1. Age equivalent is a universal score that relates the performance of same age children (Pierangelo & Giuliani, 2017, p. 48).
Example of an age-equivalent score (Suny Cortland, 2018).
2. Alternate forms reliability is when more than one version of a test is given to the same person or group (Pierangelo & Giuliani, 2017, p. 103). Each test version should assess the same skills. The score from each test is then linked for uniformity. Alternate forms reliability is also recognized as equivalent forms reliability and parallel forms reliability.
3. Assessment is a process of appraising the knowledge of a person or the behavior of a person (Pierangelo & Giuliani, 2017, p. 5). A thorough assessment includes collecting data that will allow an informed decision to be made. The process can be formal or informal, but the outcome should provide strengths and weaknesses of the individual being assessed. An example of formal assessment is the Wechsler Intelligence Scale for Children – Fifth Edition (WISC-V) and example of an informal assessment is a portfolio.
4. Chronological age is the subjects actual age at the time an assessment is administered (Pierangelo & Giuliani, 2017). A child’s chronological age who was born January 21, 2012 is six year six months as of July 21, 2018.
5. Concurrent validity uses a person’s test score to determine what their outcome on a criterion measure or satisfactory mastery level would be using the same time frame (Pierangelo & Giuliani, 2017).
(Statistics how to, 2018)
6. Construct validity is the level an assessment measures a trait (Pierangelo & Giuliani, 2017).
(Statistics how to, 2018)
7. Content validity tells if the individual test items represent what is supposed to be measured on the overall assessment (Pierangelo & Giuliani, 2017).
(Statistics how to, 2018)
8. Content-referenced test focus on specific skills and mastery of those skills (Pierangelo & Giuliani, 2017). A student’s mastery of the content is measured by the score they earn on the test. An example of a content-referenced test is a pretest that is administered prior to covering specific curriculum. The pretest allows the teacher to determine how much prior knowledge the student knows about a specific subject.
9. Convergent validity lets one know how valid the test is compared to other assessments that measure the same information (Pierangelo & Giuliani, 2017).
(Trochim, 2006)
10. Correlation is how two variables are associated with one another (Pierangelo & Giuliani, 2017).
(Matalone, 2018)
11. Criterion-referenced test (CRTs) signify an satisfactory level of mastery that is determined as the standard (Pierangelo & Giuliani, 2017). The standard can be decided by a variety of individuals such as the publisher of the test, the school, or the teacher. CRTs is an acronym that can be used when referring to Criterion-referenced test.
12. Criterion-related validity compares an assessment’s scores to another standard measuring the equivalent skill (Pierangelo & Giuliani, 2017). An example would be comparing scores of two tests that measure the same standard.
13. Curriculum-based assessment (CBA) utilizes direct observation and documentation of the student’s knowledge related to the curriculum to make decisions about instruction (Pierangelo & Giuliani, 2017). This assessment is used when determining IEP goals related to specific curriculum. CBA is an acronym that can be used when referring to Curriculum-based assessment.
14. Curriculum-based measurement (CBM) tells how a student is progressing with basic academic instruction (McLane, 2018). CBM is an acronym that can be used when referring to Curriculum-based measurement.
15. Deciles is when a score is divided into ten equal units or tenths (Pierangelo & Giuliani, 2017). An example is the seventh decile is when the score of an assessment fall below 70 percent or the tenth decile is when the score falls below 10 percent.
16. Discriminant validity informs if the concepts are unrelated. A test with discriminant validity will not correlate with tests assessing other areas (Pierangelo & Giuliani, 2017). An example would be an academic test will have low correlations with social and cognitive measurements.
17. Dynamic assessment provides comparisons of a student’s academic progress over time (Pierangelo & Giuliani, 2017). Dynamic assessment allows a teaching component to be incorporated into the process of assessment.
18. Ecological assessment allows a student to be tested by direct observation in the student’s daily environment (Pierangelo & Giuliani, 2017). An example would be assessing the student’s performance and or behavior in their daily environment. The student may be well-behaved in theater arts and act out in English or the student may be calm in academic classrooms but agitated in the cafeteria. This information is helpful when developing accommodations, modifications, and determining expectations for the student.
19. Grade equivalent represents a score that compares the performance of individuals at an equivalent grade level (Pierangelo & Giuliani, 2017). An example would be comparing the score to the child’s age in years and months. A student who gets a grade equivalent score of 4.6 is performing at the level of the average student in the 4th grade, 6th month.
20. Informal reading inventory (IRI) is a tool that assesses a student’s reading progress and weaknesses. It helps with development of interventions. The informal reading inventory can be developed commercially or by a teacher. IRI is an acronym used for informal reading inventory.
21. Instructional planning is provided by assessments when determining suitable instruction for an individual’s social, academic, and physical needs (Pierangelo & Giuliani, 2017). An example would be using information from a formal assessment to plan what type of instruction the student needs to be successful in the classroom.
22. Interrater reliability is comprised of at least two raters (Pierangelo & Giuliani, 2017). The raters are autonomous of one another. The raters are required to document observations and behaviors. An example would be two raters observing and documenting behaviors and reporting the information interpedently. The information is them merged.
23. Interval scale of measurement uses a zero point for two equal differences (Pierangelo & Giuliani, 2017). The equal difference in the score is the amount of distances between the two points. An example is the western-style calendar years.
24. Learning styles assessment helps determine fundamental things that influence the way one learns (Pierangelo & Giuliani, 2017). Examples of learning styles include spatial, auditory, linguistic, kinesthetic (Overview of learning styles, 2018).
25. Mean is the average of scores (Pierangelo & Giuliani, 2017).
An example of mean is 4 + 2 + 2 = 9 / 3 = 3 is the mean (average)
26. Measures of central tendency include statistical items such as mean, median, and mode related to score distribution (Pierangelo & Giuliani, 2017).
(Solace, 2014)
27. Median is the score that is in the middle or center of a set of scores (Pierangelo & Giuliani, 2017).
An example of the median number in a data set of scores would be 20, 20, 30, 40, 50.
The median is 30.
28. Mode is the score that occurs more than other scores (Pierangelo & Giuliani, 2017).
An example of the mode number in a data set of scores would be 20, 20, 20, 40, 40, 50.
The mode is 20.
29. Nominal scale of measurement divides observations into categories (Pierangelo & Giuliani, 2017). Frequency of the occurrence is then documented to develop a nominal scale. An example would be attaching a number or letter to a group to create categories.
Freshman = A
Sophomore = B
Junior = C
Senior = D
30. Normal distribution would apply if a test were administered to all students of the same age or same grade as specified by the assessment.
(Pierangelo & Giuliani, 2017, p. 37)
31. Norm-referenced test (NRTs) are standardized test that link and rank testers to each other (Pierangelo & Giuliani, 2017). NRTs is an acronym for norm-referenced tests.
32. Ordinal scale of measurement uses the rank order system which only refer to relative amounts (Pierangelo & Giuliani, 2017). An example would be the winners of a track meet. The ordinal scale is 1st, 2nd, and 3rd place. The time each runner completed the race is not considered. The person who finished 1st is noted as being in 1st place, the person who finished 2nd is noted as being in 2nd place, and the person who finished 3rd is noted as being in 3rd place.
33. Outcome-based assessment focuses on skills that are significant to real-life scenarios (Pierangelo & Giuliani, 2017). The assessment must focus on skills that consider the real-life situation, teacher the real-life situation, and evaluate the skills. An example would be teaching a student to use public transportation. Teaching the student how to obtain a ticket for public transportation, how to determine the bus number for the needed route, and how to ride.
34. Percentile specifies a percentage that happens below an exact score (Pierangelo & Giuliani, 2017). The percentage can relate of people. An example would be if you receive a percentile rank of 65, you did as well or better than 65 percent of the people who were administered the assessment. Percentile is also referred to as percentile rank.
35. Performance-based assessment applies knowledge to real-life situations or activities (Pierangelo & Giuliani, 2017). It can also be a real-life simulation. Another term for performance-based assessment is naturalistic-based assessment.
36. Portfolio displays a student’s work and allows one to view the labor, development, and accomplishments made in multiple areas (Pierangelo & Giuliani, 2017). A math portfolio would could include completed classwork, completed homework, and assessments. The portfolio could also house unfinished assignments. The portfolio should document the students progress, strengths, and weaknesses in the content area.
37. Predictive validity allows a forecast related to upcoming performance (Pierangelo & Giuliani, 2017). Examples of test with predictive validity include career or aptitude tests that help determine potential success or failure in specific occupations or academic areas.
38. Protocol where answers and scores are chronicled for an assessment (Pierangelo & Giuliani, 2017).
Example of the protocol used for assessment to calculate student age.
(Pierangelo & Giuliani, 2017, p. 42)
39. Purpose of assessment is to determine a need for special education or if other factors are affecting the child’s academic and behavioral performance in the school setting (Pierangelo & Giuliani, 2017, p. 5)
The purpose of assessment can be for developing the IEP, planning instruction, educational placement, diagnosis and eligibility. The purpose of assessment serves many areas for the student.
40. Quartiles refer to scores that have been divided into quarter sections (Pierangelo & Giuliani, 2017). An example of quartile is:
First quartile (1-25)
Second quartile (26-50)
Third quartile (51-75)
Fourth quartile (76-100)
The first quartile is also known as the lower quartile or the bottom 25%.
41. Range is the high score minus the low score (Pierangelo & Giuliani, 2017). The formula for range is range = high score – low score. Using the data set 12, 4, 18, 50, the range is calculated as follows: 50 – 4 = 46. The range is 46.
42. Raw score is the correct items on a test (Pierangelo & Giuliani, 2017). A raw score has not been weighted or manipulated. If the raw score is 9 out of 10 which means the student got 90% correct on the assessment.
43. Reliability is how consistent the measurements are on an assessment (Pierangelo & Giuliani, 2017). A reliable assessment would provide the same score if given more than once to the same subject. If the score is not the same it could be contributed to a psychological state or physical state the person is in at the time the test is administered.
44. Reliability coefficient tells how dependable results of an assessment are over a period (Pierangelo & Giuliani, 2017). The coefficient can range from 0.00 to 1.00. The preferred coefficient is 0.90 to 1.00 which is interpreted as excellent.
45. Scaled score represents a raw score that has been converted to a common scale that provides a mathematical comparison among others (Pierangelo & Giuliani, 2017).
(Scott, 2016)
46. Skewed distribution represents most of the test scores in a high or low end versus the middle (Pierangelo & Giuliani, 2017).
(Statistics how to, 2018)
47. Split-half reliability divides a test into two parts (Pierangelo & Giuliani, 2017). The student’s scores should be similar on either part if the test is consistent. An example would be taking one test and using the odd questions for one set of test scores and the even questions for the other set of test scores. Comparing the two test scores should be similar.
48. Standard deviation (SD) is how much the scores are spread around the average (Pierangelo & Giuliani, 2017). SD is an acronym for standard deviation.
49. Standard error of measurement (SEM) identifies the error between scores that are observed and sores that are estimated (Pierangelo & Giuliani, 2017). SEM is an acronym for standard error or measurement.
50. Standard score has been altered to a standard curve it is also known as the Z score (Pierangelo & Giuliani, 2017).
An example would be as follows: A person with an IQ of 115 on an IQ test with a standard deviation of 15 would have the following Z score (115 – 100)/15 = +1.0 The person’s IQ of 115 would be 1.0 standard deviation above the mean.
51. Standardized tests are administered with a specific procedure (Pierangelo & Giuliani, 2017). They are also timed, scored, and interpreted to acquire reliable scores. An example of a standardized test is the STAAR assessment in various subjects such as Reading and Math.
52. Standards-referenced test tell if the individual has obtained the knowledge needed for various subjects and grade levels (Pierangelo & Giuliani, 2017). An example of a standards-referenced test is the State of Texas Academic Achievement and Readiness (STAAR) test.
53. Stanine means standard nines (Pierangelo & Giuliani, 2017). It is a score with an average of five and a standard deviation (SD) of two. It can range from one to nine.
(Azzolio, 2005)
54. T score tells how many standard deviations (SDs) are in an observation (Pierangelo & Giuliani, 2017). It is an alternate way to tell a student’s performance on the test. An example would be a pupil who scored a 2.0 standard deviation above the mean would have a T score of 70 since T = 50 + 10 (2.0). The 2.0 represents the standard deviation above the mean, which is z = 2.0.
55. Task analysis is a method a task is divided into different sections (Pierangelo & Giuliani, 2017).
(Bird, 2010)
56. Test-retest reliability shows the uniformity of an assessment when taken more than one time (Pierangelo & Giuliani, 2017).
(Ibrahim, 2011)
57. Validity is how well an assessment measures what is intended (Pierangelo & Giuliani, 2017).
(McLeod, 2013)
58. Validity coefficient identifies the usefulness of your results (Statistics how to, 2018)
Validity coefficient values are noted in the table below.
(Pierangelo & Giuliani, 2017, p. 99)
59. Variance is the range of scores within a distribution (Pierangelo & Giuliani, 2017).
(Variance and standard deviation, 2009)
60. Z score tells the standard deviation (SD) of a score (Pierangelo & Giuliani, 2017). The mean of a z score is zero. The formula to calculate z scores is Z score = Test – Mean / Standard Deviation. An example would be as follows: A person with an IQ of 115 on an IQ test with a standard deviation of 15 would have the following Z score (115 – 100)/15 = +1.0 The person’s IQ of 115 would be 1.0 standard deviation above the mean.
References
Azzolio, A. (2005). Middle ground - Stanine -- Statistical standard nine normal distribution. Retrieved from Math n stuff: http://www.mathnstuff.com/math/spoken/here/2class/90/stanine.htm
Bird, D. (2010, September 6). Sample visual task analysis brush teach, bathe, and prepare breakfast (cereals). Retrieved from Slide share: https://www.slideshare.net/learningtoolssped/sample-visual-task-analysis
Ibrahim, A. (2011, October 28). Reliability and validity in physcial therapy tests. Retrieved from Slide share: https://www.slideshare.net/aebrahim123/1-reliability-and-validity-in-physical-therapy-tests
Matalone, S. (2018). Correlation: Definition, analysis, & examples. Retrieved from Study.com: https://study.com/academy/lesson/correlation-definition-analysis-examples.html
McLane, K. (2018). What is curriculum-based measurement and what does it mean to my child? Retrieved from Reading Rockets: http://www.readingrockets.org/article/what-curriculum-based-measurement-and-what-does-it-mean-my-child
McLeod, S. (2013). What is validity? Retrieved from SimplyPsychology: https://www.simplypsychology.org/validity.html
Overview of learning styles. (2018). Retrieved from Learning-styles-online.com: https://www.learning-styles-online.com/overview/
Pierangelo, R. A., & Giuliani, G. (2017). Assessment in Special Education. (Fifth). Boston, MA.
Scott, S. (2016, July 5). Scaled scores for 2016 key stage 2 tests announced. Retrieved from Schools week: https://schoolsweek.co.uk/scaled-scores-for-key-stage-2-tests-announced/
Solace, S. (2014, September 29). Measures of central tendancy: Mean, median, and mode. Retrieved from Owlcation: https://owlcation.com/stem/Measures-of-Central-Tendency-Mean-Median-and-Mode
Statistics how to. (2018). Construct validity: Simple definition, statistics used. Retrieved from Statistics how to: http://www.statisticshowto.com/construct-validity/
Statistics how to. (2018). Content validity (logical or rational validity). Retrieved from Statistics how to: http://www.statisticshowto.com/content-validity/
Statistics how to. (2018). Skewed distribution: Definition, examples. Retrieved from Statistis hos to: http://www.statisticshowto.com/probability-and-statistics/skewed-distribution/
Statistics how to. (2018). Validity coefficient: Definition and how to find it. Retrieved from Statistics how to: http://www.statisticshowto.com/validity-coefficient-definition-and-how-to-find-it/
Statistics how to. (2018). What is concurrent validity. Retrieved from Statistics how to: http://www.statisticshowto.com/concurrent-validity/
Suny Cortland. (2018). Age-equivalent score. Retrieved from Suny Courtland: http://web.cortland.edu/andersmd/stats/agequiv.html
Trochim, W. M. (2006). Convergent & discriminant validity. Retrieved from Web center for social research methods: http://www.socialresearchmethods.net/kb/convdisc.htm
Variance and standard deviation. (2009, October 13). doi:https://www.slideshare.net/lmsindia/variance-and-standard-deviation