w3a2 measurement
Cross Cultural Tests & Measurements
Week 3
1
Topic & Assignments
Validity and Test Development
Assignments
1. Discussion
2. Reverse Scoring Exercise
Validity
Does the test measure what it is supposed to measure?
Content Validity
Criterion-related validity
Construct Validity
Content Validity
Are the sample items representative of all possible items?
Is the question relevant (represent) the domain?
Interrater agreement as basis (opinion of judges) as to whether each question reflects the domain being measured
Face Validity
While not really a form of validity, often important in the public sphere that questions appear to fit “common sense”
May be particularly important to those taking the test
However, face validity can be misleading
Valid tests can include items that are surprising
More vulnerable to social desirability bias
Criterion-Related Validity
Concurrent
Simultaneously
Math test vs Current Standing in Math Course
Personality test and Psychiatrist Diagnosis
Predictive
Later date- Future Predictions
Employment Test and Supervisor ratings 6 months after employment
Validity Coefficient
Compute correlation between test reliability and criterion reliability with higher correlations indicating more validity (o to 1)
Scores rarely achieve a 1 (perfect validity) instead commonly in low to midrange and rarely exceed .8
Construct Validity
The thing we are trying to measure (intangible quality or trait)
Inferred from behavior but arguably more than the behavior itself (underlying trait)
There is no single external referent sufficient to validate construct
Network (patterns) can be derived instead
Since constructs are complex no single criterion will be accepted as entirely adequate.
Approaches to Construct Validity
Test Homogeneity, Group Differences & Interventions
Test Homogeneity
If all items on a test are measuring the same thing they should correlate with total score
Not sufficient
Group Differences
persons expected to differ on trait do so OR persons thought to be high in trait should score high
Interventions
compare results after an intervention
Or that test scores change in appropriate direction and amount
However…this is contingent on the intervention being successful also
Convergent and Divergent Validity
Convergent:
Measures should correlate with similar measures
Two intelligence (or even all intelligence tests) should correlate to some degree
As long as the known tests have measured the construct accurately
Divergent:
A test should not be related to measurements from which it should be different
Social interest and intelligence
This Week…
Discussion Board
Select a test you are interested in (perhaps for dissertation). What does the test measure (include name of test)? If you were going to discuss the validity of the test, what would you look for in terms of:
1. Criterion-Related Validity (what behaviors or traits should correlate concurrently or predictively)?
2. Group Differences (what persons may be expected to differ on the trait)?
3. Convergent & Divergent Validity (what measures should correlate and not correlate)?
Reverse-Scored Items Quiz
14
Reverse scoring/coding:
Asking questions/statements in the opposite direction of what you are trying to measure
For example, what statements would be the opposite of depression?
15
When reverse scoring we change the answer (score) to the opposite for those reversed items:
So let’s take Q4 as an example. When scoring we have to change it based on:
0123
3210
The participant answered “1” to Q4. This then becomes a “2” since
0123
3210
Sum of non-reversed:
Sum of reversed:
Total:
We go through and 1st identify reverse-coded items
Then we re-score them
Finally, we add up the non-reversed and the reversed for the total score
Sum of non-reversed =; sum of reversed = Total=
17