WPS HW P2
CHAPTER 7
Reliability of Selection Measures
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Learning Objectives
Explain the meaning of reliability and why it is important in human resources selection.
Contrast the concepts of true scores and errors of measurement for selection procedures.
Compare and contrast several methods for estimating reliability.
Explain what a reliability coefficient means.
Understand why certain factors affect a reliability estimate.
Explain why the standard error of measurement is important in comparing individuals’ scores on a predictor.
3
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
Degree of dependability, consistency, or stability of scores on a predictor of criterion used in HR selection
Figure 7.1 shows an example involving the dependability of information, in the context of selection, after a computer programming aptitude test was administered to 10 individuals applying for a job as a computer programmer
The test was readministered after the first test results went missing; results show that each of the applicants had different scores for the two tests – test results are not consistent, therefore not reliable
4
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
5
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
A Definition of Reliability
The term reliability has a host of definitions, but in the context of HR selection it simply means the degree of dependability, consistency, or stability of scores on a measure used in selection research – predictors or criteria
Reliability of a measure is determined by the degree of consistency between two sets of scores on the same measure
If such scores are inconsistent, then “errors of measurement” are present
6
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
Errors of Measurement
Reliability deals with errors of measurement – free of errors – but none of our selection measures will be free of measurement errors
Selection measures designed to assess important job-related characteristics – knowledge, skills, personality traits – may be prone to error due to the sample of items used, the test taker, the examiner, or the situation in which testing takes place
We want to know the “true” scores of applicants for each characteristic being measured – unless our measure is perfectly reliable, we will encounter difficulties in knowing precisely these true scores
The score obtained on a measure – obtained score/raw score – consists of two parts: a true component and an error component
7
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
Errors of Measurement
True score:
A hypothetical score for a person assuming no errors of measurement were present at the time of measurement or scoring
Individuals answered correctly the same percentage of problems on the test that they would have if all possible problems had been given and the test were a construct valid measure of the underlying phenomenon of interest
Individuals answered correctly the problem they actually knew without being affected by external factors – lighting or temperature of the room in which testing took place, emotional state, physical health
8
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
Errors of Measurement
Error score:
Represents errors of measurement – those factors that affect obtained scores but are not related to the characteristic, trait, or attribute being measured
These factors distort respondents’ scores either over or under what they would have been on another measurement occasion – fatigue, anxiety, noise during testing
Figure 7.2 shows the relationship between reliability and errors of measurement for three levels of reliability of a selection measure
Table 7.1 summarizes some of the more common sources of error that contribute to the unreliability of selection measures
9
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
10
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
11
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
Methods of Estimating Reliability
We cannot measure reliability; we can only estimate it
Statistical procedures are commonly used to calculate what are called reliability coefficients – an index of relationship
Summarizes the relationship between two sets of measures for which a reliability estimate is being made
The calculated index varies from 0.00 to 1.00 – the correlation coefficient obtained is regarded as a direct measure of the reliability estimate
The higher the coefficient, the less the measurement error and the higher the reliability estimate
With high reliability, more confidence that a particular measure is giving a dependable picture of true scores for attribute being measured
12
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
Methods of Estimating Reliability
Four principle methods most often employed in selection research studies:
Test-retest – the same measure used to collect data from the same respondents at two different points in time
Parallel or equivalent forms – two versions of a selection measure collected from the same respondents at two different times, scores on the two forms then correlated
Internal consistency reliability estimate – shows the extent to which all parts of a measure are similar in what they measure (split-half reliability, Kuder-Richardson – K-R 20 – reliability, Cronbach’s coefficient alpha (α) reliability
Interrater reliability estimates – the determination of consistency or agreement among raters (interrater agreement, interclass correlation, intraclass correlation)
13
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
14
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
15
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
16
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
17
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
18
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
19
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
What is Reliability?
20
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
continues on next slide
What is Reliability?
21
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
What Does a Reliability Coefficient Mean?
Specific to the reliability estimation method and group on which it was calculated
A necessary but not a sufficient condition for validity
Based on responses from a group of individuals
Expressed by degree
Determined ultimately by judgment
22
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
How High Should a Reliability Coefficient Be?
There is not generally agreed upon value above which reliability is acceptable and below which it is unacceptable
The more critical the decision to be made, the greater the need for precision of the measure on which the decision will be based, and the higher the required reliability coefficient
Imprecise predictors can have long-term consequences for an organization – dependable predictors are essential for accurately evaluating key personnel
Criterion measures should be reliable, however their reliability need not be as high as predictors for them to be useful
Test users must consider the specific circumstances surrounding their situations to determine how much measurement error they are willing to put up with – is the reliability coefficient adequate?
23
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
Factors Influencing the Reliability of a Measure
Method of estimating reliability (Figure 7.9)
Individual differences among respondents
Stability
Sample
Length of a measure (Figure 7.10)
Test question difficulty (Figure 7.11)
Homogeneity of a measure’s content
Response format
Administration and scoring or a measure
24
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
25
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
26
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
27
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
Standard Error of Measurement
Reliability is a group-based statistic
To obtain an estimate of the error for an individual, we can use the standard error of measurement – a number in the same measurement units as the measure for which it is being calculated
Formula for calculating standard error:
28
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
Standard Error of Measurement
To interpret differences in individuals’ scores:
The difference between two individuals’ scores should not be considered significant unless the difference is at least twice the standard error of measurement of the measure
Before the difference between scores of the same individual on two different measures should be treated as significant, the difference should be greater than twice the standard error measurement of either measure
29
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
Evaluating Reliability Coefficients
The Buros Institute of Mental Measurements reviewed more than 1,000 commercially available tests published in The Eighth Mental Measurements Yearbook. For the tests listed, the Institute found:
Over 22% appeared without any reliability information
7% showed neither reliability nor validity data
9% showed no reliability data for certain subtests or forms
28% did not report any normative data
30
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
31
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick
Interpreting Reliability Coefficients
Reliability: A Concluding Comment
Even though the assessment and interpretation of reliability can be complex, it is a fundamental element to the proper use of HR selection measures
The validity of a measure depends on its reliability – reliability of predictor scores and criterion scores is necessary, but not sufficient, for a score’s validity or interpretation
Knowledge of reliability information and other associated statistics are critical for making accurate assessments and decisions about individual seeking employment
32
© 2019 Wessex Press • Human Resource Selection 9e • Gatewood, Feild, Barrick