Master forum 602
Personality Testing in Personnel Selection: Adverse impact and differential hiring rates
Stephen D. Risavy and Peter A. Hausdorf
Department of Psychology, University of Guelph, Guelph, ON, Canada N1G 2W1. phausdor@uoguelph.ca
Personality tests are often used in selection and have demonstrated predictive validity across
a variety of occupational groups and performance criteria. Although different selection
decision methods can be used to make selection decisions (e.g., compensatory top down,
compensatory with sliding bands, noncompensatory) from personality test results, there is a
paucity of research addressing the influence of these different selection decision methods on
issues such as, adverse impact and differential hiring rates. This gap in the literature is
redressed in the current study. Results from 398 bus operator candidates indicated that there
may be adverse impact and differential hiring rate issues depending on the selection decision
method used and the designated group being assessed. Implications and future research
directions are discussed.
1. Introduction
Talent management – the attraction, development, andretention of key employees – begins with finding the right people. Because hiring the right people is critical for
organizational success and hiring the wrong people (i.e.,
making false positive selection decisions) may have a
negative impact on an organization, selection is of great
importance to human resource practitioners. The nega-
tive costs associated with making false-positive hiring
decisions are substantial; unqualified employees may
make costly errors, may require close supervision, and
may need to receive training to become qualified for the
job that they were hired to perform. Conversely, the
economic benefits of making correct hiring decisions
have been documented in the organizational sciences
literature (e.g., Hunter & Schmidt, 1982). For example,
when selecting candidates based on ability, substantial
labor savings have been found because of the increased
productivity associated with high performers (Hunter &
Schmidt, 1982).
As a result of the importance of making accurate
selection decisions, organizational researchers and prac-
titioners have invested considerable effort in assessing
different selection tools. Survey data has provided evi-
dence that personality tests are often employed as an
assessment measure in personnel selection processes
(e.g., Heller, 2005; Rothstein & Goffin, 2006; Ryan &
Sackett, 1987). Moreover, Conscientiousness – one of
the focal dimensions of personality – appears to have
some predictive validity across occupational groups and
performance criteria in both North American (Barrick &
Mount, 1991; Hough, Eaton, Dunnette, Kamp, & McCloy,
1990) and European (Salgado, 1997) communities.
Although some researchers have argued that personality
tests are generally not associated with adverse impact
(e.g., Ones & Anderson, 2002), there is some evidence of
group differences in mean scores between subgroups
(e.g., Bartram, 1992; Dion & Yee, 1987).
Because of the possibility that there are group differ-
ences in personality test data, selecting employees using
this information may be associated with adverse impact
and/or differential hiring rates. Moreover, adverse impact
and/or differential hiring rates may vary based on the
decision method that is used to make selection decisions
(e.g., compensatory top down, compensatory with sliding
bands, noncompensatory). There is a paucity of research
addressing the possible influence of these different selec-
tion decision methods on issues such as, adverse impact
and differential hiring rates when using personality test
data to make hiring decisions.
The current study focuses on the practical issue of
exploring possible group differences in personality as-
sessment and the potentially resulting adverse impact on
White females and minorities under different selection
decision methods. Moreover, differential hiring rates
under the different selection decision methods between
majority and minority group members will be assessed. In
& 2011 Blackwell Publishing Ltd.,
9600 Garsington Road, Oxford, OX4 2DQ, UK and 350 Main St., Malden, MA, 02148, USA
International Journal of Selection and Assessment Volume 19 Number 1 March 2011
order to situate the current research within the extant
literature, this paper will begin with a brief review of
adverse impact and diversity management. Next, a review
of the literature pertaining to the use of personality
testing in personnel selection will be presented. Subse-
quently, different possible selection decision methods for
using selection data to make hiring decisions will be
discussed. Following this discussion, a study will be
presented from an actual selection context with a sizable
minority sample in order to assess if there is evidence of
adverse impact and differential hiring rates when using
personality test data to make hiring decisions under
different selection decision methods.
1.1. Adverse impact and diversity management
Adverse impact occurs when a selection procedure
results in different selection rates for different group
members (i.e., minority group members having lower
selection rates than majority group members; Alexander,
Barrett, & Doverspike, 1983; Guion, 1998). Courts
typically turn to the longstanding four-fifths rule in order
to assess if adverse impact against a particular sex, race,
or ethnic subgroup is occurring. The four-fifths rule
states that the selection rate for the minority group
should be no less than four-fifths (i.e., .80) that of the
selection rate for the majority group (Equal Opportunity
Employment Commission, U.S. Department of Labor, &
U.S. Department of Justice, 1978).
If a selection tool is found to have adverse impact (and
is not found to have criterion-related validity), then the
organization using that selection tool is highly susceptible
to legal challenges on the basis of discrimination. It is
important to note that adverse impact can arise from
mean differences between subgroups on the selection
instruments. In spite of the relation between adverse
impact and subgroup differences, it is possible to have
subgroup differences and no adverse impact (e.g., hiring
situations where there are high selection rates); however,
it is extremely rare to have a situation in which there are
no subgroup differences and adverse impact (Newman,
Jacobs, & Bartram, 2007; Newman & Lyon, 2009).
In cases for which adverse impact is present, organiza-
tions may choose to implement affirmative action policies
such as, giving preferential treatment to minorities. For
example, when several candidates have the same score
on a selection test (or at least when score differences are
not statistically significant), an affirmative action policy
may state that minorities are to be selected before
majority group members for those candidates with the
same (or statistically similar) scores. However, it is worth
noting that this solution can create problems with
reverse discrimination.
Diversity management represents an alternative to
affirmative action policies. Diversity management pro-
grams have the goal of increasing minority representation
in more senior positions as well as providing more career
development opportunities for minorities (e.g., Williams
& Bauer, 1994). If diversity management programs are
successful, then there will be an increased representation
of minorities at higher levels in the workplace; this
increased visibility of minorities may act as a signal to
help recruit other minorities to the organization (e.g.,
Thomas, 1991). Moreover, even in cases in which adverse
impact is not present, it may still be desirable for
organizations to increase the diversity of the workforce
by hiring additional minority group members. The key
issue discussed in the current paper is how personality
assessment in the selection process impacts adverse
impact and diversity management.
1.2. Personality test use in selection
Hogan (1991) noted that personality is typically used to
refer to: (1) the underlying structures, dynamics, pro-
cesses, and propensities that bring about behavioral
actions; or (2) the way these actions are observed and
described by others in terms of their content. In recent
years, personality dimensions have converged into the
‘Big Five’ model of personality – Conscientiousness
(dependable, organized, self-disciplined), Extraversion
(sociable, talkative, active), Emotional Stability (the op-
posite of Neuroticism; calm, unemotional, secure),
Agreeableness (altruistic, nurturing, caring), and Open-
ness to Experience (imaginative, cultured, broad-minded;
Digman, 1990; McCrae & Costa, 1987). The Big Five
describes behavioral actions in terms of a person’s
dispositional characteristics (Hogan, 1991) and these
characteristics have been associated with organizational
outcomes (e.g., job performance ratings; Barrick &
Mount, 1991; Tett, Jackson, & Rothstein, 1991).
Meta-analytic evidence from the early 1990s (e.g.,
Barrick & Mount, 1991; Tett et al., 1991) purported the
beneficial, albeit modest, impact of personality measures
in predicting job performance ratings. For example, the
highest estimated true correlation – corrected for un-
reliability (in the predictor and the criterion) and range
restriction – reported by Barrick and Mount (1991) was
.22 for the relation between Conscientiousness and job
performance. Another meta-analysis published in the
same year (Tett et al., 1991) found a similar corrected
mean validity of .18 for Conscientiousness. Moreover,
personality variables have been found to be important in
explaining incremental variance in job performance be-
yond general cognitive ability (e.g., Day & Silverman,
1989; McCrae & Costa, 1987; Salgado, 1998). Although
some scholars have argued that personality is only a weak
predictor of job performance (e.g., Guion, 1965; Guion &
Gottier, 1965; Locke & Hulin, 1962; Morgeson et al.,
2007a, 2007b), personality tests have been touted as
having less adverse impact than other tests. Despite this
Personality Testing in Personnel Selection 19
& 2011 Blackwell Publishing Ltd.
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011
view, group differences in mean scores across gender and
ethnic groups have been found in prior research.
Regarding gender differences, extant large-scale, meta-
analytic research has provided evidence that across
different personality assessments, males score higher on
Assertiveness, whereas females score higher on Anxiety,
Gregariousness, Trust, and Tendermindedness (Feingold,
1994). A different study (using data from personality test
manuals), provided evidence that males score consider-
ably higher on Rugged Individualism and slightly higher on
Adjustment, Potency, and Intellectance, whereas females
score slightly higher on Dependability and Affiliation
(Hough, 1998). It is important to note that although
there were significant group differences in the aforemen-
tioned studies, the effect sizes associated with these
differences were often small in magnitude (Feingold,
1994; Hough, 1998; Ones & Anderson, 2002) and
moreover, the differences were at times, in favor of the
minority group (e.g., females scoring higher on Trust
and Affiliation). Other research studies have found
more substantive differences between males and females;
for example, Bartram (1992) found evidence that
males score substantially lower than females on
Anxiety and substantially higher on Tough-Poise and
Independence.
With respect to ethnic group differences, extant meta-
analytic research has provided evidence that White
respondents score higher on Sociability and Extraversion
(global) and lower on Anxiety, whereas Black respon-
dents score higher on Self-Esteem, Conscientiousness
(global), and Cautiousness (Foldes, Duehr, & Ones,
2008). The aforementioned study by Hough (1998)
provided evidence that White respondents score higher
on Affiliation and Intellectance, whereas Black respon-
dents score slightly higher on Potency. Other research
studies have found more substantive differences between
males and females; for example, in a Canadian context,
Dion and Yee (1987) found evidence that Anglos and
Europeans score higher than Asians on Affiliation, Dom-
inance, Exhibition, and Nurturance.
Although some scholars (e.g., Foldes et al., 2008;
Hough, 1998; Ones & Anderson, 2002) have concluded
that personality tests contain scant evidence of adverse
impact, the fact that group differences were found and
that those differences were at times fairly large, alludes to
the possibility that adverse impact may be an issue under
some conditions. As mentioned previously, when sub-
group differences are present there is an opportunity for
adverse impact to be present as well (especially when
working with low selection rates; Newman et al., 2007;
Newman & Lyon, 2009). As a result, the issue of adverse
impact in personality tests deserves further scrutiny;
specifically, adverse impact and differential hiring rates
may be more or less prevalent depending on how the
personality test data are being used to make selection
decisions.
Prior research has not specifically assessed how orga-
nizations are using personality test data to make selection
decisions in an actual selection context. It is possible that
when using different selection decision methods in an
actual selection context, group differences may result in
adverse impact and/or differential hiring rates for mino-
rities. The specific selection decision methods used when
making hiring decisions based on personality test data and
the resulting influence on adverse impact and differential
hiring rates is an important and previously overlooked
practical issue to consider when including personality test
data in the selection process.
1.3. Selection decision methods for using assessment data
Rothstein and Goffin (2006) professed that organizational
members in charge of selection procedures may not
understand the complexities and importance of using
personality tests appropriately. Industrial and Organiza-
tional psychologists have not provided practitioners with
specific guidelines for the use of personality data when
making hiring decisions. Despite this gap, generic guide-
lines for selection tools exist and may be applicable to
personality assessment data.
Table 1 displays the typical selection decision methods,
as can be found in sources such as, Guion and Highhouse
(2006) and Society for Industrial and Organizational
Psychology (SIOP) (2003). As indicated in Table 1, there
are many options for using assessment data to select
from a pool of job candidates. The table operates similar
to a flowchart in the sense that first, the organizational
hiring members can choose if they will be using a
compensatory or a noncompensatory method (first
column). Compensatory methods involve combining
scores on selection predictors into an overall score.
Because this involves summing scores across predictors,
high scores on some predictors can compensate for low
scores on other predictors (Guion & Highhouse, 2006;
SIOP, 2003). With noncompensatory methods, candidate
scores must exceed a predetermined value for each
predictor (i.e., a hurdle). As a result, a high score on
one predictor cannot compensate for a low score on
another predictor (Guion & Highhouse, 2006; SIOP,
2003). It is also possible to use a combination of
compensatory and noncompensatory methods depend-
ing on the requirements for the position as well as the
number and type of predictors (e.g., hurdle on one
predictor and compensatory on the other predictors;
Guion & Highhouse, 2006).
If choosing the compensatory method (top rows of
Table 1), the organization can require candidates to
complete all measures (second column) and subse-
quently, can either combine those scores into one score
using regression-, rational-, or unit weighting (i.e., sum-
ming the scores; third column). It is important to note
20 Stephen D. Risavy and Peter A. Hausdorf
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.
that scores can be combined using either the regression
weights solely, rational determination solely, or a combi-
nation of regression and rational weighting. Guion and
Highhouse (2006) recommend that the weighting method
should be determined by a combination of both empirical
evidence and theoretical rationale. Lastly, the combined
scores can then be used to select candidates in a top
down (i.e., the highest scoring candidate will be selected
first and so on until the desired number of candidates
have been selected), top down with fixed bands (i.e.,
candidates are grouped based on a range of scores –
typically, plus or minus two times the standard error of
measurement [SEM] – and all candidates in the first band
must be selected before any candidates in the second
band are selected), top down with sliding bands (i.e.,
candidates are grouped based on a range of scores and
after the highest scoring candidate in the first band is
selected, the band is recalculated), or cut score approach
(e.g., organizational hiring members can rank and then
select top down or can set a cut score, rank those passing,
and then select top down; fourth column).
If choosing the noncompensatory method (bottom
rows of Table 1), the organization can then choose to
administer all of the measures at once (typically in 1–2
testing sessions) or to require that candidates meet the
hiring criterion on a prior measure in order to be allowed
to complete a subsequent measure (typically in sequential
sessions; i.e., multiple cut/hurdle; second column). Next,
the organizational hiring members can consider each
predictor score separately (third column) and can then
select all candidates surpassing the cut scores on all of the
measures (fourth column).
Although not focusing solely on personality test data,
some researchers have addressed the advantages and
disadvantages of various selection decision methods for
other combinations of selection assessments (e.g., Cam-
pion et al., 2001; Cascio, Outtz, Zedeck, & Goldstein,
1991; Hunter, Schmidt, & Rauschenberger, 1977; Sackett
& Roth, 1991, 1996; Schmidt & Hunter, 1998; Schmidt,
Mack, & Hunter, 1984). Schmidt et al. (1984) demon-
strated that rank ordering (or top down selection)
produced higher selection decision utility than cut scores
in which all candidates above the cut score were deemed
acceptable. In support of these findings, Cascio et al.
(1991) also found that selection decision methods invol-
ving rank ordering, top down selection had the most
utility (i.e., economic value); however, these researchers
also found that these selection decision methods had
adverse impact implications for overall assessment test
scores of a cognitive ability test for a sample of firefighter
candidates. Cascio and colleagues (1991) recommend the
use of the sliding band selection decision method – with
minority-based referral (i.e., an affirmative action policy) –
when reducing adverse impact is a primary goal in the
hiring process. However, the practice of using banding
procedures when making hiring decisions is a contentious
issue. Specifically, Schmidt (1991) has argued that all
banding procedures reduce the utility of the selection
process, which does not offset the potential benefit of
reduced adverse impact.
Given the disagreement with respect to the impact of
different selection decision methods (e.g., Cascio et al.,
1991; Schmidt, 1991), the current paper will focus on
each of the decision methods described in Table 1 (i.e.,
compensatory top down, compensatory top down with
fixed bands, compensatory top down with sliding bands,
compensatory cut score, and noncompensatory). In light
of the aforementioned debate regarding banding (e.g.,
Table 1. Selection decision methods for using assessment data
Nature of the relationship between predictors and job performance
Testing process Score combinationsa Candidate ranking method
Compensatory (i.e., combin- ing scores on predictors into a composite)
All candidates complete all measures
Single score: All predictor scores combined into one score using: (a) regression and/or rational weightsb or (b) unit weights (i.e., summing the scores)
(1) Top down (2) Top down with fixed bands (3) Top down with sliding bands (4) Cut score
Noncompensatory (i.e., elim- inating candidates that score below the cut score for a critical predictor)
(1) All candidates complete all measures (typically in 1–2 test- ing sessions) (2) Candidates do not com- plete all measures. Candidates must meet the hiring criterion on the prior measure to be allowed to complete the sub- sequent measure (typically in sequential sessions)
Multiple scores: Every predictor score considered separately
All candidates surpassing the cut scores on all of the measures are selected
Note. Information adapted from Guion and Highhouse (2006) and SIOP Inc. (2003). This table originally appeared in Hausdorf and Risavy (2010). aAll of the scores can be combined using either the unstandardized (b) or standardized (b) scores. bGuion and Highhouse (2006) suggest that the weighting method should be determined by both computational as well as rational logic.
Personality Testing in Personnel Selection 21
& 2011 Blackwell Publishing Ltd.
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011
Cascio et al., 1991; Schmidt, 1991), it is possible that
organizational preference for increasing minority repre-
sentation or maximizing performance may guide the
selection decision method that is utilized. Simulation-
based research has provided evidence that the relative
value an organization places on minority representation
and performance should guide decisions regarding the
selection decision method to be adopted (Sackett &
Roth, 1991, 1996). In sum, it appears that there are
various different selection decision methods that can be
used to make hiring decisions based on personality test
data and the current study will assess the possible
influence of these different selection decision methods
on adverse impact and differential hiring rates – an issue
that has not yet been fully addressed in the extant
organizational sciences literature.
1.4. The current study
The current study assesses the possible group differences
present in personality measures used during a selection
process under different selection decision methods and
the resulting influence of those selection decision meth-
ods on adverse impact and differential hiring rates. The
NEO Personality Inventory (NEO PI-R; Costa & McCrae,
1992) – one of the most commonly used Big Five
measures of personality – was administered to a group
of bus operator candidates. The NEO was administered
as part of a validation study and thus, actual hiring
decisions were not made based on the assessment data.
Corresponding to the aforementioned focal purpose of
the current study, personality test data from the NEO
will be used to select candidates under different selection
decision methods and an assessment of group differences
that may lead to adverse impact and differential hiring
rates will be conducted. Specifically, each of the most
common selection decision methods – as presented
above and in Table 1 (i.e., compensatory top down,
compensatory top down with fixed bands, compensatory
top down with sliding bands, compensatory cut score,
and noncompensatory) – will be used to select
candidates and their influence on selection-based deci-
sions regarding White females and minorities will be
assessed.
2. Method
2.1. Participants and procedure
Participants were 398 bus operator candidates. The
sample consisted of 335 men (84.17%), 61 women
(15.33%), and 2 candidates who did not report their
gender (o1%). Seven candidates had less than a high school education, 64 candidates had a high school
education, 85 candidates had some college or university
education, 36 candidates had a university degree, and 3
candidates had a graduate degree. Of this sample 145
candidates were not visible minorities (36.43%), 249
candidates were visible minorities (62.56%), and 4 candi-
dates did not report their visible minority status (1.00%).
The minority sample consisted of people who identified
themselves as South Asian/Indo-Pakistani (N¼ 86), Black (N¼ 82), Chinese (N¼ 24), mixed race or color (N¼ 16), other South East Asian (N¼ 12), Central or South American (N¼ 10), Filipino (N¼ 7), West Asian or North African (N¼ 7), Oceanic (N¼ 3), and Korean (N¼ 2). Because of the large number of South Asian/ Indo-Pakistani and Black candidates, analyses were con-
ducted for the minorities combined as well as separately
for the South Asian/Indo-Pakistani and Black candidates.
The candidates completed the selection measures (the
NEO as well as other measures that were unrelated to
the current study) during a validation study in a large-
sized Canadian city during one testing session.
2.2. Measures
2.2.1. Demographics
Participants were asked to report their gender, visible
minority status, and if applicable, their specific visible
minority group.
2.2.2. NEO PI-R
The NEO PI-R (Costa & McCrae, 1992) was adminis-
tered to candidates. The NEO measures the following
five personality dimensions: Conscientiousness (a¼ .88), Extraversion (a¼ .79), Neuroticism (a¼ .90), Agreeable- ness (a¼ .85), and Openness to Experience (a¼ .76). Each of the dimensions contained 48 items for a total of
240 items. Response options ranged from strongly
disagree (1) to strongly agree (5). For the purposes of
the current study it was expected that the personality
dimensions from the NEO would be related to
training performance in the sense that having higher
levels of Conscientiousness, Extraversion, Agreeableness,
and Openness to Experience and lower levels of
Neuroticism would be associated with higher training
performance.
2.3. Analytic strategy
First, mean differences on the NEO personality dimen-
sions were assessed using independent samples t-tests.
Mean differences on each of the dimensions were
compared for White females and minorities (as well as
the South Asian/Indo-Pakistani and Black groups) against
the scores from the White male group. Second, each of
the selection decision methods explicated above and in
Table 1 (i.e., compensatory top down, compensatory top
down with fixed bands, compensatory top down with
sliding bands, compensatory cut score, and noncompen-
satory) were assessed in terms of the adverse impact
22 Stephen D. Risavy and Peter A. Hausdorf
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.
and/or differential hiring rates of White males, White
females, and minorities (as well as South Asian/Indo-
Pakistani and Black candidates). Table 2 provides a more
detailed explanation of how each of the selection deci-
sion methods was conducted.
3. Results
3.1. Mean difference analyses
The scores on each of the NEO dimensions for White
females and minorities (as well as the South Asian/Indo-
Pakistani and Black groups) were compared with the
scores from the White males group (Table 3) and
assessed using independent samples t-tests. Results in-
dicated that the only difference in favor of White males
was that they were significantly more open to experience
than the South Asian/Indo-Pakistani group. All other
significant differences were in favor of the White females
or minorities: (1) White females were more agreeable
and open to experience than White males; (2) minorities
were more conscientious than White males; (3) and
Blacks were more conscientious and agreeable than
White males.
3.2. Selection decision method analyses
3.2.1. Compensatory top down
Candidate scores on the personality dimensions were
combined (each personality dimension was weighted
equally). The personality dimensions were combined
and the top 50, 100, and 150 candidates were selected
using a top down procedure. In cases where candidates
had equal scores, candidates were selected with minor-
ity-based referral (i.e., minorities were selected before
candidates who were not minorities). One common
method for assessing adverse impact – especially in
Canada – is the four-fifths rule (Catano, Wiesner, Hack-
ett, & Methot, 2005). When the selection rate of the
designated group (e.g., White females, minorities) is less
than four-fifths that of the selection rate for the compar-
ison group (e.g., White males), there is evidence of
adverse impact for the designated group (Catano et al.,
2005). According to the four-fifths rule, the only evidence
of adverse impact occurred with the compensatory top
down selection decision method for 50 hires, the selec-
tion rate of minorities (.11) was less than four-fifths that
of the White males group (.15; Table 4).
3.2.2. Compensatory top down with fixed bands
Candidates were grouped based on a range of scores
(plus or minus two times the SEM) and then the top 50,
100, and 150 candidates were selected using a top down
procedure with minority-based referral for fixed bands
(i.e., within each band minorities were selected first
and once all of the candidates in the band were selected
the next band was created). Bands were calculated by
subtracting two times the SEM from the top combined
score of the personality scale in question. Candidates
were selected with minorities in the band being selected
before other candidates. All candidates in the band
needed to be selected before calculating the next band.
Table 2. Analytic strategies for the selection decision methods
Selection decision method
Analytic strategy
Compensatory top down
Candidate scores on the personality dimensions were combined (each per- sonality dimension was weighted equally). The top 50, 100, and 150 candidates were selected using a top down procedurea
Compensatory top down with fixed bands
Candidates were grouped based on a range of scores (plus or minus two times the standard error of measure- ment [SEM]) and then the top 50, 100, and 150 candidates were selected using a top down procedure with minority- based referral for fixed bands (i.e., within each band minorities were se- lected first and once all of the candi- dates in the band were selected the next band was created)
Compensatory top down with sliding bands
Candidates were grouped based on a range of scores (plus or minus two times the SEM) and then the top 50, 100, and 150 candidates were selected using a top down procedure with min- ority-based referral for sliding bands (i.e., within each band minorities were selected first and once the top scorer(s) in the band was(were) se- lected the band was adjusted)
Compensatory cut score
Candidate scores on the personality dimensions were combined (each per- sonality dimension was weighted equally). The candidates scoring above the cut scoreb were selected
Noncompensa- tory
Candidates who scored below the cut score for any of the personality dimen- sions were eliminated from the selec- tion process. The remaining candidates were selected
Note. aFor all top down procedures, in cases where candidates have the same score on the scale in question, candidates were selected in two alternative ways: (1) at random, and (2) minority candidates were selected first and then the remaining candidates were selected randomly bCut score thresholds for each personality dimension were computed. The process for computing cut score thresholds for each personality dimension involved taking the mean score on the personality dimension in question for all bus operators in the sample who passed training (out of the candidates that advanced from selection to training). Next, the standard error of measurement (SEM) was subtracted from each mean dimension score. As an illustrative example, for the Conscientiousness dimension, the mean score for all bus operators who passed training was 145.95. Next, the SEM for the Conscientiousness dimension was calculated (SEM¼ 5.56) and subtracted from the mean dimension score. The resulting figure was used as the cut score for the Conscientiousness dimension (i.e., 140.39; see Appendix A for the cut score calculations for Conscientiousness as well as the other personality dimensions).
Personality Testing in Personnel Selection 23
& 2011 Blackwell Publishing Ltd.
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011
According to the four-fifths rule, there was no evidence
of adverse impact when using the compensatory
top down with fixed bands selection decision method
(Table 5).
3.2.3. Compensatory top down with sliding bands
Similar to the compensatory top down with fixed bands
selection decision method, candidates were again
grouped based on a range of scores (plus or minus two
times the SEM) and then the top 50, 100, and 150
candidates were selected using a top down procedure
with minority-based referral for sliding bands (i.e., within
each band minorities were selected first and once the top
scorer[s] in the band was[were] selected the band was
adjusted). The focal difference from the fixed bands
approach is that for sliding bands, once the highest
scoring candidate(s) in the band is(are) selected, the
band can then be recalculated based on the next highest
scoring candidate. According to the four-fifths rule, there
was no evidence of adverse impact when using the
compensatory top down with sliding bands selection
decision method (Table 6).
3.2.4. Compensatory top down cut score
Candidate scores on the personality dimensions were
combined (each personality dimension was weighted
equally). The candidates scoring above the cut score
were selected. The process for computing cut score
thresholds for each personality dimension involved taking
the mean score on the personality dimension in question
for all bus operators in the sample who passed training.
The reason for setting the cut scores based on the
candidates who passed training was that it was assumed
that those who passed training were acceptable for the
job and thus, represented at least minimally acceptable
selection test scores. Next, the SEM was subtracted from
Table 3. Mean differences on the personality dimensions of the NEO
Personality dimension White males White females Minorities South Asian/Indo-Pakistani Blacks
Conscientiousness M¼ 143.25 M¼ 139.21 M¼ 146.99* M¼ 147.76 M¼ 148.24* SD¼ 15.28 SD¼ 19.23 SD¼ 16.03 SD¼ 17.06 SD¼ 16.49 (N¼ 107) (N¼ 19) (N¼ 210) (N¼ 71) (N¼ 66)
Extraversion M¼ 126.77 M¼ 129.35 M¼ 124.62 M¼ 124.06 M¼ 124.56 SD¼ 13.82 SD¼ 11.21 SD¼ 14.35 SD¼ 13.95 SD¼ 13.07 (N¼ 105) (N¼ 20) (N¼ 202) (N¼ 69) (N¼ 63)
Neuroticism M¼ 56.64 M¼ 56.67 M¼ 60.36 M¼ 62.63 M¼ 59.04 SD¼ 18.95 SD¼ 20.99 SD¼ 18.89 SD¼ 21.34 SD¼ 18.53 (N¼ 103) (N¼ 21) (N¼ 205) (N¼ 67) (N¼ 67)
Agreeableness M¼ 135.06 M¼ 140.86* M¼ 135.17 M¼ 131.78 M¼ 140.23* SD¼ 15.34 SD¼ 8.91 SD¼ 17.03 SD¼ 17.00 SD¼ 17.07 (N¼ 109) (N¼ 22) (N¼ 208) (N¼ 72) (N¼ 66)
Openness to experience M¼ 112.47 M¼ 121.23* M¼ 111.92 M¼ 108.25* M¼ 114.66 SD¼ 12.35 SD¼ 17.15 SD¼ 13.17 SD¼ 13.19 SD¼ 12.94 (N¼ 105) (N¼ 22) (N¼ 206) (N¼ 71) (N¼ 64)
Note. *Mean is significantly different (po.05) than the mean for White males.
Table 4. Compensatory top down with minority-based referral
Total candidate pool (A)
Number of candidates made job offers (B)
Selection rate (B/A)
Adverse impact according to the four-fifths rule?
N¼ 50 White males 117 17 .15 Yes, for minorities Minorities 249 28 .11 White females 23 5 .22
N¼ 100 White males 117 33 .28 No Minorities 249 58 .23 White females 23 7 .30
N¼ 150 White males 117 43 .37 No Minorities 249 93 .37 White females 23 11 .48
Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .12 (4/5 � .15) for N¼ 50, .22 (4/5 � .28) for N¼ 100, and .30 (4/5 � .37) for N¼ 150. The Number of candidates made job offers (third column) does not equal 100 and 150 because of missing data regarding the minority/gender variables.
24 Stephen D. Risavy and Peter A. Hausdorf
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.
each mean dimension score. As an illustrative example,
for the Conscientiousness dimension, the mean score for
all bus operators who passed training was 145.95. Next,
the SEM for the Conscientiousness dimension was
calculated (SEM¼ 5.56) and subtracted from the mean dimension score. The resulting figure was used as the cut
score for the Conscientiousness dimension (i.e., 140.39;
see Appendix A for the cut score calculations for
Conscientiousness as well as the other personality
dimensions). According to the four-fifths rule, there
was no evidence of adverse impact when using the
compensatory top down cut score selection decision
method; however, the selection rate for the minorities
(.45) was at the lowest possible value before being
less than four-fifths that of the White males group (.56;
Table 7).
Table 5. Compensatory top down with fixed bands with minority-based referral
Total candidate pool (A)
Number of candidates made job offers (B)
Selection rate (B/A) Adverse impact according to the four-fifths rule?
N¼ 50 White males 117 7 .06 Yes, for White females Minorities 249 42 .17 White females 23 1 .04
N¼ 100 White males 117 28 .24 No Minorities 249 65 .26 White females 23 7 .30
N¼ 150 White males 117 28 .24 No Minorities 249 115 .46 White females 23 7 .30
Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .05 (4/5 � .06) for N¼ 50, .19 (4/5 � .24) for N¼ 100, and .19 (4/5 � .24) for N¼ 150. The SEM for the bands was calculated by adding the SEM for each dimension and multiplying that sum by two (¼ 61.58).
Table 6. Compensatory top down with sliding bands with minority-based referral
Total candidate pool (A)
Number of candidates made job offers (B)
Selection rate (B/A)
Adverse impact according to the four-fifths rule?
N¼ 50 White males 117 7 .06 Yes, for White females Minorities 249 42 .17 White females 23 1 .04
N¼ 100 White males 117 16 .14 No Minorities 249 80 .32 White females 23 4 .17
N¼ 150 White males 117 27 .23 No Minorities 249 116 .47 White females 23 7 .30
Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .05 (4/5 � .06) for N¼ 50, .11 (4/5 � .14) for N¼ 100, and .18 (4/5 � .23) for N¼ 150. The SEM for the bands was calculated by adding the SEM for each dimension (¼ 61.58).
Table 7. Compensatory cut score
Total candidate pool (A)
Number of candidates made job offers (B)
Selection rate (B/A)
Adverse impact according to the four-fifths rule?
White males 117 65 .56 No Minorities 249 111 .45 White females 23 13 .57
Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .45 (4/5 � .56). The cut score was calculated by summing the cut scores for the Conscientiousness, Extraversion, Agreeableness, and Openness to experience dimensions and then subtracting the cut score for the Neuroticism dimension (¼ 434.82).
Personality Testing in Personnel Selection 25
& 2011 Blackwell Publishing Ltd.
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011
3.2.5. Noncompensatory
The personality dimension cut scores were used as
hurdles with the remaining candidates being selected;
specifically, the personality dimensions were each used as
a hurdle (in an arbitrary order). According to the four-
fifths rule, there was evidence of adverse impact when
using the noncompensatory selection decision method;
specifically, the selection rate of minorities (.18) was
less than four-fifths that of the White males group (.25;
Table 8).
4. Discussion
The current study sought to redress one of the gaps due
to the paucity of research examining practical issues for
personality testing in personnel selection. The current
study examined the impact of different selection decision
methods on hiring decisions made based on personality
test data. The results of the current study suggested that
when using the personality dimensions of the NEO to
select employees from a pool of candidates there was no
evidence of adverse impact for minorities when using the
compensatory top down with fixed/sliding bands or the
compensatory cut score selection decision methods
(although for the compensatory cut score method, the
selection rate for minorities was as low as possible
without containing adverse impact). However, there
was evidence of adverse impact for minorities when
using the compensatory top down and noncompensatory
selection decision methods (although there was only
evidence of adverse impact in the compensatory top
down method when 50 candidates were selected; there
was no evidence of adverse impact when 100 or 150
employees were selected). Regarding differential hiring
rates, the selection rates varied based upon the selection
decision method invoked; specifically, minority selection
rates were highest when using the compensatory top
down with sliding bands selection decision method with
150 hires and lowest when using the compensatory top
down selection decision method with 50 hires. In sum,
the results for minorities demonstrated that both the
selection decision method and the number of candidates
selected impacted on adverse impact and hiring rates.
With respect to White females, there was no evidence
of adverse impact when using the compensatory top
down, compensatory cut score, or noncompensatory
selection decision methods. However, there was evi-
dence of adverse impact when using the compensatory
top down with fixed/sliding bands selection decision
methods with 50 hires. Because the decisions within
the bands were made based upon minority-based refer-
ral, White females were disadvantaged under the banding
methods assessed in cases where only 50 candidates
were hired. Regarding differential hiring rates, White
female selection rates were highest when using the
compensatory cut score selection decision method and
lowest when using the compensatory top down with
fixed/sliding bands selection decision methods with 50
hires.
Regarding adverse impact and differential hiring rates,
the results of the current study suggest that adverse
impact and differential hiring rates may or may not be an
issue depending on the selection decision method in-
voked and the numbers of candidates selected. In sum,
the selection decision method that should be used by a
hiring organization will depend on the focal goals of the
organization during that particular selection process (e.g.,
to avoid adverse impact, to increase minority selection
rates, to increase utility). Furthermore, knowledge of the
impact of different selection decision methods will help
to shed further light on how to achieve these organiza-
tional goals. For example, when attempting to avoid
adverse impact against minorities by using a banding
method, it is important to be cognizant of the finding
that if banding decisions are being made based on
selecting minorities first within each band (i.e., an affir-
mative action policy), then other designated group mem-
bers (e.g., White females) may be adversely impacted and
moreover, reverse discrimination may also occur. To help
to circumvent potential issues with affirmative action
policies (such as selecting minorities first within test
score bands), tactics such as increasing minority repre-
sentation in more senior positions and providing more
career development opportunities for minorities (i.e.,
diversity management programs; e.g., Williams & Bauer,
1994) should be proactively implemented.
With respect to personality test scores on the NEO,
the only significant mean difference in favor of White
males was that they were more open to experience than
the South Asian/Indo-Pakistani group. All other significant
differences were in favor of the minorities and White
Table 8. Noncompensatory
Total candidate pool (A)
Number of candidates made job offers (B)
Selection rate (B/A)
Adverse impact according to the four-fifths rule?
White males 117 29 .25 Yes, for minorities Minorities 249 46 .18 White females 23 6 .26
Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .20 (4/5 � .25).
26 Stephen D. Risavy and Peter A. Hausdorf
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.
females; specifically, White females were more agreeable
and open to experience than White males, minorities
were more conscientious than White males, and Blacks
were more conscientious and agreeable than White
males. Because of the minimal evidence of mean differ-
ences between White males and White females/mino-
rities favoring White males, these are promising results
for the use of including personality tests in the selection
process. Specifically, the few differences in the personality
dimensions did not, for the most part, negatively impact
minorities and White females. However, the aforemen-
tioned results by selection decision method and number
of candidates selected suggest that the assessment of
adverse impact needs to incorporate this information
beyond simply comparing mean differences.
4.1. Limitations and future research directions
Although it was beyond the scope of the current paper to
test all of the possible combinations of different selection
decision methods, some of the most common methods
(i.e., compensatory top down, compensatory top down
with fixed bands, compensatory top down with sliding
bands, compensatory cut score, and noncompensatory)
were assessed. Nevertheless, there are various different
ways of combining assessment scores when using the
compensatory method (i.e., regression-, rational-, or unit
weighting) that were not investigated in the current
study.1 Moreover, candidate results can also be combined
using either the unstandardized (b) or standardized (b) scores or a combination of compensatory and noncom-
pensatory methods (e.g., hurdle on one predictor and
compensatory on the other predictors). Future research
should seek to assess the adverse impact and differential
hiring rates of different selection decision methods that
have different combination methods (e.g., regression,
rational) and different types of scores being combined
(e.g., unstandardized, standardized).
In the current study, cut scores were used because of
their prevalence in applied settings. However, the issue of
using cut scores to hire from a sample of job candidates
has been broached in previous research (e.g., Cooper-
Hakim & Viswesvaran, 2002) and the consensus appears
to be that the use of continuous measures is generally
preferable to using cut scores. The current study found
that the compensatory cut score method had a selection
rate for minorities that was extremely close to being
indicative of adverse impact. Moreover, the cut score
calculations in the current study were highly dependent
on the reliability of the personality scale dimensions.
Overall, setting cut scores in both research and practice
is difficult because of the variety of ways that they can be
calculated and thus, the impact of using different cut
score calculations under different selection decision
methods is an avenue for future research.
Socially desirable responding (i.e., faking) is an addi-
tional issue that is important to consider with personality
tests in personnel selection. Tests utilized in selection
contexts often attempt to measure socially desirable
responding and thus, many personality tests used in
selection contain faking or lie scales. Regarding test
validity, induced faking has been associated with lower
criterion-related validity (e.g., Jackson, Wroblewski, &
Ashton, 2000). Faking research has provided empirical
support for the notion that candidates can and do
increase their scores in a socially desirable manner
when responding to personality assessments (i.e., ‘fake
good’) when motivated to present themselves in a
positive manner (e.g., Viswesvaran & Ones, 1999). More-
over, candidates have been found to differ in the extent
to which they fake (Donovan, Dwight, & Hurtz, 2003). An
implication of candidate faking is that criterion-related
validity can be attenuated and that if faking is occurring, it
is possible that validity can be increased by accounting for
those socially desirable responses. In sum, assessing the
impact of faking on the results of personality test
data used in hiring decisions under different selection
decision methods is a potentially fruitful avenue for future
research.
Future research needs to assess how organizations use
data from personality measures to make selection-based
decisions. Surveying a representative sample of organiza-
tions and asking selection professionals if they are
using personality measures in their selection processes
and if so, what personality measures they are using
and how they are using those results to make selec-
tion-based decisions will help to elucidate the use of
personality measures and testing in contemporary
organizations. Explicating how personality assessment
data are actually being used in selection processes
will be an important complement to the research
conducted in the current study. In other words, are
organizations making decisions based on a profile
of candidates (i.e., across multiple traits), a reduced set
of traits, or a range of minimum and maximum trait
levels?
Finally, although the current study examined one of the
important practical issues pertaining to using personality
tests in personnel selection, there are several additional
practical issues related to the use of personality measures
in the selection process that will need to be addressed by
future empirical as well as theoretical investigations.
Some of the most important practical issues not fully
considered by the current paper and/or the extant
literature include the (1) use of cut scores; (2) use of
invasive or prohibited items; (3) use of norms or other
score adjustments; (4) need for people trained in psycho-
metrics to interpret the data; (5) inclusion of non-job-
related personality data; (6) use of item- versus dimen-
sion-level data; and (7) use of raw score or percentile
results. These areas all represent additional practical
Personality Testing in Personnel Selection 27
& 2011 Blackwell Publishing Ltd.
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011
issues that need to be addressed by subsequent research
endeavors.
4.2. Summary and conclusion
Personality tests have become a common component of
many selection systems. However, the research literature
has not confirmed how organizations are making hiring
decisions based on personality tests in the selection
process and the impact of the different decision options.
The current paper provides evidence that adverse impact
may or may not be an issue depending upon the selection
decision method invoked. This paper suggests that the
optimal selection decision method should depend on the
goals of the organization (e.g., to avoid adverse impact, to
increase minority selection rates, to increase utility).
Furthermore, this study provides initial evidence of the
impact of selection decision methods on the adverse
impact and differential hiring rates of minorities and
White females. Specifically, organizations seeking to
make hiring decisions based on personality test data
and seeking to avoid adverse impact against minorities
should utilize compensatory top down with fixed/sliding
bands or compensatory cut score selection decision
methods. Although possibly resulting in hiring utility,
organizations should be aware of the possible adverse
impact against minorities associated with compensatory
top down and noncompensatory selection decision
methods when using personality test data to make hiring
decisions.
In sum, the current study focused on a widely used
personality measure (i.e., the NEO) administered to an
actual candidate sample that contained a uniquely large
number of minority candidates, in a Canadian context.
Implications of the findings suggest that it is imperative
for organizational members who are involved in person-
nel selection decisions to be aware of the different
selection decision methods that are available to be used
when selecting job candidates based on personality test
data. Moreover, having an understanding regarding the
adverse impact and differential hiring rates that can be
expected based on using different selection decision
methods is also important information for human re-
source professionals. The time has come to think more
carefully about how personality test data are being used
for selecting job candidates.
Acknowledgements
The authors would like to thank Chad Hayward, Deborah
Powell, and two anonymous reviewers for their helpful
comments on earlier versions of this paper.
The views expressed in this paper are those of the
authors and not of the organization who commissioned
the studies from which the data were obtained.
Note
1. We thank an anonymous reviewer for directing our atten-
tion to this important limitation.
References
Alexander, R. A., Barrett, G. V., & Doverspike, D. (1983). An
explication of the selection ratio and its relationship to hiring
rate. Journal of Applied Psychology, 68, 342–344.
Barrick, M. R., & Mount, M. K. (1991). The Big Five personality
dimensions and job performance: A meta-analysis. Personnel
Psychology, 44, 1–26.
Bartram, D. (1992). The personality of UK managers: 16PF
norms for short-listed applicants. Journal of Occupational and
Organizational Psychology, 65, 159–172.
Campion, M. A., Outtz, J. L., Zedeck, S., Schmidt, F. L., Kehoe, J.
F., Murphy, K. R., et al. (2001). The controversy over score
banding in personnel selection: Answers to 10 key questions.
Personnel Psychology, 54, 149–185.
Cascio, W. F., Outtz, J., Zedeck, S., & Goldstein, I. L. (1991).
Statistical implications of six methods of test score use in
personnel selection. Human Performance, 4, 233–264.
Catano, V. M., Wiesner, W. H., Hackett, R. D., & Methot, L. L.
(2005). Recruitment and selection in Canada (3rd ed.). Toronto,
ON: Nelson Thompson Learning.
Cooper-Hakim, A., & Viswesvaran, C. (2002). A meta-analytic
review of the MacAndrew alcoholism scale. Educational and
Psychological Measurement, 62, 818–829.
Costa, P. T. Jr., & McCrae, R. R. (1992). Revised NEO personality
inventory (NEO PI-Rt) and NEO five-factor inventory (NEO-FFI):
professional manual. Odessa, FL: Psychological Assessment
Resources.
Day, D. V., & Silverman, S. B. (1989). Personality and job
performance: Evidence of incremental validity. Personnel Psy-
chology, 42, 25–36.
Digman, J. M. (1990). Personality structure: Emergence of the
five-factor model. Annual Review of Psychology, 41, 417–440.
Dion, K. L., & Yee, P. H.N. (1987). Ethnicity and personality in a
Canadian context. Journal of Social Psychology, 127, 175–182.
Donovan, J. J., Dwight, S. A., & Hurtz, G. M. (2003). An
assessment of the prevalence, severity, and verifiability of
entry-level applicant faking using the randomized response
technique. Human Performance, 16, 81–106.
Equal Opportunity Employment Commission, Civil Service
Commission, U.S. Department of Labor, & U.S. Department
of Justice. (1978). Uniform guidelines on employee selection
procedures. Federal Register, 43, 38290–38309.
Feingold, A. (1994). Gender differences in personality: A meta-
analysis. Psychological Bulletin, 116, 429–456.
Foldes, H. J., Duehr, E. E., & Ones, D. S. (2008). Group
differences in personality: Meta-analyses comparing five racial
groups. Personnel Psychology, 61, 579–616.
Guion, R. M. (1965). Personnel testing. New York: McGraw-Hill.
Guion, R. M. (1998). Assessment, measurement, and prediction for
personnel decisions. Mahwah, NJ: Lawrence Erlbaum and
Associates.
Guion, R. M., & Gottier, R. F. (1965). Validity of personality
measures in personnel selection. Personnel Psychology, 18,
135–164.
28 Stephen D. Risavy and Peter A. Hausdorf
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.
Guion, R. M., & Highhouse, S. (2006). Essentials of personnel
assessment and selection. Mahwah, NJ: Erlbaum.
Hausdorf, P. A., & Risavy, S. D. (2010). Decision making using
personality assessment: Implications for adverse impact and
hiring rates. Applied H.R.M. Research, 12, 100–120.
Heller, M. (2005). Court ruling that employer’s integrity test
violated ADA could open door to litigation. Workforce
Management, 84, 74–77.
Hogan, R. T. (1991). Personality and personality measurement.
In M. D. Dunnette, & L. M. Hough (Eds.), Handbook of
industrial and organizational psychology (Vol. 2, pp. 873–919).
Palo Alto, CA: Consulting Psychologists Press.
Hough, L. M. (1998). Personality at work: Issues and evidence. In
M. Hakel (Ed.), Beyond multiple choice: evaluating alternatives to
traditional testing for selection (pp. 131–166). Hillsdale, NJ:
Erlbaum.
Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., &
McCloy, R. A. (1990). Criterion-related validities of person-
ality constructs and the effect of response distortion on
those validities. Journal of Applied Psychology, 75, 581–595.
Hunter, J. E., & Schmidt, F. L. (1982). The economic benefits of
personnel selection using psychological ability tests. Industrial
Relations, 21, 293–308.
Hunter, J. E., Schmidt, F. L., & Rauschenberger, J. M. (1977).
Fairness of psychological tests: Implications of four definitions
for selection utility and minority hiring. Journal of Applied
Psychology, 62, 245–260.
Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The
impact of faking on employment tests: Does forced choice
offer a solution? Human Performance, 13, 371–388.
Locke, E. A., & Hulin, C. L. (1962). A review and evaluation of
the validity studies of activity vector analysis. Personnel
Psychology, 15, 25–42.
McCrae, R. R., & Costa, P. T. Jr. (1987). Validation of the five-
factor model of personality across instruments and obser-
vers. Journal of Personality and Social Psychology, 52, 81–90.
Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J.
R., Murphy, K., & Schmitt, N. (2007a). Reconsidering the use
of personality tests in personnel selection contexts. Personnel
Psychology, 60, 683–729.
Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J.
R., Murphy, K., & Schmitt, N. (2007b). Are we getting fooled
again? Coming to terms with limitations in the use of
personality tests for personnel selection. Personnel Psychology,
60, 1029–1049.
Newman, D. A., Jacobs, R. R., & Bartram, D. (2007). Choosing
the best method for local validity estimation: Relative accu-
racy of meta-analysis vs. a local study vs. Bayes-analysis.
Journal of Applied Psychology, 92, 1394–1413.
Newman, D. A., & Lyon, J. S. (2009). Recruitment efforts to
reduce adverse impact: Targeted recruiting for personality,
cognitive ability, and diversity. Journal of Applied Psychology, 94,
298–317.
Ones, D. S., & Anderson, N. (2002). Gender and ethnic
differences on personality scales in selection: Some British
data. Journal of Occupational and Organizational Psychology, 75,
255–276.
Rothstein, M. G., & Goffin, R. D. (2006). The use of personality
measures in personnel selection: What does current re-
search support? Human Resource Management Review, 16,
155–180.
Ryan, A. M., & Sackett, P. R. (1987). A survey of individual
assessment practices by I/O psychologists. Personnel Psychol-
ogy, 40, 455–488.
Sackett, P. R., & Roth, L. (1991). A Monte Carlo examination of
banding and rank order methods of test score use in
personnel selection. Human Performance, 4, 279–295.
Sackett, P. R., & Roth, L. (1996). Multi-stage selection strategies:
A Monte Carlo investigation of effects on performance and
minority hiring. Personnel Psychology, 49, 549–572.
Salgado, J. F. (1997). The five factor model of personality and job
performance in the European community. Journal of Applied
Psychology, 82, 30–43.
Salgado, J. F. (1998). Big Five personality dimensions and job
performance in army and civil occupations: A European
perspective. Human Performance, 11, 271–288.
Schmidt, F. L. (1991). Why all banding procedures in personnel
selection are logically flawed. Human Performance, 4, 265–
277.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of
selection methods in personnel psychology: Practical and
theoretical implications of 85 years of research findings.
Psychological Bulletin, 124, 262–274.
Schmidt, F. L., Mack, M. J., & Hunter, J. E. (1984). Selection utility
in the occupation of U.S. park ranger for three modes of test
use. Journal of Applied Psychology, 69, 490–497.
Society for Industrial and Organizational Psychology Inc. (2003).
Principles for the validation and use of personnel selection
procedures (4th ed.). Bowling Green, OH: Author.
Tett, R. P., Jackson, D. N., & Rothstein, M. (1991). Personality
measures as predictors of job performance: A meta-analytic
review. Personnel Psychology, 44, 703–742.
Thomas, R. R. (1991). Beyond race and gender: unleashing the
power of your total work force by managing diversity. New York:
AMACOM.
Viswesvaran, C., & Ones, D. S. (1999). Meta-analyses
of fakability estimates: Implications for personality
measurement. Educational and Psychological Measurement, 59,
197–210.
Williams, M. L., & Bauer, T. N. (1994). The effect of managing
diversity policy on organizational attractiveness. Group and
Organization Management, 19, 295–308.
Personality Testing in Personnel Selection 29
& 2011 Blackwell Publishing Ltd.
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011
Appendix A
Table A1. Cut score calculations for the personality dimensions of the NEO
Personality dimension Mean score for those who passed traininga
Standard Error of measurement (SEM); calculation¼ SD xp(1�reliability)
Cut score (mean – SEM)b
Conscientiousness 145.95 (SD¼ 16.76) 5.56 140.39 Extraversion 126.62 (SD¼ 14.51) 6.43 120.19 Neuroticism 56.95 (SD¼ 20.16) 6.01 62.96 Agreeableness 137.02 (SD¼ 15.47) 6.25 130.77 Openness to Experience 112.97 (SD¼ 14.65) 6.54 106.43 Note. For the SEM calculations, the SDs came from the total sample that completed the personality dimension: Conscientiousness¼ 16.05, Extraversion¼ 14.04, Neuroticism¼ 18.99, Agreeableness¼ 16.15, and Openness to Experience¼ 13.35. Reliabilities were: Conscientiousness¼ .88, Extraversion¼ .79, Neuroticism¼ .90, Agreeableness¼ .85, and Openness to Experience¼ .76. aThose who failed training did so for reasons such as, failing written/practical tests or receiving unsatisfactory progress reports. bExcept for Neuroticism where the SEM was added to the mean due to the fact that higher scores of Neuroticism are less desirable.
30 Stephen D. Risavy and Peter A. Hausdorf
International Journal of Selection and Assessment
Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.
Copyright of International Journal of Selection & Assessment is the property of Wiley-Blackwell and its
content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's
express written permission. However, users may print, download, or email articles for individual use.