Master forum 602

jsbfg0001
personalitytesting.pdf

Personality Testing in Personnel Selection: Adverse impact and differential hiring rates

Stephen D. Risavy and Peter A. Hausdorf

Department of Psychology, University of Guelph, Guelph, ON, Canada N1G 2W1. phausdor@uoguelph.ca

Personality tests are often used in selection and have demonstrated predictive validity across

a variety of occupational groups and performance criteria. Although different selection

decision methods can be used to make selection decisions (e.g., compensatory top down,

compensatory with sliding bands, noncompensatory) from personality test results, there is a

paucity of research addressing the influence of these different selection decision methods on

issues such as, adverse impact and differential hiring rates. This gap in the literature is

redressed in the current study. Results from 398 bus operator candidates indicated that there

may be adverse impact and differential hiring rate issues depending on the selection decision

method used and the designated group being assessed. Implications and future research

directions are discussed.

1. Introduction

Talent management – the attraction, development, andretention of key employees – begins with finding the right people. Because hiring the right people is critical for

organizational success and hiring the wrong people (i.e.,

making false positive selection decisions) may have a

negative impact on an organization, selection is of great

importance to human resource practitioners. The nega-

tive costs associated with making false-positive hiring

decisions are substantial; unqualified employees may

make costly errors, may require close supervision, and

may need to receive training to become qualified for the

job that they were hired to perform. Conversely, the

economic benefits of making correct hiring decisions

have been documented in the organizational sciences

literature (e.g., Hunter & Schmidt, 1982). For example,

when selecting candidates based on ability, substantial

labor savings have been found because of the increased

productivity associated with high performers (Hunter &

Schmidt, 1982).

As a result of the importance of making accurate

selection decisions, organizational researchers and prac-

titioners have invested considerable effort in assessing

different selection tools. Survey data has provided evi-

dence that personality tests are often employed as an

assessment measure in personnel selection processes

(e.g., Heller, 2005; Rothstein & Goffin, 2006; Ryan &

Sackett, 1987). Moreover, Conscientiousness – one of

the focal dimensions of personality – appears to have

some predictive validity across occupational groups and

performance criteria in both North American (Barrick &

Mount, 1991; Hough, Eaton, Dunnette, Kamp, & McCloy,

1990) and European (Salgado, 1997) communities.

Although some researchers have argued that personality

tests are generally not associated with adverse impact

(e.g., Ones & Anderson, 2002), there is some evidence of

group differences in mean scores between subgroups

(e.g., Bartram, 1992; Dion & Yee, 1987).

Because of the possibility that there are group differ-

ences in personality test data, selecting employees using

this information may be associated with adverse impact

and/or differential hiring rates. Moreover, adverse impact

and/or differential hiring rates may vary based on the

decision method that is used to make selection decisions

(e.g., compensatory top down, compensatory with sliding

bands, noncompensatory). There is a paucity of research

addressing the possible influence of these different selec-

tion decision methods on issues such as, adverse impact

and differential hiring rates when using personality test

data to make hiring decisions.

The current study focuses on the practical issue of

exploring possible group differences in personality as-

sessment and the potentially resulting adverse impact on

White females and minorities under different selection

decision methods. Moreover, differential hiring rates

under the different selection decision methods between

majority and minority group members will be assessed. In

& 2011 Blackwell Publishing Ltd.,

9600 Garsington Road, Oxford, OX4 2DQ, UK and 350 Main St., Malden, MA, 02148, USA

International Journal of Selection and Assessment Volume 19 Number 1 March 2011

order to situate the current research within the extant

literature, this paper will begin with a brief review of

adverse impact and diversity management. Next, a review

of the literature pertaining to the use of personality

testing in personnel selection will be presented. Subse-

quently, different possible selection decision methods for

using selection data to make hiring decisions will be

discussed. Following this discussion, a study will be

presented from an actual selection context with a sizable

minority sample in order to assess if there is evidence of

adverse impact and differential hiring rates when using

personality test data to make hiring decisions under

different selection decision methods.

1.1. Adverse impact and diversity management

Adverse impact occurs when a selection procedure

results in different selection rates for different group

members (i.e., minority group members having lower

selection rates than majority group members; Alexander,

Barrett, & Doverspike, 1983; Guion, 1998). Courts

typically turn to the longstanding four-fifths rule in order

to assess if adverse impact against a particular sex, race,

or ethnic subgroup is occurring. The four-fifths rule

states that the selection rate for the minority group

should be no less than four-fifths (i.e., .80) that of the

selection rate for the majority group (Equal Opportunity

Employment Commission, U.S. Department of Labor, &

U.S. Department of Justice, 1978).

If a selection tool is found to have adverse impact (and

is not found to have criterion-related validity), then the

organization using that selection tool is highly susceptible

to legal challenges on the basis of discrimination. It is

important to note that adverse impact can arise from

mean differences between subgroups on the selection

instruments. In spite of the relation between adverse

impact and subgroup differences, it is possible to have

subgroup differences and no adverse impact (e.g., hiring

situations where there are high selection rates); however,

it is extremely rare to have a situation in which there are

no subgroup differences and adverse impact (Newman,

Jacobs, & Bartram, 2007; Newman & Lyon, 2009).

In cases for which adverse impact is present, organiza-

tions may choose to implement affirmative action policies

such as, giving preferential treatment to minorities. For

example, when several candidates have the same score

on a selection test (or at least when score differences are

not statistically significant), an affirmative action policy

may state that minorities are to be selected before

majority group members for those candidates with the

same (or statistically similar) scores. However, it is worth

noting that this solution can create problems with

reverse discrimination.

Diversity management represents an alternative to

affirmative action policies. Diversity management pro-

grams have the goal of increasing minority representation

in more senior positions as well as providing more career

development opportunities for minorities (e.g., Williams

& Bauer, 1994). If diversity management programs are

successful, then there will be an increased representation

of minorities at higher levels in the workplace; this

increased visibility of minorities may act as a signal to

help recruit other minorities to the organization (e.g.,

Thomas, 1991). Moreover, even in cases in which adverse

impact is not present, it may still be desirable for

organizations to increase the diversity of the workforce

by hiring additional minority group members. The key

issue discussed in the current paper is how personality

assessment in the selection process impacts adverse

impact and diversity management.

1.2. Personality test use in selection

Hogan (1991) noted that personality is typically used to

refer to: (1) the underlying structures, dynamics, pro-

cesses, and propensities that bring about behavioral

actions; or (2) the way these actions are observed and

described by others in terms of their content. In recent

years, personality dimensions have converged into the

‘Big Five’ model of personality – Conscientiousness

(dependable, organized, self-disciplined), Extraversion

(sociable, talkative, active), Emotional Stability (the op-

posite of Neuroticism; calm, unemotional, secure),

Agreeableness (altruistic, nurturing, caring), and Open-

ness to Experience (imaginative, cultured, broad-minded;

Digman, 1990; McCrae & Costa, 1987). The Big Five

describes behavioral actions in terms of a person’s

dispositional characteristics (Hogan, 1991) and these

characteristics have been associated with organizational

outcomes (e.g., job performance ratings; Barrick &

Mount, 1991; Tett, Jackson, & Rothstein, 1991).

Meta-analytic evidence from the early 1990s (e.g.,

Barrick & Mount, 1991; Tett et al., 1991) purported the

beneficial, albeit modest, impact of personality measures

in predicting job performance ratings. For example, the

highest estimated true correlation – corrected for un-

reliability (in the predictor and the criterion) and range

restriction – reported by Barrick and Mount (1991) was

.22 for the relation between Conscientiousness and job

performance. Another meta-analysis published in the

same year (Tett et al., 1991) found a similar corrected

mean validity of .18 for Conscientiousness. Moreover,

personality variables have been found to be important in

explaining incremental variance in job performance be-

yond general cognitive ability (e.g., Day & Silverman,

1989; McCrae & Costa, 1987; Salgado, 1998). Although

some scholars have argued that personality is only a weak

predictor of job performance (e.g., Guion, 1965; Guion &

Gottier, 1965; Locke & Hulin, 1962; Morgeson et al.,

2007a, 2007b), personality tests have been touted as

having less adverse impact than other tests. Despite this

Personality Testing in Personnel Selection 19

& 2011 Blackwell Publishing Ltd.

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011

view, group differences in mean scores across gender and

ethnic groups have been found in prior research.

Regarding gender differences, extant large-scale, meta-

analytic research has provided evidence that across

different personality assessments, males score higher on

Assertiveness, whereas females score higher on Anxiety,

Gregariousness, Trust, and Tendermindedness (Feingold,

1994). A different study (using data from personality test

manuals), provided evidence that males score consider-

ably higher on Rugged Individualism and slightly higher on

Adjustment, Potency, and Intellectance, whereas females

score slightly higher on Dependability and Affiliation

(Hough, 1998). It is important to note that although

there were significant group differences in the aforemen-

tioned studies, the effect sizes associated with these

differences were often small in magnitude (Feingold,

1994; Hough, 1998; Ones & Anderson, 2002) and

moreover, the differences were at times, in favor of the

minority group (e.g., females scoring higher on Trust

and Affiliation). Other research studies have found

more substantive differences between males and females;

for example, Bartram (1992) found evidence that

males score substantially lower than females on

Anxiety and substantially higher on Tough-Poise and

Independence.

With respect to ethnic group differences, extant meta-

analytic research has provided evidence that White

respondents score higher on Sociability and Extraversion

(global) and lower on Anxiety, whereas Black respon-

dents score higher on Self-Esteem, Conscientiousness

(global), and Cautiousness (Foldes, Duehr, & Ones,

2008). The aforementioned study by Hough (1998)

provided evidence that White respondents score higher

on Affiliation and Intellectance, whereas Black respon-

dents score slightly higher on Potency. Other research

studies have found more substantive differences between

males and females; for example, in a Canadian context,

Dion and Yee (1987) found evidence that Anglos and

Europeans score higher than Asians on Affiliation, Dom-

inance, Exhibition, and Nurturance.

Although some scholars (e.g., Foldes et al., 2008;

Hough, 1998; Ones & Anderson, 2002) have concluded

that personality tests contain scant evidence of adverse

impact, the fact that group differences were found and

that those differences were at times fairly large, alludes to

the possibility that adverse impact may be an issue under

some conditions. As mentioned previously, when sub-

group differences are present there is an opportunity for

adverse impact to be present as well (especially when

working with low selection rates; Newman et al., 2007;

Newman & Lyon, 2009). As a result, the issue of adverse

impact in personality tests deserves further scrutiny;

specifically, adverse impact and differential hiring rates

may be more or less prevalent depending on how the

personality test data are being used to make selection

decisions.

Prior research has not specifically assessed how orga-

nizations are using personality test data to make selection

decisions in an actual selection context. It is possible that

when using different selection decision methods in an

actual selection context, group differences may result in

adverse impact and/or differential hiring rates for mino-

rities. The specific selection decision methods used when

making hiring decisions based on personality test data and

the resulting influence on adverse impact and differential

hiring rates is an important and previously overlooked

practical issue to consider when including personality test

data in the selection process.

1.3. Selection decision methods for using assessment data

Rothstein and Goffin (2006) professed that organizational

members in charge of selection procedures may not

understand the complexities and importance of using

personality tests appropriately. Industrial and Organiza-

tional psychologists have not provided practitioners with

specific guidelines for the use of personality data when

making hiring decisions. Despite this gap, generic guide-

lines for selection tools exist and may be applicable to

personality assessment data.

Table 1 displays the typical selection decision methods,

as can be found in sources such as, Guion and Highhouse

(2006) and Society for Industrial and Organizational

Psychology (SIOP) (2003). As indicated in Table 1, there

are many options for using assessment data to select

from a pool of job candidates. The table operates similar

to a flowchart in the sense that first, the organizational

hiring members can choose if they will be using a

compensatory or a noncompensatory method (first

column). Compensatory methods involve combining

scores on selection predictors into an overall score.

Because this involves summing scores across predictors,

high scores on some predictors can compensate for low

scores on other predictors (Guion & Highhouse, 2006;

SIOP, 2003). With noncompensatory methods, candidate

scores must exceed a predetermined value for each

predictor (i.e., a hurdle). As a result, a high score on

one predictor cannot compensate for a low score on

another predictor (Guion & Highhouse, 2006; SIOP,

2003). It is also possible to use a combination of

compensatory and noncompensatory methods depend-

ing on the requirements for the position as well as the

number and type of predictors (e.g., hurdle on one

predictor and compensatory on the other predictors;

Guion & Highhouse, 2006).

If choosing the compensatory method (top rows of

Table 1), the organization can require candidates to

complete all measures (second column) and subse-

quently, can either combine those scores into one score

using regression-, rational-, or unit weighting (i.e., sum-

ming the scores; third column). It is important to note

20 Stephen D. Risavy and Peter A. Hausdorf

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.

that scores can be combined using either the regression

weights solely, rational determination solely, or a combi-

nation of regression and rational weighting. Guion and

Highhouse (2006) recommend that the weighting method

should be determined by a combination of both empirical

evidence and theoretical rationale. Lastly, the combined

scores can then be used to select candidates in a top

down (i.e., the highest scoring candidate will be selected

first and so on until the desired number of candidates

have been selected), top down with fixed bands (i.e.,

candidates are grouped based on a range of scores –

typically, plus or minus two times the standard error of

measurement [SEM] – and all candidates in the first band

must be selected before any candidates in the second

band are selected), top down with sliding bands (i.e.,

candidates are grouped based on a range of scores and

after the highest scoring candidate in the first band is

selected, the band is recalculated), or cut score approach

(e.g., organizational hiring members can rank and then

select top down or can set a cut score, rank those passing,

and then select top down; fourth column).

If choosing the noncompensatory method (bottom

rows of Table 1), the organization can then choose to

administer all of the measures at once (typically in 1–2

testing sessions) or to require that candidates meet the

hiring criterion on a prior measure in order to be allowed

to complete a subsequent measure (typically in sequential

sessions; i.e., multiple cut/hurdle; second column). Next,

the organizational hiring members can consider each

predictor score separately (third column) and can then

select all candidates surpassing the cut scores on all of the

measures (fourth column).

Although not focusing solely on personality test data,

some researchers have addressed the advantages and

disadvantages of various selection decision methods for

other combinations of selection assessments (e.g., Cam-

pion et al., 2001; Cascio, Outtz, Zedeck, & Goldstein,

1991; Hunter, Schmidt, & Rauschenberger, 1977; Sackett

& Roth, 1991, 1996; Schmidt & Hunter, 1998; Schmidt,

Mack, & Hunter, 1984). Schmidt et al. (1984) demon-

strated that rank ordering (or top down selection)

produced higher selection decision utility than cut scores

in which all candidates above the cut score were deemed

acceptable. In support of these findings, Cascio et al.

(1991) also found that selection decision methods invol-

ving rank ordering, top down selection had the most

utility (i.e., economic value); however, these researchers

also found that these selection decision methods had

adverse impact implications for overall assessment test

scores of a cognitive ability test for a sample of firefighter

candidates. Cascio and colleagues (1991) recommend the

use of the sliding band selection decision method – with

minority-based referral (i.e., an affirmative action policy) –

when reducing adverse impact is a primary goal in the

hiring process. However, the practice of using banding

procedures when making hiring decisions is a contentious

issue. Specifically, Schmidt (1991) has argued that all

banding procedures reduce the utility of the selection

process, which does not offset the potential benefit of

reduced adverse impact.

Given the disagreement with respect to the impact of

different selection decision methods (e.g., Cascio et al.,

1991; Schmidt, 1991), the current paper will focus on

each of the decision methods described in Table 1 (i.e.,

compensatory top down, compensatory top down with

fixed bands, compensatory top down with sliding bands,

compensatory cut score, and noncompensatory). In light

of the aforementioned debate regarding banding (e.g.,

Table 1. Selection decision methods for using assessment data

Nature of the relationship between predictors and job performance

Testing process Score combinationsa Candidate ranking method

Compensatory (i.e., combin- ing scores on predictors into a composite)

All candidates complete all measures

Single score: All predictor scores combined into one score using: (a) regression and/or rational weightsb or (b) unit weights (i.e., summing the scores)

(1) Top down (2) Top down with fixed bands (3) Top down with sliding bands (4) Cut score

Noncompensatory (i.e., elim- inating candidates that score below the cut score for a critical predictor)

(1) All candidates complete all measures (typically in 1–2 test- ing sessions) (2) Candidates do not com- plete all measures. Candidates must meet the hiring criterion on the prior measure to be allowed to complete the sub- sequent measure (typically in sequential sessions)

Multiple scores: Every predictor score considered separately

All candidates surpassing the cut scores on all of the measures are selected

Note. Information adapted from Guion and Highhouse (2006) and SIOP Inc. (2003). This table originally appeared in Hausdorf and Risavy (2010). aAll of the scores can be combined using either the unstandardized (b) or standardized (b) scores. bGuion and Highhouse (2006) suggest that the weighting method should be determined by both computational as well as rational logic.

Personality Testing in Personnel Selection 21

& 2011 Blackwell Publishing Ltd.

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011

Cascio et al., 1991; Schmidt, 1991), it is possible that

organizational preference for increasing minority repre-

sentation or maximizing performance may guide the

selection decision method that is utilized. Simulation-

based research has provided evidence that the relative

value an organization places on minority representation

and performance should guide decisions regarding the

selection decision method to be adopted (Sackett &

Roth, 1991, 1996). In sum, it appears that there are

various different selection decision methods that can be

used to make hiring decisions based on personality test

data and the current study will assess the possible

influence of these different selection decision methods

on adverse impact and differential hiring rates – an issue

that has not yet been fully addressed in the extant

organizational sciences literature.

1.4. The current study

The current study assesses the possible group differences

present in personality measures used during a selection

process under different selection decision methods and

the resulting influence of those selection decision meth-

ods on adverse impact and differential hiring rates. The

NEO Personality Inventory (NEO PI-R; Costa & McCrae,

1992) – one of the most commonly used Big Five

measures of personality – was administered to a group

of bus operator candidates. The NEO was administered

as part of a validation study and thus, actual hiring

decisions were not made based on the assessment data.

Corresponding to the aforementioned focal purpose of

the current study, personality test data from the NEO

will be used to select candidates under different selection

decision methods and an assessment of group differences

that may lead to adverse impact and differential hiring

rates will be conducted. Specifically, each of the most

common selection decision methods – as presented

above and in Table 1 (i.e., compensatory top down,

compensatory top down with fixed bands, compensatory

top down with sliding bands, compensatory cut score,

and noncompensatory) – will be used to select

candidates and their influence on selection-based deci-

sions regarding White females and minorities will be

assessed.

2. Method

2.1. Participants and procedure

Participants were 398 bus operator candidates. The

sample consisted of 335 men (84.17%), 61 women

(15.33%), and 2 candidates who did not report their

gender (o1%). Seven candidates had less than a high school education, 64 candidates had a high school

education, 85 candidates had some college or university

education, 36 candidates had a university degree, and 3

candidates had a graduate degree. Of this sample 145

candidates were not visible minorities (36.43%), 249

candidates were visible minorities (62.56%), and 4 candi-

dates did not report their visible minority status (1.00%).

The minority sample consisted of people who identified

themselves as South Asian/Indo-Pakistani (N¼ 86), Black (N¼ 82), Chinese (N¼ 24), mixed race or color (N¼ 16), other South East Asian (N¼ 12), Central or South American (N¼ 10), Filipino (N¼ 7), West Asian or North African (N¼ 7), Oceanic (N¼ 3), and Korean (N¼ 2). Because of the large number of South Asian/ Indo-Pakistani and Black candidates, analyses were con-

ducted for the minorities combined as well as separately

for the South Asian/Indo-Pakistani and Black candidates.

The candidates completed the selection measures (the

NEO as well as other measures that were unrelated to

the current study) during a validation study in a large-

sized Canadian city during one testing session.

2.2. Measures

2.2.1. Demographics

Participants were asked to report their gender, visible

minority status, and if applicable, their specific visible

minority group.

2.2.2. NEO PI-R

The NEO PI-R (Costa & McCrae, 1992) was adminis-

tered to candidates. The NEO measures the following

five personality dimensions: Conscientiousness (a¼ .88), Extraversion (a¼ .79), Neuroticism (a¼ .90), Agreeable- ness (a¼ .85), and Openness to Experience (a¼ .76). Each of the dimensions contained 48 items for a total of

240 items. Response options ranged from strongly

disagree (1) to strongly agree (5). For the purposes of

the current study it was expected that the personality

dimensions from the NEO would be related to

training performance in the sense that having higher

levels of Conscientiousness, Extraversion, Agreeableness,

and Openness to Experience and lower levels of

Neuroticism would be associated with higher training

performance.

2.3. Analytic strategy

First, mean differences on the NEO personality dimen-

sions were assessed using independent samples t-tests.

Mean differences on each of the dimensions were

compared for White females and minorities (as well as

the South Asian/Indo-Pakistani and Black groups) against

the scores from the White male group. Second, each of

the selection decision methods explicated above and in

Table 1 (i.e., compensatory top down, compensatory top

down with fixed bands, compensatory top down with

sliding bands, compensatory cut score, and noncompen-

satory) were assessed in terms of the adverse impact

22 Stephen D. Risavy and Peter A. Hausdorf

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.

and/or differential hiring rates of White males, White

females, and minorities (as well as South Asian/Indo-

Pakistani and Black candidates). Table 2 provides a more

detailed explanation of how each of the selection deci-

sion methods was conducted.

3. Results

3.1. Mean difference analyses

The scores on each of the NEO dimensions for White

females and minorities (as well as the South Asian/Indo-

Pakistani and Black groups) were compared with the

scores from the White males group (Table 3) and

assessed using independent samples t-tests. Results in-

dicated that the only difference in favor of White males

was that they were significantly more open to experience

than the South Asian/Indo-Pakistani group. All other

significant differences were in favor of the White females

or minorities: (1) White females were more agreeable

and open to experience than White males; (2) minorities

were more conscientious than White males; (3) and

Blacks were more conscientious and agreeable than

White males.

3.2. Selection decision method analyses

3.2.1. Compensatory top down

Candidate scores on the personality dimensions were

combined (each personality dimension was weighted

equally). The personality dimensions were combined

and the top 50, 100, and 150 candidates were selected

using a top down procedure. In cases where candidates

had equal scores, candidates were selected with minor-

ity-based referral (i.e., minorities were selected before

candidates who were not minorities). One common

method for assessing adverse impact – especially in

Canada – is the four-fifths rule (Catano, Wiesner, Hack-

ett, & Methot, 2005). When the selection rate of the

designated group (e.g., White females, minorities) is less

than four-fifths that of the selection rate for the compar-

ison group (e.g., White males), there is evidence of

adverse impact for the designated group (Catano et al.,

2005). According to the four-fifths rule, the only evidence

of adverse impact occurred with the compensatory top

down selection decision method for 50 hires, the selec-

tion rate of minorities (.11) was less than four-fifths that

of the White males group (.15; Table 4).

3.2.2. Compensatory top down with fixed bands

Candidates were grouped based on a range of scores

(plus or minus two times the SEM) and then the top 50,

100, and 150 candidates were selected using a top down

procedure with minority-based referral for fixed bands

(i.e., within each band minorities were selected first

and once all of the candidates in the band were selected

the next band was created). Bands were calculated by

subtracting two times the SEM from the top combined

score of the personality scale in question. Candidates

were selected with minorities in the band being selected

before other candidates. All candidates in the band

needed to be selected before calculating the next band.

Table 2. Analytic strategies for the selection decision methods

Selection decision method

Analytic strategy

Compensatory top down

Candidate scores on the personality dimensions were combined (each per- sonality dimension was weighted equally). The top 50, 100, and 150 candidates were selected using a top down procedurea

Compensatory top down with fixed bands

Candidates were grouped based on a range of scores (plus or minus two times the standard error of measure- ment [SEM]) and then the top 50, 100, and 150 candidates were selected using a top down procedure with minority- based referral for fixed bands (i.e., within each band minorities were se- lected first and once all of the candi- dates in the band were selected the next band was created)

Compensatory top down with sliding bands

Candidates were grouped based on a range of scores (plus or minus two times the SEM) and then the top 50, 100, and 150 candidates were selected using a top down procedure with min- ority-based referral for sliding bands (i.e., within each band minorities were selected first and once the top scorer(s) in the band was(were) se- lected the band was adjusted)

Compensatory cut score

Candidate scores on the personality dimensions were combined (each per- sonality dimension was weighted equally). The candidates scoring above the cut scoreb were selected

Noncompensa- tory

Candidates who scored below the cut score for any of the personality dimen- sions were eliminated from the selec- tion process. The remaining candidates were selected

Note. aFor all top down procedures, in cases where candidates have the same score on the scale in question, candidates were selected in two alternative ways: (1) at random, and (2) minority candidates were selected first and then the remaining candidates were selected randomly bCut score thresholds for each personality dimension were computed. The process for computing cut score thresholds for each personality dimension involved taking the mean score on the personality dimension in question for all bus operators in the sample who passed training (out of the candidates that advanced from selection to training). Next, the standard error of measurement (SEM) was subtracted from each mean dimension score. As an illustrative example, for the Conscientiousness dimension, the mean score for all bus operators who passed training was 145.95. Next, the SEM for the Conscientiousness dimension was calculated (SEM¼ 5.56) and subtracted from the mean dimension score. The resulting figure was used as the cut score for the Conscientiousness dimension (i.e., 140.39; see Appendix A for the cut score calculations for Conscientiousness as well as the other personality dimensions).

Personality Testing in Personnel Selection 23

& 2011 Blackwell Publishing Ltd.

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011

According to the four-fifths rule, there was no evidence

of adverse impact when using the compensatory

top down with fixed bands selection decision method

(Table 5).

3.2.3. Compensatory top down with sliding bands

Similar to the compensatory top down with fixed bands

selection decision method, candidates were again

grouped based on a range of scores (plus or minus two

times the SEM) and then the top 50, 100, and 150

candidates were selected using a top down procedure

with minority-based referral for sliding bands (i.e., within

each band minorities were selected first and once the top

scorer[s] in the band was[were] selected the band was

adjusted). The focal difference from the fixed bands

approach is that for sliding bands, once the highest

scoring candidate(s) in the band is(are) selected, the

band can then be recalculated based on the next highest

scoring candidate. According to the four-fifths rule, there

was no evidence of adverse impact when using the

compensatory top down with sliding bands selection

decision method (Table 6).

3.2.4. Compensatory top down cut score

Candidate scores on the personality dimensions were

combined (each personality dimension was weighted

equally). The candidates scoring above the cut score

were selected. The process for computing cut score

thresholds for each personality dimension involved taking

the mean score on the personality dimension in question

for all bus operators in the sample who passed training.

The reason for setting the cut scores based on the

candidates who passed training was that it was assumed

that those who passed training were acceptable for the

job and thus, represented at least minimally acceptable

selection test scores. Next, the SEM was subtracted from

Table 3. Mean differences on the personality dimensions of the NEO

Personality dimension White males White females Minorities South Asian/Indo-Pakistani Blacks

Conscientiousness M¼ 143.25 M¼ 139.21 M¼ 146.99* M¼ 147.76 M¼ 148.24* SD¼ 15.28 SD¼ 19.23 SD¼ 16.03 SD¼ 17.06 SD¼ 16.49 (N¼ 107) (N¼ 19) (N¼ 210) (N¼ 71) (N¼ 66)

Extraversion M¼ 126.77 M¼ 129.35 M¼ 124.62 M¼ 124.06 M¼ 124.56 SD¼ 13.82 SD¼ 11.21 SD¼ 14.35 SD¼ 13.95 SD¼ 13.07 (N¼ 105) (N¼ 20) (N¼ 202) (N¼ 69) (N¼ 63)

Neuroticism M¼ 56.64 M¼ 56.67 M¼ 60.36 M¼ 62.63 M¼ 59.04 SD¼ 18.95 SD¼ 20.99 SD¼ 18.89 SD¼ 21.34 SD¼ 18.53 (N¼ 103) (N¼ 21) (N¼ 205) (N¼ 67) (N¼ 67)

Agreeableness M¼ 135.06 M¼ 140.86* M¼ 135.17 M¼ 131.78 M¼ 140.23* SD¼ 15.34 SD¼ 8.91 SD¼ 17.03 SD¼ 17.00 SD¼ 17.07 (N¼ 109) (N¼ 22) (N¼ 208) (N¼ 72) (N¼ 66)

Openness to experience M¼ 112.47 M¼ 121.23* M¼ 111.92 M¼ 108.25* M¼ 114.66 SD¼ 12.35 SD¼ 17.15 SD¼ 13.17 SD¼ 13.19 SD¼ 12.94 (N¼ 105) (N¼ 22) (N¼ 206) (N¼ 71) (N¼ 64)

Note. *Mean is significantly different (po.05) than the mean for White males.

Table 4. Compensatory top down with minority-based referral

Total candidate pool (A)

Number of candidates made job offers (B)

Selection rate (B/A)

Adverse impact according to the four-fifths rule?

N¼ 50 White males 117 17 .15 Yes, for minorities Minorities 249 28 .11 White females 23 5 .22

N¼ 100 White males 117 33 .28 No Minorities 249 58 .23 White females 23 7 .30

N¼ 150 White males 117 43 .37 No Minorities 249 93 .37 White females 23 11 .48

Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .12 (4/5 � .15) for N¼ 50, .22 (4/5 � .28) for N¼ 100, and .30 (4/5 � .37) for N¼ 150. The Number of candidates made job offers (third column) does not equal 100 and 150 because of missing data regarding the minority/gender variables.

24 Stephen D. Risavy and Peter A. Hausdorf

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.

each mean dimension score. As an illustrative example,

for the Conscientiousness dimension, the mean score for

all bus operators who passed training was 145.95. Next,

the SEM for the Conscientiousness dimension was

calculated (SEM¼ 5.56) and subtracted from the mean dimension score. The resulting figure was used as the cut

score for the Conscientiousness dimension (i.e., 140.39;

see Appendix A for the cut score calculations for

Conscientiousness as well as the other personality

dimensions). According to the four-fifths rule, there

was no evidence of adverse impact when using the

compensatory top down cut score selection decision

method; however, the selection rate for the minorities

(.45) was at the lowest possible value before being

less than four-fifths that of the White males group (.56;

Table 7).

Table 5. Compensatory top down with fixed bands with minority-based referral

Total candidate pool (A)

Number of candidates made job offers (B)

Selection rate (B/A) Adverse impact according to the four-fifths rule?

N¼ 50 White males 117 7 .06 Yes, for White females Minorities 249 42 .17 White females 23 1 .04

N¼ 100 White males 117 28 .24 No Minorities 249 65 .26 White females 23 7 .30

N¼ 150 White males 117 28 .24 No Minorities 249 115 .46 White females 23 7 .30

Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .05 (4/5 � .06) for N¼ 50, .19 (4/5 � .24) for N¼ 100, and .19 (4/5 � .24) for N¼ 150. The SEM for the bands was calculated by adding the SEM for each dimension and multiplying that sum by two (¼ 61.58).

Table 6. Compensatory top down with sliding bands with minority-based referral

Total candidate pool (A)

Number of candidates made job offers (B)

Selection rate (B/A)

Adverse impact according to the four-fifths rule?

N¼ 50 White males 117 7 .06 Yes, for White females Minorities 249 42 .17 White females 23 1 .04

N¼ 100 White males 117 16 .14 No Minorities 249 80 .32 White females 23 4 .17

N¼ 150 White males 117 27 .23 No Minorities 249 116 .47 White females 23 7 .30

Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .05 (4/5 � .06) for N¼ 50, .11 (4/5 � .14) for N¼ 100, and .18 (4/5 � .23) for N¼ 150. The SEM for the bands was calculated by adding the SEM for each dimension (¼ 61.58).

Table 7. Compensatory cut score

Total candidate pool (A)

Number of candidates made job offers (B)

Selection rate (B/A)

Adverse impact according to the four-fifths rule?

White males 117 65 .56 No Minorities 249 111 .45 White females 23 13 .57

Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .45 (4/5 � .56). The cut score was calculated by summing the cut scores for the Conscientiousness, Extraversion, Agreeableness, and Openness to experience dimensions and then subtracting the cut score for the Neuroticism dimension (¼ 434.82).

Personality Testing in Personnel Selection 25

& 2011 Blackwell Publishing Ltd.

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011

3.2.5. Noncompensatory

The personality dimension cut scores were used as

hurdles with the remaining candidates being selected;

specifically, the personality dimensions were each used as

a hurdle (in an arbitrary order). According to the four-

fifths rule, there was evidence of adverse impact when

using the noncompensatory selection decision method;

specifically, the selection rate of minorities (.18) was

less than four-fifths that of the White males group (.25;

Table 8).

4. Discussion

The current study sought to redress one of the gaps due

to the paucity of research examining practical issues for

personality testing in personnel selection. The current

study examined the impact of different selection decision

methods on hiring decisions made based on personality

test data. The results of the current study suggested that

when using the personality dimensions of the NEO to

select employees from a pool of candidates there was no

evidence of adverse impact for minorities when using the

compensatory top down with fixed/sliding bands or the

compensatory cut score selection decision methods

(although for the compensatory cut score method, the

selection rate for minorities was as low as possible

without containing adverse impact). However, there

was evidence of adverse impact for minorities when

using the compensatory top down and noncompensatory

selection decision methods (although there was only

evidence of adverse impact in the compensatory top

down method when 50 candidates were selected; there

was no evidence of adverse impact when 100 or 150

employees were selected). Regarding differential hiring

rates, the selection rates varied based upon the selection

decision method invoked; specifically, minority selection

rates were highest when using the compensatory top

down with sliding bands selection decision method with

150 hires and lowest when using the compensatory top

down selection decision method with 50 hires. In sum,

the results for minorities demonstrated that both the

selection decision method and the number of candidates

selected impacted on adverse impact and hiring rates.

With respect to White females, there was no evidence

of adverse impact when using the compensatory top

down, compensatory cut score, or noncompensatory

selection decision methods. However, there was evi-

dence of adverse impact when using the compensatory

top down with fixed/sliding bands selection decision

methods with 50 hires. Because the decisions within

the bands were made based upon minority-based refer-

ral, White females were disadvantaged under the banding

methods assessed in cases where only 50 candidates

were hired. Regarding differential hiring rates, White

female selection rates were highest when using the

compensatory cut score selection decision method and

lowest when using the compensatory top down with

fixed/sliding bands selection decision methods with 50

hires.

Regarding adverse impact and differential hiring rates,

the results of the current study suggest that adverse

impact and differential hiring rates may or may not be an

issue depending on the selection decision method in-

voked and the numbers of candidates selected. In sum,

the selection decision method that should be used by a

hiring organization will depend on the focal goals of the

organization during that particular selection process (e.g.,

to avoid adverse impact, to increase minority selection

rates, to increase utility). Furthermore, knowledge of the

impact of different selection decision methods will help

to shed further light on how to achieve these organiza-

tional goals. For example, when attempting to avoid

adverse impact against minorities by using a banding

method, it is important to be cognizant of the finding

that if banding decisions are being made based on

selecting minorities first within each band (i.e., an affir-

mative action policy), then other designated group mem-

bers (e.g., White females) may be adversely impacted and

moreover, reverse discrimination may also occur. To help

to circumvent potential issues with affirmative action

policies (such as selecting minorities first within test

score bands), tactics such as increasing minority repre-

sentation in more senior positions and providing more

career development opportunities for minorities (i.e.,

diversity management programs; e.g., Williams & Bauer,

1994) should be proactively implemented.

With respect to personality test scores on the NEO,

the only significant mean difference in favor of White

males was that they were more open to experience than

the South Asian/Indo-Pakistani group. All other significant

differences were in favor of the minorities and White

Table 8. Noncompensatory

Total candidate pool (A)

Number of candidates made job offers (B)

Selection rate (B/A)

Adverse impact according to the four-fifths rule?

White males 117 29 .25 Yes, for minorities Minorities 249 46 .18 White females 23 6 .26

Note. According to the four-fifths rule, the minimum selection rate of White females and minorities must be .20 (4/5 � .25).

26 Stephen D. Risavy and Peter A. Hausdorf

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.

females; specifically, White females were more agreeable

and open to experience than White males, minorities

were more conscientious than White males, and Blacks

were more conscientious and agreeable than White

males. Because of the minimal evidence of mean differ-

ences between White males and White females/mino-

rities favoring White males, these are promising results

for the use of including personality tests in the selection

process. Specifically, the few differences in the personality

dimensions did not, for the most part, negatively impact

minorities and White females. However, the aforemen-

tioned results by selection decision method and number

of candidates selected suggest that the assessment of

adverse impact needs to incorporate this information

beyond simply comparing mean differences.

4.1. Limitations and future research directions

Although it was beyond the scope of the current paper to

test all of the possible combinations of different selection

decision methods, some of the most common methods

(i.e., compensatory top down, compensatory top down

with fixed bands, compensatory top down with sliding

bands, compensatory cut score, and noncompensatory)

were assessed. Nevertheless, there are various different

ways of combining assessment scores when using the

compensatory method (i.e., regression-, rational-, or unit

weighting) that were not investigated in the current

study.1 Moreover, candidate results can also be combined

using either the unstandardized (b) or standardized (b) scores or a combination of compensatory and noncom-

pensatory methods (e.g., hurdle on one predictor and

compensatory on the other predictors). Future research

should seek to assess the adverse impact and differential

hiring rates of different selection decision methods that

have different combination methods (e.g., regression,

rational) and different types of scores being combined

(e.g., unstandardized, standardized).

In the current study, cut scores were used because of

their prevalence in applied settings. However, the issue of

using cut scores to hire from a sample of job candidates

has been broached in previous research (e.g., Cooper-

Hakim & Viswesvaran, 2002) and the consensus appears

to be that the use of continuous measures is generally

preferable to using cut scores. The current study found

that the compensatory cut score method had a selection

rate for minorities that was extremely close to being

indicative of adverse impact. Moreover, the cut score

calculations in the current study were highly dependent

on the reliability of the personality scale dimensions.

Overall, setting cut scores in both research and practice

is difficult because of the variety of ways that they can be

calculated and thus, the impact of using different cut

score calculations under different selection decision

methods is an avenue for future research.

Socially desirable responding (i.e., faking) is an addi-

tional issue that is important to consider with personality

tests in personnel selection. Tests utilized in selection

contexts often attempt to measure socially desirable

responding and thus, many personality tests used in

selection contain faking or lie scales. Regarding test

validity, induced faking has been associated with lower

criterion-related validity (e.g., Jackson, Wroblewski, &

Ashton, 2000). Faking research has provided empirical

support for the notion that candidates can and do

increase their scores in a socially desirable manner

when responding to personality assessments (i.e., ‘fake

good’) when motivated to present themselves in a

positive manner (e.g., Viswesvaran & Ones, 1999). More-

over, candidates have been found to differ in the extent

to which they fake (Donovan, Dwight, & Hurtz, 2003). An

implication of candidate faking is that criterion-related

validity can be attenuated and that if faking is occurring, it

is possible that validity can be increased by accounting for

those socially desirable responses. In sum, assessing the

impact of faking on the results of personality test

data used in hiring decisions under different selection

decision methods is a potentially fruitful avenue for future

research.

Future research needs to assess how organizations use

data from personality measures to make selection-based

decisions. Surveying a representative sample of organiza-

tions and asking selection professionals if they are

using personality measures in their selection processes

and if so, what personality measures they are using

and how they are using those results to make selec-

tion-based decisions will help to elucidate the use of

personality measures and testing in contemporary

organizations. Explicating how personality assessment

data are actually being used in selection processes

will be an important complement to the research

conducted in the current study. In other words, are

organizations making decisions based on a profile

of candidates (i.e., across multiple traits), a reduced set

of traits, or a range of minimum and maximum trait

levels?

Finally, although the current study examined one of the

important practical issues pertaining to using personality

tests in personnel selection, there are several additional

practical issues related to the use of personality measures

in the selection process that will need to be addressed by

future empirical as well as theoretical investigations.

Some of the most important practical issues not fully

considered by the current paper and/or the extant

literature include the (1) use of cut scores; (2) use of

invasive or prohibited items; (3) use of norms or other

score adjustments; (4) need for people trained in psycho-

metrics to interpret the data; (5) inclusion of non-job-

related personality data; (6) use of item- versus dimen-

sion-level data; and (7) use of raw score or percentile

results. These areas all represent additional practical

Personality Testing in Personnel Selection 27

& 2011 Blackwell Publishing Ltd.

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011

issues that need to be addressed by subsequent research

endeavors.

4.2. Summary and conclusion

Personality tests have become a common component of

many selection systems. However, the research literature

has not confirmed how organizations are making hiring

decisions based on personality tests in the selection

process and the impact of the different decision options.

The current paper provides evidence that adverse impact

may or may not be an issue depending upon the selection

decision method invoked. This paper suggests that the

optimal selection decision method should depend on the

goals of the organization (e.g., to avoid adverse impact, to

increase minority selection rates, to increase utility).

Furthermore, this study provides initial evidence of the

impact of selection decision methods on the adverse

impact and differential hiring rates of minorities and

White females. Specifically, organizations seeking to

make hiring decisions based on personality test data

and seeking to avoid adverse impact against minorities

should utilize compensatory top down with fixed/sliding

bands or compensatory cut score selection decision

methods. Although possibly resulting in hiring utility,

organizations should be aware of the possible adverse

impact against minorities associated with compensatory

top down and noncompensatory selection decision

methods when using personality test data to make hiring

decisions.

In sum, the current study focused on a widely used

personality measure (i.e., the NEO) administered to an

actual candidate sample that contained a uniquely large

number of minority candidates, in a Canadian context.

Implications of the findings suggest that it is imperative

for organizational members who are involved in person-

nel selection decisions to be aware of the different

selection decision methods that are available to be used

when selecting job candidates based on personality test

data. Moreover, having an understanding regarding the

adverse impact and differential hiring rates that can be

expected based on using different selection decision

methods is also important information for human re-

source professionals. The time has come to think more

carefully about how personality test data are being used

for selecting job candidates.

Acknowledgements

The authors would like to thank Chad Hayward, Deborah

Powell, and two anonymous reviewers for their helpful

comments on earlier versions of this paper.

The views expressed in this paper are those of the

authors and not of the organization who commissioned

the studies from which the data were obtained.

Note

1. We thank an anonymous reviewer for directing our atten-

tion to this important limitation.

References

Alexander, R. A., Barrett, G. V., & Doverspike, D. (1983). An

explication of the selection ratio and its relationship to hiring

rate. Journal of Applied Psychology, 68, 342–344.

Barrick, M. R., & Mount, M. K. (1991). The Big Five personality

dimensions and job performance: A meta-analysis. Personnel

Psychology, 44, 1–26.

Bartram, D. (1992). The personality of UK managers: 16PF

norms for short-listed applicants. Journal of Occupational and

Organizational Psychology, 65, 159–172.

Campion, M. A., Outtz, J. L., Zedeck, S., Schmidt, F. L., Kehoe, J.

F., Murphy, K. R., et al. (2001). The controversy over score

banding in personnel selection: Answers to 10 key questions.

Personnel Psychology, 54, 149–185.

Cascio, W. F., Outtz, J., Zedeck, S., & Goldstein, I. L. (1991).

Statistical implications of six methods of test score use in

personnel selection. Human Performance, 4, 233–264.

Catano, V. M., Wiesner, W. H., Hackett, R. D., & Methot, L. L.

(2005). Recruitment and selection in Canada (3rd ed.). Toronto,

ON: Nelson Thompson Learning.

Cooper-Hakim, A., & Viswesvaran, C. (2002). A meta-analytic

review of the MacAndrew alcoholism scale. Educational and

Psychological Measurement, 62, 818–829.

Costa, P. T. Jr., & McCrae, R. R. (1992). Revised NEO personality

inventory (NEO PI-Rt) and NEO five-factor inventory (NEO-FFI):

professional manual. Odessa, FL: Psychological Assessment

Resources.

Day, D. V., & Silverman, S. B. (1989). Personality and job

performance: Evidence of incremental validity. Personnel Psy-

chology, 42, 25–36.

Digman, J. M. (1990). Personality structure: Emergence of the

five-factor model. Annual Review of Psychology, 41, 417–440.

Dion, K. L., & Yee, P. H.N. (1987). Ethnicity and personality in a

Canadian context. Journal of Social Psychology, 127, 175–182.

Donovan, J. J., Dwight, S. A., & Hurtz, G. M. (2003). An

assessment of the prevalence, severity, and verifiability of

entry-level applicant faking using the randomized response

technique. Human Performance, 16, 81–106.

Equal Opportunity Employment Commission, Civil Service

Commission, U.S. Department of Labor, & U.S. Department

of Justice. (1978). Uniform guidelines on employee selection

procedures. Federal Register, 43, 38290–38309.

Feingold, A. (1994). Gender differences in personality: A meta-

analysis. Psychological Bulletin, 116, 429–456.

Foldes, H. J., Duehr, E. E., & Ones, D. S. (2008). Group

differences in personality: Meta-analyses comparing five racial

groups. Personnel Psychology, 61, 579–616.

Guion, R. M. (1965). Personnel testing. New York: McGraw-Hill.

Guion, R. M. (1998). Assessment, measurement, and prediction for

personnel decisions. Mahwah, NJ: Lawrence Erlbaum and

Associates.

Guion, R. M., & Gottier, R. F. (1965). Validity of personality

measures in personnel selection. Personnel Psychology, 18,

135–164.

28 Stephen D. Risavy and Peter A. Hausdorf

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.

Guion, R. M., & Highhouse, S. (2006). Essentials of personnel

assessment and selection. Mahwah, NJ: Erlbaum.

Hausdorf, P. A., & Risavy, S. D. (2010). Decision making using

personality assessment: Implications for adverse impact and

hiring rates. Applied H.R.M. Research, 12, 100–120.

Heller, M. (2005). Court ruling that employer’s integrity test

violated ADA could open door to litigation. Workforce

Management, 84, 74–77.

Hogan, R. T. (1991). Personality and personality measurement.

In M. D. Dunnette, & L. M. Hough (Eds.), Handbook of

industrial and organizational psychology (Vol. 2, pp. 873–919).

Palo Alto, CA: Consulting Psychologists Press.

Hough, L. M. (1998). Personality at work: Issues and evidence. In

M. Hakel (Ed.), Beyond multiple choice: evaluating alternatives to

traditional testing for selection (pp. 131–166). Hillsdale, NJ:

Erlbaum.

Hough, L. M., Eaton, N. K., Dunnette, M. D., Kamp, J. D., &

McCloy, R. A. (1990). Criterion-related validities of person-

ality constructs and the effect of response distortion on

those validities. Journal of Applied Psychology, 75, 581–595.

Hunter, J. E., & Schmidt, F. L. (1982). The economic benefits of

personnel selection using psychological ability tests. Industrial

Relations, 21, 293–308.

Hunter, J. E., Schmidt, F. L., & Rauschenberger, J. M. (1977).

Fairness of psychological tests: Implications of four definitions

for selection utility and minority hiring. Journal of Applied

Psychology, 62, 245–260.

Jackson, D. N., Wroblewski, V. R., & Ashton, M. C. (2000). The

impact of faking on employment tests: Does forced choice

offer a solution? Human Performance, 13, 371–388.

Locke, E. A., & Hulin, C. L. (1962). A review and evaluation of

the validity studies of activity vector analysis. Personnel

Psychology, 15, 25–42.

McCrae, R. R., & Costa, P. T. Jr. (1987). Validation of the five-

factor model of personality across instruments and obser-

vers. Journal of Personality and Social Psychology, 52, 81–90.

Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J.

R., Murphy, K., & Schmitt, N. (2007a). Reconsidering the use

of personality tests in personnel selection contexts. Personnel

Psychology, 60, 683–729.

Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J.

R., Murphy, K., & Schmitt, N. (2007b). Are we getting fooled

again? Coming to terms with limitations in the use of

personality tests for personnel selection. Personnel Psychology,

60, 1029–1049.

Newman, D. A., Jacobs, R. R., & Bartram, D. (2007). Choosing

the best method for local validity estimation: Relative accu-

racy of meta-analysis vs. a local study vs. Bayes-analysis.

Journal of Applied Psychology, 92, 1394–1413.

Newman, D. A., & Lyon, J. S. (2009). Recruitment efforts to

reduce adverse impact: Targeted recruiting for personality,

cognitive ability, and diversity. Journal of Applied Psychology, 94,

298–317.

Ones, D. S., & Anderson, N. (2002). Gender and ethnic

differences on personality scales in selection: Some British

data. Journal of Occupational and Organizational Psychology, 75,

255–276.

Rothstein, M. G., & Goffin, R. D. (2006). The use of personality

measures in personnel selection: What does current re-

search support? Human Resource Management Review, 16,

155–180.

Ryan, A. M., & Sackett, P. R. (1987). A survey of individual

assessment practices by I/O psychologists. Personnel Psychol-

ogy, 40, 455–488.

Sackett, P. R., & Roth, L. (1991). A Monte Carlo examination of

banding and rank order methods of test score use in

personnel selection. Human Performance, 4, 279–295.

Sackett, P. R., & Roth, L. (1996). Multi-stage selection strategies:

A Monte Carlo investigation of effects on performance and

minority hiring. Personnel Psychology, 49, 549–572.

Salgado, J. F. (1997). The five factor model of personality and job

performance in the European community. Journal of Applied

Psychology, 82, 30–43.

Salgado, J. F. (1998). Big Five personality dimensions and job

performance in army and civil occupations: A European

perspective. Human Performance, 11, 271–288.

Schmidt, F. L. (1991). Why all banding procedures in personnel

selection are logically flawed. Human Performance, 4, 265–

277.

Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of

selection methods in personnel psychology: Practical and

theoretical implications of 85 years of research findings.

Psychological Bulletin, 124, 262–274.

Schmidt, F. L., Mack, M. J., & Hunter, J. E. (1984). Selection utility

in the occupation of U.S. park ranger for three modes of test

use. Journal of Applied Psychology, 69, 490–497.

Society for Industrial and Organizational Psychology Inc. (2003).

Principles for the validation and use of personnel selection

procedures (4th ed.). Bowling Green, OH: Author.

Tett, R. P., Jackson, D. N., & Rothstein, M. (1991). Personality

measures as predictors of job performance: A meta-analytic

review. Personnel Psychology, 44, 703–742.

Thomas, R. R. (1991). Beyond race and gender: unleashing the

power of your total work force by managing diversity. New York:

AMACOM.

Viswesvaran, C., & Ones, D. S. (1999). Meta-analyses

of fakability estimates: Implications for personality

measurement. Educational and Psychological Measurement, 59,

197–210.

Williams, M. L., & Bauer, T. N. (1994). The effect of managing

diversity policy on organizational attractiveness. Group and

Organization Management, 19, 295–308.

Personality Testing in Personnel Selection 29

& 2011 Blackwell Publishing Ltd.

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011

Appendix A

Table A1. Cut score calculations for the personality dimensions of the NEO

Personality dimension Mean score for those who passed traininga

Standard Error of measurement (SEM); calculation¼ SD xp(1�reliability)

Cut score (mean – SEM)b

Conscientiousness 145.95 (SD¼ 16.76) 5.56 140.39 Extraversion 126.62 (SD¼ 14.51) 6.43 120.19 Neuroticism 56.95 (SD¼ 20.16) 6.01 62.96 Agreeableness 137.02 (SD¼ 15.47) 6.25 130.77 Openness to Experience 112.97 (SD¼ 14.65) 6.54 106.43 Note. For the SEM calculations, the SDs came from the total sample that completed the personality dimension: Conscientiousness¼ 16.05, Extraversion¼ 14.04, Neuroticism¼ 18.99, Agreeableness¼ 16.15, and Openness to Experience¼ 13.35. Reliabilities were: Conscientiousness¼ .88, Extraversion¼ .79, Neuroticism¼ .90, Agreeableness¼ .85, and Openness to Experience¼ .76. aThose who failed training did so for reasons such as, failing written/practical tests or receiving unsatisfactory progress reports. bExcept for Neuroticism where the SEM was added to the mean due to the fact that higher scores of Neuroticism are less desirable.

30 Stephen D. Risavy and Peter A. Hausdorf

International Journal of Selection and Assessment

Volume 19 Number 1 March 2011 & 2011 Blackwell Publishing Ltd.

Copyright of International Journal of Selection & Assessment is the property of Wiley-Blackwell and its

content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's

express written permission. However, users may print, download, or email articles for individual use.