-
I Measurement
202
Oullinc Hiring the Best Call Center Workers at Xerox What ls Measurement? Describing and lnterpreting Data
Types of Measurement Scores Shifting the Nonnal Curve
Using Data Strattgically Correlations In1erpretingCorrclations Regressions
What Arc the Characteristics of Useful Measures? Reliability Standard Error of Measurement Validity Using Existing Assessment Methods Selection Errors D~velop Your Skiffs: Assessment Tips Standardization and Objectivity
Creating and Validating an Assessment System Benchmarking Evaluating Assessme nt Methods
Hiring the Best Call Center Workers at Xerox Summary Chapter Supplement
LEARNING OBJECTIVES Af/,r sludying this chap/er, J'OU should bt able to: • Explain why measurement and assessment are important to staffing.
Describe patterns in data.
• Unde rs tand correlation and regress ion and explain how each is used . : Define pr~cri~~l and statistical sig11ifica11ce and explain why they are imponant.
Dcfin ~ relmbd1ry and i,a/itliry and explain how they affect the evaluation or a measure . Explain why standardi za ti on and objecti vity are imponant in measurement.
eadquartered in Norwalk, Connecticut, Xerox is the wold' 1 . document m~agcmcnt services. The company emplo~s ;v:;~: ~vider of_ business pr~ess
erox knows th at its call center workers are key to its bu siness Strate' pcop_le m 160 co~ntn~s.' X artY service 10 custome rs. For years, the company considered ~y exccuuon and provide high ~11 1 rtanl screening factor in staffing the se important posilio ~evious call center experience an :~ppropriateness of that selection criteri on, howe\·cr, and :s~uld ~;~~~:an; has never cvalu~tc_d thC bCSI scrce~in g method f~r it s call center w~rkers. The company also wo~d:~~~\~~~~de~dt:~:;;5 I t,ettcr sc: lecUon sy stem t~ impro\·e talent quality and reduce turnover in its call centers Im a ine th~ companY asks you for advice . After studying this chapter, you should have some good ideas ~o share with the company.
Chapter 8 • Mcasurcmcn\ 203
Effectively using data anal.ytics to make dccis!ons is a cornerstone or firm suecess.2 Data anal)1ics DATA ANALITlCS refers to the process of usmg data and ~nalyt1ca\ systems to arrive at optimal decisions, including statistical analyses of data. Data anal ytics can help organ izations ensure they have a high-quality and ialented ":'~rkforc~ t~at can execute ~tratcgy and meet business objectives. Data analytics can also in~onn hinng. ~'.nmg, and succerion planning decisions, and is critical when considering strategies for downsmng and layoffs. Some have argued that Google's success is due in large part 10 its reinvention or talent management through people ana1ytics.4 Similarly, there recently h.u been a lot ~r talk abo~t "Big Data,:• which rcfm to the exponential growth of data that is high volume, velocity, and vanety, and which enables the de1cclion and prediction oftrends.5 Big data can be used for conducting data analytics to build templates for identifying the best talent in the workforce and predicting the best future hircs.6 However. the use of data analytics and big data doesn' t alter the im ponance or having quality data, and i1 highlights the need to understand how
rlieproeu Jof ~i11g daroa11d """1)1kul S)'J/tmS /P urrfrtul opumal dtru ion.s, j,ic/udmg J1,m11k ul mruly, rs o/Mta
to interpret the statistics from that data. It also requires heightened awareness of the potential for spurious relationships.7
One of the most important activities involved in strategic starling is the process of usi ng data to make selection deci sion s. Creating a high-quality workforce depends on the accurate se lection or employee s who will best fit the organi zation's strategy, cult ure , and position requirements. If you hire poorly , then th e organization will suffer regardless of how good its strategic plans are . Improperly as sessing and measuring ca nd idate characteriSlics can lead to systematically hiring the wrong people, offe ndin g and losing good candidates, and even exposing your compa ny to legal action . In con1rast, consistently hiring the most 1a len ted , motivated, and passiona te employees who best fit your organization will result in a competitive edge that cannot be easily duplicated or replaced . Accurately sele cting employ- ees requires that yo u co ll ect, measure, and in terpret data rele vant to your firm's strategic staffi ng efforts .
In this sec tion of the book, Selec1i11g . we tum our atten ti on to these issues and the use or data to evaluate job applicants. Arter people have applied for jobs, we need to assess them in way s lhat allow us to compare them and make hiring deci sion s. We need to understand some or the basic principles or measurement so we can decide which candidates are best suited for jobs. That is lhc purpose of thi s chapter. Measurement penains to other aspects of the staffing system as we ll . We will discu ss more about the se aspec ts in Chap1er 13 .
The infonnation -tcchnology services company lEKsystcms know s th at accurate measurements arc essential to making good employee selection deci sions. The company 's cli- c~ ts evaluate the speed and success of the si.affing consullan1s it provid~s. Beca~sc TEKsystc~ s di~fercn tiates itsclr based on quality, it is critical th at it hire s people with the skills that meet its clients' needs. To measure the capabilities of its job candidates, the company carefully assesses applicants' skills at the prcscrcening stage as well as later in the hiring process. These assess- ments require th e company to collect high -quality data.
8 . . . .
, Likewise, global staffing finn PcopleAnswcrs uses systematic, obJecuve, and data-dnvcn SCt~ntific methods to help clients hire employees who perfonn belle~ and stay longcr._Curt Gra y, senior vice presidcnL or HR at AmcriPridc, one of the five large st um form rental and \men supply
204 Chllpta 8 • ~kawn:m.:nl
• IIE:tS URE.JIE.\T
I\Wr,...,...=of=s"'"t,uu,ob,,r1 acr:<Jl'd111.~ W.Jt ..,... n,/rurn.,,r,rnno,r ,vos,vrr:,,,,ff.'t'<'f'lt.}IJDJ.J•-.b"'C"Cl'-", MIUpttD ..Jd,r Jr,,ff111!( fJJ lrm
""" w .......... nroJ t> Mltvme' S <if r R£DICTll'E D.-lT..t
utfonNJll,m ubo111 Mt'tlSIIU..I wtd 10 lh<1Upro1rmo,u<Jbo111""t<'"""'f
CR.TTE. RJO.Y DATA
u,fomv,:w,,abowr,mpunwilt>IIIC<>•n.tS crf Wua/fi"-,, Pr«tM
com nies in the Unitl-<l States, state s they werc nbk to work w_i lh Pcop leAn swcrs lo rc(Ju tum:cr by 50 percent usin g scree ning and :15scss_ment tcc~no\0~1c s dev?lo~~ us ing data at lytics. The corne rs tone of using d:ua cffcc t1,·cl y 1s co llecting h1gh -quahty mt omiati on on
right pr«Jictors and outcomcs. 9
WHAT IS MEASUREMENT? Evaluating people, processes. or outcome s requires _col _lecting data, which _requ ire s me asure. mem. ln staffing . measuremen t i~ the ~rocess of ass igning numbers accord mg to some rul e0r co nve ntio n to aspects of people, Jobs. JOb succe ss. or as pcc1 s of the s taffin g sys tem _ 10 The measures can be used to select individual employees or to asse ss the s taffing process in gc: era!. Interview rnting s, knowledge te st scores, and perfo rma nce da1a arc meas ures used 10 select employee s. Measure s rek,·ant to the staffin g process i~ genera_! asse ss the following : (a charac teri stics of the job (whi ch enables the firm to c rentc Job requirement s and job rcw~ matrices). (b) aspects of the staffmg system (such as the number of days a jo b posting is ru where it is run, and the nature of the recruiting me ssage ), (c) characteristics of job candida1;; (su ch as pc rson::i lity or ability), and (d) staf~ng outc~mes (s uch as performan ce or turnove r). The se type s o f mcasu_re s e~able the firm to 1mpro,•e its sta~~ng sys tem b ~ id e ntifyi ng impor- tant paucrn s and rclau onsh1ps that can be used to make dcc1s1on s or 10 de s ign asscss menis and intervent io ns.
Th e numcril·al outcomes of measurement arc d allll . Strategic staffing centers around col- lccung and anal yzi ng the relationship between two types of data: 11
L Predict i,·e data is information used to make projec tions abo ut o utcome s. For example, what data might you collect to predict turnover or job pc_rfonnan~c? Similar ly. could }OU measure the conscie ntiou sness of job candidates to sec 1f it predi c ts so me component of job success? Thi s is predictive data. In terms of the genera l staffing sy~ tem , prcdicth e data ca n come from any pan of the hirin g proce ss and ca n include informat ion on SOUR:· ing quality, the basic qualifications of applica nt s, and the ir 1raits, co mpetencies , and value s.
2. C rit~~ion data . is inf~nnation about important outcom es of the staffing proct!>S. Trad1110nally . thi s data mcludes measuring the job succe ss of empl oyees. ~fore gcncr• ally, cntcrio n daia should also include all outcome information relevant to the e val uati on of the effecti veness of the staffing sys tem again st its goals . Thi s ca n include measuring 1 e_o mpany:s return on investment related to its s1affing measure s, employee job succe!>S, u me-10- hirc , lime-to-producti vi ty, prom otio n rates, turnover rate s, and new hire fi t with compa ny values.
. In shon, c~1cri on data is informa1ion about outcomes of the staffing o r se lection process. Predi ct ive <lat.a g1,·cs you inform atio n about the possible predictors of those outcomes.
. Once yo~ have collec1~ _data, you ha ve to do so mething wi th it . Next. we discuss basic t~ ls _and tcchmq ues for de sc nbrn g and interpreting data, fo llowed by a di sc uss ion of the charac· \ensues 0_f usc fu l me as ures. It is crit ica l to no1e that it is nearly pointJc ss IO analy ze and in terpret
~ !~y:i~~~hty . The da1a must be accu rate, depe ndable, and relevant to be wonh col·
DESCRIBING AND INTERPRETING DATA
~ ~;e~%~~u:·~ sue~ as an assess ment te st, is ad mini ste red to jo b candida1e s, the data needs to di stri buti on ei s im~: II car° be use ful !n makin g hiring decis ion s. De sc ribin g the sco res wi th in J we ll as undemandi n a~t o_r 1~~rprcung what they mean for the eniirc grou p of cand idates , ;i.
g e signi icanee of an y one panicular score .
Type s of Measurement
The 100 1s yo u can use to describe and . 'Jh( <la.ta can come fr om nominal, ordi nal , •tn\:~~;t, :e r~~it; !c;::~e~~ the level of meas urement.
o,.tlNAL ln nom in a l measuremen t , nu mbers arc as si •ncd . ~o ordering is implied in the ass igned values. Gende r, rac!. an~:iitccretc l~bc ls or categories.
rninal measure s. Yo u cou ld ass ign a "O" to male s and a " I" f ge maJor arc exa mples of :asurc for gender. to cmales to create a nomina l ORDINAL . In o rdin aJ measurement , attributes arc ranked by assigning numbers in ascendin ordeSCendmg order. Fo~ c~amplc, the first ~rson fini shing a work sample tas k might receive! Ml." the seco nd person 2 , an<l so _o n. Or~rnal measures don't tell you anyth ing about the dis - WlCC t,ctwccn the score s, th o ug h- Ju st the,r rank .
iHTERVAL In int erv~ I meas url'~~ n l , the di stance between scores has meaning . The distance r-rom40_to 50 degree s m_ F~renhcu 1s th~ sa mc_as the distance from 80 to90d egrees. Thus, the inierval 1s c~nstant. Hov.c\'er, the zero pomt o~ mtcr.·al measures is arbitrary , so ratios computed using two different m?a.sure ~ of the same attnbutc will not yie ld the same result. For example, !OOdegrecs Fahrenhe it is twice that of 50 de grees Fahrenhci1 but when convened to Celsiu s the 2:1 ratio doc sn·~ hold . Examples of inter.•al measurement in se lection may include intelligence scores, personality asse ss ment scores, and scori ng keys for inter.·ic w que stion s.
RATIO Ra tio measureme nt includes a true and meaningful zero point. Thus, you can con- strUCt ratios from th e measure. Salary , weight, height, typing speed, and sales per mo nth arc examples o f ratio-level measu res. lf one pe rson can lift 200 pounds and another JOO pound s, then the rrrst perso n can lift twice as much as the seco nd person whe ther the weight is in gr.:i ms or pounds. Thus, the ratio hold s beca use there is a true zero point. In a selection co ntext, years of experience is a ratio measure because ratios will hold whether time is measured in years, min- utes, or seconds.
The di stinc tion s among the di fferent types of measures arc impona n\ because the y influ- ence how you can describe and interpret data. For example, it is generally not useful to co mpute ana\·crageofordina l sco re s.
Scores The process of assign ing numerical values during measurement is scori ng. In order to interpret scores properly, we need to understand the scori ng system used . Oat.a is often prese nted in term s of numericaJ scores, such as raw scores, standard scores, and pcn."Cntilc scores. which we di sc uss next.
RAW SCORES Raw scores arc the unadjusted score s on a measure . On a job knowledge le st. the raw score might represent the number of item s answered correctly. For measures s uch as personality inve ntories that have no "right" or "wrong .. an swe rs. the raw score may represe nt the number of positive re sponses for a particular trait. Raw scores do not provide much useful infor- mation by themse lves . Cons ider yo ur sco re on a midte rm . U you ge t 30 out of 50 questions cor- rect, it is hard to know whether 1hi s is a good or a poor score . Yo u may believe 30 is a poor score, but if you compare the resuh s to the result s of other people who took the same test, you may discove r that 30 is the highe st score . For criterio n-rtferenced measu res, or standards- based assessments, the score s have meaning in and of them selve s. For example, cand idate s might be expected to exceed a certain level o n a criterion meas ure. such as typing at le ast 90 words per minute, before th ey can ad vance to the nellt stage of the hirin g process.
On cri terio n-referenced measure s it is easy to sec what a particular score indicates about Pmficiency or competence. In general, sco re s on norm-referen ced measures have ~,can ing only when compared 10 the score s of others. For example , candidate s who reach a ccnam no i:m - rtferenccd measure-for example, who score in the top th ird ofthcir_applicant gro u? o n a 1ypmg lest-wou ld ad vance to the next stage of the hiring proce ss. _Conven_mg raw scores 1~10 _s1andard ~ores (or percc niile s), as we de scribe next, provide s you with the kmd of comparau,·c mfonna- tlon you need to use ::i no m 1-rcfcrcn ced meas ure .
NORMAL CURVE Man y human characteristics, such as hei g ht , weight. malh ability , and typing ski ll , are di s tr ibuted in the population in a typical pattern kn ow n as the nor~:' curve. In other word s , the c hara cteris ti cs di splay a sy mme trical bell -s haped nppcaran cc i c
Chapler 8 • Mc:asurc:mcnt 205
NO .\IINAL MEASUREMIJNT a1,1twurtmt 111 i n><hrrh1111m~ r< art <I J<IJ(llrd W dlS<rt lt /a~/5 "' carrgoritx
OIWINA LMEA.SUREME.,\7 ,rmt,u1trt11rrn1/n>1·hkho1mbwrr• ortrnnktdb)'t1SJig11lns numbtrJ 1n m a ndi11g<>rdrut11dmgurdrr
lNTERVAL MEASURE.\fEN T amta.rnrtmt n1in»·hid11/,r d1,r"" rr btl><"ttlt ~rnrn 011 un mmburt l,w mtmung
RA. TIO MEASU REMENT <1 mt"-<11rrmtm in hlrlc/r 1/rt dut,mrt ~(><·u11 1corr1!,~mt<ming:l1 ,nc/11dr1 a rnu and mt<111mieful ~no ,,,,,,,,
RAW SCORES
CRITJ-:R /0.\'-RE.l'ERE.\"CE I) M&tSURES 11u•u.wrtsm1</11chrhtJ<'<.>rr.1h,nr mt<U1mgm<1ndo/1l1tmstfrtJ
NORM-RE.FE. Rl~\ 'CED MEAS URES nrta.iurrJ in ,.Juel, 1/,r JNHtJ hlllt "'"""''K""/y 11htncomparrd1v1hr J<',,rrsofo 1hrn
.\'Olt\lAI. CUR\' E. /l("lln·rrrprt<tlllUIJ.;/IJt~ll •llu,prd syr,mrr 1r1rnl illllnl>ulw11 uj J()m t /<1cfor
L
206 Cluptcr S • Mc.c;un·mcnl
SOR.\t,U, DISTRJBUT/0.\' 1M of <n>nJ WAf!tr rht ....,,...Jcwnt
l'ERCC..\Tll.£ SCORE
"""' JrO" tllashalbu nn>•ll tntJ u,tnun t.lprts.<w,, nf,Mprrcr"ro.gr ,fpr<1pk ""'-'•n,,to.1orbc-lo• 111<,,
tht midpou,t. orrt"I" 1Jfrhrdo.ro.
o. ...,.<1.J,.,. ofan1ro/1t"'1r,.,..., reflut111.grlu:a.-uo1gtuort
.IIEDU.V
IN~lr,r,, ,1.orrht pvint~lm, •hie~ 50 pou"' '41hr Jco rof<JJI
MO DE
I ' ' I '
i i ! ] l . ===---'-:_ : ' I ' ' ' ~:;~:or .. 13 ll~ 1359 HIJ Hl3 Ptn-tntiltt 13 228 1587 5000 8413
1359 214 .. 13
9772 9987
S11ndardiztd - 3 •J Srort Tesl :ii
r"''' J5 ,I{) 45 so 55 60 65 ~O 60 SO JOO 120 14 0 160
FIGURE •• , Tht Norm.I Curvt Illustrating su11d1td Scores 11nd Ptrtentilts
the one shown in Figure 8- 1. The distribut ion of score s under the normal curve is called the normal distribution. As you c:in sec, :1 large number of indi vidual cases cluster in the mid• die of the cuf\·e. The farther from the middle (o r average ) you go, the fewe r the cases. Mani· distributions of sco re s fo llo w the same nonnal curve pancrn. Most indi vi dual s ge t scores i~ the middle range , near average . As scon: extremes are approached, fewer and fewer cam exist. ind ica ting that progressivel y fewer individual s get lowe r sco res (rep re sented by th( left tail of th e curve) or higher scores (repre se nted by the right tail of the curve). Other dis• tribution s arc po ssible but we will focu s on the nonnal distribution because it is one of the mos t commonl y used.
PERCENTILE SCORE A perccnlilc score is a raw score that has been conve rted into an e:< pression of the percentage of people who score at or below that sco re . For exam ple , in Figu re 8- 1, a score of 55 on Te st X or 120 on Te st Y would place a perso n at about the 84th percentile. Thi s means that 84 percen t of the pe ople taking the te st scored a1 or be low this indiv idual' s sco re.
The second hori zo ntal line below the curve in Figure 8-1 labeled '"Percentiles" rep, resents the distribution of scores in percentile units . By know ing th e percent ile score of ao individual, you al read y know how that individual compare s wit h others in the group. An in • dividual at the 98th percentile scored the same as or better 1han 98 percent of the indi vi dual! in the grou p. Th is is :ippro,;imately equivalent to getting a raw score of 60 on Te st X or 140 on Test Y.
Pe rcenti les can yie ld useful infonnatio n. Ass um e you want lo make hi~hly compctiti1·e Job offers. You ca n use da ta so urce s such as the Bureau of Labor and Sta ti sti cs (B LS). \\h ich typically report the 10th, 25 th, 50th, 75th. and 90th percentiles in the di stribu1ion of salaries for a ghe n occ upati on. If you wish to pay salari es at the top 10 percent of the di stributi on, then )OU ca n use the BLS's percenti les 10 fi gure out how much yo u should pay .
You _ca n. also co llect salary info nnati on from wi1hin your finn to dctcm1ine what migh1 be a competitii e Job offer. Assume you collected data from all your employees on their current sal· ary le vel s. How might you describe the data? We di sc uss thi s ne xt.
CENTRAL nNDENCY Central tendency describes the midpoint , or center, of the data . Typical meas ures of ccn trJ] ten~cncy include the mean, median, and mode. The mean is a measure of ce ntral tendency reflccu ng the _average sco re . For example, you cou ld l'Ompu tc the average sal-
h11 nd th en.fay at _or ~bove thi s leve l to be co mpetitive. The median is the middle score , or the
h t pcrcent1 c, whi ch is the point below which 50 percent of the score s fa ll be low. The modr 1s
I e most commonly observed score .
mode ~ ;~~r:e 3;:mneo;~~~t;~\~u_icd. a~ they arc in Fi~ure 8- 1. th en the me an, m~di ~n, anti In the case of data on ann ual a ~s ?0t _alw_ay s the case 1f scores arc not normally d1 smbut~; cnd )ofthc range and rclativel p/' . di Sln~uuon cou!~ have most employees at the !eft_ (lo~e Thi s is regularly observed in t r :;i;;1.PI O)ees a~ t~c high end. !_his is not a no~mal d1 stnbuuonc wou ld be 10 the left, near the buf k of ion ~, a~d 1~ is ca ll ed pos,r,ve skew. In thi s case. the 01 ~J
th c diMn buu on. because it is the mos t co mmonly obsCI"\
Chapter S • Mea~urcmcnL 207
sC(tC , The mean would. be shifted to th e right du_c to the hig h annual pay for a limited number or ernployecs. The median .would be ~omcwhere m between. In this case, the firm might extend different job offers dcpcndrn g on which mea sure it used to describe the data. In labor di spute s, it is not uncommon for rnanag_crs to use average pay as a reference point (indicating higher pay across employees) whereas umon s use the mode or median as a reference poin t (indicati ng lower
acrosse mployces). paY Ahcmativcly. you mi ght _sec a bimodal di.Hrib u1io11, or II di stribution with two mode s, for annual pay. The mean and n:icdi an would fall between the two modes but neither would be repre - sentative of ~rue compensation level s because there arc probabl y two se parate employee groups represented rn the data . Perhaps the two ":'odes represe nt so~c employees who arc paid on a sal- lIY bas is and other empl ~yccs who arc pa!~ on an hourly basi s. In thi s case, you might use one of the two modes to dctcnnmc your co mpct111 ve offer for compensation, or yo u could compute the mean, median, and mode separa tely fo r th e two groups.
VAIUABILITY Vuriability de sc ribe s th e "spread" of the data around the midpoint. If yo u were told that an applicant scored 76 out of 100 points on a work sample test, what would you think? It's hard to know what to co nclude with out more info nnau on. What if you were told the mean was 70?This additional inform:ition helps becau se you can tell that lhc app licant did better than average. but you arc still mi ss in g imponant infonnation. How much better or worse did the ap-- plicant actually fan:: ? To an swer th is, you need to know the ,·ariability of scores. What wou ld yo u think if the lowest score was 64 and the highest was 76?\Vhat if you were told the lowest score was 40 and the hi ghest 100? Knowing the variabil ity of a distribution change s your interpre ta- tion of scores.
VAR IABIUTY a,r,ra1urttl,<11dt<cr1l>t< tlrt "1prtud " oj 1lttd<1/a1,r11mulrl,tm1d1,mm
There arc a number of alternative measures of variability but typical measure s include the range, variance. and standard de viation . The range is the d1ffcn::nce between 1he highest and lowest observed scores. The range is highly innucnccd by an y single extreme score (an outlier) so it may not effectively represent the true variabilit y in the data. Other measures of variability such as the variance and standard dc\·iati on arc less affected by outliers. The variance is a math - ematical measure of the spread based on squared dcviauons of scores from the mean. You can find lhc fonnula for ,·ariance in the supplement at the end of th is chapter. The standard devia• tioo is conceptually si milar to the a1 cragc di stance from the mean of a se t of scores. It is the positi,·e square root of the vari ance. A data set with a larger standard deviation has sco re s with mon:: variance and a larger range . For uamplc. if the :1 ,·cragc score on a me asure was 70, and the standard deviation was 3, the scores would be more tightl y cluslcrcd arou nd the mean than if the standard deviation was 15. If all the scores we re the same , the standard deviati on would be 0. If everyone score s the same on n meas ure , the measure isn' t usefu l in predictin g job performance or deciding who to hi re . You can se c m Figure 8- 1 that the r:i.n ge and standard deviatio n arc small er for Te st X than the y arc forTe~t Y.
STANDARD SCORES Standard scores arc conve rted raw scores that indicate where a person 's score lie s in compari so n to a referent group. A commo n standard sco re is al score, which mea- sures the distance of a score from the mean in standard dev iation units . There arc three determi- nants of a ;:: sco re : the raw score an<l 1hc mean and stan dard dev iati on of the entire se t of sco re s.
Look at Fi gure 8-1. Te st X and Test Y ha,·c different raw score mean s. Notice that Te st X has a mean of SO and Te sty has II mean of \(Ml lfan indi vidual got a sc ore of65 on Te st X, that person did ve ry well. Howeve r, a score of65 on Te st Y would be a poor score. Raw scores often carry limited infonnati on by themselves. . .
Figure 8- 1 shows th e percent of cases I, 2, :ind 3 standard dcviauons above the me~n and 1, 2, and 3 standard deviati ons below the mea n. As you can see, 34 pc~cnt of the cases he between the mean and + 1 stan dard deviat ion, and 34 percent of the cases he bc1wccn th e me an and - I standard de viation. Thu s, approximately 68 percen t of the cases li e between - I and + I standard deviati ons Note th at fo r Te st X. the standard de viation is 5, and 68 perce nt of the te st takers score d bclwc~n 45 and 55 _ For Test Y, the standard deviatio n is 20. and 68 percent of the
tc st t;t-s score~ between 80 and 1!0. . th re~·rcnt group' s mean from th e target indi v1 d•
ual ' ;:: sco re is calcul ated by su f~ract ing b e the ~1easurc 's standard dc1•ia1ion in the refere nt
groip~a~:~:~l~i:~ ~~:~~~t: csi~rcc:~~;atl~ how many ~tandard de viations the individual' s
RANGE 1htd1fltrt" ct~/wttnthtlu11-ht'1a"d /owntobun·rd ,cou
OUTLJER 0 Jcorttlw111mur hh1;;l1tr"r /o,.tr1/io.t1 17WifO/ tl1t<cort1ina dr.11rrlx4110'1
VARIAXCE <1mo. 1l,t,,,.1tkalmt<Uurt of1ht1prt"'1 ba1tdonJ,;uo.rtdJt\Wll(J nJ of .1co rt1 /rum rh tmrun
ST ASDARD DEVUTION tht p<>,i11,·t ,quurr n~,, ofth t wn<111ct , 11 u coiictpruo.11}' 11milar tu 1hta,uogtdi111111rt/rmn1Mmta,,of a Jtr ofworrs
STANDARD SCORES nm ,·trttdrm,• Jcort •th<11ind,rlllt "htrt o.ptrJ'1n 11rortlits m w 1"pansonw11rrfrrt n111.roup
ZSCORE a st""darJ1cortrha.1,ndrratts rht dmo.naofa sco"fromtht mtun,n .mmdurd dt,1otwnun111
208 Chapter 8 • Mcasuremcn1 fthen:fcrentgroup. ltcanbcseen in Figure8-l thatappr
score is abo\'\: or bc:Jow Lhc ;~~bc~ow a.: score of+ 1, whereas nearly 100 percent of pcopl/r~; :~~1~'. a~:::~ 01 ~;; s~mplc fonnula for a .: score is
:....,,. = ( lndi,-idual 's raw score - Rcferenl group mean)/ Refere.nt group Mandard deviation A. score is ncgatii·e when the target individual's ra~ score 1s below the referent group'\
mean, and positi,·c when the target individual's ra~ score 1s above the rcfcrc,nt group's mean . To compare candidates, we often need a smglc overall score that represents each can.
dida tc ·s combined pcrfonnanee on all of the assessment method s ~scd to .evaluate thern Combinmg a candidate 's raw scores on 1wo or more measure~ that use_ diffc~nt i.c?ring SHtcm~ is difficulL Imagine an assessment system that e,·aluatcs candidates us mg an _mterv1cw scored on a l-to- l0 scale and a job knowledge test that is scored on a 0-to-100 scale. ~imply averaging the two scores would give disproportionate weight to one of_ the tests, depending on the mean and standard deviation. Standardizing both scores by convcrung them to z scores ullows them to~ easily combined, as shown in Table 8-1 .
In Table 8-1. the mtemcw scores have a range of 15 to 22, a mean of 18.25, and a stan. d:lrd dc,·iauon of 3, The job knowledge test has a range of 69 to 87, a mean of 78.25, and a standard deviation of 7.46. If you subtract the mean from each raw score and divide by the standard deviation, you will obtain the standard score. For Felix, the calculations would be ( 15 - \8 .25)/3 = - I.I and (87 - 78.25)n.46 = 1.2(aftcrroundi~g).
Although meaningfully combining the raw scores would be ~1fficult, combining the< scores is easy and results in a si ngle number reflecting how each candida te did on both of the~ sessmcnt methods relati,·c to the other candidates. In lhis case, Sue's outstanding inte rview r.co" allowed her 10 overcome her slightly below-al'erage job knowledge test score to be the candidate with the highest overall score. If a company wants to weight multiple assessment methods differ• cntly, each standard score can be multiplied by the desired weighting percentage. For example, lhe fonnula for weighting the mtcrvicw r.corc 60 percent and the job knowledge test score 40 percent would be
Overall score""' (0.6 X z."'"",.,..) + (0.4 X ¾ know1cJ,,1c:.i) For Fel ix, thi s would be (0.6 X - I.I )+ (0.4 X 1.2) = -0. 18 rathe r th an the O.! he
received when lhe interview and knowledge test were equally weighted.
Shifting the Normal Curve When making candidate r.clcction decisions. it is often assumed that the di stribution of appli• cants· fit with lhc Job reflects lhc nonnal curve as depicted by the current talent pool sho 1rn 1n Figure 8-2. If lhis is true , then a large burden is placed on the se lection system to accurately iden· 1.Jfy ...,hich cand idates fall to lhe far right of the cur.·c (the best hires). In pr.ict1cc, however. many of the most desi rable people for a pos111on are not in the applicant poo l at all. The mos! 1akntcd and compete nt people arc often successfu ll y employed because they arc u~ua11 y being promoted and rewarded for the ,,.,ork they do. As a result, most of the se people an:: scmi•passive job .eclm at best. Without an cffccli1'c sourcing and recruiting process, they will nut apply 10 you r flm1 . For c.,am~le, dunng the 2008 economic downt urn, ,1 was difficult to get passive job see kers with deep c.,pencnce and a proven 1raek r<.'cord in advertis mg 10 apply for positions at oth<.'f companics.11
Converting Raw Scores to Standard Scores
Candidate
Felix So, Lm P,em~
Interview Score
Raw St.1ndard
1S - 1.J 12 1J 19 03 17 04
Job Knowl edge Test Score
Row Stan dard
87 12 7) - 0.2 69 -l.2 80 0.2
Overall Score
Standard Units
(~+ 12 )::0 1
(1.3 - 02) "' 1.l (0.3 - I 2)"' -0.9
( 0.4 + 02 ) - - 0 2
A ,\CIWCJobScclcrs Cum11l Talcntl'ool Bl'mi1eJobScckers
: I :
! ! l l FIGURE l-2 ShiftlngtheNormal Curve
A passive sou rcin g approach ca n result in a di stribu1ion of applicants that is shif1ed to the Jett.or \owcr end, as depicted by di~tribution A. An alternative way of looking at this is to think about the role that sou rcing and recruiting play in terms of shaping the qualifications of the appli - cant pool. If done strategica ll y. sourcing and recruiting can discourage applicants who arc a poor fit from applying, and increase the number of high-quality pass ive and semi-passive candida tes who do apply. In this case , the di stribution would be sh ifted to the right, as depicted by distribu - tions. This recruiting and sourcing approach wou ld yield candidates of hi gher quality, clearly reducing the burden on the selec tion syste m to identify the be st candidates. This can significantly increase the likelihood of hiring excellent employees.
Desc ribing and interpreting data 1s part of the process of using data strategically. Strategic staffing is further enhanced when you can uM: data to undcrsiand relationsh ips between measure s and variables. In particular, if you can identify predictors of desired staffing outcomes, then 1his can lead to new se lection tool s and interventions, such as recruiter traming. The next section explains how you can assess the relation ship between predictors and outcomes.
USING DATA STRATEGICALLY Correl•tions A corrdation indicates the strength of a linear rela ti onship between two variable s. If people who score higher on a measure tend to perform better on the job, or if people who score higher on a measure perform lower on the job, ~res and job performance arc said to be correlated. A correlation coefficient, also called "Pearson's , •· or the "bivariate corre lation." is a single number that ranges from - I tu + !; it reflects the direc1io11 (positil'e or negati ve) and mag11imde (the strength) of the relation shi p between two variables. A va lue of r = 0 indicates that the \'al• ues of one measure arc unrelated to 1he values of the other measure . A value of r = + I means !hat there is a perfectly linear, pos itive relationsh ip between the two measures. In other words, as the value of one of the measures increases, the value of the ot her measure increases by an exactly determined amount. By contrast. a value of r"" - 1 means that there is a perfectl y ncga - U\'C (inverse) relationship hctwcen the two measures. In othe r words, as the va lue ?f one of the measures increases, the value of the other variable decreases by an exactly determmed amount. The information provided by corrcl:itions is use ful for making staffing dccisio~s. The health care literature is full of st udie s that docu ment the positive corre lation between pallcn t outcomes and proper staffing in health care organization s, for cxample. 1l You can find the form~l3 for cor- relation in the supplement to thi s chapter along with the fonnu1a for faccl . Correlauons ca n bo: easily computed usi ng spn:ad ~hcct s or softwllfl! such as Microso ft face\, SAS, or SPSS . In m_ost circumstance s, we rarely see corre lations remotel y approaching + I or - 1. Even the eo rrcl auon between people' s height and weight is typicall y les s than .80. In staffi ng contexts, ~·c rare ly have such precisely measured and highl y corre lated data. Measurcme~'. error, t~ 1': di scu~st? l~t~r, also reduces the magnitude of the corre lati ons we observe. In add111on._rcstnct111g the vanubd1ty of our applicant and hired pools can also reduce the size of the corrc1at1011s we obscr\·c. .
11ie 1ypica1 value s we might sec in staffing con texts arc + .30 or -:J~. Alth~ugh much lo...,~r than the theon::tical maximum and minimum , these values can result in s1gmfica11t 1~pro\•c m~nts m the quality of hires. Unstructured interview s, one of the most commonly used se1ccuon tech~iqucs. often have a correlation of +.20 or less with job pcrfonnancc. A well-structured perso~a11~y tcs1 can have a correlatio n of +. 30 with job performance. Thus, using sue~ a test have a s1gmflcant llOSitivc economic impact on an organization by significantl y irnprovmg the hinng process.
Chapter 8 • Measurement 209
CORREi.AT/OS rl1<"Jtr~ni;1/lof"l"''""''tla1w,ul"p /x'/ll trnlll<Jmn<Jbltl
COHREUT/0,VCOHFFfClE.VT a Jong lt numbu 1/1111 r<1ngu jram - I w t I , it rtflu,., ,,,,. d,ruti,m (p auro ,~ ar ,a fg amt / mul,i1agn1rudt(ll1t strt"i;th/()ft/1t"/()(w11;/11pbtlMUfl rn,,,,()n11bl n
I
L
2 10 CluplcrS • ~lc-3S!Jremem
XY Vc-ry H1gl'llyCorrclatcd
0 Cornplc1elyUnc•lnd.1tcd
CTID y
Moder.llelytoStronglyCorrel3tcd
FIGURE 1-l Olagrilms for Correl11tlons
One way of th inkmg about correlations is depic ted in the diagram 14 shown in Fi gure 8-3. Th.ink of the variance of a given v:tnablc as depicted by a circle. If the circles of two different van- ables arc perfectly overlapping, then the variance of one variab le is perfectl y com:: latcd with the other. In the flf'St a ample, the 1wo circles arc ne:lfly overlapping, suggesting lhc corre lation bc1wccn X and Y is approximate ly eithc r +.90or - .90. Why either positive or negative? Because the varianct 1s shared regardless of directi on of the sign . In the second example, thc two circles do not O\'Crlap at all, indicatmg that the corre lati on between X and Y is 0. In the th ird exam ple. the two circ les overlap nearly half. suggcs tmg the correlation is abou t +. 70 or - . 70. Why a co rrel ation of ± .70 fo r a nearly 50 percent overlap ? As it turns out, the amount of variance shared by two variabks is equal 10 the square of the correlation, or ? , and .72 is .49 or abou t 49 percen t ove rlap . Anothe r good way to understand the corrc lmi onal relationship betwee n two variables is to graph them . Figure 8-l
SCATTER ,wr illustrates tht: correlmions corresponding to several different pattern s of data in scatte r plot.s, or 1,opl,,u,J ilhurratioft o{IN graphical illustrations of thc relationship between two variables. Each point on the c hart corresponds rri=,,,uJt,p b,,r,,r~n ,.,.,. ,·ur-wbk., to how a panicul3r penon scored on a test and a measure of how he or she pcrfom1c<l o n the job.
From the sea.Iler plots m Fi gure 8-4, you can sec th at a correlation of+ 1 occ urs when~ data pomlS are in a pe rfccl line. A correlation of+ I mean s th at hi gher te st scores on the measure correspond with an exact im prove ment in perfonnance scores. The 1cs1 score is called a prcdJCIOT
PREDICTOR \ 'ARLt.BLE \·ariab le. A predictor ,·a riable is a variable used to prcdic1 the va lue of an ou tcome . In thi s case, a 111/"WU' aud w p~,.in W •allU' r,f the pred ic to r varia ble (tes t score) perfec tly pred icts the outco me (performan ce). Now noti..:e thc
lac k of a rc lati~ nship between score s and pe rfonnance in the gra ph showing a corrclati~n of r = - .05. In th is case, the scores are almost completely independent of j ob pcrfom1ance raun gs, and these scores art' a poor prcdic1or of perfonnancc. . Wh en the rclallonshi p is perfect. a.~ it is in the "+ I" graph. ii is easy 10 sec how the trc nd
line shou ld be drawn. Howeve r. when the dala do not depi c t a perfect relat ion ship , it's harder t? fi~u re ?ul how~ dra w the line . In t/u s case, you wi ll need to draw 1he line in s uch a wuy th ai !1 rni mm ,1.es the distan ce of all the poi nts from the line (i .e., minimi zes errors of predi c ti on). TM is c~led a. regrcss ion hne. which wi ll be di sc ussed in the next sec tion. When 1herc is a t111ost no
::~;:~tn~~ ~~~~:1~::~ be nc3rl y flat. Whe n th ere is a nega t ive rela11on st11p. thc
I, i l!.
oL--~------~- 0 I J 4 S 6 7 8 'l 10
Te!itScores
r = +l.00
I' i 3
oL--~--------
IO
'
o I 3 4 S 6 7 8 9 JO Te$1Scorcs
r= .32
TcSl S.:on:~
CUl""\ 1hnc:.uRcl~lmn>hlp r = ~
FIGURE Corr•lations Exprused 11s S(ltter Plots
Chapters • Mea:.u rcment 211
oL------- --~~
IO
9
7
0 I 3 4 S 6 7 8 9 10 Tcs1Scom;
i ,L ~ ------- 1 , :, ' - 3
o-1-----~-------~ 0 I J 4 S 6 ? 8 9 tO Test Scores
' ,.)__~-~~~-----~ O 1 S 6 7 8 9 10
Tcn Scon:s
If yo u found a corre lati o n of r = -.4~ bc twecnt/tisu; i;:~t!:~~i:;ba?n~ebs:~~~~y~ ~ICM~re be useful in pred ic tin g whic h ca ndid ates arc 1 ..;,! ~oo;lation of,= -.43 is of a re:i - Just hire ~ople who perfonn \~wer on the n~eas u:;,n relationship between the measure :ind sonabl~ hig h magnitude . Thus •. ll re.fl ee ts 3 fairly s lati!n isn't imponanL. To make thi s easie r on-the-Job pcrfonnnnce. The d1recuon of th e eorrc . . g errors on a test and the job per· lo understand , imagine 1h11t the measure was assess •.~S typm oring lower on the measure made fonnancc dimen sion was typing pcrfonnancc. Candi a~s sc ords negati ve corre lat ions are j ust fewer errors, and thus arc likely to be better typi sts. In °1 er w '
L
2 12 ChaptcrS • 1'k35urrmC'lll
S.,Ufl'LJSG ERROR
w ,WUW,/,r, r,/ JUAplr COl"Tr/(lri ,HtJ d,,rwci,.,,,rt
STATISTIC.U. SIG.\ 'IFICASCE 1htdtgr,r1v~lucli rlit 1>lnrn td rt /,m mulup u ""' /Jill,· 6t ,., iamplv,g tmJf
as useful as ith·c correlations-negati\·e correlations involvin_g a ~esirablc sta rting outcorne ust mean th~ower-scoring candidates are preferable to t~osc w!th h1ghc~ ~cores.
J An additional type of relationship betwee n two v~ables_ 1s a cun•1lmear rel?tio,rship in ,,hich scores are related to outcomes in a nonlinear fas_hton._llns can occur when higher score$ are related to higher performance to a poin t, after which higher scores rel~tc to _ lower perf0r. mancc. Curvilinear relationships are so metimes foun16bctwce n Lhc ~rsonalll~ ~al~ of conM:icn -tiousness and job performance. 15 Consc ientiousncs~ refers to ~mg, sc lf-d1~c1pl 1~ed, striving to achieve, and tending to think carefu ll y before acu~g. If you ha,c C\Cr _worked ~Ith someol'I( who was extremely detail oriented, stro,·e fo r perfection, and had a h~ umc making dcc i~ioru, you probably understand how too muc h conscientiousness can ~om~t1mes be a detriment co per. fonnancc. Note, however, that in most si tuations more consc1c nt1ous peopl e !end to perform better on the job.
If yo u were to rel y solely on the correlation coefficient ~o eva luate " '.hcthcr or not being conscie ntious is a pred ic tor of people's job performance, you mig ht undere stimate the measure' s usefulne ss. If a curvilinear relauonshi p exi sts. rather than selecti ng ca ndidate s with the highc$t conscientiousness scores. 1t would be bcncr to select candidates who score closer to the middle range on the measure. 1bcn: are specialized statistical techniques for te sting for curvi linearity, an d 1t is imponant 1ocollcct data to dctennine whether a linear or curv ilinear relationship nists forthcpos1 t1 on you're filling .
Otherusesofthecorrelation coefficicntinclude:
• Rel au ngstoresizcswi th staffingb·els • Rel ati ng seniority in a finn with how well em ployee s pcrfonn on the job • Relatingthctim e tofillajobwithncw-hircquality • Relating the qual ity of new hires with a business's performance and the san~foc tion of1ts
customers
Inte rpreting Correlations
Suppose yo u find a correlation between a Job knowle dge te st and a measure of job succ~s equal to . l5. Shou ld you use the measure ? Answering thi s ques tio n requires assessing the l~kc lih_o~ that the obscr"cd relationship really uists, and then evaluating whether the rda- 11onsh1p JS strong enough to make the measure useful given its cost. Whene ver 11,10 variabk1 arc corre lated usi ng a subset of the tota l popu lation. there is always th e chan ce that the sample used docs not represent the total population. If you had a gro up of twent y emp loyees, ten women and ten men, and rand omly chose four of them, wou ld yo u always choo~c two women :ind two men: On average , you would. But there would be so me instances in which yot1 would :~d up choos1~g four wom~n . A t othc~ times, you would end up choosing four men . Similarly,
hen computrng a correlation coefficient from a sa mple of peopl e the correla tion might not accura t_ely reprc~ nt _the correlatio n that exists in the general popula;ion or your app licant poot Sampling error 15 sim pl y the variability of sa mple corre l:1 11 ons due 10 chance. The uscfulnt$ ; :!n:;~;:~:1:: ~~:c~se;!~~tcd by cons idering s1atis1i ca t sig nifi cance and prac1ical s1gn1fi -
:~c~~~;1:~~~d5:1~11i:=~ ~; A ~ORRELATION _Statistical signifi c~ nc~ is ii:rc deg1t'C t~ tistical ly significan t ifit has le! than a likely_ due to ~':1plmg error. A corrclauon 1s said to be ~ta _ 5 perce nt (called ap-valuc for robat,1/Crta.J n probab 1l 11y of~i_ng due to sam~ling erTOr-usuall) .03 (\·crsus, say , .30), then the ~orrelati uy _val~)- Ifth~. pro_ba_b1hty ofa corrclauon due to chance is obse rved correlation is farenou •h fromo~ is said 1? ?C s1~11s1Jcal ly significant.'" In other words. the a probab ilit y bel ow ,OS or 5 pc~cnt n zero thac 11 is unlrkcly to be due to ~mpling error, ~-1cldlng c:ipturc the '"true'· rdation ship, andc~cn IC larger a sa~plc , ~he more likel y obser:ed rda11onS!u: sample , the ~ r.cd rc lali onship muse ~7al'. cffecc sues w1_ll be scaci stically sig nificant_- In a sni cc because there is a grea ier probabi lity that th:ger for.the rcla~mns~ip _to reach statis ti_cn l s1g n1fical'I
obser.ed relauonship 1s due co samplmg error THE PRACTICAL SIGN IFICAN CE OF A CO . -c docs not gu aran tee that a predictor . f RRE LATION Unfortunately, statistical s1gnrfic:tn•
11 correlatio ns can be stMi~ iically si~sn~~:a:'; If a samp l~ size is large enough. then C\'.en vc~ s~::n
Decause large samp les lend 10 res ul t 111 ~-om:IJ
estimates with litt le samp ling error. Fo r ex_amplc, when the military studies predictors of troop performance, because they have a sample m the tens of thousands, e\·cn prcdic iors with a \ 'Cl)'
all corre lation with the outcome arc stati s1ically sig nificant. sm After estab lishing s1atistical ~ignificance, the focus shifts 10 prac tica l significance. Prt-ctical significance mean s that the observ ed re lationship is large enough to be of va lue in 1 practical ~ nse; A_corrclation that is statistically sign_ific~m is not necessarily large enough to be of practica l sigmficance. Whether or not a correlation 1s large enough 10 be practica ll y sig- nifican t is in the eyes of the measure r. For example, if hiri ng errors for a panicu lar job arc not costly, the n a correlatio n of .2 might be accep1able to an organization. However. 11 correla tion of .2 might be too low for cri tical jobs in which hi ring errors are costly. In yet other situation s, a correlation as low as . 15 can still be practicall y sig nificant.
Practical sign ificance is irrelevant unle ss the re lationship is also stati stica lly significant because otherwise the obsc n ·ed correlation could be due to chance. To be usefu l, a relation ship needs to have bot h prac tico l and stati stical sig nificance . Other facto rs will de termine whether or not a corre lati on is useful : An assess ment sys tem that is inexpensive, forcxumple, might still be usefu l even if the correlation is not large . Altcrnati\·cly, if an assessment me thod 1hat correlated .1 5 with job success was expensive, took a long time to admi nister, and was on ly moderatel y liked by job ca ndidate s, it mi ght not be worth using even if it was a statistic ally significant predictor of n pe rson' s job succe ss. It depends on the degree to which the asse ssment yields a return on the money a firm ha s inves ted in its use . For example, c,·cn if an organi1.ation use d an assessment method with a low correlation with job success, the co mpan y mig ht sti ll e::i rn a good return on the method. Consequently, it is imponant to look beyond the magnitude of th<.! correl ation.
Ide nti fying stati st icall y and practically significant re lati onship s can help organizations exec ute thei r business strutegie s more effectively . The food whol esaler Sysco is a good exam- ple. Sysco. headq uancrcd m Houston, Texas, periodicall y assesses the correlation between its customers' sati sfaction ;:md its emp loyees' satisfaction. The company has found that customer loyal ty and opcratiomll excellence are affected by a sati sfied, producti ve, and committed work- force. Retaini ng its emp loyee:, has al so helped Sysco cut its opc r::iting costs. After discovering the correlatio n, Sysco implemented a rigorous se t of programs to en hance the retention and satisfactionofi ts cmpl oy ec s.11
Regressions
Generally, staffing professionals use more than one mea.~ure to assess job applicants because iL impro\'CS the O\'Crall va lidity of a finn' s selection proce ss. Howeve r, with a correlation analy• sis, only two variables can be related 10 one another so onl y one predic tor varia ble can be used. Multiple regression is a s1.:11istical technique that predicts outcomes using one or more predic- tor variab les. Ass ume the predictors and outcomes arc measured on an interva l or ratio level. Spcciaii1.cd tec hnique s ex ist for variab le s that are meas ured on a nomina l or ordinal level but such approaches arc be yond the co\'crage of thi s chapter. A human re source professional can do a mult iple regre ssion ,mal ysis to identify the ideal we ights to assign to each assessment sco re (each pred ic tor) to maximi1.c the validity of a sci of assessment methods. The anal ysis is based on each assessment method 's corre lation with job succe ss (the outco me ) and ,he degree to which the assessmen t methods arc intercorrelared. For an example of what we mea n by intercorrelated. Sll~posc that cognitive abili ty is highly correlated with intcr.·icw pcrfor'"?ance. In this case, ii lll.ight not make sense to use both variables to pred ic t job success because 11 would be redundant
do so. If the redundancy is 100 great, the regre ssion analysis will retain only ~nc of the pre- dictors to use in the finul pred ic ti on equation. In other words, the redundant predi ctor wo uld be assig ned a near-zero wei ght . _ . . One way of vis uuli1.ing relati onships among the variable s in .mu!llplc regre ssio n IS ~c-
ptctcd in the diagram 18 show n in Figure 8-5. Ass ume Y is the cmen on, or oulcomc bcrn g predic ted, and X and z arc the predictor variables. In the first example y~u ca n sec by the overlap tha t both X and 2 pred ict Y, and both x and z arc un~o rrclatcd, with each other. In ~he seco nd example, bo th X and z predict y but X and z ar~ high ly ~or:rdatcd; You ca n eas - ily sec in thi s seco nd case that X and z contribute liulc umque prcd1c11on_ bc)ond the ot~er. Thi s is si milar to a case in which you measure the same concept (e.g., rntclh gc nce) usrng
Chaptcr8 • Measurement 213
PRACTICA L SIGNl f"ICANCE m1ob.\tf\rdrr/11tionsl1ip1/,,,1 ,s/argr enough111bt-ofrnlur,11upract1cQ/
JWLTIPLE REGRESSION a uarm,rul ,ul,mqut that prtdir u an outcorntusm1101Jtormortprtd1cwr w.rrfoblts, Uidtrlli/it s tht rdt11I 11·t1,11/1 1t ro ossignn.rcliprtd1c1ur s,, .,,,0111<Ltin11u tlrtn.rhd,ry o/ t1 ,<tl ,ifprtd1crurs , 1ht W1 <l/.>J 1.!H billtd on r,ul, prtd ,cw r , corrt /u1,,m ..111r 1/itrnt lCOmtu11dtl1tdtflTtt/O ~ h1ch1/r,pr,d1C10TJ Mtthtm st l,n ,111ucorrrlmtd
214 Cll:ipter S • Mea:su~mnu
-
--~ 0 ::76=""'"'"" ~ 0 acl,c,odbyX
X and Z Comlated ,.-itll Y lllld Htgll!y CDITTtaled 11·1th Each Ot1icr FIGURE 1-5 Diagr1ms of Multlpl• R~rHs lo n
Mon:T)p1ca!F,ampk
two diffcren~ but hi~hly related assessments. The redundanc y doesn' t add new info rmation so you can either ehmma te one of thc _~scs sments or combine them (depending on cost). In th~ thi rd example, yo u sec a mo re typica l situation . Herc X and Z are moderately co rrelated With Y 1md with each other. Both X and Zadd unique informatio n to the prediction of y but they are rel_ated t? each other . As an illu stratio n, a finn migh t use a cog nitive abi lity tes t and ~:~~u:~ ~~,~~::i:t:;~rcs to pred ic t job pcrfonnance and cogni ti ve abilit y and inte rview
The equatio_ns used to do a regression ana lysis can be com puted by hand , but th is is a cum• ~;ome process _if there are .more th an two predictor variables. The eq uation for two predictor
ab le s (a mul11ple rcgrcssm n) ca n be found in the su pplement at the end of thi s chapter A ;=1:n°:::ts7t:i~ac 1 ~~!:s~;nd uding ~xccl. SPSS, SAS ,_ and St_ata, can be use d to easily ~r·
conduct a mulu_ple re! rc ss ion a~a~; 1 : ::~~e! ua;:1;~:n\~v idc the mstructions for using Excel 10
regrcss~~\;::t~~ :: ;;a::i:d for its _S!at!stica l_ relati onship to the predicted outcome and a P g n) equ ation 1s denvcd. The reg ress io n equatio n is of the format
Job succes~l<lt-d = Constant + (bi X Tes1scorc 1)
+ (b1 x Te st scorc i) + (b3 X Tc ~t score 3) . . The consta nt. or in tercept is a n be d
and the bs arc th e reg ression ~-ci hts u,~ r a ded t~ ~\·eryonc' s predicted job succcss _sco~- score s on the tcst (s ) arc entered in7o the at arc_ multi phed by eac h te st score. An appl1can1 _s predi cted job suc cess. Be cau.st then:: res~~ sul~m g m?<1el and used to calcula te the test takers not need to be sl:andardized because 1; e wc~n anal~s i_s operates on raw score s, the se scores do stln dartl deviati ons. For exam 1 ·r M" ghts take mto acco unt th e differences in mean5 and measu~. and 20 on a Jo b kno:,e~· tcs:gu~ l scor~d 50_ o n an interview, 27 on a personality fo ll o\l,m g equaiio n: g 'his predic ted Job succe ss wou ld be 141 based on the
Job success~ = 10 + (2 X Interv iew) . +( I X Personah ty ) + (. 2 x Jobknowh:dge)
:11c mtcrcept ( IO)a ndweights( 2 1 : ~s~:f•guel' ~ score of 141 would th en be ~:;d -2) eon:ic from the re sul ts of the stati stic~! a~aly-
ld be hi red . In ge neral, onl y equation P.arcd w.nh the other candidates to detcrmrne tf he ~hould be used to make sta ffing dec isions. s wuh variab les fo11 nd to be stati sticall y signi ficant
Cllap tcr 8 • Mea surement 215
Regressio~ analysi s is als~ used to predict future headco unt requirements. Consider the rtgression eq uall~n Lh at uses proJ~ tcd sales per month and the number of expected custom to d(tcrmine a firm s headcount requirements : crs
Full-time employees = 60 + ( .000 15 X Sales) + ( J x fa peeled customers) !f the firm 's project_cd ~ale s arc $1,000,0?0, and the company projects that it will acqui re
250 new custom ers, then 11 will need 285 full -u me employees:
Full-tim e em pl oyees = 60 + (.000 15 x 1.000.000) + (. 3 x 250) = 60 + 150 + 75 = 285
To preven_t giving d_iffe rcn t_variables credit for predicting the same pan of the cri terion, multiple regress ion analy sis cxammes the effect of eac h variable (e.g., each test sc ore) on the cri- terion (e.g.,job success) after controll ing for other variables. Usin g muhiple regression requ ires ltigh qualitymcasurcs .
WHAT ARE THE CHARACTERISTICS OF USEFUL MEASURES? Twoprope nies of a good measure arc its reliability and \"alidi1y. We discuss each nc" t as well as the importance of a measure' s standard error of measurement.
Reliability Rdlability refers to how dependably, or consistently, a me as ure assesses a particular character- istic. If yo u obtained wildl y different weights each time you stepped on a scale, would you find the scale useful? Probably not. The same principle applies 10 measu re~ relevant to staffmg, such as job knowled ge, personality, intelligence, and leadership skill s.
A meas ure th at yields similar scores for a given person when it is administered mu l1iple limes is reli able. Reliabilit y sets boundaries around the usefulness of a measure . A measure can- not be usefu l if it is no! reliable, but even ifit is reliable, it sti11 might not be useful- for example, ifit doesn't measure what you're see kin g to dctcnnine but so mething else in stead. Reliabilit y is a critical componcnl of any staffing mea~urc, including ca nd idate assessment. If a person com- pletes a personality tc sl tw ice. will he or she gel a si milar score or a much different score ? If the scores radicall y change, then perhaps the te st isn' t rc\iahlc . Why \l,OUld a job candidate sco re dif- ferently when com pleting a personal ity te st again, you might wonder? Think of why you might score differen1l y on a mid1enn given on Monday and one given on Friday, and you shoul d have some insights. Some possible reasons are the following :19
Rf:IJABIUTY Im..- dtpt1t1fobl y. or co1mlltt11/y, <1111N,.1,1rt <1.1<t <Sts aparf1 r wlur r/,ar<1c len ,11c
• Tht respondent's 1emporary psychological or ph)•s ical slate. For example, differing lev - els of anxiety, fatigue, or mot ivation can affect te st results. lf yo u arc stre ssed the first 1ime you are tes1cd but arc relaxed the seco nd ti me, you might respond different ly.
• En vironm enfalfactors. Di fferences in the environment, such as room temperature, light- ing, noi se, or even the 1est admin istrator, can influence an indiv idua l' s perfonn::mce. lf ii is quiet on one occasion, and yo u hear distracting construction equipme nt on the other, you might ob tai n different scores.
• Tht. version, or form, of di e measu re.. Many measures ha\·e more than one version, or fonn . For example, the ACT and SAT colle ge entrance examinations have mu lti ple for_ms. The items differ on co.ch fonn, but each fonn is suppo!.ed to me as ure the same thmg . Because lhc form s arc not exactly the same, a respondent mighl do belier on _one fo nn than on another. If one ve rsion happened 10 be harder , or it was eq ually chall cngmg but lapped into material you knew less we ll , then you would pcrfonn more poorly. In the case of the ACT and SAT, scores ca n be adju sted IO rcllect the difficulty ofeRch folll_' ·
• Different t~•alu ators. Certain measures arc sco red subjcc1ively-that ts, they are de - tcnnincd by an evaluator ' s judgments of the respondent 's performance or respon ses. Differences in the training, ex.pcrience, and frame of rcfcrcnc~ amo_ng the evaluat~rs can re sult in differen t scores for a re spondent. This is why two m1~r,•1cwers e\·aluatm~ the same job candidate might come to completel y diffc renl conclu sions abou t the quahty of
that job ca ndidate.
216 Cl'l.lpi:er S • MeasUn:'nlenl
R.t.\'DO.W ERROR
S lTIE.\l-lTIC ERRORS trn>r /U <><nV'I bttu..st <lj ro,ullltlll arod~fa,c,c,n
m :.FICl f:..\ 'CY ERROR ,..-o,r JM/it-,,vo,,.fw/1a ,,,,<1J.,rt 11111•cmum <1.J,....m of/N wmb1t1t '°" " <11t/J/Wu,"lt<1.Jurt
CO.\ T A.\IJS ATIO \' ERROR
<l<OlrJM /,r11 tW!cr/11eronwvt la1tdro M/rmnt r ubtm t aJJr.J.Ud afJ"ttht obi rruJ irnrtJ
ll1cse factors are sources of measurtme11r uror in the as sess ment process. Measurement error can be systematic or random . In some cases, the me~u~ment c_rror ca~ random , as in the flip ofa co in. You don't always get 50 ~eads and 50 tai ls 1n 100 n ~ps. ~1s 1s an example of a random erro r. Si":ilar things can happen ma staffing con text. R_u~nmg into traffic on the .,.,.a} to work or cxpcriencmg bad weather can cause employee productt vity lo ran~oml y nuctuaie in unprediciabk ways. Systematic errors arc errors. that occur because of consistent and predict 3ble factors. For example, an employee 's productivity may go down every Tuesday becau se ~ or she works late at a second Job Monday night.
The sources of systematic errors can includ~ facto rs su_ch as .the time of day or day of the wee k. Admini stering a difficult work .~pie t':51 in the mom mg might. fo_r example, lead to di f. fcrent results lhan if the teSt was admm_i stc_rcd m the afternoon or late at mght. Sy ~tcmati c errors can aJso result when there are systcmalic differences across evaluators. For example, so me intcr- viewe~ mi~ht regul3i:ly tend 10 rnte most interviewees _near ~c middle o f a I ~to• IO scale, \.\'hercas other. mterv1ewers might tend to regularly rate most 1nterv1ewees on the high end of the scale . In thi s example, the differences among the evaluators are a source of systematic error. Another source may be due co the measurement items themselves. Item s that arc reverse-won.led, confus- ing, or O\'erly complex can lead to systematic errors. The following question is a good example:
Us in_g !I l-to-5 sc ~le where I is very ~c and 5 is vel)' untrue, answer thi s question: "In pre vious leadership roles you rarel y failed 10 set goal s on a time ly and con sis tent basis while providing good feedback to your team ,"
. If you reflect fo r a moment on thi s item, you can sec how dift1cult it is to undcrsta 11d and how 11 could !ead to systematic e~or due to woniing. Peop le may systematically \'ary in the ac - c~ cy of th~tr response~, depcnd1~g on _th~ir \·erbal abihty, motivation to complete the sur.cy qu1c.kl y, ~r simple attention 10 detai l. Thi s 1s not rand om error because it is attached to specific and 1den11fiable personal characteristics.
If there were no systemat ic or random errors of measurement. then the rc ),pondc11t would gee the sam~ score eac h ti me . If you step on a scale then you will get a reading. If the scale 1s perfectly rehable and you step on it aga in 10 seconds later, you wou ld sec the same reading. This is_yo ur ~e score. !fin real life , your scale, like ours, slightly fluctuates, then you might sec a shgh~J y different read ing IO seconds later. Thi s is random measurement error because it is due to a vanety ~ffac tors unrel ated to your actual weight. However, if you weigh yourself every day 1n ::e ~
0 °:1;i; gw;;t::~~n~:\: a~d after lunc h, you mi ght find you systematicall y weigh more
. - staffing context. te st and pcrfom1ance scores al so co ntain a true ~ :t~it~:cv:;aLJ on due 10 r~ndom and systc~at ic errors . An applicant is unli kely toobtam lors . 0; the ermrc o~l~ : s~~,::"a~~: .gc teSI every ume . Part of thi s cou ld be due to random fac-
be co: li ~tincti~n ~tw~en ra nd0m and systematic errors is important: Systematic errors can wh ile we rui ng ~ e s~'::~~oth:t:t'l::re, you can ~ ci gh yo_ursclf at the same time each day however. cannot be con trolled butstill :n use onl y hig_hly trained interviewers. Random errors. su~jec t to more random errors lhan othc f~cct th e quality of meas~remcnt. So_me measure s are weight than do others. rs, JUSI as so me sc ale s provide more re liab le rncasurcsof
A dt ficiency trror is yet another t . tan1 aspec ts of the attri but e yo Id r ypc of error. It occurs whe n you fail to m~asurc 1m por· i~ call ed a_ construct. If you w~n::
10 : : to measure . :"e u,ndcr~~ ing attribute being measure~
gineenn g Job, the n thi s abili ty is your con as ure an ~pphcan t s ab1h1y to use calcu lus for an_~n foc used only on alge bra, a dc fi c1c nc crrostru,ct of interest. However. if the test you were u~1ng othe r factors unrelated 10 one's adva :Ced: 11,oul~ re~ult. A con ta mination er ro r occurs i\hcn example, if the calcul us ics, had alh skill s (m thi s case) affect the observed scores. f or :e results. If the test was admini::~:d cuonn~plex w~rd problems, then language could influc.nce :::i~ ~ommg Vtrlus la te at night . usin cal er varyin g condition s (a loud versus a quiet sen_mg,
rrunLStrators), then these factors 1 ~ f m and he lpful admini strators versu s loud 11nd :in :< iou s
The diagram YIOwn in cou a feet scores. Assume_ you used supervisory ::rre 8"6 ill uSlratcs deficienc y, contamination. and relevant:• =::~ be deficient ? It is pos:ii~ct~~e: ure j~b pcrforrn:incc. In what ways rnigh~ supc~·1:
than teamwork, qua lity, and safet ~rvasors f?" us on producti vity and mceung ~'.~ f y. r supervisors may not know about the quallt)
Con.l r uct
FIGURE 8-6 Def icien cy, Co nU mi nation, and Relevante
customer service provided and only attend to quantity of sales. In each case, the supervisor may omlook important aspects of job ~rfonnancc and using their ratings alone may re sult in dcli - cic11Cy. Supervisory ratings can be contaminated, too. What might affect supervi sory ratings other than actual job perfonnance? ll1e research lltcraturc is fi lled with infonnation about sources of rater contami nat ion, including stereotypes, halo effects, and similar-to- me bias, am ong many other soorces.20 This is contamination . The over lapping area between the construct and the measure indi- cates re levance, or the degree to which the measure captures the intended concept to be me35un::d .
It is impossib le to elimi nate all sources of error, but mea5urc s can be made more reliable by standardizing the measurement process as much as possible. For example. you can pretest the item s on a test to ensure they arc clear and the y statisticall y corre late with each other consis- tently . Interviewers can be trained to ask lhc same que stions, avoid bi as , and use the same behav- iorally based scori ng key to make their ratings. Test admini strators can gi ve te sts at consistent times , under similar conditi ons, :md so forth .
Conceptually, re li abi lity is the correlation of an item, sc ale, or measurement instrument with a hypolhctica\ set of true sc ores. Howe\'er, in practice. true score s arc not available for the compulation of a corre lation . Instead, reliability must be estimated by correlating different types of observations. Thi s wi ll be elaborated upon later.
Ch:i ptcr 8 • ~1casuremell\ 217
The reliabi lity of a measure is indicated by the re liability coefficienr, whic h is expressed a.s a number ranging between 0 and 1. wi th 0 ind icating no reliability (no correlation between the meas ure and the true score) and I indicating perfect reliability (perfect correlation between the measure and the true score). Like a corre lation, we expre ss reliability as a decimal - for ex - ample, .70 or .91 . The close r the reliabili ly coefficient is to 1.0, lhc more repeatable or reliable the scores arc . Near•pcrfccl re liabi lil y is ex tremel y rare . The reason that rel iabi lity cocflicie nts are only positi ve (as opposed to corre lation s, which can range from - I to + l) is that observed scores and true scores shou ld relate to each othe r in a consistently positive manner. Table 8-2 Pl'CSc nts some genera l guidelines for interpreting the reliability of a measure .
-- General Guidelines for Interpreting Re liability Coefficie nts 21
_R_,_lia_b_ili-".ty...:Cc:.o•:...ff::.:k::.:;•::c"':...Vc::• c:I":...• ____ Inte rpret at ion . 90 and up Superior
.80-.89 Good
.70-.79 Adequate for most needs
.50-.69 L1m1ted app l1 cab1l1ty
.00-.49 Not useful at all
218 Cl\3plc-r S • M~uremcnt
TEST-RITEST 11£.LL-UJIUTY
ITJ!K'c, 1W rq,r<JlQOll,r.· ef . ..:o,u "'"'' ~uNi~,rabtl,r.o/ditlllldcr/m1g mulnoo:1"""-..(~M ml
,UJ,ERS,t.TEOR l'ARAU.£L FOUIREl.MBJLJTI' ~rt lk,-. O>fUIJltlll <rurtJ .,,.. filll]' Wl,.,,f,:,[WnOllro"'f'ltltJ"'c, ..,_,,,./0""1of/lt,,,.,,_ ,....'" ""'
l.\TER.\,UCO.\"SISTESCY R£LIABIUn' ~IZILJ dit UT"'11 I<> -.luc~ 1/t..U .-!<IP,'tllllV(l.jllft<l.fKU t/rtJamt
I.\TER·RA T£ R R£WBILJn'
The rc\ia.l'•ihty coefficient is nm the onl~ th_i~g to consider in scl~ctmg or rejecting an a,s. scssment method. To evalua te a measure' s rehabthty, y_ou sh~u ld co nsider \he _type ofmcasurt the type of rchability estimate reported, and the cont~xt m wh ich the m~asurc wil l be used. ~ arc several types of rehabiliiy esti mates. Be fore dcc1dmg to use a mcasui:c, s~c h as a rcrsonaluv e\·aluation or cogniu,·c abi lity test, it is 1mportan1 to learn about _thc relrnbilny of the mcasu~. Organi,.auons someti me s purchase tests or assessme nt tools, and mfo nnat1 on about reliability IS often provided by the creator and publisher of these tests and t~ ls. You should .be familiar with the different kinds of reliability estimates reported. Next, we di sc uss several different types of reliability. Rel iabili ty can be esti mated in one of four ways.
TUT-RETEST RELIABILITY Test-retest rtliability renects the repeatab ilit y of scores oi·cr time and the stab1hty of the underlying co nstruct being me as ured (e .g., a person 's math ski111 personality, in1e\lige ncc, honesty, and other relevant chara_c1~ri sti~s). Te st-retes t rehabiln; is estima ted by the corre lation be twee n two (or more) adm1m strat10n s of the same ffiC.ll,urt across different times or locations on the same sample. This assesses stab1lny over time . Some constructs an: more stable than others. For example, mechanical ability 1s more stable o'"« time than is mood or anxiety. Therefore, we would expect a highe r tCSl•retcs t rel iab1 l11y coef. ficicnt on a mecha nical aptitude tc sl than on a measure of an,,,iety . For constru c1s like mood which vary 01'er time, an aecep1able test-retest reltability coefficient may be lower than i; suggested in Table 8-2.
AL1':RNATE OR PARALLEL FORM RELIABILITY Dc,·elopcrs often make multiple formsorm. sions of a measure that arc intended 10 assess the same th ing and be of the same difficulty 1ml. Alternate or parallel form rtliability ind ic ates how consistent score s arc likel y 10 be if a pe r• so n comple tes two or more forms of the same measure. This reliability is estimated by the cor• relat ion between tw o (or more) admini strations of differe nt fo rm s that are supposed to measure the same co nstruct to the same population. A hi gh parallel form reliab ilit y coefficien t indicat~ that the different forms arc ,·cry similar, which means that it makes virtually no difference which 1·crsion of the measure is used. On the othe r hand , a low parallel form rch abi !i1y coeffidcn1 suggests that the different forms arc probably not co mparable and ma)' be me asuring diffm n1 things. In this case. the multiple fOfmS can not be used interchangeab ly and scores on ea-h fonn cannot be directl y co mpared .
INTERNAL CONSISTENCY RELIABILITY lntemal consistency rtllabilily ind icates the exte nt to which i1cms on a gil·cn measure assess the same construct. A high internal consis ten cy reh· abili ty coe fficient ind icates that the items on a measure funct ion in a simil ar manner. lntenul consi s~ency is base~ on the correlati on amo ng the items com pri si ng a me asure . For example. you m1gh1 ha11e lO 11em s mcasunng math skill. If all the items measure math skill reli ably, anJ they arc intem~lly consistent. then scores on one item shou ld correlat e highly with scorc1i on anothe r. Items m the measure can be 1i plit into even and odd items or first half and second half items. Scores on the se halves can then be correlated with eac h other. Th is is called sp/ir./lll lf rt /1ab,l,ry, w~1ch is ~ne indicator of internal consiste ncy. The mos t commonl y used indica tor of ~::~:~l
11 ~;: sistcn cy rs Cronbach ·s alplra. It is an estimate of the average of all possible spht·half
If fin;incc an_d hi story questi ons we re included on an exam for a Slllffing class. the test woul~ have lower mtcn:ial co nsistency rel iabili ty than if the 1cs1 co nt ained only sta ffing-rc! atcJ questi ons because the diffe~nt .types of items would yield varyi ng patte rn s of scores. Me asures Lhat asse ss multiple c~~c teriSt1cs are usually div ided into di sti nct sections, and a separate inter· ~a~::'.Slenc y reliability coefficie nt is reported fo r each sectio n in addition to one for 1hc .,.. hole
<r.dw-<11nlki .. co,uu :tn1uoru,:,r, ~lil:,w~,f1Nmpo,ut1w•,cortd 17:,,.,.,,~,,,..,,,ror,,,1t11ni1h,..,_ •rti'i.u.ak • .,,~,
::i:~~:~ : s~En~~B~~:orc~n ter-~ler reliability ind_ica1cs how co nsistent score~ ~re ll~e':; On so me measures, like durin ~ y tw o or more ~lcrs usmg the same item, M:alc, or m~h'\l~\ or bcha\"iors and subjective] dcfcnn~:pic gymnastic c.11ents, different raters evaluate rc~p0n:alu· ate the same job appli can; (c .. the ~~ sco re .. Often 111 bu s~ n_css co ntuts diff_ercnt peop_lc c ter.;' Judgments create vanat ion . g ' , , rccnntcr and the hmn g manage r). Differences In ra
s mapc rso n s scorcs forthc samcmcasurcorc vcnt.
Inter-rater reliabi lity i~ based.on th e correl~ti on of sco re s belwcen or among \WO or more !IICf'S wh o rate ~o~lc ?r objec ts usm? the same item, scale. or insirumcnt. A high inter-rater rc - [l abili ty coefficient mdi_ca tc s that lhc.Ju~~mcnt proe_ess is co nsistent and that the result ing scores are reli able. ~llhough mter-r_at~r reha~1lny coe~fi c1e nts arc typica ll y lower than othe r types of reliabili ty esumatcs, rater tra1111ng can mcrcasc mter-ratcr reliabilities. Thi s type of reliability is pfflicul arly importa~t f?~ undc ~stan~i ng the use fulness of in terview evaluations.
These four rch ab1h ty cst1mat1on methods are not necess arily mutuall y ex clusive, nor do they need to yield the sa_mc results. A measure of job knowledge thai has many different dimen - sions wilhin lhe test might show low rnternal consistenc y re liability. However, pe opte·s job kJIOwledge characteristics migh t be rclati\'cly stable. in which case the test scores will be similar across admini strati ons (hig ~ te st-re test reliabi lity). In thi s case, you shou ld compute se parate in- ttfllal consistency reliabilitie s for eac h dimen si on. As another illustration, two disti nct measure s ofleadership capability might yield high parallel fo rms reliabilit y. Additiona lly. the item s wi thin each of the measures might be internall y consis1en1. Howe\'e r. the tes1-rete st reliabil ity co uld be 1ow ifleadership training occ urred between the administration of the two tests . In thi s case, you would expect to sec low test-rete st reliabi lity because the training ought to change or improve people's leadership capabili ti es.
Clearly, the acceptable level of reliabi lity will differ depending on the type of me as ure and the reliability esti mate u~cd. A measure of mood may e"-hibit low reliabi lity across admin • istrations because. as wc C"-plai ned. moods nuctuate ol'er time . Howeve r, the measure might still be useful for pred ic ting how applic ants will react during in terviews. In th is case. the item s measuring mood must yield consistent score s among themselves, eve n if the y vary 011er time as 111011Crallscorc.
Standard Error of Measurement As we ha,·e explained, lhe measureme nt process a\11,•ays contains some type of error. The problem LS that we wish to use imperfect scoccs to make dl'C1 si ons despite the presence of error. II is help- ful to know how much error uists 11, hen we use a give n score. 1bc sta ndard error of measure- l'llfnt (SEM) is the margi n of error that you should C"-pcct ,in an ind ivi~ual sco re lx:causc of1hc imperfect reli abi lit y of the measure. (Ille fonnula for SEM is given in this chapter's supplement.) SEM represen ts the spread of scores you might have obscr..-cd had you tested the same _person rq,catcdly . The lower the standard error, the more accurate the mcasuremc_nts . lfthe SE~-1 is 1.ero, then the obse rved score is the true score. However. we know that error C"-l ~ts so we can compute a range of possibilities arou nd the observed score. This is a confidence i111m·a l. ~ I though noL technically preci se, you can think of it in thi s manner. If you score 85 out of a possible 100 on a measure that has an SE.ti.I of 2. there is a68 pcrcentchancctha1 the .. Lruc., sco re hes between 83 and87,andabouta95 perccntchanccthatthetruescorelicsbctwcen 81 and 89. . .
In a nonnal di stribut ion, 68 ix-rccnt of cases fall bet\\ecn + 1 and - I standard de1·iall ons from the mean, and ap pro ,,,ima Lcl y 95 pcl'C'cnt of cases in a populatio~ f~ll between + 2 and -2 standard de viations from lh e mea n. The SEM tells us the standard de viation of crr~rs. If 11,c cen- te r ou r me an around 85 (lhc observed score). then we can use the SEM to detennrne th~ c_hancc that the tru e score will fa ll within a gh·en range . With an SH,! of 2, one stand3:'11 dcv1at1on of trrorbclow the mea n is 83(85 _ 2) and one r,Landarddeviation abo~·e the mean 1s 87(85 + 2) . This give s us a 68 percent con fidence interval. A si milar computation can be made for the 95
pcrcc~ ;~~~~n~e ai:~e;;:il~easure of the accuracy of indi vidual score s. I_f have rec:ived a manual with a te st or assessment tool, then whe n yo u're c1·aluating the reha~1ht y coeffic ients of 1 measure, it is important to review the explanations provided for the fo ll ow mg:
• Tht t)'pes of reliabilil)' us td. The manual shou ld C"-p lain why a certain type of re!iabilit y
• coefficient w_as ~c.ported ~nd di sc uss :;;:c~;;_w;,,: 0::~~;t s~:~~;~:~:~~ret~:~0:~t~~n~
Uo-w th t ~e/1ab1l11y studies we~e co including the length of time that passed bc1wecn 1hc ::i~:t:~:~:e0~:
1 ~
1 :aa:u:;:
1 ;~;t. retest reliability study. In general. reliab ilitie s tend to
• drop as the ti m~ ~tween adminis tratio:~n ~~s~anual should indicate the imporl ant
:::a:7::~~'t::i~:c °fa!~~/;;;~!' u!cd t~ ga ther ihe reli abi lit y information, such as thl'
Chaptcr 8 • Measurement 219
STA.,VDARD ERROR OF Ml-:A.SlJREME,\7/SEM) tl1t l"'"8mr,frm,r/h<II J011 Jhc,u/d e:rpur .,,un1nd1>id11olJCollbe<m.st oftht1mpr,ftC1ttl1ab1/1tyo/1ht
L
i l I
220 C'h:&pt(rS • :-.k=m.:m
" •~lla_,.,.,...<l<U'.lll.J ll ~U'~ roownoc:aNJrlw<M,r,rn ~, .. ~kit .. ""'...W.lf'<"".lfi<"~"'-'""'-' '-'' ,,,..&n>.:-.1 N..-J"" obu"td U:Q'<'S
educauon lcvel s, ages,occupations,and othcrrde\'ant .characleristics of the people in the crou . nus will al\o..-.· you to comp.in: the charncte n s ~1cs of th e people you want to lflca. ~un.: ':,,lh lhc sample group. If lhey are s ufficicn~\y s m11lar, then the reported rehabihty ~ tmmtcs w,11 probabl y hold true for yourpopulauon as well . The important thing to remember is that high reliability measure s wi ll \~ave lower SEM~
"hi ch means that observed scores :irc more likely to reflect _t";3e sc~res . Add1110nall y, rcli abih'. ue s can drift o,·er time . Wilh longer periods between admm 1strauo ns , test- rete s t corre lations are likel y 10 go down. and the SEM s will then go u_p. Moreover, as we ha,·e explained, eve n ifa measure is rehablc, it doesn ' t mean it's useful. Rehable me asure s may o r may not measure \\hat you intend to measure, and lhey may or may not predict de sired staffing o utcome s Thi s is whcll the issue of valii,hty comes into play, which we di scuss next.
Validity Val idity 1s the most 1m ponan1 issue in selecting a measure . It refers to how v,,:11 a measure as- sesses a give n construct and the degree to which you can make s pecific conclusions or pred ic. lions based on o bserved scores . Validity is the cornerstone of s trate gic sta ffing . If you wish 10 use dau to make decisions, then the data must relate in mean ingful way s to de si red ootcomc.s. If you .:an predict high-quahty talent using various kinds o f te sts. then th ey will give you a com- pcUU\'e edge O\'Cr firms that do not use valid tests fo r selection.
It is important to understand the differences between reliability a nd vahd11y . Validity ,,,, 111 tell yo u how useful a measure is for a particular s ituation ; reliability wi ll tell you how consisicnt sco re s fro m that measure will be . You cannot draw valid conclu s ion s unle ss the meas ure is rch- able . Bot e,·en when a measure is reliable , it might not be valid . For example, you might be able to measure a person 's shoe sue rehably, but it probably won't be useful a~ a predictor of lhe perwn's Job pcrfonnancc . Any me as ure used in slafli ng needs to be bo th reliable a nd , ·ahd for thcsi 1ua11on .
F(g~ 8-7 shows a popular bu ll' s- eye illus trati o n of the relation ship betwee n rel iabili ty and vahdny. The cen te r of the target is whatever construct you arc trying 10 me as ure, usually some aspect of job success . Each "shot" at lhc boll' S-e)·e is a me asurement for a ~i nglc person. A bulr s-cyc means th at yo ur me asure is perfec tly assessi ng the person o n tha t con~truc1. The furthe r you arc from the ccnter. lhc more your me as ureme nt is off for th at person .
The dots d ose tog ether in Figure 8-7 reflect hig her reliabil ity lhan the dots more spread ouL Dots centered on the bolrs-c)' e rcllec t higher re liabi lity and validity th a n Jots clustmd a~ay from lhc bull' s- eye. You i.a n easily sec tha1 if the meas ure is not re liaMe (the dots are ""1dclyscaucrcd).i1 isnotposs1blcforthcmtobe\'alid(on target) .
Figure 8-7 s hows three possible s ituatio ns. In the first o ne, s hot s arc co ns is tent. but mi s; the. ce nt er of the ta rget-we arc consis tent !)' mcas urmg the wro ng va lue fo r all obsc r-·a1ion1 ~ 15 meas ure 1s thus rel iable (co nsisten t), bot nol \'alid ( not acc urate ). An e\'eryday example might be a _scale that co nsistenll y scgistcrs a weig ht that is 20 pound s too hea \'y . ,\ stsffing exa~plc mi~ht b.: a math te st th at gh·es co ns1, 1cnt resu lts but that is too ea sy . In the second bull s-ey e. hns arc ~prc~d ac ross the target, an d we arc co ns is tently mi ss ing the ce n1cr . rencct- :~ ~s: nasau:;!:t ;~ nc~~hnc;
5 rc hablc nor \'al1d . Th is is like a se a.le th at gives rando m readin gs
- P off. A math test that doc s a poor Job of measuri ng ma th and 1s
Rtliablt ' 01\'11id
/\'t l1hnRtll1blt ,~ot\'1lld
flGURES-7 llhnt,a t1onofRtl i;1 blll1y1ndV11lldlty
Both Krll a blt 1ml \' 1lld
Ch apt crfl • Me3.'>uremcn t 221
1 ued by erro r (it is b_ot h co ntaminat~d and de fi i.icnt) 1111gh1 yield the second pattern . l11e bull' s-cy~ sh ow s hits ~hat arc ~ons 1s tently in the c: ntcr of Lhc target . reflectmg a measure
thal is both rc h ab lc a nd va lid . Th1_s ts hkc a sc_ale th a t &1\'es the same weight each time, and the weight is accurate . In s1afli11~ , th, s pall ~m m1 g h1 be cx hib1 1cd by a h,gh-qoah ty math te st th at ,sconsistent in res ul ts and ncllher defi cient no r contaminated . Th is is our goa l in measurement andasscssmcnt. . _ . .
A measure' s vahd11 y ts cs tabh ~hcd in reference lo a specifi c purpose . Thus, the meas ure ntight be valid for some purpose s bot not be val_id for others . For example. a measure yo u use to make valid prediction s about_ so~eon~·s tcchmcal profic iency o n the job ma y not be valid for i:,tmcung his o r her kadcr~h•p s~1)l s, _Job com_mitm~nt, o r te amwork cff~c th·encs s.
Similllfly, a meas ure s vahd1ty 1s establi shed m reference to s pec1lk groups called reftr· tttet groups. Thus, the same measure might no t be \'alid for different groups. For examp le. a problem-solving skill s measure d~ si~ned to prcdii.t the pcrfo~ancc of sales rcpresc n111tives might not be valid or use ful for prcd1cu ng the pcrfonn:ince of clcncal employees .
As we ha\'e e xplained. the m:inuals that accompany assess ment too ls, or te~ts, shou ld <le - scnbc the reference groups used to develop the measures. The manual s sho uld also describe 1h c groups for whom the measure is valid and how the scores for the ind iv idual s belong ing to eac h of the groups were interpreted . Yoo, then, mu st determine if the me11sore is appropriate for the puucular type of people you want to assess. Thi s group of pe ople is i.ullcd your ta rget popufr, - rio11, or ra rget group.
A\lhough your target group and the reference gro up might not have to m:itc h perfectly, they must be s ufficiently similar so th a t the me asure will yield meaningful scores fo r your group . Foe example , you will want to eon~idcr fa ctors s uch as 1he oci.upatio ns, rcadmg le vel s. and cultural a nd la ng uage differences of the people in your target group . Use o nly asscs~mc nt pro- cedllJCS and instruments dem o n~trated to be valid for )'OU r targe t group(s) a nd for your spec ific purpose. This is important bccau~c the Umfonn Guidel ines on Emp loyee Selection Procedures require assessment too ls to have adequate support ing e\·idencc for the conclus ions reached with them in the eve nt adve rse impai. t occ urs . Altho ugh all employ ee selec tion procedure s-for ex - ample, interv iew s--do nol have 10 be ,·alidated , score d asscssme nL~ that have an adverse impac t should be vali dated if technica ll y feas ible .
The use r of an assessment too l is ulumatel y re sponsible fo r makmg sore that ,·ali dit y e,'1- dcnce exis ts for the co ndos ions reached us ing the measures. Th is a pplie s to all measure s anJ procedures used (includ ing interview s), whether lhc measure s ha\·e been bought off-the -shelf, dc\'clopcd external ly, o r developed in-house . Thi s means that if you develop your ow n measures ocprocedurcs, you shou ld i.o ndoi. t yo ur own validation s tod,es. If ,•a\idatio n is no t poss ible, the scored assessment sho uld be eliminated . If in fom1a \ or non scored assess ments have an adverse impac1, the e m ployer sho uld either elim inate the tool or use a more fonnal one that can be validated .
Altho ugh the Uniform Guidelines foc us on ad \'erse im pact an d le gal liab1l1 ty. v:ilida tmn is eve n more impo rtant from a s tra1cgii. pcrspcct i\'e . Str.ite g1c all )', _ii makes sense to use o n_ly those meas ure s that reli ably and vahdly assess what is important to Job s ~cce ss and that prcd1 i;;t des.i red outcomes . Anyth ing else is potentially an npc n~i\'c waste of tune . l nvu hd mea»ures tan lead to mi ssed opportunities for selecting hi gh-qual ity talent , or _woi:sc ye t, the se lec ti o n of people who will pcrfonn poorly . Toe cost of select1o n-rclatcd errors 1s high . However, the~· i.an bcdramatieal\ y reduced by using valid measures for sclci.tion . :"ere~ man y type s of va~ 1-d1ty, all of whic h address the useful ne ss and a ppropriateness o f us mg a g,vcn measure. We di sc uss lhcmncxt.
FAQ VALIDITY One aspec t of validit )' is whe ther the me asure seems to measure what it is t upPoscd to measure . Tii is is fuee validity. It is a subjecu,·c assess ment of how well Item s o r measure s se em to be related 10 the requirements of the job. Face valid ity is o~c n important to Job lpplican1s who lend to react nc gau ve ly to assessme nt methods 1f lhc ~ perccl\'c them lo be. on_re • lated to the job (or no t fa ce va lid) . Even if a measure seems face \'ahd, 1f 1t docs __ no t predict JOb ~rformance, then it should not be used . Hypothe ticall y, a measure of cxti:o.ve rs 10n might loo~ hkc an acce ptabl e way to meas ure job candidate s appl)'ing fo r a sale s pos 1u on . Non~thelc ss, 1t mi~ht still fai l to predict whclh c r o r no t an extroverted p<"rson pe_rforms wel l as a sa les rc prc se n- 1.ati\·e. Perhaps o ut going salespeople talk too mui.h and sell too ht1le, fo r c~ample .
<11wbJu 11,·t ,u1ns...,.n1ofll'-',. 11·tll11r ,.uutt11l<>~•ti<ltrdrut!it """"tmnm,,frl1e1t>b
222 Qurll"fS • 1-k:asun:mcn L
r1w,.,......i..:.,..,w "'l1'X"iPro.:tn ojt~ll..flN}'Jbrtlarr.bitl.l, >f
CO,\T£..\T-UU. T£D
Mfpro,,.Yllr,/i!f..w<Ull'tl/1118//1,J/ i/w NoUfllla{ll-.lSMrf<UY\'f< ...,,.,,W,U~rtla:rdbtll<,,,o,.
co., ~ U UCT-R£U.T£D V.WDAOO.\'
INpn,,;oJ uf.SO,,,;,urro1111,!:1""r" _,..,.. .... r u.e, WCtlrUrrwL ,,, c,\uoc,..ruoct1,kuMJta,,...,11Mrt
CIITEIUO.\'-REUTED V,WlMT/0.\ '
rd r 10 be cenai n an employ ment measure is ~sc ~u1 and :nlid, yo u muM co\. VA~~-~~:: tt:;ti: : he measure 10 a job. The process_ of cstabhshi.ng the Job rcl atedne r,s of a '.:~ure is called \1l~idadoo. Validation is the cumu\au vc and ongoing process of establ ishing
the jo~a~:i~;:t~:i:,~;:~n Employee Scle~tion Pr~~durc s discus~ the fo ll owing thrc-c
methods of conduc ung validation studies and desc nbe co nd1t1on s under which each type of,a!i. dationmethodisappropriate:
• Content-rtlated valid ation is lhc process of demonstratin g that the conten_t of a me~urc assesses important job-related behaviors. For cx~ple, a mathc_mauc~l ~\all s test w01.1ld ha,·c high content ,•alidity for an cnginccring.P?S1 t1on, ~ut a typmg sk1 \ls test mi ght hai·c low content ,-ahdity if the job required only rrun1ma( t_YP'~f - Howeve r, same typin g t~st nught ha,·c slfOng content vahdi1y for a clerical pos1t1on.·· Content vah d_1ty also apphes to lhc items makmg up a measure. A math test might have low content validit y if it includes i1cms foc usi ng on, fo r example, psychology or biology . or other facets unrelated to the position being hired for.
• Construct- rel ated ,·alidation is the process of demonstrating th at a measure assesses the construc t, or characteristic, it claims to measure . Thi s method often applies to me~uru that auempt to assess the absu-act lr:l.its of ca ndidates, such as their p.:rsonal1t ies, honesty, or aputudcs . A construct-re lated validation would need to be done if, for ex ample, a bank wantedto test itstcllcrsforatraitsuc h as"numeri cal aptitude." l nthi scasc, thcapt1tudc is not an obse rvable behavior, but a co ncept created to explain possible future bc ha,,ioi;. To demonstrate that the measure possesses construct validity, th e bank would need to sho11 ( \) that the meas ure did indeed assess th e desired trait (numerical aptitude ) and (2) that this trait corre sponded to success on the job.23 Con struct validity is established by the pattern of corrclauons among items withi n a me asure and the pattern of corrc lau on s of the ~ores from that measure wuh other relevant ou tcomes. Contcnl validity ca n also be used to help cstabli shconstruc1valid11y .
Wpro«uc,/dr""-"'Jrl'tllu,i rJiarr~rr u1111<Uu..'lnllrtl.mo,ulupbtr..rt1t ,..,,,..,fr-" ,,...aswrr (I/rt prrdu1or1 W1!w,mow,i(1M,,,.rrom,)
• Criteri on-rtlatffl va li datio n is the process of demonstrating that there is a s1a1is1ical re· lationsh1p bet"een sc ores from a measure (the predictor) and the cri1erion (the outcome), u~ua\l y some aspect of JOh SUl'Cess suc h as job pe rfonnance, train ing pc rfonnan ce, or Job tenure . Thi s form of validall on uses eithe r corrc lational or regress ion-based procedures. ln other \\ Ords, 111 the case of a posi tive re lat ion~hip, individuals who score high on the measure shoulJ tend lo perform better on the job success criterion than th ose who score low. If the critc non is obtained al lhc same time the predictor measure is co llected. it 1s called co11 c111Tt11t 1·a/idiry: if th e cri1cnon is obtained after 1hc initial meas ure (the prt· d1ctor) is collected. then 11 1s called prrdictil-e ,•alidity. Consider the position of a mill · "right. who 111 s1alls, rcpa1~ . replace s, and dismantles machi ne ry and hea vy cq u1prnenL A measure m1g h1 be designed to assess how emp loyees' mechanical skill s arc related 10 their perfo rm ance \O.hen i1 co mes 10 servicing machines (c rit eri on). A stro ng rclauon· ship would ,·ahdatc usi ng the meas urc .2' Prcd icti\"C validity wo uld be estimated if )Oll measured empl oye es· mccham cal skills before 1hey were hired and lhcn correlated lhost s\.. ill s with t~cir subse quent performance. Co ncurrent va lidi ty would be csumated if al a smgl~ PCl1111 m umc you mc ~ ured the mechanical skills of a company' s current employ- ees as "el l as ~o rrclated their score s " ith their pe rfo rmance. The cri tcn on-rela1ed vahdil) of a measure 1s mcas_ured by 1hc va li dity cocfficicnl, wh ic h we discu ss in more dct:ul in th c nc,1scc t10n of th 1s ch aptcr.
~\ l! types 0 ~ \·ahduy are important. You can establish content validi ty fo r a math sJ.:,l\s tc>l u~ini; Job an a!y\lS techniques to determine the level, type , and difficulty of math required f04' a :;~:~~:
1 : ~ im ponanc ~ of math to job perfo?" ancc . You can then co nstrucl a large number
items are rcl a1e:i":~d po1cnt1all y measure ma1 h skills and esta blish that job experts agree thll l~ . the malh s\.. ill s. Youcanalsou sejobexpc rt ratings tocstab lishthatclChmJih
~:~:::: ~1;\~ ~~;~~~1::i:!t~1a content val idity ratio (the form ul a is avai labl e in the chap~~ s\.:1ll s arc mi\~m from the m J neumbcnts and supe rvi sors todctem1ine if any important m_ f Lhc mea:, urc. n! n ou couldcasur~ . Thi s wou ld hel_r lo establi sh the content and face , ~lld'.l)_ <'
0 '.>CC 1f they predi ct ~b rfo correlate _thc _mm~ skill s _te st with the performance of eni;,nci;rs ~f
J pc nn ancc. Thi s c, tabhshes cmerion-rclat ed validity . All these fo nn s
\1.lidity, combined with reliabi lity informati on a_nd information abou t how Lh e items within the math ski ll te st re late to each other can then establi sh the co nstruct validity of the measu re .
E VALIDITY COEFFICIENT _The ".alid ity coefficient is a number between Oand + I that indi- TH the magnitude of the rc\at10nsh1p between a predictor (such as te st scores) and the crite rion cat«h as a measure of ac tu al job success ). The validity coefficient is the absolute va lue of the (sue lation be tween the pred ictor and criterion. The larger the validit y coefficient, lhc more con- ~cc you can have in predictions made from the scores. Because jobs and people arc complex, ; s~ng1c measure ca n ne ve r fully predict what a person's performa_nce will be because su~ - cess on the job der nds on so many factors . There fore, vahduy cocffic1cnts rare ly exceed .40 rn staffing contexts.
As a general rule , the hig her the val idi ty coefficient, the more bcneticial it is _10 use the measure. Validity coefficien ts of .2 1 to .35 typical fo r a si.ngle measure. The vahd1_tics ~f se- lection sy stems that use muhiple me asures will probably be tn ghc r because you~ us111g differ- ent tools lo measure and predi ct d_iffcrcnt aspects of pc rfonnancc. By contrast, a sing le measure is more like ly to measure or predi ct fc _wer aspec ts of total perfonnanc~ . Table 8-3 sh_ows s.o~c general guidelines for inte~rcting.a srng lc measure's validity. It is di fficu lt to obtarn va hdny coefficients above .50 even 1f multiple measures are used.
EVALUATING VALIDITY Evaluating a measure' s \'alidi_ty is a co~1~lcx task . In add!tion to th~ magn itude of the valid ity cocfticic nt, you should also consider al a mm1mum the fo ll ow1ng fac tors.
• The level of adverse impact associated with your asses sment tool , The number of applicants compared to th e nu mbe r of ope nings • The number of currentl y succc ~s ful employees , The cost of a hirin g error • The cost of the selec tion too l • The probability of hiring a qualified applicant without using a scored ass~smen t tool.
Here arc three scenarios illustrating wh)' yo~ shoul~ _consider t~1csc 1 ~~c tors, indi vidua ll y
and in combination wi!h one anothe r, when evaluating validity coeffic ients ·
Scenario Ou e: You have few applicants for each open posit ion. Most of th~ a~plic ~nts wi ll be hired because the positi ons do not require a great deal of skill . In this snuall?n, }'OU might be willing to accept a selection tool that has a validity in the ran¥e of .. potcntial to Ix: useful" or "useful in ce rtain circumstances .. if the assessme nt method 1s cheap, you need lo fi ll the pos itions quickly. you do not ha,·c many app licants to choose from, and the k l"c l of ski\l n:qu ircJ is no t that hi gh. . Scenario Two: You arc recnnting for jobs that require a high lc\cl_ o_f accura~y.' and _mi~~ takes could be dan gero us or costly. In thi s case,~ slig~tl:
11 10:a~~five~l~-~r;:~
1 :~~~I i:0~00 ::~~~f a"r7;k~n:~~-~s_i~.! 1~ ::~;~un~~:~~::~n:~e:tion ~ 1 th at reporte d va lidities con -
si dered to be "\'cry beneficial. ..
Scenario Thre e: The company you are w~rkmg fo:.~;;~~~i:~niti:~c:c c;1:;s:::::~: ment system that results in fai rl y high le,cls of ad i th are less l"alid amljust as tools on the marke t aisoc i~tcd wi'.h lowc~ a_dvcrsc im pac~~ ~;~pan y :it too much ri \k. costly. Additional ly, making a hmng mistake " ould P
-- Ge neral Gu ld ellnes for lnterp reti ~g Va lid ity Coefficie nts27
Validi ty Coefficien t Value Interpretati on - ~ .JS -- Very beneficial
_21 __ 35 Potential to be useful
. 11 - .20
Below l l
Usehi l incerta, n c,rrurnsta nces
Unhkely tobe useful
Chapter 8 • Measurement 223
VALJDIHCOEFFICIENT
11numMrbrM<'t11 0 anJ ..- l1ha1 1ndka1t1 tht mognmult of1ht rtlntwmh,pb<'t"rtnaprtd1rtor(s wr h <1.!lt f l .1rurt1 ) ,uu/tlr<' r rittn0ft(Juclr <J.1amta.J1<f<'()foc1wa!Job11rrrn1}
1
i i I
i
214 Ouptcr S • ~te=mcnt
VALJom· GE.\'ERALJ1.ATIO.\' 1M dlgru Jo • hie.Ji n11!L11u uf ,a!u!.r,obw,,,aJu,(}lrtlm ... .uu,n ( w, br 1"111 ralr:rd ro<JN11 i>rr11n..u,,,., • 1:Sto..rfanitcr 1rw.J,
da:idcs toimplcment thcasscssmenl given the diffic ulty Cons,..--qucntly, yo~ cornpai_i~ the .. ,"Cry beneficial" ' val idity of the assessment, and )0: lunng for !he parucularlposiu?ns.i nsuumcnts with less 3tJ versc impact. Howeve r, your com failed attempts. to fi nd a tem;:'Cways 10 reduce the ach'crse imp3c t of the sr,tem . · p,lll} will conunuctotry to I .
ost situations requ ire yo u to consider mulupl~ ~actors . Fo~exam ple, the recru it- Clearly, .:n contex t mu st be considered along with \'ahd ,ty. Even if a staffi ng syste m is
~~i:!:~:ic,s job success well , unintended consequence s _may result from the use of the system. For example, the following might be adversely affec ted.
• Applicants . A valid assessme nt system c:m result in adverse imp_a~ I by differentiall y &c· leering people from various protected groups, have low _face_ va hd1t y, ~nd result in fa 11 • suits. As we di scussed in Chapter 3, fair employmc~ t legis lation pro_h1b1ts the use of tests to discriminate against job applicants because of_the1r r:icc, color, rcl 1g~on. ~ex. or national origin. In so me cases, job cand idates may perce1\·e vahd meas ure s as 1rrclevan1 to the Job
• ~~u;;:;ization 'J time and cosl, A valid .asscss_mcnt_ syst~rn can _have .in _unocce~tabl)' long time to fill or high cost per hire ; result m t_he 1den11ficauon ofh1gh-qual1ty candidates who demand high salones. resu lting in incrcas mg payroll costs; ond be c umbersome , d1f. ficult,orcomplextouse.
, f uJun ru ruils . A system can be valid but if it is too long or onerous then applican ts, partic ularly high-quality applicants, .lTC more likel y to drop out of cons ide rat ion ; word that a firm is using time-co nsu ming se lection practices could reduce the number of apph<:a • 110ns; a va lid syste m coul d result in differential selection rate s and reduce the number of appl icants from a particular gender, ethnicity , or background: and valid sys tem s can sull bcv1ewedasunfair,rcsulti ng mfewerfu1urcapplicants.
, Currt nl tmploytts . The assess men t system may favor externa l appl icants or not gi\"e all quahfi ed cmpl o) ccsan equal chance of appl yi ng for an internal pos iti o n: em ployee s migh1 thcreforequesuon1tsfaimes s.
The point here is not to ignore validi1 y. Rather. it is to high lig ht the ne ed 10 address 1M..c factors so that highly valid me as ures ca n be used fo r se lection wh ile mrnimi zing the down ~ides ofusi ngthem .
Validity ,s typically e\·aluatcd usmg si ngle sam ples for specific jobs. There arc l1mita11ons, bo th pracucaland statistical,!ocond ucting validity studie s in cases in which there arercla111 cly few pi:ople ma g1\·cn posi tion . Computing va lidit ies with small sample s can lead 10 large ~m· pii ng errors and red uce the likelihood that your fi ndings will be stati stically s1gni ticant. One method of dealing w11h 1h is prob lem 1s to use vali dity gcnc rali zauon.
VALIDITY GENERALIZATION \'111idity general iza tion refers tu the degree to II hic h evidence of 3 measure'! vali dity obtai ned in one si lU3tion can be gcnernlizcd to another si tu31ion \\llhoul further study.:s A statistical tec hn ique call ed mt to •a11alysis is used to co mbine 1he rc sul t.s of \'al_id3tion stud ies done fora si ngle measure on man y different target groups and foravanc1I of JObs . The goal of the meta-analy sis is to estirmlc a meai.ure·s " true va lidit y" and to 1dc n11f~, \\hcthcr we can gene ral ize the re su lts to all si tuauon s or detennine if the same mca5urc ,, orb differentl y in different situation s.
Vali~i1y ~ncrafoJ ti on s1uJ1es can of1cn gm.: ~ta fling pro fc~ionals insight :iboul the w~ngtll of the rcl3uonsh1p between 3 measure and a person 's job pcrfonnancc. However, there 1s no guat· antec that all empl oyer, ~·ould find the same le ve l of va lidity of a study when it comes to their o\\n ~orkforccs. faery organi,.ai ion h3S different situational factors that can drasti ca ll y imp.ict the , ahd· ny ofa measure. Although the legal acceptabili ty of validity generalization has yet to be thoroughly ~:c%'!;~ 1" 1~lc courts,- on linc as_scs~ ment compan ies , such as ~Visor (www.prevborcom ).'
g Y using \alidny gc ncral1zau on as pan of their validation of their collection of prod(J(; ts..
Using Existing Assessment Methods
\'ahdnti on ~tudy is ex pen sive . Moreover, as we ha \'C cxplamed. man) Juct a study. One I \C _cno_uf'h empl oy ee s in a relevant job category 10 make it feas ible to con·
a tematiic ts to conduct coope rat ive st udie s across firm s within an 3s~oe iauon
c0l1cct more va lidation data more quick ly. For examp le, insurance companies ca n share data :~obtain large amou nts of validation data on specifi c posi tions. Another altem3tivc is tht1 1 it can be advan tageou s to u~ . profe ss ionall_y deve loped assess ment too ls and procedures for which documentatio n on vahd_1LY_'.11ready ex ists. H?we1·er, yo u must ensure tha t 1he validi ty evidence obU,ined from on "0~1s1d_e Mudy _ca n be sui tably "trnnsported" to your parti cul:ir suua1i on. In facl. the Unifom1 Gu1dclmc s req uire as much. To determ ine if a particular measure 1s vali d fo r
o11r in lended use, co nsult the manual and availab le independent re views such as those in Burns :nstitute' s Mtntal Mra s1lfr'.11ems Yearboo/? 9 and Test Cri1iqut s.JO
When evaluating \'ahd ity inform ati on purchased from a vendor, yo u should consi der the following:
• A w11'Wblt: validation evidtnct s upporting lht: 1,se of the meaJure f or specific p urposes. The manual should inc lude a thorough description of the procedure s used in 1hc \'a lidati on studies and the results of those studies. Also consi der the definition of job success used in the valida tion study .
, The pos1ible valid uses of the mtasun. The purposes for whic h the measure can legi ti- mately be used should he described, as well as the perfo rmance criteria that ca n va lidly be predicted.
, Th t 1imilarity of th e sample gro11p(s) on 14•hich lhe mt a1urt 14'Q J developt d wilh Ille group(s) with whicl1 ytm would likt lo ust lhe mtasurt. For example, was the measure deve loped on a sample of high school graduates, managers. or clerical workers'.' Whal was the racial, ethnic, age, and gender mix of the s.amp le'.'
, Job 1imilarity, A jo h analysis shoul d be performed 10 \'erify that your job and the origi nal job arc substantiall y s imi lar in tenns of ability requirements and work beha vior.
• Ad11t:rse impact t: 11idt11 ce. CoM idcr the adverse impact repons from out si de studie s for each protec ted group thot is part of yo ur labor market . If thi s infonnauon is not available for an othe rwise qualified measure, conduct you r own study of adverse impact, if feas ible.
In addition, if an organi,atio n would like to use a \'endor's assessme nt or other tool glob- al ly, it is imponant to thoroughly eva luate thi s ca pabil uy. Many wndor.i th3t claim 10 be global an: actu3lly not capable of dchvcring a product globally .31
This cha pter' s Dc\'d op Your Sk ,lls feature provides so me ad\'1<:c on measunng lhc char- acteri stics of jo b appli ca nt s.
DEVELOP YOUR SKILLS Asses sment Tip s 32
Ch3ptcr 8 • ~-kasurc mcn t 225
To effectively assess Job ca ndidates, emp loyers must be aware of the inherent Um1tat1ons of any assewnem procedure as well as how to prope rty use their chosen assessment me1hod5 Her e are 10 tips on conducting an effect ive assessment program
sens1t1111ty-for example. how and when to provide rea - sonable accommoda11ons for people with d1sab1hties.
1 The meas ures shou ld be used in a purposeful manner- have a clear understanding of what you want to measure and why you want to measure 11.
2 Use a variety of too ls-because no single measurement tool ls 100 percent reliabl e or vahd, use a vdriety of too ls to mea sure1ob-re levantcharactenstK.S
3. Use measu res that are unbiased and fair to all groups-- th is will a llow you 1o identify a qualified and dive rse set of fi nal ists. Use meas ures that arereltableandvahd U$E! measures that are apprnpr1a1e for 1he target popula- tiol)-a measure de veloped for use with one group might
6 ~:!: ;~~~~hde~~1~~::on staff 1s properly tra ined- lhe tra ining should include how to adm inister the mea- sure as well as how to hand le special s11uauons with
7 Ensure suitable and unifo rm assessment cond1 t1ons---no1se, poor ligh ting. inaccurate timing , and damaged equi pment can adversely affect respondents
8 Keep your assessment instruments secure----develope rs and administrators should restrict access to the instru- ment's questions, and the measu res should be periodically revised
9 Ma in tain 1he conf1den t1ahty of the results-t he results should be shared only with those who have a leg1umate need to know Persondl informatJon should not be re- leased 10 other organizauons or ind1v1duals without the informed consent of the respondent.
10 Interpret the ~ores properly-the in ferences made f,om the results should be reasonable . well founded, and not based on superficial interpretat ion. carefu l atten11on should be paid 10 con tam ,na11on and def 1c1ency errors, the manual for 1he tools should also provide instructions on how 10 properly interpret the resu lts
Il6 Chapen S • ~k;i.1.Urc:m<."n1
"'111Ail'L'Oj••"'"'""'"NJ,..,../dl,,.11r i,..,,. u,cuuf,J ,u 1hr JQb ,,, ~,n,,i, _,,.,.,.1,,,u..a,1..«r 1.1J11l a11/,r ,.,,
111fOU-PERSOS Al'PROACJI
WpracUttuf"'"''"'" '''")"f -.u;.,.,.,a,,dprocrd,.rn ,.,.,,,,,rfa//y --· S1A..\'DARDIZATIO \
W 1rn1""""'"'-'"all<mu"'1MJr <f<, ,,.,.;11'krt
OBJH71\tn r!,r .....,,..,..1, ,f)w!tmre/{)rb,..., ,,.,<H,,-du,u,,,.,,ia,,a.s,ru,.,,.,,,
Selection Erron Profession:dl y dei'clopcd me asures and procedures that arc ustd a.~ part of a plarrntd as~ssmcnt program can help you se lect and hire more quahficd and product we employee s even if the mea- sures w-c not pcrfl'Ct. It rs essential to understand tha~ all assts~me1u roofs (ire s11bjcc1 1o frrorf, both in measunng a cha.r:lCteri stk. suc h as 1·crba l ability, nnd rn pred1cung JOb succe s~ n nena. such as Joh performance This 1s true for all measures and procedure s
• Do not e:i:pccl nn y measure or procedure to measure a personal trait or :ib1l11y w11 h perfect accuracyforel'ery singlepcrson.
• Do not exp.:ct any mca.,ure or procedure to be com pletel y accurate m tcm1 , ofprcdictJng a ~-and1date's jobsucr.:ess.
Certainly , select ing employees who arc highly able is important,- Hov.e,er, there arc man y factors that affect a person' s pc rform:rncc . You also need a mouvated cmp lO)ee 11,h o clear ly understands the JOb to be pe rformed, for example. The emp loyee a lso needs the time and resources nccess.ary to sun:eed in the job. Severa l of the se factors ca n be predicted using good measurement too ls . Thi s 1s why sc lecu on procedures typica ll y invo lve three to fi1·c di s- u nct sc le cuon me asures (al a minimum) 1ha1 arc combmed in so me fas hi o n 10 111:ike a final hinng de e1s1o n.
Desp ite these efforts. there alway s will be cases m whic h a score or prOCl'durc wi ll predict so meone 10 be a good worker. 11,ho. in fac1, is not. There will also be cases in v.-tuch an mdil'1dua] rccei1·ing a low score will be reJccted v.hcn he or she would actually be a ca pable and good 10.orkcr. In the staffing profcss mn. these errors arc called selection errors. Fal se pos1til'es and fal!>C negat11cs arc iwo types of selec tion errors. False posi tives occur when you erroneous ly classify a v.eak app li cant as being a good hire . False negati ves occ ur when you erroneousl y class ify a strong apphcam as being a weak hire . As you lry 10 reduce o ne type of error you mJy increase the other so there arc trade-offs m how ~·ou make your hiring deci sion . TI1ese issues 111 11 be co,,ere<l more m the followmg chapters
Sclcc11 on errors canno t be com pletel y al'oi ded. but they can be reduce d , for exampl e, by us ing a l' arre1 y of me:isurcs. Us in g a 1'arie1 y of measure s and procedure s 10 more full y as- sess people is referred to as the '>'hole- person approac h to assess mc n1. Thi ~ approach 111 11 help n:ducc the num ber o f selection errors and boost the effec tivene ss o f you r 0 1 era ll deci sion m:i king _JJ
Star,ndudization .11nd Objectivity
Standardization is the con sistcm :idmmistr.mon and use of a measure . StandarJ11at1 on reflects the co n.m tcncy and um form1 1y of 1hc condnions as well as 1hc procedures fo r :idmm1s1cnng an assess ment method . Compu1enLa11 on he lps to cn~urc th at all re spondents recei ve th<.- ~e instructions and the same amoun t of umc to complete the asse ssment. Because ma1nt amm g stan· dard,zcd co nd.1 11o n, 1~ the rc~ponsibllu y of the pe opl e adn11n istcri ng the asscs~mcnt. tra 1nrng all admmistraiors in proper procedun:s and control of co ndi 1i on s is cri ti c11 I. Thi s I ) true for m1cr· 1 1cwing as \\ell as any other as~es,mem ap proac h. In addi ti on IO bein g lega ll y important, stan- dardru,uon 1s abo 1aluable bcca u~c rcrru1ters should cons istentl y eva lu ate cand 1d:itcs on their eompetencic s,stylc s,and 1ra1L>
Nomis reflect 1he dis1nbuti on of score s o f a large number of people 11hosc srorcs on .1n assessment method arc 10 be co rnp;ircd. The si anda.rdi,.ation sample is the group of rc,pondcnl> 11
hosc s.corcs are use d 10 establi sh nom1, llic,c nonm become th e compariso n scores fordctcr- m1ning th~ re ~a1_1vc pc_rfo rrnance of futun: re spo ndents .
ObJecti,·1ty rc1crs to the am ount of judgme nt or bias invo lved in scoring .:in :1,,cs, mcnt mea.~ urc. The scormg proce ss for objer1i 1·e 11/fW, rires is free of pasonal j udgm en t or t,ia• :
11 \ numbcrofword , typed in a mmutc 1s an obj ectil·e meas ure as 1s th e amount of ,.., cigh t a
~~;a 1 ~:;crrn ~:r~~~~aic can. lift,- Subjectfre lllt'as,ues, on the oth cr
0
hand, contain item s (s uch J) an d ~ rso na! chara~test1? n, ) for v.h 1c h !he sco re can be influenced by the a ttJtudcs, br a.cs: subJcc 11 1e it is also :n.: 1cs of lh c person doing th e sc orin g . Whcnen:r hiri ng de,•1qons ar: of dl\ ersc, en der . g od tdca 10 mv oh e muh1pl e peop le in th e h iring procc ~s . prcfcrab'.·, the mos ( acge urai c 1~:~::;:1;1~:~c~eratc a more defen si bl e dcci sion .3
4 Becau se the y prixl_~::~ po,sib le · 11 is bc St 10 u~c stan dardized , objce1i ve mca, urc s when~
Chapters• 1'1cai.urcmcnl 227
CREATING AND VALIDATING AN ASSESSMENT SYSTEM
ertating an cffec~h·e assessme nt :i~ selec tio n system for any posi tion m any organi,.ation bcgrn s .,.;th I job analysis . As you learn ed m Chapter 4, afte r understandin g the n,"qu1rcments of Job suc- cess, you ide ntify the 1mponant knowledge, skill s, abilities, and olhcr charactcri sties (KSAOs) and competencies required of a socce~s fu l c_mployce. You then identify rchable and l'a lid methods of measuring these KSAOs and compctcnc1cs, and create a system fo r meas urin g and collecting the re- sulting data- 1hc inte gri.ty and usefu lness of the data generated by e.x h measure needs to be consid- ered when deciding whi ch mca.~ures to use . The data collected from eac h measure is then exam med to ensure that it has an appropnatc mean and standard dc1·iation. Remember. a measure on which ei·eryonc scores the same or nearl y the same is not as useful as a measure that produce s a wide r.mge ofSCofCS. Candidates' scores on eac h assessme nt method are then condatcd or entered in to a regrcs- SIOO equation to evaluate any n,-duOOancies among the measures and 10 as:;ess how well the group of ir,c.1Surcs predictsjobsucccss . Adl'err.eimpactandthccos10fthcmcas uresarcalso cons1dcred in evaluaiing each measure . After the final ~t of measures is identified. selec tion rules arc dcl'clopcd to determine which scores arc passmg. The usefulness and effecti l'encss of the system is then peri - odically reevaluated 10 ensure that it is still predicti ng Joh success 10.1thout ach'crsc impact.
Benchmarking Jt issomc:times useful to compare an OT£an11.auon' s ~taffing data with those of 01 hcr ocgamlation s to understand better whether the organ11.ation is doing we ll or poorly on a particular dimension. For example, is a vo\umary turnover rate of 30 percent gocxl or bad? In some positi ons. such as the po- sitions held by retai l employees. thi s would be a good tum0l'Cr le ve l co mpared to the indu stry al'er- agc. In other position s, a 30 percent tumo,·cr rate would tlC unu~ua ll ~ high . Bcnchmark1~ g other firms can give a compan y comparatil·e information about d1mens,ons mcludmg the fo 11 owing:
• Appl icatio n rate s • Avcrage start ing salan es • Average time to fill • Average cost pcrhtre
There arc numerou s sources of relat il'c ly hi gh-qual ity benchmark inform ation, but 11 ca n bccxpensi,·c . Some source s of benchmwlrn g data include
• Corporate Leadership Council • Watson Wyatt and other starfing co nsulting fillfl S • Hac kett Group • The Saratoga Institute (now part of Pncc v. ate rhouscCoopc rs) • Staffing .org • Many industl)' associations, such as the Society for Human Resource Management, lrar k
be nchmark in fom iauon and make II available lo their members.
Ev1 lu11tlng Assessment Methods The dctenn innnts of the cffectwcncss of any internally or ex ternally developed a:.~cs~m~nt method include
I. Validily -whct hcr the asses sment meth od predi cts th e rele vant components of job success 2. Rtturn on invtstmtm-whcthcr the asse ssment method generates a fin::inc, al re turn that
J, :~;~:;~:~;;;;i:,:~~~~u;~: ~~~n~i;ccivcd job rcl::i tcdn m and fa1mm of the asscss -
4, ~:;1;;:~~hc willing ness and .:ibi \it y of peop le in the org:ini za11on to use the mclhod co n-
sis te ntl y ~nd correctly d withoul di scnminaung agai n~, mc111- S. Ad~trse impact - whether the method can be use
6. ;:,: :,;; ~c:~10~:shether the method has a low sclcctt0n rati o . . db sc rate to ihe effec1ivc ncss of an assess-
The importance o f a finn' s selec tion ratio nn / R s~ IIJS were among the fir ~l to dcm - rnem method de scf\'C further elaborat ion. :"11ylor an f i° ss of an assess ment. The tables they onstratc that validity al one will not dctenn~nc th e us: ~a~; dcinonstr:ucd that assess ments 11 ith tcnem tcd, tak ing into.:icc ount selection rau oa ndbas '
228 Clu.rtCfS • :>k1sumnem
SCUCTIO.Y IUTIU ,..,....-.,,c,frro,,hlufN,tm,kdl, ,· -~,,,,~
IM f" rrnuo{~......,,·ns,.i.ua,,. Jrti,w,J..,n;rn,u}yu«<1uj,J ,.,~,.,
high ,·a.lidu y rn:w not pro,c useful ifa hig h number of tho_sc ~scsscd llfC hired and that assr~s- me nts .. 1th relatri-ely lo w vnhdity can sull haven ~ubstant~al 1mpact_on th e 1mpruvcment of ob ~ucl'tiS rute.s if ? nly a few 0 ~ those assessed arc hired . Thi s work l:ud the found:n 1on for uo/n) anal ys1s, .. hich1 s d1sc usscdmalntcrch.ap1cr. ..
A selection ra lio 1s the number of people hired d1~'1dcd by Lhc number of apphcan1.1 Lower selec ti on rauos mean th.al a lo wer ~n:cntagc o fapphcants arc hired, and higher seli:ct ion r.iuos mean 1ha1 a grcater perce ntage of appl icants arc lure~ . Lo 11 er sclccu o n ratios me.in that th company 1s being more sclecu1·e, and can rcn~-ct either hi~ng a l?w numhcr of people or rece i,-~ ing a lru-ge number of apphco11ons . F~ example: a sclecll,on _rau o of 75 percent mea ns ihat th~ company is h1nng 75 o f e very JOOapphcants, which docsn t gt\'C th~ assessment method a~ gOOd ofa chance to weed out the a ppll c~nts who are less l_1~c ly 1odo the JOb well. _Imag in e 1f)o ur !,C . k,:-u o n r-.1110 1s J.00 (100%). 1n which case you are hmng everyo ne who applies- no assessment tool can be useful m that nrc umstance.
The base ra te 1s the percent _o f empl oyees who are defined as c urn::ntl y s uccess ful pcr- fonners. Clearly, organi1.a L1ons desire a base rate o f 100 percent as tlus rctlccts a s nuati on 1 v,,h1ch all e~ployees are perf~ing sausfactorily . The fim1' s system of HR pracLJccs, mc ludin; staffi ng, 1r.u n1 ng, compe nsation, and performance manai;c ment, work _1oge Lhcr to affect a firm 's base rate. If ) "OU T base rate 1s 100 percent. then _everyone who gets hired 1s already successful , and us mg an add1t10nnl assessment 1001 will no t improve you r s ucce ss rate .
1be potential to improve the effective ness of a new assessment system m tcnns of im. pro,·ing the base rate of a firm de~nds on the perce nt of its c urrentl y s ucce ss ful e mployees (the
=:1:~h:~~)~~:e~~~:ce:; 1;-~:: s~;c::~u~: ~~~J::~~~~1;~::11 1 t1);~:a~t :c~~~::
:issessment method v. 111 be _lowe r than if the base ra te is l~wcr and more of the firm· , employees arc perfonrun~ poorly. A high-performmg S)'Stem (as evidenced by a high b a~c ra te) s imply ha!. less room fo r 1mproHmcn t than a lo"cr-performing sys tem . . Lov.enng the ~!cction rauo ca n al~o improve the impac1 o f1hc selec ti o n ~)~ tern . A sekc-
uon r:i.uoof80 rc_r-:e n1 mean s that _the company 1s o nly sc reening o ut 20 perc ent ,)f its appli cants and is possibly h1nng man y e~d1dmes v.ho iis assessment sys tem has idcnufied a s lower po- tenual performers. O f co urse, if the ~ urcing and recru iting proce sses resulted in a high-quali1y ~;:~::0i~:1~~cn:~;: ;:;:s:°~e~t~:lhhi gh perfonnm, a lowe r selec t ion ratio might not
:·al1d1 ty of a new assessmen t method can also affect 1hc impact of the se lec ti o n syste m. As "e haie discuss_cd, a~-.cssmcnt methods wnh higher ,·al1ditics arc better :ihlc to 1mpr01e !~e:::~'~:~:~~::~ ~: lo"· •·alidi tics. If _the current base rnlc is high, the current not be "onh,.,,hll e it yof the new assess ment me1h od is moderate or low, llmJi
Hiring The Best Call Center Workers at Xerox
~ ::~ \\;:I: :~1~.:~:~~~:~;(/,~:~:::11 :u 1.mpro1c hinng quJIJty and reduce turno,cr in i1< c.lll
~femng Jpph c:u11_., v. ho h;act done the oh be hmng ~Y!tem a~d d1 !.Co1'crcd that 11~ rnm·n1 ~}'_1cm of ter m pcrfonnJ.IICc or rctenuon. Aficr c)llectmfore v. as mcf~ccmc----call center cxpcnence d1dn t mat· :md retention. Xero, kamc,J Ul.u pcn.onaht ;J~ta on bro:tder eh~~teristics rel ati-d 10 p,.'rformlll('t >UJ a.t le~>! •1 x monih5 , Jone tnou h r tered a lot m pri'l:l1c tmg "'ho would perform ,..ell .ir,J nc" tu rei; h IC"Jmed lhJt a!th~ ugh 1!qui:l\cc;oc~~ny
10 _recoup its SS.OCIO inve>lrnem 111 1r.1uung 11> tended 10 , tay longer and perfor ed be fXHhc t1~ pcopk tended to qliil. crt:tl i\'C pc rso nallues .lblc ~irrt.ll., on v.erc import :;! to rc:~~ll:~•? 3150 learned lha1 lh·ing nc;u- the job ;rnd ha, ing rc\i · "' 3.\ rcJu:c:/;;-;~
1 ~:ean~:~;~smr nt ,\y;i,•ni focu•mg on pen;onJl1ty found th JI new hire annu on
:~r 115 J 8.iOO call een1er joW Appli : a::~:~e 1;~~'e,J. Xerox nov.· uses soft w.1re to sc reen apph,anl>
~enan o~ t he) rn1gl11 encounter 011 lhc ob ·minute computcri J.Cd tes t th;u as k<; them to rcsrooJ q~cMmn~ Ulan IDO\I people do" a.nJ ~Pc~I and 10 chOlhe bct"ecn s1a1cmcn1s including. "" I a,k fllOlt ~:::~1~~~ tl\ ity . arc c,al uatc,J aucomatic:li~:d
1 ~ lnm "hat I say ." Applicant~• pl·rsonah ty !rJ!ts.
U\J llt,. mcJ ium po! tn!1JI. 01 grc~n for h ; he program ~pn s ou t 3 M'.Ore . red for 1011 patcnu al. lnln them , but pnmanly lures g~~~n~'inhal Xcro~ Occasio n:1 ll y hires so me )"cllov. s 1f1t
-summary Soundme:isure rnenl la ys th c foundation fo rc ffcc 1ivestaffing . If job charac teristics are .~ rly meas ured dunng .1 j ob anal ysis and if applicant c ~aracte~1suc~ are poo~ly assessed, it will be diffi- cult to im possible 10 identify and hire the apphcams who wou ld be the beSt hires. Al though ~casurem~nl perfec ti o n is unlikel y, using measures with appro pnatc reliabi l ity and vahdity w ill im - prove the accuracy of the_ pred ic t ions we mnkc . Understanding different ways of de scribing data a nd knowing the appropriate use and interpreta1io n of correlat ion and regre ss ion analy sis help [he staffing specialist bes t use data to make decisions, ond e valu- ate the performance of oubide consultants as well .
Acquirin g and uti lizing relc\ant infonn.1uon is an im - portant part of making ~o<XI decisions. and staffi ng deci s ion~ are no except ion . Deci smn makers be st use a nd impro,·e the
Ta keaway Points 1. Mea!urement is essential to maJ.. 111g good hiring decisi ons
Improperly assessing :md me.l~uring candidates' characu:ri~ties can lead to system.1t1c.1 lly hinng 1he wrong people. offending and losing good candidates. andc,cncxposi ng yourcomp.111y to legal action. By contrast. properly as.scssing and measuring can - didates• characteristic s can give your orgam1.ation a eompetitisc advanuigc .
l Me3Surcs of central tendency such as the mean, median, and mode and measures of 1·ariabi!ity such a, range. variance. and stand.1rJ dcviation areusefulforde,cnbmgdistri butionsof scores This in- formation can be used to co mput e standard scores. "hich can tell you how any indi vid uol pcrfonncd relati\·e 10 others and "hich can be used to easily combine scores th:11 h11.'"c different means and IWldard dcviaJ.ions
3, Correl.1tionis thc strcnglhofarc lation•hipbc 1ween1wovrufablcs Multi ple regression is a sta1i s11c.1l technique based on anal}·~1s of relalioMhips. The techn ique identifies thcidealweighllitoassign each tc:st so as to ma\ imiic the validity ofasc1 ofasse.,ment !l\elhOOs: the analysis is based on each as!oCSsmc nt mctood's cor- rclai.ion with job succcss an d 1he dci;;rec to 11h ieh thc asscssmcm
Discussion Questions I. Whai types o f measures of job candid ates are 1110~1 hl..dy to be
high in terms of their rclt.lhi lity and ,·al1dity '! Docs this m:il.e 1hcm ITIOrcusefu l? Whyorwhynof!
2, How would you e., plain 10 you r supervisor that the cOITClation bc- t11>·~n imervie\O.· scorcs and ncwhirc qualilyis lowandper..uadc himorhcrtoco nsi der ancwjuba pp\icantc , 3luationmclhoJ ?
Exercises
I. Slrm,.,,, E.urrise: T eddy bear maker Fuuy Hugs pursues l high- ~Ua\ity, low-cost strategy and ca n't affo rJ to hirc umkrpcrform- mg manufacturing employees gi'"cn it \ lca11 ~rnmng model. Fui:.1.y Hugs has idcmi fic<l an asscS\me nt syste m that has high va lidity and Predicts job success we ll but that is also ,cry e~pc nS1\C and rc ,u lts in fairly high levels of ad verse impact The company is
Ch ap ter 8 • Meas urement 229
staffi ng sys tem h y unders tanding hi s to rical relations hips among anributes o f the st.1ffing system, s uch as the recruiting so urce . rec ruiti ng me ssage, and assessment method s u sed, and de sired organizauonal o utcomes . Outcome s of s taffing syste m per- fonnanee incl ude performance, customer sa ti s faction , and the re tention of top performers . The s1.1ffing improve ments m ade possible by analyzi ng s taffin g dnta include reduced tim e to productivity for new hire s, inc reased retention of hi g h perfo rm - ers and high-pote ntial employees, increased new hire quality , and greater hirin g manager satisfaction . As )'OU w ill lea rn in Chapter 13 , the match between new hire s a nd o rga n iza t ion s can be enhanced and s laffing practices a nd s trategics can be better connec ted to the company's bus iness s trategy b y a n:i.ly11 ng dat.1 10 improve a staffing ~ys1cm .
methods :lie intercombtcd. Corrcl.111011 and regress io n an.1 ly.se\ an- u!.Cd to evaluate ho w " ell a n a,sessmc nl mclhod predict s job succes,and cuevaluatc1hceffocti\'en~, ofa nrm 's o,·eral1Marf. ing syMern.
4. Practica l significance occurs.v.hen the colTC l~ti on is large enough 1obcofvalueinapracllealscnse . Stat1stica l s1gnificancci s 1hc dcgrec to whic hthere la11 onship is no1lil..elydue1osamplingerrur . T obe useful. .lcorrda11on needs to have both practical and stat isu- cal Mgm fi cancc.
S. Relbbihty 1s how dependabl y o r con,istcntly a measure assc.sses a p.1t1 icul archarncteristic Va hJi l y is how well a measure asscsse\ a gncn con, Lruct and the degree to "h1ch you can make specific conclu~ions or predictions ba,;.cd on a measure 's scores. A measure must be reliabl e inordertobc ,·a.lid. In order lo be useful.a mca- , uremustbcbothreliablcandvalid .
6. Standardiiation is the coru.istcm .ldministrat ion and use o f a mea- sure. ObJecti\ity is the amou nt of j udgment o r bill.'> invol\c,J in !>Conng :111 11.'>sessment measu re . Because the y produce th e most accur.1tc me.1surcmcn1s. 11 is best to use M:rnd:mlizcd, objec tive 1nca~urcs II he never po~sible .
J. What correlation " ou ld you nccJ 10 sec before )'OU were w1lhng 10 useanexpcnsive:i.~~~mcnttcst?
4, Whe n,1 oo lditbcaccept.1blct o uscameasurcthat prcd ,cL~ JOb success but that hasad,ersci mpJCt?
S. What do srnffing prof~sio na ls need to kno w about n1ca.-.urcme111?
concerned about maintaming a divcn;e " o rlforl-e, and " ants to avoid lega l lroubk The assessment tools it idenuficd lh~ l h.ld lo11erad,erscimp.1cthad s ubstan1ially lo\\er,·ahd,tyas " ·cll,and were almostasexpcnsi,e .
The company a~ks your profcs>ion.1 1 ach'iee about v. hcthcr ii ~hou ld u~t the new asscssmc-m S)Slem. What OO,ke do )OU gi1 c l
230 OuptaS • J\k~menl
2. Deo·.-lop Yow Shll.s Eun-u.-. This chapier' s Dc,c lop Your Skills ft'= g:l>C)OU SOITIL" tips on :wa,i;mi;:job,, ui.:bdJ.tes. Based on -.11.:ll. ~ou n:3d in this dup(er, .. 11a1 are 1hrcc addi1ional 11p.s lh a1 ,·ou ,,,owd IO the h5' ? · {Aaiitiona.le.xerrisessn:a•"3il3blc3.1.thcendof1h1 scMp. u:r'ssuppkmenl lh.lr. -.;11 eroblc }OU 10 build additionaJ compu1a- oooil anJ da..,sh:m-ma.king W lls "lien using dJ.ta.)
_\ O['t",wig V"l.f""nt' E:.urnu 1k Of)(ntng \"igncuc described h.:J,. Xero., Jc,cJope>.l .'.Ill =mem system 10 impro,e the
CASESTIJ~
Youjus1 beaame the head of s taffi ng for Baby Bots. a manu- fac1urcr of smal l robots. You were s urpri sed 10 learn that lhc- C"Ompany had oe,·cr ,·alidated the manual dex teri1 y test ii uses to 3.SSCSS job candidates for its m:mufactunn g JObs. You decided 10 do a co ncurre nt val idat ion s tudy and ad - nuru s.tered the test to thiny man ufac turing " "o rkcrs. Their srores lltt reported in Tabk 8-4. alo ng with their ages, se x,
Validation Data for the Manual _, Dexterity Test
Employ" Su
·o 11
12 13
I " IS 16
17 ,a 19 20 21 22
R.11e• ... JS
32
44
42 J6 33 45
" 34 '6 30
39 31
" " '° 44
33 4]
36 22 28
T•tt P•rform.11nc• Seore R.11ting
36 90
" 95 so 95 49 93 46 89 52 94
so 92 so 93
" 83 44 89 40 87 48 95
" 90 39 80 48 92 38 79 38 80 J6 72 46 89
" 92 " 89 l2 70
~;:= :1 ... : 1:i;o~I:~ :~g c:~~~::1:~ workers. R~ a. ;:~:~c~:~r g~.:~~cc: ~n:p:~;.~ ~~~~~:c~~'.~/;u:~1~s1~cr~
Why? Ygo:11' b. Jf you applitd to this company and "ere denied a job
~~,J;~r :un:::i:~a~S::~m~::::~1:~~ ~ of:r~'O!Jld y~ r!i;
race, an_djob performan ce ratings . Yo~ also calc ul ated the correlation bct"'ecn the manual dexterity te s t a nd JOb pc · 1 fonnance to assess the te s t' s validit y. You then exam in:d the relationship between employee s ' lest score s and their pcrfonnance raun gs . The res ults of thi s analy s is a rc sholln in Tablc s8- 5 and 8-6 .
2l 19 48 94 24 23 48 94 25 27 36 74 26 18 46 as 27 26 44 79 28 21 so 95 29 23 34 70 JO 28 44 83 Mean 34.07 43 97 86 ,73
SD 9.33 5.54 7.9 1 M in 1800 32.00 70.00
Mo, 49.00 52 .00 95 .00 Range 3 1.00 20 .00 25 .00
~:o~m;u~.1~g .,,0 : n:,tfficic ncy. JOO = JOO<,, cffic1ency
R.u 0 "' H1~p.tn"· l - Wt11 1c, 2 Bfac ~
DDD Co rre lation Table Job
Age T•st Performance Age 1.00 Test 0 . 12 1.00 Job 0. 18 Mo 1 00 Performanc e
Nvr, Co rrC"latoon~ undcrhll('tl anti m l>oltl mJ,cace iwnshc.1l1,1gn1fiC311cea1a lndorl' < 05
IIBJI Above a Cutoff of 43 5., Abov• 43 To t .111 Count P•rce nt Males (O) 14 SO.DO Femates(ll 13 16 81.25 Tota l 20 JO 66.67
Rae• Ab ov• 4 3 Total Count P•re• nt H1spa nic(O) 666 7 W'hite (1 ) 11 72. 73 Black(2) 10 60.00
Total 20 JO 66.67
Questions 1. What kind of relation ship exists between employees' scores
on the manual de>.terity tes t and their perfonnance ratings?
Semester~Long Active Learning Project Flllish the assignment for Ch:ipter 7 and begin researt:hing, ~rib- 1ng, and critically analy zing thc alignmcn1betwcenthc pos ition you chose and the finn's existing :bstssmcnt pract ices. De vise a series of :wcssmcnt methods (intervic" s. 11$SCSsmcnt cen1ers. work sam ples.
Chap1er8 • Measurcmenc 231
2. Suppose a ca ndidate scored 44 o n the m:mu al de xterity test. The regre ss ion equation predicting job performance usingthemanualdexterity tcsti s
32.465 + ( 1.234 X Manual dex 1erity test score ) What is lhe candidalc 's predicted job performance ?
3. Assume that o nly ca ndidate s with predicted performance abo ve 85 are 10 be hired. l11 is tra nslales to a score of at least 43 o n the manual dexterity te s t. Ass ume only th ose with score s above 43 were hired (20 of the 30 people in lhis sa mple ). Would the use of thi s tes t have led 10 evi- dence of ad\·ersc impact based o n sex or race ? The re l- evant data o n the 20 peop le exceeding the cu toff were presented earl ier in the table.
4. Gi,·en the validi ty res ults yo u found, wou ld you recom- men d use of th is tes t as a sdccl ion de vice? 1f so , how wou ld yo u use it ?
and soforth. )for evaluatingjob candid:i!cs. Using"'hat youlcamcdin Chapter 4, identi fy how you r a!.M!'~,mcm pbn .. -m enab le 1he comp an y 10 be comphan t with equal cmp\oy menl o pponuni ty (EEO) la,H and ot hcrlcgal requiremcms.
Case Study Assignment: Strategic Staffing at Chern's Sec the appeOOix at the back of the book fo r th.i s chapte r' s Case Smdy Assignment.
Cha pter Supplement
ATTENUATI ON DUE TO UNRELIABILITY
Because corre lations arc an important tool fo r dctennining the Strategic usefu lne ss of a particular assess ment tool and for legal defe nsib il ity, it is impommt to recog ni ze th at obscr.·ed corre - lations can be influenced by some important fac tors that may exist in organizational conte xts. Attenuation is the weake ning of observed corre lation s. No mc!liure is perfect. and unrt'lia bil · ityo fm easurc sca n atten ua1c corrc la1io ns. 39
. Consider Figure 8-8 . In the r1rst scatte r plot .... e hav e de - p~ a perfec t corre lation along with arrows indicating the ad • ~Ltion of random error. In the second sca tter plo t you can sec the impact of the added measurement error (unrel iabili ty) . Yo u cun easily sec how unreli abi lity adds "noi se .. or error to the system, lltenuating the co rrel a1 io n. The obse rved corrclm ion has mo,·ed from being perfec t, + 1.0. 10 about + .67 . In the most extreme case, a corre lati o n wi th a random variable (a co mp le tely unre li - able measure) will hover aro und zero . The impact of error _is another i1lu s trati o n of w hy it is imponant to be system a uc in
your staffing system and proced ure s. If different mtcrvicwcrs are affec ted by no nintcrvie w factors the n the y ca nnot be reli - able and their score s will not be as s trongly correlated wi th lat er job pcrfo nnance. Ob se rved co rrelati o ns can be corrected for a ttenuation if yo u kno w lhe re liabili ty o f the mc asurc(s) . The form ul a is
Observed rxr Correc ted 'xr = V rxx X ')'}·
where rxr repre sent s the correla tio n between X and Y and r.vx and ryy represent the rel iabi lity of me asu re s of X and Y. res pccllvcly .
CORRECTION FOR RANGE RESTRICTION
Ano ther fac tor 1ha t ca n inllucnce correlation s in organirnti o ns 1s range re s tric tion . Organi za tion s do no t l11rc randomly and the y keep o nly the be st pe rform ing employee s. Th is reduce s
-
232 Ch:ip1C'1'S • :-.le3'.'Ul't"ll'l('l11
. :-1---------1 --+--= L' +-r ·: +-1 ~_ .....-. _ __ _
6 10 12
X FIGURE M Unreli.ibility ilnd Cornililtio n
th.: \'3nabi lity of employees within the organiL3t1on. Range res.uiction refers to the reducti on of vari abili1y in observed $C'OT'CS. often due 10 some ma nagemen t practice or use o f a sc• 1.xtion device . RJJ1ge restric tio n tends 10 :mcnuatc observed C01TCl:u.ions.
For e.,amplc. we do wnloodcd d:ua fro m the Bu reau of ubor and Swistics and corre lated the relati ons.hip between ed• IJC'3Uon:i.l k\·el and income . Table 8- 7 shows corre lations using ac1w.l dau with \ 'ill) ing le vels of r,mgc restriction .
From the table you would concl ude that education is stro ngl y related 10 income using the full sample but you might :i.lso co nclude 1ha1 cd ucauon is un1m ponan1 1f you included o nly those o;i.i lh advanced degrees . Wh y th e diffe rence? As education 1s mcreasi ngly restricted in range, i1 can have le ss :1nd less re lati onship with income . Imag ine if e\·cryonc had c..,.actJy the !;amc le\·cl o f educalJon--cdu cation could not cor- relate ""ith 1ncomc.
As a staffing eumple, ass ume yo u sdectrd only th ose "'ho scored 1n the top quanrr of II penmnaluy 1cs1. You would rcstric1 -~ r.uige o f \"ariabi hty m the sco res on the personal ity tesL This IS a problem ...,hen esubh sh1 ng en teno n-rela ted va lid - 11) m organizations. If you use a selcc tm n too l, or any process conelatcd v:ith a sel ection tool. to make st:i.ffi ng decisio ns, the n
mm Restriction of Range Uses Selected
Al l powble cases Ori!y people w,th some mcome
Os!~o p!e who graduate fro m high
On ly people who graduate with some colle ge
Onl y pe<Jp le w,tn a rr aster's or PhD or MD oeg ree
Correl.ition
.838 452
.410
293
.001
,, 20
" y lO
o.i---~~~ ------- o 6
X 10 12
you wi ll nece ssarily re strict the range of scon::s . E,·en rtl3Jn . ing on ly the best e mp loyees wi ll res tric l variability in important ways. You can correc t a n observed corre la tio n for range rcstnc- tion if yo u know lhc unre stricted variability (or standard dei·,a. tion ). The formula is
Correctcdrxr = rxr(St!SR ) V I - ,iy +
where r:o- repre se nts th e corre lation between X and Y, and Su and SR refer to the unre stricted and restricted standard d~- vi alions, respcc ti\·el y. You ca n use lh is fonnul a to correct for range restriction on either the predictor or criterio n as long .b you have the res tricted and unrestricted standard deviations More sophi sticated fon nulas exist for indirect range restriction or simult:i.neous com:ction for mo re th a n o ne variablc .40
MEASUREMENT FORMULAS
Unle ss ind ic ated otherwi se, X values are the obsen-ed scores, Yvalues nrc outcome scores, a nd 11 vn lues nrc 1he number of obsc::r\'a tion s m the sample . Samp le Excel formula s assumc dlta exist in colum ns of eell s r:mg mg fr o m A I to A30. con tin uin~ wi th B 1 to 8 30 and so forth .
Mean
- 2'.X X =---,,
In Excel : @:i. ver.ige(A I :A] 0)
Ra nge
Xm.u - Xrru "
In Exce l: @Max( A I :A30)-@Min (A I :A30)
5,n, ple variance
1·ar, = 11 - 1
1 (X - X) ' wir, = - ,,-_-
1 -
In Excel: @Var(A 1 :A30)
sample Standard Deviation
sx = posit ive square root of the variance In Excel: @stdev(A 1 :A30)
Z Score
Sample Covariance
cov.n = n - 1 Sample cov ariance in Excel: @co var(A I :A30,B I :B30t'[ 11/(11 - I ) I
Correlation Coefficient
(\T = 7'(~~11' In faecl: correl (A 1 :A 30,B 1 :B30)
Simp le Regression y'= bx +a b "" co v., ,1s ; OR b = r, )(s, ll', ) a= 5• - bX
~here y' is the ou tcome variable being pred ic ted, ,tis th e prc - dic_tor, bis the slope of the regression line, and a is the constant or intercept.
In Excel: @s lopc(Al:A30,B l :B30) a nd @intcrcept(Al : Al0, Bl cBJO)
Chap1er8 • Mcai.urcmcn l 233
Multiple Regression with Two Variables
y' = b1x1 + b~t2 + a b1 = [ (,, 1 - r, 2'12) / (l - d2) ] x (s/si) bi= [ (r,1 - r, 1'12 ) / ( 1 - rh)] X (s/s2) A = Y - (bi X Xi) - (bi X X2)
where y' is lhc outcome variab le being predicted. X1 and .t2 arc the two predictor variables, b1 is lhc slope for variable Xt, bi is lhc slope for variable x2, and a is the constan t or intercept. Also. r, 1, r, 2, and r12 de note the corrcla1io ns bc1wecn the outcome and xi, the outcome and x2, and the two pred ic tors x1 and .t2 , rcspcc- tivcly . Addit iona ll y, .sy. s1, and s2 arc the stand ard de viat io ns of y, .t1, and x2, rc spce lh'cly .
In Excel 20 13. it is rel ati vely easy to compu te a mu lt iple regression fo r multiple predict ors .
L Make sure the Analys is ToolPak is installed (other ver- sion s of Excel will require slightly di.ffcrcnl instruction s). Thi s is a Microsoft Office facet add -in that adds custom commands or custom featu re s to Microsoft Orficc . a. Open a new spread she et, then click the ·•File" tab in the
upper left corner. b. Click ·· Option s" at lhc bono m o f the left sidebar. c . Cli ck ··Add- In s" on the left. Make sure it says Mann gc :
.. facc l Add-in s." C lick ··Go" . d . In the "Add- In s avai lable' " box, select the ··Ana lys is
ToolPak" c heck box, and then click "OK." If you al- re ady ha\'C a check mark in ··A nal ysi s Tool Pak ... then it is already in~talled. Do not uncheck 1t and click the red X in the upper ri ght comer o f the dialog box to cancel . Otherwise, continue.
e . If you arc prompted that the Analysis ToolPak b not c urrently installed on yo ur computer. cli c k ··Yes .. to in stall it. Once loaded, lhe Dnla Analy sis c ommand is ll\'ai lablc in the Anal ys is group on the Data iah .
2. En sure eac h column is 1:i.bcled and there arc no mi ssi ng data points in you r d:i.ta set.
3. Click --oata," then go to the ·•Anal ys is·· bo11. and click ··Data Analy sis ." a. Scro ll down and click on ··Regressio n,. lhcn ··OK." b. Clic k on ·· tn put Y Range" and type in the r.ui gc for
the outcome for Y or hi gh light it using your mou se , includ ing the labe l fo r the column (e .g ., K 1 :K49 , if yo ur outcome is in column K, you have 48 case s , and a column label ).
c . Click on " Input X Ra nge" and type in 1hc range fo r a ll of the predictors or h igh lig ht it using your mo use, in- cluding the la be l for the colum ns (e .g .• BI :F49. if you have 48 cases. 5 predi ctors. and column labels) .
d . Click ··Labels"' if you have included them as suggc stcJ . c. C lick "O K." Yo u just ran a mu lti ple regre ssio n. You
can fo ll ow a similar proce ss for compu tin g a corre- lation matri:< by selecting ··C orrela tion" in th e ··oata Analysis" opt ion .
234 Ou.pter S • ~k;L"Un:nll."Tlt
Split-Hatt Reliability (Spearman-Brown)
r0 = ~ri r. 1d ( I +'Ir. id '>'here r
1 ,: 1,:: ,s the m ml:m o n h,,-tween score s o n C3~h h3\f of
the meisurc.
Coefficient Alpha
a=~ [1 - ~:~] .,.here J.: is !he num bcr o fi1,:ms, u; is the variance o f each ilem, and uj is the \ ' :ui:IDCl: oftot ::,.l scores on the mc::,.surc .
Standard En-or of Measurement
SE\ l = sx~
SEM = suncurd error of measurement also known as the s tan • dll"d deviation of observed scores for a give n true score r.a = reliabili ty s .. = s tand:lrd deviation of obse rved scores
Content Validity Ratio (CVR)
(" - ~) C\ 'R = _ ' __ 2 _
N 2
..., here nr = num be r o f judges s lating that the knowledge. abiJ. '. ty . ~II. com petency, or o ther char.icteristics me.:1Surc{) by th e item is eS5C nuJ.1 to job perfo rmance. and N = the total number of judges nung the items.
Supplementary Computations
Y~u can ac:ess the d:u.a fo r thi s secti on o nhnc al the Pcar~o n \\e bs1te(Un1\er.,a1To \ S .. 1.:l s ).
You ...,on; ;:u a m'anufactunng pl am called Un11c n.al TO)S and ) OU '>'ere J U!>l put in c harge o f a pruJccl 10 c1•aluate the \alid - 1ty of the- Curren! se lecti on syl>tcm . Yo ur Job is to c valua1c the ~ful ncss o f1.,.. o consc1e ntiotun el>S measures usi ng the Satnl· te s t ~en l\\O .,..eds a_pan. 3 Cognitnc ability tc ~L. and tw o interview scon:s. 1k oo~cn~o us nes~ mc~ ure 1s a stamb.rd pcr..o nalitv ~~~or_.,.:;"en,1ronments '>'1thscorc.s ran g mg fr om 1 to 1()
. . - me I. C o nscTI "" time 2. t'>' O weeks larcr ) The cogn1m·~ ability measure is a standard measure o f intclhg~ ~·1th ? pica! scores r.mpng fr om 80 to 130 (Cog,\bil ). TI1e ~:; mten1e\\ score Ll> from an uns uucturcd mtc n ·itw with the h1rin :~~;;~cr and it range s fr o m 1 to 7 (Hiringi\l gr) . The ~ccon d in~
score IS from a SLrUctured intc n ·icw v. 1th the HR a..ss.:-ui ng key cap:ib1h tu:~ and ex p,:ncnce and 11 r.ingcsmr:::c~ 10? (HR M gr) . All of the..,... JS'-cs, mcnts h:n e bee n used fo r pas t three years 10 hire cm pl o ~cc~ anJ you ha\ C d::ua o n ~8 lh c rent manu fac tunng ~·rnp lo)cc , 11 ith the ir ~ o res on the prcdi c:~~
recorded in your Human Re so urce In format ion System (H Score s for these 48 employees arc rcponcd in the spread RIS ). along with their _age, sex. race , positio n code, and managcn:~L performance r:,itmg s. :ou arc als_~.":'.kcd to ~onduct an adv!ob impact analysis. Race 1s code d as O for whne. " 1" for Ar , tsc American. and "2"' for Hispanic/Latino. Male s arc coded females as "I." Po~ition I refers to assembly, 2 10 cqu;m~~t setup, and 3 10 quality control. Performance is measured ing from the Tean1 Productio n Leader and it ranges fro: ~Orat- 50. Attac h_printouts or c ut and paste in to a d ocume nt and pcrfo~ the fo\lowrng analyses o n the data:
I. Us ing Excel o _r cou~ti ng by hand, plot the distribution of HR manager 111tcrv1cw scores for employees numbe red
~::u:: ;i:i:~~~:~~h~: p::~~~d frequency of
2, By hand calculate the mean, median. mode, range, van. ancc, and s tandard de"iati o n for Lhe cognitive abilit)' teM
e~;l:/l~~: ~:r~~~~!cl 0~rno~!~l!.3~~~~ wish, you 3. Us ing Excel, repeat the calculations you Just did by hand
to c hec k your work . 4. Using Exce l, calcu late the mean, median, mode, vari-
ance, and standard deviation for all interval orratioleve\ "ariablcs, includi ng performance, across all employees l l Lhrough48).
S. By hand c reate a sc atter plot of cognitive ability and per· formance for employees numbered 33 th ro ugh 48 . B} hand compute a correlation of the se same data points . Confirm your finding s using Exce l.
6. By hand compute a sim ple re gre ss ion of the relationship between cognitive ability a nd performance for employees numbered 33 thro ugh 48. Use Exce l to confirm the slope and intercept. Compute a pred ic ted score for someone wi thanabilityscoreof 11 8.
7, Use the corre lation fun clion to correlate 1he two sets of co nscientiou s ne ss scores. What docs thi., corre lati on tell you? Use the co rrelati o n func ti on to corre late each oft.he co nsci entious ne ss scores with performance . With a sam - ple si 1.c of 48 and a p va lue of .05, an y correlati o n ow .285 (ig noring Lhe sign) is st a ti stically signific:rnt. Wh31 did yo u find '!
8. Now use the Re gressio n functio n in the Data Analy sisop- t~o n to predict job pcrfonnance us ing the 1wo conscien- uous ne<,s scores. What did you find? Why ?
9. In sert a column after the two conscientiousness scon:s. Compute an average score for co nsci entious ness us ing Ult'. tw~ con sci entious ness scores . Correlate average consc+ enu o us ness score with job pe r formance . A ss ume job per- forman ce has a re liabi lity of .90. U se the re liability oft.he con sc ientious ne ss measure and the reliabi lity of jobpcrfor· mancc 10 correct the observed corre lation for attenuation ,
IO, ~ sc th c correlation function to corre late the two sets of mtcnic~ score s. What is the va lue an d wh.it docs th15
:~~:)to n tell yo u? Why d o you think you 0 t,scrved ctn>
II, use th~ R ~grcssion fun c tion _in the Data Analy s is option to predict Job performance us rn g the two in terview scores What did you find and what would yo u recommend?
tl- Use the correlatio n function to corre late ability with ·ob performance. Correc t thi ~ correlation for range re s~ic - tion, assuming the unrc s mctcd population has a standard deviation o f 15 for abi lity .
tJ, Use the Re gression fun c tio~ in the Data Analysis opti o n to predic t job performan ce us mg the avcr.igc conscienti ous- ness measure you created, cognitl\·e ability, and interview scores from the hi~in g and HR managers . Look al the p value for each predictor. Any \'aluc less than .05 is s ignifi- cant. What do you sec and what would you conclude? Arc there any variables you can drop? W hy or why not?
14, Write ou t the mult iple regressio n equation from your re- sults includin g all variable s. What is the predicted score for someone wi th scores of 6, 120, 5, a nd 7 for average con - scie nt iousness, cog ni1 ive ability, h iring manager intcn·icw score, and HR manager interview score, respecti vely?
Endnotes L ~xerox ," 20 13, h11p://www.xcrox .com/about-xero:Vcnus.h1ml. 2. Pearson. N .. Lesser. E .. and Sapp. J .. "A New Way of
Working," E.ucutil'e Rq,ort. IBM Global Ser.•icts. 1': Y. 2010. ftp:1/ftp.soflw:i.re.ibm.com/softwarc/soluti on,Jso.v'pdf,J GBE03295 - USEN -00 .p<lf: LaV allc. S. "Bus iness Analytics and Optimizati on for the lntel!igcnt Enterpri se." ' IBM Global Ser.·ices, NY. 2C09. http://www-03.ibm .com/innol'ation/us/ smancrp lanct/assct s/s mancr B usin ess/business_anal)·tic s/ gbc0321 l -uscn-00 pdf.
3. Kelly. J. ··Human Rcsou rc C-1> D.irn Analytics Brings Metri cs to Workforce Management." December 11. 2008. Searrli BusinrSJ Analytics.com. http ://searchbusine~sana!ytics.tcch- targe1.com/news/ 15071 I 8/H uman -rc so urces-da1a -an:i lytks· brings-metrics-to-wo rkforcc-man.igcmcnt; Neumann , D. "The Power of Analytics in Human Capital 1\-fanage-mcnt."' Talent Manag rmrm, Apri l 2008. h11p :/fwww .pcrsonnelJecisions.con1/ uploadedFilcs/Aniclcs/Po\1crof Anal ytic sHCl\. lgmLpdf.
4. Sullivan. J. "How G oogle Became the 113 Most Valuable Firm by Using People Anal ytic s to Reinve-nt HR ." ERE.ncl, February 25, 2013. hllp :/fwww .cre.ncV2013/02!25/how-googlc-bccamc· thc-3-mo s1- valuable -firm-by-u si ng-pcoplc-ana\ytics-to•mn - vcnt-hr/ .
Chapter 8 • Me ~ure mcn1 235
IS. What do you conclude overall about the validity or the current sys tem ? Write a s hon paragraph . Reru n the re· gre ssion ana lysis dropping variables if you deem it ap- propriate to do so. W rite out the final multiple regression equation you would recomme nd using.
16. Look at the Applicant Data in the Applican t tab of the Excel s preadsheet. Use Exce l and your multiple regres- sio n equation to compute predicted scores on all of the applicants . Assume you wanted to hire o nl y the lop 50 percen t of the se app li cants based o n their predicted per- formance . Evaluate the implications of hiring only lhe top SO percent of applicants fo r ad1·crse impact by sex and race. Write a short paragraph a nd include computations forad\'erscimpact.
17. Given the validi ty and adverse impact analy s is re su lt s you found, wha t would you recomme nd '! Is tl1erc adverse impact? Is thi s a legally defen si ble s tarfing system? Do the assessments have reliabilit y and validity?
8. '-Case Study; TEKsy ~tcms:· Brainbcnch.com Skills Mr(uure- mrm Repon . October 2003 , www .brainbcnch com/pdf/CS_ TEKsy~tcms.pdf.
9. "HowDidaLargcPto\·iderofBu~incss ScrviccsRc-duceTumol'Cr by Nearly 50%T' PeopleAnswers Newsletter. Spring 2009. http:// www .pcop1can s wer~ comhite/app'? rcqPage=abourns_new s_ ne-w sk11ers_dctai\&m1\lcmld=9#Articlc l ; 'le~timonials .. Cun Gray of AmeriPride . hup:/fwww .pcopkllflSwcrs .comlsite/app'.' rcqPage=rcsulb_tcstimonia\s& sonBy=O: ··About AmeriPridc,·· http :// www . ameripridc .com/US_ ln fo/ 1\boutAmeriPridel OurHistory jsp'.'bmUID= 1275448271 107 .
10 Adaptc<I from Stel'cns, S. S .. " Me asurcmenl. Statistics. and the Schemapiric View." Sd, uu. 161 ( 1968): 850.
1 I. Singh . R .. ' The Emperor's New Metrics. Part 2." £/ertro11ic Recnmmg £rcha11g,. September 8. 2005. www .crc .nel.larticlcY db/001 D8B71 AF754FA \8E074813EB6ADD8A .asp.
12 Hansen. F. "Ad Firm Finds Recruitin g Passi\' C- Candidates in a Downturn Is No Easy Sen;· Workforce Mwwgeme111 011/mr. January 2009. http ://www .workforcc.com/arc hivelfea- turc/26/\3/05findex .php .
13. Joint Commission Resources ... Looking a\ Staffing Effect iveness Data."' February 2003. www .j crine.com/40\6.
5. Simon. G. "Ma5tcring Big Data: CFO Strategics to Tr.in sform Insight into Opportunity ."" fSN and Oracle Corporarrou. December 18. 2012. h1tp J/www .fs n.co. ukJchanncl_bi_bp1:"- c pm/mastc ring_big _data_cfo_slra1cgies_to_1ransform _tn · sigh t_intu_opportunit y# .UeVm l237bar . SAS . ·: Big Data- What Is It'! " July 16, 2013, h1tp:/lwww .sas.com1b1g-data.linde~ . html.
6. Biro, M. M .... Uig Data is 3 Big Deal." form's. June 23. 2013. hllp :/lwww . forbes . com fsites/ mcghanbiro / 20 1 3106/23 /
14. These arc often called Venn diagrams but this term is not 1cch- nica\l)' correct. Venn diagram s repre sent all hypothetical ly po~- s\bk relati ons among se ts or collections of ~ets. Ba\l untines are niuchlcssprcciscanddonotreprcscntrc lationsh\psamong sct s. llo \\cvcr. many people ~Li ll call these O\"er\apping circles Venn diagrams because of their simi l.irity to them . Cohen, J., Cohen, P., West, S. G .. and Aiken , L. S. Applird Multiple Rtg ff :uionl Corre/ario11 A1w/ysis for tilt Brlzm·iorol Scie11cr1· (3rd c-d .). Mahwah. NJ : Lawrence Erlbaum Associates . 2003 .
15. Le, H .. Oh. l. S .. Robb ins . S. B., Hies. R .. Holland , E .. and Westrick. P .. "foo Much of a Good Thing : Cur"ilincar Rclmionships Bet'>'cen Personali1y Tr.iits and Job Performance:· Journ al of Ap11/it•d Psyrliology. 96 (2011 ): I 13-133; Lallui s,
big-data-is-a-big-deal/ . . ... , 1. Taleb, N. N .• " Beware the Big Errors of ' Bi g ~ ~ta• l\r;.-~·
February 8, 20 13. hup://www .win..-d .comlop1mon/20l3 021 big-data-mcan s-big-L·rrors- pcople/.
236 Cha.pier 8 • Measureme nt
D . ?\-1.. Martin. N. R .. and Avis, J .M., "lnvcs1iga1ing Nonlinear Conscicmiousncss-Job Performance Relati ons for Clerical Employees:· H imm11 Perfo m1ance. 18, 3 (2005): 199-212; Vasilopoulos. N. L.. Cucina. J. M .. and Hunter. A. E .. ··Personality and Training Proficiency: Issues of Bandwidth- Fidelily and CUrvilinearily." Joumal of Occupational a11d Organi:;ational Psychology, 80. I (2007): 109- 131.
16. See Digman, J. M .. "Personality Structure: Emergence of the Five-Factor Model." A1111ual Review of Psychology, 41 (1990): 417-440.
17. Carrig. K ., and Wright. P. M .. "Building Profit Through Building People.'' Workforce Managemem 011li11e. January 2007. www.workforce.com/section/09/f cature/24/65/90/index;. html.
18. Cohen. Cohen. West. and Aiken. Applied Multiple Regression/ Correlation Analysis for the Behavioral Scien ces; Cohen. J.. and Cohen. P. Applied Multiple Reg ression/Correlatio11 Analysis for the Behavioral Sciences , Hillside. NJ: Lawrence Erlbaum Associates. 1975. Cohen and Cohen were among the first to use a diagram of three intersecting circles to illustrate three variable regre ssion relationships. They called it a balJan- tine because of its similarity to the logo for a particular brand of beer. The authors of thi s text are proud owners of a se rving tray with a Ballantine logo.
19. Based on Testing and Assessment: An Employer's G11ide to Good Practices. U.S. De partment of Labor Employment and Training Administration. 2000. www.onetccnter.org/dl_filcs/ empTestAsse.pdf.
20. Bretz. R. D .. Jr .. Milkovich. G . T .• and Read. W .. "The Current State of Performance Appraisal Research and Practice: Concerns. Directions. and lmplicalions." Journal of Management, 18. 2 (1992): 321-352.
21 . From Te sting and Assessmem. 3-3. 22. French. W. L.. H11man Resources Ma11ageme111 (2nd ed.).
Boston. MA: Hou g hton Mifnin Co .• 1990. 23 . Ibid .. 260. 24. Ibid. 25. Te sting and Assessment.
26. Ibid. 27. Ibid. 28. Rafilson, F .. "The Case for Validity Generali zation•• ERr
Digest, 1991. http://ericae.net/d~/~do/ED338699.ht~1. C/ff,1 29. Spies. R. A.. Plake. B. S .. Getsmger. K. F .. and Carlso
F., (eds.), The Eightee11th Memal Meas11remems Yea,t J. Lincoln. NE: Department of Educational Psycholo gy a,001.. University of Nebraska. 2010. See www.unl.edu/buros. the
30. Keyser, D. J., and Sweetland, R. C .. Test Cririques(Vo\s. 1_10 Austin, TX: PRO-ED, 1984-1994. I.
3 1. Ryan, A. M., Wiechmann, D .• and Hemingway. M., "Designin and Implementing Global Staffing System s: Part 11-se! Practices:· Human Resource Management, 42 (2003): g5_94
32 . Adapted from Testing and Assessmem. · 33. Ibid. 34. Hansen. F .. "Recruiting on the Right Side of the Law ..
Workforce Management Online . May 2006, www.workforc~. com/scclion/06/f eature/24/38/ l 2/.
35. Taylor. H. C .. and Ru ssell, J. T .. ''The Relationship of Validit . Coefficients to the Practical Effectiveness of Tests in Selection.~' Journal of Applied Psychology, 23 ( 1939): 565-578.
36. Ibid. 37. Marasco. C.. ·•Firms Drop Billions lo Enter Fas t-Growing
"Algorithm Hiring' Market.'' The /'.-fotley Fool. October \ I, 2012. http://beta.fool.com/chrismarasco/20 12/IOJ\ If people-wont-hire-you-algorithms-wi!Vl4009/.
38. Walker, J.. "Meet the New Boss: Big Data." The \Vall Strw Journal. September 20, 2012. http://online.wsj.com/article/SBI 0000872396390443890304578006252019616 768.html.
39. Charles. E. P.. ''The Correction for Attenuation Due to Measurement Error: Clarifying Concepts and Creating Confidence Sets." Psychological Methods. 10. 2 (200S ): 206-226.
40. Sackett. P. R., Lievens. F .. Berry. C. M .• and Landers. R. N .. "A Cautionary Note on the Effects of Ran ge Restriction on Predictor lntercorrelations," Journal of Applied Psychology, 92. 2 (2007): 538-544.