Week 4 DB 2

Chapter12.pdf

Home >Business & Finance homework help >Management homework help >Week 4 DB 2

276

12 SELECTION METHODS

By the end of this chapter, you will be able to do the following: 12.1 Gather personal history data from job applicants in a manner that minimizes distortions and

embellishments 12.2 Assess letters of recommendation and reference checks in terms of factors that affect their

validity (e.g., degree of writer familiarity with the candidate and job in question) 12.3 Choose an appropriate honesty test (e.g., overt vs. personality oriented) 12.4 Use valid and reliable measures of past training and experience 12.5 Implement drug screening and polygraph testing using appropriate legal guidelines 12.6 Design and implement employment interviews taking into account possible response

distortion and considering social/interpersonal, cognitive, and individual differences that affect the process and outcomes of interviews

12.7 Administer structured employment interviews that maximize validity and reliability 12.8 Use caution in relying on social media and other big data and technological advancements

(e.g., mobile and Web-based technology, computer scoring of text, remote interviewing, and virtual reality technology) for selection purposes

LEARNING GOALS

PERSONAL HISTORY DATA Selection and placement decisions often begin with an examination of personal history data (i.e., biodata) typically found in application forms, biographical inventories, and résumés. Undoubtedly one of the most widely used selection procedures is the application form. Like tests, application forms can be used to sample past or present behavior briefly but reliably. Studies of the application forms used by 200 organizations indicated that questions gener- ally focused on information that was job related and necessary for the employment decision

Cascio, Wayne F., and Herman Aguinis. Applied Psychology in Talent Management, SAGE Publications, Incorporated, 2018. ProQuest Ebook Central, http://ebookcentral.proquest.com/lib/ashford-ebooks/detail.action?docID=6403253. Created from ashford-ebooks on 2024-09-03 15:16:51.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 277

(Lowell & DeLoach, 1982; Miller, 1980). However, over 95% of the applications included one or more legally indefensible questions. To avoid potential problems, consider omitting any question that

� Might lead to an adverse impact on members of protected groups,

� Does not appear job related or related to a bona fide occupational qualification, or

� Might constitute an invasion of privacy (Miller, 1980).

What can applicants do when confronted by a question that they believe is irrelevant or an invasion of privacy? Some may choose not to respond. However, research indicates that employers tend to view such a nonresponse as an attempt to conceal facts that would reflect poorly on an applicant. Hence, applicants (especially those who have nothing to hide) are ill advised not to respond (Stone & Stone, 1987).

Psychometric principles can be used to quantify responses or observations, and the result- ing numbers can be subjected to reliability and validity analyses in the same manner as scores collected using other types of measures. Statistical analyses of such group data are extremely useful in specifying the personal characteristics indicative of later job success.

Opinions vary regarding exactly what items should be classified as biographical, since such items may vary along a number of dimensions—for example, verifiable– unverifiable; his- torical–futuristic; actual behavior–hypothetical behavior; firsthand–secondhand; external– internal; specific–general; and invasive–noninvasive (see Table 12.1). This is further compli- cated by the fact that “contemporary biodata questions are now often indistinguishable from personality items in content, response format, and scoring” (Schmitt & Kunce, 2002, p. 570). Nevertheless, the core attribute of biodata items is that they pertain to historical events that may have shaped a person’s behavior and identity (Mael, 1991).

Some observers have advocated that only historical and verifiable experiences, events, or situations be classified as biographical items. Using this approach, most items on an applica- tion form would be considered biographical (e.g., rank in high school graduating class, work history). By contrast, if only historical, verifiable items are included, then questions such as the following would not be asked: “Did you ever build a model airplane that flew?” Cureton (see Henry, 1965, p. 113) commented that this single item, although it cannot easily be veri- fied for an individual, was almost as good a predictor of success in flight training during World War II as the entire Air Force Battery.

Weighted Application Blanks

A priori one might suspect that certain aspects of an individual’s total background (e.g., years of education, previous experience) should be related to later job success in a specific position. The weighted application blank (WAB) technique provides a means of identifying which of these aspects reliably distinguish groups of effective and ineffective employees. Weights are assigned in accordance with the predictive power of each item, so that a total score can be derived for each individual. A cutoff score then can be established, which, if used in selection, will eliminate the maximum number of potentially unsuccessful candidates. Hence, one use of the WAB technique is as a rapid screening device, but it may also be used in combination with other data to improve selection and placement decisions. The technique is appropriate in any organization having a relatively large number of employees doing similar kinds of work and for whom adequate records are available. It is particularly valuable for use with positions requiring

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

278 Applied Psychology in Talent Management

long and costly training, with positions where turnover is abnormally high, or in employment situations where large numbers of applicants are seeking a few positions (England, 1971).

Weighting procedures are simple and straightforward (Owens, 1976), but, once weights have been developed in this manner, it is essential that they be cross-validated. Since WAB pro- cedures represent raw empiricism in the extreme, many of the observed differences in weights may reflect not true differences, but only chance fluctuations.

Biographical Information Blanks

The biographical information blank (BIB) technique is closely related to the WAB technique. Like WABs, BIBs involve a self-report instrument; although items are exclusively in a multiple- choice format, typically a larger sample of items is included, and frequently items are included that

Historical

How old were you when you got your first paying job?

Future or hypothetical

What position do you think you will be holding in 10 years?

What would you do if another person screamed at you in public?

External

Did you ever get fired from a job?

Internal

What is your attitude toward friends who smoke marijuana?

Objective

How many hours did you study for your real-estate license test?

Subjective

Would you describe yourself as shy?

How adventurous are you compared to your coworkers?

Firsthand

How punctual are you about coming to work?

Secondhand

How would your teachers describe your punctuality?

Discrete

At what age did you get your driver’s license?

Summative

How many hours do you study during an average week?

Verifiable

What was your grade point average in college?

Were you ever suspended from your Little League team?

Nonverifiable

How many servings of fresh vegetables do you eat every day?

Controllable

How many tries did it take you to pass the CPA exam?

Noncontrollable

How many brothers and sisters do you have?

Equal access

Were you ever class president?

Nonequal access

Were you captain of the football team?

Job relevant

How many units of cereal did you sell during the last calendar year?

Not job relevant

Are you proficient at crossword puzzles?

Noninvasive

Were you on the tennis team in college?

Invasive

How many young children do you have at home?

Source: Republished with permission of John Wiley and Sons Inc., from Mael F. A. (1991). Conceptual rationale for the domain and attributes of biodata items. Personnel Psychology, 44, 773; permission conveyed through Copyright Clearance Center, Inc.

TABLE 12.1 ■ A Taxonomy of Biographical Items

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 279

are not normally covered in a WAB. Glennon, Albright, and Owens (1966) and Mitchell (1994) have published comprehensive catalogs of life history items covering various aspects of the appli- cant’s past (e.g., early life experiences, hobbies, health, social relations), as well as present values, attitudes, interests, opinions, and preferences. Although primary emphasis is on past behavior as a predictor of future behavior, BIBs frequently rely also on present behavior to predict future behav- ior. Usually BIBs are developed specifically to predict success in a particular type of work. One of the reasons they are so successful is that often they contain all the elements of consequence to the criterion (Asher, 1972). The mechanics of BIB development and item weighting are essentially the same as those used for WABs (Mumford & Owens, 1987; Mumford & Stokes, 1992).

Résumés

Résumés are a source of personal history data in most employee selection situations. Although résumés are now usually submitted electronically, as far back as 1975, the estimate was that about 1 billion paper résumés were screened each year (Brown & Campion, 1994). When examiners extract personal history data from a résumé, they are particularly prone to cogni- tive biases and heuristics because information is often limited to one or two pages. Specifi- cally, applicants are likely to be placed into stereotype-based categories in a rather automatic fashion, and then attributes believed to be typical of the group are assigned to individual applicants—even if those beliefs are factually incorrect. Many so-called “paper people” or “vignette” studies (Aguinis & Bradley, 2014) have been conducted in which résumés of hypothetical applicants are presented to judges, who have to provide ratings regarding each applicant’s job suitability (Derous, Ryan, & Serlie, 2015).

Social categorization can take place on more than one category. For example, Derous et al. (2015) conducted an experiment in which 60 Dutch recruiters rated the job suitability of applicants whose résumés included information on ethnicity (Dutch, Arab) and gender (female, male). Results showed that ratings were influenced by applicants’ ethnicity (i.e., Arabs were rated more negatively) and gender (i.e., men were rated more negatively), raters’ prejudice (i.e., those with more negative attitudes toward a particular group rated members of those groups more negatively), and job characteristics (i.e., results were more pronounced when jobs included more client contact).

A recent innovation is the use of video résumés, which are recorded video and audio mes- sages in which job applicants can present themselves to potential employers. Video résumés allow applicants to express themselves in a way that is not possible using the more traditional paper format. It is also possible to create multimedia résumés, in which job applicants also include animations and text (Hiemstra, Derous, Serlie, & Born, 2012). Not much research is yet available on video résumés; however, Hiemstra et al. (2012) conducted a study involv- ing 445 unemployed job seekers who had received a two-day job-application training in the Netherlands and found that they perceived video résumés to be more fair compared to traditional paper résumés regardless of applicant ethnicity (i.e., Dutch, Turkish, Moroccan, Surinamese/Antillean, other non-Westerners, and other Western applicants).

Overall, given the many factors that influence raters’ evaluation of personal history data based on résumé screening, it is important to (a) train raters to make sure they focus on job- related factors and (b) assess interrater reliability (Brown & Campion, 1994). Another impor- tant concern is the extent to which applicants may distort the information they provide, hoping to increase their chances of receiving a job offer. We discuss this topic later in the chapter.

Credit History

The big data movement has provided organizations with personal history data that were unthinkable just a few years ago. For example, a survey of members of the Society for Human Resources Management revealed that about 50% of employers conduct credit background

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

280 Applied Psychology in Talent Management

checks on at least some applicants (Bernerth, 2012). One type of personal history data, credit scores, seem to be an objective indicator of a job applicant’s conscientiousness and even integrity—two clearly desirable KSAs for many jobs. If an applicant fails to keep a promise to his or her financial institution, this may be an indicator that he or she will similarly fail to keep a promise at work. Also, perhaps individuals who are under financial duress may be more prone to engaging in counterproductive behaviors at work (e.g., theft) (Bernerth, 2012).

Using credit background checks for employment purposes is legally permissible in the United States under the Fair Credit Reporting Act if applicants provide written authoriza- tion (Bernerth, 2012). However, some states, including California, Colorado, Connecticut, Delaware, Hawaii, Illinois, Maryland, Nevada, Oregon, Vermont, and Washington, as well as Washington, D.C., have restricted the use of credit histories of applicants and employees. For example, Colorado’s Employment Opportunity Act (SB13-018) prohibits an employer’s use of consumer credit information for employment purposes if the information is unrelated to the job. Moreover, it requires an employer to disclose to an employee or applicant if the employer uses consumer credit information to take adverse action against the employee or applicant and the particular credit information upon which the employer relied. It also authorizes an aggrieved employee to sue for an injunction, damages, or both.

The regulations in these jurisdictions seem justified, given evidence that credit scores are related to several demographic variables that in many cases are unrelated to job performance. For example, Bernerth (2012) collected Fair Isaac Corporation (FICO) scores for 112 uni- versity employees and alumni and conducted a regression analysis using credit scores as the criterion and the following demographic variables as predictors: minority status (nonminority, minority); gender (male, female); marital status (never been divorced, divorced); educational attainment (high school degree/GED, some college, 2-year college degree, 4-year college degree, some graduate or professional education, graduate degree); and age. The five predic- tors combined accounted for 34% of variance in credit scores and the predictors (a) minority status (minority status associated with lower scores), (b) educational attainment (less educa- tion associated with lower scores), and (c) age (younger applicants received lower scores) had the strongest effects. Although educational attainment may be a job-related factor for some occupations and positions, the strong relation between ethnicity and credit scores guarantees that the use of this particular type of personal history data will result in adverse impact. In addition to legal issues, the use of credit scores has ethical connotations. Specifically, “critics of credit scores contend that using such information to make hiring decisions unfairly disadvan- tages individuals with low scores and traps them in a ‘vicious downward spiral’ where unem- ployment damages personal credit which, in turn, can hurt their job prospects” (Bernerth, 2012, p. 245). As is the case for all types of predictors, validity information is required—and this is particularly important in the presence of adverse impact.

Response Distortion in Personal History Data

Can job applicants intentionally distort personal history data? The answer is yes. For example, the “sweetening” of résumés is not uncommon, and one study reported that 20—25% of all résumés and job applications include at least one major fabrication (LoPresto, Mitcham, & Ripley, 1986). The extent of self-reported distortion was found to be even higher when data were collected using the randomized-response technique, which absolutely guarantees response anonymity and thereby allows for more honest self-reports (Donovan, Dwight, & Hurtz, 2003).

A study in which participants were instructed to “answer questions in such a way as to make you look as good an applicant as possible” and to “answer questions as honestly as pos- sible” resulted in scores almost two standard deviations higher for the “fake good” condition

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 281

(McFarland & Ryan, 2000). In fact, the difference between the “fake good” and the “honest” experimental conditions was larger for a biodata inventory than for other measures including personality traits such as extraversion, openness to experience, and agreeableness. In addi- tion, individuals differed in the extent to which they were able to fake (as measured by the difference between individuals’ scores in the “fake good” and “honest” conditions). So, if they want to, individuals can distort their responses, but some people are more able than others to do so.

Fortunately, there are situational characteristics that an examiner can influence, which may make it less likely that job applicants will distort personal history information. The first such characteristic is the extent to which information can be verified. More objective and veri- fiable items are less amenable to distortion (Kluger & Colella, 1993). The concern with being caught seems to be an effective deterrent to faking. Second, option-keyed items are less ame- nable to distortion (Kluger, Reilly, & Russell, 1991). With this strategy, each item-response option (alternative) is analyzed separately and contributes to the score only if it correlates significantly with the criterion. Third, distortion is less likely if applicants are warned of the presence of a lie scale (Kluger & Colella, 1993) and if biodata are used in a non-evaluative, classification context (Fleishman, 1988). A fourth approach involves asking job applicants to elaborate on their answers. These elaborations require job applicants to describe more fully the manner in which their responses are true or to describe incidents to illustrate and support their answers (Schmitt & Kunce, 2002). For example, for the question “How many work groups have you led in the past 5 years?” the elaboration request can be “Briefly describe the work groups and projects you led” (Schmitt & Kunce, 2002, p. 586). The rationale for this approach is that requiring elaboration forces the applicant to remember more accurately and to minimize managing a favorable impression. The use of the elaboration approach led to a reduction in scores of about .6 standard deviation units in a study including 311 examinees taking a pilot form of a selection instrument for a federal civil service job (Schmitt & Kunce, 2002). Similarly, a study including more than 600 undergraduate students showed that those in the elaboration condition provided responses much lower than those in the non-elaboration condition (Schmitt, Oswald, Kim, Gillespie, & Ramsay, 2003).

Validity of Personal History Data

Properly cross-validated biodata have been developed for many occupations, including life insurance agents; law enforcement officers; service station managers; sales clerks; unskilled, clerical, office, production, and management employees; engineers; architects; research scien- tists; and Army officers. Criteria include turnover (by far the most common), absenteeism, rate of salary increase, performance ratings, number of publications, success in training, cre- ativity ratings, sales volume, and employee theft.

Evidence indicates that the validity of personal history data as a predictor of future work behavior is quite good. For example, Reilly and Chao (1982) reviewed 58 studies that used biographical information as a predictor. Over all criteria and over all occupations, the average validity was .35. A subsequent meta-analysis of 44 such studies revealed an average validity of .37 (Hunter & Hunter, 1984). A later meta-analysis that included results from eight studies of salespeople’s performance that used supervisory ratings as the criterion found a mean valid- ity coefficient (corrected for criterion unreliability) of .33 (Vinchur, Schippmann, Switzer, & Roth, 1998).

As a specific illustration of the predictive power of these types of data, consider a study that used a concurrent validity design including more than 300 employees in a clerical job. A rationally selected, empirically keyed, and cross-validated biodata inventory accounted for incremental variance in the criteria over that accounted for by measures of personality and

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

282 Applied Psychology in Talent Management

general cognitive abilities (Mount, Witt, & Barrick, 2000). Specifically, biodata accounted for about 6% of incremental variance for quantity and quality of work, about 7% for interper- sonal relationships, and about 9% for retention. As a result, we now have empirical support for the following statement by Owens (1976) from more than four decades ago:

Personal history data also broaden our understanding of what does and does not con- tribute to effective job performance. An examination of discriminating item responses can tell a great deal about what kinds of employees remain on a job and what kinds do not, what kinds sell much insurance and what kinds sell little, or what kinds are promoted slowly and what kinds are promoted rapidly. Insights obtained in this fashion may serve anyone from the initial interviewer to the manager who formulates employ- ment policy. (p. 612)

A caution is in order, however. Commonly, biodata keys are developed on samples of job incumbents, and it is assumed that the results generalize to applicants. However, a large-scale field study that used more than 2,200 incumbents and 2,700 applicants found that 20% or fewer of the items that were valid in the incumbent sample were also valid in the applicant sample. Clearly motivation and job experience differ in the two samples. The implication: Match incumbent and applicant samples as closely as possible, and do not assume that predic- tive and concurrent validities are similar for the derivation and validation of BIB scoring keys (Stokes, Hogan, & Snell, 1993).

Bias and Adverse Impact

Since the passage of Title VII of the 1964 Civil Rights Act, personal history items have come under intense legal scrutiny. While not unfairly discriminatory per se, such items legitimately may be included in the selection process only if it can be shown that (a) they are job related and (b) they do not unfairly discriminate against either minority or nonminority subgroups.

In one study, Cascio (1976b) reported cross-validated validity coefficients of .58 (minori- ties) and .56 (nonminorities) for female clerical employees against a tenure criterion. When separate expectancy charts were constructed for the two groups, no significant differences in WAB scores for minorities and nonminorities on either predictor or criterion measures were found. Hence, the same scoring key could be used for both groups.

Results from several studies have concluded that biodata inventories are relatively free of adverse impact, particularly when items do not reflect cognitive abilities (Breaugh, 2009). However, a meta-analysis by Bobko and Roth (2013) emphasized that most results are based on concurrent validity designs using incumbent samples, which likely decrease observed eth- nicity-based differences. They estimated that the black—white mean standardized difference is d = .31, which was based on biodata that included a large number of KSAs.

Unfortunately, other than the degree of cognitive abilities saturation, when differences exist, we often do not know why. This reinforces the idea of using a rational (as opposed to an entirely empirical) approach to developing biodata inventories, because it has the greatest potential for allowing us to understand the underlying constructs, how they relate to criteria of interest, and how to minimize between-group score differences. As noted by Stokes and Searcy (1999):

With increasing evidence that one does not necessarily sacrifice validity to use more rational procedures in development and scoring biodata forms, and with concerns for

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 283

legal issues on the rise, the push for rational methods of developing and scoring biodata forms is likely to become more pronounced. (p. 84)

What Do Biodata Mean?

Criterion-related validity is not the only consideration in establishing job relatedness. Items that bear no rational relationship to the job in question (e.g., “applicant does not wear eye- glasses” as a predictor of theft) are unlikely to be acceptable to courts or regulatory agencies, especially if total scores produce adverse impact on a protected group. Nevertheless, external or empirical keying is the most popular scoring procedure and consists of focusing on the pre- diction of an external criterion using keying procedures at either the item or the item-option level (Stokes & Searcy, 1999). As defined by Mael (1991), “[T]he core attribute of biodata items is that the items pertain to historical events that may have shaped the person’s behavior and identity” (p. 763). Accordingly, as shown in Table 12.1, items measure behavioral inten- tions, self-descriptions of personality traits, and personal interests, among other constructs. Note, however, that biodata inventories resulting from a purely empirical approach do not help us understand what constructs are measured.

More prudent and reasonable is the rational approach, including job analysis information to deduce hypotheses concerning success on the job under study and to seek from existing, previously researched sources either items or factors that address these hypotheses (Stokes & Cooper, 2001). Essentially, we are asking the following questions: “What do biodata mean?” “Why do past behaviors and performance or life events predict non-identical future behaviors and performance?” (Breaugh, 2009; Dean & Russell, 2005). Thus, in a study of recruiters’ interpretations of biodata items from résumés and application forms, Brown and Campion (1994) found that recruiters deduced language and math abilities from education-related items, physical ability from sports-related items, and leadership and interpersonal attributes from items that reflected previous experience in positions of authority and participation in activities of a social nature. Nearly all items were thought to tell something about a candidate’s motivation. The next step is to identify hypotheses about the relationship of such abilities or attributes to success on the job in question. This rational approach has the advantage of enhancing both the utility of selection procedures and our understanding of how and why they work (cf. Mael & Ashforth, 1995). Moreover, it is probably the only legally defensible approach for the use of personal history data in employment selection.

The rational approach to developing biodata inventories has proven fruitful beyond employ- ment testing contexts. For example, Douthitt, Eby, and Simon (1999) used this approach to develop a biodata inventory to assess people’s degree of receptiveness to dissimilar others (i.e., general openness to dissimilar others). As an illustration, for the item “How extensively have you traveled?” the rationale is that travel provides for direct exposure to dissimilar others and those who have traveled to more distant areas have been exposed to more differences than those who have not. Other items include “How racially (ethnically) integrated was your high school?” and “As a child, how often did your parent(s) (guardian(s)) encourage you to explore new situations or discover new experiences for yourself?” Results of a study includ- ing undergraduate students indicated that the rational approach paid off because there was strong preliminary evidence in support of the scale’s reliability and validity. However, even if the rational approach is used, the validity of biodata items can be affected by the life stage in which the item is anchored (Dean & Russell, 2005). In other words, framing an item around a specific, hypothesized developmental time (i.e., childhood versus past few years) is likely to help applicants provide more accurate responses by giving them a specific context to which to relate their response.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

284 Applied Psychology in Talent Management

RECOMMENDATIONS AND REFERENCE CHECKS Another source of personal history data is information provided by others in the form of rec- ommendations and reference checks. Many prospective users ask a practical question: “Are recommendations and reference checks worth the amount of time and money it costs to pro- cess and consider them?” In general, four kinds of information are obtainable: (1) employ- ment and educational history (including confirmation of degree and class standing or grade point average); (2) evaluation of the applicant’s character, personality, and interpersonal competence; (3) evaluation of the applicant’s job performance ability; and (4) willingness to rehire.

For a recommendation to make a meaningful contribution to the screening and selection process, however, certain preconditions must be satisfied. The recommender must have had an adequate opportunity to observe the applicant in job-relevant situations, he or she must be competent to make such evaluations, he or she must be willing to be open and candid, and the evaluations must be expressed so that the potential employer can interpret them in the manner intended (McCormick & Ilgen, 1985). Although the value of recommendations can be impaired by deficiencies in any one or more of the four preconditions, unwillingness to be candid is probably the most serious. However, to the extent that the truth of any unfavor- able information cannot be demonstrated and it harms the reputation of the individual in question, providers of references may be guilty of defamation in their written (libel) or oral (slander) communications (Ryan & Lasek, 1991).

Written recommendations are considered by some to be of little value. For example, consider the opinions based on a survey of about 600 HR professionals with titles such as recruiting manager, employment lawyer, personnel consultant, and human resources specialist (Nicklin & Roch, 2009). About 80% of respondents agreed with the statement that “letter inflation is a problem that will never be entirely alleviated.” To a large extent, this opinion is justi- fied, since the available evidence indicates that the average validity of recommendations is .14 (Reilly & Chao, 1982). A meta-analysis focused exclusively on academic performance found similar results: the average observed correlation with GPA in medical school was .13 (N = 916) and the correlation with clinical and internship performance was .12 (N = 1,120). The average correlation with GPA in college seems higher, r = .28 (N = 5,155) (Kuncel, Kochevar, & Ones, 2014). But, meta-regression analysis (Gonzalez-Mulé & Aguinis, in press) showed that letters of recommendation contributed only .003 additional proportion of variance to the prediction of grade point average in graduate school and only .011 to the prediction of faculty ratings of performance above and beyond undergraduate GPA and verbal and quantitative GRE exam scores. Results were slightly more encouraging regarding the proportion of additional variance explained in the prediction of degree attainment: .024.

One of the biggest problems, and possibly the main reason for their overall lack of value- added predictive power, is that such recommendations rarely include unfavorable information and, therefore, do not discriminate among candidates. In addition, the affective disposition of letter writers has an impact on letter length, which, in turn, has an impact on the favor- ability of the letter (Judge & Higgins, 1998). In many cases, therefore, the letter may be providing more information about the person who wrote it than about the person described in the letter.

The fact is that decisions are made on the basis of letters of recommendation, particularly in academic settings (Nicklin & Roch, 2009). If such letters are to be meaningful, they should contain the following information (Knouse, 1987):

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 285

� Degree of writer familiarity with the candidate: This should include time known and time observed per week.

� Degree of writer familiarity with the job in question: To help the writer make this judgment, the person soliciting the recommendation should supply to the writer a description of the job in question.

� Specific examples of performance: This should cover such aspects as goals, task difficulty, work environment, and extent of cooperation from coworkers.

� Individuals or groups to whom the candidate is compared.

Unfortunately, many employers believe that reference checks are not permissible under the law. This is not true (Hedricks, Robie, & Oswald, 2013). In fact, employers may do the following: seek information about applicants, interpret and use that information during selec- tion, and share the results of reference checking with another employer (Sewell, 1981). In fact, employers may be found guilty of negligent hiring if they should have known at the time of hire about the unfitness of an applicant (e.g., prior job-related convictions, propensity for violence) that subsequently causes harm to an individual (Gregory, 1988; Ryan & Lasek, 1991). In other words, failure to check closely enough could lead to legal liability for an employer.

Reference checking is a valuable screening tool (see Box 12.1). An average validity of .26 was found in a meta-analysis of reference-checking studies (Hunter & Hunter, 1984). To be most useful, however, reference checks should be

� Consistent: If an item is grounds for denial of a job to one person, it should be the same for any other person who applies.

� Relevant: Employers should stick to items of information that really distinguish effective from ineffective employees.

� Written: Employers should keep written records of the information obtained to support the ultimate hiring decision made.

� Based on public records: Such records include court records, workers’ compensation, and bankruptcy proceedings. (Ryan & Lasek, 1991; Sewell, 1981)

Reference checking can also be done via telephone interviews (Taylor, Pajo, Cheung, & Stringfield, 2004). Implementing a procedure labeled structured telephone reference check (STRC), a total of 448 telephone reference checks were conducted on 244 applicants for customer-contact jobs (about two referees per applicant) (Taylor et al., 2004). STRCs took place over an eight-month period; they were conducted by recruiters at one of six recruitment consulting firms, and they lasted on average 13 minutes. Questions focused on measuring three constructs: conscientiousness, agreeableness, and customer focus. Recruiters asked each referee to rate the applicant compared to others they have known in similar positions, using the following scale: 1 = below average, 2 = average, 3 = somewhat above average, 4 = well above average, and 5 = outstanding. Note that the scale used is a relative, versus absolute, rating scale so as to minimize leniency in ratings. As an additional way to minimize leniency, referees were asked to elaborate on their responses. As a result of the selection process, 191 of the 244 appli- cants were hired, and data were available regarding the performance of 109 of these employees (i.e., those who were still employed at the end of the first performance appraisal cycle). A multiple-regression model predicting supervisory ratings of overall performance based on the

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

286 Applied Psychology in Talent Management

three dimensions assessed by the STRC resulted in R2 = .28, but customer focus was the only one of the three dimensions that predicted supervisory ratings (i.e., standardized regression coefficient of .28).

In closing, few organizations are willing to abandon altogether the practice of recommen- dation and reference checking, despite all the shortcomings. One need only listen to a grateful manager thanking the HR department for the good reference checking that “saved” him or her from making a bad offer to understand why. Also, from a practical standpoint, a key issue to consider is the extent to which the constructs assessed by recommendations and reference checks provide unique information above and beyond other data collection methods that we describe later in this chapter (e.g., employment interview).

POLYGRAPH TESTS Polygraph instruments are intended to detect deception and are based on the measurement of physiological processes (e.g., heart rate) and changes in those processes. An examiner infers whether a person is telling the truth or lying based on charts of physiological mea- sures in response to the questions posed and observations during the polygraph examination. Although they are often used for event-specific investigations (e.g., after a crime), they are also used (on a limited basis) for both employment and preemployment screening.

The use of polygraph tests has been severely restricted by a federal law passed in 1988. This law, the Employee Polygraph Protection Act, prohibits private employers (except firms providing security services and those manufacturing controlled substances) from requiring or requesting preemployment polygraph exams. Polygraph exams of current employees are permitted only under very restricted circumstances. Nevertheless, many agencies (e.g., U.S. Department of Energy) are using polygraph tests, given the security threats imposed by inter- national terrorism.

BOX 12.1 HOW TO GET USEFUL INFORMATION FROM A REFERENCE CHECK

In today’s environment of caution, many supervisors are hesitant to provide information about a former employee, especially over the telephone or via e-mail. To encourage them, consider doing the following:

zz Take the supervisor out of the judgmental past and into the role of an evaluator of a candidate’s abilities.

zz Remove the perception of potential liability for judging a former subordinate’s performance by asking for advice on how best to manage the person to bring out his or her abilities.

Questions such as the following might be helpful (Falcone, 1995):

zz We’re a mortgage banking firm in an intense growth mode. The phones don’t stop ringing, the paperwork is endless, and we’re considering Mary for a position in our customer service unit dealing with our most demanding customers. Is that an environment in which she would excel?

zz Some people constantly look for ways to reinvent their jobs and assume responsibilities beyond the basic job description. Others adhere strictly to their job duties and “don’t do windows,” so to speak. Can you tell me where Ed fits on that continuum?

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 287

Although much of the public debate over the polygraph as a lie detector focuses on ethical problems (Aguinis & Handelsman, 1997a, 1997b), at the heart of the controversy is validity— the relatively simple question of whether physiological measures can assess truthfulness and deception (Saxe, Dougherty, & Cross, 1985). An analysis of the scientific evidence on this issue is contained in a report by the National Research Council, which operates under a charter granted by the U.S. Congress. Its Committee to Review the Scientific Evidence on the Polygraph (2003) conducted a quantitative analysis of 57 independent studies investigating the accuracy of the polygraph and concluded the following:

� Polygraph accuracy for screening purposes is almost certainly lower than what can be achieved by specific-incident polygraph tests.

� The physiological indicators measured by the polygraph can be altered by conscious efforts through cognitive or physical means.

� Using the polygraph for security screening yields an unacceptable choice between too many loyal employees falsely judged deceptive and too many major security threats left undetected.

In sum, as concluded by the committee, the polygraph’s “accuracy in distinguishing actual or potential security violators from innocent test takers is insufficient to justify reliance on its use in employee security screening in federal agencies” (p. 6). These conclusions are consistent with the views of scholars in relevant disciplines. Responses to a survey completed by members of the Society for Psychophysiological Research and Fellows of the American Psychological Association’s Division 1 (General Psychology) indicated that the use of polygraph testing is not theoretically sound, claims of high validity for these procedures cannot be sustained, and polygraph tests can be beaten by countermeasures (Iacono & Lykken, 1997).

In spite of the overall conclusion that polygraph testing is not very accurate, potential alternatives to the polygraph, such as measuring brain activity through electrical and imag- ing studies have not yet been shown to outperform the polygraph (Committee to Review the Scientific Evidence on the Polygraph, 2003). Such alternative techniques do not show any promise of supplanting the polygraph for screening purposes in the near future. Thus, although imperfect, it is likely that the polygraph will continue to be used for employee secu- rity screening until other alternatives become available.

HONESTY TESTS Honesty testing is a multimillion-dollar industry, especially since the use of polygraphs in employment settings has been severely curtailed and “ban-the-box” laws in some states restrict employers from asking candidates about prior criminal convictions until later in the hiring process. Written honesty tests (also known as integrity tests) fall into two major categories: overt integrity tests and personality-based measures. Overt integrity tests typically include two types of questions. One assesses attitudes toward theft and other forms of dishonesty (e.g., endorsement of common rationalizations of theft and other forms of dishonesty, beliefs about the frequency and extent of employee theft, punitiveness toward theft, perceived ease of theft). The other deals with admissions of theft and other illegal activities (e.g., dollar amount stolen in the last year, drug use, gambling). Personality-based measures are not designed as mea- sures of honesty per se, but rather as predictors of a wide variety of counterproductive behav- iors, such as substance abuse, insubordination, absenteeism, bogus workers’ compensation claims, and various forms of passive aggression. Overall, personality-based measures assess

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

288 Applied Psychology in Talent Management

broader dispositional traits such as socialization and conscientiousness. (Conscientiousness is one of the Big Five personality traits; we discuss these in more detail in Chapter 13.) In fact, despite the clear differences in content, both overt and personality-based tests seem to have a common latent structure reflecting conscientiousness, agreeableness, and emotional stability (Berry, Sackett, & Wiemann, 2007).

Do honesty tests work? Overall, the answer is yes, as several reviews have documented (Ones, Viswesvaran, & Schmidt, 1993; Van Iddekinge, Roth, Raymark, & Odle-Dusseau, 2012a). However, the precise extent to which such tests predict performance—and what specific facets of performance they predict—is less clear. Ones et al. (1993) conducted a meta-analysis of 665 validity coefficients that were based on 576,460 test takers. The average validity of the tests, when used to predict supervisory ratings of performance, was .41. Results for overt and personality-based tests were similar. However, the average validity of overt tests for predicting theft per se was much lower: .13. Van Iddekinge et al. (2012a) conducted a subsequent meta- analysis that relied on fewer studies (i.e., 104 studies representing 134 independent samples) because of “concerns centered around the perceived lack of methodological rigor within this literature and a heavy reliance on unpublished data from firms that publish the integrity tests (e.g., 90% of the studies in Ones et al.’s 1993 meta-analysis)” (Van Iddekinge, Roth, Raymark, & Odle-Dusseau, 2012b, p. 543). Van Iddekinge et al.’s (2012a) updated meta-analytic results revealed the following mean observed and corrected (for unreliability in the criterion) validity coefficients: .12 and .15 for job performance, .13 and .16 for training performance, .26 and .32 for counterproductive work behaviors, and .07 and .09 for turnover.

The Van Iddekinge et al. (2012a) results were controversial and led to a forceful reaction on the part of test vendors (Harris et al., 2012), who concluded, “In light of Van Iddekinge et al.’s substantially smaller sample of studies, and arguable methodological decisions, we are inclined to accord more weight to Ones et al.’s findings when there is a difference in con- clusion” (p. 535). In their defense, regarding the number of studies included in their meta- analyses, Van Iddekinge et al. (2012b) wrote, “After several months of correspondence . . . we were informed that it was no longer possible to provide us access to the additional studies. Moreover, Jones informed us that Vangent’s corporate attorneys wanted us to know that we did not have permission to use several technical reports we had obtained from another researcher because the reports had not been released into the public domain (yet apparently were provided to Ones et al., 1993, and other researchers).” Clearly, this not the end of the discussion regarding the relative validity of honesty tests. At this point, we do not know what factors caused the different results reported by Ones et al. (1993) compared to Van Iddekinge et al. (2012a), but it seems that different study-inclusion criteria, corrections for artifacts, and second-order sampling error are not the culprits and more details on meta-analytic procedures, including coding, are necessary to address this issue (Sackett & Schmitt, 2012).

Although honesty tests are overall good predictors of certain performance facets, at least four key issues have yet to be resolved. First, as in the case of biodata inventories, there is a need for a greater understanding of the construct validity of integrity tests given that integ- rity tests are not interchangeable (i.e., scores for the same individuals on different types of integrity tests are not necessarily similar). Some investigations have sought evidence regarding the relationship between integrity tests and some broad personality traits. But there is a need to understand the relationship between integrity tests and individual characteristics more directly related to integrity tests such as object beliefs, negative life themes, and power motives (Mumford, Connelly, Helton, Strange, & Osburn, 2001). Second, women tend to score approximately .16 standard deviation unit higher than men, and job applicants aged 40 years and older tend to score .08 standard deviation unit higher than applicants younger than 40 (Ones & Viswesvaran, 1998). At this point, we do not have a clear reason for these findings. Third, many writers in the field apply the same language and logic to integrity testing as to

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 289

ability testing. Yet there is an important difference: While it is possible for an individual with poor moral behavior to “go straight,” it is certainly less likely that an individual who has dem- onstrated a lack of intelligence will “go smart.” If they are honest about their past, therefore, reformed individuals with a criminal past may be “locked into” low scores on integrity tests (and, therefore, be subject to classification error) (Lilienfeld, Alliger, & Mitchell, 1995). Thus, the broad validation evidence that is often acceptable for cognitive ability tests may not hold up in the public policy domain for integrity tests. Fourth, there is the real threat of intentional distortion (Alliger, Lilienfeld, & Mitchell, 1996). It is quite ironic that job applicants are likely to be dishonest in completing an honesty test. For example, as mentioned earlier, McFarland and Ryan (2000) found that, when study participants who were to complete an honesty test were instructed to “answer questions in such a way as to make you look as good an applicant as possible,” scores were 1.78 standard deviation units higher than when they were instructed to “answer questions as honestly as possible.” Finally, test publishers have an undeniable con- flict of interest regarding research addressing the validity of their own tests, much like we described in Chapter 8 regarding the assessment of test fairness (i.e., differential prediction). At the same time, they have legitimate concerns that “[i]t would be helpful to publishers, as well as the field, if there were mechanisms to protect the interest of testing clients in the same manner as human subjects, and more opportunities to publish or distribute the many strong validity studies in publishers’ files” (Harris et al., 2012, p. 532).

Given the challenges and unresolved issues, researchers are exploring alternative ways to assess integrity and other personality-based constructs (e.g., Van Iddekinge, Raymark, & Roth, 2005). One promising approach is conditional reasoning testing (Frost, Chia-Huei, & James, 2007; James et al., 2005), which focuses on how people solve what appear to be traditional inductive-reasoning problems. However, the true intent of the scenarios presented is to determine respondents’ solutions based on their implicit biases and preferences. These underlying biases usually operate below the surface of consciousness and are revealed based on the respondents’ responses. Another promising approach is to assess integrity as part of a situational judgment test (discussed in detail in Chapter 13), in which applicants are given a scenario and are asked to choose a response that is most closely aligned with what they would do (Becker, 2005). Consider the following example of an item developed by Becker (2005):

Your work team is in a meeting discussing how to sell a new product. Everyone seems to agree that the product should be offered to customers within the month. Your boss is all for this, and you know he does not like public disagreements. However, you have concerns because a recent report from the research department points to several poten- tial safety problems with the product. Which of the following do you think you would most likely do?

Possible answers:

A. Try to understand why everyone else wants to offer the product to customers this month. Maybe your concerns are misplaced. [–1]

B. Voice your concerns with the product and explain why you believe the safety issues need to be addressed. [1]

C. Go along with what others want to do so that everyone feels good about the team. [–1]

D. Afterwards, talk with several other members of the team to see if they share your concerns. [0]

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

290 Applied Psychology in Talent Management

The scoring for the above item is –1 for answers A and C (i.e., worst-possible score), 0 for answer D (i.e., neutral score), and +1 for item B (i.e., best-possible score). One advantage of using scenario-based integrity tests is that they are intended to capture specific values rather than general integrity-related traits. Thus, these types of tests may be more defensible both scientifically and legally because they are based on a more precise definition of integrity, includ- ing specific types of behaviors. A study based on samples of fast-service employees (n = 81), production workers (n = 124), and engineers (n = 56) found that validity coefficients for the integrity test (corrected for criterion unreliability) were .26 for career potential, .18 for leader- ship, and .24 for in-role performance (all as assessed by managers’ ratings) (Becker, 2005).

EVALUATION OF TRAINING AND EXPERIENCE Judgmental evaluations of the previous work experience and training of job applicants, as presented on résumés and job applications, is a common part of initial screening. Some- times evaluation is purely subjective and informal, and sometimes it is accomplished in a formal manner according to a standardized method. Evaluating job experience is not as easy as one may think because experience includes both qualitative and quantitative components that interact and accrue over time (Aguinis, O’Boyle, Gonzalez-Mulé, & Joo, 2016); hence, work experience is multidimensional and temporally dynamic (Tesluk & Jacobs, 1998). How- ever, using experience as a predictor of future performance can pay off. Specifically, a study including more than 800 U.S. Air Force enlisted personnel indicated that ability and experi- ence seem to have linear and noninteractive effects (Lance & Bennett, 2000). Another study that also used military personnel showed that work experience items predict performance above and beyond cognitive abilities and personality (Jerry & Borman, 2002). These findings explain why the results of a survey of more than 200 staffing professionals of the National Association of Colleges and Employers revealed that experienced hires were evaluated more highly than new graduates on most characteristics (Rynes, Orlitzky, & Bretz, 1997).

An empirical comparison of four methods for evaluating work experience indicated that the “behavioral consistency” method showed the highest mean validity, at .45 (McDaniel, Schmidt, & Hunter, 1988). This method requires applicants to describe their major achievements in several job-related areas. These areas are behavioral dimensions rated by supervisors as showing maximal differences between superior and minimally acceptable per- formers. The applicants’ achievement statements are then evaluated using anchored rating scales. The anchors are achievement descriptors whose values along a behavioral dimension have been determined reliably by subject matter experts.

A similar approach to the evaluation of training and experience, one most appropriate for selecting professionals, is the accomplishment record (AR) method (Hough, 1984). A comment frequently heard from professionals is “My record speaks for itself.” The AR is an objective method for evaluating those records. It is a type of biodata/maximum performance/ self-report instrument that appears to tap a component of an individual’s history that is not measured by typical biographical inventories. It correlates essentially zero with aptitude test scores, honors, grades, and prior activities and interests.

Development of the AR begins with the collection of critical incidents to identify impor- tant dimensions of job performance. Then rating principles and scales are developed for rating an individual’s set of job-relevant achievements. The method yields (a) complete definitions of the important dimensions of the job, (b) summary principles that highlight key characteristics to look for when determining the level of achievement demonstrated by an accomplishment, (c) examples of accomplishments that job experts agree represent various levels of achieve- ment, and (d) numerical equivalents that allow the accomplishments to be translated into

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 291

quantitative indexes of achievement. When the AR was applied in a sample of 329 attorneys, the reliability of the overall performance ratings was a respectable .82, and the AR demon- strated a validity of .25. Moreover, the method appears to be fair for females, minorities, and white males.

What about academic qualifications? They tend not to affect managers’ hiring recom- mendations, as compared to work experience, and they could even have a negative effect. For candidates with poor work experience, having higher academic qualifications seems to reduce their chances of being hired (Singer & Bruhns, 1991). These findings were supported by a national survey of 3,000 employers by the U.S. Census Bureau. The most important char- acteristics employers said they considered in hiring were attitude, communication skills, and previous work experience. The least important were academic performance (grades), school reputation, and teacher recommendations (Applebome, 1995). Moreover, when grades are used, they tend to have adverse impact on ethnic minority applicants (Roth & Bobko, 2000).

DRUG SCREENING Drug screening tests began in the military, spread to the sports world, and now are becoming common in employment (Aguinis & Henle, 2005). In fact, about 50% of employers use some type of drug screening for all of their job applicants in the United States (Lieberman, 2017). Critics charge that such screening violates an individual’s right to privacy and that the tests are frequently inaccurate (Morgan, 1989), for example, as a result of cheating (see Box 12.2). These critics do concede, however, that employees in jobs where public safety is crucial—such as nuclear power plant operators and commercial jet pilots—should be screened for drug use. In fact, perceptions of the extent to which different jobs might involve danger to the worker, to coworkers, or to the public are strongly related to the acceptability of drug testing (Murphy, Thornton, & Prue, 1991).

Do the results of such tests forecast certain aspects of later job performance? In perhaps the largest reported study of its kind, the U.S. Postal Service took urine samples from 5,465 job applicants. It never used the results to make hiring decisions and did not tell local managers of the findings. When the data were examined six months to a year later, workers who had tested positively prior to employment were absent 41% more often and were fired 38% more often. There were no differences in voluntary turnover between those who tested positively and those who did not. These results held up even after adjustment for factors such as age, gender, and race. As a result, the Postal Service implemented preemployment drug testing nationwide (Wessel, 1989).

Is such drug screening legal? In two rulings in 1989, the Supreme Court upheld (a) the con- stitutionality of the government regulations that require railroad crews involved in accidents to submit to prompt urinalysis and blood tests and (b) urine tests for U.S. Customs Service (now U.S. Customs and Border Protection) employees seeking drug-enforcement posts. Over- all, an employer has a legal right to ensure that employees perform their jobs competently and that no employee endangers the safety of other workers. So, if illegal drug use, on or off the job, may reduce job performance and endanger coworkers, the employer has adequate legal grounds for conducting drug tests.

To avoid legal challenge, consider instituting the following procedures:

� Inform all employees and job applicants, in writing, of the company’s policy regarding drug use.

� Include the drug policy and the possibility of testing in all employment contracts.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

292 Applied Psychology in Talent Management

� Present the program in a medical and safety context—namely, that drug screening will help to improve the health of employees and also help to ensure a safer workplace.

If drug screening will be used with employees as well as job applicants, tell employees in advance that drug testing will be a routine part of their employment (Angarola, 1985).

To enhance perceptions of fairness, employers should provide advance notice of drug tests, preserve the right to appeal, emphasize that drug testing is a means to enhance workplace safety, attempt to minimize invasiveness, and train supervisors (Konovsky & Cropanzano, 1991; Tepper, 1994). In addition, employers must understand that perceptions of drug testing fairness are affected not only by the program’s characteristics but also by employee character- istics. For example, employees who have friends who have failed a drug test are less likely to have positive views of drug testing (Aguinis & Henle, 2005).

COMPUTER-BASED SCREENING The rapid development of computer technology over the past few years has resulted in faster microprocessors and more flexible and powerful software that can incorporate graphics and sound. These technological advances now allow organizations to conduct computer-based screening (CBS). Using the Internet, companies can conduct CBS and administer job- application forms, structured interviews (discussed later in this chapter), and other types of tests globally, 24 hours a day, 7 days a week (Jones & Dages, 2003).

CBS can be used simply to convert a screening tool from paper to an electronic format that is called an electronic page turner. These types of CBS are low on interactivity and do not take full advantage of technology (Olson-Buchanan, 2002). By contrast, Nike uses interac- tive voice-response technology to screen applicants over the telephone, the U.S. Air Force uses computer-adaptive testing (CAT) on a regular basis (Ree & Carretta, 1998), and other organizations such as Home Depot and JCPenney use a variety of technologies for screening (Chapman & Webster, 2003; Overton, Harms, Taylor, & Zickar, 1997). CAT presents all

BOX 12.2 PRACTICAL APPLICATION: CHEATING ON DRUG TESTS

Employers are increasingly concerned about job appli- cants and employees cheating on drug tests. The Internet is now a repository of products people can purchase at reasonable prices with the specific goal of cheating on drug tests. Consider the Whizzinator, an easy-to-conceal and easy-to-use urinating device for men that includes synthetic urine and an adjustable belt. The price? Just under $150.

Hundreds of similar products, particularly targeting urine tests, are offered on the Internet. Leo Kadehjian, a Palo Alto–based consultant, noted that “by far the most preferred resource is dilution” (Cadrain, 2003, p. 42). However, a very large number of highly sophisticated

products are offered, including the following (Cadrain, 2003):

zz Oxidizing agents that alter or destroy drugs and/ or their metabolites

zz Nonoxidizing adulterants that change the pH of a urine sample or the ionic strength of the sample

zz Surfactants, or soaps, which, when added directly to a urine sample, can form microscopic droplets with fatty interiors that trap fatty marijuana metabolites

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 293

applicants with a set of items of average difficulty and, if responses are correct, items with higher levels of difficulty. If responses are incorrect, items with lower levels of difficulty are presented. CAT uses item response theory (see Chapter 6) to estimate an applicant’s level of the underlying trait based on the relative difficulty of the items answered correctly and incor- rectly. The potential value added by using computers as screening devices is obvious when one considers that implementation of CAT would be nearly impossible using traditional paper- and-pencil instruments (Olson-Buchanan, 2002).

There are several potential advantages of using CBS (Kantrowitz et al., 2011; Olson- Buchanan, 2002). First, administration may be easier. For example, standardization is maxi- mized because there are no human proctors who may give different instructions to different applicants (i.e., computers give instructions consistently to all applicants). Also, responses are recorded and stored automatically, which is a practical advantage, but can also help mini- mize data-entry errors. Second, applicants can access the test from remote locations, thereby increasing the applicant pool. Third, computers can accommodate applicants with disabili- ties in a number of ways, particularly since tests can be completed from their own (possibly modified) computers. A modified computer can caption audio-based items for applicants with hearing disabilities, or it can allow applicants with limited hand movement to complete a test. Finally, preliminary evidence suggests that Web-based assessment does not exacerbate adverse impact.

Despite the increasing availability and potential benefits of CBS, concerns about imple- mentation include cost and potential cheating. Moreover, some testing experts believe that high-stakes tests, such as those used to make employment decisions, cannot be administered in unproctored Internet settings (Tippins et al., 2006). CAT is able to address some of these concerns because, in contrast to static testing (i.e., all applicants receive the same items), with CAT each applicant is administered a test including potentially different items, which addresses the cheating concern. Additional challenges in implementing CBS include the rela- tive lack of access of low-income individuals to the Internet, or what is called the digital divide (Stanton & Rogelberg, 2001).

Olson-Buchanan (2002) concluded that innovations in CBS have not kept pace with the progress in computer technology. This disparity was attributed to three major factors: (1) costs associated with CBS development, (2) lag in scientific guidance for addressing reliability and validity issues raised by CBS, and (3) the concern that investment in CBS may not result in tangible payoffs.

Fortunately, many of the concerns are being addressed by ongoing research on the use, accu- racy, equivalence, and efficiency of CBS. For example, Ployhart, Weekley, Holtz, and Kemp (2003) found that proctored, Web-based testing has several benefits compared to the more traditional paper-and-pencil administration. Their study included nearly 5,000 applicants for telephone-service-representative positions who completed, among other measures, a bio- data instrument. Results indicated that scores resulting from the Web-based administration had similar or better psychometric characteristics, including distributional properties, lower means, more variance, and higher internal-consistency reliabilities. Another study examined reactions to CAT and found that applicants’ reactions are positively related to their perceived performance on the test (Tonidandel, Quiñones, & Adams, 2002). Thus, changes in the item-selection algorithm that result in a larger number of items answered correctly have the potential to improve applicants’ perceptions of CAT.

In sum, HR specialists now have the opportunity to implement CBS in their organiza- tions. If implemented well, CBS carries numerous advantages. In fact, the use of computers and the Internet is making testing cheaper and faster, and it may serve as a catalyst for even more widespread use of tests for employment purposes (Tippins et al., 2006). However, the degree of success in implementing CBS will depend not only on the features of the test itself

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

294 Applied Psychology in Talent Management

but also on organizational-level variables such as the culture and climate for technological innovation (Anderson, 2003).

EMPLOYMENT INTERVIEWS Use of the interview in selection is almost universal today (Moscoso, 2000). Perhaps this is so because in the employment context the interview serves as much more than just a selection device. The interview is a communication process, whereby the applicant learns more about the job and the organization and begins to develop some realistic expectations about both.

When an applicant is accepted, terms of employment typically are negotiated during an interview. If the applicant is rejected, the interviewer performs an important public rela- tions function, for it is essential that the rejected applicant leave with a favorable impression of the organization and its employees. For example, several studies found that perceptions of the interview process and the interpersonal skills of the interviewer, as well as his or her skills in listening, recruiting, and conveying information about the company and the job the applicant would hold, affected the applicant’s evaluations of the interviewer and the company (Kohn & Dipboye, 1998; Schmitt & Coyle, 1979). However, the likelihood of accepting a job, should one be offered, was still mostly unaffected by the interviewer’s behavior (Powell, 1991).

As a selection device, the interview performs two vital functions: It can fill information gaps in other selection devices (e.g., regarding incomplete or questionable application blank responses; Tucker & Rowe, 1977), and it can be used to assess factors that can be measured only via face-to-face interaction (e.g., appearance, speech, poise, and interpersonal competence). Is the applicant likely to “fit in” and share values with other organizational members (Cable & Judge, 1997)? Is the applicant likely to get along with others in the organization or be a source of conflict? Where can his or her talents be used most effectively? Interview impressions and perceptions can help to answer these kinds of questions. In fact, well-designed interviews can be helpful because they allow examiners to gather information on constructs not typically assessed via other means such as empathy (Cliffordson, 2002) and personal initiative (Fay & Frese, 2001). For example, a review of 388 characteristics rated in 47 actual interview studies revealed that personality traits (e.g., responsibility, dependability, and persistence, which are all related to conscientiousness) and applied social skills (e.g., interpersonal relations, social skills, team focus, ability to work with people) are rated more often in employment interviews than any other type of construct (Huffcutt, Conway, Roth, & Stone, 2001). In addition, interviews can contribute to the prediction of job performance over and above cognitive abilities and conscientiousness (Cortina, Goldstein, Payne, Davison, & Gilliland, 2000), as well as experi- ence (Day & Carroll, 2002).

Since few employers are willing to hire applicants they have never seen, it is imperative that we do all we can to make the interview as effective a selection technique as possible. Next, we consider some of the research on interviewing and offer suggestions for improving the process and outcome.

Response Distortion in the Interview

Distortion of interview information is probable (Weiss & Dawis, 1960), the general ten- dency being to upgrade rather than downgrade prior work experience. That is, interviewees tend to be affected by social desirability bias, which is a tendency to answer questions in a more socially desirable direction (i.e., to attempt to look good in the eyes of the inter- viewer). In addition to distorting information, applicants tend to engage in influence tactics

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 295

to create a positive impression, and they typically do so by displaying self-promotion behaviors (Stevens & Kristof, 1995).

But will social desirability distortion be reduced if the interviewer is a computer? Accord- ing to Martin and Nagao (1989), candidates tend to report their grade point averages and scholastic aptitude test scores more accurately to computers than in face-to-face interviews. Perhaps this is due to the “big brother” effect. That is, because responses are on a computer rather than on paper, they may seem more subject to instant checking and verification through other computer databases. To avoid potential embarrassment, applicants may be more likely to provide truthful responses. However, Martin and Nagao’s study also placed an important boundary condition on computer interviews: There was much greater resentment by individu- als competing for high-status positions than for low-status positions when they had to respond to a computer rather than a live interviewer.

A more comprehensive study was conducted by Richman, Kiesler, Weisband, and Drasgow (1999). They conducted a meta-analysis synthesizing 61 studies (673 effect sizes), comparing response distortion in computer questionnaires with traditional paper-and-pencil question- naires and face-to-face interviews. Results revealed that computer-based interviews decreased social-desirability distortion compared to face-to-face interviews, particularly when the inter- views addressed highly sensitive personal behavior (e.g., use of illegal drugs). Perhaps this is so because a computer-based interview is more impersonal than the observation of an interviewer and from social cues that can arouse an interviewee’s evaluation apprehension.

A more subtle way to distort the interview is to engage in impression-management behaviors (Lievens & Peeters, 2008; Roulin, Bangerter, & Levashina, 2015). For example, applicants who are pleasant and compliment the interviewer are more likely to receive more positive evaluations. Two specific types of impression management, ingratiation and self- promotion, seem to be most effective in influencing interviewers’ rating favorably (Higgins & Judge, 2004). A research program involving five different experiments using real-time video coding showed that interviewers are not able to detect impression management—although they were better at detecting honest impression management (i.e., truthfully describing actual job-related abilities, accomplishments, and experiences) than deceptive impression manage- ment (i.e., embellishing job-related credentials or creating credentials that fit with the job requirements) (Roulin et al., 2015). Moreover, experienced interviewers were no better than novices. Training can help improve this situation. Specifically, such training should empha- size that deception detection improves when the interviewer focuses on story-related cues (e.g., vagueness, contradictions) instead of nonverbal cues (e.g., gaze aversions, posture change, fidgeting) (Roulin et al., 2015).

Reliability and Validity

An early meta-analysis of only 10 validity coefficients that were not corrected for range restriction yielded a validity of .14 when the interview was used to predict supervisory ratings (Hunter & Hunter, 1984). Subsequent meta-analyses that did correct for range restriction and used larger samples of studies reported more encouraging results. Wiersner and Cronshaw (1988) found a mean corrected validity of .47 across 150 interview validity studies involving all types of criteria. McDaniel, Whetzel, Schmidt, and Maurer (1994) analyzed 245 coef- ficients derived from 86,311 individuals and found a mean corrected validity of .37 for job performance criteria. However, validities were higher when criteria were collected for research purposes (.47) than for administrative decision making (.36). Marchese and Muchinsky (1993) reported a mean corrected validity of .38 across 31 studies. A fourth review (Huffcutt & Arthur, 1994) analyzed 114 interview validity coefficients from 84 published and unpublished references, exclusively involving entry-level jobs and supervisory rating criteria. When cor- rected for criterion unreliability and range restriction, the mean validity across all 114 studies

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

296 Applied Psychology in Talent Management

was .37. Finally, Schmidt and Rader (1999) meta-analyzed 40 studies of structured telephone interviews and obtained a corrected validity coefficient of .40 using performance ratings as a criterion. The results of these studies agree quite closely.

Regarding reliability, a meta-analysis of 125 interrater reliability coefficients with a total sample size of 32,428 derived from employment interviews revealed an overall mean of .68 (Huffcutt, Culbertson, & Weyhrauch, 2013). However, the 80% credibility interval, mean- ing that 80% of the true population coefficients fall within this interval, ranged from .42 to .94. In fact, an analysis of the level of structure of the interviews revealed that reliability was lowest when structure was low (i.e., .36), and highest when structure was high (.76). Given our discussion in Chapter 7 regarding the relation between reliability and validity, the best way to improve validity is to improve the degree of structure of the interview (discussed later in this chapter).

As Hakel (1989) noted, interviewing is a difficult cognitive and social task. Managing a smooth social exchange while simultaneously processing information about an applicant makes interviewing uniquely difficult among all managerial tasks. Research continues to focus on cognitive factors (e.g., preinterview impressions) and social factors (e.g., interviewer– interviewee similarity). As a result, we now know a great deal more about what goes on in the interview and about how to improve the process. At the very least, we should expect interview- ers to be able to form opinions only about traits and characteristics that are overtly manifest in the interview (or that can be inferred from the applicant’s behavior), and not about traits and characteristics that typically would become manifest only over a period of time—traits such as creativity, dependability, and honesty. In the following subsections, we examine what is known about the interview process and about ways to enhance the effectiveness and utility of the selection interview.

Factors Affecting the Decision-Making Process

A large body of literature attests to the fact that the decision-making process involved in the interview is affected by several factors. Specifically, 278 studies have examined numerous aspects of the interview (Posthuma, Morgeson, & Campion, 2002). Posthuma et al. (2002) provided a useful framework to summarize and describe this large body of research. We follow this taxonomy in part and consider factors affecting the interview decision-making process in each of the following areas: (a) social/interpersonal factors (e.g., interviewer–applicant simi- larity), (b) cognitive factors (e.g., preinterview impressions), (c) individual differences (e.g., applicant appearance, interviewer training and experience), and (d) structure (i.e., degree of standardization of the interview process and discretion an interviewer is allowed in conduct- ing the interview).

Social/Interpersonal Factors As noted earlier, the interview is fundamentally a social and interpersonal process. As such, it is subject to influences such as interviewer–applicant similarity and verbal and nonverbal cues. We describe each of these factors next.

Interviewer–Applicant Similarity. Similarity leads to attraction, attraction leads to posi- tive affect, and positive affect can lead to higher interview ratings (Schmitt, Pulakos, Nason, & Whitney, 1996). Moreover, similarity leads to greater expectations about future perfor- mance (García, Posthuma, & Colella, 2008). Does similarity between the interviewer and the interviewee regarding race, age, and attitudes affect the interview? Lin, Dobbins, and Farh (1992) reported that ratings of African American and Latino interviewees, but not

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 297

white interviewees, were higher when the interviewer was the same race as the applicant. However, Lin et al. (1992) found that the inclusion of at least one different-race interviewer in a panel eliminated the effect, and no effect was found for age similarity. Further, when an interviewer feels that an interviewee shares his or her attitudes, ratings of competence and affect are increased (Howard & Ferris, 1996). The similarity effects are not large, however, and they can be reduced or eliminated by using a structured interview and a diverse set of interviewers.

Verbal and Nonverbal Cues. In terms of verbal cues, Anderson (1960) found that the applicant was more likely to be hired in interviews where the interviewer did a lot more of the talking and there was less silence. Other research has shown that the length of the interview depends much more on the quality of the applicant (interviewers take more time to decide when dealing with a high-quality applicant) and on the expected length of the interview. The longer the expected length of the interview, the longer it takes to reach a decision (Tullar, Mullins, & Caldwell, 1979).

Several studies have also examined the impact of nonverbal cues on impression for- mation and decision making in the interview. Nonverbal cues have been shown to have an impact, albeit small, on interviewer judgments (DeGroot & Motowidlo, 1999). For example, Imada and Hakel (1977) found that positive nonverbal cues (e.g., smiling, atten- tive posture, smaller interpersonal distance) produced consistently favorable ratings. Most important, however, nonverbal behaviors interact with other variables such as gender. Agu- inis, Simonsen, and Pierce (1998) found that a man displaying direct eye contact during an interview is rated as more credible than another one not making direct eye contact. However, a follow-up replication using exactly the same experimental conditions revealed that a woman displaying identical direct eye contact behavior was seen as coercive (Aguinis & Henle, 2001a).

Overall, the ability of a candidate to respond concisely, to answer questions fully, to state personal opinions when relevant, and to keep to the subject at hand appears to be more crucial in obtaining a favorable employment decision (Parsons & Liden, 1984; Rasmussen, 1984). High levels of nonverbal behavior tend to have more positive effects than low levels only when the verbal content of the interview is good. When verbal content is poor, high levels of non- verbal behavior may result in lower ratings.

Cognitive Factors The interviewer’s task is not easy because humans are limited information processors and have biases in evaluating others (Kraiger & Aguinis, 2001). However, we have a good under- standing of the impact of factors such as preinterview impressions and confirmatory bias, first impressions, stereotypes, contrast effect, and information recall. Let’s review major findings regarding the way in which each of these factors affects the interview.

Preinterview Impressions and Confirmatory Bias. Dipboye (1982, 1992) specified a model of self-fulfilling prophecy to explain the impact of first preinterview impressions. Both cognitive and behavioral biases mediate the effects of preinterview impressions (based on letters of reference or applications) on the evaluations of applicants. Behavioral biases occur when interviewers behave in ways that confirm their preinterview impressions of applicants (e.g., showing positive or negative regard for applicants). Cognitive biases occur if inter- viewers distort information to support preinterview impressions or use selective attention and recall of information. This sequence of behavioral and cognitive biases produces a self- fulfilling prophecy.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

298 Applied Psychology in Talent Management

Consider how one applicant was described by an interviewer given positive information:

Alert, enthusiastic, responsible, well-educated, intelligent, can express himself well, organized, well-rounded, can converse well, hard worker, reliable, fairly experienced, and generally capable of handling himself well.

On the basis of negative preinterview information, the same applicant was described as follows:

Nervous, quick to object to the interviewer’s assumptions, and doesn’t have enough self- confidence. (Dipboye, Stramler, & Fontanelle, 1984, p. 567)

Content coding of employment interviews found that favorable first impressions were fol- lowed by the use of confirmatory behavior—such as indicating positive regard for the appli- cant, “selling” the company, and providing job information to applicants—while gathering less information from them. For their part, applicants behaved more confidently and effec- tively and developed better rapport with interviewers (Dougherty, Turban, & Callender, 1994). These findings support the existence of the confirmatory bias produced by first impressions.

Another aspect of expectancies concerns test score or biodata score information available prior to the interview. A study of 577 candidates for the position of life insurance sales agent found that interview ratings predicted the hiring decision and survival on the job best for applicants with low passing scores on the biodata test and poorest for applicants with high passing scores (Dalessio & Silverhart, 1994). Apparently, interviewers had such faith in the validity of the test scores that, if an applicant scored well, they gave little weight to the inter- view. When the applicant scored poorly, however, they gave more weight to performance in the interview and made better distinctions among candidates.

First Impressions. An early series of studies conducted at McGill University over a 10-year period (Webster, 1964, 1982) found that early interview impressions play a dominant role in final decisions (select/reject). These early impressions establish a bias in the interviewer (not usually reversed) that colors all subsequent interviewer–applicant interaction. Early impres- sions were crystallized after a mean interviewing time of only four minutes!

In addition, the interview is primarily a search for negative information. For example, just one unfavorable impression was followed by a reject decision 90% of the time. Positive infor- mation was given much less weight in the final decision (Bolster & Springbett, 1961).

Consider the effect of how the applicant shakes the interviewer’s hand (Stewart, Dustin, Barrick, & Darnold, 2008). A study using 98 undergraduate students found that quality of handshake was related to the interviewer’s hiring recommendation. It seems that quality of handshake conveys the positive impression that the applicant is extraverted, even when the candidate’s physical appearance and dress are held constant. Also, in this particular study women received lower ratings for the handshake compared with men, but they did not, on average, receive lower assessments of employment suitability.

Prototypes and Stereotypes. Returning to the McGill studies, perhaps the most impor- tant finding was that interviewers tend to develop their own prototype of a good applicant and proceed to accept those who match their prototype (Rowe, 1963; Webster, 1964). Later research has supported these findings. To the extent that the interviewers hold negative ste- reotypes of a group of applicants, and these stereotypes deviate from the perception of what is needed for the job or translate into different expectations or standards of evaluation for

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 299

minorities, stereotypes may have the effect of lowering interviewers’ evaluations, even when candidates are equally qualified for the job (Arvey, 1979).

Similar considerations apply to gender-based stereotypes. The social psychology literature on gender-based stereotypes indicates that the traits and attributes necessary for managerial success resemble the characteristics, attitudes, and temperaments of the masculine gender role more than the feminine gender role (Aguinis & Adams, 1998). The operation of such stereo- types may explain the conclusion by Arvey and Campion (1982) that female applicants receive lower scores than male applicants.

Contrast Effects. Several studies have found that, if an interviewer evaluates a candidate who is just average after evaluating three or four very unfavorable candidates in a row, the average candidate tends to be evaluated favorably. When interviewers evaluate more than one candidate at a time, they tend to use other candidates as a standard. Whether they rate a can- didate favorably, then, is determined partly by others against whom the candidate is compared (Hakel, Ohnesorge, & Dunnette, 1970; Heneman, Schwab, Huett, & Ford, 1975; Landy & Bates, 1973).

These effects are remarkably tenacious. Wexley, Sanders, and Yukl (1973) found that, despite attempts to reduce contrast effects by means of a warning (lecture) and/or an anchoring procedure (comparison of applicants to a preset standard), subjects continued to make this error. Only an intensive workshop (which combined practical observation and rating experience with immediate feedback) led to a significant behavior change. Similar results were reported in a later study by Latham, Wexley, and Pursell (1975). In contrast to subjects in group discussion or control groups, only those who participated in the intensive workshop did not commit contrast, halo, similarity, or first impression errors six months after training.

Information Recall. A practical question concerns the ability of interviewers to recall what an applicant said during an interview. Here is how this question was examined in one study (Carlson, Thayer, Mayfield, & Peterson, 1971).

Prior to viewing a 20-minute videotaped selection interview, 40 managers were given an interview guide, pencils, and paper and were told to perform as if they were conducting the interview. Following the interview, the managers were given a 20-question test, based on fac- tual information. Some managers missed none, while others missed as many as 15 out of 20 items. The average number was 10 wrong.

After this short interview, half the managers could not report accurately on the informa- tion produced during the interview! By contrast, managers who had been following the inter- view guide and taking notes were quite accurate on the test. Those who were least accurate in their recollections assumed the interview was generally favorable and rated the candidate higher in all areas and with less variability. They adopted a halo strategy. Those managers who knew the facts rated the candidate lower and recognized intraindividual differences. Hence, the more accurate interviewers used an individual-differences strategy.

None of the managers in this study was given an opportunity to preview an application form prior to the interview. Would that have made a difference? Other research indicates that the answer is no (Dipboye, Fontanelle, & Garner, 1984). When it comes to recalling infor- mation after the interview, there seems to be no substitute for note taking during the inter- view. However, the act of note taking alone does not necessarily improve the validity of the interview; interviewers need to be trained on how to take notes regarding relevant behaviors (Burnett, Fan, Motowidlo, & DeGroot, 1998). Note taking helps information recall, but it does not in itself improve the judgments based on such information (Middendorf & Macan, 2002). In addition to note taking, other memory aids include mentally reconstructing the

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

300 Applied Psychology in Talent Management

context of the interview and retrieving information from different starting points (Mantwill, Kohnken, & Aschermann, 1995).

Individual Differences A number of individual-difference variables play a role in the interview process. These refer to characteristics of both the applicant and the interviewer. Let’s review applicant characteristics first, followed by interviewer characteristics.

Applicant Appearance and Other Personal Characteristics. Findings regarding physical attractiveness indicate that attractiveness is only an advantage in jobs where attractiveness per se is relevant. However, being unattractive appears never to be an advantage (Beehr & Gilmore, 1982). One study found that being perceived as being obese can have a small, although statistically significant, negative effect (Finkelstein, Frautschy Demuth, & Sweeney, 2007). However, another study found that overweight applicants were no more likely to be hired for a position involving minimal public contact than they were for a job requiring exten- sive public contact (Pingitore, Dugoni, Tindale, & Spring, 1994).

Some of the available evidence indicates that ethnicity may not be a source of bias (Arvey, 1979; McDonald & Hakel, 1985). As noted earlier, there is a small effect for race, but it is related to interviewer–applicant race similarity rather than applicant race. However, a study examining the effects of accent and name as ethnic cues found that these two factors inter- acted in affecting interviewers’ evaluations (Segrest Purkiss, Perrewé, Gillespie, Mayes, & Ferris, 2006). Specifically, applicants with an ethnic name who spoke with an accent were per- ceived less positively compared to ethnic-named applicants without an accent and nonethnic- named applicants with and without an accent. These results point to the need to investigate interactions between an interviewee’s ethnicity and other variables. In fact, a study involving more than 1,334 police officers found a three-way interaction among interviewer ethnicity, interviewee ethnicity, and panel composition, such that African American interviewers evalu- ated African American interviewees more favorably than white applicants only when they were on a predominately African American panel (McFarland, Ryan, Sacco, & Kriska, 2004). Further research is certainly needed regarding these issues, given the demographic and soci- etal trends discussed in Chapters 1 and 2.

Evidence available from studies regarding the impact of disability status is mixed. Some studies show no relationship (Rose & Brief, 1979), whereas others indicate that applicants with disabilities receive more negative ratings (Arvey & Campion, 1982), and yet a third group of studies suggests that applicants with disabilities receive more positive ratings (Hayes & Macan, 1997). The discrepant findings are likely due to the need to include additional variables in the design beyond disability status. For example, rater empathy can affect whether applicants with a disability receive a higher or lower rating than applicants without a disability (Cesare, Tannenbaum, & Dalessio, 1990).

Applicant personality seems to be related to interview performance. For example, consider a study including a sample of 85 graduating college seniors who completed a personality inventory. At a later time, these graduates reported the strategies they used in the job search and whether these strategies had generated interviews and job offers (Caldwell & Burger, 1998). Results revealed correlations of .38 and .27 for invitations for a follow-up interview and conscientiousness and extraversion, respectively. And correlations of .34, .27, .23, and −.21 were obtained for relationships between receiving a job offer and extraversion, agreeableness, openness to experience, and neuroticism, respectively. In other words, being more conscientious and extraverted enhances the chances of receiving follow-up interviews; being more extra- verted, more agreeable, more open to experience, and less neurotic is related to receiving a job offer. Follow-up analyses revealed that, when self-reports of preparation and all personality

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 301

variables were included in the equation, conscientiousness was the only trait related to number of interview invitations received, and extraversion and neuroticism (negative) were the only traits related to number of job offers. A second study found that applicants’ trait negative affectivity had an impact on interview success via the mediating role of job-search self-efficacy and job-search intensity (Crossley & Stanton, 2005). Yet another study found that individu- als differ greatly regarding their experienced anxiety during the interview and that levels of interview anxiety are related to interview performance (McCarthy & Goffin, 2004). Taken together, the evidence gathered thus far suggests that an applicant’s personality has an effect during and after the interview, and it also affects how applicants prepare before the interview.

Another issue regarding personal characteristics is the possible impact of pleasant artificial scents (perfume or cologne) on ratings in an employment interview. Research conducted in a controlled setting found that women assigned higher ratings to applicants when they used artificial scents than when they did not, whereas the opposite was true for men. These results may be due to differences in the ability of men and women to “filter out” irrelevant aspects of applicants’ grooming or appearance (Baron, 1983).

Applicant Participation in a Coaching Program. Coaching can include a variety of tech- niques, including modeling, behavioral rehearsal, role playing, and lecture, among others (Maurer & Solamon, 2007; Tross & Maurer, 2008). Is there a difference in interview perfor- mance between applicants who receive coaching on interviewing techniques and those who do not? Two studies (Maurer, Solamon, Andrews, & Troxtel, 2001; Maurer, Solamon, & Troxtel, 1998) suggest so. They included police officers and firefighters involved in promotional pro- cedures that required an interview. The coaching program in the Maurer et al. (1998) study included several elements that included (a) introduction to the interview, including a general description of the process; (b) description of interview-day logistics; (c) description of types of interviews (i.e., structured versus unstructured) and advantages of structured interviews; (d) review of knowledge, abilities, and skills needed for a successful interview; (e) participation in and observation of interview role plays; and (f) interview tips. Participants in the coach- ing program received higher interview scores than nonparticipants for four different types of jobs (i.e., police sergeant, police lieutenant, fire lieutenant, and fire captain). Differences were found for three of the four jobs when controlling for the effects of applicant precoaching knowledge and motivation to do well on the promotional procedures. In a follow-up study, Maurer et al. (2001) found similar results.

Now let’s discuss interviewer characteristics and their effects on the interview.

Interviewer Experience. Although it has been hypothesized that interviewers with the same amount of experience will evaluate an applicant similarly (Rowe, 1960), empirical results do not support this hypothesis. Carlson (1967) found that, when interviewers with the same experience evaluated the same recruits, they agreed with each other to no greater extent than did interviewers with differing experiences. Apparently, interviewers benefit very little from day-to-day interviewing experience, since the conditions necessary for learning (i.e., train- ing and feedback) are not present in the interviewer’s everyday job situation. Experienced interviewers who never learn how to conduct good interviews will simply perpetuate their poor skills over time (Jacobs & Baratta, 1989). By contrast, a positive relationship may exist between experience and improved decision making when experience is accompanied by higher levels of cognitive complexity (Dipboye & Jackson, 1999). In that case, experience is just a proxy for another variable (i.e., complexity) and not the factor improving decision making per se.

Interviewer Cognitive Complexity and Mood. Some laboratory studies, mainly using under- graduate students watching videotaped mock interviews, have investigated whether cognitive

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

302 Applied Psychology in Talent Management

complexity (i.e., ability to deal with complex social situations) and mood affect the interview. Although the evidence is limited, a study by Ferguson and Fletcher (1989) found that cogni- tive complexity was associated with greater accuracy for female raters, but not for male raters. However, more research is needed before we can conclude that cognitive complexity has a direct effect on interviewer accuracy.

Regarding the effect of mood, Baron (1993) induced 92 undergraduate students to experi- ence positive affect, negative affect, or no shift in current affect. Then students conducted a simulated job interview with an applicant whose qualifications were described as high, ambig- uous, or low. This experiment led to the following three findings. First, when the applicant’s qualifications were ambiguous, participants in the positive affect condition rated this person higher on several dimensions than did students in the negative affect condition. Second, inter- viewers’ mood had no effect on ratings when the applicant appeared to be highly qualified for the job. Third, interviewers’ moods significantly influenced ratings of the applicant when this person appeared to be unqualified for the job, such that participants in the positive affect condition rated the applicant lower than those induced to experience negative affect. In sum, interviewer mood seems to interact with applicant qualifications such that mood plays a role only when applicants are unqualified or when qualifications are ambiguous.

Effects of Structure Another major category of factors that affect interview decision making refers to the degree of structure in the interview (Levashina, Hartwell, Morgeson, & Campion, 2014). Structure is a matter of degree, and there are four dimensions one can consider: (1) questioning consistency, (2) evaluation standardization, (3) question sophistication, and (4) rapport building (Chap- man & Zweig, 2005). Overall, structure can be enhanced by basing questions on results of a job analysis, asking the same questions of each candidate, limiting prompting follow-up questioning and elaboration on questions, using better types of questions (e.g., situational questions, which are discussed shortly), using longer interviews and a larger number of ques- tions, controlling ancillary information (i.e., application forms, résumés, test scores, recom- mendations), not allowing the applicant to ask questions until after the interview, rating each answer on multiple scales, using detailed anchored rating scales, taking detailed notes, using multiple interviewers, using the same interviewer(s) across all applicants, providing extensive interviewing training, and using statistical rather than clinical prediction (discussed in detail in Chapter 13) (Campion, Palmer, & Campion, 1997).

The impact of structure on several desirable outcomes is clear-cut. First, a review of several meta-analyses reported that structured interviews are more valid (Campion et al., 1997). Spe- cifically, the corrected validities for structured interviews ranged from .35 to .62, whereas those for unstructured interviews ranged from .14 to .33. Second, structure decreases differences between racial groups. A meta-analysis found a mean standardized difference (d) between white and African American applicants of .32 based on 10 studies with low-structure inter- views and d = .23 based on 21 studies with high-structure interviews (Huffcutt & Roth, 1998). Note, however, that these differences are larger for both types of interviews if one considers the impact of range restriction (Roth, Van Iddekinge, Huffcutt, Eidson, & Bobko, 2002). Third, structured interviews are less likely than unstructured interviews to be challenged in court based on illegal discrimination (Williamson, Campion, Malos, Roehling, & Campion, 1997).

A review of 158 U.S. federal court cases involving hiring discrimination from 1978 to 1997 revealed that unstructured interviews were challenged in court more often than any other type of selection device, including structured interviews (Terpstra, Mohamed, & Kethley, 1999). Specifically, 57% of cases involved charges against the use of unstructured interviews, whereas only 6% of cases involved charges against the use of structured inter- views. Even more important is an examination of the outcomes of such legal challenges. Unstructured interviews were found not to be discriminatory in 59% of cases, whereas

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 303

structured interviews were found not to be discriminatory in 100% of cases. Taken together, these findings make a compelling case for the use of the structured interview in spite of HR managers’ reluctance to adopt such procedures (van der Zee, Bakker, & Bakker, 2002).

Why are structured interviews qualitatively better than unstructured interviews? Higher reliability alone does not seem to be a sufficient explanation (Schmidt & Zimmerman, 2004). Most likely the answer is that unstructured interviews (i.e., the interviewer has no set proce- dure but merely follows the applicant’s lead) and structured interviews (i.e., the interviewer follows a set procedure) do not measure the same constructs (Huffcutt, Conway et al., 2001). Typically, structured interviews are the result of a job analysis and assess job knowledge and skills, organizational fit, interpersonal and social skills, and applied mental skills (e.g., prob- lem solving). Therefore, constructs assessed in structured interviews tend to have a greater degree of job relatedness as compared to the constructs measured in unstructured interviews. When interviews are structured, interviewers know what to ask for (providing a more consis- tent sample of behavior across applicants) and what to do with the information they receive (helping them to provide better ratings).

Structured interviews vary based on whether the questions are about past experiences or hypothetical situations. Questions in an experience-based interview are past oriented; they ask applicants to relate what they did in past jobs or life situations that are relevant to the job in question (Janz, 1982; Motowidlo et al., 1992). The underlying assumption is that the best predictor of future performance is past performance in similar situations. Experience-based questions are of the “Can you tell me about a time when . . . ?” variety.

By contrast, situational questions (Latham, Saari, Pursell, & Campion, 1980; Maurer, 2002) ask job applicants to imagine a set of circumstances and then indicate how they would respond in that situation. Hence, the questions are future oriented. Situational interview ques- tions are of the “What would you do if . . . ?” variety. Situational interviews have been found to be highly valid and resistant to contrast error and to race or gender bias (Maurer, 2002). Why do they work? Apparently the most influential factor is the use of behaviorally anchored rating scales. Maurer (2002) reached this conclusion based on a study of raters who watched and provided ratings of six situational interview videos for the job of campus police officer. Even without any training, a group of 48 business students showed more accuracy and agree- ment than job experts (i.e., 48 municipal and campus police officers) who used a structured interview format that did not include situational questions. Subsequent comparison of situ- ational versus nonsituational interview ratings provided by the job experts showed higher lev- els of agreement and accuracy for the situational type.

Both experience-based and situational questions are based on a job analysis that uses the critical-incidents method (as described in Chapter 9). The incidents then are turned into interview questions. Each answer is rated independently by two or more interviewers on a five-point Likert-type scale. To facilitate objective scoring, job experts develop behavioral statements that are used to illustrate 1, 3, and 5 answers. Table 12.2 illustrates the difference between these two types of questions.

Taylor and Small (2002) conducted a meta-analysis comparing the relative effectiveness of these two approaches. They were able to locate 30 validities derived from situational interviews and 19 validities for experience-based interviews, resulting in mean corrected validities of .45 for situational interviews and .56 for experience-based interviews. However, a comparison of the studies that used behaviorally anchored rating scales yielded mean validities of .47 for situ- ational interviews (29 validity coefficients) and .63 for experience-based interviews (11 validity coefficients). In addition, mean interrater reliabilities were .79 for situational interviews and .77 for experience-based interviews. Finally, although some studies have found that the situ- ational interview may be less valid for higher level positions (Pulakos & Schmitt, 1995) or more complex jobs (Huffcutt, Weekley, Wiesner, DeGroot, & Jones, 2001), the meta- analytic results found no differential validity based on job complexity for either type of interview.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

304 Applied Psychology in Talent Management

Summary of Evidence-Based Suggestions for Improving the Interview Process and Outcome

Emphasis on employment interview research within a person-perception framework should continue. Also, this research must consider the social and interpersonal dynamics of the inter- view, including affective reactions on the part of both the applicant and the interviewer. The interviewer’s job is to develop accurate perceptions of applicants and to evaluate those perceptions in light of job requirements. Learning more about how those perceptions are formed, what affects their development, and what psychological processes best explain their development are important questions that deserve increased attention. Also, we need to deter- mine whether any of these process variables affect the validity, and ultimately the utility, of the interview (Zedeck & Cascio, 1984). We should begin by building on our present knowl- edge to make improvements in selection-interview technology. Here are seven research-based suggestions for improving the interview process and outcome:

1. Link interview questions tightly to job analysis results, and ensure that behaviors and skills observed in the interview are similar to those required on the job. A variety of types of questions may be used, including situational questions, questions on job knowledge that is important to job performance, job sample or simulation questions, and questions regarding background (e.g., experience, education) and “willingness” (e.g., shift work, travel).

2. Ask the same questions of each candidate because standardizing interview questions has a dramatic effect on the psychometric properties of interview ratings. Consider using the following six steps when conducting a structured interview: (1) Open the interview,

Situational item: Suppose you had an idea for a change in work procedure to enhance quality, but there was a problem in that some members of your work team were against any type of change. What would you do in this situation?

(5) Excellent answer (top third of candidates)—Explain the change and try to show the benefits. Discuss it openly in a meeting.

(3) Good answer (middle third)—Ask them why they are against change. Try to convince them.

(1) Marginal answer (bottom third)—Tell the supervisor.

Experience-based item: What is the biggest difference of opinion you ever had with a coworker? How did it get resolved?

(5) Excellent answer (top third of candidates)—We looked into the situation, found the problem, and resolved the difference. Had an honest conversation with the person.

(3) Good answer (middle third)—Compromised. Resolved the problem by taking turns, or I explained the problem (my side) carefully.

(1) Marginal answer (bottom third)—I got mad and told the coworker off, or we got the supervisor to resolve the problem, or I never have differences with anyone.

Source: Campion, M. A., Campion, J. E., & Hudson, J. P., Jr. (1994). Structured interviewing: A note on incremental validity and alternative question types. Journal of Applied Psychology, 79, 999.

TABLE 12.2 ■ Examples of Experience-Based and Situational Interview Items Designed to Assess Conflict Resolution and Collaborative Problem-Solving Skills

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 305

explaining its purpose and structure (i.e., that you will be asking a set of questions that pertain to the applicant’s past job behavior and what he or she would do in a number of job- relevant situations), and encourage the candidate to ask questions; (2) preview the job; (3) ask questions about minimum qualifications (e.g., for an airline, willingness to work nights and holidays); (4) ask experience-based questions (“Can you tell me about a time when . . . ?”); (5) ask situational questions (“What would you do if . . . ?”); (6) close the interview by giving the applicant an opportunity to ask questions or volunteer information he or she thinks is important, and explain what happens next (and when) in the selection process.

3. Anchor the rating scales for scoring answers with examples and illustrations. Doing so helps to enhance consistency across interviews and objectivity in judging candidates.

4. Whether structured or unstructured, interview panels are no more valid than are individual interviews (McDaniel et al., 1994). In fact, some panel members may see the interview as a political arena and attempt to use the interview and its outcome as a way to advance the agenda of the political network in which they belong (Bozionelos, 2005). As we have seen, however, mixed-race panels may help to reduce the similar-to-me bias that individual interviewers might introduce. Moreover, if a panel is used, letting panel members know that they will engage in a group discussion to achieve rating consensus improves behavioral accuracy (i.e., a rating of whether a particular type of behavior was present or absent) (Roch, 2006).

5. Provide a well-designed and properly evaluated training program to communicate this information to interviewers, along with techniques for structuring the interview (e.g., a structured interview guide, standardized rating forms) to minimize the amount of irrelevant information. As part of their training, give interviewers the opportunity to practice interviewing with minorities or persons with disabilities. This may increase interviewers’ ability to relate to these populations.

6. Document the job analysis and interview-development procedures, candidate responses and scores, evidence of content- or criterion-related validity, and adverse impact analyses in accordance with testing guidelines.

7. Institute a planned system of feedback to interviewers to let them know who succeeds and who fails and to keep them up-to-date on changing job requirements and success patterns.

There are no shortcuts to reliable and valid measurement. Careful attention to detail and careful “mapping” of the interview situation to the job situation are necessary, both legally and ethically, if the interview is to continue to be used for selection purposes.

THE FUTURE IS NOW: TECHNOLOGY AND BIG DATA In previous sections, we described the use of computers, video résumés, the Internet, and credit scores. As technology progresses and HR specialists can take advantage of new technol- ogy as well as big data, we anticipate important changes in how and what type of information can be gathered to make selection decisions. Consider the following innovations.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

306 Applied Psychology in Talent Management

Social Media

Social media permeates the lives of billions of individuals worldwide. As of 2016, the num- ber of users of Facebook surpassed 1.9 billion people; Twitter, 330 million; and LinkedIn, 500 million (www.statista.com). Social media is an example of a source of big data (Harlow & Oswald, 2016), and it is changing the way in which people communicate, create relation- ships, and do business with each other (McFarland & Ployhart, 2015). Users of these sites share information on their personal history, attitudes, preferences, life choices, and behaviors. It seems natural, then, for employers to investigate applicants’ accounts before making a job offer. In fact, a senior manager for the EEOC noted that approximately 75% of recruiters are required to do online research on applicants, and 70% of recruiters surveyed reported reject- ing individuals as a result (Roth, Bobko, Van Iddekinge, & Thatcher, 2016).

Can information available on social media sites be used in a valid and fair manner for select- ing employees? Unfortunately, despite its current use, little evidence is available to answer this question. Kluemper, Rosen, and Mossholder (2012) asked two undergraduate students and a faculty member to assess the suitability for hire of 56 employed students based on the content of their Facebook pages and correlated those scores with ratings of job performance provided by supervisors. The resulting validity coefficient was .28. This is clearly an important study because it is one of a kind. But, in addition to the small sample size, suitability ratings were based on a hypothetical job of manager in a service industry, whereas the performance mea- sure was from the students’ current (not necessarily supervisory) position.

In a more recent study, 86 recruiters rated the Facebook pages of graduate and undergradu- ate students who were near graduation and were looking for a job. Results suggested much less promising—and even troubling—results (Van Iddekinge, Lanivich, Roth, & Junco, 2016). Recruiters rated the students using questions such as, “I can see how this person would be an attractive applicant to an organization,” “I would consider this person further for employment if they had the skills to fill an open position,” and “I would be hesitant to pursue this person as an applicant after viewing their Facebook profile” (reverse scored). About 14 months after stu- dents were hired, the authors collected performance information for 142 of them from their supervisors. Performance was measured with items such as “The employee performs tasks they are asked to complete,” “The employee goes out of their way to help other employees,” and “Overall, I am happy with this employee’s performance.” Students whose performance was rated by their supervisors also completed surveys including measures of turnover and turnover intentions. Recruiter ratings of applicants’ Facebook information were unrelated to supervisor ratings of job performance (rs = −.13 to –.04), turnover intentions (rs = −.05 to .00), and actual turnover (rs = −.01 to .01). Facebook ratings did not predict performance above and beyond general cognitive abilities, the Big Five personality traits, core self-evaluation, self-efficacy, and grade point average.

The fact that social media data are available, and in vast quantities, does not mean they are useful. The popular press, in particular, frequently shares anecdotal reports about the use of social media for employment decisions. These include claims discussing the merits, but also warning about risks, of using social media for this purpose. Unfortunately, there is not much empirical evidence to substantiate those claims (McFarland & Ployhart, 2015). Clearly, social media provides access to many applicants on a global scale. So, using it for recruitment pur- poses may be beneficial in terms of increasing the quantity and quality of the applicant pool. From the perspective of applicants, social networking sites, such as LinkedIn, allow them to connect directly with members of targeted organizations to receive more realistic previews about their potential employers. In addition, sites such as Glassdoor allow employees to rate their organizations regardless of what senior leaders want the employees to say (McFarland & Ployhart, 2015). So, social media increases the flow of information substantially—both for

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 307

job applicants and for employers. As is the case with big data in general, however, there is a need to understand when and how such information can be used in a valid and fair manner to predict performance and other important criteria. Until such evidence becomes available, the jury is still out.

Mobile and Web-Based Selection

The use of computers and the Internet for administering tests is now pervasive. Moreover, the use of mobile devices allows applicants to upload application materials and complete forms and assessments anywhere and anytime. Technology for staffing is developing quickly, and it involves all aspects of the process, beginning with recruitment. In this section, we offer a brief overview of major issues, based on a review by Tippins (2015).

In terms of the delivery of assessments, the use of computers allows for more complex for- mats in addition to the traditional multiple-choice format. For example, an assessment may include audio or video content containing real people or equipment, avatars, or other forms of animation. Another technological innovation is the availability of an assessment portal, which is a single link from which applicants can access all assessments. Other possibilities include novel response-option formats, such as drag-and-drop, as well as thermometer response scales that allow for finer distinctions. These innovations may make it easier for applicants to respond more precisely, avoiding the scale-coarseness problem of traditional scales described in Chapter 6.

Mobile and Web-based selection also allows for innovations regarding scoring. For exam- ple, many vendors have large electronic databases of test takers, which allows them to create norms based on various types of jobs, occupations, industries, and even regions of the world. The availability of large electronic databases provides another advantage: the possibility of creating detailed reports to be shared with applicants. This is particularly useful for promo- tion decisions because employers can store large amounts of performance-related data over the span of employees’ tenure at the company.

Although there are advantages to using mobile and Web-based selection, there are also potential pitfalls. For example, the percentage of people with access to a fast Internet con- nection is quite large in the United States (73%), Canada (82%), South Korea (94%), Japan (86%), Switzerland (91%), and the Netherlands (88%), but this is not the case in countries such as Panama (12%), India (4.9%), and the Philippines (4.2%). Also, having access to the Internet does not guarantee equal opportunity regarding Web-based testing. For example, KSAs that are necessary in this particular testing context may not be necessary in the tra- ditional paper-and-pencil context (i.e., familiarity with a computer, typing speed). If those KSAs are not job related, then they can have a negative effect on the validity of the assessment. Other challenges involve distractions present during test administration that are outside the test administrator’s control and possible cheating. The latter can be addressed by using proc- tors at test-administration sites and webcams for individual assessments. Finally, applicant reactions, especially about privacy, are particularly relevant in mobile and Web-based selec- tion. Employers should be sure to explain to applicants the safeguards that are in place regard- ing the storage and use of data.

Computer Scoring of Text

Campion, Campion, Campion, and Reider (2016) provided an illustration of what is usually referred to as automated essay scoring (AES) or computer-automated scoring (CAS). Essen- tially, this is a technological advancement that is part of a family of techniques called com- puter-aided text analysis (CATA, as discussed in Chapter 6) to score the narrative responses of

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

308 Applied Psychology in Talent Management

job applicants. A traditional challenge in using this technology is what is called the informa- tion retrieval problem, which is the difficulty in making a precise lexical match between words in a user’s query and words used within the document analyzed (e.g., essays, résumés). For example, we may be interested in learning about applicants’ leadership skills, but the candi- date may not have used the terms leader or leadership. Instead, the narrative may include the term manager (Campion et al., 2016).

Campion et al. (2016) created a program using the SPSS-IBM Premium Modeler package to measure six competencies: communication skill, critical thinking, people skill, leadership skill, managerial skill, and factual knowledge. The program identifies key terms within text and also constructs models on the relations among them to infer higher order characteristics or constructs in a candidate (such as the six competencies).

Campion et al.’s results demonstrated the potential of computer scoring of text because they showed that it is possible to program a computer to emulate a human rater when scoring narrative data. The computer-based scores were as reliable as those produced by human raters, and there was evidence of construct validity. From a practical perspective, computer scoring resulted in substantial savings compared to using human raters.

Remote Interviewing

The use of videoconferencing (e.g., using Skype) allows employers to interview distant appli- cants remotely and inexpensively (Chapman & Rowe, 2002). Telephone interviewing is quite common (Schmidt & Rader, 1999). However, some key differences between face-to-face interviews and interviews using technologies such as the telephone and videoconferencing may affect the process and outcome of the interview (Chapman & Rowe, 2002). In the case of the telephone, an obvious difference is the absence of visual cues (Silvester & Anderson, 2003). By contrast, the absence of visual cues may reduce some interviewer biases based on nonverbal behaviors that were discussed earlier in this chapter. Regarding videoconferencing, the lack of a duplex system that allows for both parties to talk simultaneously may change the dynamics of the interview.

A hybrid way to conduct the interview is to do it face to face, record both audio and video, and then ask additional raters, who were not present in the face-to-face interview, to provide an evaluation (Van Iddekinge, Raymark, Roth, & Payne, 2006). However, a simula- tion that included 113 undergraduate and graduate students provided initial evidence that ratings may not be equivalent. Specifically, face-to-face ratings were significantly higher than those provided based on the videotaped interviews. Further research is needed to estab- lish conditions under which ratings provided in face-to-face and videotaped interviews may be equivalent.

One study compared the equivalence of telephone and face-to-face interviews using a sample of 70 applicants for a job in a large multinational oil corporation (Silvester, Anderson, Haddleton, Cunningham-Snell, & Gibb, 2000). Applicants were randomly assigned to two groups. Group A received a face-to-face interview followed by a telephone interview, and group B received a telephone interview followed by a face-to-face interview. Results revealed that telephone ratings (mean = 4.30) were lower than face-to-face ratings (mean = 5.52), regardless of the interview order. Silvester et al. (2000) provided several possible reasons for this. During telephone interviews, interviewers may be more focused on content rather than extraneous cues (e.g., nonverbal behavior), in which case the telephone interview may be more valid than the face-to-face interview. Alternatively, applicants may have considered the tele- phone interview as less important and could have been less motivated to perform well, or applicants may have had less experience with telephone interviews, which could also explain their lower performance.

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

Chapter 12 ■ Selection Methods 309

Another experimental study compared face-to-face interviews with videoconferencing interviews using a sample of undergraduate students being interviewed for actual jobs (Chap- man & Rowe, 2002). Applicants in the face-to-face condition were more satisfied with the interviewer’s performance and with their own performance during the interview, as compared to applicants in the videoconferencing condition (Chapman & Rowe, 2002).

Virtual Reality Technology

Virtual reality technology (VRT) is a technological advance that has the potential to alter the way screening is done (Aguinis, Henle, & Beaty, 2001). Imagine applicants for truck-driver positions stepping into a simulator of a truck to demonstrate their competence. Or imagine applicants for lab-technician positions entering a simulated laboratory to demonstrate their ability to handle various chemical substances. VRT has several advantages because it has the potential to create such job-related environments without using real trucks or real chemicals. Thus, users can practice hazardous tasks or simulate rare occurrences in a realistic environ- ment without compromising their safety. VRT also allows examiners to gather valuable infor- mation regarding future on-the-job performance. As noted by Aguinis, Henle, and Beaty (2001), “Just a few years ago, this would have only been possible in science fiction movies, but today virtual reality technology makes this feasible” (p. 70).

The implementation of VRT presents some challenges, however. For example, VRT envi- ronments can lead to sopite syndrome (i.e., eyestrain, blurred vision, headache, balance dis- turbances, drowsiness; Pierce & Aguinis, 1997). A second potential problem in implementing VRT testing is its cost and lack of commercial availability. However, VRT systems are becom- ing increasingly affordable. In fact, a Google Daydream View VRT headset costs less than $100. A final challenge faced by those contemplating the use of VRT is its technical limita- tions. In virtual environments, there is a noticeable lag between the user’s movement and the change of scenery, and some of the graphics, including the virtual representation of the user, may appear cartoonlike. However, given the frantic pace of technological advances, we should expect that some of the present limitations will soon be overcome.

In closing, new technology and the availability of big data are making possible the use of innovative selection methods. Some of these methods may be passing fads, whereas others may become popular. Regardless of their attractiveness and availability, understanding the constructs being measured and their job relatedness will continue to be essential for determin- ing the appropriateness of these methods. It will also be important to understand important hidden costs, such as negative applicant reactions and scores that are not as valid as those resulting from more traditional and well-established methods.

EVIDENCE-BASED IMPLICATIONS FOR PRACTICE

zz Several methods are available to make decisions at the initial stages of the selection process (i.e., screening). None of these methods offers a “silver bullet” solution, so it is best to use them in combination rather than in isolation.

zz Personal history data, collected through application forms, biographical information blanks, and résumés are most useful when they are based

on a rational approach—questions are developed and data collected based on a job analysis and hypotheses about relations between the constructs underlying items and job-performance constructs.

zz Recommendations and reference checks are most useful when they are used consistently for all applicants and when the information gathered is relevant for the position in question.

(Continued)

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .

310 Applied Psychology in Talent Management

zz Polygraph testing is likely to lead to errors, and administrators should be aware that the physiological indicators can be altered by conscious efforts on the part of applicants.

zz Honesty or integrity tests are either overt or personality oriented. Given challenges and unresolved issues with these types of tests, consider using alternatives to the traditional paper-and-pencil format, and include situational- judgment and conditional-reasoning tests.

zz Evaluations of training and experience qualifications are most useful when they are directly relevant to specific job-related areas.

zz For drug screening to be most effective and less susceptible to legal challenges, it should be presented within a context of safety and health

and as part of a comprehensive policy regarding drug use.

zz Employment interviews are used almost universally. Factors that may affect the validity of the employment interview include social/ interpersonal issues, cognitive biases, individual differences in both interviewers and interviewees, the interview structure, and the format used (i.e., face to face, videotaped).

zz The availability of big data and technological advancements (e.g., social media, mobile and Web-based selection, and virtual reality technology) create new opportunities, but there is much we need to understand about their validity and reliability before we can recommend their use more widely.

Discussion Questions

1. How can the usefulness of recommendations and reference checks be improved?

2. Is the use of video résumés and credit history scores equally fair for all job applicants? Why or why not?

3. As CEO of a large retailer, you are considering using drug testing to screen new hires. What elements should you include in developing a policy on this issue?

4. What instructions would you give to applicants who are about to complete a biodata instrument so as to minimize response distortion?

5. What is the difference between personality- based and overt honesty tests? Which constructs are measured by each type of measure?

6. Are you in favor of or against the use of polygraph testing for screening applicants for security screening positions at airports? Why?

7. In an employment interview, the interviewer asks you a question that you believe is an invasion of privacy. What do you do?

8. Employers today generally assign greater weight to experience than to academic qualifications. Why do you think this is so? Should it be so?

9. Discuss some of the advantages of using computer-based screening (CBS). Given these advantages, why isn’t CBS more popular?

10. Your boss asks you to develop a training program for employment interviewers. How will you proceed? What will be the elements of your program, and how will you tell if it is working?

11. Discuss the advantages of using a structured, as opposed to an unstructured, interview. Given these advantages, why are HR managers reluctant to conduct structured interviews?

12. Discuss potential pitfalls in using social media for selection purposes.

13. What type of empirical evidence would be useful to understand when and how social media can be used for selection purposes?

14. Provide examples of constructs and specific jobs for which the use of virtual reality technology would be an effective alternative compared to more traditional screening methods.

(Continued)

C op

yr ig

ht ©

2 01

8. S

A G

E P

ub lic

at io

ns , I

nc or

po ra

te d.

A ll

rig ht

s re

se rv

ed .