Assessments in Human Resources

jaybird
Article3.pdf

T he use of competencies as a funda- mental building block of organiza- tions and the people they employ is increasingly popular (Becker & Huselid, 1999; Becker, Huselid, & Ul-

rich, 2001; Lievens, Sanchez, & De Corte, 2004; Lucia & Lepsinger, 1999). Schipp- mann et al. (2000) state that the “practice of competency modeling has exploded onto the field of human resources over the past several years” (p. 704) and estimate that be- tween 75% and 80% of surveyed companies use some form of competency-based appli- cation. Lievens et al. (2004) report that over 500 articles were published on the topic be- tween 1995 and 2003. It is estimated that

firms spend $100 million per year develop- ing, implementing, and revising compe- tency models (Athey & Orth, 1999).

One reason for the popularity of compe- tency programs is the belief that traditional job-based management systems may impede an organization’s speed and agility in the face of today’s globalization and rate-of- change challenges. By contrast, a compe- tency-based system should be organized around the capabilities and capacities needed to create customer value (Prahalad & Hamel, 1990). Lawler (1994) highlighted the distinction: “Despite its historical utility, there is growing evidence that it may be time for many organizations to move away

PREDICTING ASSESSMENT

CENTER PERFORMANCE WITH

360-DEGREE, TOP-DOWN, AND CUSTOMER-BASED COMPETENCY

ASSESSMENTS

C H R I S T I N E M . H A G A N , R O B E R T K O N O P A S K E , H . J O H N B E R N A R D I N , A N D C A T H E R I N E L . T Y L E R

In the first criterion-related validity study of a complete 360-degree compe- tency assessment process (i.e., where customer data are included), aggre- gated 360-degree assessment of 428 retail associate store managers on six competencies showed strong validity (.50) in the prediction of assessment center performance. In addition, 360-degree assessments on each of the six competencies were significantly correlated with the criteria. The aggregated 360-degree assessments also demonstrated incremental validity over mana- gerial ratings alone in the prediction of assessment center criteria. Customer (mystery shopper) assessments were also significantly correlated with the assessment center criteria and exhibited incremental validity beyond super- visory assessments. © 2006 Wiley Periodicals, Inc.

Correspondence to: Christine M. Hagan, University of Miami; 414-S Jenkins; Coral Gables, FL 33124, Phone: 305- 284-1380; Fax: 305-284-3655; E-mail: chagan@miami.edu

Human Resource Management, Fall 2006, Vol. 45, No. 3, Pp. 357–390 © 2006 Wiley Periodicals, Inc.

Published online in Wiley InterScience (www.interscience.wiley.com).

DOI: 10.1002/hrm.20117

358 HUMAN RESOURCE MANAGEMENT, Fall 2006

from a focus on jobs and towards a focus on individuals and their competencies . . . In- stead of thinking about people as having a job with a particular set of activities that can be captured in a relatively permanent and fixed job description, it may be more appro- priate and more effective to think of them as human resources that work for an organiza- tion” (p. 4).

While some theoretical work has ex- plored approaches to designing competency-

based programs (e.g., Schipp- mann, 1999; Spencer & Spencer, 1993), and case studies have de- scribed the way competencies are used in organizations (e.g., Kirn, Rucci, Huselid, & Becker, 1999; McCowan, Bowen, Huselid, & Becker, 1999), little empirical work has been directed at the construct and criterion-related validity of competency measure- ment. The issue is important be- cause program effectiveness is often contingent on the accuracy with which visions, goals, and concepts are translated into oper-

ational tools, techniques, and programs. Many companies now use some form of

multirater or 360-degree assessment process to measure managerial competencies (e.g., Lucia & Lepsinger, 1999; Ostroff, Atwater, & Feinberg, 2004; Smither, London, & Reilly, 2005). Lawler (1994) specifically suggested that within competency programs, supervi- sory judgments about individual capabilities may not be as accurate as those of other, per- haps more qualified observers, especially “peers and technical experts” (p. 9). Tradi- tional top-down assessment (TDA) systems consist of one person, the direct supervisor, conducting a periodic evaluation of the em- ployee’s competence or performance over a specified time period. Traditionally, TDA has been the most widely used approach to per- formance assessment in organizations (Scullen, Mount, & Goff, 2000) and in deci- sion-making processes concerning who will be promoted (Powell & Butterfield, 1997).

While 360-degree systems were origi- nally used for developmental purposes, they

are increasingly being used for administra- tive decision making (Greguras, Robie, Schleicher, & Goff, 2003). Lucia and Lep- singer (1999) are among the many consult- ants recommending a 360-degree assess- ment approach to the measurement of competencies. Three-hundred-sixty-degree assessment programs gather performance in- formation from subordinates, peers, supervi- sors, and (occasionally) customers. Borman (1997) asserts that two assumptions underlie these programs: (1) each source of rating of- fers at least somewhat unique information concerning the ratee’s performance and (2) evaluations from different sources exhibit incremental validity beyond that of any sin- gle source. Others suggest that 360-degree assessment (particularly if self-assessment is excluded) is less likely to be susceptible to rater bias than single-rater systems (Bernardin & Tyler, 2001), that receiving as- sessment from people with different rela- tional perspectives provides a more compre- hensive perspective (Yammarino & Atwater, 1993), that assessment is less likely to be ig- nored if subordinates and peers are included (London & Smither, 1995), and that 360-de- gree assessment fits better than traditional TDA given today’s downsized, more hori- zontal, team-based organizations (Murphy & Cleveland, 1995). Yet, there is little re- search that investigates the validity of mul- tirater appraisal relative to the more com- mon and traditional top-down competency assessment.

The purpose of this study is twofold. First, this article describes the process used by a Fortune 500 retailer to identify and de- velop a group of competencies that repre- sent the core of its management system for store manager (SM) and associate store man- ager (ASM) positions. This competency pro- gram was specifically designed for and by this firm and involved the participation of many individuals and groups. We hope that adding a description of our approach to a growing body of research will be instructive for firms considering competency programs or for organizations with established pro- grams that are interested in continuous im- provement. The second purpose of this

Many companies

now use some form

of multirater or 360-

degree assessment

process to measure

managerial

competencies.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 359

study is to investigate the validity of compe- tency measurement using a 360-degree ap- proach to competency assessment as op- posed to traditional supervisory assessment. Since this study includes external customers as part of the full 360-degree network, we also explore the validity of this source of competency assessments. In these compar- isons, the independent criterion is assess- ment center (AC) competency scores as judged by expert observers.

Core Competencies

Some of the earliest work on competency- based programs may be traced to a general disappointment with the ability of tradi- tional HR tools (e.g., paper-and-pencil tests, measures of academic aptitude) to predict an individual’s behavior, particularly in un- structured situations. Spencer and Spencer (1993) described a competency as an indi- vidual characteristic existing at three levels. Knowledge and skill are “on the surface” and are, therefore, observable and measura- ble. Traits and motives are embedded deeply within an individual and tend to re- late to more enduring characteristics like personality. Self-concept is found in be- tween and includes an individual’s atti- tudes, values, and self-image. These three levels are linked to individual performance through an Intent–Action–Outcome model. On the basis of this theory, organizations were advised to focus on the more “bedrock” individual elements (traits and motives), because the surface characteristics (knowledge and skills) were easier to change through training and development. Such “bedrock” dimensions would best be meas- ured, they suggested, by examining an indi- vidual’s behavior.

Strategy experts argue that an organiza- tion’s internal resources—including the ca- pabilities of its workforce—could give rise to competitive advantage (Barney, 1991; Wern- erfelt, 1984). Competencies benefit firms when they provide customer value, resist im- itation, and enable a firm to create new busi- ness through product and service extensions, and through innovation. In their ground-

breaking work, The Core Competence of the Or- ganization, Prahalad and Hamel (1990; see also Hamel & Prahalad, 1994) conceptual- ized competencies very broadly to include both skills and technologies, arguing that skills without effective equipment, processes, and techniques will not perform to their full potential. Thus, the notion of competency grew to include both an individual employee attribute and organizational performance potential, and the managerial challenge expanded from a focus on organization–person and job–person match, to configuring and aligning a sociotechnical sys- tem. Today, human capital theory proposes that capabilities are the link between business plans and operational action (Ulrich, 1997) and that the capabilities of an or- ganization’s employees, com- bined with its processes and its customer relationships, are a source of competitive advantage (Becker & Huselid, 1999).

Unfortunately, there is no widely accepted definition of the term competency (Schippmann et al., 2000). Our review of the liter- ature suggests that “competencies” range from personal traits to work behaviors, and include everything in between. Lucia and Lepsinger (1999) base their approach to com- petency modeling on definitions of a com- petency from Klemp (1980), which they re- gard as “widely accepted among human resource specialists in corporate environ- ments” (p. 5), and from Parry (1996) based on a synthesis of definitions from human re- source development experts. Klemp (1980) defined a competency as an “underlying characteristic of a person which results in ef- fective and/or superior performance on the job” (p. 21). Parry (1996) defined a compe- tency as a “cluster of related knowledge, skills and attitudes that affects a major part of one’s job (a role or responsibility), that correlates with performance on the job, that can be measured against well-accepted stan- dards” (p. 50). Lawler (1994) argues that

Competencies

benefit firms when

they provide

customer value,

resist imitation, and

enable a firm to

create new

business through

product and service

extensions, and

through innovation.

Human Resource Management DOI: 10.1002/hrm

360 HUMAN RESOURCE MANAGEMENT, Fall 2006

skills are the basic building blocks of compe- tencies and, often, uses the words inter- changeably.

As indicated previously, Prahalad and Hamel’s definition (1990) extends well be- yond individual capabilities. Milkovich and Newman (2005) assert that the definition of competency is evolving such that vague

ideas relating to self-concepts, traits, and motives are being re- placed by business-related de- scriptions of behaviors. Tett, Guterman, Bleier, and Murphy (2000) define competencies as work behaviors that are predi- cated on individual traits. A re- view of the competencies used by three major managerial consult- ing firms (cited in Tett et al., 2000) suggests that competencies include traits (e.g., “creativity,” “self-knowledge,” and “objectiv- ity”), knowledge (e.g., “technical knowledge,” “procedural knowl- edge,” and “business knowl- edge”), skills (e.g., “delivering

presentations” and “coaching”), abilities (e.g., “political savvy,” “drive for results,” and “strategic agility”), and behaviors (e.g., “confronting direct reports,” “directing oth- ers,” and “listening”).

The lack of consensus about the defini- tion of a competency means that compe- tency-related discussions may get bogged down in semantics. Given this situation, we suggest that competencies should be recog- nized and judged based on specific contex- tual features, particularly how they are de- veloped, what they represent within an organization, and the role that they play in delivering customer value. Our review of the literature suggests that competency-based programs are distinguished from traditional job-based programs in three respects. First, competencies are about what jobs share in common, rather than what makes each job unique (Schippmann et al., 2000). Second, these commonalities are directly linked to achievement of organizational success, rather than success in any individual job (Prahalad & Hamel, 1990). Third, because

they are such fundamental commonalities, these core competencies become the primary mechanism that drives organization design, job structure, and managerial practices (Lawler, 1994). This latter principle is what makes core competencies the “connective tissue” (Prahalad & Hamel, 1990) that links an organization’s workforce with its technol- ogy to create a seamless sociotechnical sys- tem. When this system creates outputs that customers value and that are difficult to imi- tate, it gives rise to sustainable competitive advantage.

Our client organization chose to draw on the definition of competency offered by Schippmann (1999) as clusters of measurable and behaviorally based characteristics or ca- pabilities of people. This is a general defini- tion that focuses on behavior but also in- cludes specific knowledge and skills.

360-Degree Assessment

As indicated earlier, this article examines how effectively a 360-degree competency measurement system predicts expert judg- ments about competencies derived from an assessment center. Waldman, Atwater, and Antonioni (1998) suggest that many compa- nies implement 360-degree feedback pro- grams because it is faddish to do so, because they feel pressure to imitate the practices of others, as a response to internal politics, or to create an impression of openness and par- ticipation. Such programs often are designed and implemented without clear objectives and with no real plan for validating the in- struments or for measuring program effec- tiveness. Like competency-based programs, 360-degree assessment may be a real-world practice whose use and popularity is outpac- ing the progress of empirical research about how and why it works (Becker & Huselid, 1999; Smither et al., 2005).

Research to date has focused on similari- ties and differences among raters and rating sources, and ratee reactions to 360-degree as- sessment (e.g., Fletcher, Baldry, & Cunning- ham-Snell, 1998; Greguras & Robie, 1998; Mount, Judge, Scullen, Sytsma, & Hezlett, 1998; Ostroff et al., 2004; Scullen, Mount, &

The lack of

consensus about

the definition of a

competency means

that competency-

related discussions

may get bogged

down in semantics.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 361

Judge, 2003). Smither et al. (2005) conducted a meta-analysis of 24 longitudinal studies that focused on performance changes after receipt of multirater feedback. Among the criteria across these studies were assessment center performance, subsequent perform- ance appraisals, objective performance data, and the satisfaction and turnover intentions of subordinates. They reported a perform- ance-change mean effect size (corrected for measurement error and sampling error) of .15 (subordinate feedback), .05 (peer feed- back), and .15 (supervisor feedback). Ac- knowledging that the mean effect sizes are quite small, their positive direction suggests that performance does improve following multirater feedback.

In our search of the literature, we identi- fied only one published study that investi- gated the degree to which multiple raters provide better prediction than traditional TDA. In their study of 36 managers in Aus- tralia, Atkins and Wood (2002) found that both supervisor ratings alone and the aver- age of supervisor, peer, and subordinate rat- ings predicted overall assessment center rat- ings, and that the mean 360-degree assessments had higher criterion-related va- lidity than TDA alone. Our study is similar to the Atkins and Wood (2002) research in that 360-degree assessments are used to predict AC ratings, but we extend and improve upon that research in a number of important ways. First, the 360-degree and assessment center criteria presented here were derived from a competency-modeling approach (i.e., Schippmann, 1999). Second, unlike Atkins and Wood (2002) and almost all published research on multirater assessment, this study includes assessments provided by external customers as part of the 360-degree network. Third, this study examines whether weight- ing individual competencies within a com- petency model on the basis of expert judg- ments improves predictive validity (Ganzach, Kluger, & Klayman, 2000). Fourth, this study was conducted with a much larger sample size (i.e., 428 versus 36).

Despite a plethora of writing recom- mending the use of customer data, our study is only the second effort that includes exter-

nal customers as a rating source in the 360- degree system. This study investigates the va- lidity of “mystery shopper” data as the source of customer data. We were thus able to examine the relative and predictive value of this approach to assessment. While there has been considerable discussion of the im- portant role customers should play in 360-degree assessments (e.g., Bernardin, Hagan, Kane, & Villanova, 1998; Cardy, 1998; London & Beatty, 1993), only Church (1997) reported the use of client feedback in his study of a business advisory firm.

While researchers have explic- itly recommended the use of cus- tomer-generated criteria in the derivation of performance stan- dards and performance manage- ment systems (Bernardin, 1992; London & Beatty, 1993; Vil- lanova, 1992), there is very little research on customer-based as- sessment data. Church (1997) found little correspondence be- tween clients and other rating sources. While marketing re- search indicates that surveying customer satisfaction is a wide- spread practice, the approach can be expensive, can capture infor- mation that is subjective and dif- ficult to interpret, and its connec- tion with future business results or outcomes is unclear. In addition, such data may have limited value in assessing the degree to which a firm’s strategic and tactical processes are effectively operationalized at the customer level (Moriarty, McLeod, & Dowell, 2003; Wilson, 2002).

Wilson (2002) reported that a very high percentage of retail and service companies use customer data for decision making, and 71% reported supplementing customer sur- veys with mystery shoppers. Mystery shop- pers typically pose as customers in order to experience, evaluate, and report on the con- sistency of processes and procedures and the level of customer service (Finn & Kayande, 1999). While recent case studies describe the

While researchers

have explicitly

recommended the

use of customer-

generated criteria in

the derivation of

performance

standards and

performance

management

systems, there is

very little research

on customer-based

assessment data.

Human Resource Management DOI: 10.1002/hrm

362 HUMAN RESOURCE MANAGEMENT, Fall 2006

value of mystery shopping (e.g., Anderson & Bissell, 2004; Moriarty et al., 2003), little re- search has examined the psychometric qual- ities of mystery shopping data. In the only study of its kind to date, Finn and Kayande (1999) found acceptable levels of reliability for mystery shopper ratings, particularly when more stable, objective criteria were

used, and when mystery shoppers received higher levels of training in what and how to observe.

To summarize, based on our review of the literature, this arti- cle contributes knowledge in four ways. It is the first study that in- vestigates criterion-related valid- ity within a competency measure- ment program. Second, using assessment center data as the cri- teria, we are able to assess the in- cremental validity of 360-degree competency assessment relative to the more traditional “top- down” or supervisory assessment approach. Third, this is a rare study of 360-degree assessment that includes external customers as a source of assessment infor- mation and, thus, enables us to begin to explore the value that customers may add to an overall

system of competency measurement. Fourth, this study evaluates the validity of alterna- tive approaches to weighting competencies.

Assessment Centers

Many companies use assessment centers, particularly for managerial assessment and the early identification of managerial poten- tial (Goldstein, Yusko, & Nicolopoulos, 2001). Schippmann et al. (2000) make the compelling argument that there is little dis- tinction between the assessment of dimen- sions typically used in assessment center lit- erature and the most popular definitions of competencies. Goldstein et al. (2001), for ex- ample, simply changed the name of their rat- ing factors from “ability dimensions” in their 1998 study to “competencies” in their 2001 article, even though the assessment center

and the basic data used in the two studies were identical (Goldstein, Yusko, Braverman, Smith, & Chung, 1998).

An assessment center is a procedure for evaluating people in terms of human attrib- utes, abilities, or capabilities that are judged to be relevant to organizational effective- ness (Thornton, 1992). Assessment centers are typically characterized by (1) their use of situational tests or work samples (2) to elicit specific behavior, (3) which is ob- served by trained assessors, (4) who make independent evaluations across multiple competencies about what they have seen, and (5) then pool their observations with other trained assessors to arrive at an over- all assessment center rating. Assessment centers are one of the most popular and en- during approaches to measuring individual skills and capabilities in both private and public organizations, particularly for the as- sessment of managers and managerial po- tential (Arthur, Woehr, & Maldegen, 2000). Fiedler (2001) cites survey data indicating that at least 50% of major U.S. employers use some form of assessment center and that top HR executives consider assessment centers to be an effective approach to selec- tion and promotion.

Among organizations that formally measure their assessment center’s effective- ness, 88% use a content validity approach (Spychalski, Quinones, Gaugler, & Pohley, 1997). In other words, firms assess the de- gree to which the exercises closely resemble real job activities, the behavioral responses to situations reflect performance dimen- sions that matter to the organization, and trained assessors generally make consistent and accurate observations. In their meta- analysis of 50 studies, Gaugler, Rosenthal, Thornton, and Bentson (1987) found a mean correlation of .37 when overall assess- ment ratings (OARs) were compared with a variety of other criteria, including individ- ual performance ratings (.36), ratings of fu- ture potential (.53), performance in training programs (.35), and career progress (e.g., number of promotions, changes in salary over time) (.36). In a recent meta-analysis of AC dimensions, Arthur, Day, McNelly, and

An assessment

center is a

procedure for

evaluating people in

terms of human

attributes, abilities,

or capabilities that

are judged to be

relevant to

organizational

effectiveness.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 363

Edens (2003) collapsed 168 different dimen- sions reported in the literature into six cate- gories and found a mean correlation of .36 between AC dimensions and job-related cri- teria. While the use of assessment center data as criteria can be criticized because such data are not actual measures of on-the-job performance, AC data have distinct advan- tages over on-the-job measures such as su- pervisory ratings.1

In summary, assessment centers are widely used, and resultant assessments have been shown to correlate with a number of important work outcomes. We thus argue that AC performance measures, both in the aggregate and at the individual competency level, represent relevant and important crite- ria for the study of managerial competencies.

Hypotheses

This study tests the following hypotheses:

H1a: The mean 360-degree ratings of the six competencies and the aggregated ratings across competencies will exhibit criterion-re- lated validity in the prediction of AC per- formance.

H1b: The mean 360-degree ratings of the six competencies and the aggregated ratings across competencies will exhibit construct validity in the prediction of AC performance.

H2: The mean 360-degree ratings of the six com- petencies will exhibit incremental validity (beyond that of TDA) in the prediction of AC performance.

H3: The mean customer-based ratings of the six competencies will exhibit incremental valid- ity (beyond that of TDA) in the prediction of AC performance.

Like other types of HR assessment, com- petency measurement may include the cal- culation of an overall score to summarize an individual’s capability and/or potential. The degree to which different summarizing methods create different overall scores has importance both for the individual and the organization. Most competency modeling approaches acknowledge that some compe- tencies are more important than others

when the purpose of measurement is predic- tion. Lucia and Lepsinger (1999), for exam- ple, describe the importance of properly weighing competencies for succession plan- ning. We are unaware of any research that addresses the relative or absolute validity of expert-weighted competency ratings versus the unit-weighting of all compe- tencies, where all competency “predictors” are weighted by 1.0.

Research has established that when it is technically feasible to use regression weights for predic- tors, this approach is superior to simple equal weighting (Cascio & Aguinis, 2005; Ganzach et al., 2000). But what about an expert- weighting approach where, as a part of the competency model- ing process, experts are asked to estimate the relative predictive importance of each competency for some prescribed purpose? Would this expert-based ap- proach also result in superior cri- terion-related validity compared to a unit-weighting approach? Most organizations do not have regression-based validation studies that can be used to derive cross-validated predictive weights. Should an organization also ask ex- perts to estimate the predictive value of competencies? Are experts able to correctly weight the competencies using the ob- tained and cross-validated regression weights as the criterion?

This study will also examine the relative validities of expert-based versus unit-based weighting approaches to summarizing com- petency ratings for predictive purposes. Al- though the empirical evidence on expert- based systems is less compelling than it is for empirically derived regression weights, our final hypothesis is that the expert system will exhibit superior criterion-related validity beyond that of unit-weighting. We will make this comparison and investigate the relation- ship between the expert weights derived in the competency modeling project with the competency weights derived from the regres- sion analysis.

Most competency

modeling

approaches

acknowledge that

some competencies

are more important

than others when

the purpose of

measurement is

prediction.

Human Resource Management DOI: 10.1002/hrm

364 HUMAN RESOURCE MANAGEMENT, Fall 2006

H4: In deriving overall competency composites, weighting competencies based on expert judgments will demonstrate higher levels of criterion-related validity than a unit- weighted (i.e., simple averaging) approach.

Method

Sample

Participants were 428 associate store man- agers (ASM) from a Fortune 500 retailer who were candidates for promotion to store man- ager (SM). All associate store managers had

worked in a managerial capacity for the retailer for at least one year and were required to be perform- ing at a satisfactory (or better) level as associate store managers on their last formal performance assessment. Assessment center, top-down, and other 360-degree competency ratings were available on each ASM. The demographic breakdown of the ASMs under study was 71% male, 70% white, 15% black, 9% Hispanic, and 6% other or not identified.

Competency Development Process

Competencies were developed as a part of a study of all retail store

management jobs and succession planning at the company. The process for competency development was most compatible with the approach described by Schippmann (1999, pp. 153–189). Among the purposes of the project was to develop a 360-degree assess- ment program and an assessment center for developing and selecting store managers. All of the “core” competencies under study here were judged to be “critical” for performance in both the ASM and SM jobs.

As indicated earlier, competencies were defined as clusters of measurable and rele- vant behaviorally based characteristics or ca- pabilities of people (Schippmann, 1999). The capabilities were described as descriptors re- flecting abilities to perform specific work ac-

tivities and may include specific skills or spe- cific knowledge. The goal was to define all competencies with specific, observable, and verifiable descriptors that were reliably and logically classified together.

We used a variety of methods and data to derive competencies for assessment. We con- ducted a critical incident job analysis for ASM and SM jobs; reviewed company records and performance assessment criteria, formats, and data; interviewed subject-matter experts (SMEs); used two taxonomies of retail mana- gerial competencies from previous projects; and reviewed responses to a questionnaire that derived ratings of the importance of var- ious managerial tasks, activities, behaviors, knowledge, skills, and abilities for retail man- agement positions within the company. Fol- lowing the recommendations and research of Lievens et al. (2004), many diverse groups of SMEs, composed of representatives from dif- ferent and critical organizational perspec- tives, were assembled to perform the different tasks required in the competency modeling process. SMEs included effective ASMs (n = 18), SMs (n = 11), district managers (DMs) (n = 7), human resource specialists (n = 5), and consultants (n = 4).

We began by assembling four groups of SMEs, made up of DMs, HR specialists, and consultants, who participated in the refine- ment of the company’s mission/vision state- ment, the derivation of a competitive strat- egy (adapted from Schippmann, 1999, pp. 35–68), and the initial development of core competencies to meet strategic business goals. We used the Schippmann (1999, p. 19) definition of a “competency domain” as a frame of reference for the discussion. These groups also helped to refine and amend a questionnaire to be used in subse- quent steps. The second step involved four groups of SMEs, made up of SMs, ASMs, and HR specialists who made importance ratings of managerial tasks, activities, behaviors, knowledge, skills, and abilities using the questionnaire. Then, these same four SME groups drafted and refined a list of compe- tencies based on both the strategic business goals for the company and the question- naire results that described the SM’s and

The goal was to

define all

competencies with

specific, observable,

and verifiable

descriptors that

were reliably and

logically classified

together.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 365

ASM’s jobs. Six groups of SMEs wrote and re- fined descriptor content for each compe- tency, using the approach described in the development of behavioral expectation scales (BESs) (Bernardin & Beatty, 1984, pp. 82–87). These descriptors were then “re- translated” into the list of competencies by different groups of SMEs who were also asked to identify “relative importance” weights for each competency and “relative predictive” weights for each competency, as described in the next section.

Six competencies were identified as “core” or “generalizable” competencies such that each was judged to be a critical and in- dependent underlying competency for suc- cessful performance at both the ASM and the SM levels. The six competency categories were Oral Presentation and Communication, Written Communication, Interpersonal Skills, Planning and Organizing, Decision Making, and Leadership. These competen- cies were judged to be important for and rep- resentative of relatively stable and important work activities for retail management jobs within the organization. One additional competency, Technical Knowledge, was also identified as a “core” competency, but a ma- jority of SMEs believed that the most valid and reliable measurement of this compe- tency would be through standardized testing rather than assessments by any rating source, including SMs.

All core competency ratings made by as- sessors in the assessment centers and all 360- degree assessments were made on seven- point behavioral expectation scales following the prescriptions for such scales as described by Bernardin and Smith (1981). For example, Written Communication was defined by a team of SMEs as the “clear ex- pression of ideas in writing and in good grammatical form.” The focus groups de- rived generic, behavioral examples of written communication, such as “exchanges infor- mation/reports with superior regarding the day’s activities,” and descriptors representing levels of competence, such as “completes all written reports and required forms in a man- ner that ensures the inclusion of all data nec- essary to meet the needs of the personnel

using the information” and “uses appropri- ate vocabulary and avoids excessive techni- cal jargon in required correspondence.” The descriptor anchors for each competency de- scribed work activities judged by SMEs to be important for performance at both the ASM and SM levels.

We tested different rating for- mats and presented various scale format options to SMEs. Contrary to recommendations by others (e.g., Lucia & Lepsinger, 1999), we concluded that the behavioral ex- pectation approach combined with the rating method described by Bernardin and Smith (1981) was preferable to alternative for- mats and more compatible with the definition of a “competency.” Figure 1 presents an example of a competency rating scale for Plan- ning and Organizing.

For each of the “core compe- tencies,” the midpoint (4) behav- ioral expectation anchor was judged by SMEs to be at least “moderately predictive” of effec- tive performance representing that same competency at the SM level, and of overall performance as a SM. For example, for Planning and Organizing, an acceptable level of competence meant that the manager “could be expected to establish clear strate- gies that are tied to specific objectives for all occasions in which such strategies are needed with a level of precision that facili- tates the setting and meeting of mostly clear, appropriate, and attainable objectives.” De- scriptors also were judged by SMEs for the degree to which they were indicative of an acceptable level of manifested performance at the SM level. As described in Bernardin and Smith (1981), more specific scale an- chors were subject to retranslation as well and differed somewhat as a function of the rating source.

Weighting of Competencies

In order to assess the perceived relative im- portance of the competencies, we asked 20

The descriptor

anchors for each

competency

described work

activities judged by

SMEs to be

important for

performance at both

the ASM and SM

levels.

Human Resource Management DOI: 10.1002/hrm

366 HUMAN RESOURCE MANAGEMENT, Fall 2006

SMEs (SMs and DMs) to divide 100 points among the six competencies to reflect the rel- ative importance of the competencies for ef- fective performance as a store manager. Specifically, raters were asked to indicate for each competency “to what extent a high (or exceptional) level of a competency exhibited by a store manager would be important for exceptional performance as a SM. Review and consider the entire list of competencies and the definitions and then divide 100 ‘impor- tance’ points among the six competencies.”

The mean relative importance weights for the six competencies were Oral Presenta- tion and Communication (13%), Written Communication (7%), Interpersonal Skills (16%), Planning and Organizing (18%), De-

cision Making (16%), and Leadership (30%). These weights were used to form the expert- weighted competency composite (EWCC) for one assessment center criterion measure. These same SMEs were also asked to indicate how confident they were that if an SM dis- played a very high level on a given compe- tency, that SM would be an effective SM who would achieve all store business objectives. These ratings were made on a five-point scale ranging from 1 (not confident at all) to 5 (ex- tremely confident). Mean ratings on the core competencies were Oral Presentation and Communication (1.5), Written Communica- tion (1.2), Interpersonal Skills (2.5), Planning and Organizing (2.9), Decision Making (3.3), and Leadership (3.4).

Human Resource Management DOI: 10.1002/hrm

FIGURE 1. Behavioral Expectation Scale for Planning and Organizing

Predicting Assessment Center Performance 367

SMEs were also asked to provide “relative predictive weights” for each of the compe- tencies. Twenty-three SMEs (SMs and DMs), ten of whom participated in deriving relative importance weights, divided 100 points among the six competencies to reflect the relative ability of the competencies to pre- dict the level of competency at the ASM level that would predict effective performance as an SM. Specifically, raters were asked to indi- cate for each competency “to what extent a high (or exceptional) level of a competency exhibited by an ASM would be predictive of exceptional performance as an SM. Review and consider the entire list of competencies and the definitions and then divide 100 ‘pre- dictive’ points among the six competencies.”

The mean predictive weights for the six competencies were Oral Presentation and Communication (10%), Written Communi- cation (6%), Interpersonal Skills (14%), Plan- ning and Organizing (18%), Decision Mak- ing (20%), and Leadership (32%). These “predictive” weights were used to form the “predictive composite” (pred. comp.) dis- cussed later. These same SMEs and an addi- tional group of SMEs (n = 9) also provided confidence ratings of the competencies in which they were asked to indicate how con- fident they were that if an ASM displayed a very high level on a given competency, that ASM would be an effective SM. These ratings were made on a five-point scale ranging from 1 (not confident at all) to 5 (extremely confi- dent). Mean ratings on the core competen- cies were Oral Presentation and Communica- tion (1.9), Written Communication (1.5), Interpersonal Skills (2.3), Planning and Orga- nizing (2.8), Decision Making (3.2), and Leadership (3.8).

The core competencies and their descrip- tors also formed the basis for the develop- ment of the assessment center and the pro- fessional customer protocols described in the next section.

Competency Rating Processes

All eligible ASMs received 360-degree compe- tency assessments, and within the same month, they participated in the assessment

center. The same competency definitions were used for the 360-degree assessment as for the AC ratings, although training materi- als describing “translated” work activities differed. At both the 360-degree stage and the AC stage of the project, participants were informed that the data had no administra- tive significance and would only be used for feedback to managers and, in aggregated form, as one basis for evaluating the effective- ness of the staffing decision processes within the company. Participants were informed also that such a process might be used as one basis for making staffing decisions in the future and that they would be asked for feedback concerning the process.

The 360-degree system was developed and administered fol- lowing guidelines recommended by experts in the area (e.g., Fleenor & Brutus, 2001). All par- ticipants were informed about the anticipated 360-degree assess- ment about one month before the data were collected. Particular raters for the peer and subordinate sources were nominated by the ASMs, with a mini- mum of four raters required per source. Each ASM was instructed to nominate subordi- nates and peers who had worked with him/her for at least three months. Subordi- nates had to be hourly employees or supervi- sor trainees. On those occasions in which ASMs did not nominate raters, SMs nomi- nated candidates.

Ratings by subordinates, peers, man- agers, and customers were completed ap- proximately one month after a formal assess- ment cycle involving the ASMs and their managers. All raters completed a seven-item behavioral expectation scale (see Figure 1 for an example) assessing each of the six compe- tencies. Raters were asked to make their as- sessment with the following: “Based on what you know of this person’s knowledge, skills, and abilities, plus his (her) relevant perform- ance on the job, indicate what level of com- petence you would expect for this person at

Ratings by

subordinates, peers,

managers, and

customers were

completed

approximately one

month after a formal

assessment cycle

involving the ASMs

and their managers.

Human Resource Management DOI: 10.1002/hrm

368 HUMAN RESOURCE MANAGEMENT, Fall 2006

the store manager level. Place an ‘X’ on the vertical scale to reflect the level of expected competence. Also, please record at least one example of this manager’s performance, knowledge, skill, or ability that you consider to be the major justification for your assess- ment.” All rating sources were given the op- portunity to rate all six competencies based on questionnaire responses from SMEs in the competency development stage. SMEs also

judged all possible rating sources as valid sources of information about each of the six core compe- tencies to a moderate (or greater) degree.

Professional customers em- ployed by the organization (i.e., mystery shoppers) completed a special instrument dealing with the extent to which specific cus- tomer requirements related to the six competencies were met. The customers used three different standardized protocols developed to assess (and develop) competen- cies. The customers followed the scripts in order to test system pol- icy as interpreted and imple- mented by the manager, with each scenario written to allow an opportunity for a manager to ex-

hibit behaviors relevant to each of the com- petencies derived from the competency study.

Ratings were also made on the same competency scales, with additional work ac- tivities and behavioral statements serving to anchor the scales as the competencies related to the protocols. The mystery shopping pro- tocols concerned the return of expensive of- fice equipment, discussion and planning re- garding a relationship with a small business, and a customer complaint about the behav- ior of independent contractors. For all three protocols, scripts were written to provide an opportunity for ASMs to demonstrate vary- ing degrees of competence for each of the six core competencies. Of the ASMs who re- ceived customer-based assessments (n = 390), all received the customer complaint proto- col. Of these 390 ASMs, 95 were also assessed

using a second protocol: 51 received the small business protocol and 44 received the office equipment protocol. For these 95 ASMs, we used the mean rating across the two assessments. Coefficients of stability ranged from .62 to .78 for the competency assessments across the two protocols.

All possible assessors, except the profes- sional customers, received a written training program two months prior to assessments. This short program was designed to build skills in the area of behavioral observation and included definitions of rating errors and cognitive biases along with definitions of the competencies. All raters had the opportunity to refrain from assessing any (or all) of the competencies if a rater felt that s/he did not have adequate information or did not ob- serve relevant behavior or activities.

While we examined several different methods for aggregating 360-degree data, for the purposes of this study, we followed Atkins and Wood (2002) and others (e.g., Facteau & Craig, 2001; Scullen et al., 2003) and calculated an average 360-degree assess- ment across all rating sources (except self-as- sessments) for each competency using the mean rating from each rating source. SM rat- ings of their subordinate ASMs were taken as the traditional top-down assessments for this study. Following research on expert weight- ing systems—most recently, Ganzach et al. (2000)—we also derived a composite predic- tor score for each candidate based on the SME-derived predictive weight for each com- petency, as described earlier. This enabled us to compare validities for this approach ver- sus a unit-weighted approach for the compe- tencies and the weights derived from the re- gression analyses.

In order to evaluate incremental validity of 360-degree assessments, we investigated 360-degree ratings with and without SM as- sessments included in the 360-degree com- posite. Because of the uniqueness of the data, we also investigated the customer ratings as a part of, and independent of, the other rat- ing sources. We selected these comparisons because we believed that organizations would only consider a 360-degree assessment program that included a TDA, but many or-

All rating sources

were given the

opportunity to rate

all six competencies

based on

questionnaire

responses from

SMEs in the

competency

development stage.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 369

ganizations use only TDA for competency as- sessment, and a growing number, particu- larly in retail, include some form of customer data as a basis for assessing managers. Thus, the incremental validity of the customer data beyond what is provided by TDA has practical significance.

Assessment Center Development and Process

The assessment center method used for this study was developed based on the latest re- search on the validity and reliability of as- sessment center data (Caldwell, Thornton, & Gruys, 2003; Cascio & Aguinis, 2005; Gate- wood & Feild, 2001; Lievens, 1998; Woehr & Arthur, 2003). The assessment center was de- signed to measure the core competencies and was constructed as a mechanism for choosing among internal and external can- didates for promotion to SM jobs. Over a three-month period, all ASMs under study went through the one-day assessment center. The only criterion for eligibility was that the candidate must have been an ASM in good standing for at least one year.

The assessment center used trained ob- servers to make judgments about behavior from specially developed assessment simula- tions. As with the typical assessment center method, information about an employee’s strengths and weaknesses on each compe- tency was provided through a combination of assessment exercises designed to simulate the type of managerial work to which the candidate would be exposed. A team of trained assessors (Schleicher, Day, Mayes, & Riggio, 2002) observed and evaluated per- formance on the situational exercises. The assessors compiled and integrated their judg- ments on each exercise and formed a sum- mary rating for each candidate being as- sessed. A battery of paper-and-pencil tests and questionnaires were also administered prior to the start of the group exercises.

The assessment center exercises allowed assessors to observe, record, classify, and evaluate job-predictive behaviors. These be- haviors were determined by SMEs to be be- haviors exhibited at the ASM level that were

predictive or indicative of success for SMs. The assessors were responsible for observing the actual behavior of the candidate during each exercise and documenting how each candidate performed. Each “center” day con- sisted of four assessors observing and assessing 12 candidates.

The exercises used were an in- basket, two leaderless group dis- cussions, and a case analysis fol- lowed by an oral presentation. All exercises were developed and re- fined based on the competency modeling study with elements of each exercise developed to allow for effective or ineffective mani- festation of the competencies. The exercises and assessment cri- teria were written in the context of the goals and vision statements derived from the competency modeling process.

Assessors were representatives from the organization and occu- pied higher managerial-level posi- tions than the candidates being assessed. No raters providing data on the predictor side (i.e., through the 360 or the TDA) also provided AC data for any of the ASMs, nor were assessors aware of the cus- tomer data. Assessors received extensive training on assessment center methodology and were certified as assessors after they had participated in the assessment center them- selves. Following research by Schleicher et al. (2002), assessors were trained using a variant of “frame-of-reference” training (Bernardin, Buckley, Tyler, & Wiese, 2000).

After participants completed the exer- cises, assessors made exercise and compe- tency ratings for each candidate. Compe- tency ratings were made on the same seven-point BESs described earlier, but with behavioral anchors that differed by compe- tency and exercise. The training program provided specific examples of candidate per- formance to illustrate a competency for each exercise. Some of these examples were used as behavioral anchors for the BES. After indi- vidual assessor ratings were completed, the

The assessment

center was

designed to

measure the core

competencies and

was constructed as

a mechanism for

choosing among

internal and

external candidates

for promotion to SM

jobs.

Human Resource Management DOI: 10.1002/hrm

370 HUMAN RESOURCE MANAGEMENT, Fall 2006

assessors assembled at a team meeting to pool their judgments and to derive an over- all consensus rating for each candidate on each competency.

Following the conclusion of all compe- tency assessments based on assessment cen- ter performance, consensus was also reached

on an overall assessment of each competency based on all relevant information about each candi- date. In addition, an overall as- sessment of potential (OAP) was completed to appraise each candi- date’s potentiality for high per- formance as a store manager. These competency and OAP rat- ings were based on all assessment center data, including a summary of test and questionnaire results. Thus, assessors were instructed to consider assessment center per- formance plus the interpreted scores on the battery of paper- and-pencil tests and question- naires. These tests included a cog- nitive ability test, a biographical instrument, a job-related person- ality test, an accomplishment record, and a forced-choice test of

managerial self-efficacy (Bernardin, Vil- lanova, & Cooke, 2005). The assessment cen- ter training manual provided a matrix, de- rived from the competency modeling project, that linked scores on these measures (and their subscales) with particular compe- tencies. In making this particular OAP, asses- sors were instructed to focus on the poten- tiality for the associate manager to excel at the store manager job.

Results

Table I presents the descriptive statistics and the complete multitrait–multimethod (MTMM) matrix for all major variables under study.

The hypotheses were tested with correla- tional and regression analyses. While we in- vestigated criterion and construct-related va- lidities using other AC criteria, based on research by Ganzach et al. (2000) and others

(see Cascio & Aguinis, 2005), for criterion-re- lated validity, we focused our attention on the expert-weighted competency composite measure derived by summing across the weighted, consensus-derived competency as- sessments for the six competencies. We also investigated correlations with the overall as- sessment of potential.

We developed and selected OAP as a cri- terion of interest for two reasons. First, we be- lieve this measure, which incorporated all in- formation relevant to each competency, including paper-and-pencil tests and other information that were linked to the compe- tencies, was more theoretically compatible with the definition of a competency we adopted in this research and adapted from Schippmann (1999) and others. Second, the OAP was highly correlated with the EWCC (r = .88, p < .001) and the average rating of the unit-weighted, consensus-derived competen- cies (r = .84, p < .001). We will report and dis- cuss other criterion-related validities as well.

According to Hypothesis 1a, the mean 360-degree ratings of the six competencies and the aggregated ratings across competen- cies will exhibit criterion-related validity in the prediction of assessment center perform- ance. Table II presents the correlations be- tween the 360-degree assessments and the AC criterion measures. The correlations be- tween the six 360-degree competency assess- ments and EWCC and OAP were all positive and statistically significant (p < .001), rang- ing from a low of .27 for Written Communi- cation and Decision Making with OAP to .43 for Leadership and .40 for Oral Presentation in predicting EWCC. The correlation be- tween the unit-weighted mean of 360-degree assessments (across the six competencies) and EWCC was .49 (p < .001) and .42 (p < .001) with OAP.

Additional support for the criterion-re- lated validity of 360-degree assessment of the EWCC and the OAP of associate store managers came from the results of the re- gression analysis in which EWCC was re- gressed on the six competencies measured in the 360-degree assessment process. We controlled for race, gender, and minority classification in the regression analysis in

…assessors were

instructed to

consider

assessment center

performance plus

the interpreted

scores on the

battery of paper-

and-pencil tests and

questionnaires.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 371

Human Resource Management DOI: 10.1002/hrm

M e th

o d

/ C

o m

p e te

n c y

M e a n

S .D

. 1

2 3

4 5

6 7

8 9

1 0

1 1

1 2

1 3

1 4

3 6 0 -D

e g

re e F

e e d

b a ck

b

1. O

P c

4 .3

8 1.

1 8

-

2 .

W C

4 .2

2

1. 1 5

.5 5

-

3 .

IS 4 .1

3 1.

14 .5

2 .5

3 -

4 .

P O

4 .3

3 1.

1 2

.5 3

.5 2

.4 9

-

5 .

D M

4 .5

5 1.

11 .4

6 .5

0 .4

2

.5 1

-

6 .

L S

4 .7

2 1.

0 6

.4 4

.4 6

.4 3

.4 9

.5 1

-

7. O

v e ra

ll A

v g

. -

- .7

8 .7

8 .7

5 .7

8 .7

5 .7

2

To p

-D o

w n

A p

p ra

is a ld

8 .

O P

4 .5

8 1.

4 5

.8 1

.4 3

.4 2

.4 8

.3 7

.3 9

.6 4

-

9 .

W C

4 .4

3 1.

4 2

.4 1

.7 8

.4 2

.3 8

.4 0

.3 5

.6 1

.4 6

-

10 .

IS 4 .3

5 1.

3 9

.4 2

.4 0

.7 5

.4 2

.3 8

.3 4

.5 9

.4 8

.4 7

-

11 .

P O

4 .5

5 1.

4 2

.4 5

.4 9

.3 9

.7 6

.4 6

.4 3

.6 5

.5 0

.4 5

.4 1

-

1 2 .

D M

4 .8

0 1.

4 0

.4 0

.4 2

.3 5

.4 1

.7 5

.3 7

.5 9

.4 5

.5 0

.4 0

.4 7

-

1 3 .

L S

4 .8

9 1.

3 7

.4 6

.4 3

.3 7

.4 2

.4 7

.8 1

.6 5

.4 7

.4 3

.4 0

.4 5

.4 7

-

14 .

O v e ra

ll A

v g

. -

- .6

7 .6

7 .6

1 .6

5 .6

4 .6

1 .8

4

.7 6

.7 5

.7 1

.7 4

.7 4

.7 2

-

C u

st o

m e r

A ss

e ss

m e n

te

1 5 .

O P

4 .3

1 0 .9

7 .8

0

.4 4

.4 3

.4 1

.3 8

.3 7

.6 3

.6 4

.3 4

.3 5

.3 7

.3 0

.3 8

.5 4

1 6 .

W C

4 .0

9 0 .8

2 .4

5 .7

1

.3 9

.3 8

.4 0

.2 6

.5 7

.3 6

.5 6

.3 1

.3 3

.3 0

.2 8

.4 9

1 7.

IS 4 .0

9 0 .8

7 .4

0 .4

1

.7 5

.3 6

.3 4

.3 1

.5 7

.3 1

.3 2

.5 5

.2 7

.2 5

.2 9

.4 5

1 8 .

P O

4 .1

8 0 .9

9

.4 0

.4 5

.3 2

.5 8

.3 7

.3 4

.5 4

.3 7

.3 5

.2 7

.4 3

.3 2

.3 2

.4 7

19 .

D M

4 .3

2

1. 0 8

.5 3

.4 6

.3 2

.3 5

.4 1

.2 4

.5 1

.4 3

.3 7

.2 4

.3 0

.3 0

.2 9

.4 4

2 0 .

L S

4 .3

4 1.

0 2

.4 0

.4 8

.3 6

.3 5

.3 7

.3 9

.5 2

.3 4

.3 8

.2 6

.3 1

.3 0

.4 3

.4 6

2 1.

O v e ra

ll A

v g

. -

- .7

2 .7

0 .6

0 .5

8 .5

4 .4

6 .7

9

.5 9

.5 5

.4 6

.4 8

.4 2

.4 8

.6 8

A ss

e ss

m e n

t C

e n

te rd

2 2 .

O P

4 .2

5 1.

5 3

.2 6

f .1

7 * *

.2 6

.3 1

.2 3

.3 0

.3 4

.2 9

f .1

3 * *

.1 9

.2 5

.2 2

.2 4

.3 0

2 3 .

W C

4 .2

6 1.

4 6

.1 7

.2 6

f .2

2 .3

2 .1

8 .2

3 .3

0 .1

5 * *

.1 2

f * .1

3 * *

.2 6

.1 4 * *

.1 8

.2 2

2 4 .

IS 4 .3

0 1.

5 1

.3 2

.2 8

.3 9

f .3

2 .2

5 .2

8 .4

1 .2

9 .1

6 * *

.2 9

f .3

0 .2

0 .1

9 .3

2

2 5 .

P O

4 .2

7 1.

5 1

.3 2

.1 6 * *

.2 5

.2 6

f .1

8 .2

5 .3

1 .2

7 .0

8 n

s .2

3 .1

5 f *

* .1

6 * *

.1 6 * *

.2 4

2 6 .

D M

4 .3

9 1.

4 0

.3 0

.2 5

.2 9

.2 9

.2 4

f .2

5 .3

6 .1

9 .1

4 * *

.1 8

.1 7

.1 4

f * *

.1 9

.2 3

2 7.

L S

4 .2

8 1.

4 4

.2 9

.2 5

.2 7

.2 7

.2 4

.4 4

f .3

8 .2

2 .1

2 *

.1 6 * *

.2 7

.2 0

.3 3

f .2

9

2 8 .

O v e r.

A v g

. (O

A P

) -

- .3

2 .2

7 .3

2 .3

3 .2

7 .3

9 .4

2 .2

5 .1

3 * *

.2 3

.2 8

.2 0

.2 8

.3 1

2 9 .

C o

m p

. (E

W C

C )

- -

.4 0

.3 1

.3 9

.3 9

.3 1

.4 3

.4 9

.3 3

.1 7

.2 7

.3 2

.2 5

.3 1

.3 7

(c o

n ti

n u

e d

)

T A

B L

E

I M

ul ti

tr ai

t– M

ul ti

m et

ho d

C or

re la

ti on

M at

ri x

of 3

60 -D

eg re

e A

ss es

sm en

ta , T

op -D

ow n

A ss

es sm

en t,

C us

to m

er A

ss es

sm en

t, an

d A

ss es

sm en

t C en

te r

C om

pe te

nc y

R at

in gs

372 HUMAN RESOURCE MANAGEMENT, Fall 2006

Human Resource Management DOI: 10.1002/hrm

M e th

o d

/ C

o m

p e te

n c y

M e a n

S .D

. 1 5

1 6

1 7

1 8

1 9

2 0

2 1

2 2

2 3

2 4

2 5

2 6

2 7

2 8

C u

st o

m e r

A ss

e ss

m e n

te

1 5 .

O P

4 .3

1 0 .9

7 -

1 6 .

W C

4 .0

9 0 .8

2 .3

7 -

1 7.

IS 4 .0

9 0 .8

7 .3

6 .3

3 -

1 8 .

P O

4 .1

8 0 .9

9 .3

7 .3

8 .2

5 -

19 .

D M

4 .3

2 1.

0 8

.5 7

.5 1

.3 1

.3 7

-

2 0 .

L S

4 .3

4 1.

0 2

.3 6

.4 7

.2 6

.3 5

.4 7

-

2 1.

O v e ra

ll A

v g

. -

- .7

3 .7

1 .5

8 .6

5 .7

9 .7

0 -

A ss

e ss

m e n

t C

e n

te rd

2 2 .

O P

4 .2

5 1.

5 3

.2 1

f .1

6 * *

.2 1

.1 5 * *

.1 7 * *

.1 8

.2 6

-

2 3 .

W C

4 .2

6 1.

4 6

.1 7 * *

.1 1

f * .1

5 * *

.2 0

.1 7 * *

.1 2 *

.2 2

.5 0

-

2 4 .

IS 4 .3

0 1.

5 1

.2 6

.2 6

.3 1

f .1

8 .1

9 .1

7 * *

.3 2

.4 7

.4 9

-

2 5 .

P O

4 .2

7 1.

5 1

.2 5

.1 7 * *

.2 2

.1 7

f * *

.1 2 *

.1 0 *

.2 4

.4 5

.3 4

.5 1

-

2 6 .

D M

4 .3

9 1.

4 0

.2 8

.2 0

.2 7

.1 3 *

.2 2

f .1

6 * *

.3 0

.3 8

.4 1

.4 8

.3 8

-

2 7.

L S

4 .2

8 1.

4 4

.2 3

.1 8 * *

.2 1

.0 7 n

s .1

0 n

s .1

6 f *

* .2

2 .4

0 .3

8 .4

5 .4

0 .4

5 -

2 8 .

O v e r.

A v g

. (O

A P

) -

- .2

6 .2

0 .2

6 .1

0 n

s .1

5 * *

.2 1

.2 8

.5 8

.5 2

.6 0

.5 6

.6 2

.7 9

-

2 9 .

C o

m p

. (E

W C

C )

- -

.3 3

.2 5

.3 2

.1 9

.2 1

.2 1

.3 5

.6 9

.6 2

.7 6

.7 2

.7 0

.8 0

.8 8

N o

te : A

ll c

o rr

e la

ti o

n s

a re

s ta

ti st

ic a ll y s

ig n

if ic

a n

t to

t h

e p

< .

0 01

l e v e l

u n

le ss

o th

e rw

is e n

o te

d (

i. e ., *

* p

< .

01 a

n d

* p

< .

0 5 a

n d

“ n

s” f

o r

n o

t si

g n

if ic

a n

t) .

a R

a te

rs i

n cl

u d

e m

a n

a g

e rs

, su

b o

rd in

a te

s, p

e e rs

, a

n d

p ro

fe ss

io n

a l

cu st

o m

e rs

. b N

= 4

0 9 .

c O P

= O

ra l

P re

se n

ta ti

o n

, W C

= W

ri tt

e n

C o

m m

u n

ic a ti

o n

, IS

= I

n te

rp e rs

o n

a l

S ki

ll s,

P O

= P

la n

n in

g &

O rg

a n

iz in

g ,

D M

= D

e ci

si o

n M

a ki

n g

, L S

= L

e a d

e rs

h ip

. d N

= 4

2 8 .

e N

= 3

9 0 .

f M o

n o

tr a it

-h e te

ro m

e th

o d

c o

n v e

rg e n

t v a li d

it ie

s (c

o rr

e la

ti o

n s

fo r

th e s

a m

e c

o m

p e te

n cy

a ss

e ss

e d

w it

h d

if fe

re n

t m

e th

o d

s) .

T A

B L

E

I M

ul ti

tr ai

t– M

ul ti

m et

ho d

C or

re la

ti on

M at

ri x

of 3

60 -D

eg re

e A

ss es

sm en

ta , T

op -D

ow n

A ss

es sm

en t,

C us

to m

er A

ss es

sm en

t, an

d A

ss es

sm en

t C en

te r

C om

pe te

nc y

R at

in gs

(c on

tin ue

d)

Predicting Assessment Center Performance 373

Human Resource Management DOI: 10.1002/hrm

C o

rr e la

ti o

n C

o rr

e la

ti o

n C

o rr

e la

ti o

n C

o rr

e la

ti o

n C

o rr

e la

ti o

n C

o rr

e la

ti o

n C

o rr

e la

ti o

n C

o rr

e la

ti o

n

w it

h A

C C

o rr

e la

ti o

n w

it h

w it

h w

it h

w it

h w

it h

w it

h w

it h

M e th

o d

/ C

o m

p o

s it

e w

it h

A v g

. A

C A

C C

o m

p .

A C

C o

m p

. A

C C

o m

p .

A C

C o

m p

. A

C C

o m

p .

A C

C o

m p

.

C o

m p

e te

n c y

(E W

C C

) O

A P

R a ti

n g

O ra

l P

re s .

W ri

t. C

o m

m .

In te

r. S

k il ls

P la

n .

& O

rg a n

. D

e c .

M a k in

g L e a d

e rs

h ip

3 6 0 -D

e g

re e

O ra

l P

re s.

.4 0

.3 2

.3 8

.2 6

a .1

7 .3

2 .3

2 .3

0 .2

9

W ri

t. C

o m

m .

.3 1

.2 7

.3 2

.1 7

b .2

6 a

.2 8

.1 6

b .2

5 .2

5

In te

r. S

ki ll s

.3 9

.3 2

.3 9

.2 6

.2 2

.3 9

a .2

5 .2

9 .2

7

P la

n .

& O

rg .

.3 9

.3 3

.4 1

.3 1

.3 2

.3 2

.2 6

a .2

9 .2

7

D e c.

M a ki

n g

.3 1

.2 7

.3 0

.2 3

.1 8

.2 5

.1 8

.2 4

a .2

4

L e a d

e rs

h ip

.4 3

.3 9

.4 0

.3 0

.2 3

.2 8

.2 5

.2 5

.4 4

a

A v e ra

g e

.4 9

.4 2

.4 8

.3 4

.3 0

.4 1

.3 1

.3 6

.3 8

P re

d .

C o

m p

o si

te .5

0 .4

3 .4

9 .3

5 .3

0 .3

9 .3

1 .3

5 .4

1

U n

it W

td .

C o

m p

o si

te .4

9 .4

2 .4

8 .3

4 .3

0 .4

1 .3

1 .3

6 .3

8

To p

-D o

w n

O ra

l P

re s.

.3 3

.2 5

.3 3

.2 9

a .1

5 b

.2 9

.2 7

.1 9

.2 2

W ri

t. C

o m

m .

.1 7

.1 3

b .1

7 .1

3 b

.1 2

a c

.1 6

b .0

8 d

.1 4

b .1

2 c

In te

r. S

ki ll s

.2 7

.2 3

.2 7

.1 9

.1 3

b .2

9 a

.2 3

.1 8

.1 6

b

P la

n .

& O

rg .

.3 2

.2 8

.3 2

.2 5

.2 6

.3 0

.1 5 a

b .1

7 .2

7

D e c.

M a ki

n g

.2 5

.2 0

.2 4

.2 2

.1 4

b .2

0 .1

6 b

.1 4

a b

.2 0

L e a d

e rs

h ip

.3 1

.2 8

.2 9

.2 4

.1 8

.1 9

.1 6

b .1

9 .3

3 a

A v e ra

g e

.3 7

.3 1

.3 7

.2 9

.2 2

.3 2

.2 3

.2 2

.2 9

P re

d .

C o

m p

o si

te .3

8 .3

3 .3

8 .3

0 .2

3 .3

1 .2

3 .2

3 .3

3

U n

it W

td .

C o

m p

o si

te .3

7 .3

1 .3

7 .3

0 .2

2 .3

2 .2

4 .2

3 .2

9

C u

st o

m e r

O ra

l P

re s.

.3 3

.2 6

.3 2

.2 1

a .1

7 b

.2 6

.2 5

.2 8

.2 3

W ri

t. C

o m

m .

.2 5

.2 0

.2 5

.1 6

b .1

1 a c

.2 6

.1 7

b .2

0 .1

8 b

In te

r. S

ki ll s

.3 2

.2 6

.3 1

.2 1

.1 5

b .3

1 a

.2 2

.2 7

.2 1

P la

n .

& O

rg .

.1 9

.1 0

d .2

1 .1

5 b

.2 0

.1 8

.1 7

a b

.1 3

c .0

7 d

D e c.

M a ki

n g

.2 1

.1 5

b .2

2 .1

7 b

.1 7

b .1

9 .1

2 c

.2 2

a .1

0 d

L e a d

e rs

h ip

.2 1

.2 1

.2 1

.1 8

.1 2

c .1

7 b

.1 0

c .1

6 b

.1 6

a b

A v e ra

g e

.3 5

.2 8

.3 6

.2 6

.2 2

.3 2

.2 4

.3 0

.2 2

P re

d .

C o

m p

o si

te .3

2 .2

6 .3

3 .2

5 .2

1 .2

8 .2

1 .2

7 .2

0

U n

it W

td .

C o

m p

o si

te .3

5 .2

8 .3

6 .2

6 .2

2 .3

2 .2

4 .3

0 .2

2

A ll c

o rr

e la

ti o

n s

a re

s ta

ti st

ic a ll y s

ig n

if ic

a n

t a t

th e p

< .

0 01

l e v e l

u n

le ss

o th

e rw

is e i

n d

ic a te

d .

a M

o n

o tr

a it

-h e te

ro m

e th

o d

c o

n v e rg

e n

t v a li

d it

y (

co rr

e la

ti o

n f

o r

th e

s a

m e

c o

m p

e te

n cy

a ss

e ss

e d

w it

h d

if fe

re n

t m

e th

o d

s) .

b C

o rr

e la

ti o

n i

s st

a ti

st ic

a ll y s

ig n

if ic

a n

t a t

th e p

< .

01 l

e v e l.

c C

o rr

e la

ti o

n i

s st

a ti

st ic

a ll

y s

ig n

if ic

a n

t a t

th e p

< .

0 5 l

e v e l.

d C

o rr

e la

ti o

n i

s n

o t

st a

ti st

ic a

ll y

s ig

n if

ic a

n t.

T A

B L

E

I I

R el

at io

ns hi

ps B

et w

ee n

C om

pe te

nc ie

s M

ea su

re d

by 3

60 -D

eg re

e A

ss es

sm en

t, To

p- D

ow n

A ss

es sm

en t,

an d

C us

to m

er A

ss es

sm en

t a nd

A

ss es

sm en

t C en

te r

C om

pe te

nc y

M ea

su re

s

374 HUMAN RESOURCE MANAGEMENT, Fall 2006

that they could exert an influence over the competency assessments and the predictor- criterion relationships. In each of the regres- sion models, the regression coefficients of these control variables were not statistically significant. After entering the control vari- ables, the six-competency model signifi- cantly predicted EWCC (adjusted R2 = .26, F

= 17.25, p < .001). We followed the same procedure using the OAP as the criterion. After con- trolling for gender, race, and mi- nority classification, the six-com- petency model significantly predicted OAP (adjusted R2 = .19, F = 11.31, p < .001). Taken to- gether, the results support Hy- pothesis 1a by providing strong evidence for the criterion-related validity of 360-degree ratings in the prediction of AC criteria.

According to Hypothesis 1b, the mean 360-degree ratings of the six competencies and the ag- gregated ratings across competen- cies will exhibit evidence of con- struct validity when compared with competency ratings made by expert judges in an assessment

center. Table II presents the convergent va- lidities for each of the six 360-degree compe- tencies. Correlations for the same compe- tency (e.g., Oral Presentation) measured with different methods (e.g., 360 versus AC) should be significantly different than zero and large enough to encourage additional examination of validity (D. T. Campbell & Fiske, 1959). In addition, each convergent validity correlation should be larger than the other correlations found in its row and col- umn that measure different competencies by different methods and different competen- cies measured with the same method (Pito- niak, Sireci, & Leucht, 2002).

The convergent validity coefficients were all statistically significant and ranged from a low of .24 (p < .001) for Decision Making to a high of .44 (p < .001) for Lead- ership. The average convergent validity across the six 360-degree competencies was .31. The correlations of Interpersonal

Skills (r = .39, p < .001) and Leadership (r = .44, p < .001) were higher than all other correlations measured by heterotrait-het- eromethod and heterotrait-monomethod approaches. However, convergent validi- ties for the other four competencies (i.e., Oral Presentation, Written Communica- tion, Planning and Organizing, and Deci- sion Making) were not the highest in their respective rows and columns. For example, the correlation between 360-degree and AC competency measures of Oral Presenta- tion was r = .26 (p < .001) (versus correla- tions of .32, .32, .30, and .29 between 360- degree-measured Oral Presentation and AC-measured Interpersonal Skills, Plan- ning and Organizing, Decision Making, and Leadership, respectively, and versus correlations of .31 and .30 between AC- measured Oral Presentation and 360-de- gree-measured Planning and Organizing and Leadership, respectively).

Using information provided in the MTMM correlation matrix in Table I, the monotrait-heteromethod average correlation was .31, the heterotrait-monomethod aver- age correlation was .45, and the heterotrait- heteromethod correlation was .25. More specifically, the convergent validity for Oral Presentation was .26, compared to an aver- age correlation of .28 of Oral Presentation with the other AC competencies, and a .50 average correlation with the other 360-de- gree competency ratings. The convergent va- lidity for Written Communication was .26, compared to an average correlation of .22 of Oral Presentation with the other AC compe- tency ratings, and .51 with the other 360-de- gree competency ratings. The convergent va- lidity of Interpersonal Skills was .39, compared to an average of .26 for the other AC competencies and .48 with the other 360-degree competency ratings. The remain- ing three convergent validities for Planning and Organizing, Decision Making, and Lead- ership were .26 (compared to .30 for other AC competencies and .51 for other 360-de- gree competency ratings), .24 (compared to .22 for other AC competencies and .48 for other 360-degree competency ratings), and .44 (compared to .26 for other AC compe-

Taken together, the

results support

Hypothesis 1a by

providing strong

evidence for the

criterion-related

validity of 360-

degree ratings in

the prediction of AC

criteria.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 375

tencies and .47 for other 360-degree compe- tency ratings), respectively. Thus, these re- sults provide only weak support for the con- struct validity of the 360-degree competency assessment. Thus, Hypothesis 1b is not sup- ported.

According to Hypothesis 2, the mean 360-degree ratings of the six competencies will exhibit incremental validity (beyond that of top-down assessment) in the predic- tion of assessment center performance. Table II also reports the results of the correlations between the manager assessments (TDAs) and AC criterion data. We, of course, ex- pected that the correlation between the aver- age rating from all four assessment sources (i.e., managers, subordinates, peers, and cus- tomers) and assessment center criteria would be higher than the correlation between man- ager assessments (TDAs) and AC criteria, due to the assumption that ratings obtained from multiple sources are less susceptible to sin- gle-source bias and other forms of measure- ment error that can decrease their validity. The results confirmed our expectations that 360-degree assessment is a more valid pre- dictor of assessment center performance and that it does, in fact, demonstrate incremental validity beyond that of top-down assess- ments in the prediction of assessment center performance.

Comparisons of 360-degree versus TDA validities for each competency favored 360- degree assessments. Each of the correla- tions between the six managerial compe- tencies (measured by 360-degree assessment) and EWCC were significantly higher than the corresponding correlations between TDA and EWCC. For example, for Oral Pre- sentation, the 360-EWCC correlation was r = .40 (versus r = .33 for TDA-EWCC; differ- ence p < .01).

For OAP, comparisons of 360-degree ver- sus TDA validities for each competency also favored 360-degree assessments. Each of the correlations between the six managerial competencies (measured by 360-degree as- sessment) and OAP were significantly higher than the corresponding correlation between a TDA competency assessment and OAP. For example, for Oral Presentation, the 360-OAP

correlation was r = .32 (versus r = .25 for TDA-OAP; difference p < .01).

Also, the correlation between the unit- weighted, average 360-degree competency assessment and EWCC was .49 (p < .001) and was larger than the average assessment across competencies from managers and EWCC (r = .37). The correlation between the unit-weighted, average 360-degree compe- tency assessment and OAP was .42 (p < .001) and was larger than the average assessment across competencies from managers and OAP (r = .31, p < .001). We tested for differ- ences between correlations and found that the magnitudes of these criterion-related validities were statistically significant: .50 was larger than .43 (p < .01), .49 was larger than .37 (p < .001), and .42 was larger than .31 (p < .001).

In addition, we regressed EWCC on the six competencies measured in the TDA, then on the average of the six competency ratings from the 360-degree as- sessment. After entering the three control variables (gender, race, and minority classification), the top-down competency ratings, which were entered into the second block of the hierarchical regression analysis, ex- plained 17% of the variance in EWCC (ad- justed R2 = .17, F = 10.51, p < .001). The av- erage 360-degree competency ratings, which were entered into the final block, demon- strated incremental validity beyond that of the TDA ratings (change in adjusted R-square = .13, p < .001 ). A similar effect was observed when we conducted two step-wise regression analyses using the same predictor variables but different criterion variables (namely, OAP and mean AC ratings) across competen- cies. Using OAP as the criterion variable, the TDA competency ratings accounted for 12% of the variance in OAP (adjusted R2 = .12, F = 7.21, p < .001), with average 360-degree rat- ings adding incremental validity above that of the TDA ratings (change in adjusted R2 = .09, p < .001). Using mean AC ratings across competencies, the top-down competency

The average 360-

degree competency

ratings, which were

entered into the

final block,

demonstrated

incremental validity

beyond that of the

TDA ratings.

Human Resource Management DOI: 10.1002/hrm

376 HUMAN RESOURCE MANAGEMENT, Fall 2006

ratings accounted for 17% of the variance in mean AC ratings (adjusted R2 = .17, F = 10.07, p < .001). Again, the average 360-de- gree competency ratings exhibited incre- mental validity beyond that of the top-down appraisal ratings (change in adjusted R2 = .11, p < .001).2

Taken together, the results of the correla- tional and multiple regression analyses provide strong support for Hypothesis 2.

According to Hypothesis 3, the mean customer ratings of the six competencies will exhibit in- cremental validity (beyond that of top-down assessment) in the pre- diction of assessment center per- formance. Table II also presents the correlations between the six competency ratings measured by customers and the EWCC and OAP. The six competency correla- tions with EWCC ranged from r = .19 (p < .001) for Planning and Or- ganizing to .33 (p < .001) for Oral Presentation. The unit-weighted, average competency assessment was correlated .35 (p < .001) with EWCC. Correlations using the OAP as the AC criterion were not as impressive, with a range of va- lidities from .10 (not statistically significant) for Planning and Or-

ganizing to .26 (p < .001) for Oral Presenta- tion and Interpersonal Skills. The predictor composite score for customers correlated .26 (p < .001) with the OAP, while the correlation between the average of the customer assess- ments and OAP was .28 (p < .001).

Additional support for the criterion-re- lated validity of customer assessments for the EWCC and the OAP of associate store man- agers came from the results of two regression analyses in which EWCC and the OAP were regressed on the six competencies assessed by customers. After controlling for gender, race, and minority classification, the six- competency model significantly predicted the criterion variables (adjusted R2 = .16, F = 9.20, p < .001 and adjusted R2 = .10, F = 5.91, p < .001 for EWCC and OAP, respectively).

In terms of evaluating the criterion-re- lated validity of customer ratings in the pre- diction of assessment center performance at the individual competency level, we ex- pected evidence of convergent validity among the correlations of the same compe- tency measured using customers and the AC. These validities ranged from r = .11 (p < .05) for Written Communication to r = . 31 (p < .001) for Interpersonal Skills. The remaining correlations were statistically significant: Oral Presentation (r = .21, p < .001), Planning and Organizing (r = .17, p < .01), Decision Making (r = .22, p < .001), and Leadership (r = .16, p < .01).

As stated previously, each convergent va- lidity should be larger than the other corre- lations found in its row and column (Pito- niak et al., 2002). The correlation of Interpersonal Skills measured using customer assessment and the AC was statistically sig- nificant (r = .31, p < .001) and higher than all other correlations measured by heterotrait- heteromethod and heterotrait-monomethod combinations. The remaining five conver- gent validity correlations were not the largest values in their respective rows and columns.

To further explore the criterion-related validity of customer assessments of AC per- formance, we regressed the EWCC and the OAP on top-down assessment and customer assessment competency ratings to ascertain whether customer data provide incremental validity to top-down assessment.

Additional support for the criterion-re- lated validity of customer assessment in pre- dicting the EWCC and the OAP of ASMs came from the results of two step-wise re- gression analyses. After controlling for gen- der, race, and minority classification, the top-down competency ratings, which were entered into the second block of the step- wise regression analysis, explained 16% of the variance in the EWCC (adjusted R2 = .16, F = 9.35, p < .001). Also, after controlling for gender, race, and minority classification, the top-down competency ratings, which were entered into the second block of the step- wise regression analysis, explained 12% of the variance in OAP (adjusted R2 = .12, F =

According to

Hypothesis 3, the

mean customer

ratings of the six

competencies will

exhibit incremental

validity (beyond that

of top-down

assessment) in the

prediction of

assessment center

performance.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 377

6.65, p < .001). The customer assessment competency ratings, which were entered into the final block of the regression analy- sis, demonstrated incremental validity be- yond that of top-down assessment ratings (statistically significant change in adjusted R2 = .05, p < .001 for EWCC and adjusted R2

= .03, p < .001 for OAP). Taken together, our results support Hy-

pothesis 3 in that customer assessment ex- hibited incremental validity beyond tradi- tional top-down assessment in the prediction of assessment center performance.

According to Hypothesis 4, when deriv- ing overall competency composites, weight- ing competencies based on expert judgments will demonstrate higher levels of criterion- related validity than a unit-weighted (i.e., simple averaging) approach. We tested this hypothesis by comparing criterion-related validities using the two approaches to aggre- gating the predictive data. As Table II indi- cates, criterion-related validities were statisti- cally significant (p < .001) and nearly identical for the two approaches (i.e., .50 for the expert-weighted approach versus .49 for the unit-weighted approach using the EWCC criterion, and .43 versus .42 for the 360-de- gree data using the OAP criterion). The va- lidities for top-down and customer-based weighted assessments were also similar using the EWCC criterion. The validities were .38 for the top-down expert composite predictor and .37 for the average competency rating, while the expert-weighted customer compos- ite was slightly less valid than the average competency rating (.35 versus .32).

While the expert-weighted competency composite resulted in higher criterion-re- lated validity for the full 360-degree assess- ment, comparisons of the validities were not statistically significant. Thus, Hypothesis 4 was not supported. Table III provides a sum- mary of the hypotheses, decision rules for testing the hypotheses, and the results of each hypothesis test.

In order to estimate the degree of shrink- age in the multiple correlation coefficients, we used the recommended statistical cross- validation formula (Raju, Bilgic, Edwards, & Fleer, 1999). The statistical estimates of cross-

validation (Browne, 1975; Raju et al., 1999) of the multiple correlation coefficients from regressing EWCC on 360-degree assessment, TDA, and customer assessment were .52 (ver- sus R = .53 in this study), .42 (versus R = .43), and .41 (versus R = .42), respectively. The es- timates of cross-validity of the multiple cor- relation coefficient from regressing OAP on 360-degree assessment, TDA, and customer assessment were .42 (versus R = .45 in this study), .35 (versus R = .37), and .36 (versus R = .35), respectively.

Discussion

Overview

In what we believe to be the first criterion-related validity study of a complete 360-degree assess- ment process (i.e., where cus- tomer data are included), our re- sults support the use of 360-degree assessment as a part of competency assessment. We found that mean 360-degree as- sessments on all six of the core competencies derived from the full complement of 360-degree sources were significantly corre- lated with independent criteria derived from assessment center competency ratings. The correlation between the expert-derived com- posite 360-degree predictor score and the EWCC was .50 (p < .001), and the correla- tion between the expert-derived composite 360-degree predictor score and the overall OAP was .43 (p < .001). The results of the re- gression analyses also provided strong sup- port for the validity of the 360-degree ap- proach to competency measurement. The adjusted R-square for 360-degree assessment using all six predictive competencies was .26 for EWCC and .19 for OAP.

All six 360-degree competency scores were significantly correlated in the predicted direction with the composite performance score from the assessment center (EWCC) and the consensus-derived measure of over- all potential for performance as a store man- ager, with a range of validities from .27 to

The results of the

regression analyses

also provided strong

support for the

validity of the 360-

degree approach to

competency

measurement.

Human Resource Management DOI: 10.1002/hrm

378 HUMAN RESOURCE MANAGEMENT, Fall 2006

Human Resource Management DOI: 10.1002/hrm

H y p

o th

e s e s a

n d

D e c is

io n

R u

le (s

) O

P a

W C

IS P

O D

M L S

O A

P E

W C

C P

ra c ti

c a l

Im p

li c a ti

o n

s

H 1 a : T

h e m

e a n

3 6 0 -d

e g

re e r

a ti

n g

s o

f th

e s

ix c

o m

p e te

n ci

e s

a n

d t

h e

T h

e se

f in

d in

g s

su g

g e st

t h

a t

a g

g re

g a te

d r

a ti

n g

s a cr

o ss

c o

m p

e te

n ci

e s

w il l

e x h

ib it

c ri

te ri

o n

- 3 6 0 -d

e g

re e a

ss e ss

m e n

t

re la

te d

v a li d

it y i

n t

h e p

re d

ic ti

o n

o f

A C

p e rf

o rm

a n

ce .

is a

v a li d

a p

p ro

a ch

t o

- C

o rr

e la

ti o

n s

b e tw

e e n

3 6 0 -d

e g

re e r

a ti

n g

s a n

d A

C p

e rf

o rm

a n

ce Y

Y Y

Y Y

Y -

- co

m p

e te

n cy

a ss

e ss

m e n

t a n

d ,

(E W

C C

) sh

o u

ld b

e p

o si

ti v e a

n d

s ta

ti st

ic a ll y s

ig n

if ic

a n

t. w

h e n

f e a si

b le

, sh

o u

ld b

e

- C

o rr

e la

ti o

n s

b e tw

e e n

3 6 0 -d

e g

re e r

a ti

n g

s a n

d A

C p

e rf

o rm

a n

ce Y

Y Y

Y Y

Y -

- u

se d

i n

p e rf

o rm

a n

ce

(O A

P )

sh o

u ld

b e p

o si

ti v e a

n d

s ta

ti st

ic a ll y s

ig n

if ic

a n

t. m

a n

a g

e m

e n

t p

ro g

ra m

s.

- M

u lt

ip le

r e g

re ss

io n

a n

a ly

si s

o f

A C

p e rf

o rm

a n

ce (

E W

C C

a n

d -

- -

- -

- Y

Y

O A

P )

o n

3 6 0 -d

e g

re e c

o m

p e te

n cy

r a ti

n g

s sh

o u

ld e

x p

la in

a

st a ti

st ic

a ll y s

ig n

if ic

a n

t a m

o u

n t

o f

v a ri

a n

ce .

S u

m m

a ry

o f

te st

r e su

lt s:

S tr

o n

g s

u p

p o

rt o

f H

1 a

H 1 b

: T h

e m

e a n

3 6 0 -d

e g

re e r

a ti

n g

s o

f th

e s

ix c

o m

p e te

n ci

e s

a n

d t

h e

B e ca

u se

e v id

e n

ce o

f th

e

a g

g re

g a te

d r

a ti

n g

s a cr

o ss

c o

m p

e te

n ci

e s

w il l

e x h

ib it

c o

n st

ru ct

co n

st ru

ct v

a li d

a ti

o n

o f

v a li d

it y i

n t

h e p

re d

ic ti

o n

o f

A C

p e rf

o rm

a n

ce .

m e a su

re s

re q

u ir

e s

m u

lt ip

le

- C

o rr

e la

ti o

n s

o f

sa m

e c

o m

p e te

n ci

e s

m e a su

re d

w it

h d

if fe

re n

t Y

Y Y

Y Y

Y -

- st

u d

ie s

a n

d a

n a ly

se s

(b e y o

n d

m e th

o d

s sh

o u

ld b

e s

ta ti

st ic

a ll y s

ig n

if ic

a n

t a n

d th

a t

o f

M T

M M

a n

a ly

si s)

, th

is

- E

x ce

e d

c o

rr e la

ti o

n s

o f

th is

c o

m p

e te

n cy

m e a su

re d

w it

h N

N Y

N N

Y -

- fi

n d

in g

s h

o u

ld b

e v

ie w

e d

a s

d if

fe re

n t

m e th

o d

s a n

d o

f o

th e r

co m

p e te

n ci

e s

m e a su

re d

te n

ta ti

v e u

n ti

l a d

d it

io n

a l

w it

h t

h e s

a m

e m

e th

o d

. e v id

e n

ce i

s p

ro v id

e d

.

S u

m m

a ry

o f

te st

r e su

lt s:

H 1 b

n o

t su

p p

o rt

e d

H 2 : T

h e m

e a n

3 6 0 -d

e g

re e r

a ti

n g

s o

f th

e s

ix c

o m

p e te

n ci

e s

w il l

B y a

d d

in g

t h

e o

th e r

“ 2 7 0 ”

e x h

ib it

i n

cr e m

e n

ta l

v a li d

it y (

b e y o

n d

t h

a t

o f T

D A

) in

t h

e e le

m e n

ts o

f a 3

6 0 -d

e g

re e

p re

d ic

ti o

n o

f A

C p

e rf

o rm

a n

ce .

sy st

e m

, m

o re

c a n

b e l

e a rn

e d

- C

o rr

e la

ti o

n s

b e tw

e e n

3 6 0 -d

e g

re e c

o m

p e te

n cy

r a ti

n g

s a n

d A

C Y

Y Y

Y Y

Y -

- a b

o u

t m

a n

a g

e rs

’ p

o te

n ti

a l

p e rf

o rm

a n

ce (

E W

C C

) sh

o u

ld b

e s

ig n

if ic

a n

tl y g

re a te

r th

a n

T D

A p

e rf

o rm

a n

ce t

h a n

j u

st u

si n

g

a lo

n e .

to p

-d o

w n

a ss

e ss

m e n

t a lo

n e .

- C

o rr

e la

ti o

n s

b e tw

e e n

3 6 0 -d

e g

re e c

o m

p e te

n cy

r a ti

n g

s a n

d A

C Y

Y Y

Y Y

Y -

- T

h e e

x tr

a t

im e a

n d

e x p

e n

se

p e rf

o rm

a n

ce (

O A

P )

sh o

u ld

b e s

ig n

if ic

a n

tl y g

re a te

r th

a n

T D

A a ss

o ci

a te

d w

it h

a f

u ll 3

6 0 -

a lo

n e .

d e g

re e p

ro g

ra m

m a y b

e

- W

h e n

A C

p e rf

o rm

a n

ce (

E W

C C

a n

d O

A P

) is

r e g

re ss

e d

o n

- -

- -

- -

Y Y

w o

rt h

t h

e i

n v e st

m e n

t if

i t

co m

p e te

n cy

r a

ti n

g s

in a

s te

p -w

is e p

ro ce

d u

re ,

th e a

d d

it io

n o

f 3 6 0 -

re su

lt s

in b

e tt

e r

p e rs

o n

n e l

d e g

re e c

o m

p e

te n

cy r

a ti

n g

s to

T D

A c

o m

p e te

n cy

r a ti

n g

s w

il l

re su

lt d

e ci

si o

n m

a ki

n g

.

in t

h e e

x p

la n

a ti

o n

o f

a d

d it

io n

a l

v a ri

a n

ce .

S u

m m

a ry

o f

te st

r e su

lt s:

S tr

o n

g s

u p

p o

rt o

f H

2

(c o

n ti

n u

e d

)

T A

B L

E

I I

I S

um m

ar y

of H

yp ot

he se

s, D

ec is

io n

R ul

es fo

r H

yp ot

he si

s Te

st in

g, a

nd R

es ul

ts o

f H yp

ot he

se s

Predicting Assessment Center Performance 379

Human Resource Management DOI: 10.1002/hrm

H y p

o th

e s e s a

n d

D e c is

io n

R u

le (s

) O

P a

W C

IS P

O D

M L S

O A

P E

W C

C P

ra c ti

c a l

Im p

li c a ti

o n

s

H 3 : T

h e m

e a n

c u

st o

m e r-

b a se

d r

a ti

n g

s o

f th

e s

ix c

o m

p e te

n ci

e s

C u

st o

m e r

fe e d

b a ck

m a ke

s

w il l

e x h

ib it

i n

cr e m

e n

ta l

v a li d

it y (

b e y o

n d

t h

a t

o f T

D A

) in

t h

e a n

i m

p o

rt a n

t co

n tr

ib u

ti o

n

p re

d ic

ti o

n o

f A

C p

e rf

o rm

a n

ce .

to 3

6 0 -d

e g

re e a

ss e ss

m e n

t.

- C

o rr

e la

ti o

n s

b e tw

e e n

c u

st o

m e r-

b a se

d c

o m

p e te

n cy

r a ti

n g

s a n

d Y

Y Y

Y Y

Y -

- C

u st

o m

e rs

p ro

v id

e e

v a lu

a ti

v e

A C

p e rf

o rm

a n

ce (

E W

C C

) sh

o u

ld b

e s

ig n

if ic

a n

tl y

g re

a te

r th

a n

in

fo rm

a ti

o n

t h

a t

is b

o th

T D

A a

lo n

e .

u n

iq u

e a

n d

r e la

te d

t o

- C

o rr

e la

ti o

n s

b e tw

e e n

c u

st o

m e r-

b a se

d c

o m

p e te

n cy

r a ti

n g

s a n

d Y

Y Y

N Y

Y -

- o

rg a n

iz a ti

o n

a l

g o

a ls

a n

d

A C

p e rf

o rm

a n

ce (

O A

P )

sh o

u ld

b e s

ig n

if ic

a n

tl y g

re a te

r th

a n

o

b je

ct iv

e s

(e .g

., i

d e n

ti fi

ca ti

o n

T D

A a

lo n

e .

o f

h ig

h -p

o te

n ti

a l

m a n

a g

e rs

).

- W

h e n

A C

p e rf

o rm

a n

ce (

E W

C C

a n

d O

A P

) is

r e g

re ss

e d

o n

-

- -

- -

- Y

Y

co m

p e te

n cy

r a

ti n

g s

in a

s te

p -w

is e p

ro ce

d u

re ,

th e a

d d

it io

n o

f

cu st

o m

e r-

b a se

d c

o m

p e te

n cy

r a ti

n g

s to

T D

A w

il l

re su

lt i

n t

h e

e x p

la n

a ti

o n

o f

a d

d it

io n

a l

v a ri

a n

ce .

S u

m m

a ry

o f

te st

r e su

lt s:

S tr

o n

g s

u p

p o

rt o

f H

3

H 4 :

In d

e ri

v in

g o

v e ra

ll c

o m

p e te

n cy

c o

m p

o si

te s,

w e ig

h ti

n g

A lt

h o

u g

h e

x p

e rt

w e ig

h ti

n g

co m

p e te

n ci

e s

b a se

d o

n e

x p

e rt

j u

d g

m e n

ts w

il l

d e

m o

n st

ra te

sy

st e m

s d

o n

o t

p re

d ic

t b

e tt

e r

h ig

h e r

le v e ls

o f

cr it

e ri

o n

-r e la

te d

v a li d

it y t

h a n

a u

n it

-w e ig

h te

d

th a n

e q

u a l

w e ig

h ti

n g

(i .e

, si

m p

le a

v e ra

g in

g )

a p

p ro

a ch

. sy

st e m

s, t

h e ir

u se

m a y

- C

o rr

e la

ti o

n b

e tw

e e n

3 6 0 -d

e g

re e p

re d

ic to

r co

m p

o si

te a

n d

-

- -

- -

- -

N re

su lt

i n

h ig

h e r

le v e ls

o f

A C

e x p

e rt

-w e ig

h te

d c

o m

p o

si te

( E

W C

C )

w il l

b e s

ig n

if ic

a n

tl y

fa ce

v a li d

it y a

n d

p ro

ce d

u ra

l

g re

a te

r th

a n

t h

e c

o rr

e la

ti o

n b

e tw

e e n

3 6 0 -d

e g

re e u

n it

w e ig

h te

d

ju st

ic e .

In t

h e c

u rr

e n

t le

g a l

co m

p o

si te

a n

d A

C e

x p

e rt

-w e ig

h te

d c

o m

p o

si te

. e n

v ir

o n

m e n

t, t

h e se

o u

tc o

m e s

- C

o rr

e la

ti o

n b

e tw

e e n

T D

A p

re d

ic to

r co

m p

o si

te a

n d

-

- -

- -

- -

N m

a y s

a v e o

rg a n

iz a ti

o n

s

A C

e x p

e rt

-w e ig

h te

d c

o m

p o

si te

w il l

b e s

ig n

if ic

a n

tl y

co n

si d

e ra

b le

t im

e a

n d

c o

st

g re

a te

r th

a n

t h

e c

o rr

e la

ti o

n b

e tw

e e n

T D

A u

n it

-w e ig

h te

d

b y a

tt ra

ct in

g f

e w

e r

la w

su it

s

co m

p o

si te

a n

d A

C e

x p

e rt

-w e ig

h te

d c

o m

p o

si te

. a n

d c

o m

p la

in ts

.

- C

o rr

e la

ti o

n b

e tw

e e n

c u

st o

m e r-

b a se

d p

re d

ic to

r co

m p

o si

te a

n d

-

- -

- -

- -

N

A C

e x p

e rt

-w e ig

h te

d c

o m

p o

si te

w il l

b e s

ig n

if ic

a n

tl y

g re

a te

r th

a n

t h

e c

o rr

e la

ti o

n b

e tw

e e n

c u

st o

m e

r- b

a se

d u

n it

w e ig

h te

d c

o m

p o

si te

a n

d A

C e

x p

e rt

-w e ig

h te

d c

o m

p o

si te

.

S u

m m

a ry

o f

te st

r e su

lt s:

H 4 n

o t

su p

p o

rt e d

N o

te :

a O

P =

O ra

l P

re se

n ta

ti o

n , W

C =

W ri

tt e n

C o

m m

u n

ic a ti

o n

, IS

= I

n te

rp e rs

o n

a l

S ki

ll s,

P O

= P

la n

n in

g &

O rg

a n

iz in

g ,

D M

= D

e ci

si o

n M

a ki

n g

, L S

= L

e a d

e rs

h ip

, O

A P

= O

v e ra

ll A

ss e ss

m e n

t o

f P

o te

n ti

a l, a

n d

E W

C C

= E

x p

e rt

-W e ig

h te

d C

o m

p e te

n cy

C o

m p

o si

te .

T A

B L

E

I I

I S

um m

ar y

of H

yp ot

he se

s, D

ec is

io n

R ul

es fo

r H

yp ot

he si

s Te

st in

g, a

nd R

es ul

ts o

f H yp

ot he

se s

(c on

tin ue

d)

380 HUMAN RESOURCE MANAGEMENT, Fall 2006

.43. This finding lends credence to the com- petency modeling process used in this re- search, which concentrated on those core competencies believed to be the underlying individual difference bases for successful per- formance in the two managerial positions under study.

The evidence for convergent and dis- criminant validity was disap- pointing but similar to that re- ported in other studies involving performance ratings and assess- ment center data (e.g., Atkins & Wood, p. 885). According to Nun- nally and Bernstein (1994, p. 94), construct validation, which re- quires trait correlations to be high (convergent validity) and method correlations to be low (discrimi- nant validity) is not easily achieved. Consequently, these findings should be viewed as ten- tative until additional studies are conducted to assess the conver- gent and discriminant validity of 360-degree ratings in the predic- tion of assessment center per- formance.

We were also interested in the extent to which a full 360-degree assessment showed incremental validity beyond that of top-down assessment with respect to the pre- diction of AC criteria. Using the independent AC criteria, we assessed the incremental va- lidity of the full 360-degree assessment rela- tive to the more traditional managerial or top-down assessment. Most experts on incre- mental validity recommend a consideration of the contribution of new approaches to as- sessment beyond what would be expected from data sources that are already available. Since almost all companies do some form of top-down performance appraisal, and many also do top-down competency assessment as we (and others) have defined it here, an im- portant practical question is whether the other possible assessment sources demon- strate incremental validity. We found that the full 360-degree approach yielded better results. Using hierarchical regression analy- sis, we found that adding the other “270” el-

ements of the 360-degree system to the pre- dictive equations explained unique variance in the prediction of the weighted, composite measure and the overall assessment of po- tential.

We found that the weighted, composite TDA correlated .38 with the weighted crite- rion measure, while the weighted, compos- ite 360-degree assessment correlated .50 with EWCC. The estimate of the increase in utility using the expert-weighted, 360-de- gree approach is 32%. Using the results of the comparable regressions with all six com- petencies in the equations (.43 for TDA ver- sus .53 for 360), the 360-degree approach also improved utility by 31% (Schmidt & Hunter, 1998).

We also found that the criterion-related validities for each of the 360-degree assess- ments of core competencies were higher than the validities for the same competen- cies using TDA alone, and that all six of the competency comparisons were significantly different. The validity of the 360-degree as- sessments on Leadership, judged by experts to be the most important of the six compe- tencies as a predictor of success as a store manager, increased from .31 based on TDA alone, to .43.

Customer Assessments

Our findings concerning customer assess- ment in general suggest that customers make an important contribution to 360-degree as- sessment. While management literature calls for the inclusion of customers in assessment and feedback systems, their actual presence in management research is a rarity. Little the- oretical or empirical research in manage- ment has focused on external customers and the role they play in shaping organizational activities and decision making. The potential for and usefulness of customer contributions may differ on a number of variables, includ- ing industry characteristics, a firm’s chosen strategy, the nature of the product/service of- fered, the type of technology in use, distri- bution tactics, and the like.

For example, Lengnick-Hall (1996) sug- gested that when services are customized

The evidence for

convergent and

discriminant validity

was disappointing

but similar to that

reported in other

studies involving

performance ratings

and assessment

center data.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 381

and important to our sense of self-esteem or well-being (e.g., health care providers and beauticians), customers may demand higher quality and more consistent performance standards than they would when services are more impersonal. Hagan and Bernardin (2003) posited that as services become more personal and intangible, a customer’s percep- tions about service quality and provider per- formance may become more synonymous with one another. The business advisory/pro- fessional services setting studied by Church (1997) would appear to be an expert-driven situation characterized by long-term cus- tomer relationships and considerable cus- tomer “intimacy.”

By contrast, the mystery shopping stud- ied here is a “scripted” approach to customer evaluation, focusing on capturing the qual- ity of “random” organization-customer in- teractions and the performance of a branch office in meeting established operational standards. In any event, this study provides empirical support for the belief that cus- tomers provide evaluative information that is both unique and related to issues of orga- nizational importance. Until some of these contingencies are addressed in the manage- ment research and literature, organizations that wish to structure effective customer-as- sessment systems may lack useful guidance.

Expert Weights

Even though the differences were not signif- icant, we found that a competency modeling process, one that included an expert predic- tive composite weighting system for deriving a predictive equation for scores on the com- petencies, resulted in the highest correlation with the measure of potential compared to a unit-weighting approach to deriving a pre- dictor score across competencies for each job candidate. Given the correlations among the 360-degree competency assessments, this finding is in line with previous research (Raju, Bilgic, Edwards, & Fleer, 1997). The correlation between the two sets of “predic- tor” (expert and unit weighted) scores was .98 (p < .001). This high correlation can be explained by investigating the MTMM ma-

trix in Table I. Among the six 360-degree competencies, the average correlation of each competency with the other five was .49.

Implications for HR Practice

Many scholars are opposed to the use of 360- degree assessment for personnel decision making (for discussions see Fleenor & Brutus, 2001, and Waldman et al., 1998). However, the incremental validity evidence discussed earlier makes a com- pelling case for the use of 360-de- gree assessments for decision making. Individuals in organiza- tions that use 360-degree assess- ment for administrative (rather than just developmental) pur- poses can use these findings to help persuade other employees and managers of the importance, relevance, and validity of con- ducting 360-degree (as opposed to just top-down) assessments. In- dividuals can emphasize that al- though the 360-degree assess- ment requires more time and energy on the part of managers, employees, and human resource professionals, such assessments provide in- cremental validity beyond that of TDA. This message can be communicated at several points during the 360-degree assessment process.

If an organization is opposed to using a full 360-degree assessment approach, a possi- ble compromise is the use of more than one source but not all possible 360-degree sources. In retail, many companies are using customer-based data for a variety of adminis- trative purposes, including the evaluation of individual managers. Our results provide support for this practice in that customer as- sessments explained unique variance beyond TDA. One interesting question that arises is the extent to which a combination of top- down and customer-based data can predict our criteria and whether this two-source ap- proach is significantly different from the full 360-degree assessment. Comparisons of the

If an organization is

opposed to using a

full 360-degree

assessment

approach, a

possible

compromise is the

use of more than

one source but not

all possible 360-

degree sources.

Human Resource Management DOI: 10.1002/hrm

382 HUMAN RESOURCE MANAGEMENT, Fall 2006

criterion-related validity of 360 predictor composite versus the average weighted as- sessments across managers and customers re- vealed superior validity for 360 (.50 versus .37 and .35). Our results also showed that customer-based assessment explained unique variance beyond traditional top- down assessment and that the mean rating

across the two rating sources for administrative decision making is more valid than just the TDA ap- proach. So, those organizations uncomfortable with installing up- ward and peer appraisal for ad- ministrative purposes could still improve the validity of their deci- sion making by incorporating a formal system of customer assess- ment to be used along with TDA.

Schippmann (1999) argued that effective competency pro- grams are those that are cus- tomized to reflect a particular or- ganization’s mission, strategy,

culture, and technology. In this article, we describe the efforts of one organization to construct such a program. The process was complex and involved a number of individ- uals and groups. From start to finish, the competency modeling process took seven months. Our client chose to invest this time because it believed that a customized pro- gram would uniquely fit its needs.

In addition, we would suggest that cus- tomized programs, particularly if they are created with high levels of insider participa- tion, could also produce higher levels of in- sider satisfaction both with the resultant pro- gram and with the decisions that emerge on the basis of the program. Wal-Mart is one of many organizations currently embroiled in litigation about the methods it uses in choosing candidates for promotion. Such lit- igation is not only extremely costly, but it also exacts heavy tolls on the employer-em- ployee relationship, and on the attitudes of customers and community members about the organization’s value. If nothing else, the use of a competency-based model to make personnel decisions should enhance the per- ceived procedural justice and face validity of

the approach and perhaps deter a proclivity to complain (or litigate) regarding the result- ant decisions.

A similar argument could be offered for using an expert weighting system despite our finding that equal weights for the competen- cies predict just as well. An expert weighting system might enhance the face validity of the process and increase perceived proce- dural justice. Another possibility is that a predictive weighting system should also in- corporate the source of the rating. For exam- ple, are assessments of “leadership” by sub- ordinates more valid as predictors than assessments of the same competency by peers or supervisors?

While the competency titles that evolved from this process were similar to the titles of “off-the shelf” competency models, the details in each scale that defined each competency were developed in the com- pany and unique to the organization. This level of detail may have contributed to the strong criterion-related validity of the 360- degree data.

Limitations and Future Research Directions

The most obvious limitation of our study was the use of assessment center perform- ance data and not actual on-the-job per- formance. While a compelling argument can be made that the use of assessment cen- ter data is a highly relevant criterion for the study of managers (e.g., Cascio & Aguinis, 2005), the use of multiple and relevant out- come measures such as store performance would make a stronger contribution to the literature on predicting managerial success. After all, the actual predictive validities of assessment centers, while statistically signif- icant, are not all that impressive (i.e., .36 mean validities for AC dimensions and .37 mean validities for single measures of over- all AC performance). Thus, drawing conclu- sions about real or potential on-the-job be- haviors using AC performance should be done with some caution. On the other hand, in the absence of longitudinal out- come measures, AC measures that are inde-

An expert weighting

system might

enhance the face

validity of the

process and

increase perceived

procedural justice.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 383

pendent of the predictors are a reasonable substitute, particularly when other criteria typically used in concurrent validation stud- ies (e.g., performance, pay, and promotions) would be confounded with elements of the 360-degree predictors (Atkins & Wood, 2002). Also, given the correlations found across the “predictor” competencies (and sources), the sine qua non of criterion “equivalence” is likely; that is, the same managerial candidates would be “selected” using the battery derived from the AC study versus a study involving a relevant and reli- able measure of on-the-job performance.

Our finding that 360-degree assessments better predicted an independent criterion than top-down assessments alone may have been facilitated by organizational factors present in this particular firm. The success of any multirater assessment program will be influenced by a number of variables, includ- ing the nature of the work, the structure of the jobs, the level and degree of supervision, the complexity of the technology in use, and the culture of the organization. The research reported here involves a 360-degree assess- ment program designed for use by a retailer with numerous branch offices linked by standardized technology and a strong inter- nal team orientation. In addition, the selec- tion of raters is a key element relating to pro- gram effectiveness. Performance-assessment literature, including literature about multi- rater programs, suggests that a “good” rater is one who (1) understands the job being rated; (2) has sufficient opportunity to ob- serve performance or experience the out- comes; (3) is able to discern effective from in- effective behavior; and (4) is motivated to provide accurate ratings (Bernardin & Beatty, 1984; Tornow, 1993). In this research, ASMs were encouraged to select raters who fulfilled these criteria, and the organization invested considerable resources in rater training. Fu- ture research should investigate the degree to which these findings generalize to other or- ganizational settings.

As firms build competency-based sys- tems, their design and development experi- ences could be instructive to firms consider- ing the approach, or those who are

interested in improving an existing program. Our client organization chose to define com- petencies very behaviorally. As indicated ear- lier, Milkovich and Newman (2005) note that companies are moving toward more be- haviorally based competencies. As additional cases are described, it will be interesting to examine how different defini- tions of “competency” affect pro- gram design and competency measurement.

Core competency theory sug- gests that organizing around worker capacities should improve a firm’s ability to respond rapidly and effectively to environmental change by making the company more flexible and “reconfig- urable.” Future research should focus on whether these claims are valid and for what situations they are particularly well suited. For example, will competency- based systems add more value in an organization that creates cus- tomer value on the basis of high technical expertise or leading- edge innovation? Our client or- ganization seeks to achieve com- petitive advantage on the basis of operational excellence. Our client chose to invest in creating a compe- tency-based system because it believed that its recruitment, selection, performance feed- back, career and individual development, and other related systems would facilitate a more cohesive HR system. Future research should be directed at identifying the situa- tions in which competency systems are par- ticularly useful, how they specifically add value, and how they facilitate improved business returns.

At this time, efforts to achieve some level of consensus about a definition of competencies are critical. It is difficult for a field to advance without such consensus. In the absence of a general definition, our re- search suggested that competency programs share certain characteristics in common. Earlier in the article, we identified such commonalities: (1) competencies focus on

Future research

should be directed

at identifying the

situations in which

competency

systems are

particularly useful,

how they

specifically add

value, and how they

facilitate improved

business returns.

Human Resource Management DOI: 10.1002/hrm

384 HUMAN RESOURCE MANAGEMENT, Fall 2006

what jobs share in common, rather than what makes each job unique; (2) those com- monalities are directly linked to achieve- ment of organizational success, rather than success in any individual jobs; and (3) be- cause they are fundamental commonalities, these competencies become the primary mechanism that drives organizational struc-

ture and managerial process. The program developed and imple- mented in our client organiza- tion meets these criteria. This or- ganization specifically designed these competencies to represent the general characteristics of managers that would create value for customers and, thus, enable the organization to meet its goals. In addition, the company has made these characteristics the foundation of its overall management system. Over time, however, we may find that com- petency-based tools and tech- niques may vary as widely as tra- ditional job analysis techniques.

Because this project was pri- marily concerned with the crite- rion validity of 360-degree com- petency assessment, we neither described nor discussed other types of statistical analyses that would have focused on the psy- chometric properties of the rat- ings themselves. Confirmatory factor analysis (CFA), for exam-

ple, is widely used to assess measurement equivalence of performance ratings across multiple sources, although there is some evi- dence that the approach has considerable limitations (Woehr, Sheehan, & Bennett, 2005). Future research should continue to focus on improving our understanding about how multirater assessment works, what unique contribution each rating source provides, and the degree to which such rat- ings overlap.

One provocative issue from our findings is the demonstration that 360-degree assess- ment has superior validity to TDA. In Title

VII litigation, if a firm successfully defends the job-relatedness of a contested practice, the plaintiff may still prevail by presenting evidence of an alternative practice that would enable the firm to obtain its same goals without the levels of adverse impact that resulted from the practice in question. In Albemarle v. Moody, 422 U.S. 405 (1975), the Supreme Court described this stage:

If an employer does then meet the bur- den of proving that its tests are “job-re- lated,” it remains open to the com- plaining party to show that other tests or selection devices, without a similarly undesirable racial effect, would also serve the employer’s legitimate interest in “efficient and trustworthy workman- ship.” Such a showing would be evi- dence that the employer was using its tests merely as a “pretext” for discrimi- nation.

The Civil Rights Act of 1991 further as- serted that such illegality would be estab- lished when “the complaining party makes the demonstration with respect to an alter- native employment practice and the respon- dent refuses to adopt such alternative prac- tice” (Section 2000 e-2 (k) (1) (A) (ii)).

It is in this context that expert testimony has been introduced asserting that the liti- gated practice could be replaced with a better (i.e., less discriminatory) practice. In both disparate treatment and disparate impact cases charging racial or gender bias, testi- mony has asserted that, in comparison to traditional top-down appraisal, assessment centers and multirater appraisal may be less prone to bias and that multirater appraisal is more likely to dilute an individual rater’s bi- ases (Hennessey & Bernardin, 2003). Our re- sults showing that 360-degree assessment ac- tually is more valid than TDA, combined with a finding that 360-degree assessment would eliminate or reduce adverse impact against protected classes, could be com- pelling testimony where a company is ac- cused of discrimination and they used TDA to make their personnel decisions.

Future research

should continue to

focus on improving

our understanding

about how

multirater

assessment works,

what unique

contribution each

rating source

provides, and the

degree to which

such ratings

overlap.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 385

Human Resource Management DOI: 10.1002/hrm

CHRISTINE M. HAGAN received her PhD degree in business administration from Florida Atlantic University (Boca Raton). She also holds an MBA with an HR specialization from Fairleigh Dickinson University (Madison, New Jersey). She is currently a lecturer at the University of Miami’s School of Business. Her research interests include performance ap- praisal, HR management in service settings, compensation, and HR effectiveness. Her re- search has been published in a variety of journals, including Administrative Science Quarterly, Human Resource Management Review, and the Journal of Applied Behavioral Science. Prior to pursuing a doctoral degree, she spent over 20 years in human resource management positions with such organizations as General Electric, RKO General Enter- tainment, and Montefiore Medical Center.

ROBERT KONOPASKE earned his BA degree, Phi Beta Kappa, at Rutgers College, Rut- gers University in 1986 and his master’s degree in international business studies at the University of South Carolina in 1994. In 2001, he obtained his PhD in business admin- istration at the University of Houston. His primary research interests are traveling, short-term and other types of emerging global assignments, expatriate staffing and selection, cross-cultural training, work-family balance issues, and travel stress. He has published in such journals as the Journal of Applied Psychology, the Academy of Man- agement Executive, the Journal of Business Research, Work & Stress, the International Journal of Human Resource Management, Compensation & Benefits Review, and Human Resource Management Review. Dr. Konopaske is an author of several text- books and is a senior fellow at the Applied Management Sciences Institute. He recently became a member of the editorial board of the International Journal of Human Re- source Management.

H. JOHN BERNARDIN is a university research professor in the College of Business at Florida Atlantic University in Boca Raton. He earned his PhD in industrial/organizational psychology from Bowling Green University and is the former director of doctoral stud- ies in I/O psychology at Virginia Tech. He is past chair of the Division of Personnel/Human Resources of the Academy of Management. Dr. Bernardin was editor of Human Resource Management Review and has served on the editorial boards of numerous journals, in- cluding Academy of Management Review, the Human Resource Management Journal, and the Journal of Organizational Behavior. He is the author of six books and over 100 articles related to human resource management. His paper on employment discrimina- tion was cited as the best paper of the year by the Society of Human Resource Manage- ment in 2003. Dr. Bernardin has consulted for many of the most successful companies in the world and has served as an expert witness in numerous employment discrimina- tion lawsuits.

CATHERINE L. TYLER is an assistant professor at Oakland University in Rochester, Michi- gan. She received her PhD from Florida Atlantic University (Boca Raton) and an MBA from the University of South Florida. Her research interests include performance ap- praisal, employee selection, team development, ethics, and international human re- sources. She has published in journals such as the Journal of Applied Psychology and the Journal of Management Education. Other published works include book chapters on ethics in supervision and performance appraisal training. She has also presented works at national and international conferences and received a “Best Paper” nomination in the HR division of the Academy of Management.

386 HUMAN RESOURCE MANAGEMENT, Fall 2006

NOTES

1. Some may question the use of assessment center data as criteria in criterion-related validation be- cause such data are not measures of actual job performance typically operationalized by supervi- sory ratings of on-the-job performance. We have three responses to this issue that implies the crite- ria we used in this study are not “equivalent” to on-the-job performance measures (Bernardin & Beatty, 1984, p. 142; Brogden & Taylor, 1950).

First, the true definition of “equivalence” of criteria in empirical validation research is the se- lection and weighting of the predictors under study for subsequent use in personnel decision making (Brogden & Taylor, 1950). Thus, the impli- cation of the criticism is that the use of assess- ment center data as criteria would result in differ- ent competencies being selected (e.g., 360-degree assessments of Competency A would be selected for use when AC data are used as criteria, while Competency A would not be selected if an on-the- job performance measure were used) and/or dif- ferent weights (e.g., Competency A would receive a different weight in a selection battery because AC data were used versus on-the-job perform- ance data).

In fact, there is no published evidence to sup- port this claim. Brogden’s classic Army research found that job knowledge, ratings, and work sam- ples were equivalent since their use as criteria re- sulted in essentially the same beta-weights for the predictors under study (F. L. Schmidt, personal communication, January 30, 2006). Regarding the circumstances under study here, given the level of correlation among the predictor competencies, even if the use of one or more on-the-job perform- ance measures did result in different predictor weights, the derivation of a composite predictor “battery” (for actual selection purposes) would re- sult in nearly identical rank ordering of job candi- dates using either the AC data or some other “rele- vant” measure of on-the-job performance.

In their discussion of criterion “relevance,” Cas- cio and Aguinis (2005) state that “a well-designed work sample test or performance management system may require a great deal of ingenuity, ef- fort, and expense to construct” (p. 71). These schol- ars, among many others, obviously consider a work sample as an acceptable criterion measure in validation research. Of course, there is a plethora of published criterion-related validation studies in- volving work samples as criteria.

Second, most experts on the concept of validity, in general, and construct validity, in particular, argue that all forms of empirical or criterion-re-

lated validity should be interpreted as part of a construct validity assessment process (e.g., Bin- ning & Barrett, 1989, see inferences 5 and 8). J. P. Campbell (1976), for example, suggested eight ap- proaches to the study of construct validity, includ- ing correlational studies among variables theorized to be measuring the same thing. Studies of predic- tors correlated with criteria such as work samples and situational tests are very common in valida- tion research (e.g., Schmidt & Hunter, 1981). Such criteria are often used because of the plethora of problems that arise when using on-the-job per- formance measures. Among the well-documented major problems with on-the-job performance measures for validation purposes are freedom from bias, criterion contamination and deficiency, opportunity bias, low reliability, rater leniency, and discriminability, all of which are either eliminated or at least reduced when using a well-controlled performance test, work sample, or assessment center. A strong argument could be made that the composite measures derived from this assessment center have more (not less) “relevance” than the on-the-job performance measures typically avail- able in validation research (Cascio & Aguinis, 2005).

Our third response to the criticism is that the use of ratings of on-the-job performance in this study would probably confound “predictor” data with criterion data in the sense that some raters who participated in the 360-degree data collection would probably provide data related to on-the-job performance. Thus, because rater tendencies are relatively stable across ratees and occasions (Kane, Bernardin, Villanova, & Peyrefitte, 1995), such a research design would artificially inflate any obtained validity coefficient. We were not beset by this critical problem using the AC data.

2. In addition, we conducted four step-wise regres- sion analyses to compare the differences in the re- gression weights for the six competencies using EWCC and OAP as the dependent variables (and 360-degree and TDA as the independent variables). Regressing EWCC on 360-degree assessment, we found that the standardized regression coefficients for Oral Presentation (p < .05), Interpersonal Skills (p < .01), Planning and Organizing (p < .05), and Leadership (p < .001) were statistically significant. When OAP was regressed on 360-degree assess- ment, the standardized regression coefficients for Interpersonal Skills (p < .05) and Leadership (p < .001) were statistically significant. In contrast, the TDA-EWCC regression analysis generated statisti- cally significant standardized regression coeffi- cients for Oral Presentation (p < .05), Written Com- munication (p < .05), Interpersonal Skills (p < .05), Planning and Organizing (p < .01), and Leadership

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 387

(p < .01); and the TDA-OAP regression analysis re- sulted in statistically significant betas for Interper- sonal Skills (p < .05), Planning and Organizing (p < .01), and Leadership (p < .01).

REFERENCES

Anderson, C., & Bissell, P. (2004). Using semi covert research to evaluate an emergency hormonal con- traception service. Pharmacy World & Science, 26(2), 102–106.

Arthur, W., Jr., Day, E. A., McNelly, T. L., & Edens, P. S. (2003). A meta-analysis of the criterion-related va- lidity of assessment center dimensions. Personnel Psychology, 56, 125–154.

Arthur, W., Jr., Woehr, D. J., & Maldegen, R. (2000). Convergent and discriminant validity of assess- ment center dimensions: A conceptual and empiri- cal re-examination of the assessment center con- struct-related validity paradox. Personnel Psychology, 26, 813–835.

Athey, T. R., & Orth, M. S. (1999). Emerging compe- tency methods for the future. Human Resource Management, 38, 215–226.

Atkins, P. W. B., & Wood, R. E. (2002). Self- versus oth- ers’ ratings as predictors of assessment center rat- ings: Validation evidence for 360-degree feedback programs. Personnel Psychology, 55, 871–904.

Barney, J. B. (1991). Firm resources and sustained competitive advantage. Journal of Management, 17(1), 99–120.

Becker, B. E., & Huselid, M. A. (1999). Overview: Strategic human resource management in five leading firms. Human Resource Management, 38, 287–301.

Becker, B. E., Huselid, M. A., & Ulrich, D. (2001). The HR scorecard: Linking people, strategy, and per- formance. Boston: Harvard Business School Press.

Bernardin, H. J. (1992). An ‘analytic’ framework for customer-based performance content development and appraisal. Human Resource Management Re- view, 2, 81–102.

Bernardin, H. J., & Beatty, R. W. (1984). Performance appraisal: Assessing human behavior at work. Boston: Kent-Wadsworth Publishing.

Bernardin, H. J., Buckley, M. R., Tyler, C. L., & Wiese, D. S. (2000). A consideration of strategies in rater training. In G. R. Ferris (Ed.), Research in personnel & human resources management (Vol. 18, pp. 221–274). Greenwich, CT: JAI Press.

Bernardin, H. J., Hagan, C. M., Kane, J. S., & Vil-

lanova, P. (1998). Effective performance manage- ment: A focus on precision, customers, and situa- tional constraints. In J. W. Smither (Ed.), Perfor- mance appraisal: State of the art in practice (pp. 3–48). San Francisco: Jossey-Bass.

Bernardin, H. J., & Smith, P. C. (1981). A clarification of some issues regarding the development and use of behaviorally anchored rating scales. Journal of Applied Psychology, 66, 458–463.

Bernardin, H. J., & Tyler, C. L. (2001). Legal and ethical issues in multisource feedback. In D. W. Bracken, C. W. Timmreck, & A. H. Church (Eds.), The handbook of multisource feedback (pp. 447–462). San Fran- cisco: Jossey-Bass.

Bernardin, H. J., Villanova, P. & Cooke, D. (2005). The relationship among the Big Five factors, rating dis- comfort and rating elevation. Manuscript under re- view.

Binning, J. F., & Barrett, G. V. (1989). Validity of per- sonnel decisions: A conceptual analysis of the in- ferential and evidential bases. Journal of Applied Psychology, 74, 478–494.

Borman, W. C. (1997). 360 degree ratings: An analysis of assumptions and a research agenda for evaluat- ing their validity. Human Resource Management Review, 7, 299–316.

Brogden, H. E., & Taylor, E. K. (1950). The theory and classification of criterion bias. Educational and Psychological Measurement, 10, 159–186.

Browne, M. W. (1975). Predictive validity of a linear re- gression equation. British Journal of Mathematical and Statistical Psychology, 28, 79–87.

Caldwell, C., Thornton, G. C., & Gruys, M. L. (2003). Ten classic assessment center errors: Challenges to selection validity. Public Personnel Manage- ment, 32, 73–88.

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multi- method matrix. Psychological Bulletin, 56, 81–105.

Campbell, J. P. (1976). Psychometric theory. In M.D. Dunnette (Ed.), Handbook of industrial and organi- zational psychology (pp. 185–222). Chicago: Rand- McNally.

Cardy, R. L. (1998). Performance appraisal in a quality context: A new look at an old problem. In J. W. Smither (Ed.), Performance appraisal: State of the art in practice (pp. 132–162). San Francisco: Jossey-Bass.

Cascio, W. F., & Aguinis, H. (2005). Applied psychology in human resource management. Upper Saddle River, NJ: Prentice Hall.

Church, A. H. (1997). Do you see what I see? An expla-

Human Resource Management DOI: 10.1002/hrm

388 HUMAN RESOURCE MANAGEMENT, Fall 2006

nation of congruence in ratings from multiple per- spectives. Journal of Applied Social Psychology, 27, 983–1020.

Facteau, J. D., & Craig, S. B. (2001). Are performance appraisal ratings from different rating sources comparable? Journal of Applied Psychology 86. 215–227.

Fiedler, A. (2001). Adverse impact on Hispanic job ap- plicants during assessment center evaluations. His- panic Journal of Behavioral Sciences, 23, 102–110.

Finn, A., & Kayande, U. (1999). Unmasking a phan- tom: A psychometric assessment of mystery shop- ping. Journal of Retailing, 75(2), 195–217.

Fleenor, J. W., & Brutus, S. (2001). Multisource feed- back for personnel decisions. In D. W. Bracken, C. W. Timmreck, & A. H. Church (Eds.), The handbook of multisource feedback (pp. 335–351). San Fran- cisco: Jossey-Bass.

Fletcher, C., Baldry, C., & Cunningham-Snell, N. (1998). The psychometric properties of 360 degree feedback: An empirical study and a cautionary tale. International Journal of Selection and Assessment, 6, 19–34.

Ganzach, Y., Kluger, A. N., & Klayman, N. (2000) Mak- ing decisions from an interview: Expert measure- ment and mechanical combination. Personnel Psy- chology, 53, 1–20.

Gatewood, R. D., & Feild, H. S. (2001). Human re- source selection. Fort Worth, TX: The Dryden Press.

Gaugler, B. B., Rosenthal, D. B., Thornton, G. C., III, & Bentson, C. (1987). Meta-analysis of assessment center validity. Journal of Applied Psychology, 72, 493–511.

Goldstein, H. W., Yusko, K. P., Braverman, E. P., Smith, D. B., & Chung, B. (1998). The role of cognitive abil- ity in the subgroup differences and incremental va- lidity of assessment center exercises. Personnel Psychology, 51, 357–374.

Goldstein, H. W., Yusko, K. P., & Nicolopoulos, V. (2001). Exploring black-white subgroup differences of managerial competencies. Personnel Psychol- ogy, 54, 783–807.

Greguras, G. J., & Robie, C. (1998). A new look at within-source interrater reliability of 360 degree feedback ratings. Journal of Applied Psychology, 83, 960–968.

Greguras, G. J., Robie, C., Schleicher, D. J., & Goff, M. (2003). A field study of the effects of rating pur- pose on the quality of multisource ratings. Person- nel Psychology, 56, 1–21.

Hagan, C. M., & Bernardin, H. J. (2003). Customer feedback as a critical performance dimension: Re-

view and exploratory examination. In C. A. Schriesheim & L. L. Neider (Eds.), New directions in human resource management (pp. 1–27). Green- wich, CT: Information Age.

Hamel, G., & Prahalad, C. K. (1994). Competing for the future. Boston, MA: Harvard Business School Press.

Hennessey, H. W., & Bernardin, H. J. (2003). The re- lationship between performance appraisal crite- rion specificity and statistical evidence of dis- crimination. Human Resource Management, 42, 143–158.

Kane, J. S., Bernardin, H. J., Villanova, P., & Peyrefitte, J. (1995). The stability of rater leniency: Three stud- ies. Academy of Management Journal, 38, 1036–1051.

Kirn, S. P., Rucci, A. J., Huselid, M. A., & Becker, B. E. (1999). Strategic human resource management at Sears. Human Resource Management, 38, 329–335.

Klemp, G. O. (Ed.). (1980). The assessment of occupa- tional competence. Washington, DC: Report to the National Institute of Education.

Lawler, E. E., III. (1994). From job-based to compe- tency-based organizations. Journal of Organization Behavior, 15, 3–15.

Lengnick-Hall, C. A. (1996). Customer contributions to quality: A different view of the customer-oriented firm. Academy of Management Review, 21, 791–824.

Lievens, F. (1998). Factors which improve the con- struct validity of assessment centers: A review. In- ternational Journal of Selection and Assessment, 6, 141–152.

Lievens, F., Sanchez, J. I., & De Corte, W. D. (2004). Easing the inferential leap in competency model- ing: The effects of task-related information and subject matter expertise. Personnel Psychology, 57, 881–904.

London, M., & Beatty, R. W. (1993). 360-degree feed- back as a competitive advantage. Human Resource Management, 32, 353–372.

London, M., & Smither, J. W. (1995). Can multi-source feedback change perceptions of goal accomplish- ment, self-evaluations, and performance related outcomes? Personnel Psychology, 48, 803–839.

Lucia, A. D., & Lepsinger, R. (1999). The art and sci- ence of competency models: Pinpointing critical success factors in organizations. San Francisco: Jossey-Bass.

McCowan, R. A., Bowen, U., Huselid, M. A., & Becker, B. E. (1999). Strategic human resource manage- ment at Herman Miller. Human Resource Manage- ment, 38, 303–308.

Human Resource Management DOI: 10.1002/hrm

Predicting Assessment Center Performance 389

Milkovich, G. T., & Newman, J. M. (2005). Compensa- tion. New York: McGraw-Hill Irwin.

Moriarty, H., McLeod, D., & Dowell, A. (2003). Mystery shopping in health service evaluation. British Jour- nal of General Practice, 53, 942–946.

Mount, M. K., Judge, T. A., Scullen, S. E., Sytsma, M. R., & Hezlett, S. A. (1998). Trait, rater and level ef- fects in 360-degree performance ratings, Personnel Psychology, 51, 557–576.

Murphy, K. M., & Cleveland, J. N. (1995). Understanding performance appraisal: Social, organizational and goal-based perspectives. Thousand Oaks, CA: Sage.

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.

Ostroff, C., Atwater, L. E., & Feinberg, B. J. (2004). Un- derstanding self-other agreement: A look at rater and rate characteristics, context, and outcomes. Personnel Psychology, 57, 333–375.

Parry, S. B. (1996). The quest for competencies. Train- ing, 33(7), 48–54.

Pitoniak, M. J., Sireci, S. G., & Leucht, R. M. (2002). A multitrait-multimethod investigation of scores from a professional licensure examination. Educational and Psychological Measurement, 62, 498–516.

Powell, G. N., & Butterfield, D. A. (1997). Effect of race on promotions to top management in a federal de- partment. Academy of Management Journal, 40, 112–128.

Prahalad, C. K., & Hamel, G. (1990). The core compe- tence of the corporation. Harvard Business Review, 67(3), 79–91.

Raju, N. S., Bilgic, R., Edwards, J. E., & Fleer, P. F. (1997). Methodology review: Estimation of popula- tion validity and cross validity and the use of equal weights in prediction. Applied Psychological Mea- surement, 21, 291–305.

Raju, N. S., Bilgic, R., Edwards, J. E., & Fleer, P. F. (1999). Accuracy of population validity and cross- validity estimation: An empirical comparison of formula-based, traditional empirical, and equal weights procedures. Applied Psychological Mea- surement, 23, 99–115.

Schippmann, J. S. (1999). Strategic job modeling: Working at the core of integrated human re- sources. Mahwah, NJ: Erlbaum.

Schippmann, J. S., Ash, R. A., Battista, M., Carr, L., Eyde, L. D., Hesketh, B. et al. (2000). The practice of competency modeling. Personnel Psychology, 53, 703–740.

Schleicher, D. J., Day, D. V., Mayes, B. T., & Riggio, R. E. (2002). A new frame of reference training: En-

hancing the construct validity of assessment cen- ters. Journal of Applied Psychology, 87, 735–746.

Schmidt, F. L., & Hunter, J. E. (1981). Employment testing: Old theories and new research findings. American Psychologist, 36, 128–137.

Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychol- ogy: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262–274.

Scullen, S. E., Mount, M. K., & Goff, M. (2000). Under- standing the latent structure of job performance ratings. Journal of Applied Psychology, 85, 956–970.

Scullen, S. E., Mount, M. K., & Judge, T. A. (2003). Evi- dence of the construct validity of developmental ratings of managerial performance. Journal of Ap- plied Psychology, 88, 50–66.

Smither, J. W., London, M., & Reilly, R. R. (2005). Does performance improve following multisource feed- back? A theoretical model, meta-analysis, and re- view of empirical findings. Personnel Psychology, 58, 33–66.

Spencer, L. M., Jr., & Spencer, S. M. (1993). Compe- tence at work: Models for superior performance. New York: Wiley.

Spychalski, A. C., Quinones, M. A., Gaugler, B. B., & Pohley, K. (1997). A survey of assessment center practices in organizations in the United States. Per- sonnel Psychology, 50, 71–90.

Tett, R. P., Guterman, H. A., Bleier, A., & Murphy, P. J. (2000). Development and content validation of a ‘hyperdimensional’ taxonomy of managerial com- petence. Human Relations, 13, 205–251.

Thornton, G. C., III. (1992). Assessment centers in human resource management. Reading, MA: Addi- son-Wesley.

Tornow, W. W. (1993). Editor’s note: Introduction to special issue on 360–degree feedback. Human Re- source Management, 32, 211–219.

Ulrich, D. (1997). Human resource champions: The next agenda for adding value and delivering re- sults. Boston, MA: Harvard Business School Press.

Villanova, P. (1992). A customer-based model for de- veloping job performance criteria. Human Re- source Management Review, 2, 103–114.

Waldman, D. A., Atwater, L. A., & Antonioni, D. (1998). Has 360 feedback gone amok? Academy of Man- agement Executive, 12(2), 86–94.

Wernerfelt, B. (1984). A resource-based view of the firm. Strategic Management Journal, 5(1), 171–180.

Human Resource Management DOI: 10.1002/hrm

390 HUMAN RESOURCE MANAGEMENT, Fall 2006

Wilson, A. M. (2002). Attitudes towards customer sat- isfaction measurement in the retail sector. Interna- tional Journal of Market Research, 44, 213–222.

Woehr, D. J., & Arthur, W., Jr. (2003). The construct- related validity of assessment center ratings: A review and meta-analysis of the role of method- ological factors. Journal of Management, 29, 231–258.

Woehr, D. J., Sheehan, K. M., & Bennett, W. (2005). Assessing measurement equivalence across rating sources: A multitrait-multimethod approach. Jour- nal of Applied Psychology, 90, 592–600.

Yammarino, F. J., & Atwater, L. E. (1993). Understand- ing self-perception accuracy: Implications for human resource management. Human Resource Management, 32, 231–247.

Human Resource Management DOI: 10.1002/hrm