PSYCH

profileabc7746
PsychometricPropertiesofMeasuresofTeamDiversityWithLikertData..pdf

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/273586003

Psychometric Properties of Measures of Team Diversity With Likert Data

Article in Educational and Psychological Measurement · May 2014

DOI: 10.1177/0013164414541275

CITATION

1

READS

68

3 authors, including:

Some of the authors of this publication are also working on these related

projects:

Asymptotics View project

Estimating equation View project

L. Deng

Beihang University (BUAA)

6 PUBLICATIONS 38 CITATIONS

SEE PROFILE

Ke-Hai Yuan

University of Notre Dame

140 PUBLICATIONS 3,974 CITATIONS

SEE PROFILE

All content following this page was uploaded by Ke-Hai Yuan on 12 June 2015.

The user has requested enhancement of the downloaded file.

http://epm.sagepub.com/ Measurement

Educational and Psychological

http://epm.sagepub.com/content/early/2014/07/03/0013164414541275 The online version of this article can be found at:

DOI: 10.1177/0013164414541275

published online 4 July 2014Educational and Psychological Measurement Lifang Deng, George A. Marcoulides and Ke-Hai Yuan

Psychometric Properties of Measures of Team Diversity With Likert Data

Published by:

http://www.sagepublications.com

at: can be foundEducational and Psychological MeasurementAdditional services and information for

http://epm.sagepub.com/cgi/alertsEmail Alerts:

http://epm.sagepub.com/subscriptionsSubscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.com/journalsPermissions.navPermissions:

What is This?

- Jul 4, 2014OnlineFirst Version of Record >>

by guest on July 5, 2014epm.sagepub.comDownloaded from by guest on July 5, 2014epm.sagepub.comDownloaded from

Article

Educational and Psychological Measurement

1–23 � The Author(s) 2014

Reprints and permissions: sagepub.com/journalsPermissions.nav

DOI: 10.1177/0013164414541275 epm.sagepub.com

Psychometric Properties of Measures of Team Diversity With Likert Data

Lifang Deng1, George A. Marcoulides2, and Ke-Hai Yuan3

Abstract

Certain diversity among team members is beneficial to the growth of an organization. Multiple measures have been proposed to quantify diversity, although little is known about their psychometric properties. This article proposes several methods to evalu- ate the unidimensionality and reliability of three measures of diversity. To approxi- mate the interval scale required by the measures of diversity, a transformation on the Likert-item scores is proposed. Ridge maximum likelihood is used to deal with the issue of small sample size, and methods for evaluating the significance of the difference of two reliability estimates with correlated samples are also developed. Results with a real data set on entrepreneurial teams indicate that different measures of diversity may correspond to significantly different estimates of reliability. Results also indicate that diversity measures obtained with the transformed data tend to be more unidi- mensional than their counterparts obtained from Likert data. However, diversity mea- sures obtained from Likert data tend to yield greater reliability estimates. Among the three examined measures of diversity, the standard deviation is found to yield greater and more efficient reliability estimates than the others and is thus recommended.

Keywords

unidimensionality, reliability, normal-curve transformation, ridge structural equation modeling

1 Beihang University, Beijing, China

2 University of California, Santa Barbara, CA, USA

3University of Notre Dame, Notre Dame, IN, USA

Corresponding Author:

Ke-Hai Yuan, University of Notre Dame, 123a Haggar Hall, Notre Dame, IN 46556, USA.

Email: [email protected]

by guest on July 5, 2014epm.sagepub.comDownloaded from

Introduction

The compositional diversity of team members within an organization has been

shown to affect the performance and growth of the organization (Harrison & Klein,

2007; Van Knippenberg, De Dreu, & Homan, 2004). Among various kinds of diver-

sity (e.g., demographic, informational, experiential, or personality attributes), not all

have been determined to be beneficial to the growth of an organization. Some

researchers have indicated that it is merely the differences between team members in

terms of their skill level, knowledge, and perspectives that are needed to foster crea-

tivity and innovation (e.g., Guzzo & Shea, 1992). However, the findings in the extant

literature have not been consistent (Jackson, Joshi, & Erhardt, 2003; Stewart, 2006;

Van Knippenberg et al., 2004; Webber & Donahue, 2001) and indicate that our

understanding of diversity and its role is still relatively limited. To facilitate a better

understanding of diversity, Harrison and Klein (2007) classified diversity into three

distinctive types: separation, variety, and disparity. Separation is for the difference

in position or opinion among team members; variety is used to describe diversity in

expertise, knowledge, or experience; and disparity refers to inequality in status or

resources held among team members. Such a classification allows researchers to

identify different roles of different types of diversity.

A variety of measures have also been proposed to quantify different types of

diversity among individuals within a team. According to Harrison and Klein (2007),

separation should be measured by either the standard deviation or the average of

Euclidean distances, variety should be measured by the so-called Blau’s (1977) index

or entropy (Teachman, 1980), and disparity should be measured by the coefficient of

variation or the ratio of the average of the absolute differences over the mean.

Harrison and Klein (2007) also discussed the type of scales required by each of these

diversity measures and emphasized that measuring separation requires the observed

data on team members to be at the interval scale, whereas measuring disparity

requires the observed data to be at the ratio scale. However, in the study of human

behavior within the fields of education, management, psychology, and related social

and behavioral sciences, it is extremely difficult to obtain data at the ratio or even

interval scales. What are typically obtained are data collected from a survey using

questionnaires that are commonly only ordinal or Likert type. Nevertheless, research-

ers still regularly apply procedures that require interval-scale data to ordinal data.

For example, factor analysis is commonly applied to Likert data for item selection or

scale development (Raykov & Marcoulides, 2011). Although such a practice may

still yield interpretable results, a better method is to factor analyze the polychoric

correlation matrix (Babakus, Ferguson, & Jöreskog, 1987). Given that the observed

values in Likert data used to determine the above-mentioned diversity measures are

somewhat arbitrary, we propose to transform them to avoid the arbitrariness. The

transformation is based on threshold values under the normal curve, parallel to those

used in estimating polychoric correlations (Olsson, 1979). We can call it the normal-

curve (NC) transformation. Although the transformed data are still limited in number

of values, we argue that they are closer to the conditions required by diversity

2 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

measures than the commonly used Likert data. To see the effect of the transforma-

tion, we will study the psychometric properties of several diversity measures when

applied to both Likert and transformed data.

Unidimensionality and reliability are probably the two most basic psychometric

properties one has to consider for any scale or instrument. Unidimensionality implies

that the statistical dependence among the items can be accounted for by a single

underlying latent trait, and reliability informs about the degree to which the observed

individual differences are indicative of true individual differences on the latent

dimension of interest. Measures of diversity, especially those for measuring separa-

tion with Likert data, are also subject to such properties if they aim to properly cap-

ture any underlying trait. In particular, when measurements in a scale are not

unidimensional, the empirical meaning of the scale will be different from the mean-

ing assigned to it, which will create interpretational confounding (e.g., Anderson &

Gerbing, 1982; Bagozzi, 1980; Burt, 1973, 1976). Reliability is equally important

because, for measurements with a low reliability index, the observed values of the

obtained measurements can be mostly due to random errors. Additionally, because

the value of the determined reliability index sets a bound on validity (Allen & Yen,

1979), a high reliability (index) is a necessary condition for high validity (Raykov &

Marcoulides, 2011). We hope that by studying the unidimensionality and reliability

of different measures of diversity, the inconsistent findings obtained to date on the

roles of diversity can be better understood.

The methodological development presented in this article was motivated by the

need to study the psychometric properties of diversity measures based on 13 Likert

items administered to entrepreneurial teams. Because the number of teams plays the

role of sample size, which is not sufficiently large, a method to deal with the issue of

small sample sizes was also needed especially when using factor analysis to evaluate

the unidimensionality of the diversity measures. For such a purpose, we make use of

the ridge maximum likelihood (ML) method originally developed in Yuan and Chan

(2008). This method has been shown to yield more accurate parameter estimates than

the normal-distribution-based maximum likelihood (NML) even for normally distrib-

uted data. We also develop methods for evaluating the significance of the difference

of two reliability estimates with correlated samples. This enables us to determine

whether different measures of diversity correspond to significantly different reliabil-

ity estimates. If different diversity measures yield significantly different reliability

estimates, then it is better to use the one that corresponds to the greatest reliability.

In the next section, the methodological components for studying the unidimen-

sionality and reliability of different measures of diversity are given, including the

formulations of different diversity measures, the NC transformation, ridge ML, and

standard error (SE) for difference of reliability estimates. A real data set with Likert

scale and its analysis are presented in the following section. We conclude with a dis-

cussion and recommendations. It is important to note that our focus is on the psycho-

metric properties (unidimensionality and reliability) of different measures of

Deng et al. 3

by guest on July 5, 2014epm.sagepub.comDownloaded from

diversity, not on interrater reliability issues (for further details on interrater reliabil-

ity, see Algina, 1978; Schuster & Smith, 2002; Shrout & Fleiss, 1979).

Methodology

This section first introduces the three diversity measures that will be used in the anal-

ysis of the real data. Then, the NC transformation is described. Ridge ML for factor

analysis is reviewed next. Formulas for SE of the difference of two reliability esti-

mates are developed at the end of this section. These measures and techniques will

be used to analyze the real data in the subsequent section.

Diversity Measures

Let xijk be the score of person k on item j within team i, k = 1, 2, . . . , ni; j = 1, 2, . . . , p; i = 1, 2, . . . , N: Three measures of diversity derived from xijk will be studied. These are the average of absolute distances (aad) among team members,

aadij = 2

ni(ni � 1) Xni�1 k1 = 1

Xni k2 = k1 + 1

jxijk1 � xijk2j; ð1Þ

the average of absolute deviations from the mean (aadm) of team members,

aadmij = 1

ni � 1 Xni k = 1

jxijk � �xijj, ð2Þ

where �xij = Pni

k = 1 xijk=ni; and the standard deviation (sd) among team members,

sdij = ½ 1

ni � 1 Xni k = 1

(xijk � �xij)2�1=2: ð3Þ

Two measures for separation were recommended by Harrison and Klein (2007).

One is the standard deviation in which the denominator is ni instead of ni � 1: Another is the square root of the average of the squared Euclidean distances

(xijk1 � xijk2 ) 2, in which k1 = k2 is not distinguished from k1 6¼ k2: According to

Biemann and Kearney (2010), these measures may contain substantial bias due to

including terms that are obviously 0 or without correcting for the loss of degrees of

freedom. The diversity measure in (1) only includes the absolute distances for differ-

ent team members, and degrees of freedom loss are accounted for in (2) and (3).

Parallel to the average Euclidean distance (aed) in Harrison and Klein (2007) or

Biemann and Kearney (2010), we define aedij as

aedij = 2

ni(ni � 1) Xni�1 k1 = 1

Xni k2 = k1 + 1

(xijk1 � xijk2 ) 2

" #1=2 :

4 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

Because aedij is proportional to sdij (see Hays, 1981) and any results of reliability

and unidimensionality analysis of aedij would be identical to those of sdij, we do not

separately examine aedij in this article.

As we were not able to locate any references in the literature in which the aadmij in (2) or an index that is proportional to aadmij has been proposed to measure diver-

sity, aadmij can be regarded as a new measure. The psychometric properties of the

three measures, aad, aadm, and sd, will be examined through real data analysis in

the following section.

Quantities in the form of the average of absolute distances (e.g., aad) are not pre-

sented as stand-alone measures in either Harrison and Klein (2007) or Biemann and

Kearney (2010). Instead, they are divided by the team mean score for measuring dis-

parity. Another measure for disparity recommended by Harrison and Klein (2007) is

the coefficient of variation. Since these measures require the observed data to possess

the properties of ratio scale, they may not be applicable to Likert data and will not be

studied in this article. Similarly, variety will not be measured through Likert data and

neither do we study Blau’s index or the entropy.

Normal-Curve Transformation

Since all three measures of diversity (aad, aadm, sd) are obtained by arithmetic oper-

ations, they are ideally applicable to data that are of interval scale (Harrison & Klein,

2007). However, as indicated previously, measurements in the social and behavioral

sciences are typically Likert or ordinal scale. To approximate interval scales, we pro-

pose a transformation to Likert data in this subsection.

With a total of Nt = PN

i = 1ni individual observations and c categories for a given

item, let q̂l be the proportion of observations 1

for category l: Following the conven- tion of polychoric correlations (Olsson, 1979), we may assume that, for each

observed xijk , there is an underlying continuous variable zijk ;N (0, 1) such that

xijk = l whenever zijk belongs to the interval (hl�1, hl�, where h0 \h1 \ � � � \hc are threshold values to be estimated. This implies that the probability for xijk = l is

given by

ql = F(hl ) � F(hl�1), l = 1, 2, . . . , c,

where F(�) is the cumulative distribution function of z;N (0, 1), with h0 = � ‘ and hc = ‘: Thus, the marginal maximum likelihood estimate of hl is given by

ĥl = F �1( Xl t = 1

q̂t ), l = 1, 2, . . . , c � 1: ð4Þ

Based on this underlying NC assumption, we propose a transformation to the Likert

xijk by

yijk = (ĥl�1 + ĥl )=2 if xijk = l, l = 1, 2, . . . , c: ð5Þ

Deng et al. 5

by guest on July 5, 2014epm.sagepub.comDownloaded from

Notice that there are only c � 1 finite values of ĥl in (4), and we cannot use ĥ0 = � ‘ or ĥc = ‘ because they will result in yijk = � ‘ when xijk = 1 or yijk = ‘ when xijk = c: We further propose to use

ĥ0 = F �1(:5=Nt ) and ĥc = F

�1(1 � :5=Nt ): ð6Þ

The proposed values in (6) are equivalent to assigning a value of .5 to cells with zero

number of observations in the analysis of contingency tables, because we can think

of an extra category xijk = 0 below xijk = 1 and another extra category xijk = c + 1 above

xijk = c, and both had zero number of observations. The proposed values in (6) are

also similar to the so-called continuity correction in applying the central limit theo-

rem to categorical data (Feller, 1945), where a step of .5 is used when jumping from

1 to the next whole number.

Notice that the correction in (6) is for yijk to avoid being �‘ or ‘ whenever xijk = 1 or c: If the nominal number of categories is c but only c � 1 or fewer number of categories are observed, we may simply treat the unobserved categories in the

middle as having probability of zero by just applying the correction to the end points

of xijk: We need to note that the transformed yijk do not possess the property of interval

scales, although they avoid the arbitrary nature of Likert data that assign consecutive

whole numbers to ordered categories. Closely related to polychoric correlation, the

rationale of the transformation in (5) depends heavily on the assumption of a normal

curve underlying the observed frequencies. If the NC assumption holds, the yijk obtained by the NC transformation determined by equations (4), (5), and (6) is sim-

ply the middle point of the interval zijk belongs, and thus represents the best predic-

tion of the true value of zijk in the sense of smallest absolute mean difference.

Applying each of the three diversity measures, aadij, aadmij, and sdij, to the trans-

formed yijk yields three more measures of diversity. In the next section, their reliabil-

ity and unidimensionality are examined, and the results are contrasted with those

obtained based on Likert data.

Ridge Maximum Likelihood for Factor Analysis With Small Sample Sizes

As indicated in the previous section, the number of teams, N , plays the role of sample

size when evaluating the psychometric properties of the diversity measures aad,

aadm, and sd. Since it can be expensive to have a large N , we use ridge ML for factor

analysis of the diversity measures in (1) to (3) when studying their unidimensionality.

Unless all the ni are sufficiently large, the diversity measures in (1) to (3) cannot be

regarded as normally distributed. As such, we expect ridge ML to work better than

NML when factor analyzing the diversity measures.

Let S be a sample covariance matrix of size p, and we are interested in modeling

S = E(S) by a confirmatory factor model

S(u) = LFL 0 + C, ð7Þ

6 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

where L is a factor loading matrix, F is a factor correlation matrix, and C is a diago- nal matrix of measurement errors/uniquenesses. The widely used NML procedure for

covariance structure analysis is to minimize

FML(S, S(u)) = tr½SS�1(u)�� log jSS�1(u)j� p

for parameter estimation. Let a . 0 be a small number and Sa = S + aI, with I

being the identity matrix. The ridge ML developed in Yuan and Chan (2008) is to

estimate ua by minimizing FML(Sa, S(ua)), and let the estimates be denoted by ûa: The corresponding estimates û for u are obtained by subtracting a from each of the elements of ûa corresponding to the diagonal elements of C, leaving the other elements of ûa unchanged. Standard errors of û are obtained by a sandwich-type covariance matrix, which accounts for the unknown underlying population distri-

bution of the involved diversity measure. As for overall model evaluation, Yuan

and Chan (2008) showed that, unless a = 0, TML = (N � 1)FML(Sa, S(ua)) does not asymptotically follow the nominal chi-square distribution x2df even if data are

normally distributed. They developed a rescaled statistic TRML and an adjusted sta-

tistic TAML: Parallel to the development for NML in Satorra and Bentler (1994), TRML asymptotically follows a distribution whose mean equals df , and TAML asymptotically follows a distribution whose mean and variance equal those of the

approximating distribution. Since the details of ridge ML have already been

described in Yuan and Chan (2008), no further elaboration is given here. Our pur-

pose is to apply ridge ML to evaluate the unidimensionality of each of the three

measures of diversity in (1) to (3) and to determine whether the corresponding

sample covariance matrix can be reasonably fitted by a one-factor model.

Following the recommendation of Yuan and Chan (2008), a = p=N is used in applying the ridge ML.

In order to fully justify applying a factor analysis to each of the diversity measures,

we do not need to assume that each of aadij, aadmij, or sdij is identically distributed

across i = 1, 2, . . . , N: The development in Lee and Shi (1998) implies that the vector di = (aadi1, aadi2, . . . , aadip)

0 does not need to have the same population covariance as i varies. Since for both reliability and unidimensionality the analysis is based on the

sample covariance matrix S of the corresponding diversity measures with the assump-

tion E(S) = S, our study of the psychometric properties of di is for the population rep- resented by the sample di, i = 1, 2, . . . , N: We will further discuss this point in the concluding section.

Standard Error for Difference of Two Reliability Estimates With Correlated Samples

Among the many available estimates of reliability for equally weighted composite

scores, coefficient alpha is most widely used in practice even though it can over- or

underestimate the population reliability (Raykov, 1997). Another popular estimate is

coefficient omega defined through the factor loadings and error variances by fitting

Deng et al. 7

by guest on July 5, 2014epm.sagepub.comDownloaded from

the sample covariance matrix to a one-factor model (McDonald, 1999). Both are

applicable when evaluating the reliability of the different diversity measures. Our

interest is whether different diversity measures will yield significantly different relia-

bility estimates. Thus, we need to have an estimate of the SE of the difference of two

estimates of alpha or omega. When the two estimates are independent, the variance

of the difference of the two estimates is simply the summation of the variances of

the two estimates of alpha or omega. However, with respect to the three diversity

measures, the variance or SE of the difference of two estimates of alpha or omega

depends on their correlation. Since the SE for the difference of two reliability esti-

mates with correlated samples will facilitate comparison of reliabilities in other con-

texts, and the literature to date does not contain such a development, we provide

more details for obtaining consistent SEs of the difference of two estimates of alpha

and omega, respectively. We also present the necessary notation and formulas for

calculating the SEs. The complete details leading to the calculation formulas are

given in Appendices A and B.

Let S = (sjk ) be a sample covariance matrix of size p, and s = vech(S) be the vector

by stacking the elements in the lower-triangular part of S: Then, with p� = p(p + 1)=2, s is a vector of p�31, and the sample coefficient alpha is given by

â = g(s) = p

p � 1 (1 �

Xp j = 1

sjj= Xp j = 1

Xp k = 1

sjk ) = p

p � 1 (1 �

a0s

b0s ),

where a is a p�31 vector whose elements are 1 corresponding to sjj and 0 elsewhere; and b is also a p�31 vector whose elements are 1 corresponding to sjj and 2 corre- sponding to sjk when j 6¼ k: For example, at p = 3, s = (s11, s21, s31, s22, s32, s33)0, a = (1, 0, 0, 1, 0, 1)0, and b = (1, 2, 2, 1, 2, 1)0: We need to have the Jacobian matrix or the matrix of derivatives of g(s) with respect to the elements of s, and it is given by

_g(s) = � p

p � 1 ½

1

b0s a �

a0s

(b0s) 2

b�:

With s1 = vech(S1) and s2 = vech(S2) from two correlated samples, standard error for

â2 � â1 = g(s2) � g(s1) also involves the variance-covariance matrices of s1 and s2: Denote these by G11 = Var(

ffiffiffi n p

s1), G22 = Var( ffiffiffi n p

s2), and G12 = Cov( ffiffiffi n p

s1, ffiffiffi n p

s2),

where n = N � 1: These are consistently estimated by their sample counterparts, with details given in Appendix A. With the introduced notation, the result given in

Appendix B implies that ffiffiffi n p ½(â2 � a2) � (â1 � a1)� is asymptotically normally dis-

tributed with mean zero and variance consistently estimated by

t̂ 2 a = _g

0(s1)Ĝ11 _g(s1) + _g 0(s2)Ĝ22 _g(s2) � 2 _g0(s1)Ĝ12 _g(s2): ð8Þ

It follows from (8) that the SE of (â2 � â1) is consistently estimated by t̂a= ffiffiffi n p

,

which will be used in the next section when evaluating the significance of the

8 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

difference (â2 � â1): Confidence interval (CI) for a2 � a1 with confidence level (1 � 2b) can be obtained as

½â2 � â1 � cbt̂a= ffiffiffi n p

, â2 � â1 + cbt̂a= ffiffiffi n p �,

where cb is the critical value corresponding to probability 1 � b under the standard normal curve.

We next consider the sample coefficient omega, which is defined through the esti-

mates of a one-factor model. With p items, the covariance structure of the one-factor

model can be represented by (7), where L = l = (l1, l2, . . . , lp) 0 is a vector of factor

loadings, and C = diag(c11, c22, . . . , cpp) is a diagonal matrix of error variances. Let û = (l̂1, l̂2, . . . , l̂p, ĉ11, ĉ22, . . . , ĉpp)

0 be the ridge ML estimates for the one-factor

model. Then the sample coefficient omega is given by

v̂ = h(û) = (10pl̂)

2

(10pl̂) 2

+ tr(Ĉ) ,

where 1p is a vector of p 1s. With two covariance matrices S1 and S2, the ridge tun-

ing parameters a1 and a2 can be different. We set them equal (a1 = a2 = a = p=N ) in our study and denote Sa1 = S1 + aI and Sa2 = S2 + aI: Let the parameter estimates by minimizing FML(Sa1, S(ua1)) and FML(Sa2, S(ua2)) be denoted as ûa1 and ûa2, respectively. We need to introduce additional notation for presenting the SE of

v̂2 � v̂1 = h(û2) � h(û1): Let vec(S) be the vector of stacking all the columns of S and s = vech(S): Then

there exists a p23p� matrix Dp such that vec(S) = Dpvech(S), and Dp is called the duplication matrix (e.g., Schott, 2005). Notice that the covariance structure S(u) in fitting S1 and S2 are the same, the difference between fitting the two samples are in

parameter estimates. One is û1 and the other is û2, these are obtained by subtracting a from ûa1 and ûa2 corresponding to each error variance, respectively. Let the Jacobian matrices of s(u) = vech½S(u)� and h(u) be denoted by _s(u) = ∂s(u)=∂u0 and _h(u) (see Yuan & Bentler, 2002), respectively; W(u) = 2�1D0p½S

�1 (u) � S�1(u)�Dp

and Ĉaj = W(ûaj) _s(ûaj)½ _s0(ûaj)W(ûaj) _s(ûaj)��1: Then Appendix B shows thatffiffiffi n p ½(v̂2 � v̂1) � (v2 � v1)� is asymptotically normally distributed with mean zero

and variance can be consistently estimated by

t̂ 2 v =

_h0(û1)Ω̂11 _h(û1) + _h 0(û2)Ω̂22 _h(û2) � 2 _h0(û1)Ω̂12 _h(û2), ð9Þ

with Ω̂jk = Ĉ 0 ajĜjk Ĉak , j, k = 1, 2: The result in (9) allows us to evaluate the signifi-

cance of (v̂2 � v̂1): Alternatively, we can obtain a (1 � 2b)-level confidence interval (CI) for (v2 � v1) as

½v̂2 � v̂1 � cbt̂v= ffiffiffi n p

, v̂2 � v̂1 + cbt̂v= ffiffiffi n p �:

Deng et al. 9

by guest on July 5, 2014epm.sagepub.comDownloaded from

Before ending this section, we note that the validity of SEs for â2 � â1 and v̂2 � v̂1 in this subsection does not need the normality assumption.

Psychometric Analysis of Diversity Measures With Entrepreneurial Teams

The data are part of a longitudinal study examining the impact of team attributes and

team process on team performance (Deng, Ye, & Xie, 2013). Participants for the

study are members nested within teams, which are distributed across provinces in the

well-developed Eastern part of China and include Beijing. Because diversity is

known to affect team performance, 13 items measuring diversity were administered

to each team member starting from the first wave. The English version of the 13

diversity items are included in Appendix C, and each participant was asked to

endorse each item using a 5-point Likert scale (1 = strongly disagree, 2 = somewhat

disagree, 3 = neutral/no opinion, 4 = somewhat agree, 5 = strongly agree). In the

design, the first five items are about the information/background of team members

and are used to measure information diversity. The last eight items are about their

opinions and measure underlying diversity. Following the design, we separate the

five information-diversity items from the eight underlying-diversity items when

studying their psychometric properties of reliability and unidimensionality. It is also

worthy to note that, although Items 4, 6, 7, 8, 11, and 12 are phrased in the opposite

direction from the other seven items, they do not affect the values of aadij, aadmij,

or sdij whether they are reversed or not in the analysis, since each of the diversity

measures uses absolute values of centralized scores or score differences.

There are a total of four waves of data in the longitudinal study of Deng et al.

(2013). However, the majority of the participants showed little change in their

answers to the 13 diversity items across the waves. Consequently, our analysis uses

data from only the first wave. It should also be noted that there were many teams in

which team members provided identical answers when endorsing each of the 13

items (resulting in a diversity value of 0 on all items), and these data are not included

in our analysis. In summary, the study in this section is based on Nt = 177 individual

participants from N = 52 teams, with the number of participants in a team ranging

from 2 to 5.

As described in the second section, diversity measures aadij, aadmij, and sdij are

obtained based on the Likert data and NC-transformed data for each of the 13 diver-

sity items, respectively. These are also referred to as samples in the following

discussion.

Distribution Properties

Before evaluating the reliability and unidimensionality of these measures, it is infor-

mative to check their distribution characteristics. In particular, we want to know

whether the NC transformation has any effect on the distribution of the diversity

10 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

measures. Table 1 contains the sample marginal skewness and excess kurtosis of each

diversity measure for each of the 13 items. The absolute averages (aave) of the sam-

ple skewness and excess kurtosis across the 13 items are also reported at the bottom

of the table. According to Table 34C of Pearson and Hartly (1954), at sample size

Table 1. Sample Skewness and Excess Kurtosis of the Three Measures of Diversities: Average of Absolute Distance (aad) Among Team Members, Average of Absolute Deviation From the Mean (aadm) of Team Members, and Standard Deviation (sd) Among Team Members.

Item

Sample skewness

Likert data Transformed data

add aadm sd add aadm sd

1 1.640 1.725 1.294 .832 .859 .363 2 .827 1.024 .655 .618 .780 .385 3 1.028 1.225 .694 .681 .814 .375 4 .425 .778 .122 .505 .859 .202 5 1.057 1.418 .616 1.189 1.493 .707 6 .724 .812 .534 .705 .826 .485 7 .547 .646 .403 .588 .758 .317 8 .951 1.266 .613 .834 1.122 .492 9 .614 .874 .329 .349 .535 .106 10 .603 .818 .269 .543 .773 .219 11 .262 .386 2.034 .302 .434 2.011 12 .863 1.158 .587 .897 1.192 .583 13 1.693 1.903 1.357 1.666 1.880 1.297 aave .864 1.079 .577 .747 .948 .426

Item

Sample excess kurtosis

Likert data Transformed data

add aadm sd add aadm sd

1 2.187 2.588 1.074 2 .071 .039 2 .984 2 .175 1.133 2 .515 2 .654 2 .162 2 1.032 3 .588 1.246 2 .339 2 .356 2 .021 2 .803 4 .278 .913 2 .080 .394 1.092 2 .015 5 2.272 3.279 .966 2.962 3.751 1.689 6 2 .669 2 .411 2 .844 2 .672 2 .345 2 .936 7 2 .242 .032 2 .272 2 .065 .371 2 .412 8 1.339 2.851 .241 .954 2.044 .110 9 2 .300 .456 2 .874 2 .972 2 .522 2 1.313 10 2 .252 .244 2 .625 2 .339 .220 2 .719 11 2 .047 .304 2 .484 2 .036 .310 2 .470 12 .839 2.501 2 .122 .810 2.172 2 .144 13 3.240 3.907 2.257 3.223 3.917 2.145 aave .956 1.528 .669 .885 1.151 .829

Deng et al. 11

by guest on July 5, 2014epm.sagepub.comDownloaded from

50, sample skewness is statistically significant at 2% or 10% level if its absolute

value is greater than .787 or .533, respectively. At N = 52, these critical values are

slightly smaller. It is clear that multiple entries of sample skewness in the top panel

of Table 1 are greater than .787. This is because each of the diversity measures is

obtained using absolute values or square root of a summation of squared deviations,

and such kinds of measures tend to have longer right tail. The values of sample skew-

ness in Table 1 suggest that NC transformation does make the resulting diversity

measures less skewed on average, although not all the values following the

Table 2. Test Statistics by Fitting One-Factor Model to Each of the Three Measures of Information and Underlying Diversities: Average of Absolute Distance (aad) Among Team Members, Average of Absolute Deviation From the Mean (aadm) of Team Members, and Standard Deviation (sd) Among Team Members.

(a) Information diversity (5 items, df = 5)

Likert data Transformed data

Statistic p Value Statistic p Value

aad TML 8.450 .133 5.299 .380 TRML 8.872 .114 5.578 .349 TAML 5.160 .151 3.587 .343 dfAML 2.908 3.216

aadm TML 8.663 .123 5.394 .370 TRML 8.240 .143 5.388 .370 TAML 4.606 .179 3.548 .360 dfAML 2.794 3.293

sd TML 6.039 .302 3.770 .583 TRML 9.397 .094 5.711 .335 TAML 5.612 .131 3.580 .331 dfAML 2.986 3.134

(b) Underlying diversity (8 items, df = 20)

Likert data Transformed data

Statistic p Value Statistic p Value

aad TML 24.335 .228 28.521 .098 TRML 31.084 .054 28.682 .094 TAML 6.893 .177 7.546 .205 dfAML 4.435 5.262

aadm TML 24.193 .234 28.972 .088 TRML 35.471 .018 29.692 .075 TAML 9.685 .108 8.822 .180 dfAML 5.461 5.943

sd TML 13.196 .869 17.696 .607 TRML 30.545 .061 28.402 .100 TAML 6.958 .183 7.806 .208 dfAML 4.556 5.497

12 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

transformation become smaller. Comparing among the three diversity measures, the

values of sample skewness corresponding to sdij are uniformly the smallest while

those corresponding to aadmij are uniformly the largest, suggesting that different

diversity measures have different distributional shapes.

The lower panel of Table 1 contains the marginal sample excess kurtosis of each

of the diversity measures. According to Pearson and Hartly (1954, Table 34C), at

N = 50, sample kurtosis is significantly different from that of a normal distribution

(whose excess kurtosis equals 0) at 2% or 10% level if its value is outside the inter-

val ½�1:05, 1:88� or ½�:85, :99�: At N = 52, the end values of these intervals slightly move to the center. Similar to skewness, multiple entries of excess sample kurtosis

are outside the two intervals. While the kurtosis values of aadij and aadmij following

the NC transformation become smaller on average, the average kurtosis of sdij fol-

lowing the NC transformation becomes greater. None of the diversity measures

enjoys uniformly smallest excess kurtosis although the absolute average for sdij with

the Likert data is the smallest.

The results in Table 1 suggest that, on average, sd has the smallest skewness and

kurtosis with either the Likert data or the NC-transformed data. We note that the

sample skewness and excess kurtosis for the 13th item are still significant at level

.02. Thus, NML-based SEs (see, e.g., Van Zyl, Neudecker, & Nel, 2000) for reliabil-

ity estimates are not valid even when sample size is large, and SEs based on the

sandwich-type covariance matrix are needed. Similarly, we have to rely on the

rescaled or adjusted statistics when evaluating the unidimensionality of the diversity

measures using factor analysis.

Unidimensionality

Because the first five items are designed to measure information diversity and the

last eight items are designed to measure underlying diversity, we would like to see

whether some or all of the diversity measures on the first five or last eight items fol-

low a one-factor model. If some or all of them follow a one-factor model, then we

may choose a measure that is most reliable in future applications. If none of them

follow a one-factor model, then we may need to further study the dimensionality of

these diversity measures to better understand their factor structure as well as their

relationship with the content of the items that these diversity measures are derived.

Only after the factor structures of the aadij, aadmij, or sdij are well understood can

we make better use of these diversity measures.

Since N = 52 plays the role of sample size, which may not be sufficiently large

for factor analysis, we use ridge ML for more reliable parameter estimates and over-

all model evaluation. Following the recommendation of Yuan and Chan (2008), the

ridge parameter is chosen as a = p=N = 5=52 when studying the five items of infor- mation diversity, and a = 8=52 when studying the eight items of underlying diversity. Fitting the one-factor model to the original Likert data as well as the NC-transformed

data with five and eight items by ridge ML, respectively, the rescaled and adjusted

Deng et al. 13

by guest on July 5, 2014epm.sagepub.comDownloaded from

test statistics, TRML and TAML, together with their associated p values, are obtained

and reported in Table 2. The degrees of freedom for the adjusted statistic, dfAML, are

also included to better understand the value of TAML: The statistic TML is reported as well for comparison purpose.

The statistics TRML and TAML for information diversity in the top panel of Table 2

suggest that, except the fit for the measure sd with the Likert data being marginal,

other samples are well fitted by the one-factor model. The results also suggest that

Table 3. Ridge ML Estimates of Factor Loadings and Error Variances by Fitting One-Factor Model to Each of the Three Measures of Information (Items 1-5) and Underlying (Items 6-13) Diversities With Likert Data: Average of Absolute Distance (aad) Among Team Members, Average of Absolute Deviation From the Mean (aadm) of Team Members, and Standard Deviation (sd) Among Team Members.

aad aadm sd

u û SE z û SE z û SE z

l1 .294 .175 1.676 .307 .185 1.657 .218 .108 2.013 l2 .408 .160 2.547 .375 .164 2.284 .353 .109 3.253 l3 .799 .184 4.348 .760 .193 3.946 .609 .109 5.593 l4 .148 .100 1.482 .154 .101 1.525 .121 .072 1.678 l5 .409 .126 3.249 .391 .134 2.926 .314 .081 3.858 c11 .565 .148 4.484 .537 .147 4.296 .303 .072 5.579 c22 .371 .107 4.383 .323 .097 4.335 .210 .063 4.832 c33 .049 .218 .668 .065 .213 .756 .032 .098 1.316 c44 .424 .102 5.089 .399 .110 4.491 .218 .049 6.379 c55 .500 .160 3.737 .487 .165 3.542 .271 .077 4.765

aad aadm sd

u û SE z û SE z û SE z

l6 .425 .105 4.059 .459 .108 4.237 .317 .074 4.293 l7 .400 .085 4.679 .251 .114 2.207 .315 .061 5.199 l8 .400 .139 2.871 .203 .156 1.300 .340 .081 4.208 l9 .241 .063 3.816 .184 .084 2.180 .202 .050 4.030 l10 .265 .077 3.452 .392 .100 3.931 .208 .055 3.820 l11 .247 .060 4.130 .177 .091 1.934 .224 .046 4.918 l12 .315 .131 2.411 .215 .145 1.480 .271 .081 3.358 l13 .358 .084 4.271 .542 .126 4.287 .256 .055 4.661 c66 .326 .087 5.507 .240 .091 4.345 .185 .044 7.732 c77 .186 .045 7.518 .244 .053 7.449 .102 .025 10.363 c88 .268 .082 5.162 .331 .084 5.751 .133 .041 6.995 c99 .285 .058 7.585 .270 .066 6.370 .160 .030 10.471 c10, 10 .286 .057 7.708 .180 .058 5.756 .148 .028 10.634 c11, 11 .160 .040 7.740 .170 .036 9.007 .085 .020 11.729 c12, 12 .353 .064 7.960 .331 .076 6.401 .193 .035 9.905 c13, 13 .621 .231 3.348 .441 .157 3.791 .322 .117 4.074

14 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

the fit to each of the three diversity measures with the NC-transformed data is a lot

better than its counterpart obtained with the Likert data.

The statistic TAML in the lower panel of Table 2 suggests that the fit of the one-

factor model to each of the samples with underlying diversity is reasonable.

However, the statistic TRML suggests that the fit is marginal, although the p values

corresponding to TRML under the transformed data are uniformly larger. Similar to

Table 4. Ridge ML Estimates of Factor Loadings and Error Variances by Fitting One-Factor Model to Each of the Three Measures of Information (Items 1-5) and Underlying (Items 6-13) Diversities With NC-Transformed Data: Average of Absolute Distance (aad) Among Team Members, Average of Absolute Deviation From the Mean (aadm) of Team Members, and Standard Deviation (sd) Among Team Members.

aad aadm sd

u û SE z û SE z û SE z

l1 .192 .141 1.362 .178 .144 1.236 .187 .095 1.978 l2 .300 .132 2.279 .266 .127 2.087 .269 .091 2.968 l3 .675 .206 3.283 .682 .228 2.991 .484 .107 4.539 l4 .192 .129 1.485 .196 .126 1.557 .154 .096 1.598 l5 .317 .122 2.604 .306 .126 2.438 .244 .080 3.065 c11 .583 .113 6.022 .577 .111 6.046 .323 .055 7.663 c22 .311 .078 5.206 .288 .070 5.519 .168 .046 5.719 c33 .052 .236 .625 .013 .269 .407 .054 .085 1.765 c44 .520 .138 4.472 .489 .147 3.994 .267 .068 5.350 c55 .439 .139 3.856 .434 .143 3.696 .230 .066 4.968

aad aadm sd

u û SE z û SE z û SE z

l6 .473 .116 4.094 .510 .120 4.255 .346 .080 4.356 l7 .490 .118 4.132 .379 .146 2.590 .367 .078 4.723 l8 .416 .166 2.505 .261 .184 1.417 .344 .096 3.562 l9 .305 .100 3.048 .257 .107 2.409 .267 .076 3.512 l10 .348 .117 2.984 .438 .133 3.292 .274 .080 3.420 l11 .333 .086 3.865 .266 .115 2.323 .304 .065 4.688 l12 .356 .155 2.291 .281 .170 1.646 .310 .094 3.288 l13 .386 .089 4.345 .508 .123 4.124 .279 .056 4.978 c66 .402 .114 4.898 .306 .114 4.027 .227 .056 6.778 c77 .310 .083 5.575 .368 .095 5.500 .169 .043 7.566 c88 .409 .105 5.336 .458 .105 5.807 .205 .052 6.926 c99 .606 .104 7.315 .553 .116 6.115 .341 .054 9.227 c10, 10 .539 .101 6.843 .419 .107 5.350 .282 .052 8.443 c11, 11 .332 .084 5.791 .337 .077 6.396 .175 .042 7.797 c12, 12 .527 .098 6.964 .488 .107 6.003 .282 .051 8.555 c13, 13 .724 .262 3.354 .595 .208 3.609 .376 .131 4.031

Deng et al. 15

by guest on July 5, 2014epm.sagepub.comDownloaded from

the results displayed in the top panel of Table 2, all the samples under NC transfor-

mation are fitted by the one-factor model uniformly better according to TAML: However, unlike in the top panel where sdij is fitted by the one-factor model least

well, aadmij in the lower panel is fitted by the one-factor model least well.

It is interesting to note that some of the statistics TRML in Table 2 are multiple

times of the corresponding TAML and so are their corresponding degrees of freedom.

This is because the measures of diversity are not normally distributed. In particular,

when data are far from symmetrically distributed, TAML may differ substantially from

TRML, and dfAML automatically accounts for the value of TAML due to certain distribu-

tional characteristics of the sample. The statistic TML is very close to TRML for some

samples and is quite different from TRML for other samples. This is expected because

their difference also depends on the distribution of the sample.

In summary, there exist differences among the test statistics regarding the unidi-

mensionality of the three diversity measures. But the differences are not substantial.

The results in Table 2 suggest that the fit with NC-transformed data is substantially

better than with Likert data. The difference in p values between TRML and TAML on

each sample is consistent with the literature when they are applied to NML (Satorra &

Bentler, 1994). In particular, Bentler and Yuan (1999), Fouladi (2000), Nevitt and

Hancock (2004), and Savalei (2010) found that TRML tends to reject correct models too

often at smaller sample sizes; and results in Nevitt and Hancock (2004) and Savalei

(2010) indicate that Type I errors of TAML tend to be lower than the nominal level.

Table 3 contains the ridge ML estimates of factor loadings and error variances for

the three diversity measures with Likert data. Like the test statistics in Table 2, there

exist noticeable differences among parameter estimates. For example, parameter esti-

mates for l8, l11, and l12 with aadm in the lower panel of Table 3 are not statisti-

cally significant at the .05 level, whereas they are significant with aad and sd: Other noticeable patterns include (a) estimates of factor loadings and error variances with

the measure sd for information diversity are uniformly the smallest and (b) estimates

for error variances with sd for underlying diversity are uniformly the smallest. In par-

ticular, for all the 13 items, the z-statistics for sd in Table 3 are uniformly the largest,

implying that parameter estimates under sd tend to be more efficient.

Estimates of factor loadings and error variances with NC-transformed data are

reported in Table 4, where again there exist noticeable differences among parameter

estimates across the three samples. Error variance estimates for the measure sd with

underlying diversity are still uniformly the smallest, but the pattern with information

diversity is not so clear. Again, the z-statistics with the measure sd are uniformly the

largest across the 13 items. Comparing Tables 3 and 4, except for l11, whose esti-

mate with aadm in Table 3 is nonsignificant but significant in Table 4, the transfor-

mation does not change the significance status of other parameter estimates across

the two tables. However, fluctuations exist in parameter estimates due to the transfor-

mation, some estimates in Table 3 are slightly greater while others in Table 4 are

slightly greater.

16 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

Table 5. Estimates of Reliabilities Alpha and Omega for Three Measures of Information and Underlying Diversities: Average of Absolute Distance (aad) Among Team Members, Average of Absolute Deviation From the Mean (aadm) of Team Members, and Standard Deviation (sd) Among Team Members.

(a) Reliability a applied to sample covariance matrix

Information diversity (5 items)

Likert data Transformed data

â SE z â SE z

aad .641 .092 6.988 .556 .101 5.500 aadm .644 .097 6.615 .554 .101 5.488 sd .669 .078 8.610 .603 .095 6.333 aadm 2 aad .003 .014 .232 2.002 .014 2.166 sd 2 aad .028 .028 .989 .047 .028 1.666 sd 2 aadm .025 .041 .606 .049 .040 1.229

Underlying diversity (8 items)

Likert data Transformed data

â SE z â SE z

aad .738 .074 9.953 .714 .080 8.969 aadm .723 .080 9.015 .701 .084 8.303 sd .774 .063 12.356 .752 .068 11.004 aadm 2 aad 2.016 .020 2.769 2.014 .020 2.693 sd 2 aad .036 .021 1.686 .038 .023 1.665 sd 2 aadm .051 .039 1.324 .052 .040 1.296

(b) Reliability v following ridge ML

Information diversity (5 items)

Likert data Transformed data

v̂ SE z v̂ SE z

aad .689 .077 8.973 .596 .080 7.483 aadm .686 .086 7.969 .595 .083 7.159 sd .716 .064 11.272 .632 .077 8.248 aadm 2 aad 2.003 .014 2.255 2.001 .016 2.036 sd 2 aad .027 .027 1.018 .036 .033 1.103 sd 2 aadm .031 .039 .784 .037 .047 .780

Underlying diversity (8 items)

Likert data Transformed data

v̂ SE z v̂ SE z

aad .739 .078 9.520 .715 .081 8.845 aadm .727 .073 9.923 .705 .080 8.851 sd .774 .065 11.846 .751 .070 10.761 aadm 2 aad 2.012 .025 2.490 2.010 .021 2.488 sd 2 aad .035 .022 1.582 .036 .023 1.542 sd 2 aadm .047 .038 1.236 .046 .039 1.196

Deng et al. 17

by guest on July 5, 2014epm.sagepub.comDownloaded from

Reliability

Table 5 contains the estimates of alpha and omega for both information diversity and

underlying diversity. Their SEs and corresponding z-statistics are also reported in the

table. The differences of reliability estimates for either alpha or omega, together with

their SEs and corresponding z-statistics are reported as well. The results in Table 5

suggest that both â and v̂ with the Likert data are uniformly greater than those with

the transformed data, so are the corresponding z-statistics. Except for aad with under-

lying diversity, all the other v̂ are greater than the corresponding â: Among the three diversity measures (aad, aadm, sd), sd always corresponds to the

largest â and v̂ with either the Likert or the transformed data, and for either informa-

tion or underlying diversity. Except for the information diversity scale with Likert

data in the top left portion of Table 5, aadm always corresponds to the smallest â and

v̂: However, the largest z-statistic for reliability difference is always between sd and aad (sd � aad). This is because the SEs corresponding to the difference of the esti- mates between sd and aad tend to be smaller than those between sd and aadm, due to

different correlations between the estimates of reliability.

Although none of the differences in reliability estimates are significant at the level

of .05, three differences of â are at the level of .1, corresponding to those between sd

and aad for information diversity with transformed data (z = 1:666), and for underly- ing diversity with both Likert and NC-transformed data (z = 1:686, 1.665). Two dif- ferences in v̂ corresponding to those between sd and aad for underlying diversity are

also marginal (z = 1:582, 1.542). As there are only N = 52 independent teams in the analysis, we would expect that the difference of reliability estimates between sd and

aad to become more pronounced and significant with a larger N.

Most v̂ are greater than â in Table 5. There are also exceptions, for example, the

measure sd with the NC-transformed sample for underlying diversity. Such observed

differences are expected since the items may not be essentially tau-equivalent or lit-

erally unidimensional (see, e.g., Raykov, 1997).

Discussion and Conclusion

In this article, we described methodology for studying unidimensionality and reliabil-

ity of diversity measures with Likert data. Using some real data, the analyses indicate

that the reliability estimates corresponding to sd are the greatest. The z-statistics for

the reliability estimates corresponding to sd are the largest, and so are the correspond-

ing z-statistics for estimates of factor loading and error variances. The SEs for these

estimates are also the smallest. These indicate that the diversity measure sd tends to

yield more efficient parameter estimates than both aad and aadm: With respect to uni- dimensionality, there is little difference in test statistics across the three diversity

measures. Thus, among the three measures of diversity, sd is the preferred measure.

Comparing between the NC-transformed data and the Likert data, the transformed

data are closer to the underling values if the NC assumption holds. The transformed

data are less skewed on average; diversity measures with the transformed data also

18 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

tend to be more unidimensional than those with Likert data. However, reliability esti-

mates following the transformed data tend to be smaller than those following the

Likert data. It would appear that some additional studies are needed to further exam-

ine the merit of the transformed data versus Likert data.

The implication in studying the reliability and unidimensionality of the diversity

measures is that there exists a model in which each measure (aadij, aadmij, or sdij)

is linearly related to a latent diversity trait plus an error or uniqueness term. In con-

trast, models in studying rater reliability assume that the observed scores are linearly

related to some underlying traits. Clearly, both kinds of models are hypothetical and

are motivated by the needs to study reliability and/or unidimensionality of the corre-

sponding measurements. The obtained results in this study indicate that each of the

three diversity measures is fitted by the one-factor model reasonably well, and each

subscale defined by these measures also has a decent reliability. More research is

clearly needed as to whether the model behind the diversity measures aad, aadm, or

sd is more plausible or that behind the original xijk or yijk is more plausible. This

issue might be best addressed through an analysis in which many different sets of

real data are examined rather than through analytical or simulation studies.

Appendix A

This appendix gives the formula for calculating consistent estimates Ĝjk of

Gjk = Cov( ffiffiffi n p

sj, ffiffiffi n p

sk ), j, k = 1, 2:

Let the two different measures of diversity be denoted by di1 and di2, each is a

p31 vector, i = 1, 2, . . . , N = n + 1: Let �dj be the sample mean of the jth diversity mea- sure, and tij = vech½(dij � �dj)(dij � �dj)0�: Then a consistent estimate of Gjk is given by

Ĝjk = 1

N

XN i = 1

(tij ��tj)(tik ��tk )0,

where �tj and �tk are the vectors of sample means of tij and tik , respectively.

Appendix B

Asymptotic Distributions of â2 � â1 and v̂2 � v̂1 This appendix shows that both â2 � â1 and v̂2 � v̂1 are asymptotically normally dis- tributed and gives the formulas for calculating consistent estimates of their variances.

In the Methodology section we have introduced the notation â1 = g(s1) and

â2 = g(s2): Their population counterparts are given by a1 = g(s1) and a2 = g(s2): It follows from standard asymptotics that

ffiffiffi n p ½(â2 � a2) � (â1 � a1)� =

ffiffiffi n p ½g(s2) � g(s2)��

ffiffiffi n p ½g(s1) � g(s1)�

= _g0(s2) ffiffiffi n p

(s2 � s2) � _g0(s1) ffiffiffi n p

(s1 � s1) + op(1), ðB1Þ

Deng et al. 19

by guest on July 5, 2014epm.sagepub.comDownloaded from

where op(1) denotes a term that converges to 0 in probability when n ! ‘. According to the central limit theorem,

ffiffiffi n p

(s1 � s1) and ffiffiffi n p

(s2 � s2) are jointly asymptotically normally distributed with asymptotic variance-covariance matrices

G11, G22, and G12: It follows from (B1) that

ffiffiffi n p ½(â2 � a2) � (â1 � a1)�!

L N (0, t2a), ðB2Þ

where

t 2 a = _g

0(s1)G11 _g(s1) + _g 0(s2)G22 _g(s2) � 2 _g0(s1)G12 _g(s2):

For two estimates v̂1 = h(û1) and v̂2 = h(û2) of omega, there existsffiffiffi n p ½(v̂2 � v2) � (v̂1 � v1)� =

ffiffiffi n p ½h(û2) � h(u2)��

ffiffiffi n p ½h(û1) � h(u1)�

= _h(u2) ffiffiffi n p

(û2 � u2) � _h(u1) ffiffiffi n p

(û1 � u1) + op(1): ðB3Þ

The development in Yuan and Chan (2008) implies that

ffiffiffi n p

(ûj � uj) = ffiffiffi n p

(ûaj � uaj) = C0aj ffiffiffi n p

(sj � sj) + op(1), j = 1, 2, ðB4Þ

where

Caj = W(uaj) _s(uaj)½ _s0(uaj)W(uaj) _s(uaj��1:

Combining (B3) and (B4) yields

ffiffiffi n p ½(v̂2 � v2) � (v̂1 � v1)�!

L N (0, t2v),

where

t 2 v =

_h0(u1)Ω11 _h(u1) + _h 0(u2)Ω22 _h(u2) � 2 _h0(u1)Ω12 _h(u2)

with Ωjk = C 0 ajGjk Cak:

Appendix C

Thirteen Items for Measuring Team Diversity

The first five items are for measuring information diversity, and the last eight items

are for measuring underlying diversity. Participants were asked to endorse each of the

items using a 5-point Likert-type scale.

1. Overall, the ages of team members are widely distributed.

2. Overall, team members have diverse background and training.

3. Overall, knowledge and specialty of team members are complementary.

20 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

4. Overall, team members have similar social experience (reversed).

5. Overall, team members have different expertise.

6. Overall, team members have the same value regarding entrepreneurial

development (reversed).

7. If starting a new business, team members will aim to achieve the same goal

(reversed).

8. If starting a new business, team members will have the same ambition

(reversed).

9. Overall, team members have different personality.

10. Overall, team members have different working styles.

11. Overall, team members are on the same page regarding the goal of the team

(reversed).

12. Overall, team members have the same consensus regarding the focus of the

team (reversed).

13. Team members have different entrepreneurial philosophy.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship,

and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship,

and/or publication of this article: The research was supported in part by a grant from the

National Natural Science Foundation of China (71002023) and a grant from China Scholarship

Council.

Note

1. We omit the subscripts i, j, and k from q and h to simplify the notation.

References

Algina, J. (1978). Comment on Bartko’s ‘‘On various intraclass correlation reliability

coefficients.’’Psychological Bulletin, 85, 135-138.

Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Monterey, CA:

Brooks/Cole.

Anderson, J. C., & Gerbing, D. W. (1982). Some methods for respecifying measurement

models to obtain unidimensional construct measurement. Journal of Marketing Research,

19, 453-460.

Babakus, E., Ferguson, J. C. E., & Jöreskog, K. G. (1987). The sensitivity of confirmatory

maximum likelihood factor analysis to violations of measurement scale and distributional

assumptions. Journal of Marketing Research, 24, 222-228.

Bagozzi, R. P. (1980). Causal models in marketing. New York: John Wiley.

Deng et al. 21

by guest on July 5, 2014epm.sagepub.comDownloaded from

Bentler, P. M., & Yuan, K.-H. (1999). Structural equation modeling with small samples: Test

statistics. Multivariate Behavioral Research, 34, 181-197.

Biemann, T., & Kearney, E. (2010). Size does matter: How varying group sizes in a sample

affect the most common measures of group diversity. Organizational Research Methods,

13, 582-599.

Blau, P. M. (1977). Inequality and heterogeneity. New York, NY: Free Press.

Burt, R. S. (1973). Confirmatory factor-analysis structures and the theory construction process.

Sociological Methods & Research, 2, 131-187.

Burt, R. S. (1976). Interpretational confounding of unobserved variables in structural equation

models. Sociological Methods & Research, 5, 3-52.

Deng, L., Ye, S., & Xie, L. (2013). A longitudinal study of team trait combinations, team

process and performance. Journal of Management Science (manuscript under review).

Feller, W. (1945). On the normal approximation to the binomial distribution. Annals of

Mathematical Statistics, 16, 319-329.

Fouladi, R. (2000). Performance of modified test statistics in covariance and correlation

structure analysis under conditions of multivariate nonnormality. Structural Equation

Modeling, 7, 356-410.

Guzzo, R. A., & Shea, G.P. (1992). Group performance and intergroup relations in

organizations. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of industrial and

organizational psychology (Vol. 3, 2nd ed., pp. 269-313). Palo Alto, CA: Consulting

Psychologists Press.

Harrison, D. A., & Klein, K. J. (2007). What’s the difference? Diversity constructs as

separation, variety, or disparity in organizations. Academy of Management Review, 32,

1199-1228.

Hays, W. L. (1981). Statistics (3rd ed.). New York, NY: Holt, Rinehart & Winston.

Jackson, S. E., Joshi, A., & Erhardt, N. L. (2003). Recent research on team and organizational

diversity: SWOT analysis and implications. Journal of Management, 29, 801-830.

Lee, S.-Y., & Shi, J.-Q. (1998). Analysis of covariance structures with independent and non-

identically distributed observations. Statistica Sinica, 8, 543-557.

McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum.

Nevitt, J., & Hancock, G. (2004). Evaluating small sample approaches for model test statistics

in structural equation modeling. Multivariate Behavioral Research, 39, 439-478.

Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient.

Psychometrika, 44, 443-460.

Pearson, E. S., & Hartly, H. O. (1954). Biometrika tables for statisticians (Vol. 1). London,

England: Biometrika Trust.

Raykov, T. (1997). Scale reliability, Cronbach’s coefficient alpha, and violations of essential

tau-equivalence with fixed congeneric components. Multivariate Behavioral Research, 32,

329-353.

Raykov, T., & Marcoulides, G. A. (2011). Introduction to psychometric theory. New York,

NY: Routledge.

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance

structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications

for developmental research (pp. 399-419). Thousand Oaks, CA: Sage.

Savalei, V. (2010). Small sample statistics for incomplete nonnormal data: Extensions of

complete data formulae and a Monte Carlo comparison. Structural Equation Modeling, 17,

241-264.

22 Educational and Psychological Measurement

by guest on July 5, 2014epm.sagepub.comDownloaded from

Schott, J. (2005). Matrix analysis for statistics (2nd ed.). New York, NY: John Wiley.

Schuster, C., & Smith, D. A. (2002). Indexing systematic rater agreement with a latent-class

model. Psychological Methods, 3, 384-395.

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability.

Psychological Bulletin, 86, 420-428.

Stewart, G. L. (2006). A meta-analytic review of relationships between team design features

and team performance. Journal of Management, 32, 29-54.

Teachman, J. D. (1980). Analysis of population diversity. Sociological Methods & Research, 8,

341-362.

Van Knippenberg, D., De Dreu, C. K. W., & Homan, A. C. (2004). Work group diversity and

group performance: An integrative model and research agenda. Journal of Applied

Psychology, 89, 1008-1022.

Van Zyl, J. M., Neudecker, H., & Nel, D. G. (2000). On the distribution of the maximum

likelihood estimator of Cronbach’s alpha. Psychometrika, 65, 271-280.

Webber, S. S., & Donahue, L. M. (2001). Impact of highly and less job-related diversity on

work group cohesion and performance: A meta-analysis. Journal of Management, 27,

141-162.

Yuan, K.-H., & Bentler, P. M. (2002). On robustness of the normal-theory based asymptotic

distributions of three reliability coefficient estimates. Psychometrika, 67, 251-259.

Yuan, K.-H., & Chan, W. (2008). Structural equation modeling with near singular covariance

matrices. Computational Statistics & Data Analysis, 52, 4842-4858.

Deng et al. 23

by guest on July 5, 2014epm.sagepub.comDownloaded from View publication statsView publication stats