Module 6: Validity and Reliability 650

profileKCplul76
650MODULE6DISCUSSIONED565757.pdf

Measuring Principal Performance How Rigorous Are Commonly Used Principal Performance Assessment Instruments?

Quality School LEADERSHIP

A Q

U A

L IT

Y S

C H

O O

L L

E A

D E

R S

H IP

I s

s u

e B

ri e

f |

Ja

n u

a ry

2 0

1 2

High-performing and dramatically improving schools are led by

strong principals. The Quality School Leadership (QSL) services

developed by American Institutes for Research gives educators the

tools they need to hire and assess their leaders. Our Quality School

Leadership Identification (QSL-ID) process is a standardized hiring

procedure built from research-based tools that local hiring

committees can use to reach consensus when selecting

a new school principal.

The QSL-ID process guides schools and districts through each of the

specific steps to hiring the right school leader and allows them to:

¡ Establish an effective hiring committee that understands

the specific leadership needs of the school or district.

¡ Recruit principal candidates based on the criteria that

best meet school and district goals.

¡ Identify the strongest candidates and conduct an onsite

performance assessment of finalists.

¡ Plan for a smooth leadership transition.

Learn more about our QSL-ID services at

http://www.learningpt.org/expertise/educatorquality/

schoolLeadershipIdentification.php/.

Quality School LEADERSHIP

Measuring Principal Performance How Rigorous Are Commonly Used Principal Performance Assessment Instruments?

Revised January 2012

Christopher Condon, Ph.D. Matthew Clifford, Ph.D.

Contents 1 Introduction

2 New Standards for Principal Performance

3 Reliability and Validity

4 The Reviewed Measures

4 Change Facilitator Style Questionnaire

5 Diagnostic Assessment of School and Principal Effectiveness

5 Instructional Activity Questionnaire

5 Leadership Practices Inventory

5 Performance Review Analysis and Improvement System for Education

6 Principal Instructional Management Rating Scale

6 Principal Profile

6 Vanderbilt Assessment of Leadership in Education

6 Summary of Findings

9 Findings

11 References

13 Additional Resources

Acknowledgments

The authors wish to thank Kenneth Arndt, Ph.D.; David Behlow, Ph.D.; Kenneth Leithwood, Ph.D.; Martha McCarthy, Ph.D.; and Patrick Schuermann, Ed.D., whose review of this brief improved its content.

The author also wishes to thank Publication Services staff at American Institutes for Research, particularly Christine Hulbert and Laura King, who helped to shape the work.

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 1

Introduction Assessing school principal performance is both necessary and challenging. It is

necessary because principal performance assessments offer districts an additional

mechanism to ensure accountability for results and reinforce the importance of strong

leadership practices. After all, school principals are second only to classroom teachers

as the most influential school factor in student achievement (Hallinger & Heck, 1998;

Leithwood, Louis, Anderson, & Wahlstrom, 2004). Principal performance assessments

also provide central office administrators and principals, themselves, information

with which to build professional learning plans and chart professional growth. Such

assessments are also challenging because principals’ practice and influence on

instruction is sometimes not readily apparent.

During the past five years, many states have begun using validated measures in

summative assessments of novice principal competency as a basis for certification

decisions. These measures may be psychometrically sound but often cannot be used

for formative performance assessments or professional development planning (Reeves,

2005). To be used as a formative performance assessment, test results would have to

be disaggregated, and their underlying constructs would need to be made transparent to

readers. In addition, administrative and analytic control would have to be transferred to

local educators (see “Formative Versus Summative Assessment: What Is the Difference?”

on page 2).

Although standardized tests are used for certification purposes, other types of

assessments are being used by school districts to ascertain principal performance

and plan professional learning. So, independent of standardized measures, which

tend to serve summative purposes, other assessments are being used formatively

to judge principal performance. Scanning the field, Goldring et al. (2009) found that

school districts often use idiosyncratic and inconsistent measures for principal

performance assessment. Districts’ principal performance assessments may or may

not be aligned with existing professional standards, and they often lack justification

or documentation of psychometric rigor (Heck & Marcoulides, 1996). In other words,

district performance assessments allow for formative feedback, but the measures vary

in quality and rigor. This variance opens up the possibility that scores are inaccurate

or performance assessments do not reflect research-based standards of the field.

Superintendents and others who seek to improve principal performance assessment

may select one or more of these measures or may develop and validate their own

measures. Regardless of origin, assessments should be validated and reliable to

ensure accuracy and applicability to principal performance.

2 | MEASURING PRINCIPAL PERFORMANCE

This brief reports results of a scan of publicly available measures conducted by Learning Point Associates staff* in 2009. The measures included in this review are expressly intended to evaluate principal performance and have varying degrees of publicly available evidence of psychometric testing. The review of this information is intended to inform decision makers’ selection of job performance instruments used for hiring, performance assessment, and tenure decisions. This brief also addresses the importance of standards-based measures, the need for establishing reliability and validity, and the measures that are more widely accepted and psychometrically sound.

New Standards for Principal Performance

Knowledge about what strong principals do to develop and maintain teaching and learning excellence has evolved with the changes in the context of schooling and improved school leadership research. School principals are being asked to ensure that all students have access to high-quality instruction and all educators are held accountable for student learning. These tasks deepen and broaden principals’ professional responsibilities beyond their traditional roles as building managers.

New standards for principal performance have emerged and reflect new emphases in the profession. The Educational Leadership Policy Standards: ISLLC 2008, for example, are a widely recognized and referenced principals standards list (Council of Chief State School Officers, 2008). The ISLLC Standards contain six domains for principal professional practice:

¡ Setting a widely shared vision for learning

¡ Developing a school culture and instructional program conducive to student learning and staff professional growth

¡ Ensuring effective management of the organization, operation, and resources for a safe, efficient, and effective learning environment

¡ Collaborating with faculty and community members, responding to diverse community interests and needs, and mobilizing community resources

¡ Acting with integrity, fairness, and in an ethical manner

¡ Understanding, responding to, and influencing the political, social, legal, and cultural context

* Learning Point Associates merged with American Institutes for Research in August 2010.

Formative Versus Summative Assessment: What Is the Difference?

No matter their form, assessments generally have two purposes. An assessment used for summative purposes tends to inform a decision about the test taker’s competence, and there is no opportunity for remediation or development after completion. An assessment used for formative purposes is also a measure of competence, but results are used to inform future actions. For example, a formative purpose of performance assessment is to inform a principal’s professional development plan. A single assessment may serve formative and summative purposes in different situations.

The Learning Point Associates scan included only publicly available and rigorously tested measures that are useful for formative assessment purposes.

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 3

As the ISLLC Standards suggest, principals must work within a well-formed ethical code to oversee instructional quality; develop teacher talents; establish a learning culture in schools; and work within and beyond the school to secure financial, human, and political capital to maintain and advance organizational operations.

The ISLLC Standards have been integrated into many states’ licensure procedures through the following means:

¡ Alignment of ISLLC Standards with state principal professional standards

¡ Requirement of all principal candidates to receive a certain score on a standardized examination, which has been validated against ISLLC Standards, as a prerequisite for certification

¡ Requirement of state-recognized preservice principal preparation programs to display and defend how program activities prepare and determine whether candidates meet ISLLC Standards

Less is known about the integration and alignment of ISLLC Standards, other standards lists, or other promising leadership practices with principal performance assessments.

Reliability and Validity To be included in the scan, documentation of validity and reliability testing had to be published. Such testing provides evidence of psychometric rigor, which should be considered by purchasers and users of performance assessments.

Assessments are considered valid when they measure what they are intended to measure. There are many types of validity, but two of the more salient types in constructing performance measures are content and construct validity. Content validity is established by ensuring that the test items under consideration measure all of the dimensions or facets of a given construct, such as principal performance. Content validity can be established by linking the test or other items to a set of standards, such as the ISLLC Standards, or practices, such as leadership effectiveness.

Construct validity is determined by the degree to which test items measure a “construct,” which is the element that the items purport to assess. For example, a construct may be ISLLC Standard 5, “An education leader promotes the success of every student by acting with integrity, fairness, and in an ethical manner” (Council of Chief State School Officers, 2008, p. 15). For this construct, multiple test items or another method for collecting evidence would be needed to determine the degree to which the standard is met. In this case, testing for construct validity would determine how well items and observations measure principals’ abilities to act with integrity, fairness, and in an ethical manner.

Reliability is a measure of consistency and stability. A measure has reliability when the responses are consistent and stable for each individual who takes the test. In other words, a principal should receive relatively the same score on multiple administrations of a given test if all factors remain the same.

4 | MEASURING PRINCIPAL PERFORMANCE

The Reviewed Measures Of the 20 school principal performance assessment measures identified through Google Scholar, eight met preestablished criteria for inclusion in the review (see “How Assessments Were Selected for Review”).

Some measures, such as the ETS School Leadership Series examinations, provided extensive documentation of reliability and validity testing but no information about the formative use of results in performance assessment, so this measure was not included in the review. Other measures, such as the Chicago Public Schools’ principal performance rubric, are clearly intended for use during performance assessments, but no documentation was available about the validity or reliability of these measures.

The following principal performance assessments were included in the review and may be useful resources for superintendents, human resource directors, and others who are charged with gauging principal skills and abilities for hiring, performance assessment, and tenure decisions. Table 1 provides additional information about each of the measures included in this review (see p. 7).

Change Facilitator Style Questionnaire

Vandenberghe (1988) developed the Change Facilitator Style Questionnaire (CFSQ) to measure the extent to which leaders can facilitate change (see School Administrators of Iowa, 2003). In CFSQ, three different approaches have been identified as change facilitator styles: initiator, manager, and responder. Data are categorized into three clusters with two scales/dimensions embedded within each cluster:

¡ Cluster 1. Concern for People: Scale 1 (Social/Informal) and Scale 2 (Formal/ Meaningful)

¡ Cluster 2. Organizational Efficiency: Scale 3 (Trust in Others) and Scale 4 (Administrative Efficiency)

¡ Cluster 3. Strategic Sense: Scale 5 (Day-to- Day) and Scale 6 (Vision and Planning)

How Assessments Were Selected for Review

Learning Point Associates staff conducted a keyword search of Google Scholar to locate school principal performance assessment instruments. More than 5,000 articles were initially identified, but the majority of articles were not pertinent. To winnow the list further, publicly available performance assessment support documents had to report that the assessment was

¡ Intended for use as a per formance assessment.

¡ Psychometrically tested for reliability and validity.

¡ Publicly available for purchase.

For the purposes of the review, psychometrically sound means that the instrument must be tested for validity and reliability using accepted testing measures. A minimum reliability rating of 0.75 must be achieved. Also, content validity and/or construct validity testing must have occurred.

Using these criteria, 20 assessments were identified, and eight principal performance assessment instruments were included in the final review.

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 5

Diagnostic Assessment of School and Principal Effectiveness

Ebmeier (1992) developed this measure to identify the strengths of schools and their leaders so that school improvement plans and principal professional development goals would be better informed. To complete the assessment, separate surveys are completed by students, teachers, parents, principals, and principal supervisors. The measures indicate how these groups view themselves, school leadership, and school performance. Multiple measures are completed by multiple groups to identify matches between school leader traits and school characteristics. These measures can be used separately depending on their purpose. For more information, see Ebmeier (1991).

Instructional Activity Questionnaire

This measure was developed by Larsen (1987) as a performance assessment tool that specifically addresses instructional leadership aspects of principals’ work (as cited in Heck, Larsen, & Marcoulides, 1990). The measure was developed through an extensive review of the school principal effectiveness literature.

Leadership Practices Inventory

Kouzes and Posner (2002) developed the Leadership Practices Inventory (LPI) by extensively interviewing and surveying leaders, including principals, to identify best leadership practices. Thus, LPI views leadership practices as transferrable across professional types. What works to inspire people in business settings also may work in educational settings. LPI’s domains are as follows: (1) modeling the way, (2) inspiring a shared vision, (3) challenging the process, (4) enabling others to act, and (5) encouraging the heart. This measure has found widespread appeal across many disciplines, and LPI can be completed as an online or print survey. For more information, see Kouzes and Posner (n.d.).

Performance Review Analysis and Improvement System for Education

The Performance Review Analysis and Improvement System for Education (PRAISE) assessment system was developed through an extensive review of school administrator effectiveness literature. As such, PRAISE domains are not specifically aligned with professional standards. The PRAISE domains are problem solving, relations with teachers, and professional qualities and competencies. PRAISE is a print assessment to be completed by the principal and his or her supervisor. For more information, see Knoop and Common (1985).

6 | MEASURING PRINCIPAL PERFORMANCE

Principal Instructional Management Rating Scale

Hallinger and Murphy (1985) developed the Principal Instructional Management Rating Scale (PIMRS) to determine the degree to which principals serve as instructional managers. PIMRS also provides exemplars of each construct, which may be used by raters to identify changes in their own or others’ practices. PIMRS focuses on several constructs, including the dedicated use of time for improving instruction, coordinating curriculum, and evaluating instruction. For more information, see Leadingware (2008).

Principal Profile

The Principal Profile was developed through extensive interview and consultation with principals, teachers, superintendents, and department heads. The authors consulted with practitioners to establish validity and reliability but also to ensure that the measure was practical for use in school/district settings. Two key assumptions inform the tool: (1) student growth should be a benchmark for school leader effectiveness and a factor in performance evaluation and (2) school leader effectiveness is marked by consistency of actions, in that principals need a well-defined set of purposes and the skill and knowledge to achieve them on a consistent basis. For more information, see Leithwood and Montgomery (1986) and Leithwood (1987).

Vanderbilt Assessment of Leadership in Education

Since the Vanderbilt Assessment of Leadership in Education (VAL-ED) was developed in 2006, it has become one of the most widely used and respected measures of school leadership performance assessment. Like the Diagnostic Assessment of School and Principal Effectiveness, VAL-ED assesses principal performance by gathering information from principals, teachers, and principal supervisors. The results from VAL-ED produce a quantitative diagnostic profile that is linked to the ISLLC standards. VAL-ED is based on a thorough examination of the research literature including a conceptual framework within which to place the scale. For more information, see Vanderbilt Peabody College (n.d.) and Porter, Murphy, Goldring, and Elliot (2006).

Summary of Findings Table 1 synthesizes findings from the review of instruments. In the table, the content focus of the assessment (e.g., principal as change facilitator or principal as instructional leader) and evaluation approach (e.g., self-reflection survey or 360-degree evaluation) are indicated in the column labeled “Approach.” Validity measures and testing methods are generally described. In the “Reliability” column, a benchmark of 0.80 was used to indicate “moderate” reliability, and a benchmark of 0.90 was used to indicate “high” reliability. Any reliability rating below 0.80 was considered “poor.”

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 7

Ta bl

e 1

. S ch

oo l L

ea de

rs hi

p M

ea su

re s

In st

ru m

en t

A ut

ho r(

s) A

pp ro

ac h

Ti m

e R

eq ui

re d

C on

te nt

a nd

C on

st ru

ct V

al id

it y

R el

ia bi

lit y

C ha

ng e

Fa ci

lit at

or S

ty le

Q

ue st

io nn

ai re

(C

FS Q

)

Va nd

en be

rg he

(1

98 8)

¡ ¡

77 -i

te m

a ss

es sm

en t t

ha t

ad dr

es se

s si

x do

m ai

ns

¡ ¡

C ol

le ct

s in

fo rm

at io

n ab

ou t

te ac

he rs

’ v ie

w o

f t he

p rin

ci pa

l as

le ad

er

Ti m

e re

qu ire

d is

no

t s ta

te d.

C on

te nt

v al

id ity

is b

as ed

o n

ex te

ns iv

e lit

er at

ur e

re vi

ew

an d

fo cu

s gr

ou p

st ud

y us

ed in

d ev

el op

m en

t.

C on

st ru

ct v

al id

ity is

e st

ab lis

he d

th ro

ug h

em pi

ric al

va

lid at

io n

us in

g co

nf irm

at or

y fa

ct or

a na

ly si

s.

Po or

: S ub

sc al

e co

ef fic

ie nt

a lp

ha s

ra ng

e fr

om 0

.6 4

to

0 .9

5.

D ia

gn os

tic

As se

ss m

en t o

f S

ch oo

l a nd

Pr

in ci

pa l

Ef fe

ct iv

en es

s

Eb m

ei er

(1

9 92

) ¡ ¡

36 0

-d eg

re e

ev al

ua tio

n fo

cu si

ng o

n ed

uc at

io na

l l ea

de rs

hi p

¡ ¡

21 3

-i te

m s

ur ve

y in

de pe

nd en

tly

co m

pl et

ed b

y st

ud en

ts , t

ea ch

er s,

pr

in ci

pa ls

, a nd

o th

er s

¡ ¡

Re su

lts c

om bi

ne d

in to

a s

co re

30 –4

0

m in

ut es

p er

qu

es tio

nn ai

re

C on

te nt

v al

id ity

is s

ub st

an tia

te d

th ro

ug h

th e

de ve

lo pm

en t o

f a c

on ce

pt ua

l f ra

m ew

or k

an d

ex te

ns iv

e re

vi ew

b y

gr ad

ua te

s tu

de nt

s, c

ol le

ge p

ro fe

ss or

s, a

nd

pr ac

tit io

ne rs

.

C on

st ru

ct v

al id

ity is

e st

ab lis

he d

th ro

ug h

fa ct

or a

na ly

si s

an d

hi gh

in te

r- ite

m c

or re

la tio

ns .

M od

er at

e: A

lp ha

co

ef fic

ie nt

s ra

ng e

fr om

0 .8

0 to

0 .9

7.

In st

ru ct

io na

l A

ct iv

it y

Q ue

st io

nn ai

re

La rs

en (1

98 7

) ¡ ¡

A 3

4 -i

te m

a ss

es sm

en t

¡ ¡

C ov

er s

th re

e su

bs ca

le s

th

at s

pe ci

fic al

ly a

dd re

ss

in st

ru ct

io na

l l ea

de rs

hi p

Ti m

e re

qu ire

d is

no

t s ta

te d.

C on

te nt

v al

id ity

is b

as ed

o n

re vi

ew o

f l ite

ra tu

re .

C on

st ru

ct v

al id

ity is

e m

pi ric

al ly

v al

id at

ed u

si ng

co

nf irm

at or

y fa

ct or

a na

ly si

s.

M od

er at

e: A

lp ha

co

ef fic

ie nt

r an

ge s

fr om

0 .7

0 to

0 .9

0.

Le ad

er sh

ip

Pr ac

tic es

In

ve nt

or y

(L PI

)

Ko uz

es a

nd

Po sn

er (

20 0

2) ¡ ¡

30 -i

te m

m ea

su re

o f g

en er

al

le ad

er sh

ip p

ra ct

ic es

t o

be

co m

pl et

ed b

y th

e pr

in ci

pa l

an d

an o

bs er

ve r o

r s up

er vi

so r

¡ ¡

W id

el y

us ed

t o

m ea

su re

le ad

er sh

ip

ef fe

ct iv

en es

s

Ti m

e re

qu ire

d is

no

t s ta

te d.

C on

te nt

v al

id it

y is

e st

ab lis

he d

vi a

ex te

ns iv

e in

te rv

ie w

s an

d su

rv ey

s of

le ad

er s.

C on

st ru

ct v

al id

it y

is e

st ab

lis he

d vi

a co

nf irm

at or

y fa

ct or

a na

ly si

s.

C on

cu rr

en t

va lid

it y

is e

st ab

lis he

d be

tw ee

n LP

I a nd

ot

he r

ev al

ua tio

ns o

f m an

ag er

ia l e

ff ec

tiv en

es s,

s uc

h as

w or

ki ng

w ith

o th

er s,

c re

di bi

lit y,

a nd

t ea

m

co he

si ve

ne ss

.

Po or

: T es

t- re

te st

re

lia bi

lit y

fo r

sc ho

ol p

rin ci

pa ls

is

0 .7

9.

8 | MEASURING PRINCIPAL PERFORMANCE

In st

ru m

en t

A ut

ho r(

s) A

pp ro

ac h

Ti m

e R

eq ui

re d

C on

te nt

a nd

C on

st ru

ct V

al id

it y

R el

ia bi

lit y

Pe rf

or m

an ce

Re

vi ew

A na

ly si

s an

d Im

pr ov

em en

t S

ys te

m fo

r Ed

uc at

io n

(P R

AI SE

)

K no

op a

nd

C om

m on

(1

98 5)

¡ ¡

81 -i

te m

a ss

es sm

en t t

ha t i

nc lu

de s

ni ne

s ub

sc al

es

¡ ¡

Pr od

uc es

a t

w o-

di m

en si

on al

le

ad er

sh ip

p ro

fil e

an d

id en

tif ic

at io

n of

s tr

en gt

hs a

nd

w ea

kn es

se s

15 –2

0 m

in ut

es C

on te

nt v

al id

it y

is b

as ed

o n

th or

ou gh

r ev

ie w

o f

ef fe

ct iv

en es

s lit

er at

ur e

an d

co nt

en t

va lid

at io

n.

C on

st ru

ct v

al id

it y

is n

ot e

xa m

in ed

.

M od

er at

e: A

lp ha

co

ef fic

ie nt

r an

ge s

fr om

0 .8

8 to

0 .9

8,

w he

re as

t es

t- re

te st

r el

ia bi

lit y

ra ng

es fr

om 0

.5 9

to 0

.8 0.

Pr in

ci pa

l In

st ru

ct io

na l

M an

ag em

en t-

Ra tin

g S

ca le

(P

IM RS

)

H al

lin ge

r a nd

M

ur ph

y (1

98 5)

¡ ¡

71 -i

te m

q ue

st io

nn ai

re t

ha t

ad dr

es se

s 11

e du

ca tio

na l

le ad

er sh

ip s

ub sc

al es

¡ ¡

W id

el y

us ed

in t

he f

ie ld

10 –1

5 m

in ut

es C

on te

nt v

al id

ity is

b as

ed o

n a

re vi

ew o

f t he

in

st ru

ct io

na l l

ea de

rs hi

p lit

er at

ur e.

C on

te nt

is v

al id

at ed

th

ro ug

h ex

te ns

iv e

ex pe

rt r

ev ie

w . A

gr ee

m en

t a m

on g

ra te

rs w

as 0

.8 0

fo r e

ac h

ite m

fo r i

nc lu

si on

in t

he s

ca le

.

C on

st ru

ct v

al id

ity is

s ho

w n

by h

ig he

r c or

re la

tio ns

am

on g

ite m

s w

ith in

a s

ub sc

al e

th an

fo r t

he s

am e

ite m

s fo

r o th

er s

ub sc

al es

. I n

ad di

tio n,

P IM

RS s

co re

s ar

e co

rr ob

or at

ed b

y sc

ho ol

d oc

um en

ts .

Po or

: A lp

ha

co ef

fic ie

nt

is 0

.7 5.

Pr in

ci pa

l P ro

fil e

Le ith

w oo

d an

d M

on tg

om er

y

(1 98

6)

Le ith

w oo

d (1

98 7

)

¡ ¡

In te

rv ie

w -b

as ed

a ss

es sm

en t

te ch

ni qu

e th

at m

ea su

re s

le ad

er sh

ip e

ff ec

tiv en

es s

on c

er ta

in

ta sk

s an

d ch

ar ac

te riz

es le

ad er

sh ip

st

yl e

¡ ¡

U se

d pr

im ar

ily a

s a

di ag

no st

ic t

oo l

15 –2

0 m

in ut

es C

on te

nt v

al id

ity is

b as

ed o

n re

vi ew

o f t

he li

te ra

tu re

.

C on

st ru

ct v

al id

ity is

b as

ed o

n em

pi ric

al v

al id

at io

n us

in g

co nf

irm at

or y

fa ct

or a

na ly

si s.

Po or

: I nt

er -r

at er

ag

re em

en t f

ro m

0.

5 0

to 1

.0 0.

Va nd

er bi

lt As

se ss

m en

t o f

Le ad

er sh

ip in

Ed

uc at

io n

(V

AL -E

D )

Po rt

er e

t a l.

(2 0

06 )

¡ ¡

36 0

-d eg

re e

as se

ss m

en t t

oo l t

o be

ad

m in

is te

re d

to p

rin ci

pa ls

, t ea

ch er

s,

an d

pr in

ci pa

ls ’ s

up er

vi so

rs

¡ ¡

C on

si st

s of

7 2

ite m

s

¡ ¡

Pr od

uc es

a q

ua nt

ita tiv

e di

ag no

st ic

pr

of ile

¡ ¡

Li nk

ed t

o IS

LL C

S ta

nd ar

ds

20 m

in ut

es C

on te

nt v

al id

ity is

b as

ed o

n ex

am in

at io

n of

t he

re

se ar

ch li

te ra

tu re

c on

ce pt

ua l f

ra m

ew or

k.

C on

st ru

ct v

al id

ity is

b as

ed o

n co

nf irm

at or

y fa

ct or

an

al ys

es , w

hi ch

re ve

al ed

a g

re at

fi t f

or c

or e

co m

po ne

nt s

an d

th e

ke y

pr oc

es se

s. T

he re

w er

e hi

gh c

om po

ne nt

a nd

pr

oc es

s in

te rc

or re

la tio

ns (0

.7 3

to 0

.9 0)

.

C on

cu rr

en t v

al id

ity is

b as

ed o

n th

e fa

ct t

ha t t

ea ch

er

an d

pr in

ci pa

l r at

in gs

a re

r el

at ed

( r =

0 .4

7 ).

H ig

h: A

lp ha

is

0 .9

8 fo

r a ll

12

s ca

le s

on

di ff

er en

t f or

m s.

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 9

Findings The Internet-based scan of scholarly articles and books conducted identified 20 school

principal performance assessments, which were intended for use in hiring, advancement,

and tenure decisions. Of the 20 assessments, eight met criteria for rigor, which meant

that the assessment development process was transparent and involved some

psychometric testing, and measures were provided for review. Two of the eight

assessments were developed in the past decade, and the remainder were developed

10–20 years ago.

The scan suggests that, although there is considerable interest in school principal quality

and accountability, few principal performance assessments have been rigorously developed

or make details of psychometric testing available for public review. An explanation for the

finding is that few assessments are being used in the field, but the findings of Goldring et

al. (2009) suggest that many principal performance assessments of varying quality are

being used. Unpublished assessments were not included in the scan.

In addition, the age of instruments raises questions about their continued validity

for assessing principal performance. Given the emphasis on instructional leadership,

accountability, data-based decision making, community involvement, and other well-

documented changes to the school principal position in the past 10 years, it is plausible

that older measures do not capture essential features of the position. Changes in the

position and additional research on principal effectiveness raise concerns and may be

cause for revalidation of older assessments.

The scan also highlights the different approaches to assessing school principal

performance. The eight principal performance assessments measure the degree to which

principals complete different roles. For example, CFSQ addresses principals’ roles as

change facilitators, VAL-ED focuses on principals as instructional leaders, and PRAISE

examines principal capacity to improve school-level systems. Each provides test takers

and principal evaluators with slightly different perspectives on principal practices.

In addition, the assessments take different approaches to data collection. Several

measures use self-assessment questionnaires or rubrics that provide an aggregate

score and help principals to answer the following question: “How do I think I am

doing, in reference to professional competencies?” Others use more intensive

360-degree surveys from multiple constituents to create an aggregate profile, which

can provide comparative information based on multiple perspectives to principals

about their performance. The use of different constituencies to rate principal

performance is a growing trend (Lashway, 2003). These evaluations answer the

following question: “How do I, and others, believe I am doing, in reference to

professional competencies?”

10 | MEASURING PRINCIPAL PERFORMANCE

In conjunction with student achievement data, the performance assessments

that are included in this review hold potential for raising principal accountability and

identifying necessary changes in practice. However, principal performance assessment

data will achieve desired ends only if principals and their supervisors view the data as

credible and actionable and give assessment data considerable weight during principal

performance evaluations. Close examinations of the principal performance evaluation

process—its frequency and structure—would provide information about how

assessments are used. In addition, this process would offer insight for assessment

developers about how to structure assessment processes for better effects.

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 11

References Council of Chief State School Officers. (2008). Educational Leadership Policy Standards:

ISLLC 2008. Washington, DC: Author. Retrieved December 5, 2011, from http://www. ccsso.org/Documents/2008/Educational_Leadership_Policy_Standards_2008.pdf

DeAngelis, K. J., Peddle, M. T., & Trott, C. E. (with Bergeron, L.). (2002). Teacher supply in Illinois: Evidence from the Illinois teacher study. Edwardsville, IL: Illinois Education Research Council. Retrieved December 5, 2011, from http://ierc.siue.edu/documents/ kdReport1202_Teacher_Supply.pdf

Ebmeier, H. (1991). The development and field test of an instrument for client-based principal formative evaluation. Journal of Personnel Evaluation in Education, 4(3), 245–278.

Ebmeier, H. (1992). Diagnostic assessment of school and principal effectiveness: A reference manual. Topeka, KS: KanLEAD Educational Consortium Technical Assistance Center.

Goldring, E., Cravens, X., Murphy, J., Porter, A., Elliott, S., & Carson, B. (2009). The evaluation of principals: What and how do states and urban districts assess leadership? The Elementary School Journal, 110(1), 19–39.

Hallinger, P., & Heck, R. (1996). Reassessing the principal’s role in school effectiveness: A review of empirical research, 1980–1985. Educational Administration Quarterly, 32, 5–44.

Hallinger, P., & Murphy, J. (1985). Assessing the instructional management behavior of principals. The Elementary School Journal, 86(2), 217–247.

Heck, R. H., Larsen, T. J., & Marcoulides, G. A. (1990). Instructional leadership and school achievement: Validation of a causal model. Educational Administration Quarterly, 26(2), 94–125.

Heck, R. H., & Marcoulides, G. A. (1996). The assessment of principal performance: A multilevel evaluation approach. Journal of Personnel Evaluation in Education, 10(1), 11–28.

Knoop, R., & Common, R. W. (1985, May). A performance appraisal system for school principals. Paper presented at the Annual Meeting of the Canadian Society for the Study of Education, Montreal, Quebec, Canada.

Kouzes, J., & Posner, B. (2002). The Leadership Practices Inventory: Theory and evidence behind the five practices of exemplary leaders. Unpublished document. Retrieved December 5, 2011, from http://media.wiley.com/assets/61/06/lc_jb_appendix.pdf

Kouzes, J., & Posner, B. (n.d.). The leadership challenge [website]. Retrieved December 5, 2011, from http://www.leadershipchallenge.com/WileyCDA/

Lashway, L. (2003). Improving principal evaluation. ERIC Digest. Eugene, OR: ERIC Clearinghouse on Educational Management.

Leadingware. (2008). Principal Instructional Management Rating Scale. Retrieved December 5, 2011, from http://www.philiphallinger.com/pimrs.html

Leithwood, K. (1987). Using The Principal Profile to assess performance. Educational Leadership, 45(1), 63–66.

Leithwood, K., Louis, K., Anderson, S., & Wahlstrom, K. (2004). How leadership influences student learning. New York: The Wallace Foundation. Retrieved December 5, 2011, from http://www.wallacefoundation.org/Knowledge-center/school-leadership/key- research/Documents/How-Leadership-Influences-Student-Learning.pdf

12 | MEASURING PRINCIPAL PERFORMANCE

Leithwood, K. A., & Montgomery, D. J. (1986). Improving principal effectiveness: The principal profile. Toronto: OISE Press.

Porter, A. C., Murphy, J. F., Goldring, E. B., & Elliot, S. N. (2006). Vanderbilt Assessment of Leadership in Education. Nashville, TN: Vanderbilt University.

Reeves, D. B. (2005). Assessing educational leaders: Evaluating performance for improved individual and organizational results. Thousand Oaks, CA: Corwin.

School Administrators of Iowa. (2003). Administrator as a change leader: Change facilitator style profile interpretation. In The survival guide for Iowa school administrators. Clive, IA: Author.

Vandenberghe, R. (1988, April). Development of a questionnaire for assessing principal change facilitator style. Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA.

Vanderbilt Peabody College. (n.d.). Development of the Vanderbilt Assessment of Leadership in Education (VAL-ED). Nashville, TN: Author. Retrieved December 5, 2011, from http://peabody.vanderbilt.edu/x8451.xml

HOW RIGOROUS ARE COMMONLY USED PRINCIPAL PERFORMANCE ASSESSMENT INSTRUMENTS? | 13

Additional Resources Arizona Department of Education. (2006). Principal leadership survey. Phoenix, AZ: Author.

Cohen, E., & Miller, R. (1980). Coordination and control of instruction. Pacific Sociological Review, 23, 446–473.

Danielson, C. (2007). Enhancing professional practice: A framework for teaching (2nd ed.). Alexandria, VA: Association for Supervision and Curriculum Development.

Dornbusch, S., & Scott, W. (1975). Evaluation and the exercise of authority. San Francisco: Jossey-Bass.

ETS. (2009). The Praxis Series: Teacher licensure and certification [Website]. Retrieved December 5, 2011, from http://www.ets.org

Farkas, S., Johnson, J., Duffett, A., & Foleno, T. (with Foley, P.). (2001). Trying to stay ahead of the game: Superintendents and principals talk about school leadership. New York: Public Agenda. Retrieved December 5, 2011, from http://www.publicagenda.org/ files/pdf/ahead_of_the_game.pdf

Goldring, E., Cravens, X., Porter, A., Murphy, J., & Elliott, S. N. (2007). The state of educational leadership evaluation: What do states and districts assess? Unpublished manuscript, Vanderbilt University.

Leithwood, K. A., Begley, P. T., & Cousins, J. B. (1994). Developing expert leadership for future schools. London: Falmer Press.

Leusner, D. M., & Ohls, S. (2008). Praxis III research update summary. Princeton, NJ: ETS.

Lortie, D. (1969). The balance of control and autonomy in elementary school teaching. In A. Etzioni (Ed.), The semi-professions and their organization (pp. 1–53). New York: Free Press.

Lortie, D. (1975). Schoolteacher: A sociological study. Chicago: University of Chicago Press.

Scott, L. (2008). Leadership performance planning worksheet. Santa Monica, CA: RAND.

Stallings, J., & Mohlman, G. (1981). School policy, leadership style, teacher change, and student behavior in eight schools: Final report. Mountain View, CA: Stallings Teaching and Learning Institute.

Wellisch, J., MacQueen, A., Carriere, R., & Duck, G. (1978). School organization and management in successful schools. Sociology of Education, 51, 211–226.

Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance. San Francisco, CA: Jossey-Bass.

For more information about this brief, contact Matthew Clifford via e-mail at [email protected] or by phone at 630.689.8017.

ABOUT AMERICAN INSTITUTES FOR RESEARCH

Established in 1946, with headquarters in Washington, D.C., American Institutes for Research (AIR) is an

independent, nonpartisan, not-for-profit organization that conducts behavioral and social science research and

delivers technical assistance both domestically and internationally. As one of the largest behavioral and social

science research organizations in the world, AIR is committed to empowering communities and institutions with

innovative solutions to the most critical challenges in education, health, workforce, and international development.

Copyright © 2012 American Institutes for Research. All rights reserved.

This work was originally produced in whole or in part with funds from the U.S. Department of Education’s Fund for the Improvement of Education (FIE) Earmark Grant Awards Program under grant number U215K080187. The content does not necessarily reflect the position or policy of the Education Department, nor does mention or visual representation of trade names, commercial products, or organizations imply endorsement.

1463_01/12

Making Research Relevant

1120 East Diehl Road, Suite 200 Naperville, IL 60563-1486 800.356.2735 | 630.649.6500

www.air.org