Annotated Bibliography Project

profilesunqqq92
validty.asp_.pdf

Psychology in the Schools, Vol. 46(5), 2009 C! 2009 Wiley Periodicals, Inc. Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/pits.20385

VALIDITY OF THE BRACKEN SCHOOL READINESS ASSESSMENT FOR PREDICTING FIRST GRADE READINESS

JANET E. PANTER

Rhodes College

BRUCE A. BRACKEN

College of William & Mary

The Bracken School Readiness Assessment (BSRA) was administered to all kindergarten students enrolled in two rural elementary schools in the fall of 2004. Eight months later, the reading portion of the Metropolitan Readiness Tests, 6th Edition (MRT-6) was administered. Teachers were asked to indicate whether they had concerns about each student’s readiness for first grade and whether students had been retained or referred for other assessment(s) or services. The BSRA was found to be a good predictor of children’s readiness ratings, as well as their retention or referral for services. It also predicted performance on the MRT-6. This study partially validated the use of the BSRA as a screening measure to predict kindergarten performance and kindergarten teachers’ ratings of first grade readiness. C! 2009 Wiley Periodicals, Inc.

Screening young children before school entry has become common practice in school districts. A 1999–2000 survey of state education departments found that 13 states mandate specific procedures; five states mandated screening, but allowed districts to determine procedures; in 26 states, local, although not statewide, procedures were the norm; and in 16 states, procedures were being developed for statewide screening. Only six states reported no screening at the state or local level (Saluja, Scott-Little, & Clifford, 2000). Data gathered in 1998–1999 for the Early Childhood Longitudinal Study, Kindergarten Class (National Center for Education Statistics [NCES], 2003) revealed that, in 61% of schools, personnel administer entrance or placement tests to incoming students. According to that report, respondents claimed that screening information was used primarily to determine children’s needs (47%) and to guide instruction (52%). Screening was less often used to allow early entry for children (19%), to support recommendations for delayed entry (27%), or to determine class placement (19%).

A variety of instruments are used for screening and eligibility testing, often with little attention paid to their psychometric soundness or their efficacy as academic screening tools (Bordignon & Lam, 2004; Bracken, 1987; Carlton & Winsler, 1999) or early childhood educational standards (Bracken & Crawford, 2007). Gredler (1997) distinguished two types of early childhood instruments: developmental screening measures and readiness measures. Developmental screening considers the child’s potential to acquire new skills, whereas readiness measures “tap skills believed to be related to school learning tasks that are predictive of school success” (p. 99). In a 1992 survey of school districts in New York State, May and Kundert identified four readiness assessment approaches typically used (with some districts using more than one approach): Developmental screening was used in 30% of the surveyed schools; readiness measures, 28%; informal observations, 20%; and school-constructed tests, 33%.

An examination of the literature reveals that current screening practices may be problematic for various reasons. First, screening programs often place the burden of proof for readiness on the child (i.e., the child must prove his or her readiness for kindergarten). Failure to demonstrate one’s readiness may result in decisions or interventions with limited, if any, benefit. For instance, parents might decide that their child should delay school entry, a practice most prevalent among affluent

Correspondence to: Janet Panter, Psychology Department, Rhodes College, 2000 N. Parkway, Memphis, TN 38112. E-mail: [email protected]

397

398 Panter and Bracken

Table 1 Determining the Effectiveness of Screening Measures for Predicting Student Outcomes

Predictive Measure

Outcome Yes No Measure (At-risk) (Not at-risk) Indices

Yes (negative outcome)

a = identified at-risk with poor outcome (valid positives)

b = identified as not at-risk with poor outcome (false negative)

Sensitivity = a/(a + b)

No (positive outcome)

c = identified as at-risk with good outcome

d = identified as not at-risk with good outcome

Specificity = d/(c + d)

Indices Positive predictive value = a/(a + c)

Total correctly classi- fied = a + d/(a + b + c + d)

Note. From Gredler (1992, 1997).

parents. In addition, school personnel sometimes recommend that a child be given the Gesellian “gift of time” and allowed to develop certain skills before beginning kindergarten. Those who argue for delayed entry as an intervention apparently believe that the skills involved will develop in all children without direct intervention. This question of delaying kindergarten entry is especially thorny, as students whose entry is delayed are likely to be those who would most benefit from early intervention services (Bordignon & Lam, 2004; Carlton & Winsler, 1999; Meisels, 1992, 1995). Similarly, students in early intervention programs, such as transitional classes, are typically taught at a slower pace, are not appropriately challenged, do not receive needed interventions, and are, as a group, no better off than peers recommended for such services but not receiving them (Carlton & Winsler, 1999). Moreover, screening programs overidentify minority and low-socioeconomic status (SES) students as at-risk for poor outcomes (Carlton & Winsler, 1999).

Gredler (1992, 1997) outlined procedures for evaluating an instrument’s efficacy as a screener. He recommended examining the following criteria: sensitivity, specificity, positive predictive value, and overall hit rate (see Table 1). Sensitivity has been defined as the percentage of children with actual negative outcomes or poor performance who were identified by the screening battery as being at-risk for negative outcomes. Specificity is the proportion of children who are successful (positive outcomes) from those classified as not at-risk. Positive predictive value is the percentage of children with actual negative outcomes from all those identified by the screening battery as being at-risk. Finally, the overall hit rate is the overall percentage of students correctly classified by the screening measure as at-risk or not at-risk.

Gredler (1992) evaluated 12 screening instruments to determine each test’s performance on these indices. He found an average sensitivity index of .77. In other words, 77% of the children identified by the predictor measures as at-risk later had poor outcomes; 23% of those classified as at-risk had positive outcomes. The average specificity ratio of was .81 (i.e., 81% of the students identified as not at-risk were correctly identified). The average positive predictive value was .55, which raises concern given that 45% of those identified as at-risk were incorrectly classified.

In the current study, the authors evaluated the performance of the Bracken School Readiness Assessment (BSRA; Bracken, 2002) and the Brigance K & 1 Screen (Brigance, 1992) on these same indices. The BSRA is composed of the items on the Bracken Basic Concept Scale-Revised (BBCS-R; Bracken, 1998) School Readiness Composite (SRC). The usefulness of the BBCS-R SRC (i.e., the BSRA) for school screening has been partially validated previously (Panter, 1997,

Psychology in the Schools DOI: 10.1002/pits

BSRA Validity Study 399

2000). The previous version of the test, the BBCS (Bracken, 1984), differentiated between average and at-risk preschoolers (Stebbins & McIntosh, 1996), predicted performance on the Weschler Preschool and Primary Scale of Intelligence-Revised (WPPSI-R; Wechsler, 1989; Laughlin, 1995), correlated highly with the Stanford-Binet-IV (Thorndike, Hagen, & Sattler, 1986) with African American preschoolers (Howell & Bracken, 1992), correlated highly with the Differential Abilities Scales (Elliott, 1990; McIntosh, Wayland, Gridley & Barnes, 1995), and outperformed the Gesell Developmental Exam (Ilg, Ames, Haines, & Gillespie, 1978) in predicting early childhood academic achievement (Sterner & McCallum, 1988).

In a 1996–1997 research study, Panter (1997, 2000) administered a multidimensional screening battery to 71 kindergarten students at three rural schools. An extensive battery was constructed to address the variety of factors believed to account for academic performance in later grades, kindergarten retention, referral for services, and teachers’ concerns about students’ performance in first grade (Tramontana, Hooper, & Selzer, 1988). In September 1996, Panter administered the following screening battery to participants: the tryout version of the BBCS-R, the parent version of the Social Skills Rating Scale (SSRS; Gresham & Elliott, 1990), and the Geometric Design (GD) subtest of the WPPSI-R. Demographic information, such as sex, gender, and age at school entry, were also collected. In the spring, the Metropolitan Readiness Tests, 6th Edition (MRT-6; Nurss & McGauvran, 1995) was administered to participants as an outcome measure. In addition, data were gathered regarding students’ retention and/or referral for services, teachers’ concerns about their students’ readiness for first grade, and teacher ratings on the Academic Competence scale of the SSRS (Gresham & Elliott, 1990).

Panter found the tryout version of the BBCS-R SRC to be the best predictor of retention or referral, as well as teacher ratings of Academic Competence on the SSRS (Gresham & Elliott, 1990). When the data were analyzed, children’s retention and teacher ratings of readiness were best accounted for by the tryout version of the BBCS-R SRC. With retention in grade as the outcome variable, the BBBCS-R SRC (tryout version) correctly classified 90% (total sample) to 94% (Blacks only) of the students. In addition, analyses conducted to predict MRT-6 scores showed that the tryout BBCS-R SRC accounted for nearly half of the variance in the MRT-6 Pre-reading scores (r 2 = .45). In this 1998 study, the tryout BBCS-R SRC was considerably more accurate and accounted for more variance in the outcomes than the measures used in other studies.

In addition to examining the BSRA, the present study investigated the efficacy of the Brigance (Brigance, 1992) for predicting student achievement and readiness for first grade. The Brigance was designed as a screening measure and as a curriculum tool for making placement and instruction decisions (Brigance, 1992). It is presently used as the only screening measure by the school system where this research was conducted; furthermore, it is a popular and widely used screening measure. This popularity may be because of the measure’s apparent multidimensional nature, ease of admin- istration, instructional properties, and connection to other Brigance instruments (Mantzicopoulos, 1999b; May & Kundert, 1992).

Mantzicopoulos (1999b) investigated the predictive utility of the Brigance for a sample of 256 Head Start students. The study found differences in total test scores by sex and age; racial/ethnic differences were not investigated. Specifically, boys performed less well than girls, and younger children (45–57 months) performed less well than their older counterparts (58–65 months).

To determine the Brigance’s validity as a predictor of performance, Mantzicopoulos calculated correlations between the Brigance, given at the beginning of Head Start, and two measures given at the beginning of kindergarten: the Kaufman Assessment Battery for Children Achievement Com- posite (K-ABC Achievement; Kaufman & Kaufman, 1983) and the math and reading subtests from the Woodcock–Johnson Tests of Achievement - Revised (WJ-R; Woodcock & Mather, 1990). Mod- erate correlations were found between the Brigance and the K-ABC Achievement scores (r = .55,

Psychology in the Schools DOI: 10.1002/pits

400 Panter and Bracken

p < .001). The Brigance was also moderately correlated with the WJR Applied Problems subtest (r = .53, p < .01). Smaller correlations were found with the other WJ-R subtests: Letter–Word Iden- tification (r = .41, p < .01); Calculation (r = .25, p < .05); and Passage Comprehension (r = ".18, p > .05).

Mantzicopoulos also examined the efficacy of the Brigance screen for predicting special ed- ucation placement. Special education students obtained significantly lower Brigance total scores. However, the Brigance lacked accuracy when predicting students’ at-risk status. Using locally de- rived cut-score criteria, the Brigance correctly classified 72% of the sample, misidentifying 28% of the Head Start students.

One of the central issues in readiness screening is choosing outcome measures. In other words, what are we seeking to predict? To provide a standardized and norm-referenced measure of kinder- garten students’ first grade readiness, the MRT-6 (Nurss & McGauvran, 1995) was administered. In this instance, then, readiness is synonomous with reading achievement. The use of a standardized measure is particularly important since the passage of the 2001 No Child Left Behind (NCLB) Act. As Chatterji (2006) points out, the primary purpose of the NCLB Act is to close socioeconomic and racial/ethnic group achievement gaps, particularly in reading. Given the importance of reading and widespread use of standardized tests, this study sought to examine students’ reading readiness and standardized test performance in addition to more subjective variables, such as retention and teachers’ readiness ratings.

The current study evaluated the validity of the BSRA and the Brigance as predictors of MRT-6 Pre-reading scores, teacher ratings of their students’ readiness for first grade, kindergarten retention, and referral for special education services. In the fall of 2004, the BSRA and Brigance were administered. In the spring of 2005, teachers provided first grade readiness ratings (concern or no concern) along with retention and referral information; also, the MRT-6 Pre-reading scale was administered to all kindergarten students in the two schools. Based on previous performance of the tryout BBCS-R SRC (Panter, 1997, 2000), the BSRA was predicted to correlate significantly and positively with MRT-6 Pre-reading scores and to favorably predict teacher ratings of their students’ readiness for first grade along with retention and/or referral for services. The Brigance was predicted to perform less well as a predictor of MRT-6 scores and teacher ratings of readiness and retention/referral.

Beyond the overall validity of the BSRA and Brigance as predictive measures, differential validity of these two measures was evaluated regarding MRT-6 performance. Previous research has shown the BSRA SRC to be a valid predictor for students of different races and both genders (Panter, 2000). Brigance mean scores differ by age and gender, but differences in predictive validity have not been shown (Mantzicopoulos, 1999b). It is important to identify such differences, if any, because screening and outcome measures may not be equally valid for all students. For example, if the BSRA or Brigance varies by race in predicting MRT-6 scores, school psychologists should interpret scores in a manner consistent with this differential validity.

In addition to examining the utility of these instruments, the authors considered the use of retention in grade as a criterion measure. Retention rates and reasons for retention vary greatly among teachers and between schools and school systems. Many factors, some of which are not related to the child’s performance in school, affect retention decisions. For instance, some teachers focus on within-child factors, such as size, age, or gender. Other times, extra-child factors influence a teacher’s decision whether to retain a particular child. The following factors have been found to be significant contributors to decisions for student retention: the school’s policy on retention; the availability of alternative services; graduation standards for students; teacher beliefs regarding development; parental views; child’s age; and gender (Germain & Merlo, 1991). In other words, students may be retained or not retained for reasons not pertinent to academic achievement.

Psychology in the Schools DOI: 10.1002/pits

BSRA Validity Study 401

Retention was used as a validation criterion in this study because it is one of the principal tools used by educators to intervene with kindergarten students who are deemed not ready for first grade. Given the continued use of retention, even with the lack of evidence for its efficacy (Carlton & Winsler, 1999; Hong & Raudenbush, 2005; Meisels, 1992; Shepard, 1997), it is important to examine teachers’ recommendations for retention along with their ratings of students’ first grade readiness.

METHODOLOGY

Participants

Participants were 86 kindergarten children attending two public schools in rural, western Tennessee. Both schools receive Title I assistance (i.e., federal funds for schools with a high percentage of students living in poverty). All 117 kindergarten students at both schools were initially assessed; the final sample consisted of 86 students who took the BSRA in the fall and received independent readiness ratings in the spring.

Of the original 117 students, 31 students were excluded from the final sample for the following reasons: older than cutoff age for kindergarten entry (3); previously retained in kindergarten (8); currently receiving special education services (9); primary language in the home not English (2); or missing information, such as date of birth or Brigance scores (9). Exclusionary criteria were based on factors that might confound the results, as differences might be attributable to these factors and not the variables of interest. Previously retained students and those older than the cutoff age for school entry had already been identified as needing extra assistance or time to develop or acquire skills. Likewise, students in special education had a demonstrated need for specialized instruction. Students whose primary language is not English have barriers to learning and assessment other than the ones being investigated here.

The final sample included 30 Black students, 50 White students, and six students belonging to other racial/ethnic groups (e.g., Asian). There were 37 boys and 49 girls. Because final data were collected at the end of the school year and there were additional scheduling difficulties, only 66 of the 86 students took the MRT-6; thus, there was a second reduction in sample size. These 66 students consisted of 39 Blacks, 25 Whites, and 2 of other racial/ethnic heritage; there were 26 boys and 40 girls.

Instruments

BSRA (Bracken, 2002). The BSRA was individually administered to each child in the study. The BSRA consists of 88 items and requires approximately 15 minutes to administer. Concepts measured include color, letters, numbers/counting, sizes, comparisons, and shapes. As noted previously, this scale is a subset of the BBCS-R (Bracken, 1998), which is designed to assess concept acquisition and receptive language skills for children ages two and a half through seven. Items are presented in a multiple-choice format wherein participants point to or otherwise indicate their choice from among four or more response options.

The BBCS-R was normed on 1,100 children; the standardization sample closely matched the 1995 U.S. Census for gender, race/ethnicity, region, and parent education level. Four percent of the standardization group were children who had been classified as eligible for special education services. An additional 1.7% of the students received services under the exceptional education category of “gifted and talented.”

The BBCS-R has strong psychometric characteristics, with a total test internal consistency of .98 (r = .96–.99 across age groups); individual subtests also have high internal consistency (r = .85–.98). The BBCS-R Examiner’s Manual presents considerable validity evidence. In concurrent validity

Psychology in the Schools DOI: 10.1002/pits

402 Panter and Bracken

studies, the BBCS-R demonstrated moderate-to-high correlations with the Preschool Language Scale-3 (r = .36–.80; Zimmerman, Steiner, & Pond, 1992) and the Peabody Picture Vocabulary Test-III (r = .69 and .79; Dunn & Dunn, 1997).

Brigance K & 1 Screen (Brigance, 1992). The Brigance screen is a criterion-referenced test the purpose of which is to identify children who may be in need of a comprehensive assessment. This screen is a subset of the Brigance Inventory of Early Development (Brigance, 1991), a longer and more global measure of a child’s functioning. The screen contains 12 subtests, measuring skills such as language, numbers, motor skills, letter recognition, and following directions. It takes approximately 15 minutes to administer, and detailed scoring criteria are provided in the manual. Students can obtain a maximum score of 100 based on weighted, not raw, scores. The manual does not make clear the rationale behind the weightings of each subtest score (Mantzicopoulos, 1999a). As only weighted Brigance scores were available to the authors, they were used in all analyses for the current study. It should be noted that this score, although weighted, is not a standard or norm-referenced score and thus, is comparable to the raw scores of other measures.

No normative data are provided in the Brigance manual. The manual recommends that local school districts develop local norms by which to measure and compare students’ performance. Comparing students within their respective districts provides teachers with information regarding a student’s relative performance, as compared to the student’s peers in the same school system. However, in a 1997 technical report based on a sample of 74 kindergartners, Glascoe reported good internal consistency (Guttman’s Lambda Coefficient = .99); Mantzicopoulos (1999b) found similar overall reliability (r = .91).

The Brigance screener is mandated by the participating school district for all students entering kindergarten; however, specific guidelines are lacking regarding the use of test results. The 1997 technical manual (Glascoe, 1997) recommends the use of cut-scores based on age to identify children who may be at risk for academic difficulties. The two schools participating in this study did not implement this practice or base decisions solely on Brigance results; however, these scores were one part of the information used to determine a student’s need for a comprehensive assessment. The Brigance was included in this study to determine its effectiveness in comparison to the BSRA and to assess its usefulness as the current screening measure for the local school district participating in this research.

MRT-6 (Nurss & McGauvran, 1995). The MRT-6 is designed to assess children’s readiness to begin mathematics and reading instruction. This instrument measures auditory, visual, language, and quantitative processes. Only the Pre-reading Composite was examined in this study; it includes beginning reading and story comprehension. The Pre-reading portion of the MRT-6 Level 2 is a two-session, group-administered test.

The MRT-6 was standardized in 1994 on 13,000 kindergarten and pre-kindergarten children. To establish norms, data were weighted so that the number of children from each cell would be proportionate to the number of children in the U.S. population enrolled in school districts with similar characteristics. The MRT-6 manual shows that the weighted norms sample very closely approximates the U.S. population (based on the 1990 census) and thus is representative of the nation.

The MRT-6 Level 2 Test has strong technical characteristics. The Pre-reading Composite score has a reported reliability of .91 (using Kuder-Richardson-20) for spring administration, demonstrat- ing a high level of internal consistency. Its test–retest reliability coefficient was also .91, showing good stability over a 2-week period. The MRT-6 manual reports reliability for the Pre-reading score of .91. Validity studies, as reported in the MRT-6 Norms Manual, were also conducted using the MRT-6 to predict performance on two achievement tests. In both instances, the MRT-6 Pre-reading Composite score accounted for a significant percentage of the variance in the children’s later reading achievement.

Psychology in the Schools DOI: 10.1002/pits

BSRA Validity Study 403

Teacher Questionnaire. This brief form was given to each of the teachers. It contained a class roster, and teachers were asked to indicate which children were retained in grade and/or referred for any special services or assessment. Additionally, teachers were asked to indicate whether they had “serious concerns about this student’s readiness for first grade.” This yes/no rating was provided for every kindergarten student enrolled in May 2005.

Procedures

Application was made to the school district in which the two schools were located for approval to conduct this study. Once the study was approved, arrangements were made with the school principals to administer the BSRA at the beginning of the academic year. This administration was completed in August and early September 2004 by the first author. BSRA scores were not shared with teachers so as not to influence teachers’ attitudes, instruction, or decision making. In September 2004, the following student information was gathered from school records: date of birth, primary language used in the home, disabling conditions (if any), and scores on the Brigance screen. In May 2005, the reading portion of the MRT-6 was administered to all of the available kindergarten students in the two schools. In addition, teachers were asked to complete the Teacher Questionnaire.

Data Analysis

Three sets of analyses were performed for the following purposes: to identify differences between groups; to evaluate the effectiveness of the independent variables (IVs) (BSRA and Brigance scores) in predicting retention/readiness (dependent variables (DV)); and to determine the variance in MRT-6 scores (DV) accounted for by the BSRA and Brigance scores (IVs). The first set of analyses was conducted to identify differences by sex and/or race in raw or standard scores on the BSRA, Brigance, and MRT-6. Using gender and race as IVs, five 3 (Race) # 2 (Sex) analyses of variance (ANOVAs) were conducted using each of the test scores (raw and standard) as the DVs.

The second group of analyses used retention/referral and teachers’ readiness ratings as the DV to determine the predictive validity of the two IVs (Brigance and BSRA scores). Examining the utility of these measures in predicting teacher ratings and decision making (i.e., retention) was a central purpose of this study. Finally, the third set of analyses determined the amount of variance in MRT-6 scores (DV) accounted for by BSRA and Brigance scores (IVs). The MRT-6 provided an objective, norm-referenced measure of kindergarten outcomes, using data consistent with typical school performance measures (i.e., standardized test scores).

RESULTS

For the first set of analyses, participants’ mean raw and standard scores on the BSRA, Brigance, and MRT-6 were examined for differences between groups based on sex and/or race (see Table 2)1. Significant differences were found for race on the MRT-6 Pre-reading Skills Composite raw [F (2, 65) = 6.17, p < .004] and standard scores [F (2, 65) = 5.57, p = .006]. Post hoc tests were conducted using the Tukey Honestly Significant Differences (HSD) test to reduce Type I error; Black students scored significantly lower than did White students on the MRT-6 Pre-reading scale.

The second set of analyses used the BSRA and Brigance scores to predict retention, referral, and/or teacher’s concern for student’s readiness for first grade as the DV (see Table 3). Five of the 86

1 Standard scores used here to examine differences between groups and to help the reader locate group performance in relation to normative sample.

Psychology in the Schools DOI: 10.1002/pits

404 Panter and Bracken

Table 2 Brigance, BSRA, and MRT-6 Raw and Standard Scores by Race and Sex

No. of Total Sample Whites Blacks Other Boys Girls Instrument Items n = 86 n = 50 n = 30 n = 6 n = 37 n = 49

Brigance K & 1 —a 89. 54 91.08 88.27 83.03 86.95 91.49 (11.45) (9.54) (11.29) (22.72) (10.91) (11.57)

BSRA SRC Raw 88 60.76 62.84 59.00 52.17 60.11 61.24 (15.36) (13.33) (16.86) (21.61) (14.71) (15.96)

BSRA SRC Standard —a 97.35 98.64 96.37 91.50 96.05 98.33 (12.96) (11.64) (13.71) (19.37) (11.21) (14.18)

Total Sample Whites Blacks Other Boys Girls n = 66 n = 39 n = 25 n = 2 n = 26 n = 40

MRT-6 Pre-reading Rawb 62 47.77 49.69 44.16 55.50 48.12 47.55 (7.44) (6.35) (7.83) (2.12) (6.65) (7.99)

MRT-6 Pre-reading Standardb — 101.63 104.53 96.15 113.33 102.14 101.29 (11.28) (9.63) (11.86) (3.21) (10.08) (12.10)

aNumber of items not reported for standard scores. bBlacks < Whites, p < .05. BSRA: Bracken School Readiness Assessment; MRT-6: Metropolitan Readiness Tests, 6th Edition; SRC: School Readiness Composite.

Table 3 Effectiveness of BSRA as Predictor of Retention/Referral and/or Teacher Readiness Concern for Total Sample Based on Actual Probability (15%)

Predictive Measurea

Outcome Yes No Measure (At-risk) (Not at-risk) Total Indices

Yes (") 8 5 13 Sensitivity = .62 No (+) 3 70 73 Specificity = .96 Total 11 75 86 Indices Positive Predictive Value = .73 Correctly Classified = 91%

aBracken School Readiness Assessment (BSRA) was the only significant predictor variable (Wilks’ lambda (1, 84) = .62, p = .0001).

students were retained in kindergarten; two of the five were referred for special services. Teachers expressed readiness concern (i.e., not ready for first grade) for 12 of the 86 students, including four of the five retainees. One retained student received a “no concern” rating. Given the small size of these groups, retainees and those for whom teachers had readiness concerns were combined into a single group (n = 13). The 13 students consisted of five Blacks, five Whites, and three of other racial/ethnic heritage. There were six boys and seven girls.

Discriminant analyses were conducted to predict student outcome by group (i.e., at-risk or not at-risk). For these analyses, students were classified as having a negative or poor outcome if they were retained, referred for services, or if teachers had serious concern regarding their readiness for first grade. Students in the positive outcome group were not retained or referred and received teacher ratings of “no concern” for first grade readiness. Discriminant analyses were conducted using the BSRA and Brigance (IVs) to predict group membership (DV). Probability of group membership for the analyses was based on the actual percentage of students with negative and positive outcomes.

Psychology in the Schools DOI: 10.1002/pits

BSRA Validity Study 405

Table 4 Multiple Regression Analyses: BSRA SRC and Brigance K & 1 Scores Regressed on MRT-6 Pre-reading Composite

Total Sample Blacks Whites (n = 66) (n = 25) (n = 39) r (R2) r (R2) r (R2)

Actual Correcteda Actual Correctedb Actual Correcteda

BSRA SRC .44a (.20) .66 (.44) .003 (.0009) .005 (<.0001) .56b (.32) .82 (.67) Brigance K & 1 Screen .02 (.0004) – .54a (.30) – .0000 (.0000) –

aCorrelations were corrected for restriction in range; corrected correlation coefficients were not available for the Brigance because it does not provide standard scores and standard deviations. bp < .05.

For the total sample (n = 86), actual probability of negative outcomes was 15%. In the dis- criminant analysis, the BSRA was the only significant variable in the function (Wilks’ lambda (1, 84) = .62; see Table 3). In this instance, 90.7% of the students were correctly classified. A sensi- tivity index of .62 shows that 62% of the students with poor outcomes were identified by the BSRA as at-risk. The BSRA’s specificity value was high at .96. In other words, of the children identified as not at-risk, 96% of them had positive outcomes. The positive predictive value of the BSRA SRC was .73 (i.e., 73% of students identified as at-risk had negative outcomes).

In the third set of analyses, a multiple regression analysis was conducted to determine what portion of MRT-6 Pre-reading Composite score was accounted for by the BSRA and Brigance scores (see Table 4). As the MRT-6 score differed significantly by race, a total sample analysis was conducted, and then separate analyses were performed for Blacks and Whites. Because there were only two students with MRT-6 scores in the “Other” racial/ethnic grouping, their scores were omitted from these separate analyses.

For the total sample, the BSRA SRC accounted for 20% of the variance in the MRT-6 Pre- reading raw score. The Brigance score contributed less than .01% to that amount. Given the restricted range of the scores in this sample, corrected correlation coefficients were computed for the BSRA and MRT-6 standard scores; the Brigance K & 1 Screen does not provide standard scores, so corrected coefficients could not be computed. The corrected r 2 for the total sample was .44 (i.e., 44% of the variance in the MRT-6 Prereading scores was accounted for by the BSRA score).

When the analysis was performed for Black students only, the Brigance accounted for 30% of the variance, and the BSRA SRC contributed less than .01%. For Whites, the amount of variance explained by the BSRA was 32% with the Brigance adding less than .01%. The corrected BSRA coefficient for Whites only was .82.

DISCUSSION

For the total sample, the BSRA proved to be a good predictor of student outcomes. It correctly identified the majority of students who were retained, referred for services, and/or rated by teachers as possibly not ready for first grade. The classification hit rate of 90.7% displays excellent overall accuracy. However, these numbers might mask the possibility of differential predictive validity for the BSRA. When evaluating the BSRA and Brigance in relation to MRT-6 scores, the two measures performed differently based on racial/ethnic status. It was not possible to conduct separate discriminant validity analyses for Blacks and Whites because of the small sample size; furthermore, only 13 of the 86 students had negative outcomes by retention/referral and/or teacher readiness ratings, greatly reducing the power of statistical analyses to detect differences if the sample were

Psychology in the Schools DOI: 10.1002/pits

406 Panter and Bracken

divided by race. Further research with a larger sample will be needed to determine whether the BSRA works as well for classifying Blacks as Whites.

For the total sample, the Brigance K & 1 Screen did not significantly add to the predictive power of the BSRA and therefore did not enter the discriminant function. Therefore, schools using the Brigance as a screening measure should reconsider its usefulness. At the very least, school psychologists should play an active role in evaluating the efficacy of the Brigance (as a kindergarten screening measure) in their schools and districts. School psychologists are best equipped to evaluate screening measures or batteries and to assist in the development and continued evaluation of screening procedures.

As discussed above, a screening measure’s performance should also be evaluated by its speci- ficity, sensitivity, and positive predictive value. With a ratio of .96, the BSRA exhibited excellent specificity, especially compared to the .81 average Gredler found in this review. Of the 73 students with positive outcomes in the total sample, 70 were correctly classified as not at-risk. In other words, the BSRA had only three “false negatives.”

The BSRA’s sensitivity ratio of .62 is lower than the average ratio reported by Gredler (1992) in his analyses of 12 screening measures; however, the sample size of the current study was very small, which had a negative effect on the analyses. With only 13 at-risk students, the discriminant functions lacked the statistical power needed to perform more accurately.

It is important to recognize, too, that school psychologists are typically dealing with small numbers of students for whom teachers have academic or readiness concerns. For instance, the two schools in this sample retained only 5 of the 86 participants. As noted above, retention occurs for a variety of reasons, some of which are nonacademic in nature. Choosing a single screening measure or battery to identify students who might be retained is difficult. When consulted by educators making such decisions, school psychologists should be aware of the many factors affecting retention as well as the appropriate use of screening instruments.

A more appropriate outcome variable than retention may be teacher ratings of student readiness for the next grade. These ratings are important because decisions, such as retention or referral for servies, often rely on teachers’ opinions regarding a child’s overall performance and ability. This is especially true at the kindergarten level where less standardized testing is conducted. In this study, the outcome classification relied primarily on teacher ratings of their concern for student’s first grade performance. However, the nature of this criterion against which the BSRA is being judged (teacher ratings) was not evaluated for reliability or validity. It would have been beneficial to measure students’ actual first grade performance to determine teacher accuracy. Given the absence of data here, there may be “criterion contamination” (i.e., the BSRA is being evaluated against a criterion measure that may be less accurate than the BSRA is). In addition to obtaining longitudinal data, it may have been useful to use a less generic question, in other words, to ask teachers to specifically identify their concerns.

Regardless of the limitations of this study, future research is needed to determine the accu- racy and consistency of teachers’ judgments. Considerable attention has been given to child-related variables, yet teacher-related factors are as—or more—important. Teachers’ decision making (for re- tention/referral or regarding readiness) must be examined in some depth to balance our understanding of the factors’ affecting student success and the ways in which success is judged.

Regarding the positive predictive value of the BSRA, the ratio of .73 for the total sample is considerably higher than the .55 average value from Gredler’s 1992 review. In other words, 73% (8 of 11) of the students identified as at-risk had poor outcomes and were correctly identified. Only 3 (27%) of the 11 were incorrectly classified.

The BSRA accounted for a significant portion of the variance in MRT-6 Pre-reading scores for the total sample and for Whites. However, it performed less well in this regard than in a previous

Psychology in the Schools DOI: 10.1002/pits

BSRA Validity Study 407

study (Panter, 1997, 2000), especially for Black students. There are two factors specific to this study that may explain the overall difference. First, the MRT-6 was administered in the last 2 weeks of school because of ordering and scheduling limitations. This time of the year is quite hectic, and the students are involved in a range of “fun” activities that may make formal testing problematic. Second, many of the students were absent during that time, changing the classroom climate and testing procedures, as the total battery was not given; these factors may have lowered the reliability of the MRT-6 results.

Apart from the overall limitations of the MRT-6 data for this study, it is important to consider the differences between White and Black students. As noted above, if instruments exhibit differential predictive validity, school psychologists must interpret test results accordingly. In this study, Blacks scored lower than Whites on the MRT-6; also, the Brigance exhibited predictive power for the MRT-6 for Blacks only, the BSRA was not significant. The reasons for these differences are not clear, but two possibilities should be considered. First, the difference in the number of Black students (25) compared to White students (39) is substantial and likely to have affected the statistical analyses.

The second possibility is that Black students may have scored lower than Whites because of a preparation gap (e.g., Mead, 2004). Research shows that students from racial/ethnic minority groups often begin school less well prepared than their White peers, resulting in lower achievement. This gap is often attributed to differential access to preschool programs. Of the 30 Blacks in this sample, though, only 9 reported no preschool (preschool = 12; Head Start = 9). Whites showed similar proportions: preschool = 31; Head Start = 3; and none = 16. In contrast, the programs that these students attended may have differed in quality of materials, teacher training, instruction, and other important ways. These differences may account for some portion of the discrepancy in performance. Given the importance of preschool involvement and preparation, school districts would benefit from providing school psychological expertise to programs in their communities not located within the public schools.

Regarding the prediction of Black students’ MRT-6 Pre-reading scores, it is not clear why the Brigance predicted performance whereas the BSRA was not significant. The BBCS-R SRC (tryout version) was a solid predictor of MRT-6 scores for Blacks in the previous study, so this finding may be an artifact of the problems described earlier in this article. It is also possible that MRT-6 scores have differential validity based on racial/ethnic minority status. Further investigation is needed to investigate this possibility.

One reason for the success of the BSRA may be that its content is directly related to school performance. Bracken and Crawford (2007) surveyed all 50 U.S. states’ early childhood educational standards and found that the BBCS exceeded all states educational standards in the area of concept development. Furthermore, basic concepts are an important component of school performance and have been shown to affect students’ functioning in terms of conduct and formal testing, and in other areas (Bracken, 2002; Flanagan, Alfonso, Kaminer, & Rader, 1995). Lastly, Gredler’s (1992) standards address the instrument’s reliability and validity. The BSRA has proven to be a reliable and valid measure of student’s receptive language and concept acquisition. In addition, this study shows it to be a sound predictor of kindergarten performance.

Some authors recommend that screening batteries be multidimensional and evaluate behavior across contexts and content domains. Although the recommendation has considerable merit, con- ducting such a complex and involved screening has practical limitations in terms of cost, ease of administration, and timely completion. It seems preferable, then, to provide schools with an effective and affordable screening tool, such as the BSRA, that can be used to identify students in need of further assessment or targeted interventions.

Future research would benefit from gathering longitudinal data regarding students’ academic performance in first grade and their scores on local or state-mandated assessments. In addition,

Psychology in the Schools DOI: 10.1002/pits

408 Panter and Bracken

research into teachers’ decision-making processes, especially regarding retention and readiness, would be invaluable.

REFERENCES

Bordignon, C. M., & Lam, T. C. M. (2004). The early assessment conundrum: Lessons from the past, implications for the future. Psychology in the Schools, 41, 737 – 749.

Bracken, B. A. (1984). Bracken Basic Concept Scale. Chicago: The Psychological Corporation. Bracken, B. A. (1987). Limitations of preschool instruments and standards for minimal levels of technical adequacy. Journal

of Psychoeducational Assessment, 4, 313 – 326. Bracken, B. A. (1998). Bracken Basic Concept Scale-Revised. San Antonio, TX: The Psychological Corporation. Bracken, B. A. (2002). Bracken School Readiness Assessment. San Antonio, TX: The Psychological Corporation. Bracken, B. A., & Crawford, E. (2007). Basic concepts in early childhood educational standards: A 50-state review. Manuscript

submitted for publication. Brigance, A. (1991). Brigance Inventory of Early Development. North Billerica, MA: Curriculum Associates. Brigance, A. (1992). Brigance K & 1 Screen (3rd ed.). North Billerica, MA: Curriculum Associates. Carlton, M. P., & Winsler, A. (1999). School readiness: The need for a paradigm shift. School Psychology Review, 28,

338 – 352. Chatterji, M. (2006). Reading achievement gaps, correlates, and moderators of early reading achievement: Evidence from

the Early Childhood Longitudinal Study (ECLS) kindergarten to first grade sample. Journal of Educational Psychology, 98, 489 – 507.

Dunn, L. M., & Dunn, L. M. (1997). Peabody Picture Vocabulary Test-III. Circle Pines, MN: American Guidance Services. Elliott, C. D. (1990). Differential Abilities Scale. San Antonio, TX: Psychological Corporation. Flanagan, D. P., Alfonso, V. C., Kaminer, T., & Rader, D. E. (1995). Incidence of basic concepts in the directions of new and

recently revised American intelligence tests for preschoolers. School Psychology International, 16, 345 – 364. Germain, R. B., & Merlo, M. (1991). Best practices in assisting promotion and retention decisions. In Student grade retention:

A manual for parents and educators. Silver Spring, MD: National Association of School Psychologists. Glascoe, F. P. (1997). Technical report for the Brigance screens. North Billerica, MA: Curriculum Associates. Gredler, G. R. (1992). School readiness: Assessment and educational issues. Brandon, VT: Clinical Psychology Publishing

Company, Inc. Gredler, G. R. (1997). Issues in early childhood screening and assessment. Psychology in the Schools, 34, 99 – 106. Gresham, F. M., & Elliot, S. N. (1990). Social Skills Rating System. Circle Pines, MN: American Guidance Service. Hong, G., & Raudenbush, S. W. (2005). Effects of kindergarten retention policy on children’s cognitive growth in reading

and mathematics. Educational Evaluation and Policy Analysis, 27, 205 – 224. Howell, K. K., & Bracken, B. A. (1992). Clinical utility of the Bracken Basic Concept Scale as a preschool intellectual

screener: Comparisons with the Stanford-Binet for African-American children. Journal of Clinical Child Psychology, 21, 255 – 261.

Ilg, F. L., Ames, L. B., Haines, J., & Gillespie, C. (1978). School readiness: Behavior tests used at the Gesell Institute. New York: Harper & Row.

Kaufman, A. S., & Kaufman, N. L. (1983). Kaufman Assessment Battery for Children. Circle Pines, NM: American Guidance Service.

Laughlin, T. (1995). The School Readiness Composite of the Bracken Basic Concept Scale as an intellectual screening instrument. Journal of Psychoeducational Assessment, 13, 294 – 302.

Mantzicopoulos, P. (1999a). Reliability and validity estimates of the Brigance K & 1 Screen based on a sample of disadvantaged preschoolers. Psychology in the Schools, 36, 11 – 19.

Mantzicopoulos, P. (1999b). Risk assessment of Head Start children with the Brigance K & 1 screen: Differential performance by sex, age, and predictive accuracy for early school achievement and special education placement. Early Childhood Research Quarterly, 14, 383 – 408.

May, D., & Kundert, D. (1992). Kindergarten screenings in New York State: Tests, purposes, and recommendations. Psy- chology in the Schools, 29, 35 – 41.

McIntosh, D. E., Wayland, S. J., Gridley, B., & Barnes, L. L. B. (1995). The relationship between the Bracken Basic Concept Scale and the Differential Abilities Scale with a preschool sample. Journal of Psychoeducational Assessment, 13, 39 – 48.

Mead, S. (2004). Open the preschool door, close the preparation gap. Progressive Policy Institute Policy Report. Retrieved December 23, 2007, from http://www.ppionline.org/documents/PreK 0904.pdf

Meisels, S. J. (1992). Doing harm by doing good: Iatrogenic effects of early childhood enrollment and promotion policies. Early Childhood Research Quarterly, 7, 154 – 174.

Psychology in the Schools DOI: 10.1002/pits

BSRA Validity Study 409

Meisels, S. J. (1995). Out of the readiness maze. Momentum, 26, 18 – 22. National Association for the Education of Young Children. (1995). School readiness: A position statement of The National

Association for the Education of Young Children. Washington, DC: Author. National Center for Education Statistics. (2003). Schools’ use of assessments for kindergarten entrance and placement:

1998-99. Washington, DC: U.S. Department of Education. Nurss, J. R., & McGauvran, M. E. (1995). Metropolitan Readiness Tests (6th ed.). San Antonio, TX: The Psychological

Corporation. Panter, J. (2000). Validity of the Bracken Basic Concept Scale-Revised for predicting performance on the Metropolitan

Readiness Test (6th ed.). Journal of Psychoeducational Assessment, 18, 104 – 110. Panter, J. E. (1997). Assessing the school readiness of kindergarten children. Unpublished doctoral dissertation, University

of Memphis, Memphis, Tennessee. Saluja, G., Scott-Little, C., & Clifford, R. (2000). Readiness for school: A survey of state policies and definitions. Early

Childhood Research and Practice, 2 (2). Retrieved February 15, 2007, from http://ecrp.uiuc.edu/v2n2/index.html Shepard, L. A. (1997). Children not ready to learn? The invalidity of school readiness testing. Psychology in the Schools, 34,

85 – 97. Stebbins, M. S., & McIntosh, D. E. (1996). Decision-making utility of the Bracken Basic Concept Scale in identifying at-risk

preschoolers. School Psychology International, 17, 293 – 303. Sterner, A. G., & McCallum, R. S. (1988). Relationship of the Gesell Developmental Exam and the Bracken Basic Concept

Scale to academic achievement. Journal of School Psychology, 26, 297 – 300. Thorndike, R. L., Hagen, E. P., & Sattler, J. M. (1986). The Stanford-Binet Intelligence Scale (4th ed.). Chicago: Riverside. Tramontana, M. G., Hooper, S. R., & Selzer, S. C. (1988). Research on the preschool prediction of later academic achievement:

A review. Developmental Review, 8, 89 – 146. Wechsler, D. (1989). Wechsler Preschool and Primary Scale of Intelligence-Revised. Chicago: Psychological Corporation. Woodcock, R. W., & Mather, N. (1990). Woodcock-Johnson Tests of Achievement-Revised. Allen, TX: DLM Teaching

Resources. Zimmerman, I. L., Steiner, V. G., & Pond, R. E. (1992). Preschool Language Scale-3. San Antonio, TX: The Psychological

Corporation.

Psychology in the Schools DOI: 10.1002/pits