Testing

Source2BAS.docx

BASC-3 Behavioral and Emotional Screening System

Review of the BASC-3 Behavioral and Emotional Screening System by THOMAS P. HOGAN, Professor of Psychology and Distinguished University Fellow, University of Scranton, Scranton, PA: DESCRIPTION. The BASC-3 Behavioral and Emotional Screening System (BASC-3 BESS), as suggested by its title, is designed to provide a quick screen for problems among children in the age range 3-18 years. It is essentially a short form of the Behavior Assessment System for Children, Third Edition (BASC-3). The test consists of five components: teacher and parent forms for preschool (ages 3-5) and child/adolescent (Grades K-12) and a student form for Grades 3-12. Each form is a single two-sided sheet with identification information and instructions on the front and test items on the back. Each form has a corresponding hand-scoring worksheet where responses may be transcribed, scored, and tallied into index scores (subscores) and a total score. Alternatively, the user may employ the test publisher’s Q-global system, which entails digital administration and scoring. For teacher and parent versions, the preschool form contains 20 and 29 items, respectively, as does the child/adolescent form. The student self-report form has 28 items. All forms are available in English; parent and student forms are available in both English and Spanish. Most items consist of brief descriptions of potential problems, for example (not actual items but indicative), “seems nervous” or “pokes other children.” A few items indicate positive behaviors or dispositions, for example (indicative only), “cooperates with others” or “seems happy.” The student form incorporates self-referencing in the statements: “I …” All responses are on a 4-point scale: N = never, S = sometimes, O = often, A = almost always. Instructions for completing the forms are simple and clear. The forms themselves are laid out cleanly, although considering the amount of blank space on both sides of each form, using a larger font size would make sense.

Each form yields a total score called the Behavioral and Emotional Risk Index (BERI). Each parent and teacher form includes the following subscores: Externalizing Risk Index, Internalizing Risk Index, Adaptive Skills Risk Index, and the F Index, the first three of which sum to the BERI along with several additional items that contribute to the total BERI but not to any of the subscores. The F Index draws on items from the other index categories. The student form has the total BERI and subscores for Internalizing Risk Index, Self-Regulation Index, and Personal Adjustment Risk Index; once again the F Index uses items from the other categories. Subscores contain 5-10 items. The F Index provides a measure of the tendency for the respondent to view the child in an excessively negative manner. It derives from extreme (N or A) marks for selected items. Two other validity indexes are provided only when using the Q-global scoring system. These are the Consistency Index, based on inconsistent responses to pairs of similar items, and the Response Pattern Index, based on analysis of pattern-marking of responses. High scores on any of the validity indexes lead to cautionary notes about interpreting other scores. The BASC-3 BESS manual notes that any of the components can be used singly or in combination. For example, a school might use the teacher, parent, and student forms for all students in a grade, or only the teacher form, or perhaps the teacher and student forms. The test manual recommends use in the three-tier screening system widely referenced by school psychologists: an initial, quick screen to spot potential problems, followed by a more thorough assessment (e.g., with the full BASC-3), and finally, a complete, highly individualized evaluation, with each of the second and third steps implemented as warranted by results from earlier steps. The test manual describes how such a tiered system might work. The test manual contains useful suggestions for administering the forms in various settings, what appear to this reviewer to be reasonable estimates of administration and scoring times, and practical advice on dealing with the thorny issue of securing parental permission.

DEVELOPMENT. BASC-3 BESS items consist of a selection of items from the full BASC-3. A table in the test manual conveniently cross-references BASC-3 BESS items to the corresponding BASC-3 scales. Selection of items depended primarily on results of principal components analysis of the BASC-3 item pool, complemented by examination of consistency of results across forms. Previously, the BASC-3 item pool had been subjected to bias analyses, described in the BASC-3 manual but not in the BASC-3 BESS manual. Following the item selection process, the final forms were subjected to confirmatory factor analysis (CFA), with the total BERI score as a second order factor and the main subscores as primary factors. The test manual notes that CFA fit indices were generally marginal, a surprising result in light of how the items were selected. TECHNICAL. Norms for both the teacher and parent forms are given for the following age groups: 3, 4-5, 6-7, 8-11, 12-14, and 15-18, with the latter three groupings also used for the student form. Both gender-separate and gender-combined norms appear for each age group. Total BERI scores convert to T scores and percentile ranks. Subscores use a three-category classification based on raw scores (no normative conversions): normal risk, elevated risk, and extremely elevated risk. The same classification system applies to the total BERI score but based on T scores, with scores below 61, 61-70, and 71+ corresponding to the category labels. Validity indexes also use raw score categories resulting in labels of acceptable, caution, and extreme caution, generally corresponding to the extreme 1-2% of response distributions for these indexes. Standardization samples used for determining norms included 1,618 cases for the teacher form; 1,659 for the parent form; and 899 for the student form, all derived from a stratified sampling plan based on groupings by age, gender, parent education, race/ethnicity, and geographic region. Cases were obtained by trained recruiters and independent examiners from daycare, school, and clinic settings to fill the sampling strata, essentially in a quota-sampling manner.

The test manual reports internal consistency, test-retest, and interrater reliability coefficients for the total BERI and subscores, except for the validity indexes, for all forms. Internal consistency data are reported separately by age and gender groups used to develop norms, based on the full norming sample. Test-retest data are reported for the full forms (not separately by age groups or gender) for samples ranging in size from 53 to 212, with time intervals averaging about 3 weeks. Interrater reliability coefficients are reported for the parent and teacher forms using samples drawn from the same pool used for the test-retest studies; for these data, both parents completed a form on a child, and two different teachers completed a form for a child, respectively. Standard errors of measurement are given in T score units for the total BERI score and in raw score units for subscores based on the internal consistency coefficients (Spearman-Brown corrected split-half) for the total BERI ranged from .91 to .96, with a median of .95; alpha coefficients for subscores ranged from .72 to .93, with medians of .88 for parent and teacher forms and .84 for the student form. In both parent and teacher forms, internal consistency coefficients run about .05 lower for the Internalizing Risk Index than for the other subscores. For all forms, the total BERI score test-retest reliability coefficients (adjusted for variability) ranged from .87 to .93, with a median of .91. Subscore test-retest adjusted coefficients ranged from .76 to .92, with a median of .87. For the parent and teacher forms, interrater reliability coefficients for the total BERI score ranged from .67 to .83, with a median of .77, and subscore coefficients ranged from .52 to .85, with a median of .72. Validity evidence consists primarily of correlations between BASC-3 BESS scores and a host of other measures, including the BASC-3, the Achenbach System of Empirically Based Assessment (ASEBA), the Conners 3, the Autism Spectrum Rating Scales (ASRS), the Children’s Depression Inventory 2, and the Revised Children’s Manifest Anxiety Scale: Second Edition. Sample sizes for these studies ranged from 39 to 173, with a median of 68. The test manual also reports accuracy of classification rates in terms of sensitivity, specificity, positive predictive power, and negative predictive power for BASC-3 BESS versus BASC-3 scores, based on contrasting a combination of the two elevated categories in each instrument with the nonelevated categories. Finally, the test manual presents average BERI T scores for children identified by parents in the standardization program as having been classified with a behavioral, emotional, or learning problem.

COMMENTARY. For the most part, the array of technical information reported for the BASC-3 BESS is exemplary in terms of completeness of design, clarity of presentation, and interpretive commentary. The test authors are to be commended for their forthright presentation and analysis of data, admitting, for example, when results are (just) adequate when that is the case, rather than sugar-coating outcomes, as often happens in test manuals. In general, reliability is strong, especially for the total BERI score. Understandably, subscore reliabilities are lower but, for the most part, still quite respectable. Special caution is required when interpreting the Internalizing Risk Index, which for some reason rather consistently underperforms other scales. Interrater reliability coefficients are clearly lower than for the other reliability coefficients, but the interrater context here is rather different from that in many other contexts. Mother and father are unlikely to have the same experiential base for viewing a child; two different teachers are unlikely to have the same degree and type of familiarity with a child. One weak spot in the array of reliability data is the absence of any data for the validity indexes (F, Consistency, and Response Pattern). The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014) explicitly state in Standards 2.0 and 2.3 that reliability data should be provided for any scores reported. The test manual contains no such data although the validity indexes are, arguably, scores to be interpreted. Presumably the raw data are available in the files for the test-retest and interrater reliability studies. The norming process, numbers of cases, and methods of analyzing and presenting data represent solid professional practice. Of course, the quota-sampling method employed leaves us with the vagaries of not knowing who did not participate and for what reasons. But that problem is virtually unavoidable for these types of norming projects. In general, the correlational studies presented as validity evidence reflect favorably on the BASC-3 BESS, both in terms of the total BERI score and the subscores. Correlations with a host of relevant measures are generally strong and in the expected directions. Results of the confirmatory factor analysis, reported in the test manual under item selection rather than validity, are curious. Why would a better fit not have emerged? The test manual does not speculate about that result, but it is worth pursuing. The presentation on accuracy of classifications (sensitivity, etc.) is promising but needs additional attention. The test manual (p. 55) concludes that “… a comprehensive review of the statistics associated with this table … is beyond the scope of this manual …” Not really: Such a review is quite relevant. Finally, the BASC-3 BESS manual, with the exceptions already noted, is quite complete, covering description of the instrument, possible uses, and technical matters. The manual could note the presence of webinars on the publisher’s website with one of the test authors describing the instrument and its use in a comprehensive three-tier assessment process. It also would be helpful for the test manual to include sample reports for individuals and groups, a practice followed in many test manuals. SUMMARY. The BASC-3 Behavioral and Emotional Screening System provides an effective instrument for a quick, initial screen for emotional and behavioral problems among children in the target age range. It can be recommended for such usage, provided the user acknowledges that it is a preliminary—not a final—assessment. The forms are simple and easy to use and score. Technical documentation is quite complete. Follow-up assessment is clearly warranted for children scoring in the elevated ranges. Emphasis should focus on the total BERI score. Subscores should be used cautiously.

REVIEWER’S REFERENCE American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Review of the BASC-3 Behavioral and Emotional Screening System by CHRISTOPHER A. SINK, Professor and Batten Chair, Counseling and Human Services, Old Dominion University, and KRISTY L. CARLISLE, Doctoral Candidate and Graduate Teaching Assistant, Department of Counseling and Human Services, Old Dominion University, Norfolk, VA: DESCRIPTION. The BASC-3 Behavioral and Emotional Screening System (BASC-3 BESS) is a brief instrument for use in schools, mental health clinics, pediatric clinics, and community health and research settings to screen behavioral and emotional strengths and weaknesses in children from preschool through high school. Though not a comprehensive diagnostic assessment, it is a tool used to assess children’s risk level for behavioral and emotional problems requiring intervention. The BASC-3 Behavior Intervention Guide links a child’s BASC-3 BESS score to evidence-based interventions. The BASC-3 BESS consists of three multi-informant screening measures to be completed by teachers, parents, and students. The measure includes two 20-item teacher forms: one for preschool (ages 3 through 5) and one for children and adolescents (Grades K through 12). Two parent/caregiver rating forms (29 items) are available in English and Spanish to assess preschool and child/adolescent behaviors. Teacher and parent forms each require about 5 minutes to complete. The English and Spanish versions of the 28-item student form (Grades 3 to 12) take 15 minutes or less to group-administer. Each form employs the same 4-point frequency response scale (never, sometimes, often, and almost always) to describe how often children may act, think, or feel. All forms can be administered traditionally with paper and pencil; digital options for administration and scoring also are available from the test publisher’s Q-global platform.

Although the most comprehensive perspective is obtained using the parent, teacher, and student measures, examiners can use the forms individually or in various combinations depending on their settings and the needs of their students/clients. Completed forms yield an overall score–the Behavioral and Emotional Risk Index (BERI)–in addition to subindexes. The teacher and parent forms provide the Externalizing Risk Index (ERI), the Internalizing Risk Index (IRI), and the Adaptive Skills Risk Index (ARI). Examples of behaviors from the ERI include hyperactivity, aggression, and conduct problems. The IRI measures behaviors associated with anxiety, depression, and somatization. The ARI assesses behaviors related to adaptability, social skills, and study skills. The student form generates these outcomes: the Internalizing Risk Index (IRI), the Self-Regulation Risk Index (SRI), and the Personal Adjustment Risk Index (PRI). Behaviors on the SRI are associated with self-control, and the PRI identifies problems with interpersonal relationship skills, self-esteem, and self-reliance. In addition to individual reports, group-level reports are available for all indexes. The test authors suggest a multistage model for screening, assessment, intervention, and monitoring of behavioral and emotional problems as part of a comprehensive program. The BASC-3 BESS, though limited to screening for risk status, can lead to more advanced assessment practices like diagnosis and prognosis, as well as early intervention, treatment, and prevention practices. Those who are identified as being at an elevated risk for behavioral and emotional problems may need further professional assessment to inform intervention, diagnosis, and treatment. Examiners need not be testing experts to administer the BASC-3 BESS; the accompanying materials include detailed instructions for administration, scoring, and interpretation. They also clearly describe which children to appraise, which forms to use, how often to assess the target population, and methods of choosing suitable raters. Useful examiner instructions and suggestions are provided to assist informants with the rating process and procedures. Effective ways to communicate testing outcomes to parents/guardians also are included. Detailed instructions are provided for using the hand-scoring worksheet. Should test administrators desire immediate scoring and reporting, they can use computer software (Q-global). To inform accuracy of score interpretation, the test includes several relevant indexes. First, the F Index is a validation metric to identify respondents with tendencies to rate a child’s behaviors in an overly negative way. Second, the Consistency Index assists in identifying respondents who provide differing responses to items typically answered similarly. Third, the Response Pattern Index can be used to recognize invalid rater forms, particularly those that may have responded in an inattentive or patterned way. Consistency and Response Pattern Indexes are available only by using the Q-global software.

Other pertinent indexes are well explained, and the manner by which examiners interpret them can be found in the test manual. For instance, the overall BERI score and the interpretation of the subindex scores on each form are explicated in a relatively uncomplicated manner. BERI scores correspond to one of three levels of risk: T scores no more than 1 standard deviation above the mean of 50 (i.e., 60 or lower) reflect a normal level of risk; T scores 1 to 2 standard deviations above the mean (i.e., 61-70) indicate an elevated risk level; and T scores 2 standard deviations above the mean (i.e., 71+) suggest an extremely elevated level of risk. It should be noted that BERI T scores are not normalized; however, combined-gender norm and separate-gender norm options are available. Subindex scores are not interpreted using T scores or percentile ranks, but with raw scores. These latter values fall into ranges representing less than 1 standard deviation above (for nonadaptive subindex scores) or below (for adaptive subindex scores) the mean (normal risk); 1 to 2 standard deviations above or below the mean (elevated risk); and more than 2 standard deviations above or below the mean (extremely elevated risk). Subindex scores may be useful for specifying areas of dysfunction when an overall BERI score is elevated. DEVELOPMENT. The BASC-3 BESS was developed to assess a wide variety of behaviors and emotions, focusing on both strengths and weaknesses of children and adolescents from preschool through Grade 12. Ratings are gathered from teachers, parents, and students using brief forms that can be administered to individuals or groups. The measure was developed alongside the Behavior Assessment System for Children, Third Edition (BASC-3), Teacher Rating Scales (TRS), Parent Rating Scales (PRS), and Self-Report of Personality (SRP). The steps of the development process would be clearer if the test authors had included more information about the TRS, PRS, and SRP and explained how they differ from the teacher, parent, and student forms in the BASC-3 BESS. The test authors indicate BASC-3 BESS item selection started with items that had been retained following bias analyses on the BASC-3 standardization project items, and the BASC-3 BESS manual refers readers to the BASC-3 manual for details. Item selection for the BASC-3 BESS was focused on the goal of producing a single total score from each teacher, parent, and student form. This total score was designed to identify the presence or absence of behavioral and emotional problems. Composite scores would detect specific areas of dysfunction once an elevated total score was observed.

A series of principal components analyses (PCA) were implemented on the items from each composite scale of each BASC-3 form (i.e., externalizing problems, internalizing problems). The type of rotation was not reported. Items were selected based on the following criteria: (a) generated the highest loadings on each composite scale, (b) provided unique content, and (c) maintained similar psychometric properties across forms and levels. Subsequently, a confirmatory factor analysis (CFA) was conducted to evaluate the reliability and factorial validity of the initial item selections. This analysis led to the inclusion of the overall BERI score in the teacher and parent forms, along with the ERI, IRI, and ARI subindices. For the student form, the CFA supported the inclusion of the overall BERI score, as well as the IRI, SRI, and PRI subindexes. The test authors report many items on the BASC-3 BESS forms also appear on the BASC-3 forms, but the overlapping items and the unique items are not identified in the BASC-3 BESS manual. The standardization forms included items are not identified in the BASC-3 BESS manual. The standardization forms included more than 775 potential BASC-3 and BASC-3 BESS items. TECHNICAL. The test authors reported that the BASC-3 standardization sample was representative of the U.S. population in terms of gender, socioeconomic status, race/ethnicity, geographic region, and special-education/gifted status. Between April 2013 and November 2014, more than 9,000 forms were collected for the standardization project from 311 expert examiners in 44 states. Norms were subsequently developed using a sample that included approximately equal numbers of males and females in each age group: 3, 4-5, 6-7, 8-11, 12-14, and 15-18. The sample was stratified by age according to parent education level, race/ethnicity, and geographic region to match the 2013 U.S. Census figures. Sample sizes for each form were as follows: 1,618 for the teacher forms; 1,659 for the parent forms; and 899 for the student form. For the total BERI score, normative comparisons for each age group and form can be made using separate-gender or combined-gender tables in the test manual. Norms for subindexes on each form were derived from the combined-gender group.

Internal consistency, stability, and interrater reliability analyses are clearly summarized in the technical manual. Internal consistency was assessed using split-half reliability for the BERI scores and alpha coefficients for the subindex scores. Reliability coefficients for the BERI on all teacher, parent, and student forms were excellent, ranging from .91 to .96. Alpha coefficients for the subindex scores on the teacher and parent forms were good to excellent, ranging from .76 to .93, but lower on the student form, ranging from .72 to .90. Test-retest reliability studies were conducted using diverse samples of 53 to 212 participants with time intervals of 1 to 10 weeks, with an average interval of 3 weeks. Coefficients were adjusted for biasing effect resulting from sampling differences in variability of scale or subindex scores. Adjusted coefficients for the total BERI scores on all forms were good to excellent and ranged from .87 to .93. Adjusted coefficients for the subindices were adequate to excellent, ranging from .76 to .92, with most in the upper .80s or above. Interrater reliability studies were conducted by examining ratings of the same child by each parent completing separate parent forms or by two teachers completing separate teacher forms. The studies used a diverse sample of 58 to 231 participants with time intervals of 1 to 2 weeks. Results indicated moderate to high agreement across raters on all forms with adjusted coefficients ranging from .52 to .85 and most values in the .60s to .80s. Construct validity was carefully examined. Correlations between the BERI and the subindex scores for all forms ranged from .68 to .88. Next, convergent evidence for construct validity was provided. Strong correlations between the BASC-3 BESS total BERI score and the BASC-3 Behavioral Symptoms Index (teacher and parents) and Emotional Symptoms Index (self-report) ranged from .90 to .92. Correlations between BASC-3 BESS subindex scores and BASC-3 composite scale scores ranged from .86 to .92. Moderately high correlation coefficients were reported between the BASC-3 BESS and the Achenbach System of Empirically Based Assessment Total Problems composite scales, with most values ranging from upper .50s to .70s. The test authors also reported moderate correlation coefficients with the Conners 3 rating scales used to assess ADHD, with most correlations ranging from the .40s to .60s. Moderate correlation coefficients were also reported between the BASC-3 BESS and the Autism Spectrum Rating Scales, ranging from .23 to .58. Correlation coefficients were slightly higher for the Children’s Depression Inventory 2 with most values in the .50s and .60s. Lower coefficients were observed between the BASC-3 BESS and the Revised Children’s Manifest Anxiety Scale: Second Edition with correlations in the upper .30s to lower .50s. Finally, correlations between the BASC-3 BESS BERI and the BASC-2 BESS total scores were extremely high (.95 to .98), providing strong evidence to generalize research conducted on the BASC-2 BESS to the BASC-3 BESS. The test authors describe how to compare BASC-3 scores to BASC-2 scores for longitudinal purposes.

In terms of predictive validity, when comparing average teacher, parent, and student BERI scores to reports of classification or diagnosis of ADHD, autism spectrum disorder (ASD), emotional/behavioral disturbance (EBD), hearing impairment, specific learning disorder, and speech or language disorder, the ASD and EBD groups generated total BERI scores that fell into the elevated risk classification range. All other groups scored in the normal risk range. COMMENTARY. The BASC-3 BESS is a psychometrically sound screening instrument for assessing behavioral and emotional issues in children and adolescents from preschool through Grade 12. It assesses both risk factors and adaptive skills to produce an overall score indicating risk level for behavioral and emotional issues, while also specifying subindex scores to indicate particular areas of strength or weakness. It is a brief and efficient way to gain information from multiple informants (in English and Spanish) and can be used in a variety of settings. Teacher, parent, and student forms are user-friendly and can be scored easily by hand or by using Q-global software. Items on the three forms were carefully chosen using factor analysis, and both positively and negatively worded items appear to minimize response bias. Validity indexes further inform accuracy and quality of responses. Construct evidence was evaluated by showing convergence with six other instruments. In addition, predictive validity evidence was demonstrated with examinees with diagnoses of autism and behavioral/emotional disturbance, in particular. The test authors noted that although no specialized training is required to administer the BASC-3 BESS, decisions about intervention and treatment require professional input. Furthermore, BASC-3 BESS results can be an integral part of a multistage model of intervention, but as an instrument designed to screen for risk, it may be most appropriate at the early stages of such a model. More information would be helpful to better understand the theoretical orientation driving the item selection and the conceptualization of the subindexes. SUMMARY. The BASC-3 BESS is a quality screening instrument that estimates children’s and adolescents’ risk levels for behavioral and emotional problems. The instrument is brief and flexible in its applicability, and it can be readily administered in school, clinical, and research settings. Information may be gathered using teacher and parent forms for younger and older age groups, as well as from a self-report student form. Test authors provide ample evidence for the measure’s reliability and validity. It is used widely in school settings and may be useful for multistage screening as a beginning tool to indicate risk and inform when intervention may be required.

*** Copyright © 2022. The Board of Regents of the University of Nebraska and the Buros Center for Testing. All rights reserved. Any unauthorized use is strictly prohibited. Buros Center for Testing, Buros Institute, Mental Measurements Yearbook, and Tests in Print are all trademarks of the Board of Regents of the University of Nebraska and may not be used without express written consent.