Systematic Review

stellagirl
TheEffectofTestPresentation.Alt..pdf

LSHSS

Article

The Effect of Test Presentation on Children With Autism Spectrum Disorders and Neurotypical Peers

Mary Alta and Melanie Humphrey Morenoa

Purpose: The purpose of this experiment was to determine if there is alternate forms reliability for paper- and computer- administered standardized vocabulary tests. Another purpose was to determine whether the behavioral ratings of children with autism spectrum disorders (ASDs) would improve during the computer-administered testing sessions secondary to a decreased need for social interaction. Method: Thirty-six school-age children (half with ASDs, half neurotypical [NT]) took 2 versions (i.e., paper vs. computer) of the Expressive One-Word Picture Vocabulary Test (EOWPVT– 2000; Brownell, 2000a) and the Receptive One-Word Picture Vocabulary Test (ROWPVT–2000; Brownell, 2000b). Order of presentation was counterbalanced across participants. Test sessions were videotaped, and randomly selected 1-min

intervals were rated for behaviors. Standardized test scores and behavior ratings were compared for equivalence across the test presentation methods. Results: Standard scores for both versions of the tests were not significantly different for both groups of participants. There were no differences in behavioral ratings between the two methods of test presentation. Conclusion: Alternate forms reliability was found, thus expand- ing the options for testing for school-age populations. The use of computers had no effect on the behaviors of the children with ASDs. The ramifications of this finding for assessment and intervention for children with ASDs are discussed.

Key Words: autism, computer-assisted intervention, assessment

A ccurate testing is an important aspect of service pro- vision. Clinicians have to weigh a range of options when considering assessment. Some of the most

useful measures for designing intervention plans are lan- guage samples, dynamic assessments, and curriculum-based criterion-referenced measures, all of which can mimic the child’s everyday learning environment more closely than standardized tests can. However, there are situations in which standardized tests are useful, necessary, or both. The American Speech-Language-Hearing Association’s (ASHA’s, 2006) technical report on autism spectrum disorders (ASDs) noted that there is no clear empirical evidence to support any one assessment approach. Condouris, Meyer, and Tager- Flusberg (2003) found that there were significant correlations

between the scores that children with autism achieved on standardized tests and the scores they achieved on measures of spontaneous language. Together, these results indicated that both types of assessment measures may be valid for children with ASDs.

Delays and deficits in social interaction and language, as used in social communication, are two of the diagnostic criteria for ASDs as defined by the Diagnostic and Statistical Manual of Mental Disorders—Fourth Edition (DSM–IV; American Psychiatric Association, 1994). In fact, even high- functioning children with ASDs have been found to have higher levels of general anxiety and social anxiety than chil- dren with specific language impairment and typically de- veloping peers (Gillott, Furniss, & Walter, 2001). Such impairments in social communication may prevent children with ASDs from reaching their full potential when admin- istration of a standardized test requires interaction with the test administrator, who may be an unfamiliar person. If children do demonstrate their full linguistic potential on the task, the experience may still be stressful for one or both parties due to the required social interaction.

Researchers have found that individuals with ASDs are a challenging population to assess because of their difficulties with test-taking procedures (Koegel & Mentis, 1985) and

aUniversity of Arizona, Tucson

Correspondence to Mary Alt: malt@email.arizona.edu

Editor: Marilyn Nippold Associate Editor: Jeannene Ward-Lonergan

Received October 25, 2010 Revision received March 7, 2011 Accepted July 24, 2011 DOI: 10.1044/0161-1461(2011/10-0092)

LANGUAGE, SPEECH, AND HEARING SERVICES IN SCHOOLS • Vol. 43 • 121–131 • April 2012 * American Speech-Language-Hearing Association 121

their “core social deficits” (Condouris et al., 2003, p. 351). These difficulties may mask their linguistic competence. Test takers with ASDs may need modifications during a formal assessment to enable them to use their skills in task recog- nition and their procedural knowledge of the test situation. Individuals with ASDs must be able to inhibit excessive off- task behavior and sustain their attention to the task at hand. Koegel, Koegel, and Smith (1997) found that children with ASDs routinely scored higher on components of standardized tests of language and intelligence when they took the tests in a condition that was individually designed to promote attention and motivation rather than when the tests were administered in the standardized manner.

One characteristic of some individuals with ASDs is the need for sameness of routine. Gillott et al. (2001) found that high-functioning children with ASDs had higher levels of separation anxiety and obsessive–compulsive characteristics than children without autism. The authors noted that many of the characteristics of obsessive–compulsive disorder overlap with features that are often associated with autism, such as the need for sameness of routine. Taking stan- dardized tests, particularly with an unfamiliar examiner, is not part of most people’s routine; thus, the novelty of the act might contribute to the stressfulness of the testing situation.

Computers and Children With ASDs

Computer-assisted intervention (CAI) has been suggested as a means to provide new learning for children with ASDs when traditional situations may not be appropriate or pro- ductive (Powell, 1996). Some studies have reported that children with ASDs respond better to information that is presented by a computer rather than by a teacher (Moore & Calvert, 2000; Quill, 1995). Computers may be a favorable way to present tasks to individuals with ASDs partially because using computers may be perceived as predictable and controllable (Strickland, Marcus, Mesibov, & Hogan, 1996). Moreover, using a computer screen is often preferable for learners with ASDs because these learners tend to have a strong interest in computers (Shane & Albert, 2008), and irrelevant information is less likely to distract them (Shane & Weiss-Kapp, 2007).

Moore and Calvert (2000) assessed the attention, motiva- tion, and learning of words in 14 children with ASDs in a behavioral/teacher program compared to a computer program. Overall, the children in the computer program performed better than the children in the teacher program. Specifically, the children with ASDs were attentive 97% of the time in the computer condition versus 62% of the time in the teacher condition. Although 57% of the children in the computer condition wanted to continue treatment, 0% of the children in the teacher condition wanted to continue. Finally, the children learned 74% of the targeted nouns in the computer condition compared to 41% in the teacher condition.

In a literature review, Panyan (1984) reported increased curiosity and improved attention and social behavior, includ- ing the use of spontaneous speech, when CAI was used with children with ASDs. Williams, Wright, Callaghan, and Coughlan (2002) found that eight children with ASDs spent more time on a reading task when it was presented on a computer than when it was presented by personal instruction. In addition, the children were monitored using a standard- ized behavioral observation scale that included more than 35 behaviors, both positive (e.g., showing, pointing) and negative (e.g., pushing people, tantrumming). The children’s behaviors were rated as less resistant when the reading material was presented via a computer than when it was presented via personal instruction. Kozima, Nakagawa, and Yasuda (2005) showed that a simple interactive robot elicited interaction from preschool children with ASDs. Although the researchers did not directly compare their intervention to human interaction, they noted that children used facial expressions with the robot that the children’s parents had not previously observed. Kozima et al. hypothesized that the robot was successful due to its relative simplicity and pre- dictability. It may be that, with regard to language assess- ment, a computerized version of a standardized test would meet the modification needs of children with ASDs (Williams et al., 2002).

Reports suggest that a program that promotes the optimal use of electronic screen displays on TVs, computer screens, and video games is beneficial when working with children with ASDs (Shane & Weiss-Kapp, 2007). However, rela- tively few studies have investigated the effects of computers on children with ASDs. Ramdoss et al. (2011) systematically reviewed studies that used computer-based interventions for children with ASDs. They found only 10 studies that met their criteria (i.e., included a person with ASDs, had a communication skill as a dependent variable, and interven- tion was computer based). Although most of the studies showed some gains for participants, only four of the 10 were classified as having enough experimental control to yield certainty of evidence ratings of preponderance or conclusive. Many reports of CAI in the literature are anecdotal or qual- itative, and internal validity is often questionable due to small sample sizes. To our knowledge, there are no studies that have examined standardized test administration via a computer in the population with ASDs. Factors inhibiting computer use include parental and professional fears of increased social withdrawal, the computer becoming the focus of an obsession, and computer technology in general (Bernard-Opitz, Ross, & Tuttas, 1990). The evolution of computers in the past 20 years may have mitigated some of these fears.

Using CAI when assessing children who experience difficulty with social withdrawal and obsessive behaviors may help the children compensate for these deficits during testing. CAI has the potential to alleviate some of the social interaction necessary for traditional test administration. If

122 LANGUAGE, SPEECH, AND HEARING SERVICES IN SCHOOLS • Vol. 43 • 121–131 • April 2012

successful, CAI may be better accepted as a viable means of decreasing the impact of social difficulties during testing and treatment and in classroom settings.

Testing With Neurotypical (NT) Children

For our study, we hypothesized that NT children would perform similarly on both a traditional vocabulary test with a human administrator and a computerized version of the same test. Differences were not expected because NT chil- dren are not expected to have social deficits that would impact test performance and administration. Additionally, the absence of sensory and attention problems in the NT control group indicates that there should not be a difference in performance between the traditional paper and computer tasks. However, before one can comfortably use scores derived from a computer-administered test, one must be confident that there are no significant differences between the alternate test forms.

The Current Study

The primary purpose of this study was to determine whether the performance of children with ASDs on stan- dardized vocabulary tests would be equivalent when the tests were administered traditionally versus when they were ad- ministered via a computer. There are several reasons why examining alternate forms reliability of test presentation was of interest. First, high alternate forms reliability would pro- vide clinicians with additional choices for a testing format. Certainly, it could be more convenient to bring a single laptop to an evaluation than a bag full of bulky easels. High alternate forms reliability could also have implications for assessment via telepractice. It is also possible that children with ASDs might benefit from the use of computer tech- nology. It would be beneficial to clinicians if they could achieve a reliable score using a test presentation method that makes a testing session more pleasant for both the examiner and the examinee. Second, there are researchers who ad- minister standardized tests via a computer. It remains an empirical question as to whether or not scores obtained via computer administration should be considered equivalent to those obtained via the traditional method.

The current study was designed to determine if there would be alternate forms reliability between traditionally administered and computer-administered versions of two standardized vocabulary tests. We hypothesized that there would be equivalence between the two testing formats for the NT children, but that there may be a facilitative effect of computer-administered testing for the children with ASDs. We also expected the participants with ASDs to exhibit fewer negative behaviors during the computerized testing, and that there would be no measurable behavioral differences in NT children. A computerized version of a language test

may allow children with ASDs to focus on the test stimuli while minimizing the potential negative repercussions of forced social interaction with the administrator.

METHOD

Participants

After receiving approval from the University of Arizona’s Institutional Review Board, we recruited a total of 18 chil- dren with a diagnosis of ASDs for our study. Five participants were female, and 13 participants were male. Participants with ASDs were recruited from several clinics in Tucson that provide speech-language and occupational therapy. Inclu- sion criteria for the study were (a) a diagnosis of either autism, Asperger’s syndrome (AS), pervasive developmental disorder—not otherwise specified (PDD–NOS), Rett’s disorder, or childhood disintegrative disorder; (b) age of 5–13 years; (c) English as a first language; (d) no neurological disease or trauma as reported by the parent or legal guardian; (e) normal vision per parent report; and (f) normal hearing as assessed by the test administrator. A wide age range was used for several reasons. First, there was no reason to suspect that age would influence test presentation format performance. Second, this age range is a common range for the tests we used; therefore, use of a wide age range allows for greater generalizability.

The parents of each child with ASDs were required to complete a questionnaire before testing began. The parents were asked to report their child’s diagnosis from a choice of autism, AS, PDD–NOS, Rett’s disorder, and childhood disintegrative disorder, as well as the title of the professional who provided the diagnosis (i.e., psychiatrist, neurologist, primary care physician, psychologist, or other). Criteria used by the various professionals to derive the diagnoses were unknown. When needed, clarification of the information provided in the questionnaire was sought during the initial assessment session.

Information reported by the parent was used to classify the functioning level of each child with ASDs as low, moderate, or high. This type of terminology is common when quali- tatively describing children with ASDs. It typically refers to the children’s general cognitive, language, and social capa- bilities. We did not have direct, standardized measures of the children’s functioning in these domains—these determina- tions were made by parent report. No child who participated in this study was classified as low functioning, five children were classified as moderate, and 13 children were classi- fied as high functioning (three of whom had AS). All of the children in the study were able to take part in both testing sessions that the study required (see Table 1).

We recruited 18 NT children with unremarkable develop- mental histories per parent report as control participants.

Alt & Humphrey Moreno: Effect of Test Presentation 123

Recruitment of controls was conducted through local after- school activity centers and interdepartmentally in the Univer- sity of Arizona’s Speech, Language, and Hearing Sciences Department. Inclusion criteria for control participants included (a) English as a first language, (b) no history of neurological disease or trauma, (c) no developmental history of language delays or difficulties, and (d) normal hearing as verified by a hearing screening and normal vision as per parent report. The 18 children with ASDs were matched on age (±12 months) and sex to the 18 NTchildren. The mean age of the children with ASDs was 9;2 (years;months, median 9;0), and the mean age of the NT children was 9;1 (median 9;0).

Materials

We assessed all participants using the Expressive One- Word Picture Vocabulary Test (EOWPVT–2000; Brownell, 2000a; test–retest reliability 0.90) and the Receptive One- Word Picture Vocabulary Test (ROWPVT–2000; Brownell, 2000b; test–retest reliability 0.84). These tests are standard- ized and validated measures that are co-normed. We obtained permission from the research and statistics department of Academic Therapy Publications to scan test plates from the EOWPVT–2000 and the ROWPVT–2000 onto laptop com- puters. Each test yields raw scores, standard scores, per- centile ranks, and normal curve equivalents. We used standard scores for the analyses because they enabled direct compari- sons among different-age participants. Although there is a newer version of both of these tests currently available, the updated versions were not available when testing began. We chose vocabulary tests because they have relatively low

language demands (i.e., one-word or nonverbal responses) and they provide a measurement of both expressive and receptive language skills.

Procedure

Both tests were administered in the standardized manner set forth in the test manuals. Both tests were given in their traditional test plate format (paper) as well as in a com- puterized format. Each child was seen for two sessions that occurred on separate days roughly 1 week apart. During each session, each child was assessed with two of the possible four tests (i.e., receptive–paper; receptive–computer; expressive– paper; expressive–computer). No child received two versions of the same test in a single visit. Rather, a child received one paper test and one computerized test in each session. The selection and order of the tests was randomized to control for order effects. The second author and a trained research assistant, who, in most instances, were not familiar to the participants, administered all of the tests. The same person was present at both sessions for 16 of the 18 children with ASDs and 17 of the 18 NT children.

Computerized Presentation

For the computerized format, we scanned each test plate into the computer, fitting each scan into an individual slide in Microsoft PowerPoint. Each child was seated in front of a monitor for the computerized presentation. The monitor was connected to a laptop computer, from which the slide show could be advanced. Typically, the administrator was seated opposite the child. Unless a child sought interaction with the administrator (e.g., asked a question, made eye contact, made a comment), human interaction was not pro- vided for the duration of the test. Generally, the adminis- trator remained quiet, with her gaze on the laptop; the child remained seated in front of the monitor; and interactions were limited to child-initiated sequences. Examples of child- initiated sequences included initiations of conversation and physical contact.

The EOWPVT–2000 requires children to say the name of the image presented. Instructions for the test were recorded and presented as a .wav file. Appropriate prompts (e.g., What’s this? What’s she doing? What word names all of these?) and cues (e.g., What’s the arrow pointing to?) were also recorded as .wav files. Prompts and cues were em- bedded into each slide as a sound file that played automa- tically when the slide was presented. In accordance with traditional administration style, the carrier phrase What’s this? was dropped in the computerized administration after the first several items were presented. However, the What’s this? prompt was repeated for the first item in each item set. If an item required a different prompt (e.g., What’s she doing?), the necessary prompt was embedded and was

Table 1. Characteristics of the study participants with autism spectrum disorders.

Sex Age in

years;months Functioning

level Diagnosis

F 8;3 High Asperger’s syndrome F 8;9 High Asperger’s syndrome F 9;1 High Autism spectrum disorder F 9;8 High Autism spectrum disorder F 12;8 High Asperger’s syndrome M 6;0 High Autism spectrum disorder M 6;4 High Autism spectrum disorder M 6;7 Moderate Autism spectrum disorder M 7;1 High Autism spectrum disorder M 7;5 Moderate Autism spectrum disorder M 8;10 High Autism spectrum disorder M 9;3 High Autism spectrum disorder M 9;3 High Autism spectrum disorder M 9;9 Moderate Autism spectrum disorder M 10;6 High Autism spectrum disorder M 11;10 Moderate Asperger’s syndrome M 12;4 High Autism spectrum disorder M 12;8 Moderate Autism spectrum disorder

124 LANGUAGE, SPEECH, AND HEARING SERVICES IN SCHOOLS • Vol. 43 • 121–131 • April 2012

presented with the appropriate slide. The computerized test presentation also included feedback slides. A feedback slide typically consisted of a sound icon that corresponded to words of encouragement that were presented on the slide (e.g., Good job. Keep going). One to two feedback slides were presented for every 9–10 test items. This placement was chosen to mirror the feedback that a child would receive during typical administration of the test. When a feedback slide was presented, a .wav file of positive reinforcement (e.g., Good job. You’re working hard. If you don’t know, take your best guess.) played.

The ROWPVT–2000 requires children to identify the picture that corresponds to a single word prompt. For the ROWPVT–2000, each prompt was embedded into the slide so that it played as the slide was being presented. If a child did not respond, or if the child requested a repetition using the arrow pad on the laptop, the stimulus sound played again. Feedback slides were used in the manner described previously.

Paper Administration

For the traditional paper administration format, standard- ized protocols were followed as specified in the testing manuals. Feedback during these administrations of the EOWPVT–2000 and the ROWPVT–2000 was consistent with that given during the computer administrations. Be- cause the child and the administrator were seated in closer proximity and in direct line of sight to one another during this task, the frequency of feedback instances may have been greater than during the computerized test administration. Feedback was provided approximately every 9–10 items during testing.

Behavior Rating

All of the sessions with the children were videotaped. To assess whether or not the children’s behaviors were dif- ferent between test modalities, raters who were blind to the purpose of the study and the diagnoses of the participants rated the children’s behaviors during testing. Usable video clips were available for 16 of the 18 participants in each group. Due to technical problems, video clips for two partici- pants were unusable. One clip each was randomly selected from a paper testing session and a computer testing session to compare behaviors. The clips were roughly 1 min in length and were sampled from the 1st or 3rd min of testing. By design, the nature of the task was the same throughout the administration (e.g., child was exposed to test plate/image, expected to respond), so 1 min seemed appropriate to capture the nature of the child’s behavior during the task. However, by choosing two times, we were able to get a sample of the child’s behavior across the task. Not every child’s testing session took more than 3 min, so earlier clips were used to be consistent across participants. Short sessions (<3 min)

occurred for some of the children with ASDs who quickly reached a ceiling on each test.

Behavioral Rating Scale

A behavioral rating instrument was developed specifi- cally for this task due to the lack of available rating scales that were directly applicable to the current task. The rating scale could be used for two purposes: to record observable negative behaviors and to record overall impressions of the testing session. Raters scored children in 12 categories. All responses were based on a 5-point scale, and higher numbers indicated either more negative behaviors or more unfavorable impressions. Published rating scales like the Conners’ Rat- ing Scales—Revised (Conners, 1997) were used as a guide to help choose categories of behaviors.

In order to measure negative behaviors, we focused on behaviors that children with ASDs are likely to exhibit, thus increasing the study’s face validity. Raters were asked to make judgments about the children’s behavior in nine areas:

& amount of fidgeting (e.g., leaving seat, kicking)

& amount of distractibility

& inability to adapt answer or behavior

& amount of self-stimulatory behavior (e.g., hand flapping)

& lack of motivation

& demonstration of excessive loudness

& demonstration of obsessive behaviors (e.g., moving/ adjusting monitor or book, picking at self )

& inability to follow or adhere to directions

& demonstration of unexpected behaviors

The first four behaviors were rated in terms of the percent- age of time the behaviors occurred during a video clip based on the administrator’s perceptions and the rating scale. The last five behaviors were not amenable to a percentage-of- occurrence scale and were rated in terms of their agreement with a statement about the behavior’s occurrence. Raters were also asked to provide their overall impression of the testing session on three measures: engagement in the task, pleasant to test, and the child’s enjoyment (see the Appendix).

Raters

Raters were two undergraduate students who were blind to the purpose of the study and the diagnosis of the child they were viewing. Three training videos were provided to fa- miliarize the raters with the behaviors and the rater form and to provide anchor points. Training videos did not include participants who were scored and factored into the analy- sis. Both raters were given feedback regarding the scoring criteria. Of the 768 ratings made (12 ratings per session × 2 sessions per participant), raters were within 1 point of agreement for 691 scores (89.97%) on the 5-point scale. An additional 39 ratings were within 2 points of each other

Alt & Humphrey Moreno: Effect of Test Presentation 125

(5.08%). Thirty-eight ratings (4.95%) differed by 3 points or more.

RESULTS

Test Performance

We performed repeated measures analyses of variance (ANOVAs) to determine alternate forms reliability for the ROWPVT–2000 and the EOWPVT–2000. We ran a mixed design ANOVAwith diagnosis (ASDs, NT) as the between- groups measure and test (ROWPVT–2000, EOWPVT– 2000) and test presentation method (paper, computer) as the within-group measures. There was no significant difference between the test administration method for either group, F(1, 34) = .093, p = .76, hp

2 = .002. Not surprisingly, there was a significant difference between the standard scores for the NT group and those for the ASDs group, with the NT group having higher standard scores and a large effect size, F(1, 34) = 29.86, p < .001, hp

2 = .467. There was also a significant effect for both groups between scores on the expressive and receptive tests, F(1, 34) = 4.158, p = .04, hp

2 = .108, with children scoring higher on the expressive test than on the receptive test (see Figure 1).1 This find- ing had a moderate effect size. There were no significant interactions: method of presentation by diagnosis, F(1, 34) = .178, p = .675, hp

2 = .005; version of test (receptive vs. expressive) by diagnosis, F(1, 34) = .064, p = .801, hp

2 = .001; or method of presentation by version of test by diag- nosis, F(1, 34) = .001, p = .971, hp

2 = .000. We did not predict differences based on age regarding

factors that affect test performance. However, to ensure that this assumption was supported, we ran a simple linear regres- sion to see if age would predict difference scores between the computer and paper versions of the task. There was no effect for age for the receptive test, F(1, 32) = 1.95, p = .171, hp

2 = .057, or the expressive test, F(1, 32) = .824, p = .370, hp

2 = .025.

Behavior Ratings

We ran a second ANOVA to determine if there were sig- nificant differences between each group in terms of their behaviors during the tasks. There was no difference between the paper and computer administrations for either group, F(1, 30) = 1.71, p = .199, hp

2 = .054. With the exception of a significant between-group difference with a moderate

effect size, F(1, 30) = 4.76, p = .03, hp 2 = .136, in which the

children with ASDs exhibited significantly more negative behaviors than the NT children, no significant effect of test presentation was found on behavioral measurements. There was also no interaction between diagnosis and method of administration, F(1, 30) = .007, p = .19, hp

2 = .054. This was true when behavioral scores were analyzed using an overall score (as reported above) and when behavioral scores were analyzed separately for actual behaviors observed, F(1, 30) = 2.16, p = .15, hp

2 = .067 (interaction, F[1, 30] = .003, p = .953, hp

2 = .000), as well as for overall impressions of each session, F(1, 30) = .68, p = .414, hp

2 = .022 (interaction, F[1, 30] = .019, p = .891, hp

2 = .000; see Figure 2).

Figure 1. Standard scores on alternate forms of the Expressive One-Word Picture Vocabulary Test (EOWPVT–2000; Brownell, 2000a) and the Receptive One-Word Picture Vocabulary Test (ROWPVT–2000; Brownell, 2000b) for both sets of partici- pants: children with autism spectrum disorders (ASDs) and neurotypical children (NT).

1This difference was not predicted and we have no explanation for these findings. The differences were small (ASDs means, Expressive = 81.77, Receptive = 79.13; NT means, Expressive = 113.55, Receptive = 110.16) but statistically significant. According to the manuals for these tests, a standard score difference between scores on the EOWPVT and the ROWPVT of 3 points was found in >25% of the standardization sample (Brownell, 2000a, 2000b). Thus, this difference likely does not have clinical significance.

Figure 2. Behavioral ratings for both groups on alternate forms testing.

Note. Higher scores indicate more negative behaviors. “All Ratings” is a combination of the “Observable Behaviors” and “Overall Impressions.”

126 LANGUAGE, SPEECH, AND HEARING SERVICES IN SCHOOLS • Vol. 43 • 121–131 • April 2012

As noted earlier, we wanted to test to ensure that our prediction that there would not be significant age differences was accurate. Therefore, we ran a simple linear regression to see if age would predict difference scores for all behavior scores between the computer and paper versions of the task. There was no effect for age, F(1, 30) = 2.11, p = .155, hp

2 = .065.

DISCUSSION

We designed the current study to determine if there is alternate forms reliability between paper- and computer- administered versions of the ROWPVT–2000 and the EOWPVT–2000 with respect to children with ASDs and their NT peers. The answer was clearly yes for both groups for both tests. We were also interested in whether the chil- dren’s behavior would change based on the method of test administration. Results showed that the children’s behaviors during testing were not significantly different between the paper- and computer-administered tests.

These results were not surprising for the NT group. How- ever, we expected some benefit of computer administration with respect to the ASDs group, at least for behavior, for two reasons. First, research has shown improved learning and interest when children use computer technology (e.g., Moore & Calvert, 2000; Quill, 1995). Second, we followed the logic that the social anxiety of a testing situation could result in falsely low scores for the ASDs group; therefore, the use of a computer to alleviate the degree of interpersonal interaction might lead to higher test scores for these children. This was clearly not the case. One possible explanation lies with the simplicity of the task that was evaluated in this study. One-word picture vocabulary tests have a stable, predict- able pattern in which stimuli are presented in the same way throughout the test, and the manner of responding stays the same. These tasks also ease the linguistic burden because children are processing and must respond verbally at the one-word level for the EOWPVT–2000 and are processing at the one-word level and must respond nonverbally for the ROWPVT–2000. This predictability and simplicity may have sufficiently served to ease the social and linguistic difficulties found in most children with ASDs given that the addition of a computer had a negligible effect on perfor- mance. We hypothesize that if the children with ASDs were presented with more complex linguistic and social tasks, their deficits might manifest in their performances across methods of presentation.

Another possibility lies with the functioning level of the participants with ASDs. This study included 13 children who were classified (per parent report) as highly functioning and five who were classified as moderately functioning. Thus, the sample was biased toward higher functioning par- ticipants. It may be that a group of lower functioning children would have shown a significant difference in performance

because even the simplicity of the vocabulary test might have been too stressful for them. High-functioning children are generally more adept with social interactions and language processing/output than their lower functioning peers, although they still experience increased anxiety compared to peers without ASDs (Gillott et al., 2001). The task evaluated in this study was linguistically simple, and the social demands were routine and predictable. Accordingly, the children’s perfor- mances were relatively stable across the paper and computer administrations. However, for lower functioning children, a one-word vocabulary test would typically be relatively more linguistically taxing, and performance differences may manifest themselves between paper and computer adminis- trations. It is important to note that the current study was not designed to tease out the effect of functioning level on performance across the two administration formats. This hypothesis will need to be explicitly tested to determine if it is accurate.

A third possibility is that our rating scale was not an accurate way to measure the behaviors that we planned to measure. The scale we used was novel. However, it did have strong face validity in that it measured behaviors that are commonly observed clinically in children with ASDs. Rat- ings were made using a Likert rating scale, which is standard practice. Raters were trained using anchor videos. Most importantly, there was strong interrater reliability, indicating that the raters were able to measure the same behaviors in the same manner. So, although problems with the scale are a possible explanation for the lack of difference in behaviors across the testing formats, the evidence supporting the scale makes this possibility somewhat remote.

Although there are well-documented differences in the way children with ASDs interact socially (e.g., Condouris et al., 2003), these social differences may be diminished in the con- text of a standardized testing situation. In this study, children with ASDs were able to perform similarly on vocabulary tests, regardless of the level of human interaction required. The task presented was one that was highly regulated and predictable. This may have helped mitigate the social anxi- eties that are often experienced by children with ASDs so that they could perform more closely to their true potential. Paul and Cohen (1985) studied a small group of adults with ASDs and suggested that the structured nature of standard- ized testing might actually fit well with the behavioral profile of learners with ASDs.

Future studies that evaluate linguistically complex tasks requiring more verbal output and listening comprehension skills would be valuable. Tasks that require memory, manip- ulation of syntax, and following directions could be eval- uated with paper and computer administrations to investigate such effects. Further, studies that include children with ASDs of different functioning levels might find that the allevia- tion of social interaction allowed by the computer testing component can be beneficial for children who are deemed to be lower functioning.

Alt & Humphrey Moreno: Effect of Test Presentation 127

Clinical Implications

Research clearly shows that computers are a valuable way of teaching children in the classroom (Moore & Calvert, 2000; Quill, 1995). Although the results of the present study do not show that computers give children an advantage during testing, they do indicate that incorporating computers into standardized testing should yield reliable results. This gives researchers and clinicians another tool to use to assess children’s language ability. Computers can be valuable to professionals who are called on to formally evaluate a child in that they can store all of the testing stimuli and are also capable of recording responses. Testing via computers may also be a viable way of combining/maximizing resources because one professional may oversee multiple test takers in one setting. Computer administration opens up assessment possibilities for telepractice, which is a method of service delivery that is supported in theory by ASHA (2005) as well as in practice based on a review by Reynolds, Vick, and Haak (2009), although they urge further research to define the parameters of the practice. Based on the current findings, a clinician could administer these standardized assessment measures remotely via a computer with the expectation of equivalent results. Future research will further our knowl- edge of how, precisely, computer testing may be used to our advantage.

Results from this study indicate that a score from either the paper format or the computer format of the EOWPVT– 2000 and ROWPVT–2000 is reliable. However, when de- ciding between a paper and a computer test, it is critical that test administrators consider the individual to be assessed. A clinician may choose paper- or computer-based adminis- tration according to the individual child’s preference and testing style. The majority of the participants in this study expressed a preference for the computerized versions of the tests. However, some children became visibly tense when the computer was introduced, whereas others were overly distracted by their desire to turn the computer on and off. Future research should investigate how familiarity with com- puter technology and frequency of computer use affect children’s performance on computerized tests. Furthermore, learning style may also need to be considered when deter- mining which kind of test presentation to use with a child. If a child is a visual learner, he or she may prefer the com- puter, whereas an auditory learner may prefer the traditional, paper administration.

Conclusion

This study showed that the use of computer administra- tion of standardized vocabulary tests yielded alternate forms reliability with traditional paper testing for both children with ASDs and their NT peers. The use of a computer did not provide a significant advantage to the children with ASDs or the NT controls in terms of standard scores obtained or

behaviors exhibited. However, the knowledge that the scores are comparable gives researchers and clinicians an alternate assessment format to consider for use in their evaluations.

ACKNOWLEDGMENTS

The work for this paper began as part of the second author’s master’s thesis. Preliminary findings were presented at the Arizona State Speech-Language-Hearing Association Conference in April 2010. We would like to thank all of the families who participated in this project, as well as all of the people who supported this work.

REFERENCES

American Psychiatric Association. (1994). Diagnostic and sta- tistical manual of mental disorders (4th ed.). Washington, DC: Author.

American Speech-Language-Hearing Association. (2005). Speech-language-pathologists providing clinical services via telepractice: Position statement [Position statement]. Available from www.asha.org/policy.

American Speech-Language-Hearing Association. (2006). Prin- ciples for speech-language pathologists in diagnosis, assess- ment, and treatment of autism spectrum disorders across the life span [Technical report]. Available from www.asha.org/policy.

Bernard-Opitz, V., Ross, K., & Tuttas, M. L. (1990). Computer assisted instruction for autistic children. Annals of the Academy of Medicine Singapore, 19, 611–616.

Brownell, R. (2000a). Expressive One-Word Picture Vocabulary Test—Second Edition. Novato, CA: Academic Therapy Publications.

Brownell, R. (2000b). Receptive One-Word Picture Vocabulary Test—Second Edition. Novato, CA: Academic Therapy Publications.

Condouris, K., Meyer, E., & Tager-Flusberg, H. (2003). The relationship between standardized measures of language and measures of spontaneous speech in children with autism. American Journal of Speech-Language Pathology, 12, 349–358.

Conners, C. K. (1997). Conners’ Ratings Scales—Revised. New York, NY: Multi-Health Systems.

Gillott, A., Furniss, F., & Walter, A. (2001). Anxiety in high- functioning children with autism. Autism, 5, 277–286.

Koegel, L. K., Koegel, R. L., & Smith, A. (1997). Variables related to differences in standardized test outcomes for children with autism. Journal of Autism and Developmental Disorders, 27, 233–243.

Koegel, R. L., & Mentis, M. (1985). Motivation in childhood autism: Can they or won’t they? Journal of Child Psychology and Psychiatry, 26, 185–191.

Kozima, H., Nakagawa, C., & Yasuda, Y. (2005). Interactive robots for communication-care: A case-study in autism therapy. IEEE International Workshop on Robots and Human Interactive Communication, 341–346.

128 LANGUAGE, SPEECH, AND HEARING SERVICES IN SCHOOLS • Vol. 43 • 121–131 • April 2012

Moore, M., & Calvert, S. (2000). Brief report: Vocabulary acqui- sition for children with autism: Teacher or computer instruction. Journal of Autism and Developmental Disorders, 30, 359–362.

Panyan, M. V. (1984). Computer technology for autistic students. Journal of Autism and Developmental Disorders, 14, 375–382.

Paul, R., & Cohen, D. J. (1985). Comprehension of indirect requests in adults with autistic disorders and mental retardation. Journal of Speech and Hearing Research, 28, 475–479.

Powell, S. (1996). The use of computers in teaching people with autism. Autism on the Agenda: Papers from the National Autistic Society Conference. London, UK: National Autistic Society.

Quill, K. A. (1995). Teaching children with autism: Strategies to enhance communication and socialization. Albany, NY: Delmar.

Ramdoss, S., Lang, R., Mulloy, A., Franco, J., O’Reilly, M., Didden, R., & Lancioni, G. (2011). Use of computer-based interventions to teach communication skills to children with autism spectrum disorders: A systematic review. Journal of Behavioral Education, 20, 55–76.

Reynolds, A. L., Vick, J. L., & Haak, N. J. (2009). Telehealth applications in speech-language pathology: A modified narrative review. Journal of Telemedicine and Telecare, 15, 310–316.

Shane, H. C., & Albert, P. D. (2008). Electronic screen media for persons with autism spectrum disorders: Results of a survey. Journal of Autism and Developmental Disorders, 38, 1499–1508.

Shane, H. C., & Weiss-Kapp, S. (2007). Visual language in autism. San Diego, CA: Plural.

Strickland, D., Marcus, L. M., Mesibov, G. B., & Hogan, K. (1996). Brief report: Two case studies using virtual reality as a learning tool for autistic children. Journal of Autism and Devel- opmental Disorders, 26, 651–659.

Williams, C., Wright, B., Callaghan, G., & Coughlan, B. (2002). Do children with autism learn to read more readily by computer assisted instruction or traditional book methods?: A pilot study. Autism, 6, 71–91.

Alt & Humphrey Moreno: Effect of Test Presentation 129

APPENDIX (P. 1 OF 2). BEHAVIORAL RATING SCALE

1 2 3 4 5 NEVER displays

behavior Displays behavior less than

25% of time Displays behavior between 25% and 50% of time

Displays behavior between 50% and 75% of time

Displays behavior more than 75% of time

130 LANGUAGE, SPEECH, AND HEARING SERVICES IN SCHOOLS • Vol. 43 • 121–131 • April 2012

APPENDIX (P. 2 OF 2). BEHAVIORAL RATING SCALE

Alt & Humphrey Moreno: Effect of Test Presentation 131

Copyright of Language, Speech & Hearing Services in Schools is the property of American Speech-Language-

Hearing Association and its content may not be copied or emailed to multiple sites or posted to a listserv

without the copyright holder's express written permission. However, users may print, download, or email

articles for individual use.