8a Mathematics Expert Only

profilemalibumark21
Advancedregressionmethodsforsingle-casedesigns.pdf

EBSCO Publishing Citation Format: APA (American Psychological Assoc.):

NOTE: Review the instructions at http://support.ebsco.com.library.capella.edu/help/?int=ehost&lang=&feature_id=APA and make any

necessary corrections before using. Pay special attention to personal names, capitalization, and dates. Always consult your library

resources for the exact formatting and punctuation guidelines.

References

Brossart, D. F., Meythaler, J. M., Parker, R. I., McNamara, J., & Elliott, T. R. (2008). Advanced regression methods for single-

case designs: Studying propranolol in the treatment for agitation associated with traumatic brain injury. Rehabilitation

Psychology, 53(3), 357–369. https://doi-org.library.capella.edu/10.1037/a0012973

<!--Additional Information:

Persistent link to this record (Permalink): http://library.capella.edu/login?url=http://search.ebscohost.com

/login.aspx?direct=true&db=pdh&AN=2008-11210-010&site=ehost-live&scope=site

End of citation-->

Advanced Regression Methods for Single-Case Designs: Studying Propranolol in the Treatment for Agitation

Associated With Traumatic Brain Injury

By: Daniel F. Brossart

Department of Educational Psychology, Texas A&M University;

Jay M. Meythaler

Department of Physical Medicine and Rehabilitation, Wayne State University;

Rehabilitation Institute of Michigan, Detroit, Michigan

Richard I. Parker

Department of Educational Psychology, Texas A&M University

James McNamara

Department of Educational Psychology, Texas A&M University

Timothy R. Elliott

Department of Educational Psychology, Texas A&M University

Acknowledgement: This study was funded in part by National Institute of Disability Research and Rehabilitation

Grant H 133G000072 awarded to Jay M. Meythaler. Appreciation is expressed to Michael E. Dunn for sharing

information and opinions about the history of single-case designs in rehabilitation psychology research. Graphs of

participant data not presented in this article are available upon request from Daniel F. Brossart.

In a thoughtful commentary, Aeschleman (1991) observed a decreasing interest in single-case research (SCR)

designs in the rehabilitation psychology literature: Between 1985 and 1989, Aeschleman found only 6 out of 402

empirical papers published in Rehabilitation Psychology, Archives of Physical Medicine and Rehabilitation, and

Rehabilitation Counseling Bulletin used a single-subject design (<1.5% of the total; Aeschleman, 1991, p. 43). A brief

examination of the past 15 years of Rehabilitation Psychology reveals one article that offered an innovative way to

analyze single-case data (Callahan & Barisa, 2005) and another that was a true single-case study (Pijnenborg,

Withaar, Evans, van den Bosch, & Brouwer, 2007).

We disagree with Aeschleman's bleak conclusion that SCR designs “… have not made a methodological impact on

research in rehabilitation psychology” (Aeschleman, 1991, p. 47). History informs otherwise: Many of the influential

research programs in rehabilitation psychology first appeared in the literature in single-case designs. Behavioral

approaches—championed in the classic Behavioral Methods in Chronic Pain and Illness (Fordyce, 1976)—were based

on earlier single-case studies. The potential of supported employment— arguably one of the few evidence-based

practices in rehabilitation psychology with considerable support from many randomized clinical trials (RCTs; Dunn &

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

1 of 20 3/1/2019, 5:25 PM

Elliott, in press)— appeared in a study using a single-case case design published in the Journal of Applied Behavior

Analysis (Wehman et al., 1989). And the ground-breaking extensions of Neal Miller's operant learning models to

visceral, reflex, and motor responses were achieved in single-case designs (Brucker & Ince, 1977; Ince, Brucker, &

Alba, 1978). Clearly, SCR designs have played a pivotal role in the rehabilitation psychology research base.

Unfortunately, SCR and case studies are often misconstrued as one in the same. An uncontrolled case study is a

study of a single client, dyad, or group in which observations are made under uncontrolled and unsystematic

conditions. The lack of experimental control in such a study may have contributed to an overall suspicion or distrust of

results based on a single subject in general. Designs that add more experimental control include systematic, repeated

observations of a single client, dyad, or group and are often called intensive single-case designs. For even more

experimental rigor, one could use a single-case experimental design, which is typically viewed as having greater

control than intensive single-case designs. These designs usually have behavioral goals or target behaviors that are

the main focus of interest and function as the dependent variable. They also have repeated measurements over time

and at least two treatment phases (baseline and treatment). Some have stated that the core essence of single-case

research is that “all dependent measures are collected repeatedly over the course of the experiment, and these data

are not combined with those from other participants to produce group averages for purposes of data analysis” (Morgan

& Morgan, 2001, p. 122). Nevertheless, there are also instances in which evaluating single-case data across

participants is helpful because it can increase the internal validity of the design.

In this article, we begin by briefly discussing some present issues, past practice, and some misunderstandings

regarding single-case research. We then show how the application of single-case research can be helpful in answering

substantive questions. To illustrate this, we use data collected from a double-blind, crossover RCT to examine the

effectiveness of drug therapy in reducing agitation in individuals with a traumatic brain injury (TBI). Furthermore, we

introduce a new methodology for analyzing single-case data and compare it with a more traditional regression method.

Why Consider SCR Now?

Although others have urged for an increased use of single-case research, such calls for the use of single-case

research appear to have had little effect in changing the behavior of researchers (Blampied, 2000; Goldfried & Wolfe,

1996; Hilliard, 1993; Morgan & Morgan, 2001). SCR continues to be an underused research design.

Several forces, however, do appear to be making an impact. One is the present-day focus on effect sizes. Many

journals now require investigators to report effect size with contextual information for their interpretation (Fidler, 2002).

A similar trend toward accountability, objectively measured outcomes, and greater scientific rigor can be seen in policy

statements by influential groups such as the National Research Council (Shavelson & Towne, 2002). The medical

profession's accountability reform has also played a part in the movement for the broader use of effect sizes (Oakley,

2002). Funding agencies, public and private, are increasingly requiring empirical results and effect sizes. In addition,

the call for greater accountability and objective, defensible results (Shavelson & Towne, 2002) in psychological and

educational research has been an important factor leading to greater scrutiny of how SCR is summarized.

Recognizing the Limitations of RCTs

There appears to be an increasing recognition that RCTs are ideal for answering some research questions but that the

design itself is not able to answer all important questions and that its implementation has certain limitations. This has

led some to continue to call for both efficacy and effectiveness studies (Tucker & Roth, 2006). Important questions

about how any given single RCT is conducted and the validity of the results gained have prompted guidelines for

registering RCTs for public scrutiny (Elliott, 2007). The intention is that this requirement will address deficiencies in the

quality control of RCTs. However, the validity of RCTs are often compromised in many applications relevant to

rehabilitation psychology by a low number of available participants (with low-incidence disabilities) and because true

control groups are difficult (if not impossible) to attain due to the lack of services that negate a “usual treatment”

scenario for a controlled, comparison group (such that any attention to control participants would be above and beyond

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

2 of 20 3/1/2019, 5:25 PM

the typical experience or “treatment-as-usual”; Elliott, 2007).

The use of single-case designs also helps address the overuse of cross-sectional methods so common in

rehabilitation psychology. Just as many introductory research design texts talk about the monomethod bias for a single

research study, overuse of a single design within a field creates a lopsided literature base that lacks the advantages of

triangulation with multiple research methodologies. Researchers across the health care fields have called for an

expanded evidence base, reflected in a broadened focus and a plurality of methodologies to answer questions

regarding informed practice (Concato, Shah, & Horwitz, 2000; Spring et al., 2005). Single-case designs seem a ready

way to add methodological diversity to the literature base.

SCR Compared With Traditional Cross-Sectional Research Designs

The more commonly applied cross-sectional research designs are, in general, nomothetic approaches: They “aim to

establish lawful relations that apply across individuals” (Nesselroade, 1991, p. 96). Thus, two key characteristics of

cross-sectional designs are “static observations and multiple behavioral categories” (Baltes & Nesselroade, 1979, pp.

11–12). In contrast, SCR designs may be seen as a hybrid form of the longitudinal approach. Longitudinal designs

have the ability to identify not only the processes and causes of intraindividual change but also the processes and

causes of interindividual patterns of intraindividual change in behavioral development. Although single-case designs

may be used to explore patterns and processes, they typically focus on evaluating the impact of a treatment on a

client, student, or patient. Because attention is given to collecting data before treatment begins, after treatment starts,

and sometimes even after treatment ends, each research participant may serve as their own control. Thus, SCR can

be viewed as an alternative methodology for answering many of the same research questions as cross-sectional group

research and as a methodology that is uniquely capable of answering different and new research questions.

When Should One Use SCR Designs?

SCR should be considered as a top candidate research design to use in several circumstances. It is ideally suited for

studying low-incidence problems and conditions. Many behavioral issues that accompany conditions such as TBI and

spinal cord injury (SCI) are difficult to study in designs that rely on large, representative samples for randomization and

treatment. For example, SCR has been used to study treatments to promote wheelchair pushups among men with

SCIs (White, Mathews, & Fawcett, 1989) and other attempts to prevent pressure sores (Malament, Dunn, & Davis,

1975). These are significant clinical issues that often challenge and confound clinicians; however, they are not

manifested in a sufficient number of individuals required to attract the necessary attention and financial support for a

large-scale (or multisite) RCT.

For low-incidence problems, SCR designs are probably one of the few designs that researchers could use to expand

the knowledge base productively in a time-efficient manner. Cross-sectional designs can take a considerable amount

of time to obtain a sample of sufficient size for data analysis. SCR designs are also indicated for studies in which few

participants are able to meet the inclusion criteria for a study. In addition, SCR would be beneficial in any study in

which participants are required to participate over an extended period of time. Such studies often experience a fair

amount of attrition. If an SCR design was used, for the data that was complete, although possibly much smaller than

the number of participants the study began with, this would still allow important research questions to be answered.

Because each participant serves as their own control, the existing data would still allow one to make important

inferences (this is not to diminish the import of considerations one must make when interpreting results with high levels

of attrition). Multiple scenarios are presented in Appendix A as examples of when SCR should be considered.

Problems With Data Analyses

In spite of the fact that SCR has played an important historical role in psychology and that there have been a number

of replicable empirical findings in differing domains, Morgan and Morgan (2001) stated that SCR “remains relatively

obscure because of its disavowal of the statistical machinery that defines psychological research in the 21st century”

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

3 of 20 3/1/2019, 5:25 PM

(p. 120). Furthermore, those involved in SCR have historically relied on visual analysis (Busk & Marascuilo, 1992;

Kratochwill & Brody, 1978), which Kazdin (1982) defined as the procedure (largely informal) for reaching a judgment

about reliable or consistent intervention effects by examining graphed data visually. Indeed, one of the most recent

review articles on single-subject research in rehabilitation failed to acknowledge any of the available statistical

procedures for analyzing data from these designs (Backman, Harris, Chisholm, & Monette, 1997).

There is a continued and legitimate need for visual analysis. As recently noted by Parker, Cryer, and Byrns (2006),

visual analysis plays at least seven important roles in SCR:

(a) to simultaneously consider multiple data attributes in complex graphs; (b) to identify cycles and other patterns

embedded within and across phases; (c) to distinguish between improvement and deterioration in effect sizes, and to

interpret effect size magnitudes; (d) to validate whether results (with predictions lines) are meaningful, by being within

score-scale limits; (e) to select the best statistical analysis techniques from multiple options; (f) to validate the

procedures and results from newer SCR analytic techniques, which lack a track record of successful published

applications; (g) to judge whether SCR datasets meet parametric data assumptions (p. 420).

Nevertheless, results on the basis of visual analysis have been shown to have low reliability even when judges are

experienced professionals, editors of single-case journals, or others provided with fully contextualized graphs with

other design and measurement improvements (Brossart, Parker, Olson, & Mahadevan, 2006; DeProspero & Cohen,

1979; Harbst, Ottenbacher, & Harris, 1991; Ottenbacher, 1990; Park, Marascuilo, & Gaylord-Ross, 1990). Neither

technique—visual analysis or statistical analysis—should be used in isolation: “In single-case research it seems

especially important to investigate how these two methods inform and support each other” (Brossart et al., 2006, p.

558).

Our own experience highlights the importance of using both visual and statistical analysis. For example, in previous

studies, we noticed large differences between visual analysis and the output from ITSACORR (Crosbie, 1993, 1995).

Further investigation showed that ITSACORR was unrelated to other statistical techniques as well as to visual analysis

of single-case data (Brossart & Parker, 2001; Parker & Brossart, 2003), which raised serious concerns about its

viability as a useful technique. Additional empirical studies have also highlighted its weaknesses (Huitema, 2004). It is

time for single-case researchers to abandon the sole use of visual analysis; the dogged refusal to incorporate

statistical analysis of single-case data will simply result in various fields or lines of research being ignored as irrelevant,

archaic, and unsophisticated.

Some of the underuse of statistical methods has been due to the cautiousness of researchers in applying univariate

parametric analyses because of well-placed concerns that the data fail to meet assumptions of homogeneity of

variance, normality, and serial independence. In fact, these assumptions are commonly violated by short, interrupted

data series. Even greater concerns have been voiced about the use of more complex parametric analyses, such as

repeated measures analysis of variance (ANOVA), as it makes even stronger assumptions of the data (sphericity;

Stevens, 2007). Because of these stringent assumptions, multivariate analysis of variance (MANOVA) has sometimes

been used to replace repeated measures ANOVA (RM-ANOVA). However, MANOVA still has strict assumptions

(homogeneity of variance-covariance matrices, absence of multicollinearity and singularity) and does not provide

output as useful as RM-ANOVA's partial effect sizes.

For simpler parametric analyses, concerns about unequal variance and nonnormal distributions are reasonably well

addressed by bootstrapping, a resampling technique that sidesteps data assumptions by relying on an empirical

sampling distribution (Davison & Hinkley, 1997; Good, 2001; Lunneborg, 2000; Simon, 1999). The bootstrap is

attractive and is just beginning to be applied to SCR (Parker, 2006). Violation of the assumption of serial independence

can be addressed through autoregressive integrated moving average backcasting (Parker et al., 2006). We take a

different approach in the present article; however, the use of nonparametric analyses is burdened by only the minimal

assumptions of nominal-level data.

Advantages of nominal-level data analysis include its applicability to any SCR data set, regardless of parametric

assumptions, and its greater ease of use, as remedial data transformations are not needed. The main assumption

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

4 of 20 3/1/2019, 5:25 PM

made by nominal-level data analyses is an adequate sample size for a 2 × 2 table of about five expected data points

per cell (total N of at least 20–25). All nominal-level analyses based on the 2 × 2 table can produce two effect sizes: (a)

Phi (Φ), which is Pearson's R for a 2 × 2 table, and (b) the clinical outcome index, the “risk difference” (medical

terminology), here more appropriately named “improvement rate difference” (IRD). Given a 2 × 2 table with balanced

marginal values, these two values are equal (Φ = IRD). Standard output for both indices includes confidence intervals

around the obtained values. For more complex single-case designs, these nominal-level indices can be obtained

through logistic regression (LR).

Other concerns with using statistical analyses on SCR data are related to the lack of relevance of effect sizes to the

traditional standard of visual analysis (Parsonson & Baer, 1992). An R (or R) effect size derived from ordinary least

squares regression and interpreted as “percent of variance accounted for” does not resound with more traditional SCR

practitioners. A further advantage of nominal-level 2 × 2 table-based analyses is that they are based on

nonoverlapping data between phases, a keystone of visual analysis. Depending on the particular method, the

approach to measuring nonoverlapping data varies, but in all cases, the data overlap can be confirmed visually.

Comparison Method: Simple Mean Shift (SMS) Regression

Regression models have been used by single-case researchers since at least 1983 (Gorsuch, 1983). Since that time,

many different models for analyzing single-case data have been proposed (e.g., Allison & Gorman, 1993; Center,

Skiba, & Casey, 1985–1986; Faith, Allison, & Gorman, 1996). One of the advantages of regression models is that they

are familiar to many because they are often covered in doctoral training programs in the behavioral sciences. They

also produce a common effect size, R , which can be converted to other effect sizes such as Cohen's d (Rosenthal,

1991). Results from individual studies may also be summarized in meta-analytic studies. Additional advantages

include the relative ease of evaluating power and creating confidence intervals around the effect size. It is also fairly

easy to bootstrap regression models, especially those models that entail a single step (as opposed to those that

involve multiple steps; e.g., Allison & Gorman, 1993).

Every statistical method has limitations, and one disadvantage of the regression models is that the effect size, R , is

not easily interpreted in terms of treatment effectiveness. Another disadvantage is that there are numerous regression

models a single-case researcher may choose from. Some models try to control for trend in various ways, some across

the entire data series similar to a covariate in analysis of covariance (e.g., Gorsuch, 1983), others attempt to control for

trend in the baseline phase only (Allison & Gorman, 1993; Faith et al., 1996). The choice of model depends on the

question the investigator wants to answer. Furthermore, the effect sizes produced by these regression methods are not

directly comparable to those found in typical cross-sectional regression studies in terms of the characteristic range and

magnitude seen in SCR. Thus, the interpretive guidelines found in texts by Cohen (Cohen, 1988), for instance, are of

little help in SCR. Investigators have made some progress in trying to provide tentative interpretive guidance, but

guidelines per se are not available yet (see Brossart et al., 2006; Parker & Brossart, 2003; Parker et al., 2005). Thus,

the effect size coefficient does not directly communicate the degree of intervention effectiveness.

Among the regression methods available, the one discussed by Allison and colleagues appears to be one of the more

conceptually and empirically sound options (Allison & Gorman, 1994; Brossart et al., 2006; Parker & Brossart, 2003).

This method involves multiple steps and effectively controls baseline trend, but it is not without limitations. Because it

controls for baseline trend, the data series needs to have enough data points to assess trend accurately. Although one

may draw a trend line through three data points, any baseline based on only three data points should only be analyzed

by a regression method in which mean shift is examined, and even then such analysis should be considered tentative.

More data in each phase serves to increase the accuracy of any trend line produced. In addition, because Phase

A-predicted values are generated for Phase B, the technique may infrequently produce values that extend beyond the

range of the dependent variable (on the y-axis). Such values should be constrained to fit within the limits of the y-axis

variable. An additional limitation of the regression model promoted by Allison is that one cannot graph the output for

visual analysis. The semipartialing performed by this method changes the data so much that visual analysis is difficult.

Although trend is removed, graphing the final output does not lend itself toward a straightforward interpretation. In an

2

2

2

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

5 of 20 3/1/2019, 5:25 PM

effort to improve the Allison technique, Parker et al. (2006) renamed the technique mean and slope adjustment

(MASAJ) and modified it so that it was visually interpretable and the question it addressed was slightly adjusted. The

MASAJ technique now answers the question, “What if phase A trend influence were eliminated or controlled in phase

B?” (Parker et al., 2006, p. 426). In contrast, the Allison technique answers a similar but different question: “What

phase differences would have been obtained if there had been no phase A trend in the entire dataset?” (p. 426).

We used a regression model that looks at an SMS between the baseline and treatment phase for the present study to

provide a comparison to the LR technique. Although it is one of the simplest models and does not control for baseline

trend, we felt it was important to provide a familiar comparison technique because it is very different from the LR

technique in terms of conceptual framework and output. This technique was also chosen because a few data sets

contained the treatment drug in the first phase with the “baseline” or placebo phase following. We deemed it

inappropriate to use a regression method that controlled for baseline trend when the treatment phase came prior to the

“baseline” phase.

Autocorrelation

In cases in which the investigator chooses to use a regression technique, it is important to be aware that

autocorrelation has been an enduring problem. Data sets with levels of autocorrelation ≥ ± .20 may be considered

problematic regardless of statistical significance (Matyas & Greenwood, 1996). The presence of autocorrelation

violates the assumption of data independence. To remove autocorrelation, one may use an autoregressive integrated

moving average model with a lag-1 parameter for backcasting rather than forecasting, as is typically done. The

traditional cautions against using time series analysis for this application do not apply (see Parker et al., 2006).

Addressing Threats to Validity

Among the strongest (in internal validity) and most flexible SCR designs is the multiple baseline design (MBD) across

subjects (Kazdin, 1982). The MBD permits an overall judgment of intervention effectiveness from multiple (typically 3

or 4) data series. Each data series represents one client. The most simple data series is AB, that is, a baseline phase

followed by an intervention phase. The strength of the MBD is in implementing the intervention at different times for the

clients, thus reducing the likelihood that the performance change is due to some event other than the intervention.

Increasing the number of clients, each with staggered intervention onset, improves the control of “history” as an

alternative explanation for behavior change (Kazdin, 1982). For history to be present, the external event would need to

impact the participants concurrently. Any history effect should be seen across all individuals at approximately the same

time. Without such evidence, the threat of history can usually be ruled out. Maturation is only a problem in special

circumstances in which the length of the study and the variables measured may, in fact, reflect developmental changes

in the participants.

With MBD, each data series and client are viewed as an independent replication, contributing evidence to the omnibus

judgment. That judgment is easy to make when improvement is uniformly strong across clients, but when results vary,

the overall judgment of intervention effectiveness is more difficult to make. That problem situation can be handled by

calculating effect sizes.

Statistical Methods Have Improved

Recent innovations in SCR include the ability to calculate effects sizes, in most cases with confidence intervals (Parker

et al., 2005; Parker & Hagan-Burke, 2007b), the use of phase contrasts (Parker & Brossart, 2006), controlling

autoregression, controlling preexisting baseline trend (Parker et al., 2006), and the use of the bootstrap (Parker, 2006).

In the past 20 years, the number of analytic techniques available for short data series has easily tripled since the early

1980s (Barlow & Hersen, 1984; Kazdin, 1982). The difficulty has been that few studies compared the statistical

techniques with each other and with visual analysis. Thus, those who wanted to use these statistical techniques had

little information in terms of how to interpret the output. Increasingly, researchers have recognized this deficiency in the

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

6 of 20 3/1/2019, 5:25 PM

literature base and have made some progress in addressing this need (e.g., Brossart et al., 2006; Parker & Brossart,

2003). Presently, it appears that effect sizes vary, depending on the statistical technique used to produce them, and

that the effect size magnitudes produced from cross-sectional research are very different than those produced from

SCR (e.g., Parker et al., 2005).

Summary

To summarize, SCR designs should be used because they are ideally suited to address questions unanswerable by

cross-sectional designs, they address the overuse of cross-sectional designs in the literature base, and it is no longer

the case that there are few statistical methods to analyze single-case data. In addition, the MBD is a powerful design

that competes well against other designs in terms of internal validity. In the remainder of the present article, we present

a small RCT that can be conceptualized as a hybrid multiple baseline study. We then analyze these data using a

statistical technique burdened by few assumptions, which is well suited for SCR.

Illustrative Study

To illustrate the usefulness of SCR and advanced regression methods for analyzing data from these designs, we

examined data collected from a funded project (awarded to Jay Meythaler) to conduct a randomized, double-blind,

crossover trial of propanolol with a placebo control among patients who were more than 1 year postbrain injury (BI).

Agitated behavior after BI can be very disruptive during acute medical care, inpatient rehabilitation, and in the

community. Previous studies have reported agitated behavior in 11%–34% of patients with BI in the acute phase

(Brooke, Questad, Patterson, & Bashak, 1992; Levin & Grossman, 1978; Reyes, Bhattacharyya, & Heller, 1981).

Although prevalence rates of agitation in the postacute phase are lacking, many patients seen in long-term follow-up

after severe BI demonstrate significant behavioral dyscontrol and agitation. Such sequelae have a devastating impact

on family relationships and overall functioning, considerably hampering community reintegration of persons with BI.

Agitation is generally regarded as a disturbed behavioral pattern often accompanied by overactivity and an “explosive”

(i.e., lacking goal direction), impulsive aggression among persons with BI who have regained cognitive awareness

(Corrigan & Mysiw, 1988; Silver & Yudofsky, 1994). Historically, clinicians have relied on pharmacological treatments of

agitated behavior (Cardenas & McLean, 1992; Rowland & DePalma, 1995). A recent Cochrane review of these agents

observed that beta-blockers (particularly propanolol) appear to have the best evidence of effectiveness (Fleminger,

Greenwood, & Oliver, 2006). In spite of such reviews supporting the use of beta-blockers, a recent survey indicates

that specialists seem to prefer anti-epileptics and atypical antipsychotics (Francisco, Walker, Zasler, & Bouffard, 2007).

The mechanism of action for the anti-aggressive properties of propanolol is essentially unknown, although it is unlikely

to be due to propranolol's peripheral beta-blocking activity because the doses required to manage agitated behavior

often well exceed the doses required to saturate fully peripheral beta-adrenergic receptors (Coltart & Shand, 1970;

Yudofsky, Williams, & Gorman, 1981). Propranolol may likely exert its anti-aggressive properties via central

antagonism of noradrenergic and serotonergic neurotransmission at several subsets of receptors.

For example, both the noradrenergic and serotonergic systems have been implicated as neurophysiologic substrates

of aggressive behavior in animal studies, though these systems probably subserve different types of aggressive

behavior and seem to interact in a complex fashion (Cassidy, 1990; Eichelman, 1987; Miczek, Weerts, Haney, & Tidey,

1994). The locations of noradrenergic and serotonergic cell bodies (the locus ceruleus and dorsal raphe nuclei,

respectively), as well as their neuronal (white matter) projections, are particularly vulnerable to injury within the brain

as a result of acceleration/deceleration injuries, the most common mechanism of BI (Morrison, Millier, & Grzanna,

1979; Whyte & Rosenthal, 1993). Because propranolol has effects on both beta-adrenergic receptors as well as

serotonin 5-HT1A and 5-HT1B receptors, its apparent effectiveness in managing agitation may be related to

modulation of neurotransmission in these damaged pathways.

However, the Cochrane review noted several problems that undermine our confidence in the evidence base that merit

a closer scrutiny of propanolol as preferred intervention for agitation. The reviewers found very few RCTs to evaluate

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

7 of 20 3/1/2019, 5:25 PM

(only six were identified, generally, in the pharmacological literature), a reliance on small sample sizes and lack of a

systematic reporting of all treated participants, and no replication studies and a lack of a global outcome measure to

assess the complexity of agitated behavior in this population (Fleminger et al., 2006). Although the reviewers cited the

need for further RCTs of the effectiveness of pharmacological agents, researchers and clinicians were strongly advised

to revisit the use of “N of 1 research methods” to analyze the effectiveness of the intervention in research projects and

in clinical case management (Fleminger et al., 2006).

As we observed earlier, these clinical realities and methodological issues often vex intervention research in

rehabilitation. And as we demonstrate, SCR designs and advanced regression techniques can be used efficiently to

examine the effectiveness of clinical interventions for grouped data (necessary for RCTs) and for clinical case

management (to monitor individual response to treatment). In the remainder of this article, we demonstrate the use

and implications of SCR and regression techniques in a randomized, double-blind crossover trial of propanolol in the

treatment of agitation among persons with postacute BI.

Method

Twenty individuals with BI who were sequentially enrolled in an outpatient brain injury clinic were invited to participate

in the present study. Each potential participant and his or her family members were given a thorough explanation of the

study together with a detailed informed consent document. Every effort was made to explain the purpose of the study

and the risks and benefits of participation to the potential participant, and to obtain assent or refusal. For individuals

unable to provide informed consent, decisions regarding participation fell to family members or the person's

designated surrogate decision maker.

The inclusion criteria were as follows: (a) BI due to closed or penetrating head trauma and/or hypoxia greater than 1

year prior to entry into the study; (b) 14 years of age or older; and (c) a clinically significant level of agitated behavior,

defined as that which interferes with activities of daily living or independent living skills. In order to more carefully

operationalize the level of agitated behavior necessary for inclusion, this study relied on the behavioral ratings by

family members on the Agitated Behavior Scale (ABS; Corrigan, 1989) obtained by the staff member. Prospective

individuals qualified for entry into the study if they obtain at least two scores on the ABS (described in the Measures

section) of 25 or greater in a 2-week period.

The exclusion criteria were as follows: (a1) medical contraindications to initiation of a beta-adrenergic blocker,

including a recent history of congestive heart failure, cardiac arrhythmia, atrioventricular conduction defect (2nd degree

or higher), or asthma requiring pharmacologic intervention; (b) clear medical indications for prescription of a beta-

adrenergic blocker for reasons other than agitation; (c) demonstrated inability to tolerate propranolol due to

hypotension or bradycardia; (d) suspected development of increased intracranial pressure requiring neurosurgical

intervention (e.g., placement or revision of ventricular-peritoneal shunt).

Participants

The sample available for study consisted of 13 persons with TBI (4 women, 9 men). Participants who had only two

data points in either the baseline or treatment phase were excluded. The final sample that was analyzed consisted of

10 participants. Sample ethnicity consisted of 12 Caucasians and 1 African American. The average age of the

participants was 34 (SD = 9.78).

Measures

The ABS (Corrigan, 1989) was used to assess agitation. The ABS is a 14-item scale designed to assess agitation

objectively among persons with TBI. At the end of each observation period, raters assign a number ranging from 1

(absent) to 4 (present to an extreme degree) for each item, representing the frequency of the agitated behavior and/or

the severity of a given incident. Total scores range from 14 (no agitation) to 56 (extremely severe agitation). In

previous studies, the ABS has demonstrated adequate reliability and validity (Corrigan, 1989). Factor analysis of the

ABS yielded a three-factor solution: Aggression, Disinhibition, and Lability (Corrigan & Bogner, 1994).

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

8 of 20 3/1/2019, 5:25 PM

The initial ABS was completed by a family member in an interview conducted by Timothy R. Elliott. This was used to

determine sufficient level of agitation to qualify for the study. At the introductory evaluation prior to randomization,

family members met with Timothy R. Elliott to learn how they were to assess agitation each week of participation with

the ABS. During this session, family members were instructed in the use of the ABS. An instructional videotape

(depicting various agitated behaviors) was played for the family members to rate the depictions of agitation on the

ABS. These ratings were reviewed and critiqued by the staff member. Family members were given copies of the ABS

and instructed to rate the participant's agitation each week. Completed scales were mailed to the research team or

returned in subsequent visits.

Intervention

The study was designed to be a randomized, double-blinded, crossover trial. Upon enrollment in the study, each

participant had a 2-week observation period during which placebo was administered in a single-blind fashion. ABS

observations began during this period. Pharmacy personnel used a double-blind randomization procedure to assign

participants to receive either the active agent (propranolol) or placebo for the first arm of the study. The study drug

(propranolol or placebo) was prepared by the pharmacy and delivered to the clinic. A 2-week supply of study drug

contained in a blister pack and labeled with the dosage increment was provided at each clinic visit.

Participants had pulse and blood pressure checked at each clinic visit. Dose of the study drug was adjusted to a

tolerated dosage increment for supine blood pressures less than 55 diastolic or 95 systolic in patients under 50 years

of age; less than 70 diastolic or 110 systolic in patients 50 years of age and over. Eight participants were started at an

initial dose of 60 mg of long-acting propranolol (Inderal-LA) per day; 2 participants were started at an initial dose of 80

mg of propranolol (Participants 4 and 7). Dosages were increased for participants who demonstrated tolerance for the

preceding dosage. From this protocol, 1 participant received a maximum dosage of 180 mg (Participant 1), 1 received

a maximum dosage of 120 mg (Participant 8), 6 participants received a maximum dosage of 80 mg (Participants 3, 4,

5, 6, 7, 10), and 2 were maintained at a dosage of 60 mg (Participants 2 and 9).

Ratings of agitation for each individual were conducted weekly from 6 to 14 weeks (average 10 weeks). Of the 10

clients, 7 were assessed over 10 or more weeks. The design for each of 9 clients was a simple AB (baseline period of

no treatment, followed by a treatment period). For 1 participant, the treatment preceded the baseline period, forming a

BA design. Baseline phases ranged from 3 to 8 data points (mean 5.3 data points), and treatment phases had the

same range (mean 5.1 data points).

Data Analysis

For many research designs, logistic regression (LR) is a close contender to ANOVA in power and sensitivity, while

being burdened with fewer data assumptions (Fox, 2000; Menard, 2002; Pampel, 2000; Tabachnick & Fidell, 1996). LR

is similar to ordinary least squares (OLS) multiple regression but uses iterative maximum likelihood estimation (MLE)

rather than OLS. Like multiple regression, LR can use any combination of categorical or continuous predictors, but the

dependent variable must be categorical. LR performs similarly to discriminant function analysis (DFA), but it is

increasingly favored over the latter because of its fewer data assumptions (Press & Wilson, 1978). Unlike OLS

regression, LR does not assume (a) a linear relationship between the independent variables and the dependent

variable, (b) normally distributed variables, or (c) equal variance per cell. LR is offered by most statistics packages,

including NCSS (Hintze, 2007), SPSS, Stata, S-Plus, SYSTAT, and SAS.

Although LR is burdened by few data assumptions, ideally it needs at least 10 observations for each predictor variable

level (e.g., the smaller Phase A or B; Peduzzi, Concato, Kemper, Holfold, & Feinstein, 1996). In addition, all predictor

cells should have frequencies of at least 1, and no more than 20% of cells should have less than 5 per cell.

LR does not yield a true effect size but rather one or more quasi-R approximations (e.g., Cox & Snell, 1989;

Nagelkerke, 1991). These quasi effect sizes must be interpreted cautiously (e.g., they do not represent “percent of

variance explained” as do true R s). A second LR output, and the one most important to this article, is a summary of

LR prediction accuracy in a 2 × 2 table. LR predicts membership of each data point in either baseline or intervention

phase, based on its relative magnitude. Chance level is 50% accuracy. The 2 × 2 agreement table, when analyzed by

2

2

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

9 of 20 3/1/2019, 5:25 PM

using chi-square, yields the Pearson's phi index of association, a bona fide effect size. Pearson's Φ and Φ are R

family members, and familiar to many researchers (Cohen, 1988). Phi also can be calculated from chi-square: Φ =

(where N is the total number of frequency counts in the 2 × 2 table).

In a balanced 2 × 2 table (from LR), phi also can be obtained by submitting four internal scores to analysis in a “two

proportions” statistical module. Phi approximates the difference between the two proportions and is exactly the same in

a balanced table. An advantage of using a “two proportions” module for analysis is that it commonly outputs

confidence intervals.

In a single-case design, LR analysis requires two predictor variables, participant and scores, and the dependent

measure, PhaseAB. Though not essential, a fourth serial sequence variable, time, should be added. Participant is a

categorical predictor variable whose number of levels equals the number of clients (data series) (Levels I, II, III, etc.).

Scores serve as a predictor rather than as a dependent or criterion variable, as is the case with ANOVA or OLS

regression. The dependent variable, PhaseAB, is dichotomous (Levels A, B). A one-way (noninteraction) model is

specified. The output needed for the present study is only the 2 × 2 prediction accuracy table, which is ordinarily used

for prediction specificity and sensitivity (involving false negatives and false positives). Through LR, an attempt is made

to predict the phase to which a score belongs (baseline vs. treatment), based on its size. The prediction is made on

the basis of all participants' data, but the classification results also can be disaggregated by individual participant.

Results

Analysis of the propranolol data set results in a classification table presented in Table 1. Table 1 indicates that the

classification accuracy for these data is only about chance level, 50%. Any given data point has an equal chance of

belonging to the baseline versus treatment condition. These results represent an unsuccessful intervention. From a

total of 104 data points, only 57% were classified correctly for phase membership, which is close to chance level.

Submitting this table (only the interior four scores) to a chi-square analysis yields, χ = 1.9. Phi is output directly as

.135 or can be calculated as, Φ =

=

Phi can be interpreted approximately as “prediction accuracy beyond chance.”

Classification Accuracy Table

From the 2 × 2 table, we also can calculate phi from the difference between two ratios: d/f − b/e = 30/51 − 24/53 =

.5882 − .4528 = .135. A two proportions statistical module provides a 90% exact confidence interval as: −.03 < .135 <

.29, and because it spans zero, we note that it could have been obtained by chance alone. On the basis of all 10

participants, this phi effect size of approximately .14 indicates the magnitude of change from baseline to intervention

phases for this particular intervention. Guidelines for interpreting phi magnitudes were recently derived from 165

analyses of published SCR data (Parker & Hagan-Burke, 2007a). LR results correlated .83 with visual judgments, and

2 2

2

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

10 of 20 3/1/2019, 5:25 PM

studies judged to show small or negligible results had effect sizes (interquartile range [IQR]) of .09–.43. Studies judged

as showing medium-size results had effect sizes (IQR) of .53–.72. And studies judged as showing large results had

effect sizes (IQR) of .82–1.0.

This effect size does not indicate whether this change (or lack of change) can be attributed to the intervention.

Attributing change to the intervention depends on strength of the design. The design of this example is a multiple-

baseline design with 10 independent client AB data series, and with treatment initiated at 10 different times. Most

single-case researchers would consider this a strong design. Thus, our hypothesis that participants with agitation

would have a significant reduction in ABS scores on propranolol as compared with placebo was not supported.

Analyses by Participant

Besides obtaining an index of overall intervention effect, diagnostic understanding can be gained from effect sizes for

individual participants. This is accomplished in LR by dropping the participant predictor variable and entering the data

only one participant at a time. Table 2 includes the 10 effect sizes for the individual participants, which is compoised of

roughly two groups: a larger group of “little or no effect” (Φ = .04, .00, .00, .07, .33, .00) and a smaller group of

“moderate to strong effect” (Φ = .52, 1.0, .63, .87). We include these effect sizes and confidence intervals for the

individual participants because clinicians involved in monitoring patient progress would focus on each unique client's

progress, whereas researchers would probably want to distill the results across multiple baselines in order to

determine whether the treatment was effective. Generally, these results indicate that propranolol was not effective in

lowering agitation for the majority of participants. The level of analysis one uses depends on the question one needs

answered.

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

11 of 20 3/1/2019, 5:25 PM

Classification Tables for Each Individual Participant

Comparison to Regression

The results from the SMS regression model conducted on each participant are included in Table 2. In general, there

were three groups of participants. The largest group contained those participants that demonstrated no effect while

taking propranolol. These participants obtained R values of .02, .07, .04, .05, .22, and .02 and classification rates of

54.5%, 50%, 50%, 54.5%, 66.7%, and 62.5%, respectively. A graphic depiction of the lack of effects observed for one

participant in this group is presented in Figure 1.

Figure 1. Example data set of noneffective treatment of agitation with propranolol. ABS = Agitated Behavior Scale.

Solid circles represent data collected in the baseline phase; solid triangles represent data from the treatment phase.

Two participants exhibited significantly elevated agitation during the propranolol phase (a 33-year-old Caucasian

woman and a 37-year-old Caucasian man). We obtained R values of .23 and .70 with classification rates of 80% and

83.3%. The small R value for the first participant seems to reflect the nonstatistically significant phi. Figure 2 depicts

the ratings obtained for the 37-year-old man who exhibited significantly greater agitation on propranolol compared with

placebo.

2

2

2

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

12 of 20 3/1/2019, 5:25 PM

Figure 2. Example data set of participant deterioration while on propranolol for the treatment of agitation. ABS =

Agitated Behavior Scale. Solid circles represent data collected in the baseline phase; solid triangles represent data

from the treatment phase.

Two other participants displayed significantly less agitation on propranolol than on placebo (a 51-year-old African

American man and a 23-yearold Caucasian man). We obtained R values of .70 and .73 and classification rates of

100% and 92.9%. Figure 3 depicts the significant improvement exhibited by the 51-year-old man during the propanolol

phase.

Figure 3. Example data set of participant improvement while on propranolol for the treatment of agitation. ABS =

Agitated Behavior Scale. Solid circles represent data collected in the baseline phase; solid triangles represent data

from the treatment phase.

There were 2 participants who obtained results with R values of .23 and .22. Their classification rates and phi values

were 80%, .52 (p = .10) and 66.7%, .33 (p = .41), respectively. The case with the 80% classification rate is rather high,

but the phi value and examination of the confidence interval for the bootstrapped R value, which contains zero,

suggest that such a high classification rate should be interpreted with caution. Interpretability would likely be increased

if this case had one or two more data points in the baseline phase, beyond the minimum of three. The other R value

of .22 was not associated with a high classification rate. This data series also obtained a non-significant phi, and the

confidence interval from the bootstrapped R also contained zero. One can more confidently conclude that there is no

treatment effect in these cases.

It is important to emphasize that the interpretation of any statistical output needs to include visual analysis. These

results show that large effect sizes do not inform one as to the directionality of the results. This study produced some

high-correct classification rates; however, half those participants did better on propranolol, and the other half did worse

on propranolol.

Discussion

2

2

2

2

2

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

13 of 20 3/1/2019, 5:25 PM

Our main objective in this article was to present a discussion of advanced regression methods for the analysis of data

produced by SCR. We presented two very different methods, LR and OLS regression. This is the first attempt that we

are aware of in which investigators have used LR to analyze SCR. The application to a RCT with multiple-baseline

data from a drug study of the effectiveness of propranolol to treat agitation among individuals with BI was ideal for this

demonstration because of the high internal validity and the multiple data sets available for analysis. Factors suggesting

a high degree of internal validity include multiple-baseline design, double-blind features, and random assignment to

the ordering of treatment condition. Thus, although the overall sample size was small, the degree of experimental

control for the present study appears to be rather high.

Our analysis of the multiple-baseline data suggests that overall propranolol was not an effective treatment for agitation.

The effect size based on these data was .14. This is a small and nonsignificant effect size and could have been

obtained by chance alone. Yet, when we analyzed participants separately, we found that there were interesting

differences among the participants. Six individuals experienced little or no effect on propanolol. Four others evidenced

a moderate to strong effect in response to propranolol: 2 of these participants improved, and the other 2 did worse.

The individual variations in treatment response, which any analysis of overall group performance cannot address,

suggest that agitation may be influenced by several factors that have yet to be isolated or understood. The results of

our demonstration, then, have implications for clinical case management and for isolating other variables in future

studies of propanolol in the treatment of agitation.

Clinical scientists are typically interested in their patient's response to treatment. The analysis of each participant's

data separately is in line with the clinician's interest in patient progress. We can see in these profiles that any particular

client's response may vary markedly from the overall analysis (which suggested no effect for the group). As seen in

these results, a few participants had notable results with propranolol. In the absence of contraindications and

troublesome side effects, some clinicians may choose to prescribe propranolol for agitation because it was effective for

some clients. Such a choice would seem to require adequate monitoring to determine whether continued

administration was beneficial, worthwhile, and cost-effective. These observations are consistent with other expert

opinions concerning the use of propanolol in the treatment of agitation (Fleminger et al., 2006).

There are limitations with the techniques we have demonstrated in this study. One does not evaluate trends or curves

with LR. In some cases, trend lines or curves may be of primary interest. In such cases, LR would not be an ideal

analytic tool. LR also has a ceiling limitation. If a treatment obtains a 100% correct classification rate (a Φ of 1), then

there is no way in which to evaluate any magnitude of difference with the technique beyond the minimum required to

obtain the 100% classification rate. Furthermore, additional work remains to determine how this LR procedure

performs with a wide variety of single-case data sets.

Although we have focused on the statistical results, it is important to note that the ratings we obtained in this study

were not complicated by patient self-report. The participants were rated by their family member. Thus, for any

participant who improved on propranolol, it may not be necessary for statistical significance to be achieved. Improved

quality of life for the family may be a more important consideration in some clinical scenarios.

Beyond the findings of this particular study, it should be noted that with an appropriate measure of outcome and the

implementation of a multiple-baseline design, we presented in this article a statistical procedure that should be

appreciated by peer reviewers and peer-reviewed outlets. There is no longer an acceptable rationale for conducting

SCR without statistical analysis. Single-case studies that feature a strong rationale, a multiple-baseline design, and

appropriate statistical analyses deserve a place in the evidentiary foundation of rehabilitation psychology research.

With all the key elements in place, any reluctance to publish such a study probably reflects editorial bias more than a

scholarly critique.

In many respects, the present controversies and needs in our research make for an exciting time for single-case

researchers. New statistical methods continue to be developed and refined. No longer must the single-case researcher

rely solely on visual analysis: Regression methods such as those presented here provide two powerful yet very

different methods for analyzing single-case data. In conjunction with visual analysis, it is hoped that those who may

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

14 of 20 3/1/2019, 5:25 PM

have previously avoided SCR will now see new avenues for productive inquiry that can improve clinical practice,

enrich the literature base, and improve the quality of life for consumers of rehabilitation services.

References

Aeschleman, S. R. (1991). Single-subject research designs: Some misconceptions. Rehabilitation Psychology, 36,

43–49.

Allison, D. B., & Gorman, B. S. (1993). Calculating effect sizes for meta-analysis: The case of the single case.

Behaviour Research and Therapy, 31, 621–631.

Allison, D. B., & Gorman, B. S. (1994). “Make things as simple as possible, but no simpler.” A rejoinder to Scruggs and

Mastropieri. Behaviour Research and Therapy, 32, 885–890.

Backman, C. L., Harris, S. R., Chisholm, J. M., & Monette, A. (1997). Single-subject research in rehabilitation: A review

of studies using AB, withdrawal, multiple baseline, and alternating treatments designs. Archives of Physical Medicine

and Rehabilitation, 78, 1145–1153.

Baltes, P. B., & Nesselroade, J. R. (1979). History and rationale of longitudinal research. In J. R.Nesselroade & P.

B.Baltes (Eds.), Longitudinal research in the study of behavior and development (pp. 1–39). London: Academic Press.

Barlow, D. H., & Hersen, M. (Eds.). (1984). Single case experimental designs: Strategies for studying behavior change

(2nd ed.). Oxford, England: Pergamon Press.

Blampied, N. M. (2000). Single-case research designs: A neglected alternative. American Psychologist, 55, 960.

Brooke, M. M., Questad, K. K., Patterson, D. R., & Bashak, K. J. (1992). Agitation and restlessness after closed head

injury: A prospective study of 100 consecutive admissions. Archives of Physical Medicine and Rehabilitation, 73,

320–323.

Brossart, D. F., & Parker, R. I. (2001, March). Evaluating client improvement: Interrupted time series methods. Poster

session presented at the Houston 2001 National Counseling Psychology Conference, Houston, Texas.

Brossart, D. F., Parker, R. I., Olson, E. A., & Mahadevan, L. (2006). The relationship between visual analysis and five

statistical analyses in a simple AB single-case research design. Behavior Modification, 30, 531–563.

Brucker, B. S., & Ince, L. P. (1977). Biofeedback as an experimental treatment for postural hypotension in a patient

with a spinal cord lesion. Archives of Physical Medicine and Rehabilitation, 58, 49–53.

Busk, P. L., & Marascuilo, L. A. (1992). Statistical analysis in single-case research: Issues, procedures, and

recommendations, with applications to multiple behaviors. In T. R.Kratochwill & J. R.Levin (Eds.), Single-case research

design and analysis: New directions for psychology and education (pp. 159–185). Hillsdale, NJ: Erlbaum.

Callahan, C. D., & Barisa, M. T. (2005). Statistical process control and rehabilitation outcome: The single-subject

design reconsidered. Rehabilitation Psychology, 50, 24–33.

Cardenas, D. D., & McLean, A. (1992). Psychopharmacologic management of tramatic brain injury. Physical Medicine

and Rehabilitaion Clinics of North America, 3, 273–290.

Cassidy, J. W. (1990). Neurochemical substrates of aggression: Toward a model for improved intervention, part 1.

Journal of Head Trauma Rehabilitation, 5, 83–86.

Center, B. A., Skiba, R. J., & Casey, A. (1985–1986). A methodology for the quantitative synthesis of intra-subject

design research. Journal of Special Education, 19, 387–400.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

15 of 20 3/1/2019, 5:25 PM

Coltart, D. J., & Shand, D. G. (1970). Plasma propranolol levels in the quantitative assessment of beta-adrenergic

blockade in man. British Medical Journal, 3, 731–734.

Concato, J., Shah, N., & Horwitz, R. I. (2000). Randomized, controlled trials, observational studies, and the hierarchy

of research designs. New England Journal of Medicine, 342, 1887–1892.

Corrigan, J. D. (1989). Development of a scale for assessment of agitation following traumatic brain injury. Journal of

Clinical and Experimental Neuropsychology, 11, 261–277.

Corrigan, J. D., & Bogner, J. A. (1994). Factor structure of the Agitated Behavior Scale. Journal of Clinical and

Experimental Neuropsychology, 16, 386–392.

Corrigan, J. D., & Mysiw, W. J. (1988). Agitation following traumatic brain injury: Equivocal evidence for a discrete

stage of cognitive recovery. Archives of Physical Medicine and Rehabilitation, 69, 487–492.

Cox, D. R., & Snell, E. J. (1989). Analysis of binary data (2nd ed.). London: Chapman & Hall.

Crosbie, J. (1993). Interrupted time-series analysis with brief single-subject data. Journal of Consulting and Clinical

Psychology, 61, 966–974.

Crosbie, J. (1995). Interrupted time-series analysis with short series: Why it is problematic; how it can be improved. In

J. M.Gottman (Ed.), The analysis of change (pp. 361–395). Mahwah, NJ: Erlbaum.

Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application. Cambridge, England: Cambridge

University Press.

DeProspero, A., & Cohen, S. (1979). Inconsistent visual analyses of intrasubject data. Journal of Applied Behavior

Analysis, 12, 573–579.

Dunn, D., & Elliott, T. (in press). The place and promise of theory in rehabilitation psychology. Rehabilitation

Psychology.

Eichelman, B. (1987). Neurochemical and psychopharmacologic aspects for aggressive behavior. In H. Y.Meltzer

(Ed.), Psychopharmacology: The third generation of progress. New York: Raven Press.

Elliott, T. R. (2007). Registering randomized clinical trials and the case for CONSORT. Experimental and Clinical

Psychopharmacology, 15, 511–518.

Faith, M. S., Allison, D. B., & Gorman, B. S. (1996). Meta-analysis of single-case research. In R. D.Franklin, D.

B.Allison, & B. S.Gorman (Eds.), Design and analysis of single-case research (pp. 245–277). Mahwah, NJ: Erlbaum.

Fidler, F. (2002). The fifth edition of the APA Publication Manual: Why its statistics recommendations are controversial.

Educational and Psychological Measurement, 62, 749–770.

Fleminger, S., Greenwood, R. J., & Oliver, D. L. (2006). Pharmacological management for agitation and aggression in

people with acquired brain injury. Cochrane Database of Systematic Reviews, (4), CD003299. DOI:

10.1002/14651858.pub2.

Fordyce, W. E. (1976). Behavioral methods in chronic pain and illness. St. Louis, MO: Mosby, Inc.

Fox, J. (2000). Multiple and generalized nonparametric regression. Thousand Oaks, CA: Sage.

Francisco, G. E., Walker, W. C., Zasler, N. D., & Bouffard, M. H. (2007). Pharmacological management of

neurobehavioral sequelae of traumatic brain injury: A survey of current physiatric practice. Brain Injury, 21,

1007–1014.

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

16 of 20 3/1/2019, 5:25 PM

Goldfried, M. R., & Wolfe, B. E. (1996). Psychotherapy practice and research: Repairing a strained alliance. American

Psychologist, 51, 1007–1016.

Good, P. I. (2001). Resampling methods: A practical guide to data analysis. Boston: Birkhäuser Boston.

Gorsuch, R. L. (1983). Three methods for analyzing time-series (N of 1) data. Behavioral Assessment, 5, 141–154.

Harbst, K. B., Ottenbacher, K. J., & Harris, S. R. (1991). Interrater reliability of therapists' judgments of graphed data.

Physical Therapy, 71, 107–115.

Hilliard, R. B. (1993). Single-case methodology in psychotherapy process and outcome research. Journal of

Consulting and Clinical Psychology, 61, 373–380.

Hintze, J. (2007). NCSS, PASS, and GESS [Computer software]. Kaysville, UT: NCSS.

Huitema, B. E. (2004). Analysis of interrupted time-series experiments using ITSE: A critique. Understanding Statistics:

Statistical Issues in Psychology, Education, and the Social Sciences, 3, 27–46.

Ince, L. P., Brucker, B. S., & Alba, A. (1978). Reflex conditioning in a spinal man. Journal of Comparative and

Physiological Psychology, 92, 796–802.

Kazdin, A. E. (1982). Single-case research designs: Methods for clinical and applied settings. New York: Oxford

University Press.

Kratochwill, T. R., & Brody, G. H. (1978). Single subject designs: A perspective on the controversy over employing

statistical inference and implications for research and training in behavior modification. Behavior Modification, 2,

291–307.

Levin, H. S., & Grossman, R. G. (1978). Behavioral sequelae of closed head injury: A quantitative study. Archives of

Neurology, 35, 720–727.

Lunneborg, C. E. (2000). Data analysis by resampling: Concepts and applications. Pacific Grove, CA: Brooks/Cole.

Malament, I. B., Dunn, M. E., & Davis, R. (1975). Pressure sores: An operant conditioning approach to prevention.

Archives of Physical Medicine and Rehabilitation, 56, 161–165.

Matyas, T. A., & Greenwood, K. M. (1996). Serial dependency in single-case time series. In R. D.Franklin, D. B.Allison,

& B. S.Gorman (Eds.), Design and analysis of single-case research (pp. 215–243). Mahwah, NJ: Erlbaum.

Menard, S. (2002). Applied logistic regression analysis (2nd ed.). Thousand Oaks, CA: Sage.

Miczek, K. A., Weerts, E., Haney, M., & Tidey, J. (1994). Neurobiological mechanisms controlling aggression:

Preclinical developments for pharmacotherapeutic interventions. Neuroscience & Biobehavioral Reviews, 18, 97–110.

Morgan, D. L., & Morgan, R. K. (2001). Single-participant research design: Bringing science to managed care.

American Psychologist, 56, 119–127.

Morrison, J. H., Millier, M. E., & Grzanna, R. (1979, July20). Noradrenergic innervation of cerebral cortex: Widespread

effects of local cortical lesions. Science, 205, 313–316.

Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691–692.

Nesselroade, J. R. (1991). Interindividual differences in intraindividual change. In L. M.Collins & J. L.Horn (Eds.), Best

methods for the analysis of change: Recent advances, unanswered questions, future directions (pp. 92–105).

Washington, DC: American Psychological Association.

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

17 of 20 3/1/2019, 5:25 PM

Oakley, A. (2002). Social science and evidence-based everything: The case of education. Educational Review, 54,

277–286.

Ottenbacher, K. J. (1990). Visual inspection of single-subject data: An empirical analysis. Mental Retardation, 28,

283–290.

Pampel, F. C. (2000). Logistic regression: A primer. Thousand Oaks, CA: Sage.

Park, H., Marascuilo, L., & Gaylord-Ross, R. (1990). Visual inspection and statistical analysis of single-case designs.

Journal of Experimental Education, 58, 311–320.

Parker, R. I. (2006). Increased reliability for single-case research results: Is the bootstrap the answer?Behavior

Therapy, 37, 326–338.

Parker, R. I., & Brossart, D. F. (2003). Evaluating single-case research data: A comparison of seven statistical

methods. Behavior Therapy, 34, 189–211.

Parker, R. I., & Brossart, D. F. (2006). Phase contrasts for multiphase single case intervention designs. School

Psychology Quarterly, 21, 46–61.

Parker, R. I., Brossart, D. F., Vannest, K. J., Long, J. R., De-Alba, R. G., Baugh, F. G., et al. (2005). Effect sizes in

single case research: How large is large?School Psychology Review, 34, 116–132.

Parker, R. I., Cryer, J., & Byrns, G. (2006). Controlling baseline trend in single-case research. School Psychology

Quarterly, 21, 418–443.

Parker, R. I., & Hagan-Burke, S. (2007a). Median-based overlap analysis for single case data: A second study.

Behavior Modification, 31, 919–936.

Parker, R. I., & Hagan-Burke, S. (2007b). Useful effect size interpretations for single-case research. Behavior Therapy,

38, 95–105.

Parsonson, B. S., & Baer, D. M. (1992). The visual analysis of data, and current research into the stimuli controlling it.

In T. R.Kratochwill & J. R.Levin (Eds.), Single-case research design and analysis (pp. 15–40). Hillsdale, NJ: Erlbaum.

Peduzzi, P., Concato, E., Kemper, T. R., Holfold, T. R., & Feinstein, A. R. (1996). A simulation of the number of events

per variable in logistic regression analysis. Journal of Clinical Epidemiology, 49, 1373–1379.

Pijnenborg, G. H. M., Withaar, F. K., Evans, J. J., van den Bosch, R. J., & Brouwer, W. H. (2007). SMS text messages

as a prosthetic aid in the cognitive rehabilitation of schizophrenia. Rehabilitation Psychology, 52, 236–240.

Press, S. J., & Wilson, S. (1978). Chosing between logistic regresssion and discriminant analysis. Journal of the

American Statistical Association, 73, 699–705.

Reyes, R. L., Bhattacharyya, A. K., & Heller, D. (1981). Traumatic head injury: Restlessness and agitation as

prognosticators of physical and psychologic improvement in patients. Archives of Physical Medicine and

Rehabilitation, 62, 20–23.

Rosenthal, R. (1991). Meta-analytic procedures for social research (Rev. ed.). Newbury Park, CA: Sage.

Rowland, T., & DePalma, L. (1995). Current neuropharmacologic interventions for the management of brain injury

agitation. Neuro Rehabilitation, 5, 219–232.

Shavelson, R., & Towne, L. (Eds.). (2002). Scientific research in education. Washington, DC: Committee on Scientific

Principles for Educational Research, National Research Council, National Academy Press.

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

18 of 20 3/1/2019, 5:25 PM

Silver, J. M., & Yudofsky, S. C. (1994). Aggressive disorders. In J. M.Silver, S. C.Yudofsky, & R. E.Hales (Eds.),

Neuropsychiatry of traumatic brain injury (pp. 313–356). Washington, DC: American Psychiatric Press.

Simon, J. L. (1999). Resampling: The new statistics. Arlington, VA: Rita Simon.

Spring, B., Pagoto, S., Kaufmann, P. G., Whitlock, E. P., Glasgow, R. E., Smith, T. W., et al. (2005). Invitation to a

dialogue between researchers and clinicians about evidence-based behavioral medicine. Annals of Behavioral

Medicine, 30, 125–137.

Stevens, J. (2007). Repeated measures analysis. In Intermediate statistics: A modern approach (3rd ed.). Mahwah,

NJ: Erlbaum.

Tabachnick, B. G., & Fidell, L. S. (1996). Using multivariate statistics (3rd ed.). New York: Harper Collins.

Tucker, J. A., & Roth, D. L. (2006). Extending the evidence hierarchy to enhance evidence-based practice for

substance use disorders. Addiction, 101, 918–932.

Wehman, P., West, M., Fry, R., Sherron, P., Groah, C., Kreutzer, J., et al. (1989). Effect of supported employment on

the vocational outcomes of persons with traumatic brain injury. Journal of Applied Behavior Analysis, 22, 395–405.

White, G. W., Mathews, R. M., & Fawcett, S. B. (1989). Reducing risk of pressure sores: Effects of watch prompts and

alarm avoidance on wheelchair push-ups. Journal of Applied Behavior Analysis, 22, 287–295.

Whyte, J., & Rosenthal, M. (1993). Rehabilitation of the patient with traumatic brain injury. In J. A.DeLisa (Ed.),

Rehabilitation medicine: Principles and practice (2nd ed., pp. 825–611). Philadelphia: Lippincott.

Yudofsky, S., Williams, D., & Gorman, J. (1981). Propranolol in the treatment of rage and violent behavior in patients

with chronic brain syndromes. American Journal of Psychiatry, 138, 218–220.

APPENDICES

APPENDIX A: Scenarios In Which Single-Case Research Designs are Useful

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

19 of 20 3/1/2019, 5:25 PM

APPENDIX B

Within NCSS 2007, select Analysis, Regression/Correlation, Logistic Regression. Enter phase variable as dependent

variable and the variable for treatment score as Numeric Independent Variable (assuming it is continuous). Under the

Response Analysis Section in the output, you will find the % Correctly Classified, look in the row titled Total, for the

overall % of correctly classified data points. In the output under the section titled Classification Table are the values

you will enter into the Proportions –Two Independent analysis, which is listed under Analysis, Proportions. Make sure

you enter the values from the classification table correctly into the cells for the proportions test. Select Difference in the

Statistics box and Exact (although a Bootstrap is available) in the Confidence Intervals box. The output will list the Phi

under the column titled Estimated Value, and the confidence intervals will be listed next to it. Using the Confidence

Intervals tab, one may set the range of the confidence intervals the program produces. One may also use the Chi-

square Effect Size Estimator found under Analysis, Descriptive Statistics, Contingency Tables. When the classification

table is entered in the cell boxes, the program produces the chi-square, effect size (Phi), and the probability level.

Following these directions should give one all the necessary output to report one's results.

Submitted: December 31, 2007 Revised: June 3, 2008 Accepted: June 5, 2008

This publication is protected by US and international copyright laws and its content may not be copied without the

copyright holders express written permission except for the print or download capabilities of the retrieval software used

for access. This content is intended solely for the use of the individual user.

Source: Rehabilitation Psychology. Vol. 53. (3), Aug, 2008 pp. 357-369)

Accession Number: 2008-11210-010

Digital Object Identifier: 10.1037/a0012973

EBSCOhost http://web.b.ebscohost.com.library.capella.edu/ehost/delivery?sid=7a3d9...

20 of 20 3/1/2019, 5:25 PM