summary

ProgramEvaluationchapter4.pdf

II Alternative Approaches to Program Evaluation

Part

109

In Part One, we referred to the varying roles that evaluation studies can play in education, government, business, nonprofit agencies, and many related areas, and readers were introduced to some of the different purposes of evaluation. We hinted at some of the different approaches to evaluation, but we have not yet exposed the reader to these approaches. We will do so in Part Two.

In Chapter 4, we examine the factors that have contributed to such differing views. Prior efforts to classify the many evaluation approaches into fewer categories are discussed, and the categories that we will use in the remainder of this book are presented.

In Chapters 5 through 8, we describe four categories of approaches that have influenced evaluation practice. These general approaches include those we see as most prevalent in the literature and most popular in use. Within each chapter, we discuss how this category of approaches emerged in evaluation, its primary char- acteristics, and how it is used today. Within some categories, there are several major approaches. For example, participatory evaluation has many models or approaches. We describe each approach, including its distinguishing characteristics and contri- butions, the ways in which the approach has been used, and its strengths and weaknesses. Then, in Chapter 9, we discuss other themes or movements in eval- uation that transcend individual models or approaches, but that are important in- fluences on evaluation practice today.

Many evaluation books, often authored by the developer of one of the ap- proaches we discuss, present what Alkin (2004) has called “prescriptive theories” or approaches to evaluation. These books are intended to describe that approach in depth and, in fact, to suggest that the approach presented is the one that evalua- tors should follow. This book does not advocate a particular approach. Instead, we think it is important for evaluators and students studying evaluation to be famil- iar with the different approaches so they can make informed choices concerning

which approach or which parts of various approaches to use in a particular eval- uation. Each approach we describe tells us something about evaluation, perspectives we might take, and how we might carry out the evaluation. During this time of increased demands for evaluation in the United States and the world—what Donaldson and Scriven (2003) have called the “second boom in evaluation”—it is important for evaluators to be aware of the entire array of evaluation approaches and to select the elements that are most appropriate for the program they are evaluating, the needs of clients and other stakeholders, and the context of the evaluation.

110 Part II • Alternative Approaches to Program Evaluation

Alternative Views of Evaluation

Orienting Questions

1. Why are there so many different approaches to evaluation?

2. Why is evaluation theory, as reflected in different approaches to evaluation, important to learn?

3. What philosophical and methodological differences influenced the development of different approaches?

4. How have evaluation approaches been categorized by others? How does this book categorize evaluation approaches? What is the rationale for each?

5. What practical issues contribute to the diversity of evaluation approaches?

111

In the early days, when evaluation was emerging as a field, it was troubled by def- initional and ideological disputes. Those who wrote about evaluation differed widely in their views of what evaluation was, and those who conducted evaluation stud- ies brought to the task diverse conceptions of how one should go about doing it. From 1960 to 1990, nearly 60 different proposals for how evaluations should be conducted were developed and circulated. These proposals have been chronicled from the early days of thinking about evaluation approaches (Gephart, 1978) to more recent reviews of the development of evaluation models (Stufflebeam, 2001b). These different prescriptions have been implemented with varying degrees of fidelity. To complicate the picture further, some evaluations were designed without conscious reference to any existing conceptual framework, occasionally resulting, if successful, in yet another evaluation approach.

112 Part II • Alternative Approaches to Program Evaluation

The various approaches, or theories, proposed by evaluators make up the content of the field of evaluation. William Shadish titled his presidential address to the American Evaluation Association in 1997 “Evaluation Theory Is Who We Are” and argued that “[a]ll evaluators should know evaluation theory because it is central to our professional identity” (1998, p. 1). As he pointed out, evaluation theory “provides the language we use to talk to each other about evaluation” and “is the knowledge base that defines the profession” (Shadish, 1998, pp. 3, 5). Stufflebeam, too, emphasizes the importance of studying evaluation theory and its approaches. He writes, “The study of alternative evaluation approaches is impor- tant for professionalizing program evaluation and for its scientific advancement and operation” (2001b, p. 9). As illustrated in Shadish’s and Stufflebeam’s remarks, some evaluators use the term evaluation “theories”; others use the terms evaluation “models” or “approaches.” We prefer to use the word approaches, because few are as broad as a true theory and their intent is to guide how evalu- ation is practiced.1

Today, although there is no dominant evaluation theory or approach, there is much more agreement than in the past. Nevertheless, it is important for read- ers to become familiar with the different approaches, not only to learn the knowl- edge base of the field and the issues that professional evaluators discuss, but also to help them make conscious choices about the approach or elements of different approaches that they intend to use in each evaluation. Many evaluators today use a mix of approaches, selecting elements that are most appropriate for the pro- gram they are evaluating, its context, and stakeholders. Sometimes a funder will select the approach to be used, although evaluators may choose to negotiate changes to that if the funder is not familiar with other approaches and the one chosen is inappropriate for the program or its context. But, without knowledge of these different approaches, evaluators tend to make uninformed choices of the questions their evaluation should address, the ways in which stakeholders might be involved, the appropriate methods to use for collecting data, and the means for maximizing the use of the results. (See, for example, Christie’s 2003 study of practicing evaluators.)

Approaches to evaluation that have emerged as the most common or well- known are described in the chapters following Chapter 4. These approaches pro- vide the conceptual tools for an evaluator to use in designing an evaluation that fits particular circumstances. In this chapter, we will discuss the factors that have influenced the differences in approaches, some of the ways in which approaches have been categorized, and how we have conceptualized the common approaches used today.

1Shadish (1998) defines “theory” in his address as “a whole host of more or less theoretical writings with evaluation as their primary focus” (p. 1). Like “approaches,” these writings discuss how evalua- tion should be conducted and the factors that influence its practice.

Chapter 4 • Alternative Views of Evaluation 113

Diverse Conceptions of Program Evaluation

The many evaluation approaches that have emerged since 1960 range from com- prehensive models to checklists of actions to be taken. Some authors opt for a comprehensive approach to judging a program, while others view evaluation as a process of identifying and collecting information to assist decision makers. Still others see evaluation as synonymous with professional judgment, where judg- ments about a program’s quality are based on opinions of experts. In one school of thought, evaluation is viewed as the process of comparing performance data with clearly specified goals or objectives, while in another, it is seen as synony- mous with carefully controlled experimental research on programs to establish causal links between programs and outcomes. Some focus on the importance of naturalistic inquiry or urge that value pluralism be recognized, accommodated, and preserved. Others focus on social equity and argue that those involved with the entity being evaluated should play an important, or even the primary, role in determining what direction the evaluation study takes and how it is conducted.

The various models are built on differing—often conflicting—conceptions and definitions of evaluation. Let us consider an example from education.

• If one viewed evaluation as essentially synonymous with professional judg- ment, the worth of an educational program would be assessed by experts (often in the subject matter to be studied) who observed the program in action, examined the curriculum materials, or in some other way gleaned sufficient information to record their considered judgments.

• If evaluation is viewed as a comparison between student performance indi- cators and objectives, standards would be established for the curriculum and relevant student knowledge or skills would be measured against this yard- stick, using either standardized or evaluator-constructed instruments.

• If an evaluation is viewed as providing useful information for decision mak- ing, the evaluator, working closely with the decision maker(s), would iden- tify the decisions to be made and collect sufficient information about the relative advantages and disadvantages of each decision alternative to judge which was best. Or, if the decision alternatives were more ambiguous, the evaluator might collect information to help define or analyze the decisions to be made.

• If the evaluator emphasized a participative approach, he or she would iden- tify the relevant stakeholder groups and seek information on their views of the program and, possibly, their information needs. The data collection would focus on qualitative measures, such as interviews, observations, and content analysis of documents, designed to provide multiple perspectives on the program. Stakeholders might be involved at each stage of the evaluation to help build evaluation capacity and to ensure that the methods used, the interpretation of the results, and the final conclusions reflected the multiple perspectives of the stakeholders.

114 Part II • Alternative Approaches to Program Evaluation

• If the evaluator saw evaluation as critical for establishing the causal links between the program activities and outcomes, he or she might use random assignment of students, teachers, or schools to the program and its alterna- tives; collect quantitative data on the intended outcomes; and draw conclu- sions about the program’s success in achieving those outcomes.

As these examples illustrate, the way in which one views evaluation has a direct impact on the manner in which the evaluation is planned and the types of evaluation methods that are used. Each of the previous examples, when reviewed in detail, might be considered an excellent evaluation. But, evaluations must con- sider the context in which they are to be conducted and used. Each context—the nature and stage of the program, the primary audiences for the study and the needs and expectations of other stakeholders, and the political environment in which the program operates—holds clues to the approach that will be most appropriate for conducting an evaluation study that makes a difference in that context. There- fore, without a description of the context, we cannot even begin to consider which of the examples would lead to the best evaluation study. Nor can we judge, based on our own values, which example is most appropriate. Instead, we must learn about the characteristics and critical factors of each approach so that we can make appropriate choices when conducting an evaluation in a specific context.

Origins of Alternative Views of Evaluation

The diversity of evaluation approaches has arisen from the varied backgrounds, experiences, and worldviews of their authors, which have resulted in diverse philosophical orientations, and methodological and practical preferences. These different predispositions have led the authors—and their adherents—to propose sometimes widely different methods for conducting evaluations and for collecting and interpreting information or data. The differences in evaluation approaches can be traced directly to their proponents’ rather different views not only of the meaning and nature of evaluation but also of the nature of reality (ontology) and knowledge (epistemology).

To understand the origins of alternative conceptualizations of evaluation, the reader will first need an introduction to different philosophical views of ontology and epistemology.

Philosophical and Ideological Differences

Logical Positivism. Early evaluations emerged from the social sciences, in particular education and psychology, at a time when the dominant paradigm was positivism. Logical positivists, a more extreme branch of positivism, argued that knowledge was obtained entirely through experience, specifically through observation, and held rigid views concerning the world and data collection (Godfrey-Smith, 2003). They argued that (a) there is one reality of the objects we are studying and the aim

Chapter 4 • Alternative Views of Evaluation 115

of researchers and evaluators is to use social science research methods and theories of statistical probability to discover that one reality and to establish laws and theo- ries about how things work, and (b) to effectively gain knowledge of that reality, researchers need to be “scientifically objective.” A key component of that approach is that researchers should maintain some distance from the program to be studied so as not to influence the program itself, the participants, or the results of the study. The methods used to achieve this objectivity, or distance, were typically quantita- tive in nature. Objectivity or objectivism, meaning that the researcher’s views and values do not influence the results obtained, was a key principle of positivism.

Postpositivism. Reichardt and Rallis (1994) note that logical positivism began to decline around the time of World War II, though elements of positivism continued to influence research and evaluation for some time. By 1984, however, Donald Campbell, a prominent research methodologist and evaluator with a quantitative orientation, noted that “twenty years ago logical positivism dominated the philos- ophy of science. . . . Today the tide has completely turned among the theorists of science in philosophy, sociology, and elsewhere. Logical positivism is almost uni- versally rejected” (p. 27). Postpositivism emerged in reaction to logical positivism and many, unfortunately, confuse the two. Guba and Lincoln (1989) argued that the views of postpositivists were not compatible with other approaches to evalua- tion. However, Reichardt and Rallis (1994), quantitative and qualitative evaluators respectively, effectively refuted their arguments, demonstrating that postpositivists, such as Campbell and Stanley (1966) and Cook and Campbell (1979), did not hold the views of logical positivists. Instead, they showed through quotations from their work that these postpositivists, and others, recognized that facts and methods or inquiry choices in research are influenced by the values of the researcher, that knowledge is fallible and changing, that data can be explained by many different theories, and that reality is constructed by people and their experiences.

The focus of postpositivists, however, was on examining causal relationships to develop laws and theories to describe the external world, albeit temporary ones given the fallibility of knowledge. Replication and intersubjectivity, not objectiv- ity, were the keys to ensuring good research (Frankfort-Nachmias & Nachmias, 2008). Intersubjectivity involves the ability to communicate what one does in research in such a way that others can judge its findings and replicate it to see if they obtain the same results. For evaluation, House and Howe (1999) note that one of the key characteristics of this philosophical approach, which they call the received view, is viewing facts as quite distinct from values and believing that eval- uators should be focusing on the facts.

A Constructivist Paradigm. As evaluation continued, evaluators saw that con- text and values played very important roles in evaluation. Unlike many laws of science which are readily generalizable from one setting to the next, the factors that influence the success of education, social, and economic programs can differ dramatically from one setting to another. Also, clients and stakeholders for the

116 Part II • Alternative Approaches to Program Evaluation

evaluation often had information needs that were not so concerned with estab- lishing causality as with gaining a better understanding of the program and those they served. Program developers recognized the many differing “realities” or con- ditions or life experiences of those that the programs were intended to serve and saw that programs had different effects on different kinds of clients. They wanted to know more about these issues to help them improve their programs. And val- ues were an integral part of what programs, policies, and evaluations confronted. To exempt evaluation from such values was to make it incomplete.

The constructivist paradigm that was emerging then corresponded more closely to the views and experiences of these evaluators and program developers. Constructivists took a different view of ontology and epistemology (Guba & Lincoln, 1994). Although we now realize that the differences were not as extreme as they were sometimes portrayed, Guba and Lincoln focused on understanding our constructed world and, in particular, the multiple realities seen or experienced by different stakeholders. They argued that objectivity was not possible; we each see the world through our own lens, influenced by our own experiences. Later, House and Howe (1999) emphasized that the fact-value dichotomy, or the rigid distinc- tion between “facts” which are objective and “values” which are subjective, is in fact (pun intended) a continuum. Our values influence what we perceive to be facts. Thus, evaluators should become involved with values—helping stakehold- ers articulate their values, considering the values inherent in the evaluation, and working to portray the program through different stakeholders’ perspectives of reality. Constructivism also continued its focus on what Schwandt (1997) calls the “localness” of knowledge. Evaluation is intended to provide understanding of a particular program and its context and is less concerned with generalizability and developing laws and theories for other settings.

A Transformative Paradigm. More recently, a new paradigm for evaluation has emerged—the transformative paradigm. It emerged initially, and is still most pow- erful, in international development work and in the developing world, though the paradigm is gaining proponents in the United States and Western countries. Like constructivism and postpositivism, this paradigm emerged in response to the stric- tures of positivism, but also developed in response to concerns in developing countries that research and evaluation often failed to address critical political and social problems. Like the constructivist paradigm, the transformative paradigm acknowledges multiple realities and the need for evaluation to capture those real- ities. However, the emphasis of the transformative paradigm is on the political, social, and economic factors that form those realities. The transformative paradigm is less concerned with methodological choices and more concerned with the nature of the problems that evaluation addresses and how stakeholders are involved in the evaluation. Transformative evaluations are concerned with empowering groups that have less power in society. These can include poor people, ethnic or racial minorities, women, and people with disabilities (Mertens, 1999). The focus of the evaluation is on helping these groups construct their own knowledge and

Chapter 4 • Alternative Views of Evaluation 117

empowering them by having them play a central role in the evaluation (Hall, 1992; Freire, 1970, 1982). The evaluator serves as a facilitator to the decisions made by the stakeholders about the evaluation in order to change power struc- tures and knowledge. Some view transformative evaluation as a new paradigm. Others view it as an approach. We will cover this type of evaluation as an approach more extensively in Chapter 8.

The Influence of Paradigms on Evaluation Practice. These philosophical para- digms, and their implications for methodological choices, have influenced the de- velopment of different evaluation approaches. Some have argued that paradigms and qualitative and quantitative methods should not be mixed because the core beliefs of postpositivists and constructivists are incompatible (Denzin & Lincoln, 1994). As noted, Reichardt and Rallis (1994) argued and demonstrated that the paradigms were compatible. These and other pragmatists, representing different methodological stances—quantitative and qualitative—disputed the incompatibil- ity argument and urged evaluators and researchers to look beyond ontological and epistemological arguments to consider what they are studying and the appropri- ate methods for studying the issues of concern. In other words, evaluative and methodological choices should not be based on paradigms or philosophical views, but on the practical characteristics of each specific evaluation and the concepts to be measured in that particular study. Today, there are many evaluators, some of whose approaches will be discussed in subsequent chapters, who skip the arguments over paradigms and prefer a pragmatic approach (Patton, 1990; 2001; Tashakkori and Teddlie, 2003). Howe (1988) and, more recently, Tashakkori and Teddlie (1998) have proposed the pragmatic approach as a paradigm in itself. They see discussions of ontology and epistemology as fruitless and unnecessary and argue that re- searchers’ and evaluators’ choice of methods should be based on the questions the evaluator or researcher is trying to answer. They write, “Pragmatist researchers consider the research question to be more important than either the methods they use or the paradigm that underlies the method” (Tashakkori & Teddlie, p. 21, 2003).

It is useful, however, for readers to be familiar with these paradigms because their philosophical assumptions were key influences on the development of dif- ferent evaluation approaches and continue to play a role in many evaluations and approaches.

Methodological Backgrounds and Preferences

For many years, evaluators differed, and argued, about the use and value of qual- itative or quantitative methods, as suggested previously. These methodological preferences were derived from the older paradigms described earlier. That is, the postpositivist paradigm focused on quantitative methods as a better way to obtain objective information about causal relationships among the phenomena that eval- uators and researchers studied. To be clear, quantitative methods are ones that yield numerical data. These may include tests, surveys, and direct measures of certain quantifiable constructs such as the percentage of entering students who

118 Part II • Alternative Approaches to Program Evaluation

graduate from a high school to examine a school’s success, blood alcohol content for the evaluation of a drunk-drivers treatment program, or the numbers of people who are unemployed to evaluate economic development programs. Quantitative methods also rely on experimental and quasi-experimental designs, or multivari- ate statistical methods, to establish causality.

Constructivists were more concerned with describing different perspectives and with exploring and discovering new theories. Guba and Lincoln discussed developing “thick descriptions” of the phenomenon being studied. Such in-depth descriptions were more likely to be made using qualitative observations, inter- views, and analyses of existing documents. Constructivists also see the benefit of studying causal relationships, but their emphasis is more on understanding those causal relationships than on establishing a definitive causal link between a pro- gram and an outcome. Given these emphases, constructivists favored qualitative measures. Qualitative measures are not readily reducible to numbers and include data collection methods such as interviews, focus groups, observations, and content analysis of existing documents.

Some evaluators have noted that the quantitative approach is often used for theory testing or confirmation while qualitative approaches are often used for exploration and theory development (Sechrest & Figueredo, 1993; Tashakkori & Teddlie, 1998). If the program to be evaluated is based on an established theory and the interest of the evaluation is in determining whether that theory applies in this new setting, a quantitative approach might be used to determine if, in fact, the causal mechanisms or effects hypothesized by the theory actually did occur. For ex- ample, a reading program based upon an established theory is being tried with a younger age group or in a new school setting. The focus is on determining whether the theory works in this new setting to increase reading comprehension as it has in other settings. Students might be randomly assigned to either the new method or the old one for a period of a few months, and then data would be collected through tests of reading comprehension. While qualitative methods could also be used to examine the causal connections, if the focus were on firmly establishing causality, quantitative approaches might be preferred. In contrast, if the evaluator is evaluating an experimental program or policy for which the theory is only loosely developed—for example, a new merit pay program for teachers in a par- ticular school district—a qualitative approach would generally be more appropriate to better describe and understand what is going on in the program. Although a few districts are experimenting today with merit pay, little is known about how merit pay might work in educational settings, and results from other sectors are mixed (Perry, Engbers, & Jun, 2009; Springer & Winters, 2009). In this case, it would be important to collect much qualitative data through interviews with teachers, prin- cipals, and other staff; observations at staff meetings; content analysis of policy doc- uments; and other methods to learn more about the impact of merit pay on the school environment; teacher retention, satisfaction, and performance; teamwork; teacher-principal relations; and many other issues.

In the beginning years of evaluation, most evaluators’ training was in quanti- tative methods. This was particularly true for evaluators coming from the disciplines

Chapter 4 • Alternative Views of Evaluation 119

of psychology, education, and sociology. The emergence of qualitative methods in evaluation provided new methodologies that were initially resisted by those more accustomed to quantitative measures. Today, however, most evaluators (and researchers) acknowledge the value of mixed methods and most graduate pro- grams recognize the need to train their students in each, though some may focus more on one method than another. For researchers, who tend to study the same or a similar subject most of their career, intensive training in a few methodologies appropriate for the types of constructs and settings they are studying is appropri- ate. But evaluators study many different programs and policies containing many different important constructs over the course of their careers. Therefore, evalua- tors now recognize the need to have skills in both qualitative and quantitative methods in order to select the most appropriate method for the program and con- text they are evaluating.

One useful framework for explaining the differences among evaluators and approaches over the years comes from Stevenson and Thomas (2006), who ana- lyzed what they called the intellectual contexts for evaluation. They identified three traditions in evaluation that are closely tied to one’s original training and discipline:

(a) The experimental tradition is composed primarily of people trained in psy- chology and sociology, and in quantitative methods with a focus on estab- lishing causality. Donald Campbell was an early leader in this tradition, moving social psychologists to think more practically about conducting useful research beyond the laboratory.

(b) The case/context tradition, led by Ralph Tyler and his student Lee Cronbach, is primarily grounded in education. This movement was rooted in testing and student assessment, but moved on to describe programs and work with teachers to gain an understanding of what was happening.

(c) The policy influence tradition is composed of people trained in political science and often working in the federal government. These leaders included Carol Weiss and Joseph Wholey. Their work on policy, which was somewhat re- moved from individual programs but tried to help elected and appointed gov- ernment officials make decisions about what to fund and the directions government should take, led to a different kind of focus on use and designs.

Although evaluators come together today at large meetings of professional associations, such as the American Evaluation Association attended by more than 2,000 evaluators, these traditions can still be seen. They learn a little from each other, but continue, often, to focus on the issues familiar to the environments in which they work and their original training. By presenting different approaches in this textbook, we hope to help readers bridge these disciplines and traditions and learn what might be valuable from each for the context in which they work.

Disciplinary Boundaries and Evaluation Methodology. It is ironic that in a field with such a rich array of alternative evaluation approaches, there still ex- ists, among some evaluators, a tendency to fall prey to the law of the instrument

120 Part II • Alternative Approaches to Program Evaluation

fallacy2 rather than to adapt or develop evaluation methods to meet the needs of the program, the stakeholders, and the identified evaluation questions. In many cases, the law of the instrument fallacy in evaluation is grounded in the methods of the discipline of the evaluator’s original training. However, Scriven (1991c) has effectively argued that evaluation is not a single discipline but a transdiscipline that, like logic, design, and statistics, is applied to a wide range of disciplines.

Thus, our presentation of approaches is not meant to encourage a single ap- proach, but to encourage the reader to adopt the approach or elements of different approaches that are appropriate for the particular evaluation he or she is planning.

Classifications of Evaluation Theories or Approaches

Existing Categories and Critiques

In recent years, several evaluators have attempted to categorize evaluation theo- ries for different purposes. Shadish, Cook, and Leviton’s (1995) book was influ- ential in reviewing important evaluation theorists, at least partly to illustrate historical trends and changes in the field, but primarily to identify and describe important evaluation theories. Shadish et al. identified three stages of evaluation theory as it emerged in the United States. The first stage, in the 1960s, was char- acterized by a focus on using scientifically rigorous evaluation methods for social problem solving or studying the effectiveness of government programs at achiev- ing outcomes. The emphasis at this stage of evaluation was on examining causal effects of programs and, with this information, judging the value of each program. Shadish et al. focus on individual evaluators to illustrate the dominant theories at each stage. For the first stage, they profile Michael Scriven, who developed his the- ory of valuing—the process of reaching a judgment on the value of programs or policies—and Donald Campbell, who developed quasi-experimental methods to establish the causal effects of programs outside of the laboratory and discussed how these methods should be used by managers and evaluators. Stage two reflected evaluators’ growing concern with having evaluation results used.3

Evaluators’ focus on use in stage two prompted evaluation to grow and change in many ways, such as encouraging evaluators to establish relationships with specific stakeholders to facilitate use, and broadening the methods used to accommodate

2Kaplan (1964) described this fallacy by noting that, if you give a small boy a hammer, suddenly every- thing he encounters needs hammering. The same tendency is true, he asserts, for scientists who gain familiarity and comfort in using a particular method or technique: suddenly all problems will be wrested into a form so that they can be addressed in that fashion, whether or not it is appropriate. 3Stage one theorists had not written extensively about use, assuming results would naturally be used by consumers, managers, or policymakers.

Chapter 4 • Alternative Views of Evaluation 121

the potential information needs and values of different stakeholders. The theorists they profile in the second stage, Carol Weiss, Joseph Wholey, and Robert Stake, were concerned, in quite different ways, with increasing the responsiveness and utility of evaluations. In stage three, Shadish et al. view evaluators such as Lee Cronbach and Peter Rossi as integrating the first stage’s emphasis on truth or sci- entific validity with the second stage’s emphasis on evaluation’s utility to stake- holders. In efforts to have evaluation be both valid and useful, stage three evaluators introduce new concepts such as developing the theory of a social pro- gram to aid in its evaluation and adapt others.

Stufflebeam (2001b), too, analyzed evaluation theories or, what he, like us, calls “approaches.” His work was designed to reduce the burgeoning numbers of evaluation theories and to identify those with the greatest potential. He attempted to reduce the numbers of theories to those that are most useful by conducting an intensive study of 20 different evaluation approaches using some key descriptors to summarize each approach. He then used the Standards developed by the Joint Committee to judge nine approaches in more detail. His assessments of the vari- ous methods were also influenced by the extent to which each approach addresses what he sees as “evaluation’s fundamental requirement to assess a program’s merit or worth” (Stufflebeam, 2001b, p. 42). Of interest to us here is his catego- rization of the 20 approaches into three groups: (a) Question and/or Methods- Oriented approaches, (b) Improvement/Accountability approaches, and (c) Social Agenda/Advocacy approaches. His first category, Question and/or Methods- Oriented approaches, is the largest of the three groups, containing 13 of the 20 ap- proaches. These approaches, Stufflebeam notes, are alike in that they “tend to narrow an evaluation’s scope” by focusing on either particular questions or meth- ods (2001b, p. 16). Approaches in this category include ones that focus on partic- ular strategies to determine what should be evaluated (objectives-oriented and theory-based approaches), on particular methods to collect data (objective testing, performance testing, experimental studies, case studies, cost-benefit analysis) or to organize data (management information systems), or on a particular method for presenting and judging results (clarification hearing).4 Stufflebeam's second category, Improvement/Accountability approaches, contains approaches that“stress the need to fully assess a program’s merit or worth” (2001b, p. 42). Stufflebeam sees these approaches as more comprehensive in their evaluation of programs in order to serve their purpose of judging merit or worth. Typical exam- ples include the accreditation/certification approach and Scriven’s consumer- oriented approach to judging the quality of products for potential consumers. The Social Agenda/Advocacy approaches, rather than having a primary emphasis on judging the overall quality of a product or relying upon a particular method, “are directed to making a difference in society through program evaluation” (Stufflebeam, 2001b, p. 62). In conducting evaluations, these approaches are concerned with involving or empowering groups who have less power in society. These

4These sub-categories are our own interpretation of the 13 approaches, not Stufflebeam’s.

122 Part II • Alternative Approaches to Program Evaluation

approaches include Stake’s client-centered or responsive evaluation and House’s deliberative democratic evaluation.

In 1985, Alkin and Ellett argued that to be considered a comprehensive the- ory, evaluation theories must address three issues: methodologies, how the data are valued or judged, and use of the evaluation. Later Alkin and House (1992) de- veloped these issues into three continua: (a) methods could be characterized along a continuum from qualitative to quantitative; (b) values could be characterized from unitary (one value or way of judging the data and program) to plural (many values); and (c) use could range from aspirations for instrumental, or direct use, to enlightenment or indirect use. In 2004, Alkin and Christie used these dimen- sions to categorize evaluation theorists and their approaches through the visual model of a tree. The roots of the tree reflect what they see as the dual foundations of evaluation: social inquiry (using a “systematic and justifiable set of methods”) and accountability and control (reflecting the purposes and intended use of eval- uation). The branches of the tree then reflect the three dimensions of methods, values, and use identified earlier by Alkin and House (1992). Individual theorists are placed on one of the three branches to reflect the key dimension of their approaches. Like Shadish et al. (1995), Alkin and Christie use individual evalua- tion theorists to illustrate different approaches to evaluation.

Each of these categorizations of evaluation approaches or theories provides useful insights into evaluation and its history and practice. Thus, Shadish, Cook, and Leviton illustrate the early focus on the truth that evaluation would bring to the judgments made about social programs, the later recognition that use needed to be consciously considered, and the integration and adaptation of the two issues in even later stages. Alkin and Christie’s model builds on these foundations iden- tified by Shadish et al. in slightly different ways. Its roots are in social inquiry, accountability, and control, but it considers evaluation’s emphases in three areas: methods, values, and use. Stufflebeam’s categories are different from the first two in that he focuses not on individual evaluators and their writings to identify cate- gories, but on the content of evaluation theories or models.5 He developed his cat- egories by considering the orienting devices or principles used for focusing the evaluation. The priorities used to focus the evaluation are reflected in his three categories: using particular evaluation questions or methods, taking a compre- hensive approach to making a judgment regarding quality of the program, or improving society and its programs by considering social equity and the needs of those with less power. Like Stufflebeam, our purpose is to reduce the current number of evaluation approaches. Although Stufflebeam’s method of reducing the approaches was to judge the quality of each, our synthesis of the approaches is intended to describe each approach to help you, the reader, to consider different

5Of course, examining the writings of proponents leads one to consider the theories as well because the individual is writing about his or her evaluation approach or theory. The difference is that Alkin and Christie (2004) and Shadish et al. (1995) were focusing on individuals to illustrate theories, and Stufflebeam’s writing is less concerned with individuals. Although some of the theories Stufflebeam reviews are identified with one individual, others are not.

Chapter 4 • Alternative Views of Evaluation 123

approaches and their potential use in your work.6 Although the many different approaches to evaluation can seem confusing, their diversity allows evaluators to pick and choose either the approach or the elements of an approach that work best for each program they are evaluating. Our task is to categorize the approaches in a way that helps you consider them and to expand your views of possible ways in which evaluations may be conducted.

A Classification Schema for Evaluation Approaches

We have chosen to classify the many different evaluation approaches into the four categories that we have developed based on our identification of the primary fac- tor that guides or directs the evaluation:

1. Approaches oriented to comprehensive judgments of the quality of the program or product: These approaches include expertise-oriented and consumer-oriented evaluations. They are the oldest approaches in evaluation, having been used by many before formal evaluation approaches were developed. We will dis- cuss Elliot Eisner’s writings on connoisseurship and criticism, accreditation, and other types of expertise-oriented evaluations and Michael Scriven’s consumer-oriented approach. The expertise and consumer-oriented approaches differ rather dramatically in who conducts the evaluation and the method- ology, but their commonality is that they direct evaluators to focus on valu- ing or judging the quality of the thing they are evaluating.

2. Approaches oriented to characteristics of the program: These approaches include objectives-based, standards-based, and theory-based evaluations. In each of these approaches, the evaluator uses characteristics of the program, its ob- jectives, the standards it is designed to achieve, or the theory on which the program is based to identify which evaluation questions will be the focus of the evaluation.

3. Approaches oriented to decisions to be made about the program: These approaches include Daniel Stufflebeam’s Context-Input-Process-Product (CIPP) ap- proach and Michael Patton’s Utilization-Focused Evaluation, as well as Joseph Wholey’s evaluability assessment and performance monitoring. These approaches focus on evaluation’s role in providing information to im- prove the quality of decisions made by stakeholders or organizations.

4. Approaches oriented to participation of stakeholders: These approaches include Robert Stake’s Responsive Evaluation, Practical Participatory Evaluation, Developmental Evaluation, Empowerment Evaluation, and democratically oriented approaches.

Placement of individual evaluation approaches within these categories is to some degree arbitrary. Several approaches are multifaceted and include characteristics that would allow them to be placed in more than one category. For clarity we have

6Although our purpose is not to judge the quality of each approach, but to introduce you to them, we do not include approaches that could not serve a valid purpose in an evaluation.

124 Part II • Alternative Approaches to Program Evaluation

decided to place such approaches in one category and only reference their other fea- tures in chapters where it is appropriate. Our classification is based on what we see as the driving force behind doing the evaluation: the factors that influence the choice of what to study and the ways in which the study is conducted. Within each category, the approaches vary by level of formality and structure, some being rel- atively well developed philosophically and procedurally, others less developed. Some are used frequently; others are used less, but have had a major influence on evaluators’ thinking. The following chapters will expand on these approaches.

Major Concepts and Theories

1. During evaluation’s relatively short history, many different approaches or theories concerning how to practice evaluation have emerged.

2. Evaluators should be familiar with the various approaches in order to choose the approach or elements of approaches most appropriate for the specific program they are evaluating and its context.

3. The different evaluation approaches were influenced by differing views of ontol- ogy (the world and reality) and epistemology (knowledge), and the methods for obtain- ing valid knowledge. These views often are associated with the evaluator’s graduate training and life experiences.

4. Today, prominent paradigms in evaluation and the social sciences include postpos- itivist, constructivist, transformative, and pragmatist paradigms.

5. Others have categorized evaluation theories or approaches according to a focus on truth and use and an integration of the two; by the categories of questions or methods, improvement/accountability, and social agenda/advocacy; and by their focus on meth- ods, values, or use.

6. We categorize theories based on the primary factor that guides the actions taken in the evaluation. Our categories include approaches that focus on making an overall judg- ment regarding the quality of the program, on program characteristics, on decisions to be made, and on stakeholder participation.

Discussion Questions

1. What are the key differences between the paradigms that have influenced evalua- tion? Which paradigm seems most appropriate to you? Why?

2. How can the ways in which one defines program evaluation impact an evaluation study?

3. What implications does the statement “evaluation is not a traditional discipline but a transdiscipline” have for the methodologies or approaches an evaluator may de- cide to use in an evaluation?

Chapter 4 • Alternative Views of Evaluation 125

Application Exercises

1. Think about how you could approach evaluation. Describe the steps you think you would follow. Then, analyze your approach according to your philosophical and methodological preferences. Explain how your background and what you would be evaluating could affect your approach. Describe other things that might affect your approach to evaluation.

2. Identify a program in your area that you would like to see evaluated. List some qualitative evaluation methods that could be used. Now list some quantitative methods that you see as appropriate. What might the different methods tell you?

3. The Anderson Public School District has recently begun a new training program for principals. What questions would you ask if you were to conduct an evaluation of this training program from the postpositivist paradigm? What types of data would you collect? How might this evaluation be conducted differently if you took a con- structivist perspective? A transformative perspective?