Self Reporting

profileTT24
CHAPTER4SelfReportingBYMosher.docx

The Mismeasure of Crime

Mosher, Clayton; Miethe, Terance D.; Hart, Timothy C.

CHAPTER 4

SELF-REPORT STUDIES

Respondents are a tricky bunch, and they do not always behave the way a researcher would wish or expect. In fact, surveys would be far more reliable without them.

—Coleman & Moynihan (1996, p. 77)

Self-report studies of crime were developed in the 1940s and 1950s, largely in response to concerns among criminologists that official measures of crime were systematically biased and provided a distorted picture of the nature and extent of crime and its correlates.

One of the primary advantages of self-report studies is that the information individuals provide regarding their behavior is not filtered through any official or judicial process. The criminal justice funnel, which illustrates how, at each stage of the system, fewer and fewer illegal behaviors are siphoned off for official crime counts, does not operate with respect to self-report data.

However, what individuals tell us about their behavior may or may not be a reliable and valid source for determining how involved they are in criminal activity. Memories of events—even of ones as dramatic as criminal episodes— may be fuzzy rather than clear, especially when it comes to recollecting the time period in which they occurred or the sequence of their occurrence. The questions that researchers ask may be phrased in ways that are different from the way people think of their behavior. For example, asking “In the past six months, have you abused or aggressed against a family member?” may elicit a different response than “In the past six months, have you slapped, hit, or punched anyone in your house?” Even with good nonjudgmental questions,however, respondents may be reluctant to answer fully and truthfully—at least partially because of the fact that they are being asked to admit to behaviors that might result in their arrest if the actions became known to authorities.

One purpose of this chapter is to make you a savvy consumer, as well as evaluator, of self-report measures of crime. Because of its importance to both self-report and victimization data, we begin with a brief discussion of survey methodology. We then review the methodology and findings of some of the more prominent self-report studies, including the National Youth Survey (NYS), the National Longitudinal Study of Adolescent Health (Add Health), the Monitoring the Future (MTF) Survey, and the National Household Survey on Drug Abuse (NHSDA, now known as the National Survey on Drug Use and Health [NSDUH]). This is followed by a review of a prominent and enduring debate in the discipline of criminology regarding the connection between social class and crime, a debate that led to further refinements and improvements in self-report methodology. We then discuss self-report data from known offenders, which have provided particular insights into the crime patterns of individuals who have been apprehended by the criminal justice system. The chapter concludes with an examination of studies focusing on the reliability of self-reported data on drug use.

THE METHOD BEHIND THE MEASURE

Self-report measures of crime are subject to the same constraints found more generally in survey research. Criticisms regarding the adequacy or accuracy of self-report as well as victimization data have as much, if not more, to do with how the data are collected than with what those data might tell us about crime and criminals. In order to anticipate and understand these criticisms, we briefly address the sources of error in survey research before discussing specific self-report studies of crime.

Sources of Survey Error

At the core, evaluating any survey and the data derived from it revolves around two central issues (Phillips, Mosher, & Kabel, 2000):

Were the right people asked the right questions?

Did they answer truthfully?

 

In survey research terminology, these two issues expand into the four sources of total survey error: coverage error, sampling error, nonresponse error, and measurement error (see, e.g., Dillman, 2000; Groves, 1989, 1996; Junger-Tas & Marshall, 1999; Salant & Dillman, 1994).

Coverage error means that researchers selected individuals from a list— a sampling frame—that did not include all the people they intended to study: the target population. To illustrate this principle, consider the following example. A researcher is interested in determining how welfare recipients feel about their encounters with social service and criminal justice agencies. They have access to a current list of welfare recipients in their state, from which they will select a sample. In this example, there are already two limits on coverage: only people who (1) receive welfare as of a certain date and (2) live in the state can be included in the survey and described by its results.

There is an additional limitation imposed by the survey mode used to obtain information from the sample members. A telephone survey may be expedient, but a high percentage of welfare recipients may not have telephone service. A mail survey would likely include most, if not all, welfare recipients, but literacy may be a problem in enough cases to contribute to two other sources of error: nonresponse and measurement (which will be discussed later). A face-to-face survey could address the deficiencies of either of the other two modes but at great cost in terms of time and money. As this example illustrates, coverage error can be relatively easy to identify but not easy to correct.

Sampling error is an automatic, unavoidable result of surveying a subset, rather than taking a census, of all the people in the target population. This is the source of survey error that is referred to when journalists report that a political exit poll or public opinion survey has a “margin of error plus or minus five points.” It means that it can be estimated, by using a well-established statistical formula, how closely the survey sample mirrors the target population. Although it is rarely reported by journalists, sampling error is estimated within a specified confidence level that indicates how sure we are about the estimate. For example, if a survey’s sampling error is estimated as +/- 5 percentage points at the 95% confidence level, we can be confident that 95 times out of 100, the percentage of sample members who gave a certain response will be within 5 percentage points either way of the true percentage in the target population who would give that response if asked. Unfortunately, confidence levels apply only to predictions in the long run (referred to as an infinite number of trials); any particular sampling outcome may fall within or outside of the specific range of the 95% confidence level. Although sampling error cannot be completely eliminated from surveys, it can be reduced by increasing the sample size and obtaining responses from a larger proportion of people in the target population.

Nonresponse error affects survey data when both of the following are apparent: (a) too many people in the sample did not respond to the survey, either because they could not be contacted via the survey mode or they refused to participate; and (b) the nonrespondents differ from respondents in ways that are important to the objectives of the survey. Why both conditions must hold is easily illustrated. Consider a survey in which researchers complete interviews with 70% of their sample, a quite respectable response rate for social surveys. Most of the respondents have brown eyes, whereas most of the nonrespondents have hazel, green, or blue eyes. Is this survey plagued by nonresponse error? The answer to this question depends on the research questions. If the researcher is interested in the relative sun-sensitivity reported by people with different eye colors or their preferences for using contact lenses of different hues, then non-response likely is a problem. Even though there is a high response rate to the survey, respondents differ from nonrespondents on a variable of potential interest—eye color. If, on the other hand, the researcher is interested in attitudes toward capital punishment, nonresponse on the basis of eye color would not be a source of error because eye color is not germane to this issue.

To return to the earlier example of surveying welfare recipients regarding their encounters with social service and criminal justice agencies, assume that the researcher obtains an 80% response rate. However, more than 60% of the respondents are female, and more than 70% of the nonrespondents are male. Nonresponse error constitutes a serious problem in this case because gender is a factor not only in the number but also in the character of contacts with social service and criminal justice agencies. It is possible for a survey with a 99% response rate to be subject to nonresponse error if the 1% who did not respond differ in predictably significant, substantive ways from the nonrespondents. Likewise, a survey with a response rate of only 40% may be immune to non-response error if the nonrespondents are similar to respondents in ways that might make a difference in analyzing the data from the survey.

As mentioned in Chapter 1, the process of operationalization involves attaching meaning to abstract concepts and developing specific indicators and measures of those concepts. How researchers decide to measure these concepts, the nature and number of indicators that are used to identify them, and the specific wording used to define them are all sources of measurement error. In evaluating measures of any phenomenon, social scientists are concerned with issues of validity and reliability.

Validity and Reliability

Validity is the degree to which a measure captures what it is intended to measure: If a measure is valid, it is true and accurate. Some measures have prima facie validity (i.e., clear or self-evident; often called face validity). Other measures possess validity only for specific cases and within strictly defined boundaries. Consider this example. Which of the following is more valid as a measure of the physical stature of human beings: (a) height and weight as recorded by physicians at routine physical exams or by coroners at an autopsy, (b) sizes of clothing most frequently purchased from the inventories of top- or bottom-tier manufacturers and department stores, (c) dimensions of seating and lavatory areas in commercial airplanes, or (d) observations of, and conversations with, people at public events or on the streets at rush hour? The first option—height and weight as recorded by a medical practitioner—does seem to have face validity for measuring physical stature, but the other three options have fairly obvious limitations when it comes to measuring what is intended. However, people who have physical exams or those whose deaths require an autopsy may not be representative of all human beings. Thus, even the validity of what appears to be the most accurate measurement can be compromised by an inadequate or biased sampling frame. We will return to these threats to validity after defining the second criterion for evaluating any measurement.

Reliability is the extent to which the same results are obtained each time a measure is used. If something is a reliable measurement, then it is a precise, consistent, and dependable one. A bathroom scale that showed an individual having three different weights on three different occasions over a 10-minute period would not be reliable. In the case of self-report studies of criminal and deviant behavior, reliability refers to the ability of the procedure used and questions asked to generate consistent responses from the same respondents on repeated administrations. For example, if individuals are asked whether they have ever stolen something and they answer yes, then they should answer yes the next time they are asked the same question. But just as all squares are rectangles but not all rectangles are squares, all valid measures are reliable ones, but not all reliable measures are valid.

In survey research, threats to reliability and validity (i.e., measurement error) derive from any of four aspects of the study (see, e.g., Aquilino, 1994; Aquilino & Wright, 1996; Dillman & Tarnai, 1991; Dykema & Schaeffer, 2000). The survey mode, whether it is telephone, mail, face-to-face, or Internet or web based, may result in different answers to the same question, even when posed to the same types of respondents. For example, studies have found that respondents are more likely to report drug use on self-administered answer sheets than in face-to-face interviews (Harrison, 1997; also see discussions later in this chapter). The survey instrument may include questions with categories that are not mutually exclusive or with terms that are not interpreted the same way by different respondents. The survey interviewer may unintentionally prompt a particular response by either attempting to clarify the meaning of a question (resulting in leading the respondent) or by giving the impression that a particular response is correct or expected (resulting in a socially desirable answer from the respondent). Finally, the survey respondent may misunderstand the question, may feel that the question is too nosy and prying, or may just plain lie. All of these conditions will result in mistakes in measurement.

To restate the sources of survey error in the context of the two key questions regarding research (i.e., Were the right people asked the right question? Did they answer truthfully?), if the coverage of the target population is inadequate or the sampling strategy is inappropriate, or the nonresponse rate jeopardizes either of them, the right people have not been asked the right question. If the measurement strategy elicits responses that are imprecise or might be inaccurate or cannot be compared to others, then the data do not allow us to determine if the respondent is telling the truth.

SELF-REPORTS ON CRIME AND DELINQUENCY

Chapter 2 documented that most of the early self-report measures of crime and its correlates were intended to discover, document, and describe the true dimensions—or dark figure—of crime. Some researchers believed that there was a great deal of illegal behavior that was not captured by official statistics. Rather than taking the official statistics at face value, they attempted to learn about criminal activities directly from the individuals who were engaging in them, whether or not those activities were detected by law enforcement.

The work of James Short and Ivan Nye, briefly discussed in Chapter 2, serves as an instructive example of both the strengths and weaknesses of self-report data on illegal activities (see Nye & Short, 1957; Short, 1955, 1957; Short & Nye, 1957–1958, 1958). Non-institutionalized adolescents was the targeted population for these researchers, and “because they seem likely to be more representative of the general population than are college or training school populations,” Short and Nye (1958, p. 297) drew their samples from public high schools, administering an anonymous questionnaire to these students. Exhibit 4.1 lists the items included in Short and Nye’s questionnaire that were designed to measure the youths’ involvement in delinquent and criminal activities. From responses to the questionnaire, Short and Nye (1958) drew the following conclusions, among others: (a) delinquent conduct in the non-institutionalized population is extensive and variable; (b) self-reported delinquent conduct is similar to official delinquency and crime in that boys admit committing nearly all delinquencies more often than do girls, and the offenses for which boys and girls are most often arrested are the ones they admit to committing most often; and (c) self-reported delinquent conduct differs from official statistics in that delinquency is distributed more evenly throughout the socioeconomic classes of non-institutionalized populations, whereas official cases are concentrated in the lower economic strata.

There are, however, a number of questions that can be raised regarding Short and Nye’s work. First, are students enrolled in high school likely to be representative of the general youth population? What about dropouts and other young people who might have been absent for one reason or another on the day(s) the questionnaire was administered? It is likely that such individuals are more prone to be involved in criminal and delinquent behavior. Second, many of the behaviors listed in the questionnaire are not described in legalistic, criminal terms. One of the many challenges associated with obtaining valid and reliable self-reports and comparing these to official data is translating the reported behaviors into categories consistent with those in sources such as the UCR. Third, and in a similar vein, many of the items included on the Short and Nye (1958) questionnaire are oriented toward the less serious end of the crime scale. The fact that many self-report instruments focus on relatively trivial behaviors, such as skipping school and defying parents’ authority, has become an enduring criticism of self-report studies.

Despite these shortcomings, Short and Nye’s work was important in the sense that it revealed that a considerable amount of crime and delinquency was not officially recorded. And much of this hidden delinquency was apparently committed by young people from relatively privileged backgrounds; Short and Nye found few social class distinctions in either the range or frequency of involvement in self-reported illegal activities. As a result, “Short and Nye’s work stimulated much interest in both the use of self-report methodology and the substantive issue concerning the relationship between some measure of social status (socioeconomic status, ethnicity, race) and delinquent behavior” (Thornberry & Krohn, 2000, p. 37).

Literally hundreds of self-report surveys that have been conducted in the past 60 years under the auspices of a variety of government agencies, academic institutions, and individuals largely confirm the findings from the earliest self-report studies of crime. However, as we will discuss in more detail later, several more recent studies—using more sophisticated methods, instruments, and analyses— have challenged the conclusions regarding little or no association between social class variables and involvement in delinquent and criminal behavior. In the following section, we describe four surveys, each of them national in scope and each of which have been used in numerous published studies, that arguably are standard bearers for collecting and analyzing self-report data. Not only do these provide self-report data on involvement in illicit activities, they also form the basis for research on and debate about techniques for improving the quality of self-report data. Two of the surveys (NYS and Add Health) measure both criminal and delinquent behavior in addition to the use of controlled substances. The other two (MTF and NSDUH) are focused on issues related to the use and abuse of legal and illegal substances.

National Youth Survey

First conducted in 1977, the NYS was designed specifically to provide both prevalence and incidence estimates of the commission of delinquent activities by youth. It is a longitudinal survey that uses a national-probability-based sample of young people who were 11 to 17 years old at the time of the first interview (Elliott, Huizinga, & Morse, 1986). Participants in this study were interviewed in their homes at one-year intervals through 1981 and at two- to three-year intervals at least through 1995. More than 90% of the original 1,725 participants have remained in the survey over time.Exhibit 4.2 provides a list of some of the questions used in the NYS.

Confidential, face-to-face interviews solicit information on the number of times the respondent has engaged in a specific delinquent or criminal activity within the past calendar year, with two different response sets used. If an individual’s response to an open-ended question indicates they have engaged in the particular activity more than 10 times, the interviewer asks the youth to select one of the following responses: (a) once a month, (b) once every 2 to 3 weeks, (c) once a week, (d) 2 to 3 times a week, (e) once a day, or (f) 2 to 3 times a day. Although described in nonlegalistic terms, the 47 activities asked about directly parallel offenses listed in the FBI’s Uniform Crime Report. Of the Part I offenses, only homicide is excluded; about 75% of Part II offenses are included, along with a wide range of misdemeanors and status offenses.

Exhibit 4.3 presents data on prevalence and incidence rates of self-reported offending for the first five waves of the NYS. Because the NYS is a longitudinal survey, the panel of respondents reporting on their behavior for 1976 is the same group of people reporting for 1980, and this is why the age range is different for each of the five years. With respect to prevalence rates (i.e., the percentage of respondents who report having engaged in certain types of crime) for felony assault and theft, both whites and blacks report lower involvement for 1980 than for 1976, and their self-reported rates of involvement in these offenses are nearly identical. For general delinquency, whites report a slightly higher involvement and blacks report a slightly higher involvement for 1980 than for 1976.

The NYS provided the database for a number of important substantive and methodological studies in criminology—a March 2010 search of Criminal Justice Abstracts,1 using the search term “national youth survey,” resulted in 177 entries, 137 of which were journal articles. We will discuss some of these studies in more detail in subsequent sections of this chapter, but here we mention a few to provide a sense of the range of topics that can be addressed by NYS data. Several studies have focused on gender, race, and social class similarities and differences in self-reported offending (e.g., Ageton, 1983; Huizinga & Elliott, 1987; Smith, Visher, & Jarjoura, 1991; Zhang & Messner, 2000). Some researchers have examined the relationship between drug use and involvement in predatory crime or juvenile involvement in violent crime (e.g., Chaiken & Chaiken, 1990; Elliott et al., 1986) or used NYS data to test the gateway drug theory (Rebellon & Van Gundy, 2006). Still others have used NYS data to test explanatory theories of delinquent and criminal behavior, including deterrence, strain, power-control, and control balance theories, among others (e.g., Blackwell & Reed, 2003; DeLisi & Hochstetler, 2002; Heimer & Matsueda, 1994; Jang 1999a, 1999b; Jang & Johnson, 2001; Lauritsen, 1999; Ostrowsky & Messner, 2005; Pogarsky, Kim, & Paternoster, 2005). Researchers have also used NYS data to examine the relationship between religiosity, moral beliefs, and delinquency (Desmond, Soper, & Purpura, 2009) and between marriage and involvement in crime (King, Massoglia, & MacMillan, 2007).

National Longitudinal Study of Adolescent Health (Add Health)

The Add Health study, initiated in 1994 under a grant from the National Institute of Child Health and Human Development and administered by the University of North Carolina’s Carolina Population Center, is a nationally representative longitudinal study that was originally designed to collect data on how social contexts (families, friends, peers, schools, neighborhoods, and communities) influence teenagers’ health and risk behaviors (National Institutes of Health, n.d.). Among the data collected in the Add Health study are suicidal intentions or thoughts, biomarkers, substance use and abuse, violence, delinquency, criminal offending, and involvement with the juvenile and criminal justice systems (Carolina Population Center, 2010). In addition, school administrators provided information regarding characteristics of the schools that respondents attended and, if participants agreed, data from high school transcripts.

Add Health has gone through four waves, with the first study involving a stratified, random sample of all high schools in the United States, administered in 1994/1995, and resulting in 90,118 school questionnaires, 164 school administrator questionnaires, 20,745 in-home interviews of adolescents, and 17,700 parent questionnaires. For the second stage of Wave I, an in-home sample of 27,000 adolescents was drawn, consisting of a core sample from each community in addition to selected oversamples. In this stage, parents were asked to complete a questionnaire about family and relationships. The second wave of Add Health was conducted in 1996 and consisted of close to 15,000 in-home interviews with adolescents and 128 school administrator questionnaires. Wave III of the study consisted of Wave I respondents who could be located and re-interviewed in July of 2001 and April of 2002, resulting in 15,197 young adult in-home interviews (and the collection of biomarker data) as well as 1,507 interviews with partners of the original respondents. Finally, Wave IV of Add Health, conducted in April and June of 2007 and January and February of 2009, consisted of 15,701 adult in-home interviews (of the original respondents, who were then between the ages of 24 and 32) and biomarker collection (Carolina Population Center, 2010).

A March 2010 search of Criminal Justice Abstracts using “Add Health” as the search term resulted in 18 entries, with 16 of these being journal articles. The Carolina Population Center website lists several hundred more publications that have used Add Health data, and the National Institute of Health’s website indicates that more than 3,000 scientists have used data from Waves I through III, resulting in the publication of more than 600 articles (National Institutes of Health, n.d.).

Criminological researchers have used Add Health data to study the relationship between family structure, family process, and economic factors and delinquency (Lieber, Mack, & Featherstone, 2009); the role of friendship sex composition in girls’ and boys’ involvement in serious violence (Haynie, Steffensmeier, & Bell, 2007); the impact of early puberty on experiencing violent victimization (Schreck, Burek, & Stewart, 2007); the role of social psychological processes in mediating the impact of neighborhood contexts on violence (Kaufman, 2005); and the impact of school and family attachments on drug use, delinquency, and violent behavior (Dornbusch, Erickson, Laird, & Wong, 2001). Others have taken advantage of some of the unique features of the Add Health data to address issues of criminological interest. As noted previously, the Add Health studies have collected extensive information on the parents of those surveyed—Foster and Hagan (2007) used these data to examine the effects of fathers’ incarceration on the detainment and exclusion of children during their transition to adulthood. Beaver, DeLisi, and Wright (2009) used the biomarker data in Add Health and concluded that genetic factors interact with delinquent peers and low self-control in predicting variation in delinquency.

Monitoring the Future: A Continuing Study of U.S. Youth

Since 1975, the MTF study has served as a primary source of information about illicit drug, alcohol, and tobacco use by young people in the United States (Johnston, O’Malley, & Bachman, 1999). Each year, published reports based on MTF data reveal the extent of use of several legal and illegal substances. The study also examines a variety of attitudes among 8th-, 10th-, and 12th-grade students, but it does not address involvement in other criminal and delinquent activities (MTF data are available online at http://www.monitoring thefuture.org).

MTF is an extraordinarily ambitious and costly project—between 15,000 and 20,000 students in each of three grades, in addition to between 9,000 and 16,000 college students and young adults, complete an MTF questionnaire each year. The data from any given MTF survey are directly comparable to those from previous years, largely because sampling techniques and question formats are consistent from one year to the next.

MTF began with a cross-sectional survey of a representative sample of all seniors in public and private high schools in the coterminous United States (Johnston et al., 1999), but it quickly became a longitudinal survey. With the exception of the first graduating class, follow-up questionnaires are mailed to a representative sample, consisting of approximately 2,400 individuals, of the members of each senior class who participated in the MTF. These follow-ups occur on seven occasions between the year of high school graduation and the year that the cohort reaches the age of 32, and they constitute the college student and young adult samples for each MTF survey year.

The MTF survey instrument has been modified over the years to accommodate the use of different types of drugs as well as corollary attitudes and behaviors. For example, a question on crack cocaine was first added to the instrument in 1986, and more detailed questions about all forms of cocaine were included in the 1987 version. Questions on crystal methamphetamine (ice) have been included since 1990; 8th, 10th, and 12th graders have been asked questions about MDMA (ecstasy) since 1996. Since 2007, MTF has placed emphasis on the use of prescription drugs (outside of medical supervision) and on the use of over-the-counter cough and cold medicines to get high. In addition to typical questions about licit and illicit drugs, such as age or grade at first use, frequency and quantity of use, and perceived availability of drugs, the MTF also queries respondents regarding their attitudes and beliefs about involvement in risky behaviors as well as their perceptions of the attitudes, beliefs, and behaviors of others with whom they associate.

The 2008 MTF survey included more than 46,000 students in 8th, 10th, and 12th grade in 386 secondary schools in the United States (Johnston, O’Malley, Bachman, & Schulenberg, 2009).Exhibit 4.4 shows the percentages of each MTF school sample group that reported having used various illicit drugs, alcohol, and tobacco at any time in the 30 days prior to completing the questionnaire in 2008. This table reveals that alcohol is the drug most frequently used by young people, with 43% of 12th graders, 29% of 10th graders, and 16% of 8th graders reporting they had consumed alcohol in the 30 days prior to being surveyed. Twenty-two percent of 12th graders, 16% of 10th graders, and 8% of 8th graders reported using any illicit drug in the previous 30 days, with marijuana being the most commonly used substance. Related to our discussion of socially constructed drug epidemics in Chapter 1, it is notable that less than 2% of 8th, 10th, and 12th graders reported using ecstasy in 2008, and less than 1% reported using methamphetamine in the previous 30 days. Rates of ecstasy and methamphetamine use were similarly low among college students and young adults.

MTF surveys also collect measures of regular or daily use of particular substances. Measuring regular use is important because a relatively large proportion of people who report having used a substance in the past month may be first-time and possibly only-time users. The more regularly a substance is used in a 30-day period, the greater the risk for negative consequences associated with long-term use of the substance. In addition, some substances are viewed as gateway drugs, whose regular use by young people may lead to more hard-core substance abuse and addiction (Johnston et al., 1999, p. I:25).

Exhibit 4.5 shows the percentages of the MTF sample groups that reported daily use (i.e., on at least 20 occasions in the past 30 days) of the so-called gateway drugs: marijuana, alcohol, and tobacco. Not surprisingly, the percentages are considerably lower than those in Exhibit 4.4: For marijuana and alcohol, they are as low as one tenth and never as high as one fourth of the percentages reporting use at least once in the past month. For cigarettes, the percentages in the two tables are closer—about one half to two thirds as many past month users reported being daily smokers.

Although the data do not appear in the tables included in this chapter, the 1988 MTF found that 2.3% of both 8th- and 10th-grade students reported heroin use, compared to 2.0% of seniors. These higher rates of heroin use by younger students may be an artifact of the MTF sampling strategy, as heroin users may be more likely than other students to drop out of school before their senior year (Snyder & Sickmund, 1999). The MTF researchers recognize the potential bias induced by the exclusion of high school dropouts but note that “since the bias from missing dropouts should remain just about constant from year to year, their omission should introduce little or no bias in change estimates” (emphasis in original, Johnston et al., 2009, p. 58). Nonetheless, the lack of responses from high school dropouts and those who are absent from school when the MTF surveys are administered need to be considered in interpreting the results from these surveys.

MTF surveys have provided valuable panel data on substance-use patterns among young people over time. Perhaps more important in terms of this chapter, they have provided essential data for evaluating the validity of self-report measures of illicit behaviors (see Caulkins, 2000; Harrison, 1997; Johnston & O’Malley, 1997).

National Survey on Drug Use and Health (NSDUH)

First conducted in 1971, the NSDUH (before 2002 this survey was known as the National Household Survey on Drug Abuse) is an enduring source of information about illicit drug, alcohol, and tobacco use in the United States (Substance Abuse and Mental Health Services Administration [SAMHSA], 2000). Like the MTF, each year the NSDUH reveals the prevalence as well as incidence of drug use. Unlike the MTF survey, however, the NSDUH is administered each year to a sample of non-institutionalized civilians who are 12 years of age or older (as opposed to just students).

Similar to the MTF study, the NSDUH is ambitious and costly, involving face-to-face interviews with almost 70,000 individuals across the United States. The survey is cross-sectional as opposed to longitudinal, and some groups in the target population are oversampled to ensure that there are a sufficient number of interviews to calculate reasonable estimates of drug use by those groups that either may not show up in sufficient numbers in a random sample of the population or may be of particular interest. For example, the NSDUH has traditionally oversampled people over the age of 35, and blacks and Hispanics have been over-sampled since 1985. In certain years, residents from rural areas have been over-sampled, while in other years, residents from urban areas or low socioeconomic status residents within those areas have been oversampled.

Similar to the MTF surveys, the NSDUH survey instrument has been modified over the years to accommodate trends in the use of different types of drugs and correlates of drug-using behaviors. Some of these modifications have been implemented for only one or two survey years. For example, the 1979 and 1982 surveys asked respondents not only about their own but also about their friends’ use of heroin in order to obtain a better sense of the prevalence of heroin use in the United States. The 1982 survey also included a special section on medical as well as nonmedical use of stimulants, sedatives, tranquilizers, and analgesics. In 1995, respondents were asked about their need for drug or alcohol treatment and their criminal record. Other changes in the questionnaire have become standard features of the NSDUH since they were first introduced. For example, since 1985, there have been questions on (a) the use of cigarettes and related products such as smokeless tobacco, (b) perceived consequences of using various drugs, and (c) the various ways in which cocaine is administered. In 1988, questions about crack cocaine use and sharing needles for drug injection were added to the survey. Questions about access to health insurance and total annual family income were introduced in 1990, about employment and drug testing in the work-place in 1991, and about mental health and access to health care in 1994.

Although the questions on the NSDUH survey have varied over the years, its mode of administration remained unchanged until 1999. For nearly three decades, the NSDUH was a face-to-face, paper and pencil interview (PAPI) that took approximately one hour to complete. Trained interviewers read the survey items aloud to respondents, who recorded their answers to questions deemed to be sensitive (such as those on substance use) on separate sheets so that interviewers could not see the responses. Interviewers recorded respondents’ answers to nonsensitive questions (such as those on occupational status and household composition) directly on the survey booklet.

The 1999 NSDUH heralded a major shift in the mode of administration. Rather than a PAPI, it was a combination of computer-assisted personal interview (CAPI) and a computer-assisted self-interview (CASI). The CAPI portion corresponded to the questions for which interviewers recorded respondents’ answers on the booklet, whereas the CASI portion allowed respondents to enter their own answers to sensitive questions. The use of computer-assisted interviewing (CAI) was expected not only to improve the efficiency of data collection and processing, but also to increase respondents’ honesty in reporting illicit drug use and related behaviors. However, ultimately, these changes did not result in improvements to the efficiency of data collection. Response rates early on in the 1999 survey were so low compared to previous years that additional, complicated subsampling and weighting techniques had to be applied. Furthermore, because the mode of administration has been shown to affect both response rates and the content of responses, “the NHSDA also included a supplemental sample using the paper and pencil interviewing mode for the purposes of measuring trends with estimates comparable to 1998 and prior years (SAMHSA, 2000, p. A:1).

Another complicating, though ultimately highly beneficial, change in the 1999 NSDUH was the introduction of state-based probability sampling. Through 1998, with the exception of those survey periods when particular regions were oversampled, the NSDUH sampling design was based on national figures. Estimates regarding drug use could only be applied to the United States as a whole, not to individual states. (Drawing inferences about drug use across different regions of the country was somewhat less problematic, although still questionable). To make it possible to calculate substance use estimates separately for states, as well as to allow for more detailed analysis of national patterns, the 1999 NSDUH drew “an independent, multi-stage area probability sample for each of the 50 states and District of Columbia” (SAMHSA, 2000, p. intro:1).

California, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas—eight states that together account for 48% of the U.S. population age 12 and older—were oversampled. Also oversampled were youths and young adults, so that each state’s sample was approximately equally distributed among three major age groups: 12 to 17 years, 18 to 25 years, and 26 years and older.

Exhibit 4.6 shows the percentages of NSDUH respondents in the 12 to 17, 18 to 25, and 26+ age groups who reported past month use of any illicit drugs, marijuana, cocaine, heroin, hallucinogens, inhalants, psychotherapeutic drugs,2 tobacco, and alcohol in 2008. Past month use of illicit drugs was highest in the 18 to 25 age category, and marijuana was the most frequently used illicit drug in all three age groups. The table also reveals fairly high levels of past month alcohol and binge alcohol use, especially in the 18 to 25 age group.

Although it is important to be cautious in making direct comparisons between the MTF data shown in Exhibit 4.4 and the NSDUH data presented in Exhibit 4.6, it is instructive to examine differences in what the two surveys reveal regarding substance use. Limiting the comparison to the 12- to 17-year-olds and 18- to 25-year-olds in Exhibit 4.6 and the 8th, 10th, and 12th graders and college students in Exhibit 4.4, MTF data show somewhat higher percentages of illicit drug, alcohol, and tobacco use overall than do the NSDUH data. These differences are likely due to differences in the mode of administration of the two surveys. Despite all best efforts to maintain privacy and ensure confidentiality, when questions about substance use are posed out loud, in person, and in one’s home, as is the case with the NSDUH, there may be a tendency to underreport one’s consumption of those substances. Alternatively, filling out a questionnaire anonymously, as is the case with the MTF survey, may allow a certain amount of bragging about, or overreporting of, one’s use of illicit drugs, alcohol, and tobacco.

1Tobacco products include cigarettes, smokeless tobacco (i.e., chewing tobacco or snuff), cigars, or pipe tobacco.

2Binge alcohol use is defined as drinking five or more drinks on the same occasion at least 1 day in the past 30 days.

3Heavy alcohol use is defined as drinking five or more drinks on the same occasion on each of 5 or more days in the past 30 days.

Similar to the NYS and MTF, the NSDUH has provided important data for substantive as well as methodological studies. Of particular value are studies that address measurement issues such as response bias and nonresponse error as well as the general validity of self-reported drug use (see Biemer & Witt, 1997; Caulkins, 2000; Gfoerer, Lesser, & Parsley, 1997; Harrell, 1997; Harrison, 1997; Miller, 1997; Turner, Lessler, & Gfoerer, 1992; Wright, Gfoerer, & Epstein, 1997).

SOCIAL CORRELATES OF SELF-REPORTED OFFENDING

Since their inception in the 1940s, self-report measures have consistently revealed dimensions of crime and its correlates that either were at odds with or could not be addressed by official statistics. For example, in self-report surveys, girls and women reported being just as delinquent and criminal, although less frequently and intensely, as did boys and men. Whites likewise admitted to involvement in a range and number of delinquent and criminal acts closely paralleling that of blacks. Middle- and upper-class youth self-reported similar levels of involvement in delinquent activities as lower-class youth. Their consistency notwithstanding, such findings were not uniformly accepted as valid among researchers and practitioners. Indeed, the debate over the “myth of social class and criminality” and the relationship between race or ethnicity and crime was, and continues to be, so essential to understanding the role of self-report measures that it warrants special attention here.3

Tittle, Villemez, and Smith (1978) standardized data from 35 studies, with publication dates spanning four decades, on the relationship between social class and criminality. Their conclusions were highly controversial and launched one of the more enduring and, at times, heated debates in the discipline of criminology. In essence, their analyses indicated that the negative association between social class and criminality revealed in official data was not only much more marked than the slight one observed from self-report data, but had also been declining substantially and steadily over the decades while remaining fairly stable in self-report studies. They found no support for the notion that people of lower social status were more involved in delinquency and crime. “In short, class and criminality are not now, and probably never were related, at least not during the recent past” (p. 652).

Hindelang, Hirschi, and Weis (1979) took issue with the findings of Tittle et al. (1978), arguing that misrepresentation of findings from self-report studies “create the illusion of discrepancy between the correlates of official and self-reported delinquency, when, in general, no such discrepancy has been demonstrated” (p. 996). Their main contention was that besides covering exclusively or primarily trivial offenses, self-report measures do not “tap the same domain of chargeable offenses as do official statistics” (p. 997). This domain should include a full range of types of offenses (i.e., “behavioral content,” p. 997) as well as “seriousness, both within (e.g., amount of theft) and across (e.g., school versus violent) offense types”(p. 997). Hirschi et al.’s analyses showed that if type of offense and seriousness are taken into account, then self-report data look much like official statistics in terms of a disproportionate involvement by males and by blacks in more serious offenses. They contended that neither self-report data nor official statistics were adequate to make any comparisons between social classes with regard to specific illegal behavior. Hindelang et al. (1979) concluded:

This evidence suggests to us that: (1) official measures of criminality provide valid indications of the demographic distribution of criminal behavior;(2) self-report instruments typically tap a domain outside the domain of official data; (3) within the domain they tap, self-report measures provide reliable and valid indicators of offending behavior; (4) the self-report method is capable of dealing with behavior within the domain of official data; and(5) in practice, self-report samples have been inadequate for confident conclusions concerning the correlates of offending behavior comparable in seriousness to that represented in official data. (p. 1009)

Elliott and Ageton (1980) similarly contended that self-report and official statistics do not measure the same things. They noted that “self-report measures of delinquency provide a different picture of the incidence and distribution of delinquent behavior than do official arrest records” (p. 95). Using data from the first year of the NYS, they constructed a measure of criminal behavior that was directly comparable to UCR data both in Hindelang et al.’s (1979) behavioral content (i.e., type of offense) and in time frame (i.e., the period during which the offenses occurred). They found “significant race differences for total [self-reported] delinquency and for predatory crimes against persons” (Elliott & Ageton, 1980, p. 102). In addition, Elliott and Ageton (1980) found that blacks and lower-class youth were disproportionately likely to be high-frequency offenders. In other words, these overall race and social class differences were largely the result of blacks and lower-class youth reporting the commission of so many, and so many more serious, offenses. Elliott and Ageton surmised that because “the more frequent and serious offenders are more likely to be arrested,” their NYS “data are more consistent with official arrest data than are data from most prior self-report studies”

(p. 107). Calling particular attention to the tendency in self-report studies to truncate measures of the frequency of commission of offenses while simultaneously paying little attention to the seriousness of offenses, Elliott and Ageton concluded:

The most significant difference may not be between the nonoffender and the one-time offender, or even between the one-time and multiple-time offender. Equal or greater significance may be found between those reporting over (or under) 25 nonserious offenses, or between those reporting over (or under) 5 serious offenses. (p. 108)

Clelland and Carter (1980) began their critique of Tittle et al. by asserting that “the proposition of no relationship is the new myth of class and crime,” and noting that “for Tittle et al., criminologists play the role of 900-pound intellectual gorillas—they define ‘crime’ any way they please.” They argued that self-report studies are “nearly worthless” for examining the class-crime relationship, primarily because of the fact that they focused on minor forms of delinquency, such as “skipping school and throwing eggs” (pp. 320–324).

Braithwaite (1981) also took issue with those who denied an association between social class and criminality and noted that “if [a total of 35 works for Tittle et al.’s secondary analysis] is all that could be found, then they did not look very hard. … Perhaps Tittle et al. take their own findings seriously and adopt no extra precautions when moving about the slums of the world’s great cities than they do when walking in the middle class areas of such cities” (p. 37).

Braithwaite (1981) examined the findings of 143 studies that could address the relationship between social class and crime, 97 of which were based on official statistics and 46 of which were based on self-report measures. Nearly 93% of the official-record studies showed higher crime rates among lower-class as opposed to middle-class people; on the other hand, about 53% of self-report studies showed significantly, or at least notably, higher levels of delinquency by lower-class adolescents. Citing Elliott and Ageton’s (1980) finding that differences in self-reported delinquency result entirely from the contrast “between the lower class group and the rest of the sample” (p. 42), Braithwaite went on to assert that “the nature of the class distribution of crime depends entirely on what form of crime one is talking about” (p. 47).

The debate continued when Kleck (1982) argued that the finding of no relationship between social class and crime was largely due to the fact that lower-class youth had a greater tendency to underreport their involvement in delinquency. He used the following example. Suppose that in a given sample, lower-class respondents had committed an average of six delinquent acts, whereas middle-class youth had committed an average of four. If the middleclass group reported 90% of their delinquent acts but lower-class juveniles only reported 60%, both groups would show an identical mean number of reported acts, at 3.6. Kleck (1982) also noted that several self-report studies had drawn samples from a single school or cluster of schools in relatively class-homogeneous areas, resulting in a truncated range on the social class variable. This sampling strategy thus omitted the theoretically relevant underclass, who, he argued, were more likely to be involved in delinquency. Tittle, Villemez, and Smith (1982), in response to Kleck, said “Kleck, (and others) for example, believes that poor people are not only more criminal than those of other classes but bigger liars as well” (p. 437).

Researchers have continued to attempt to specify the conditions under which socioeconomic status is associated with self-reported delinquency and crime. They have also devoted considerable effort toward documenting, if not necessarily improving, the reliability and validity of self-report measures of key variables. But as Tittle and Meier (1990) observed, “Regardless of the conceptual and methodological reasons, criminologists seem no closer to identifying the nature of the relationship [between social class and criminality] than 50 years ago” (p. 271). “Sometimes SES does appear to predict delinquency; most of the time it does not” (p. 294). A decade later, Dunaway, Cullen, Burton, and Evans (2000) drew much the same conclusion about adult criminality. Results from their mail questionnaire survey of a random sample of adults in a Midwestern city “largely reject the notion that social class has a strong main effect on adult criminality in the general population, and thus, they tend to support Tittle and Meier’s (1990) more recent evaluation of the class-crime debate” (p. 617).

RELIABILITY AND VALIDITY OF SELF-REPORT DATA

Debate over whether, and under what conditions, indicators of social class and race or ethnicity are associated with self-report measures of delinquency prompted a flurry of studies aimed at establishing the reliability and validity of self-report measures (see Elliott & Ageton, 1980; Fendrich & Vaughn, 1994; Hindelang et al., 1979, 1981; Huizinga & Elliott, 1986; Mensch & Kandel, 1988; Thornberry & Krohn, 2000). Perhaps most important—and least likely to be met—on the list of requirements for obtaining representative self-report data is to have a sample that is large enough to include sufficient numbers of relatively rare individuals, that is “high-rate, serious offenders most likely to come to the attention of authorities” (Thornberry & Krohn, 2000, p. 40).

Among the elements necessary for reliable and valid self-report survey instruments, four are particularly germane to measures of delinquency and crime (Thornberry & Krohn, 2000, p. 41): (1) a wide range and variety of behaviors must be included; (2) serious offenses must be covered if comparisons are to be made to other kinds of data; (3) respondents must be asked to report on the actual, not relative, number of times they engaged in a particular behavior so that people who committed robbery four times are not lumped together with those who committed it 60 times in the past year; and (4) follow-up questions often are required to distinguish chargeable offenses from others—for example, some respondents may initially indicate that they have committed theft when what they actually have done is hidden someone’s books between classes.

Panel studies have generally shown that self-reported delinquency measures yield stable and consistent results from one period to another; that is, they are fairly reliable. Similarly, most tests find that self-reports measure what they set out to measure; that is, they are reasonably valid. However, there is evidence that some groups, some crimes, and some survey modes yield noticeably higher rates of underreporting. For example, “lower class youths tend to score higher on ‘lie’ scales within self-report measures4 (Braithwaite, 1981, p. 47). Similarly, African American males substantially underreport their involvement in delinquency (Thornberry & Krohn, 2000). There is also some indication that girls may be more honest in reporting their involvement in delinquent behavior than boys (Kim, Fendrich, & Wislar, 2000). At least for one type of criminal behavior—the use of illegal drugs—rates of underreporting are higher for the more serious offenses and for telephone interviews compared to self-administered questionnaires (see Aquilino, 1994; Turner et al., 1992).

As Thornberry and Krohn (2000) observed, a conclusion drawn in the early 1980s may still be the most logical:

The self-report method appears to behave reasonably well when judged by standard criteria available to social scientists. By these criteria, the difficulties in self-report instruments currently in use would appear to be surmountable; the method of self-reports does not appear from these studies to be fundamentally flawed. Reliability measures are impressive and the majority of studies produce validity coefficients in the moderate to strong range. (Hindelang et al., 1981, p. 114, as cited in Thornberry & Krohn, 2000, p. 59)

SELF-REPORTS FROM KNOWN CRIMINALS AND DELINQUENTS

Some researchers believed that official statistics might be just the tip of the iceberg—not only in terms of who engaged in what kinds of crime but also in terms of how much crime those official criminals might account for. Theseresearchers determined to get information directly from the source by surveying known—that is, arrested or incarcerated—criminals. Two prominent studies in this genre are the RAND inmate surveys and the ADAM program.

The RAND Inmate Survey(s)

What is commonly referred to as the RAND inmate survey is actually two surveys conducted at different times and with different samples. The objectives of both of these surveys were the same, however, and their findings parallel each other. Primary among those objectives was to learn from the source about the illegal behavior of convicted criminals, that is, “to gather information on individual patterns of criminal behavior—types of crime committed, degree of specialization in crime types, and changes in criminal patterns over time” (Visher, 1986, p. 166).

RAND researchers first completed exploratory interviews with 49 California prison inmates who were convicted of robbery (see Petersilia, Greenwood, & Lavin, 1977, in Visher, 1986). Using those interview data to construct a self-administered survey instrument, the first inmate survey was conducted in 1976 (Peterson & Braiker, 1981, in Visher, 1986; Tremblay & Morselli, 2000). A total of 624 inmates (representing only a 47% response rate) from five California prisons completed the anonymous questionnaire. The results from the exploratory study and those from the inmate survey were similar: “Most inmates committed few crimes per year. … A small group reported much higher frequencies of offending” (Visher, 1986, p. 164). RAND researchers were not satisfied that the measurement and sampling for this study were sufficient for broader generalization; thus, a more rigorous, more representative inmate survey was designed and conducted.

The second inmate survey, conducted in 1978, drew samples from both jail and prison populations in California, Michigan, and Texas (Tremblay & Morselli, 2000; Visher, 1986). Attempts were made to ensure that the samples were representative of a typical cohort of inmates for those states and that the offenses for which they were convicted covered a broad range in terms of seriousness. A total of 2,190 inmates from the three states completed the confidential questionnaire.5 By making the survey confidential rather than anonymous, researchers were able to compare inmates’ official records with their self-reported information.

The survey instrument included detailed questions about inmates’ illegal activities as juveniles, their adult criminal behavior in the two years prior to the arrest that resulted in their incarceration, and past as well as recent use of drugs and alcohol. Inmates’ attitudes on specific issues, their employment history, and their demographic data were also solicited. Inmates were asked to estimate the number of times in the previous two years they had committed each of 10 crimes, including burglary, business robbery, personal robbery, assault during robbery, other assaults, theft, auto theft, forgery/credit card swindles/bad checks, fraud, and drug dealing.

Exhibit 4.7 summarizes some of the findings from RAND’s second inmate survey. It presents both the median number of crimes per year (i.e., the maximum number of crimes 50% of the inmates reported having committed) and the number of crimes per year at the 90th percentile (i.e., the minimum number of crimes 10% of the inmates reported having committed). Half of the active robbers reported committing the crime no more than 5 times per year, whereas 10% of them reported robbing a person no less often than 87 times a year. The difference between the number of crimes per year reported by low-rate and by high-rate offenders is even more dramatic for the other property offenses. These data led to the conclusion that most people who engage in illegal behavior, even convicted criminals, do so infrequently, but some individuals are involved in crime with such regularity that they can be labeled career criminals.

The RAND inmate surveys provided the database for constructing offender typologies—that is, a classification of offenders according to the types of crime they commit (Chaiken & Chaiken, 1984). One of the more important findings from the surveys was that criminals do not necessarily specialize in a single illegal enterprise but instead combine activities to accomplish a particular end. Researchers have also used RAND surveys to explore whether or not inmates were motivated by the expectation that crime pays and how much they claimed to have earned through engaging in criminal behavior (see Tremblay & Morselli, 2000; Wilson & Abrahamse, 1992). Others have used these data to examine the relationship between substance use and other types of illicit activity (see Chaiken & Chaiken, 1990). In many ways, and on the basis of attempts to replicate them, the RAND surveys persist as the standard for obtaining and analyzing self-report data from inmates (see Auerhahn, 1999).

Arrestee Drug Abuse Monitoring

Another example of obtaining self-reported information from known offenders is the Arrestee Drug Abuse Monitoring (ADAM) program, a program established by the National Institute of Justice to monitor drug use among arrestees in a number of jurisdictions in the United States (Taylor & Bennett, 1999). The forerunner of ADAM, the Drug Use Forecasting (DUF) program was initiated in 1987 and demonstrated the feasibility of urinalysis as a means of measuring drug use by arrestees. By focusing on arrestees, a group that is more likely than other populations to be involved in drug use, ADAM presented a different picture of drug use than general household surveys such as the NSDUH. DUF and ADAM have been used extensively to provide information for the purposes of criminal justice policies, and the studies represent a major resource for criminologists analyzing the association between drug use and involvement in criminal activity (ADAM, 2000).

At each ADAM collection site, trained interviewers conduct voluntary and confidential interviews with arrestees who have been in a jail or booking facility for less than 48 hours. The interview covers basic demographics, drug use history, current drug use, recent participation in buying and selling drugs, lifetime drug and mental health treatment, and, for those who report any illegal drug use in the previous 12 months, detailed information on arrests and housing arrangements. In addition, voluntary urine samples are taken from the arrestees. These urine samples are tested for the presence of (at least) five illegal drugs: marijuana, cocaine (including crack), opiates (including heroin),methamphetamine, and PCP. In 1999, the ADAM program collected data from more than 30,000 adult male arrestees at 34 sites and from more than 10,000 adult female arrestees at 32 sites. In addition, data were collected from more than 2,500 juvenile male detainees at 9 sites and more than 400 juvenile female detainees at 6 sites (ADAM, 2000).

Due to a lack of funding, the ADAM program was terminated in 2003; however, in 2007, the Office of National Drug Control Policy resumed data collection in 10 former ADAM sites as ADAM II (Office of National Drug Control Policy, 2009).

In 2008, across all 10 sites, a total of 4,952 booked arrestees completed the interview portion of ADAM II and 3,924 provided a urine specimen. We focus here on a comparison of 2002 and 2008 data on the percent of male arrestees in the 10 sites testing positive for marijuana, cocaine, opiates, and methamphetamine.

Exhibit 4.8 shows that, across the 10 ADAM II sites and in both 2002 and 2008, marijuana was the drug for which most arrestees tested positive. However, there were substantial differences in the percentage testing positive for cocaine, opiates, and methamphetamine over time and across sites. In 2008, the percentage testing positive for cocaine ranged from a low of 17.2 in Sacramento to a high of 43.8 in Chicago. Arrestees in Chicago also had comparatively high rates of positive tests for opiate drugs, at 25.1% in 2002 and 28.6% in 2008. In contrast, in Charlotte, North Carolina, only 2.3% of arrestees in 2002 and 1.1% in 2008 tested positive for opiates.Exhibit 4.8 also confirms that methamphetamine is a drug more commonly used in western jurisdictions of the United States (see Mosher & Akins, 2007), with arrestees in Portland, Oregon, and Sacramento, California, being much more likely to test positive for this drug. However, it is notable that with the exception of Minneapolis, where the percentage testing positive for methampheta-mine was identical in 2002 and 2008, each ADAM II site had a lower percentage of arrestees testing positive for methamphetamine in 2008, with a particularly large decrease in Portland.

In addition to providing useful information regarding patterns of drug use by arrestees, the ADAM project offers a rare opportunity to assess the validity of self-report data—that is, to determine to what extent people tell the truth when responding to a survey. Through comparisons of the self-reported drug use information to urinalysis results, researchers can analyze under- and over-reporting of drug use.

Verifying the Validity of Self-Reported Drug Use

Research has demonstrated that there is often a discrepancy between self-reporting of drug use and the results of urinalysis tests. For example, a study comparing self-reports and urinalysis results that relied on ADAM data from five U.S. cities (New York-Manhattan, Fort Lauderdale, Miami, Washington, D.C., and Birmingham, Alabama) found that 7.8% of arrestees underreported drug use, compared with 1.9% who overreported (i.e., they reported using drugs but their urinalysis results were negative; Taylor & Bennett, 1999).

Studies have also indicated that underreporting varies according to the type of drug. There is generally a higher concordance rate for marijuana use, but for harder drugs such as cocaine and heroin, underreporting is much more common. For example, on the basis of 1988 DUF data, 47% of arrestees in New York City reported cocaine use, whereas 75% had positive urinalyses for the substance. In the same year, 41% of Philadelphia arrestees reported cocaine use, but 72% tested positive. However, 28% of the arrestees in New York City self-reported marijuana use, and 30% tested positive. Similarly, in Philadelphia, 28% reported using marijuana, and 32% tested positive (Thornberry & Krohn, 2000). Lu, Taylor, and Riley (2001) similarly found significant underreporting of cocaine use from ADAM data, with less than 50% of those who tested positive for cocaine admitting that they used the substance.

Golub, Liberty, and Johnson (2005) used 2000 to 2001 ADAM adult arrestee interview and urinalysis data to examine disclosure of drug use and the correlates of disclosure. They found that arrestees were most likely to disclose recent marijuana use (82%), followed by methadone (69%). However, rates of disclosure for cocaine/crack, heroin, and methamphetamine were all about half. These researchers also found that white arrestees were much more likely than were black arrestees to disclose recent use of methamphetamine and that arrestees charged with drug offenses were generally more likely than those charged with less serious offenses to disclose recent use of each drug, with the exception of methadone. Perhaps most interestingly, Golub et al. (2005) found incredible variation in disclosure rates for particular drugs across ADAM sites. For example, the marijuana disclosure rate varied from a low of 68% in Fort Lauderdale to a high of 93% in Spokane, while the cocaine/crack disclosure rate varied from 28% in Chicago to 70% in Kansas City. They noted that this variation might be attributable to differences across sites with respect to the nature of the jail where the interviews were conducted and the privacy it provides, the nature of the arrest experience and the hostility it engenders, and differences in the disapproval of various drugs across communities. Golub et al. concluded, “This analysis raises serious doubts about the validity of self-reported drug use, at least among arrestees” (2005, p. 932).

Magura et al. (1987, as cited in Magura & Kang, 1997) compared self-reports of drug use with urinalysis results for patients who were receiving methadone treatment in four clinics in New York City. Among subjects who tested positive for each drug, 65% did not report opiate use, 39% did not report benzodiazepine use, and 15% did not report cocaine use. Magura and Kang (1997) also found that African Americans were more likely than other groups to underreport drug use. In general, research suggests that the quality of survey data on racial and ethnic disparities in substance use is compromised by differential measurement error across racial and ethnic groups (Johnson & Bowman, 2003).

The findings of differential honesty in reporting by type of substance are consistent with social desirability theory (Edwards, 1957), which suggests that the distortion of self-reports, by underreporting or overreporting, occurs as a function of the perceived acceptability of the behavior in question. Because the use of marijuana is less stigmatized than the use of hard drugs in U.S. society, subjects may be more likely to truthfully report using the substance (Harrison, 1997).

SUMMARY AND CONCLUSIONS

Thornberry and Krohn (2000) suggested that “the self-report method of collecting data on delinquent and criminal behaviors is one of the most important innovations in criminological research in the 20th century” (p. 34).

Considerable improvements in survey methodology, as well as research efforts focused on enhancing the validity of self-reports over the years, have yielded greater confidence in the data that are collected in this manner.

Self-report data, however, are by no means without their weaknesses. Self-report measures of crime and its correlates continue to be constrained by the same elements that affect self-report of all types of behaviors. Concerns over sampling, representativeness, generalizability (i.e., did we ask the right people?), along with instrument design, question wording and order (i.e., did we ask the right question?) plague survey researchers generally and can be especially problematic with respect to surveys on crime. Concerns over the validity of responses (i.e., did the respondents answer with the truth?) likewise are not unique to self-report measures of crime.

At the same time, self-report data have unique strengths: For all their problems, self-report measures of crime provide valuable information that is not available through other measures. This is especially true if researchers are interested in etiological issues such as explanatory variables and models (theory testing) for delinquency and crime, circumstances surrounding illegal behavior, age at beginning as well as ceasing involvement in criminal activity, patterns of offending over the life course, and related issues. To maximize the value of self-report data, proper care should be taken to approximate as much as possible the ideal in methods, sampling, and instruments.

Given that victimization data are also a form of self-report measure, as will be discussed in the next chapter, it is impossible to overstate their importance to our understanding of crime and its correlates.