More and More Research

Batman007

Resource4.pdf

Home >Literature homework help >More and More Research

Encyclopedia of Epidemiology

Type I and Type II Errors

Contributors: Rebecca Harrington & Li-Ching Lee

Edited by: Sarah Boslaugh

Book Title: Encyclopedia of Epidemiology

Chapter Title: "Type I and Type II Errors"

Pub. Date: 2008

Access Date: May 7, 2019

Publishing Company: SAGE Publications, Inc.

City: Thousand Oaks

Print ISBN: 9781412928168

Online ISBN: 9781412953948

DOI: http://dx.doi.org/10.4135/9781412953948.n465

Print pages: 1053-1057

This PDF has been generated from SAGE Knowledge. Please note that the pagination of the online

version will vary from the pagination of the print book.

javascript:void(0);

http://dx.doi.org/10.4135/9781412953948.n465

Type I and Type II errors are two types of errors that may result when making inferences from results cal- culated on a study sample to the population from which the sample was drawn. They refer to discrepancies between the acceptance or rejection of a null hypothesis, based on sample data, as compared with the ac- ceptance or rejection that reflects the true nature of the population data. Both types of error are inherent in inferential statistics, but they can be minimized through study design and other techniques.

Type I Error

The probability of a Type I error, also known as alpha (α), is the probability of concluding that a difference exists between groups when in truth it does not. Another way to state this is that alpha represents the prob- ability of rejecting the null hypothesis when it should have been accepted. Alpha is commonly referred to as the significance level of a test or, in other words, the level at or below which the null hypothesis is reject- ed. It is often set at 0.05 which, although arbitrary, has a long history that originated with R. A. Fisher in the 1920s. The alpha level is used as a guideline to make decisions about the p value that is calculated from the data during statistical analysis: Most typically, if the p value is at or below the alpha level, the results of the analysis are considered significantly different from what would have been expected by chance. The p value is also commonly referred to as the significance level and is often considered analogous to the alpha level, but this is a misuse of the terms. There is an important difference between alpha and p value: Alpha is set by the researcher at a certain level before data are collected or analyzed, while the p value is specific to the results of a particular data analysis. For instance, a researcher might state that he or she would use an alpha level of 0.05 for a particular analysis. This means two things: First, that he or she accepts the fact that if his or her analysis was repeated an infinite number of times with samples of equal size drawn from the same population, 5% of the time the analysis will return significant results when it should not (a Type I error) and that results with p values of 0.05 or less will be considered significant—that is, not due to chance. The p value calculated for a particular experiment can be any number between 0 and 1: In this example, a p value of 0.02 would be considered significant while a p value of 0.8 would not be.

As an example of a Type I error, consider the case of two normally distributed populations whose true means are equal. If an infinite number of samples are drawn from those populations, the means of the samples will not always be equal, and sometimes will be quite discrepant. Because in most cases we do not know the true population means, we use statistics to estimate how likely the differences in the means found in our samples are, if the population means were truly the same. In doing this, we accept that in some percentage of the cases, we will make the wrong decision, and conclude that the population means are different when they are truly the same: The probability of making this incorrect decision is Type I error or alpha.

Type II Error

The probability of a Type II error is known as beta (b). Beta is the probability of concluding that no difference exists between groups when in truth it does exist. As with alpha, we accept that there is some probability of drawing incorrect conclusions merely by chance: Often, the probability is set at 20%.

The complement of beta (i.e., 1 − b)isknownas statistical power, and describes the probability of detecting a difference between sample groups when a difference of specified size truly exists in the population. The commonly accepted power level is 80%, corresponding to a beta of 20%, meaning that if a true difference at least as large as we specify truly exists in the population from which our samples are drawn, over an infinite number of trials, we will detect that difference 80% of the time. If the power of a study is low, it may not be able to detect important differences that may truly exist, thereby missing potentially important associations.

SAGE Reference

Page 2 of 5 Encyclopedia of Epidemiology

Importance of Type I and Type II Errors

Type I and Type II errors are generally thought of in the context of hypothesis testing. In hypothesis testing, the null hypothesis (H0) is often that there is no difference between groups while the alternative hypothesis (Ha) is that there is a difference between groups. Type I and Type II errors are the two types of errors that may occur when making a decision based on the study sample as to whether the null hypothesis or the alternative hypothesis is true. The 2 × 2 table shown below (Table 1) illustrates when a Type I or Type II error occurs in the context of hypothesis testing. These errors are important concepts in epidemiology because they allow for the conceptualization of how likely study results are to reflect the truth. From them, guidelines can be set as to what is an acceptable amount of uncertainty to tolerate in the sample to make an inference to the truth in the population and gives an idea of how likely the data are to be able to detect a true difference.

The probability of a Type I error, alpha, and the complement of the probability of a Type II error, power, are used in the calculation of sample size. Prior to beginning a study, it is necessary to decide on the levels of error that are acceptable and from this, determine the sample size that corresponds to the chosen levels of error. As stated previously, although the common alpha and power level are 0.05 and 0.80, respectively, sometimes researchers choose different levels. Their choice depends in part on the relative importance of making a Type I or Type II error, because there is a trade-off between the alpha and power levels: When the alpha level is set lower, the beta necessarily becomes higher and vice versa. Figure 1 demonstrates why this trade-off occurs. Figure 1a shows a scenario where, using a one-sided test and specifying the alternative hy- pothesis as the average amount by which males are taller than females, or delta (Δ), the alpha is set at 0.05, and the beta is 0.20. When the alpha level is changed to 0.10, keeping all other factors (i.e., sample size) the same, the beta necessarily lowers to 0.12 as seen in Figure 1b. This happens because the amount of overlap between the two curves is predetermined by the values given to the null and alternative hypotheses. Increas- ing the alpha level shifts the cutoff to the left, thereby decreasing the size of beta (and increasing power).

Table 1 Hypothesis Testing and Errors

H0 Is True H0 Is Not True

Accept H0 Correct Type II error

Reject H0 Type I error Correct

Generally, beta is set much higher than alpha because some consider it to be a less serious error to make, though there is controversy in this statement. Deducing that no difference exists between groups when it does seems a less harmful mistake because it may lead to lack of action on the part of scientists (i.e., not imple- menting an efficacious intervention or drug treatment). On the other hand, deducing that there is a difference when there really is not may lead to inappropriate action and could lead to harmful side effects that do not bring with it the expected benefits. The debate, however, comes about with the realization that lack of action is not always less harmful and, therefore, the levels at which the alpha and beta are set depend on the poten- tial costs or benefits that may result from a Type I or Type II error.

The concepts of Type I and Type II errors also pertain to instances where the outcome measure is a cate- gorical variable, not continuous. The principle is the same although statistics appropriate to categorical data are used to estimate effect size, such as chisquare or odds ratio (OR), rather than a statistic such as mean difference between groups. An example using the OR is demonstrated in Figure 2, which also demonstrates the influence of sample size on Type I and Type II errors. Let us assume the real effect is OR =1.5 in the pop- ulation and the alpha level is set at 0.05. The larger study sample has a smaller confidenceinterval(CI)range- andisabletodetectthedifference between the experiment and the control groups at the set alpha level (0.05);

whereas a smaller study sample fails to do so because the CI includes OR = 1(H0). In other words, the analy- sis based on the smaller study sample resulted in a Type II error, failing to detect a true difference, which could also be stated as failing to reject the null hypothesis when it should have been rejected.

SAGE Reference

Page 3 of 5 Encyclopedia of Epidemiology

In general, increasing the study sample size decreases the probability of making a Type II error without having to increase the alpha level. This is, in part, because increasing the sample size decreases sample variance and increases statistical power. Along the same line, when the sample size is very large, there is a high like- lihood of finding statistically significant differences between study groups; however, the differences are not necessarily clinically significant.

In epidemiology, there has been some discussion as to the ethicality of conducting a study that has a high probability of Type II error, even if the likelihood of a Type I error is low. The issue arises because study par- ticipants are asked to take on risks by being in a study that they wouldn't take on otherwise, such as the use of experimental drugs or potential breaches to confidentiality, and many researchers consider it unethical to expose them to those risks unless the study has a high probability of finding differences if they truly exist. In a well-designed study with proper methodology, human subjects’ protections, and ample power, the risks taken on by participants are outweighed by the potential benefits to society that come with scientific findings. However, in a study with low power, the risks may not be outweighed by the potential benefits to society because of the lesser probability that a true difference will be detected. Grant applications and institutional review board proposals often require a power analysis for this reason, and further require that researchers demonstrate that they will be able to attract sufficient study subjects to give them adequate power.

Figure 1 Type I and Type II Error Trade-Off

Figure 2 Type I and Type II Errors for Categorical Variables

• type I and type II errors • type II errors • type I errors • p value • sample size • null hypothesis • sampling studies

SAGE Reference

Page 4 of 5 Encyclopedia of Epidemiology

Rebecca Harrington & , and Li-Ching Lee http://dx.doi.org/10.4135/9781412953948.n465 See also

• Hypothesis Testing • Multiple Comparison Procedures • p Value • Sample Size Calculations and Statistical Power • Significance Testing