Managerial Epidemiology
Chapter 10
Data Interpretation Issues
Learning Objectives
Distinguish between random and systematic errors
State and describe sources of bias
Identify techniques to reduce bias at the design and analysis phases of a study
Define what is meant by the term confounding and provide three examples
Describe methods to control confounding
Validity of Study Designs
The degree to which the inference drawn from a study, is warranted when account it taken of the study, methods, the representativeness of the study sample, and the nature of the population from which it is drawn.
2
Validity of Study Designs
Two components of validity:
Internal validity
External validity
2
Internal Validity
A study is said to have internal validity when there have been proper selection of study groups and a lack of error in measurement.
Concerned with the appropriate measurement of exposure, outcome, and association between exposure and disease.
3
External Validity
External validity implies the ability to generalize beyond a set of observations to some universal statement.
A study is externally valid, or generalizable, if it allows unbiased inferences regarding some other target population beyond the subjects in the study.
4
Sources of Error in Epidemiologic Research
Random errors
Systematic errors (bias)
5
Random Errors
Reflect fluctuations around a true value of a parameter because of sampling variability.
6
Factors That Contribute to Random Error
Poor precision
Sampling error
Variability in measurement
7
Poor Precision
Occurs when the factor being measured is not measured sharply.
Analogous to aiming a rifle at a target that is not in focus.
Precision can be increased by increasing sample size or the number of measurements.
Example: Bogalusa Heart Study
8
Sampling Error
Arises when obtained sample values (statistics) differ from the values (parameters) of the parent population.
Although there is no way to prevent a non-representative sample from occurring, increasing the sample size can reduce the likelihood of its happening.
9
Variability in Measurement
The lack of agreement in results from time to time reflects random error inherent in the type of measurement procedure employed.
10
Bias (Systematic Errors)
“Deviation of results or inferences from the truth, or processes leading to such deviation. Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth.”
11
Factors That Contribute to Systematic Errors
Selection bias
Information bias
Confounding
12
Selection Bias
Refers to distortions that result from procedures used to select subjects and from factors that influence participation in the study.
Arises when the relation between exposure and disease is different for those who participate and those who theoretically would be eligible for study but do not participate.
Example: Respondents to the Iowa Women’s Health Study were younger, weighed less, and were more likely to live in rural, less affluent counties than nonrespondents.
13
Information Bias
Can be introduced as a result of measurement error in assessment of both exposure and disease.
Types of information bias:
Recall bias: better recall among cases than among controls.
Example: Family recall bias
14
Information Bias (cont’d)
Interviewer/abstractor bias--occurs when interviewers probe more thoroughly for an exposure in a case than in a control.
Prevarication (lying) bias--occurs when participants have ulterior motives for answering a question and thus may underestimate or exaggerate an exposure.
15
Confounding
The distortion of the estimate of the effect of an exposure of interest because it is mixed with the effect of an extraneous factor.
Occurs when the crude and adjusted measures of effect are not equal (difference of at least 10%).
Can be controlled for in the data analysis.
16
Criteria of Confounders
To be a confounder, an extraneous factor must satisfy the following criteria:
Be a risk factor for the disease.
Be associated with the exposure.
Not be an intermediate step in the causal path between exposure and disease.
17
Simpson’s Paradox as an Example of Confounding
Simpson’s paradox means that an association in observed subgroups of a population may be reversed in the entire population.
Illustrated by examining the data (% of black and gray hats) first according to two individual tables and then by combining all the hats on a single table.
Simpson’s Paradox (cont’d)
When the hats are on separate tables, a greater proportion of black hats than gray hats on each table fit.
On table 1:
90% of black hats fit
85% of gray hats fit
On table 2:
15% of black hats fit
10% of gray hats fit
Simpson’s Paradox (cont’d)
19
Simpson’s Paradox (cont’d)
When the man returns the next day and all of the hats are on one table:
60% of gray hats fit (18 of 30)
40% of black hats fit (12 of 30)
Note that combining all of the hats on one table is analogous to confounding.
Examples of Confounding
Air pollution and bronchitis are positively associated. Both are influenced by crowding, a confounding variable.
The association between high altitude and lower heart disease mortality also may be linked to the ethnic composition of the people in these regions.
18
Techniques to Reduce Selection Bias
Develop an explicit (objective) case definition.
Enroll all cases in a defined time and region.
Strive for high participation rates.
Take precautions to ensure representativeness.
20
Reducing Selection Bias Among Cases
Ensure that all medical facilities are thoroughly canvassed.
Develop an effective system for case ascertainment.
Consider whether all cases require medical attention; consider possible strategies to identify where else the cases might be ascertained.
Reducing Selection Bias Among Controls
Compare the prevalence of the exposure with other sources to evaluate credibility.
Attempt to draw controls from a variety of sources.
21
Techniques to Reduce Information Bias
Use memory aids; validate exposures.
Blind interviewers as to subjects’ study status.
Provide standardized training sessions and protocols.
Use standardized data collection forms.
Blind participants as to study goals and classification status.
Try to ensure that questions are clearly understood through careful wording and pretesting.
22
Methods to Control Confounding
Prevention strategies--attempt to control confounding through the study design itself.
Three types of prevention strategies:
Randomization
Restriction
Matching
Two types of analysis strategies:
Stratification
Multivariate techniques
23
Randomization
Attempts to ensure equal distributions of the confounding variable in each exposure category.
Advantages:
Convenient, inexpensive; permits straightforward data analysis.
Disadvantages:
Need control over the exposure and the ability to assign subjects to study groups.
Need large sample sizes.
24
Restriction
May prohibit variation of the confounder in the study groups.
For example, restricting participants to a narrow age category can eliminate age as a confounder.
Provides complete control of known confounders.
Unlike randomization, cannot control for unknown confounders.
25
Matching
Matches subjects in the study groups according to the value of the suspected or known confounding variable to ensure equal distributions.
Frequency matching--the number of cases with particular match characteristics is tabulated.
Individual matching--the pairing of one or more controls to each case based on similarity in sex, race, or other variables.
26
Matching (cont’d)
Advantages:
Fewer subjects are required than in unmatched studies of the same hypothesis.
May enhance the validity of a follow-up study.
Disadvantages:
Costly because extensive searching and recordkeeping are required to find matches.
27
Two Analysis Strategies to Control Confounding
Stratification--analyses performed to evaluate the effect of an exposure within strata (levels) of the confounder.
Multivariate techniques--use computers to construct mathematical models that describe simultaneously the influence of exposure and other factors that may be confounding the effect.
28
Advantages of Stratification
Performing analyses within strata is a direct and logical strategy.
Minimum assumptions must be satisfied for the analysis to be appropriate.
The computational procedure is straightforward.
29
Disadvantages of Stratification
Small numbers of observations in some strata.
A variety of ways to form strata with continuous variables.
Difficulty in interpretation when several confounding factors must be evaluated.
Categorization results in loss of information.
30
Multivariate Techniques
Advantages:
Continuous variables do not need to be converted to categorical variables.
Allow for simultaneous control of several exposure variables in a single analysis.
Disadvantages:
Potential for misuse.
31
Publication Bias
Occurs because of the influence of study results on the chance of publication.
Studies with positive results are more likely to be published than studies with negative results.
Publication Bias (cont’d)
May result in a preponderance of false-positive results in the literature.
Bias is compounded when published studies are subjected to meta-analysis.