epidemiology
Cohort Studies
HESC 401 Epidemiology
� At the beginning of chapter 8, alpha and p-values are covered, but we haven’t discussed this yet. We will cover this when we review relative risk and odds ratios and other statistical concepts.
Objectives � Describe the basic design of a cohort study
� Begin to understand a 2 x 2 table (table 9-1) and 9-2.
� Identify differences and similarities between cohort studies and randomized trials
� Describe the different approaches to select groups in cohort studies and any problems associated with them
� Identify and describe prospective and retrospective cohort studies and their similarities and differences
� Describe the Framingham cohort study
� Identify and explain biases associated with cohort studies
Design of a Cohort Study
� Two groups are selected by an investigator: exposed and
non-exposed.
� The two groups are then followed up to compare the
incidence of disease.
� If a positive association exists between the exposure and
the disease, it would be expected the incidence in the
exposed group would be greater than the incidence in the
non-exposed group.
� Comparing the incidence of exposed and non-exposed
persons is the hallmark of this design.
Incidence of disease � The key aspect of a cohort study compared with a
case/control and cross-sectional study is that we can calculate the incidence of disease. Because we are following people in time, we can measure who has “newly” developed disease and hence calculate the incidence.
Then Follow to See Whether
Disease Develops
Disease Does Not Develop
Totals Incidence Rates of
Disease
a b a + b
Exposed
Not exposed d d
c + d
The 2 x 2 table shown below and in table 9-1, illustrates the important point that
cohort studies can measure/estimate the incidence of disease in the exposed and
unexposed (the last column). Later in subsequent chapters, this will be important
when calculating the “risk” of disease in those exposed compared with those
not exposed.
Then Follow to See Whether
CHD Develops CHD Develops CHD Does Not Develop
Totals
Incidence
per 1,000
per Year
84 2,916 3,000 28.0 Smoke cigarettes
Do not smoke cigarettes 87 4,913 5,000 17.4
Table 9-2 below and from the book provides an excellent example of calculating
incidence for those exposed compared with those not exposed. Incidence of
Disease in smokers is = new disease/ all those at risk in the exposed group
= 84/3000 = 28.0.
Figure 9-2: Cohort study design, including exposed and non- exposed groups. It is broken down further into groups that develop the disease and those who do not develop the disease.
Comparing Cohort Studies with
Randomized Trials
� Both studies compare exposed and non-exposed groups.
� In randomized trials, for ethical reasons, the “exposure” is
a treatment or preventive measure.
� In cohort studies, the “exposure” is often to a possibly
toxic or carcinogenic agent.
� In both studies, an exposed group is compared with a non-
exposed group or with a group with another exposure.
What is the Main Difference
Between the Two Studies?
� In a randomized trial, you are actually conducting an
“experiment” and testing whether a drug or treatment has
an effect.
� Where as in a cohort study, there is no randomization, but
instead you are following people in time to see whether if
they were exposed or not exposed to something, such as
cigarette smoke, bad diet, etc has an affect on disease.
� The following slide displays the main difference between
the two studies.
Figure 9-3: The selection of study groups in experimental and observational epidemiologic studies.
Selecting Study Populations
� There are two ways to develop exposed and non-exposed
groups in cohort studies.
� 1) Create a study population by selecting groups on the
basis of whether or not they were exposed. This design is
shown on the next slide.
Figure 9-4: Design of a cohort study beginning with exposed and non-exposed groups.
Selecting Study Populations (cont’d)
� 2) Select a defined population before any members
become exposed or before the exposure is identified.
Select on the basis of some factor not related to exposure,
and then take histories or blood tests of the entire
population.
� Use the results of the histories or tests to separate the
population into exposed and non-exposed groups.
Figure 9-5: Design of a cohort study starting with a defined population.
Issue with Using Second Selection
Approach
� Cohort studies often require a long follow-up period,
lasting until enough outcomes have occurred.
� When the second approach is used, the exposure of
interest may not occur for some time, even for many years
after the population has been identified.
Prospective Cohort Study
� Also called a concurrent cohort or longitudinal cohort.
� The study is concurrent because the investigator identifies
the original population at the beginning of the study and
follows the subjects concurrently through calendar time
until the point at which the disease does or does not
develop.
� Figure 9-6 on the next slide shows a hypothetical
prospective cohort study.
Figure 9-6: This shows the time frame for a hypothetical prospective cohort study that began in 2008. Over the course of the next 20 years, the investigator will follow the subjects in order to observe any outcomes.
Problems with Prospective Cohort
Studies � 1) This study has a long follow-up period. The study on
the previous slide shows that it will take 20 years to
complete.
� 2) Funding is generally limited to 3 to 5 years, and would
not last the entire duration of the study.
� 3) There is a risk that the subjects will outlive the
investigator, or the investigator may not survive to the end
of the study.
Retrospective Cohort Study
� Also called a historical cohort study or a nonconcurrent
prospective study.
� It is the same as a prospective cohort study in that it is still
comparing exposed and non-exposed groups.
� In the retrospective study, we are using historical data
from the past so that we can telescope the frame of
calendar time for the study and obtain our results sooner.
� The study is beginning with a pre-existing population to
reduce the duration of the study.
Main Similarity and Difference Between
Prospective and Retrospective Studies
� Similarity: They are identical- both are comparing exposed and non-exposed populations.
� Difference: Calendar time
Figure 9-8: Time frames for hypothetical prospective and retrospective cohort studies begun in 2008.
Example of prospective and
retrospective studies � Based on the diagram below, for a prospective study, let’s say in 2008 we
identified a group of students and followed them to 2018 where we surveyed them on their smoking (exposure) and then we followed them until 2028 to see whether those that were exposed or not exposed got lung cancer.
� In a retrospective study, we identified a group of people who had lung cancer and didn’t have lung cancer in 2008, and then also found out that they were surveyed at an earlier time (1998) on smoking exposure and began to be observed/followed in 1988.
The Framingham Study
� The Framingham Study of cardiovascular disease began in
1948.
� Residents who were considered eligible for the study were
between 30 and 62 years of age.
� There were 5,127 men and women between 30 and 62
years of age at the beginning of the study and were free of
cardiovascular disease at that time.
The Framingham Study (cont’d)
� Many “exposures” were defined for this study. They
included smoking, obesity, elevated blood pressure,
elevated cholesterol levels, low levels of physical activity,
and other factors.
� New coronary events were identified by examining the
study population every 2 years and by daily surveillance
of hospitalizations at the only hospital in Framingham.
� The second approach for selecting a study population was
used for this study.
The Framingham Study (cont’d)
� A defined population was selected on the basis of location of residence or other factors not related to exposures.
� The population of Framingham was observed over time to determine which individuals developed or already had the “exposure(s)” of interest, as well as to determine later on who developed the cardiovascular outcomes of interest.
� This approach allowed the investigators to examine the roles of multiple “exposures”, as well as the interactions among the exposures.
Potential Biases in Cohort Studies
� 1) Bias in the assessment of the outcome: According to Gordis,
“IF the person who decides whether disease has developed in
each subject also knows whether that subject was exposed, and
if that person is aware of the hypothesis being tested, that
person’s judgment as to whether the disease developed may be
biased by that knowledge.”
� 2) Information bias: If the method of collecting the data is
different for the exposed and unexposed, the information on
exposure might be different for exposed persons than for non-
exposed persons, then a significant bias can be introduced. It is
likely to occur in historical cohort studies since information is
obtained from past records.
Potential Biases in Cohort Studies � 3) Biases from non-response and losses to follow-up: non-
participation and non-response can introduce major biases that
can complicate interpretation of the study findings.
� For example, let’s say that you are studying a group of people
who worked in a factory and then followed this group to see
whether exposure to asbestos is associated with lung disease.
However, several of the workers dropped out of the study and
if the workers who dropped out were sicker and possibly
exposed to asbestos, then mostly the healthy group stayed in
the study and it may appear that there isn’t an association
between asbestos and lung disease.
� Also, if people with the disease are lost to follow-up,
incidence rates will be difficult to interpret for both groups.
� 4) Analytic bias: According to Gordis, “If the epidemiologists
and/or statisticians who analyze the data have strong
preconceptions, they may unintentionally include their biases
into the data analysis and the interpretation of the study
findings.”
� This could occur for any type of study design
When is a Cohort Study Warranted?
� Figure 9-11 on the next slide reviews the basic steps in a
cohort study.
� We begin with identifying an exposed group and an
unexposed group. (Part A)
� We then ascertain the incidence in both the exposed and
non-exposed groups. (Part B)
� If the exposure is associated with disease, we would
expect to find a greater incidence in the exposed group
than in the non-exposed group. (Part C)
Figure 9-11: The design of a cohort study. Part A starts with the exposed and non- exposed groups. In Part B, we are measuring the development of disease in both groups. Part C is the expected findings if the exposure is associated with disease.
What Can Make a Cohort Study
Impractical? � According to Gordis, “strong evidence does not exist
to justify mounting a large and expensive study for in- depth investigation of the role of a specific risk factor in the etiology of a disease.”
� There are generally no appropriate past records or other sources of data to conduct a retrospective cohort study.
� Many of the diseases that are of interest today occur at very low rates. Therefore, large cohorts must be enrolled in a study to ensure that enough cases develop by the end of the study to permit valid analysis.