Confounder or Effect Modifier

profileTPerk
confounding_pdf2.pdf

Excelsior College PBH 321

Page 1

CONFOUNDING

Confounding is a mixing of effects of extraneous factors (confounders) with the effect of the exposure of interest. The association between exposure and disease is distorted because it is mixed with the effect of another factor that is associated with the disease. A confounder is therefore an alternate explanation for observed association between an exposure and disease. The result of confounding is to distort the true association toward or away from the null. Many epidemiologists refer to confounding as a type of bias.

Example: Who can run faster, men or women? Exposure: gender Outcome: speed Hypothesis: The average running speed of men is faster than the average running speed of women. All men and women in one town were invited to participate in a road race. On race day, both men and women come and race. The average running time for the men is faster than the women. CONCLUSION: Men run faster than women, because of their gender. But wait! Someone notices that women with young children did not race. In fact, women who ran the race were, on average, older than men who ran. For example, the average age of women was 50 years while the average age of men was 25 years. CONCLUSION: Perhaps men were faster not because of their gender, but because they were younger. Another race is held, this time making sure ages in the two groups (men and women) are comparable. That is, the men and women have the same distribution of ages. Race result: Once again, men are faster. CONCLUSION: Controlling for age, men are still faster than women. But wait! Someone points out that the men are, on average, taller than the women. CONCLUSION: Perhaps men were faster not due to their gender, but because their legs are longer. Another race is held, this time making sure both heights and ages in the two groups (men and women) are comparable. Race result: Once again, men are faster. But wait! Someone points out that 50% of the women had hair longer than their shoulders, and only 5% of the men did! CONCLUSION: Long hair made the women run slower? (Is this a reasonable conclusion?)

Excelsior College PBH 321

Page 2

CRITERIA FOR CONFOUNDING Let’s review the meaning of association. If a characteristic is associated with disease, then risk of disease is different among people with the characteristic compared to those without. If the characteristic is associated with exposure, then the distribution of the characteristic is different among people with the exposure compared to people without exposure (unbalanced between groups). In general, for a characteristic to be a confounder, it must be associated with both the outcome and the exposure under study. (Think about the race example: Why would age and height be reasonable explanations, but not hair length?) There are three major criteria that must be satisfied for a factor to be a confounder: 1) Confounder is a risk factor for the outcome, independent

of exposure 2) Confounder is associated with exposure 3) Confounder is not in the causal pathway between

exposure and disease.

We will now explore each of these criteria in more detail. 1. Association with the Outcome A confounder can be a risk factor, preventive, or marker for a cause of disease. It must be associated with disease independent of exposure. We can think of this as meaning the association between the confounder and outcome is present regardless of the exposure. To figure this out, we look in the unexposed population. Example: The road race again. Age and height are associated with speed regardless of gender. Taller people (both men and women) have greater speed. Younger people (both men and women) have greater speed. 2. Association with Exposure We have already discussed the need for comparability between the exposed and unexposed groups in experimental and cohort studies. Recall the purpose of randomization in an experimental study, and the rules for selection of the comparison group in a cohort study -- we want the groups being compared to be similar with respect to all factors other than the exposure of interest. A variable can only be a confounder if it is different between compared groups. In order to satisfy this criterion, the prevalence of the factor must greater among the exposed versus the unexposed. In a cohort study, we examine this association among the entire study population (because the population is defined based on exposure status). In a case-

Exposure Outcome (Disease)

Confounder

Exposure Outcome (Disease)

Confounder

Excelsior College PBH 321

Page 3

control study, this association is only examined among the controls (because controls represent the exposure distribution in the source population). Example: In a cohort study, smoking is a confounder of the effect of occupational exposures to dyes on bladder cancer. Smoking is associated with dye exposure, because workers tend to smoke more than nonworkers do. 3. Not in the Causal Pathway A factor cannot be a confounder if it is a step in the causal chain or pathway between exposure and disease: Occupational dye exposure does not “cause” smoking, leading to bladder cancer – so we say it is not on the causal pathway in between exposure and disease. This means that smoking may be a confounder. An example of a non-confounder identified using this criterion is that moderate alcohol consumption increases serum HDL levels which, in turn, decrease the risk of heart disease. HDL level is an intermediate step in this causal chain, and therefore not a confounder. Rather, it is something interesting that helps us understand the disease mechanism.

EXAMPLE OF CONFOUNDED DATA Let’s examine a hypothetical cohort study of dietary fat and heart disease. The hypothesis is that people with a high fat diet are more likely to develop heart disease. Exposure: high fat diet; Outcome: heart disease; Confounder: gender.

Dye exposure

Bladder Cancer

Smoking

Dye exposure

Bladder Cancer

Smoking

Moderate alcohol

Heart disease

Serum HDL

Moderate alcohol

Heart disease

Serum HDL

Dye exposure Bladder cancerX Smoking

Dye exposure Bladder cancerX Smoking

Excelsior College PBH 321

Page 4

Our total population consists of 1000 exposed and 1000 unexposed individuals, and we calculate an overall or crude risk ratio:

Heart Disease

No Heart Disease

Total

High fat diet 600 400 1000

Low fat diet 300 700 1000

RR = (600/1000) / (300/1000) = 2.0

Interpretation: Individuals with a low fat diet are two times as likely to develop heart disease during the study period. Alternatively, we could also say those individuals are 100% more likely to develop heart disease. You know enough about epidemiology now to understand that the risk of disease is often not the same for all individuals. Let’s see if the risk is the same for both men and for women. To do this, we separate the population by gender. We call this process stratification. Stratification allows us to examine the association within homogeneous categories (strata) of the confounding variable. In other words, stratification restricts the populations to a specific characteristic we suspect is a confounder. We must then calculate RRs and interpret them separately for each stratum, calling them adjusted estimates.

Women Men D+ D- D+ D-

E+ 600 400 E+ 560 240

E- 300 700 E- 140 60

RR = 1.0

RR = 1.0

Interpretation: Among men, there is no greater risk of heart disease in the high fat versus low fat diet. Among women, there is no greater risk of disease in the high fat versus low fat diet. Does this make sense? The crude estimate suggested a twofold increased risk among the exposed relative to the unexposed, but the adjusted estimates suggest there is no association. This was the effect of confounding by gender - it created the appearance of an association when there was not actually one. Why might this be true? To find out, let’s walk through our three criteria for confounding:

Exposure Disease

Gender

Exposure Disease

Gender

Excelsior College PBH 321

Page 5

1) Confounder must be associated with outcome, independent of exposure. Is gender

associated with heart disease, independent of dietary fat?

How do we know if a factor is related to the outcome, independent of exposure? We ask the question: even among the unexposed (low fat diet), is the factor (gender) associated with the outcome (heart disease)? To answer this question we can look at the relationship between the confounder and the disease in the unexposed population. In other words, even if you are not exposed, the factor must still be associated with disease to be a confounder.

Unexposed population – Low fat diet D+ D-

Male 140 60 Proportion of diseased men: 140/200

Female 300 700 Proportion of diseased women: 300/1000

RR = 2.3

Note: These numbers come from the gender-stratified tables above - make sure you understand how they were summed. Interpretation: Among individuals with a low fat diet, the risk of heart disease for men is 2.3 times (130% greater) than for women. The factor is associated with disease, independent of exposure (regardless of exposure), and this criteria is met. 2) Confounder must be associated with exposure. Is gender associated with dietary

fat?

To answer this question, we need to compare the relative proportion of men and women exposed – in other words, calculate the prevalence ratio of exposure for men versus women. We can do another stratified analysis to measure the association, by looking at the RR of exposure for males to females:

High fat

diet Low fat

diet

Male 1000 800 Proportion of men exposed: 1000/1800

Female 1000 200 Proportion of women exposed: 1000/1200

RR = 1.5

Excelsior College PBH 321

Page 6

Interpretation: Men are about 1.5 times or 50% more likely than women to be exposed. (Alternatively, we could say that women are .5 times or 50% less likely to be exposed than men.) In other words, the prevalence of a high fat diet is 1.5 times greater for men than for women. The criterion is satisfied. If there were equal proportions of exposed men and women, we could not say that gender is associated with exposure – because the RR produced from this table would be 1.0. 3) Confounder cannot be in the causal pathway. Exposure to a high fat diet is not likely to

precede or cause gender! Therefore, gender is not on the exposure – disease pathway, and the third criterion is also met.

Using this same example, smoking is another likely confounder for this association. Walk through the criteria for confounding: Smoking is 1) a risk factor for heart disease, regardless of diet, 2) individuals with a high fat diet may be more likely to smoke, and 3) smoking is not on the causal pathway between high fat diet and heart disease.

IDENTIFYING CONFOUNDERS Usually, risk factors for disease are potential confounders, such as lifestyle and behavioral factors including diet, smoking, alcohol consumption, etc., obesity, gender, age, genetic factors, as well as socio-demographic factors (income, education level, access to health care, etc.). Epidemiologists often rely on existing knowledge of disease to identify possible confounders before conducting a study, so that they can try to collect information on these confounders. Still, it is not possible to identify all possible confounders, or even to measure all known confounders. This results in residual confounding, because the exposure-disease association may still be distorted by either known or unknown confounders. How do we know if a factor is a confounder in your data or not? One way is to compare the crude and adjusted measures of association, as above. If they differ appreciably, then the factor is a confounder. Confounding can cause the crude estimate to be either an overestimate or an underestimate of the true association.

Controlling for Confounders Confounding tends to result in a biased estimate of the true exposure – outcome association. If we collect information on the confounder then we can usually remove its effect. Because confounders may bias our association and the interpretation of study results, we have to control for them. Confounding factors are nuisance variables because they get in the way of the

High fat diet Heart diseaseX Gender

High fat diet Heart diseaseX Gender

Excelsior College PBH 321

Page 7

relationship you want to study, and we want to remove their effect to observe the true association. We have done this already in Module 2 with age-standardization. There are two major strategies to control for confounding - in either the design phase or the analysis phase of a study. To control for confounding in either phase you must have information on the variables that are potential confounders. This is true for all methods of controlling confounding except randomization. Controlling for confounders during the design phase Methods to control for confounding at the design phase of a study include randomization, restriction, and matching. These decisions are made before the study is conducted, in order to effectively control for these nuisance factors.

• Randomization: In Module 3 you learned about randomization in the context of experimental studies. With a large enough sample size, randomization is likely to control for both known and unknown confounders.

• Restriction: The investigator can restrict the inclusion criteria for study subjects and

limit entrance to individuals who fall within a specified category of the confounder. In the road race example, you could restrict the race to people in a certain age range (say, 25-30) or to people in given height range. By restricting the race, there cannot be major differences in age or height between exposed and unexposed groups. (Remember, a variable can only be a confounder if it is different between compared groups.) The advantage of restriction is that it is straightforward, convenient, and less expensive than collecting information for a larger group. However, restricting limits our ability to generalize study findings to other populations. For example, if we only allowed individuals aged 25-30 to race, we cannot make effective guesses about how 50-60 year olds (or some other age group) might have performed if they had raced.

• Matching: Here the investigator selects study subjects so that the potential confounders

are distributed in an identical manner among the exposed and unexposed groups (cohort study) or among the cases and controls (case control study). An exposed subject of specific characteristics is matched to an unexposed subject who shares those same characteristics.

Excelsior College PBH 321

Page 8

Example: Matching in cohort study of exercise and heart attack. Exposed group: exercisers, Unexposed group: non-exercisers. Confounders to be matched are age, gender, and smoking status; since we know these are strong risk factors for heart attack and are also likely associated with exposure (people who exercise tend to be nonsmokers). The matching on age doesn’t have to be exact, but should be close (+ or – a couple of years).

Exposed subject Unexposed subject 45 year old female, nonsmoker, exerciser 45 year old female, nonsmoker, non-

exerciser 50 year old male, nonsmoker, exerciser 50 year old male, nonsmoker, non-

exerciser 41 year old female, smoker, exerciser 40 year old female, smoker, non-exerciser

Controlling for Confounders in the Analysis In addition to these methods to control for confounding in the design of a study, we can also control for confounding using analysis methods. We have demonstrated stratification in many of the tables above. Because the RRs are calculated for each strata of a factor separately, we can say we control for confounding by the factor via stratification.

Example: Case control study of oral contraceptive use and risk of heart attack. Age is a confounder.

Case Control

OC use

Yes 39 24

No 114 154

Crude OR= 2.2

Stratified analysis Age < 40 years Age ≥ 40 years

Case Control Case Control

OC use

Yes 21 17

Yes 18 7

No 26 59 No 88 95

OR = 2.8 OR = 2.8

Excelsior College PBH 321

Page 9

Note each stratum is like a restricted analysis, because there is a narrow range of the confounder. The stratum-specific ORs (2.8) differ from the crude OR (2.2) by about 25%. This difference indicates that there is confounding by age. There are some limitations of stratification. It is difficult to control for many variables simultaneously because a large number of strata will be generated relative to the number of study subjects. There are also other methods for controlling for confounding in the analysis, such as multivariate regression and other types of statistical approaches.

Summary of Confounding • Is a mixing of effect between and exposure, an outcome, and a third variable known as a

confounder • Considered a nuisance which must be eliminated • Studies may have a small, moderate or large degree of confounding • Confounding can either exaggerate or minimize the true association • Epidemiologists have developed methods to control confounding in the design and

analysis of studies

EFFECT MODIFICATION Let’s say we are evaluating a potential confounder and conduct a stratified analysis, as above. If the strata-specific estimates differ from the crude, we say that confounding was present. But what if the stratum-specific estimates are appreciably different from each other? This is called effect modification. Effect modification occurs when the association between exposure and disease varies by levels of a third variable. In other words, the presence/absence/or level of one risk factor influences the association between some other risk factor and the disease. The diagram shows the relationship between exposure and disease for a factor with only 2 strata (such as gender).

Example: In a cohort study, it appears that gender influences the association between dietary fat and heart disease (the association between dietary fat and heart disease is different for women than for men). Generally, effect modification reflects biological interactions. In this example, estrogen or other sex hormones that differ between genders may interact with dietary fat and alter its effect on the heart.

Exposure Outcome (Disease)

Effect modifier

Exposure Outcome (Disease)

Effect modifier

Dietary fat Heart disease

Gender

Dietary fat Heart disease

Gender

Excelsior College PBH 321

Page 10

A stratified analysis of dietary fat and heart disease shows the effect modification. Notice that the stratum-specific RRs differ from the crude (demonstrating confounding) as well as from each other (demonstrating effect modification).

Crude Stratum-specific

Men Women

HD+ HD- HD+ HD- HD+ HD-

E 600 300

E 200 50

E 400 250

E 400 700

E 200 550

E 200 150

RR= 1.8 RR=3.0 RR =1.1

Effect modification can improve our understanding about the association between exposure and disease. For example, if men and women respond differently to dietary fat, we could use this information to target certain preventive dietary measures for men with respect to heart disease. Another example: The relation between body mass index (a measure of obesity) and breast cancer varies according to menopausal status. Among pre-menopausal women, higher BMI decreases risk. Among post-menopausal women, higher BMI increases (or does not affect) risk. Let’s reflect on the difference between effect modification and confounding. We want to describe and understand effect modification, since it helps improve our understanding of disease mechanisms and possible differences in preventive treatment between groups. Confounding, on the other hand, is a nuisance and we need to control for it or eliminate it since it biases our measure of association. Since we use stratification to observe both effect modification and confounding, once we identify effect modification by a factor we can stop our evaluation of confounding by that same factor (since stratification will deal with confounding anyway).

  • Confounding
    • Criteria for confounding
    • Example of confounded data
    • Identifying Confounders
      • Controlling for Confounders
    • Summary of Confounding
    • Effect Modification