Managerial Epidemiology: Week 6

profilegregueira82
chapter14week6.pdf

Chapter 14

Molecular and Genetic

Epidemiology

Learning Objectives

• Differentiate between molecular and

genetic epidemiology

• Describe principles of inheritance and

sources of genetic variation

• Define epidemiologic approaches for

the identification of genetic

components to disease

Peeking into the “Black Box”

• Many risk factors can be quantified

through questionnaires, records, and

easily measured attributes (such as

blood pressure and anthropometrics).

• The biological mechanism(s) through

which these factors influence disease

is not always apparent (i.e., a “black

box”).

Value of Mechanistic Insight

• Biologic plausibility is a criterion for

causality.

• Linking lifestyle risk factors with

measures of biologic effect

strengthens interpretations of

causality.

• This linkage, in turn, provides

stronger support for interventions.

Why Distinguish Between

Molecular and Genetic

Epidemiology?

• The basic tenets and principles of

molecular and genetic epidemiology are

the same.

• However, there are specific features

regarding design, analysis and

interpretation inherent in the latter.

Definition of Genetic

Epidemiology

• A discipline that seeks to unravel

the role of genetic factors and their

interactions with environmental

factors in the etiology of diseases,

using family and population study

approaches.

Key Aspects of This Definition

• Inherited susceptibility does not mean

inherited disease--environment

matters!

• When families are studied, the

observations (study subjects) are no

longer independent.

• This dependence requires special

considerations for the analysis of

data.

Genetic Epidemiology is a

Method to Answer:

• Does a disease cluster in families?

• If so, is that clustering likely a result of shared

non-genetic risk factors?

• If the clustering is not accounted for by shared

lifestyle or common environment, is the pattern

of disease consistent with inherited effects?

• If so, where is the putative gene?

What Diseases or Risk Factors

Cluster in Families?

• Heart disease

• Various cancers

• Alcoholism

• Others

Epidemiologic Assessment of

Clustering

• Case-control study

• Comparison of the frequency of a

positive family history

• Expectation under genetic influence

Clustering of “Non-Genetic”

Exposures in Families

• Employment (e.g., several family

members with medical degrees)

• Radon from soil

• Religious preferences

• Lead in paint

• Others?

Major Point of This Section

• You cannot tell easily whether

clustering of a risk factor or disease

within a family is due to genetics,

culture, or shared environment

(including social or political factors).

• Clustering within a family will also

occur simply due to bad luck!

Other Correlates of Family

History

• Large family size

• Age of relatives (for an age-related

disease)

• Gender distribution (consider

testicular cancer, prostate disease,

ovarian cysts)

Analysis Approach

• Model Y (case/control status) =

established risk factors.

• Add family history variable to denote

“genetic” influence (i.e., share genes

with an individual who has the

outcome of interest).

Analysis Issues

• Try to compare (and control if necessary)

differences between cases and controls

with regard to size of family.

• Not easy to adjust for age of family

members or their risk factors.

• What types of data can you ask your

cases and controls to provide about their

relatives?

Motivation for Case-Control

Family Studies

• To rule out influence of shared

environment, family size differences, and

age on differences in the frequency of

family history between cases and

controls

• Need to enumerate the relatives of cases

and controls, and determine the disease

status and risk factor profile for each

relative

Conduct of Family Studies

• Ascertain “probands” (index cases).

• Define family (siblings? children?

parents? grandparents?)

• Invite family members to participate

• Collect data (and, typically, biological

samples)

How to Select Control Families

• Must decide how to identify controls

– From spouse’s side of proband’s family?

– Or select a random sample from the

population?

• Will controls be motivated to

participate?

• Must take HIPAA rules into account

Analysis Issues

• Exclude the index cases and controls

• Model disease (or behavior) of

interest based on age, sex, known

risk factors

• Evaluate evidence for genetic effect

through statistical significance of

variable(s) that indicate “relationship

to index case”

Analysis Issues (cont’d)

• Simplest “genetic” variable (1 if

relative of case, 0 if relative of control)

• Can also construct indicator variables

to designate type of relative (parent,

sibling, more distant relative)

• If not significant after including other

risk factors, then no evidence for

genetic influence

Evidence of Genetic Influence,

so far….

• Cases are more likely to have a family

history of disease than controls.

• The excess risk to relatives is not

accounted for by age, sex, and other risk

factors.

• What does that tell us about the

underlying genetic influence? (nothing)

Other Approaches to Identify

Genetic Influences

• Twin studies

• Segregation analysis

• Linkage analysis

Twin Studies

• A “natural experiment” of sorts

• Monozygotic (MZ) twins are genetically

identical.

• Dizygotic (DZ) twins share, on average,

the same proportion of genes as siblings.

• Greater concordance (for dichotomous

traits) or correlation (for continuous traits)

for MZ than DZ twins is evidence of a

genetic influence.

Linkage Analysis

• One way to distinguish cultural inheritance

from genetic inheritance is to track a

region of our DNA that is transmitted from

parents to offspring in the same manner

as the disease/outcome of interest.

• This procedure works well for diseases

that follow simple rules of inheritance

(e.g., autosomal dominant or recessive).

Segregation Analysis

• Historically, linkage analysis required

knowledge of the mode of

transmission of the putative gene

[dominant versus recessive, allele

frequency, lifetime or age-specific risk

(penetrance)].

• Segregation analysis has been used

to estimate these parameters.

Genetic Epidemiology of

Complex Diseases

• “Complex diseases” are ones for which

the genetic influence may be modest and

environmental factors contribute to

disease risk.

• Segregation analysis is not typically done

for “complex diseases.”

• Modern approaches ignore models of

inheritance (non-parametric methods).

Use of Epidemiology to

Understand Genetic Variation

• The methods of genetic epidemiology

have been applied historically to

identify genes.

• Typically, epidemiologists are not

interested in mapping genes, but

rather in figuring out how genes

interact with environment to influence

disease risk and outcome.

Molecular Epidemiology

• Related individuals are not necessarily required

for studies of the association of genetic

variation with risk of disease.

• Both cohort and case-control designs can be

used.

• Because genetic code (germline DNA) is

unchanged since conception, one readily can

employ retrospective designs.

Common Strategies for Genetic

Marker Selection

• Genome-wide approach with anonymous

DNA markers (1,000,000 SNPs on a chip)

• SNPs or simple tandem repeat markers in

“candidate” genes based on a priori

knowledge about presumed function

• SNPs in candidate genes with known

functional effect on level or activity of

protein product

Primer on Single Nucleotide

Polymorphisms (SNPs)

• Because of our redundant genetic code,

some SNPs will not alter the encoded

amino acid (e.g., GGA, GGG, GGT and

GGC all encode proline).

• SNPs that change an amino acid may not

necessarily lead to change in function of

transcribed protein.

More on SNPs

• SNPs that don’t change an amino acid

may still lead to alternate splicing of the

transcript (and therefore be functionally

important).

• SNPs in promoter region may influence

level of protein product–not activity (and

therefore be biologically significant).

• SNPs in non-coding regions may still have

functional effect.

Caveats About SNP Studies

• If you’re interested in gene x environment

interactions–best to focus on SNPs with known

functional effect.

• Human biology is complex: are alterations in

one component of a pathway compensated for

by another?

• Most SNPs are likely to be modest risk factors–

requiring large sample sizes to determine

statistically significant association.

Realistic Expectations

• Almost every gene is modified after translation

into protein (e.g., glycosylation, acetylation,

methylation).

• Thus, the correlation between DNA sequence

and protein is far from perfect.

• Most GWAS “hits” are in “gene deserts.”

• May be necessary to examine multiple SNPs

within a gene and several genes within a

pathway.

Molecular Epidemiology –

Beyond Genetics

• Biomarkers of exposure and disease

extend beyond DNA.

• Viral or bacterial load

• Morphometric analysis of tissues/cells

• Hormone or lipid levels in blood or

urine

• Other examples?

Conclusion

• Molecular and genetic epidemiology represent

specialty areas of expertise.

• These specialty areas utilize and apply

advances in molecular biology and molecular

genetics of disease to:

– Unravel disease etiology.

– Enable novel approaches for early detection.

– Inform more effective interventions by targeting

those at greatest risk.