Managerial Epidemiology
Chapter 5
Sources of Data for Use in Epidemiology
Learning Objectives
Discuss criteria for assessing the quality and utility of epidemiologic data
Indicate privacy and confidentiality issues that pertain to epidemiologic data
Discuss the uses, strengths, and weaknesses of various epidemiologic data sources
Criteria for the Quality and Utility of Epidemiologic Data
Nature of the data
Availability of the data
Completeness of population coverage
Representativeness
Generalizability (external validity)
Thoroughness
Strengths versus limitations
2
Nature of the Data
Refers to the source of data, e.g., vital statistics, case registries, physicians’ records, surveys of the general population, or hospital and clinic cases.
Will affect the types of statistical analyses and inferences that are possible.
3
Availability of the Data
Refers to investigator’s access to data.
For example, medical records and other data with personal identifiers may not be used without patients’ consent.
4
Completeness of Population Coverage
Representativeness—the degree to which a sample resembles a parent population.
Generalizability (external validity)— ability to apply findings to a population that did not participate in the study.
Thoroughness—the care taken to identify all cases of a given disease.
5
Strengths versus Limitations
The utility of the data for various types of epidemiologic research.
Factors inherent in the data may limit their usefulness.
Incomplete diagnostic information.
Case duplication.
6
Online Sources of Epidemiologic Data
Online bibliographic databases include MEDLINE, TOXLINE, and commercial databases.
National Library of Medicine’s PubMed®
MEDLINE is the main part of PubMed®
Premier source of health-related literature
TOXLINE—keyed to toxicology and includes information on drugs and chemicals
7
Selected Internet Addresses
American Public Health Association—http://www.apha.org
Centers for Disease Control and Prevention—http://www.cdc.gov
PubMed®—http://www.ncbi.nlm.nih.gov/sites/entrez
Confidentiality
Privacy Act of 1974
Prohibits the release of confidential data without the consent of the individual
Freedom of Information Act
Mandates the release of government information to the public, except for personal and medical files
The Public Health Service Act
Protects confidentiality of information collected by some federal agencies, e.g., NCHS
8
The HIPAA Privacy Rule
Refers to the Health Insurance Portability and Accountability Act of 1996
Sections of HIPAA “…require the Secretary of HHS to publicize standards for the electronic exchange, privacy and security of health information…”
Categories of protected health information pertain to individually identifiable data re:
The individual’s physical and mental health
Provision of health care to the individual
Payment for provision of health care
Data Sharing
Refers to the voluntary release of information by one investigator or institution to another for the purpose of scientific research.
Can enhance data quality and increase knowledge from research.
Key issue is the primary investigator’s potential loss of control over information.
9
Record Linkage
Joining data from two or more sources, e.g., employment records and mortality data.
Applications include genetic research, planning of health services, and chronic disease tracking.
10
Statistics Derived from the Vital Registration System
Mortality statistics
Birth statistics: certificates of birth and fetal death.
11
Mortality Statistics
Mortality data are nearly complete, as most deaths in the U.S. and other developed countries are unlikely to be unreported.
Death certificates include demographic information about the deceased and cause of death (immediate cause and contributing factors).
12
Limitations of Mortality Data
Certification of cause of death.
For example, in an elderly person with chronic illness, exact cause of death may be unclear.
Lack of standardization of diagnostic criteria.
Stigma associated with certain diseases, e.g., AIDS, may lead to inaccurate reporting.
13
Limitations of Mortality Data (cont’d)
Errors in coding by nosologist
Changes in coding
Revisions in the (ICD) International Classification of Disease.
Sudden increases or decreases in a particular cause of death may be due to changes in coding.
14
Birth Statistics: Certificates of Birth and of Fetal Death
Birth certificate includes information that may affect the neonate, such as congenital malformations, birth weight, and length of gestation.
Sources of unreliability:
Mothers’ recall of events during pregnancy may be inaccurate.
Conditions that affect neonate may not be present at birth.
15
Birth Statistics (cont’d)
Varying state requirements for fetal death certificates.
Both types of certificates have been used in studies of environmental influences upon congenital malformations.
Both provide nearly complete data.
16
Reportable Disease Statistics
Federal and state statutes require health care providers to report those cases of diseases classified as reportable and notifiable.
Include infectious and communicable diseases that endanger a population, e.g., STDs, measles, foodborne illness.
17
Limitations of Reportable Disease Statistics
Possible incompleteness of population coverage.
For example, asymptomatic persons would not seek treatment.
Failure of physician to fill out required forms.
Unwillingness to report cases that carry a social stigma.
18
Screening Surveys
Conducted on an ad hoc basis to identify individuals who may have infectious or chronic diseases. Examples: breast cancer screenings, health fairs.
Clientele are highly selected.
Individuals who participate are concerned about the particular health issue.
19
Multiphasic Screening
Administration of 2 or more screening tests during a single screening program
Ongoing screening programs often are carried out at worksites.
Potential biases from worker attrition
Data can be useful for research on occupational health problems.
Data may not contain etiologic information.
20
Disease Registries
Registry--a centralized database for collection of data about a disease
Coding algorithms are used to maintain patient confidentiality.
Applications of registries:
Patient tracking
Identification of trends in rates of disease
Case-control studies
Example: SEER program
21
Surveillance, Epidemiology, and End Results (SEER) Program
Conducted by the National Cancer Institute (NCI)
Collects cancer data from different cancer registries across the U.S.
Provides information about trends in cancer incidence, mortality, and survival
22
Morbidity Surveys of the General Population
Morbidity surveys collect data on the health status of a population group.
Obtain more comprehensive information than would be available from routinely collected data
Example: National Health Interview Survey
23
National Health Survey
Authorized under the National Health Survey Act of 1956 to obtain information about the health of the U.S. population.
Refers generically to a group of surveys and not a single survey.
In response to the Act, the National Center for Health Statistics (NCHS) conducts three separate and distinct programs.
24
NCHS Survey Programs
National Health Interview Survey (NHIS)
Health Examination Survey (HES)
Various surveys of health resources
National Hospital Discharge Survey
National Ambulatory Medical Care Survey
National Health Interview Survey (NHIS)
General household health survey of the U.S. civilian noninstitutionalized population
Studies a comprehensive range of conditions such as diseases, injuries, disabilities, and impairments
25
Health Examination Survey (HES)
Provides direct information about morbidity through examinations, measurements, and clinical tests
Identifies conditions previously unreported or undiagnosed
Provides information not previously available for a defined population
Now known as the Health and Nutrition Examination Survey (HANES)
26
Behavioral Risk Factor Surveillance System (BRFSS)
Collects data on behaviorally related phenomena
Behavioral risks for chronic diseases
Preventive activities
Healthcare utilization
The largest telephone survey in the world
California Health Interview Survey (CHIS)
Provides information on the health and demographic characteristics of California residents
Uses telephone survey methods
Topics include
Physical and mental health
Health behaviors
Health insurance coverage and utilization
Conducted on a continuing basis
Insurance Data
Sources include:
Social Security--provides data on disability benefits and Medicare.
Health insurance--provides data on those who receive care through a prepaid medical program.
Life insurance--provides information on causes of mortality; also provides results of physical examinations.
27
Limitations of Insurance Data
Data may not be representative of entire population, as the uninsured are excluded.
Clinical Data Sources
Hospital data
Diseases treated in special clinics and hospitals
Data from physicians’ practices
Hospital Data
Consists of both inpatient and outpatient data
Deficiencies of data:
Not representative of any specific population
Different information collected on each patient
Settings may differ according to social class of patients; e.g., specialized clinics, emergency rooms
28
Diseases Treated in Special Clinics and Hospitals
Data cannot be generalized because patients are a highly selected group.
Case-control studies can be done with unusual and rare diseases.
However, it is not possible to determine incidence and prevalence rates without knowing the size of the denominator.
29
Data from Physicians’ Practices
Limited application due to:
Confidentiality of patient data
Highly selected group of patients
Lack of standardization of information collected
Useful for the purposes of:
Verification of self-reports
Source of exposure data
30
Absenteeism Data
Records of absenteeism from work or school
Possible deficiencies:
Data omit people who neither work nor attend school.
Not all people who are ill take time off.
Those absent are not necessarily ill.
Useful for the study of rapidly spreading conditions
31
School Health Programs
Provide information about immunizations, physical exams, and self-reports of illness
Have been used in studies of intelligence, mental retardation, and disease etiology
Paffenbarger, et al. used information from health records of college students to track causes of chronic diseases.
32
Morbidity Data from the Armed Forces
Reports from physicals, hospitalizations, and selective service examinations
Data have been used for:
Studies of disease etiology.
Study of twins serving in Korean War or WWII to determine influence of “nature and nurture” on cause of disease.
Studies investigating genetic factors in obesity
33
Other Data Sources Relevant to Epidemiologic Studies
U.S. Bureau of the Census publications:
Statistical Abstract of the United States
County and City Data Book
Decennial Censuses of Population and Housing
Historical Statistics of the United States, Colonial Time to 1970
34
U.S. Bureau of the Census
Provides information on the general, social, and economic characteristics of the U.S. population
U.S. Census is administered every 10 years.
Attempts to account for every person and his or her residence
Characterizes population according to sex, age, family relationships, and other demographic variables
37