Statistics question
PLAN(NING AHEAD): SAMPLING
VARIABILITY
Biol/Stat 2244 – Peter
Objectives
By the end of this module, you should be able to:
Identify the inference goal of a study;
Distinguish between, and categorize data by, type of variables;
Identify the number of comparison groups in a study;
Distinguish between matched and independent samples;
Use vocabulary relevant to selecting possible inference procedures (e.g. matched, independent samples, two-sample, confidence interval, etc.).
Biol/Stat 2244 – Peter
Scientific Inquiry Framework: PPDAC
Problem Define the research question.
Plan Decide how to address the research Problem.
Data Execute your Plan and examine your Data.
Analysis Extract meaning from your Data.
Conclusion Interpret your results in the context of the Problem
(MacKay & Oldford, 2000)
Biol/Stat 2244 – Peter
Plan
Create a plan—including data collection and
analysis—to address the research question(s)
• What is the study population and sampling
strategy?
• What will be measured for the response
variable(s)?
• How will you deal with explanatory variables?
• What statistical procedures do you plan to use?
Biol/Stat 2244 – Peter
Sampling variability
Sampling error
difference between a
statistic and parameter,
due to chance sample
composition
variation in the value of a statistic from sample to
sample
Biol/Stat 2244 – PeterReasonable video explanation of sampling error: https://www.youtube.com/watch?v=uGuWrPFStdg
Statistical Inference
infer
sample population
Forming judgements about
population parameters
and relationships among
variables, on the basis of
sample data
Biol/Stat 2244 – Peter
Inference procedures use knowledge of
sampling variability and/or estimates of
sampling error to help draw conclusions e.g. hypothesis tests, confidence intervals
Describing your data and analyses
Questions that can help identify possible analysis
procedures
• What is your analysis goal?
• How many “samples” or groups are you
comparing?
• Are your data independent or matched
“samples”?
• What type of variables are you analysing?
• What characteristic(s) is/are of interest? Biol/Stat 2244 – Peter
Is Enduron better at reducing blood pressure, on
average, than Diovan in men with hypertension?
Confidence Interval
Hypothesis Test Biol/Stat 2244 – Peter
Some inference procedure ‘goals’
Is Enduron better at reducing blood pressure, on
average, than Diovan in men with hypertension?
Estimate the value of a parameter
Assess evidence against a claim about a
parameter or relationship between variables
Number of “samples”
Does weight gain differ between oral contraceptive (“OC”)
users and IUD users?
Biol/Stat 2244 – Peter
In the context of inference, the term “sample” is generally
interpreted as ‘comparison group’
OC
users
Study 1
IUD
users Females
Study 2
Pop’n
OC users
(n1)
IUD users
(n2)
Females
(n)
Treatment:
OC (n1)
Treatment:
IUD (n2)
OC
users
IUD
users
re p re
se n ts
re p re
se n ts
Structure of data/“samples”
Many statistical procedures require understanding the
structure of the data collected:
Independent samples Matched/paired samples
explicit matches (non-arbitrary)
between responses and/or units Biol/Stat 2244 – Peter
nominal
ordinal
interval
ratio
Values can be ordered
Values can be named
Distance between consecutive
values is constant
Zero means none
Quantitative
Categorical
Biol/Stat 2244 – Peter
Also: discrete or
continuous
Types of variables
Characteristic(s) of interest
The analysis procedure may depend on the
characteristic of interest for the population
• Parameters: mean, median, proportion, variance
• Distribution: Normal, Poisson, …
• Relationship: linear, exponential, …
Biol/Stat 2244 – Peter
Summary
• Statistical analysis should be part of the initial
plan for your research;
• Study design and variable type influence your
choice of analysis.
Biol/Stat 2244 – Peter
Prompt: What are some variables that could be both
categorical and quantitative, depending on the
method of measuring them?
Post to “Thinking about different types of variables”