statistics with R programming
Week 7 Outline (10-14-20):
Assignment 4 due Friday at midnight.
Exam 1 will be after we finish Aho Ch 5 (and after assignment 4, which is over Aho Ch 5, is returned to you). It will be at the earliest in Week 8, but it could be Week 9. Recall that midterm exams will be available for you to take outside of class for 48 hours. My guess is that the 48 hours after our class that week is the most sensible, but I am open to suggestion. We will start Aho Chapter 6 before Exam 1, but Aho Chapter 6 will not be on Exam 1
· Which 48 hours?
· You may not discuss the exams with anyone during the 48 hour exam period, other than the professor.
· You may use your notes or textbook, though I recommend you prepare as if you did not have these resources, so you don’t have to look everything up.
· You may not use any online resources, other than to access R if necessary.
· You may not seek help with any part of the exam from any online resources, including tutors.
Questions
Start Aho Chapter 6: Hypothesis testing for one population mean, or for comparing two population means from independent (quantitative) populations
· Handout on Introduction to hypothesis testing for one population mean (one sample t test)
· p-value approach
· do 8 steps by hand for example in handout
· if you have the raw data (unlike in this example where we only have the sample mean, sample sd and sample size), R code is
t.test(x,
alternative = c("two.sided", "less", "greater"),
mu = , conf.level = 0.95)
· one sided vs two sided tests
· critical value approach as alternative to p-value
· Paired t test as example of one sample t test (focus on differences)
· Paired data occur when a quantitative variable is measured twice on the same subject, like a before and after test
· The goal is to see if there is any change, on average, so almost always for paired data
· If you subtract the two measurements, you get the change for that unit, and the differences can be analyzed as one quantitative variable
· Example:
|
|
Subject1 |
Subject2 |
Subject3 |
Subject4 |
|
Before |
4 |
10 |
13 |
7 |
|
After |
6 |
10 |
12 |
11 |
|
Difference |
2 |
0 |
-1 |
4 |
· Power of a test
· Power = 1 – , where is the probability of a type 2 error (not rejecting when is false
· This means power is the probability of rejecting when is false, so we would like power to be near 1
· Power depends on , how far the value in is from the true value of the parameter, variability in the population, and sample size
· Applet: https://www.geogebra.org/m/e8Usa8Pp
· Sometimes researchers want to determine how big their sample size should be to attain a particular level of power
· R has built in functions to do this, here is for t tests
power.t.test(n = NULL, delta = NULL, sd = 1, sig.level = 0.05, power = NULL,
type = c("two.sample", "one.sample", "paired"),
alternative = c("two.sided", "one.sided"),
strict = TRUE)
· If you want to calculate sample size, leave n as NULL, and provide all the other inputs (delta is the difference between the value in and the true value of the parameter)
· If you want to do a power analysis, you can input sample size, and leave power as NULL
· If power is really high (especially because sample size is really high), it is really easy to detect a small departure from
· Be careful to distinguish between statistical significance and practical significance
· What if we cannot conclude is normally distributed?
· Example: number of offspring of a species of fish
· Want to see if there is evidence that the center of the population is above 16
· Random sample: 12, 13, 14, 14, 16, 17, 17, 17, 20, 21, 25
· If is not normally distributed, that means skewness or outliers are causing a problem, which means we focus on population median instead of population mean (we’ll use “eta” for population median)
· Sign test (not in textbook)
· If paired data, focus on differences
· Does not require the population to have any particular shape (but does require that binomial conditions are met)
· Hand calculate test statistic S
· S = number of values above median in
· If is true, then S will have a binomial distribution with p=0.5 and number of trials n* = n – (number of values equal to value from )
· In our example, there are six values above 16, so S=6
· If is really true, then each sample value should have 0.5 probability of being above the value from
· We throw out any sample values that are the same as the value in , so our “modified” sample size is n* = 10
· Use binomial distribution to get p-value
· On calculator: P-value=P(S>=6) = 1-binomcdf(10,.5,5) = 0.3770
· In R: P-value = P(S>=6) = 1-pbinom(5,10,.5)
· SIGN.test() function in BSDA package
SIGN.test(x, md = ,alternative= c("two.sided",
"less", "greater"))
· In our example
> offspring<-c(12,13,14,14,16,17,17,17,20,21,25)
> SIGN.test(offspring,md=16,alternative="greater")
One-sample Sign-Testdata: offsprings = 6, p-value = 0.377alternative hypothesis: true median is greater than 16
· Wilcoxon Sign Rank test
· If paired data, focus on differences
· Requires that the population is symmetric (but not normal)
· Hand calculate test statistic V
· Subtract median in from each value
· Ignore values of 0
· Rank values in terms of smallest to largest from 0, using average ranks for ties
· = sum of positive ranks (from data points above value from )
· = sum of “negative” ranks (from data points below value from )
· Aho textbook says calculate V by
· For two tail test, V = min( , )
· For right tail test, V =
· For left tail test, V =
· R just uses V = , so that’s what we’ll go with
· In our example (same as above with sign test, which was right tail test)
· = 2+2+2+7.5+9+10 = 32.5
· Use R to get p-value
wilcox.test(x, alternative = c("two.sided", "less", "greater"), mu = )
· In our example
> wilcox.test(offspring,alternative="greater",mu=16)Wilcoxon signed rank test with continuity correctiondata: offspringV = 32.5, p-value = 0.3226alternative hypothesis: true location is greater than 16· Which test has more power? (one sample t test, sign test, signed rank test)
· Hypothesis testing for two population means from independent populations
· Two independent sample t test
· Overview of how different than one sample t test
· Example: ______________
· Pooled vs unpooled (Welch’s approximate t test or Saittherwaite approximate)
· Confidence interval for difference in two population means
· R code (y is quantitative response variable and x is categorical explanatory (group) variable)
t.test(y~x,alternative = c("two.sided", "less", "greater"),mu = , var.equal = FALSE,conf.level = 0.95)
· What if we cannot conclude that is normally distributed?
· Wilcoxon rank sum test (sometimes referred to as “Mann-Whitney U”, which gives the same p-value as Wilcoxon rank sum test)
· If is not normally distributed, that means skewness or outliers are causing a problem, which means we focus on population median instead of population mean
· Requires that the two populations have the same shape
· Hand calculate test statistic W
· Rank values from largest to smallest in the two groups as if they were all in one group (use average ranks for ties)
· = sum of ranks in sample 1
· = sum of ranks in sample 2
· Whichever group has higher population median should generally produce higher values, so should have higher sum of ranks
· Rescale and by subtracting lowest possible value for each (if each group’s values were lower than all the values for the other group)
·
·
· This is because sum of first n whole numbers is n(n+1)/2
· Aho textbook says
· For two tail test, W = min( , )
· For right tail test, W =
· For left tail test, W =
· R just uses W= _________________ , so that is what we will do
· Use R to get p-value (y is quantitative response variable, x is categorical explanatory (group) variable)
wilcox.test(y~x, alternative = c("two.sided", "less", "greater"), mu = 0, exact = NULL)
· Two independent sample t test has more power than Wilcoxon rank sum test
· Duality between confidence intervals and hypothesis tests
Chapter 6 reading for next time (in case you didn’t read it yet): Aho Chapter 6: Pages 197-224, 231-236
Chapter 7 reading for when we get to it: Aho Chapter 7: Pages 247-257, 265 (start with section 7.5.6) – 282 (stop at section 7.6.5.3.4)