statistics with R programming

jamesfiona1993

Assignment5key.docx

Home >Mathematics homework help >Statistics homework help >statistics with R programming

Assignment 5 KEY, STA 610, Fall 2020

1. Consider #5 on pages 241-242 in the Aho textbook. Do not follow their parts (a) through (g), and do not assume the population variances are equal. The data are on BB so you don’t have to extract from R, but they are also found in R if you can load the asbio package.

(a) Determine if there is evidence in the sample that the O2 level below the town is lower than above the town, on average, in the populations. Regardless of your assessment of the conditions, assume both populations are normal in step 1 (after checking sample sizes and sample shapes).

Step 0 : From each individual (random location), we observe O2 level (quantitative) and if above or below (categorical). The goal is to compare the mean level of O2 above to the mean level of O2 below (or O2 is the response variable and above/below is the explanatory variable), so we have two independent quantitative populations, and they are independent because the locations above are not connected (paired) in any way with the locations below.

Step 1 : population sd’s known? no

sample sizes over 30? No, 15 for both groups

The above group looks pretty normal, however the below group has a little left skewness and a moderate outlier on high end. The problem says to assume both populations are normal.

Step 2 : mu1 = pop mean O2 for above, mu2 = pop mean O2 for below

Step 3 : Ho: mu1=mu2 H1: mu1 > mu2 alpha=.05

Step 4 : t = 1.96 (Show work by hand too because requested)

> t.test(O2~location,data=dO2)

Welch Two Sample t-test

data: O2 by location

t = 1.9551, df = 20.343, p-value = 0.06445

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.01183912 0.37183912

sample estimates:

mean in group Above mean in group Below

4.92 4.74

Step 5 : p-value = .06445/2 = 0.032 (I asked R for two tail test, so I could get CI from R)

Step 6 : p-value<0.05, so reject Ho

Step 7 : There is suff ev to support that the population mean O2 level is lower for the below group

(b) Calculate and interpret a 95% confidence interval for the difference in population means (assuming both populations are normal).

From above, (-0.012,0.372), we are 95% confident that the difference in the pop means (mu1-mu2) is between -0.012 and 0.372.

0 is inside the interval, so we should not reject Ho:mu1-mu=0, but this is a two sided interval, which does not line up with a one-sided test, so it is OK that we get a different decision in this instance.

(d) Redo only steps 2, 3, 4, and 5 from 1a without assuming the populations are normal, but instead assuming the populations have the same shape and variability.

Step 2 : eta1= pop median for above eta2= pop median for below

Step 3 : Ho: eta1=eta2 H1: eta1>eta2 alpha=0.05

Step 4 : W=165 (do hand calculation too because it was requested)

> wilcox.test(dO2$O2~dO2$location, alternative = c("two.sided"), mu = 0, exact = TRUE)

	Wilcoxon rank sum test with continuity correction
data:  dO2$O2 by dO2$location
W = 165, p-value = 0.02926
alternative hypothesis: true location shift is not equal to 0



Step 5
:  p-value = .02926/2 = 0.015 (cut in half because I asked R for two tail test)



2.  Consider #7 on page 242 in the Aho textbook.  Do not follow their parts (a) through (e).  Note that the same variable was observed on each subject twice (which is why they are listing X – Y), and this must be taken into account.
(a)  Determine if there is evidence in the sample that program X is superior to program Y, on average, in the population.  Regardless of your assessment of the condition, assume the population is normal in step 1 (after checking sample size and sample shape).


Step 0
:  From each individual (twin pair) we observed the difference in weight gain, which is one quantitative variable (when you have paired data, you should always focus on the differences)


Step 1
:  pop sd known?  No
n=12<30
boxplot is mildly right skewed, but problem says to assume pop is normal 





Step 2
: mu=pop mean difference in weight (X-Y)
Step 3:  Ho:  mu=0	H1: mu>0	alpha=0.05


Step 4
:  t= -5.2237 (hand calculate too because it was requested)
> t.test(diff,alternative="greater",mu=0)
	One Sample t-test
data:  diff
t = -5.2237, df = 11, p-value = 0.9999
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 -1.799565       Inf
sample estimates:
mean of x 
-1.339167 



Step 5
:  p-value= .9999


Step 6
:  p-value>0.05  so do not reject Ho



Step 7
:  There is not suff ev to support that program X is superior with respect to population mean weight loss

(b)  Redo only steps 2, 3, 4 and 5 from 2a, but now assuming the population is symmetric.


Step 2
:  eta = pop median difference in weight loss (X-Y)


Step 3
:  Ho:  eta=0  	H1:  eta>0   alpha=0.05


Step 4
:  V=1  (hand calculate too because it was requested)
> wilcox.test(diff,alternative="greater",mu=0)
	Wilcoxon signed rank test
data:  diff
V = 1, p-value = 0.9998
alternative hypothesis: true location is greater than 0



Step 5
:  p-value = 0.9998


(c)  Redo only steps 2, 3, 4 and 5 from 2a, but now assuming the population is not symmetric.


Step 2
:  eta = pop median difference in weight loss (X-Y)


Step 3
:  Ho:  eta=0	H1: eta>0	alpha=0.05


Step 4
:  S=1   (hand calculate too)
> SIGN.test(diff,md=0,alternative="greater")
	One-sample Sign-Test
data:  diff
s = 1, p-value = 0.9998



Step 5
:  p-value = 0.9998