java final project

profilemilan21
IMSQ9Spring20211.pdf

Dr. Boucher Spring 2021 Texas A&M University-Commerce

MATH 403 – Intro to Math Stat Quiz #9

(1) The results of the 2013 National Assessment of Educational Progress are in NAEP2013.csv grouped by jurisdiction (50 states + DC). Consider the participants to

be a random sample of all students in the United States. The variable

G8mathchange2011 contains the change in Grade 8 average math scores for each

jurisdiction from 2011 to 2013. A positive value means the average score for the

jurisdiction increased from 2011 to 2013. You want to use this data to test whether

the average Grade 8 math score nationwide changed from 2011 to 2013. Let 𝜇 denote the average Grade 8 math score nationwide.

(a) Always begin data analysis with an appropriate plot of the data. Start with a

boxplot of the changes in Grade 8 average math scores for each jurisdiction

from 2011 to 2013. Include the plot here and comment on it.

(b) It’s also a good idea to produce summary statistics of your data. Calculate the

sample mean and sample variance the set of math score changes. Include the

output here.

(c) Now for the hypothesis test - what are the null and alternative hypotheses?

(d) Which test will you use, the t-test or z-test? What are the assumptions behind

the test? Are they satisfied?

(e) Run the test and paste the output here.

(f) What conclusion do you reach? State the reasoning behind your conclusion

and interpret your conclusion.

(g) Calculate a 95% confidence interval for 𝜇 = the average grade 8 math score

nationwide. Paste the output here. Interpret the confidence interval – how does

it elaborate upon the conclusion from the hypothesis test above?

(2) The file ‘MeatSelenium.csv’ contains the selenium levels (mcg/100g) of a sample of

144 portions of beef raised in a particular region. The selenium levels are in the

variable ‘selen’. Let 𝜇 denote the average selenium level (mcg/100g) of all beef

raised in this region. The minimum recommended daily intake for adults is 55 mcg.

You will test whether a 100g serving of this meat exceeds the minimum

recommended daily intake for adults. Use α = 0.05.

(a) Always begin data analysis with an appropriate plot of the data. Start with a

boxplot of the beef portion selenium levels. Include the plot here. Comment

on what you see.

(b) It’s also a good idea to produce summary statistics of your data. Calculate the

sample mean and sample variance for each gender. Include the output here.

(c) Now for the hypothesis test - what are the null and alternative hypotheses?

(d) Which test will you use, the t-test or z-test? What are the assumptions behind

the test? Are they satisfied?

(e) Run the test and paste the output here.

Dr. Boucher Spring 2021 Texas A&M University-Commerce

(f) What conclusion do you reach? State the reasoning behind your conclusion

and interpret your conclusion.

(g) Calculate a 95% confidence interval for 𝜇= denote the average selenium level

(mcg/100g) of all beef raised in this region. Interpret the confidence interval –

how does it elaborate upon the conclusion from the hypothesis test above?

(3) The file ‘Soda.csv’ contains survey responses from 339 adults regarding the % of

daily liquid intake that is soda (‘Soda’), whether or not they are on a diet (‘Diet’),

their gender (‘Gender’) and age in years (‘Age’). Consider the sample to be random

and representative of all US adults. After reading the data into the data frame soda the

table() function give the following summary information of the variable ‘Diet’ versus

‘Gender’:

(a) Use this information to test whether a majority of females are on a diet. Omit

the missing responses.

(b) What is the associated 95% confidence interval? How does it agree with the

result of the hypothesis test?

(c) Are the assumptions behind the hypothesis test and the confidence interval

satisfied?