java final project
Dr. Boucher Spring 2021 Texas A&M University-Commerce
MATH 403 – Intro to Math Stat Quiz #9
(1) The results of the 2013 National Assessment of Educational Progress are in NAEP2013.csv grouped by jurisdiction (50 states + DC). Consider the participants to
be a random sample of all students in the United States. The variable
G8mathchange2011 contains the change in Grade 8 average math scores for each
jurisdiction from 2011 to 2013. A positive value means the average score for the
jurisdiction increased from 2011 to 2013. You want to use this data to test whether
the average Grade 8 math score nationwide changed from 2011 to 2013. Let 𝜇 denote the average Grade 8 math score nationwide.
(a) Always begin data analysis with an appropriate plot of the data. Start with a
boxplot of the changes in Grade 8 average math scores for each jurisdiction
from 2011 to 2013. Include the plot here and comment on it.
(b) It’s also a good idea to produce summary statistics of your data. Calculate the
sample mean and sample variance the set of math score changes. Include the
output here.
(c) Now for the hypothesis test - what are the null and alternative hypotheses?
(d) Which test will you use, the t-test or z-test? What are the assumptions behind
the test? Are they satisfied?
(e) Run the test and paste the output here.
(f) What conclusion do you reach? State the reasoning behind your conclusion
and interpret your conclusion.
(g) Calculate a 95% confidence interval for 𝜇 = the average grade 8 math score
nationwide. Paste the output here. Interpret the confidence interval – how does
it elaborate upon the conclusion from the hypothesis test above?
(2) The file ‘MeatSelenium.csv’ contains the selenium levels (mcg/100g) of a sample of
144 portions of beef raised in a particular region. The selenium levels are in the
variable ‘selen’. Let 𝜇 denote the average selenium level (mcg/100g) of all beef
raised in this region. The minimum recommended daily intake for adults is 55 mcg.
You will test whether a 100g serving of this meat exceeds the minimum
recommended daily intake for adults. Use α = 0.05.
(a) Always begin data analysis with an appropriate plot of the data. Start with a
boxplot of the beef portion selenium levels. Include the plot here. Comment
on what you see.
(b) It’s also a good idea to produce summary statistics of your data. Calculate the
sample mean and sample variance for each gender. Include the output here.
(c) Now for the hypothesis test - what are the null and alternative hypotheses?
(d) Which test will you use, the t-test or z-test? What are the assumptions behind
the test? Are they satisfied?
(e) Run the test and paste the output here.
Dr. Boucher Spring 2021 Texas A&M University-Commerce
(f) What conclusion do you reach? State the reasoning behind your conclusion
and interpret your conclusion.
(g) Calculate a 95% confidence interval for 𝜇= denote the average selenium level
(mcg/100g) of all beef raised in this region. Interpret the confidence interval –
how does it elaborate upon the conclusion from the hypothesis test above?
(3) The file ‘Soda.csv’ contains survey responses from 339 adults regarding the % of
daily liquid intake that is soda (‘Soda’), whether or not they are on a diet (‘Diet’),
their gender (‘Gender’) and age in years (‘Age’). Consider the sample to be random
and representative of all US adults. After reading the data into the data frame soda the
table() function give the following summary information of the variable ‘Diet’ versus
‘Gender’:
(a) Use this information to test whether a majority of females are on a diet. Omit
the missing responses.
(b) What is the associated 95% confidence interval? How does it agree with the
result of the hypothesis test?
(c) Are the assumptions behind the hypothesis test and the confidence interval
satisfied?