Statistics
Question
Before heading to the beach in the afternoon in mid-winter, hardy Sydney surfers are very interested in what the likely temperature will be given the temperatures at breakfast time. Afternoon temperatures, however, are also likely to be affected by other meteorological factors, such as precipitation and sunshine. The data below relate to daily temperatures in Sydney, July 2015 at 9 am and 3 pm, as well as: rainfall in the 24 hours to 9 am; evaporation in the 24 hours to 9 am; hours of bright sunshine in the 24 hours to midnight the day before.
You will use descriptive statistics, inferential statistics and your knowledge of multiple linear regression to complete this task.
Here is a table describing the variables in the data set:
Variable
Definition
Rain (mm)
Millimeters of rain in 24 hours to 9 am
Evaporation (mm)
Millimeters of water evaporation in the 24 hours to 9 am
Sun (hours)
Number of hours of sun in the 24 hours to midnight, the day before
9 am Temp
The temperature at 9 am
3 pm Temp
The temperature at 3 pm
Required:
- Calculate the descriptive statistics from the data and display in a table. Be sure to comment on the central tendency, variability and shape for each variable.
- Draw a graph that displays the distribution of the temperature at 3 pm.
- Create a box-and-whisker plot for the distribution of rain and describe the shape. Is there evidence of outliers in the data?
- What is the likelihood that the 3 pm temperature is no less than 17 degrees if it has rained in the 24 hours prior to 9 am? Is the temperature statistically independent of rain? Use a Contingency Table.
- Estimate the 90% confidence interval for the population mean hours of sun
- Your supervisor recently stated that it is obvious that the mean 3 pm temperature is greater than the long-run average of 17.3 degrees Celsius. Test her claim at the 1% level of significance.
- Run a multiple linear regression using the data and show the output from Excel.
- Is the coefficient estimate for the rain total statistically different than zero at the 5% level of significance? Set-up the correct hypothesis test using the results found in the table in Part (G) using both the critical value and p-value approach. Interpret the coefficient estimate of the slope. (2 Marks)
- Interpret the remaining slope coefficient estimates. Comment on whether the signs are what you are expecting.model statistically significant at the 1% level of significance? Use the p-value approach.
- Do the results suggest that the data satisfy the assumptions of a linear regression: Linearity, Normality of the Errors, and Domesticity of Errors? Show using scatter diagrams, normal probability plots and/or histograms and Explain.
- Based on the results of the regressions, is it likely that other factors have influenced the afternoon temperature? If so, provide a couple possible examples and indicate whether these would likely influence the regression results if they were included.
If a community housing organisation asked for information regarding the characteristics of housing targeting the households of Aboriginal and Torres Strait islanders, explain whether a simple random sampling technique would provide an accurate representation of these households.
Check
7 years ago
10