asap math statistics need it about 1-2 hours
STAT 202 Thursday September 13, 2012 Miller Name______________________________
Lab 3: Scatter Plots
Please submit your completed lab on Blackboard.
Part 1: More with Assessing Normality
Billionaires (data file is on Blackboard-Data-billionaires.txt)
With the usual definition of normal, being a billionaire is not at all normal. If we examine the (relatively) small subset of the world’s population, which includes the world’s billionaires from 1992, some of the demographics may have some Normality to them. Here you will examine the distributions for these demographics. Fortune magazine reported the following demographics for the billionaires: wealth (in billions), age (*=unknown), the region where they are from (Asia, Europe, Middle East, United States, or Other).
1. Begin by examining the distribution of each variable. Give a brief description of the overall pattern of each distribution below (For quantitative variables you need to discuss shape, center, and spread. For categorical variables you need to discuss percentages.)
· Wealth-
· Age-
· Region-
2. To get a complete look of how well the distribution of the billionaires’ ages fit with the standard normal distribution look at the Normal quantile plot (Graphics-QQ plot). Include the plot below.
3. What conclusions can you make about the distribution of the billionaires’ ages based on the plot in #2 and why? Keep in mind that the horizontal axis corresponds to Z.
4. Also, how do your conclusions relate to what we have discussed about the correlation, r? Please include an estimation of what type of value you think that r would have.
5. There are three very obvious outliers that do not fit with the pattern that the rest of the data follows. Identify the ages of the outliers. Explain why they appear as outliers in the quantile plot by looking back at the distribution of ages. Also explain why, logically, it makes sense for these individuals to be outliers.
Part 2-More with Scatter Plots (data file is on Blackboard: fuel oil-Bahrain)
The given data was reported by the UN Statistics Division as part of the JODI (Joint Oil Data Initiative- http://www.jodidata.org/WJODI.shtm). JODI was formed in 2003 with “the objective of improving the quality and transparency of international oil statistics.” Basically, most of the world felt that it was fair to be able to see data that supports the frequent fluctuations in oil prices. Over 90 countries now contribute data. This data set shows the available information about the amount of fuel oil that Bahrain exported (“Export”) and the amount of refined fuel oil Bahrain produced (“Refinery Output”) over several years. Both are measured in thousand metric tons.
6. I want you to examine the relationship between “Export” and “Refinery Output.” Treat “Refinery Output” as the explanatory variable. Display the data together with a scatterplot.
7. Find the value of the correlation between “Export” and “Refinery Output.”
8. What conclusions can you make based on your scatterplot and calculated correlation value? Please be very specific and include both a statistical interpretation and a practical interpretation.
9. There is an outlier in your data set. It is the data value for July 2007 when the Refinery Output was 141,000 metric tons and the Export was 253,000 metric tons. What happens to the correlation value when this point is removed? Do you notice any changes in the pattern of the scatterplot?