Math/Statistics (2nd)
Answer the following problems showing your work and explaining (or analyzing) your results.
1. Describe the measures of central tendency. Under what condition(s) should each one be used?
2. Last year, 12 employees from a computer company retired. Their ages at retirement are listed below.First, create a stem plot for the data. Next, find the mean retirement age. Round to the nearest year.
55 77 64 77 69 63 62 64 85 64 56 59
3. A retail store manager kept track of the number of car magazines sold each week over a 10-week period. The results are shown below.
27 30 21 62 28 18 23 22 26 28
a. Find the mean, median, and mode of newspapers sold over the 10-week period.
b. Which measure(s) of central tendency best represent the data?
c. Name any outliers.
4. Joe wants to pass his statistics class with at least a 75%. His prior four test scores are 74%, 68%, 84% and 79%. What is the minimum score he needs on the final exam to pass the class with a 75% average?
5. Nancy participated in a summer reading program. The number of books read by the 23 participants are as follows:
10 9 6 2 5 3 9 1 6 3 10 4 7 6 3 5 6 2 6 5 3 7 2
|
Number of books read |
Frequency |
|
1–2 |
|
|
3–4 |
|
|
5–6 |
|
|
7–8 |
|
|
9–10 |
|
a. Complete the frequency table.
b. Find the mean of the data.
c. Find the median of the data.
6. The chart below represents the number of inches of snow for a seven-day period.
|
Sunday |
Monday |
Tuesday |
Wednesday |
Thursday |
Friday |
Saturday |
|
2 |
5 |
3 |
10 |
0 |
4 |
2 |
a. Find the mean, median, and mode.
b. Which is the best measure of central tendency?
c. Remove Wednesday from the calculations. How does that impact the three measures of central tendency?
d. Describe the effect outliers have on the measures of central tendency.
7. A dealership sold 15 cars last month. The purchase price of the cars, rounded to the nearest thousand, is represented in the table.
|
Purchase price |
Number of cars sold |
|
$15,000 |
3 |
|
$20,000 |
4 |
|
$23,000 |
5 |
|
$25,000 |
2 |
|
$45,000 |
1 |
a. Find the mean and median of the data.
b. Which measure best represents the data? Use the results to support your answer.
c. What is the outlier and how does it affect the data?
8. What do the letters represent on the box plot?
9. The test scores from a math final exam are as follows:
64 85 93 55 87 90 73 81 86 79
a. Create a box plot using the data.
b. Label the five points on the box plot and include numerical answers from part "a."
10. Using the data and results from Question 9, answer the following questions.
a. What is the median?
b. What is the range?
c. What is the interquartile range?
d. In a short paragraph, describe the data in the box plot.
Continued:
Using the data you collected in the Module 1, write a paper (1–3 pages) including all of the following content:
· Calculate the mean, median, and mode of your collected data. Show and explain your calculations.
· Are these numbers higher or lower than you expected? Explain.
· Which of these measures of central tendency do you think most accurately describes the variable you are looking at? Provide your justification.
· Create a box plot to represent the data, labeling and numerating all 5 points on the box plot. For the plot, you may draw and insert it in your paper as a picture. Make sure it is legible.
Data from Module 1 Provided Below:
The assignment this week is to collect quantitative data from your daily activities for a minimum of 10 days. Some examples of data to collect are:
· The number of minutes you spend studying every day.
· The time it takes to cook meals each day.
· The amount of daily time spent talking on the phone.
· The amount of time you drive each day.
In a paper (1–3 pages), describe the data you are going to collect and how you are going to keep track of the time. Within the paper, incorporate the concepts we are learning in the module including (but not limited to) probability theory, independent and dependent variables, and theoretical and experimental probability. Discuss your predictions of what you anticipate the data to look like and events that can skew the data. Collect data for at least 10 days. Do you think the data will provide a valid representation of these activities? Explain why or why not.
Data Collection Project:
Every event in life that can be measured can be quantified as a variable. The variable is either independent or dependent, and in either case the variable can be measured and used. One variable that I am testing the amount of time it takes myself to get to work daily. For the past two weeks, I have measured the time it takes for myself to get to work starting from turning on the engine, to turning off the engine. This variable is a quantitative and dependent variable in nature; many factors affect the time involved in driving to work daily. For example, the most blatant factors which affect commute time are traffic conditions, construction, accidents resulting in closed routes, etc. Before I collected the data, I hypothesized that the data would be best modeled by a bell curve, as certain days I would arrive early, others late, but mostly in the middle. However, the factors stated skewed the data rightward. Eight out of the ten days, there was construction in process on the road outside of my neighborhood. Due to this, the commute time was significantly longer. Because of the construction, the data of sample size ten does not provide a valid representation of the activity. Construction does occur time to time, but being eight out of the ten days was inconvenient. If the sample size were larger, such as 40 to 50 days, all types of factors could be incorporated – traffic, construction, accidents, etc. – to provide an accurate representation of the activity.
However, the time it takes to get to work is not as important as the fact whether I arrived to work on-time. For each of the days, I also measured whether I arrived on time – a Boolean variable of two values, arriving on time (1) and not arriving on time (0). For each of the ten days, I arrived on time seven days and I arrived late thrice. Therefore, the experimental probability of arriving on time to work is:
The experimental probability of arriving on time from this dataset is 0.70. However, the theoretical probability is unknown – it is absolutely dictated by randomness. From the perspective of this dataset, a 70% probability of arriving on-time is not a good sign; assuming I work every weekday for one year, I would be arriving late 30% of the time, or 78 days. As with the sample size of commute times, a larger sample size of days would result in a more accurate experimental probability of arriving on time to work, a probability that would increasingly get closer to the theoretical probability of arriving on time.