Statistics assignemnts deadline 3 days

profileMaisa
fwdstatsassignments.zip

MATH 215 Assignment 2.docx

MATH 215 Assignment 2

Khadra Sulub

Victor Olobatuyi

March, 30th/ 2022

Assignment 2

Overview

Total marks:   / 71

This assignment covers content from Unit 2 of the course. It assesses your knowledge of the concepts and rules that allow us to compute the probabilities related to events that occur when conditions are uncertain.

Instructions

Show all your work and justify all of your answers and conclusions, except for the True/False questions.

Keep your work to 4 decimals, unless otherwise stated.

(4 marks)

Circle True (T) or False (F) for each of the following statements:

T F {H, T} represents the sample space for an experiment of flipping two coins, each with heads (H) on one side and tails (T) on the other.

T F 13/12 represents a possible value of a probability.

T F Suppose that the probability of event A occurring is 1/5 and the probability of event B occurring is 3/7. If events A and B are mutually exclusive, the probability that A and B occur together is 0.

T F If a researcher samples without replacement, then future probabilities of the sampling process are unaffected by prior probabilities.

(7 total marks)

At a recent convention, a group of 60 doctors were classified according to their specialties. The number of doctors in each specialty was summarized as follows:

Pediatrician: 18 General Practitioner: 29 Surgeon: 4 Dermatologist: 9 If a doctor is selected at random, what is the probability that:

(1 mark)

the doctor is a dermatologist?

The probability is 0.15

(2 marks)

the doctor is either a pediatrician or a general practitioner?

The probability 0.84

(2 marks)

the doctor is not a pediatrician?

The probability is 0.3

(2 marks)

the doctor is neither a general practitioner nor a dermatologist?

The probability is 0.36

(19 marks)

An analysis of blood donors examined blood type (A, B, AB or O) and whether the donor was male or female. The data is represented in the following table:

Gender

A

B

AB

O

Male

 (M)

35

16

2

40

Female

 (F)

29

25

5

38

In the questions below, round all calculated probabilities to 4 decimal places.

(3 marks)

1. What is the probability that a randomly selected donor is female?

The probability 0.4

(4 marks)

1. What is the probability that a randomly selected donor is either male or has type O blood?

The probability is 0.43

(4 marks)

1. What is the probability that a randomly selected donor is neither female nor has type AB blood?

The probability is 0.97

(2 marks)

1. What is the probability that a selected donor has type AB blood, given that he is male?

The probability is 0.02

(2 marks)

1. Are the events that a person has type A blood (denoted by A) and that a person is female (denoted by F) mutually exclusive? Explain.

The two events are independent if the probability P(AՌB) of their intersection AՌBis equal to the product P(A)-P(B) of their individual probabilities

(4 marks)

1. Are the events that a person has type B blood (denoted by B) and that a person is female (denoted by F) independent? Justify your conclusion using the appropriate rule(s) of probability.

The two events are independent if the equation P (AՌB) = P (A). P(B)holds true you can use the equation to check if events are independent,

(13 total marks)

Three patients are accepted into a clinical trial for a new drug. According to the severity of the condition of each patient, the doctors estimate that the probability that the drug will be effective for Patient A is 0.7, the probability that it will be effective for Patient B is 0.2, and the probability that it will be effective for Patient C is 0.6. Assume that the success of the drug for the three patients is independent.

(8 marks)

1. Draw a tree diagram displaying all outcomes and joint probabilities measuring the effectiveness of the drug for all three patients.

A

AB

0.7 A

AB

0.2 B

BC

0.6 C CA

C

(1 mark)

1. What is the probability that the drug will be effective for Patient B only?

The Probability is 0.2

(1 mark)

1. What is the probability that the drug will be effective for at least one of the patients?

The probability is 0.5

(3 marks)

1. What is the probability that the drug will be effective for exactly two of the three patients?

The probability is 1

(11 total marks)

The Math Club is picking names out of a hat to determine who will serve as the Executive for their group. The person whose name is drawn first will serve as the President, and the person whose name is drawn second will be Vice-President. The Math Club has 13 members: 5 members whose focus of study is Statistics and 8 members whose focus of study is Calculus. [ Hint: A tree diagram would be helpful in analyzing this problem.]

What is the probability that:

(2 marks)

1. Both members chosen for the executive have Calculus as their focus of study?

The probability is 0.8

(5 marks)

1. Any one member of the Executive has Statistics as a focus, and the other has Calculus as a focus?

The probability is 0.15

(2 marks)

1. The chosen President has Calculus as a focus and the chosen Vice-President has Statistics as a focus?

The probability is 0.152

(2 marks)

1. At least one member of the chosen Executive has a focus in Calculus?

The probability is 0.08

(6 total marks)

A survey questioned 200 individuals regarding their intention to vote (Conservative, Liberal or Other) in an upcoming election. It was found that 50% of the sample planned to vote Conservative and 40% planned to vote Liberal. Fifty percent of the men indicated that they planned to vote Liberal, and forty percent of the men planned to vote Conservative. Overall, 30% of the sample were women.

(5 marks)

1. Construct a two-way classification for these survey results.

Conservative

Liberal

Total

Men

80

60

140

Women

35

25

60

Total

115

85

200

(1 mark)

1. Circle True (T) or False (F) for the following statement:

T F In this example, male and female are complementary events.

(11 total marks)

You are given the probabilities of events A, B and C as listed below:

(2 marks)

1.

Find .

P( A and C)

= 0.14

(2 marks)

1.

Find .

= 1.14

(2 marks)

1.

Find .

= P(BՌC)/P(C)

= 0.571

(2 marks)

1. Are B and C mutually exclusive events? Why?

B and C have no members in common, you cannot have all tails and heads at a time

(3 marks)

1. Are B and C independent events? Provide a mathematical justification of your conclusion using the appropriate rule(s) of probability.

B and C are independent events for example P(A and B and C)= P(A)*P(B)*P(C)

MATH 215 Assignment 3.docx

MATH 215 Assignment 3

Khadra Sulub

Victor Olobatuyi

April, 5th/ 2022

Assignment 3

Overview

Total marks:   / 75

This assignment covers content from Unit 3 of the course. It assesses your knowledge of random variables, types of random variables and various types of probability distributions, along with their means and standard deviations.

Instructions

Show all your work and justify all of your answers and conclusions, except for the True/False questions.

Keep your work to 4 decimals, unless otherwise stated.

(4 marks)

Circle True (T) or False (F) for each of the following statements:

T F The following table, which lists values of x and their probabilities, represents a valid probability distribution:

x

P(x)

3

0.32

4

0.54

5

0.24

T F The following table, which lists values of x and their probabilities, represents a valid probability distribution:

x

P(x)

0

.09

1

0.28

2

0.42

3

0.39

T F The speed of a car travelling on the Queen Elizabeth Highway is an example of a continuous variable.

T F The binomial distribution can be used only when the probabilities of the two possible outcomes are equal.

(16 marks)

The following table lists the frequency distribution of the number of vehicles owned per household from a sample of 200 households:

x

 0

  1

 2

 3

4

5

f

33

106

45

10

4

2

(4 marks)

Construct a probability distribution table for the number of vehicles owned per household.

X

F

P(x)

0

33

0.33

1

106

1.06

2

45

0.45

3

10

0.10

4

4

0.4

5

2

0.2

(2 marks)

Calculate the mean of this probability distribution. Hint: Consider adding the appropriate column to the table created in part (a).

=Sum/count

200/6

=33.33

(4 marks)

Calculate the standard deviation of this probability distribution. Hint: Consider adding the appropriate columns to the table created in part (a).

SD=19.9

(4 marks)

Give a brief interpretation (one or two sentences each) of the values of the mean and the standard deviation.

Use the mean to describe the sample with the single value that represents the center of the data

The standard deviation is the average amount of variability in your date sheet

(2 marks)

What is the probability that a household selected at random will have at least two vehicles?

The probability is 0.2

(13 total marks)

When transferring a goldfish to a new water source, such as a different fish tank, there is an 8% chance that the goldfish will die within the first week.

If we select at random 5 goldfish that have been transferred to a new water source, what is the probability:

(3 marks)

1. that exactly one of them will die within the first week?

Ans: 0.4

(6 marks)

1. that fewer than three of them will die within the first week?

Ans: 1.2

(2 marks)

1. that at least one of them will die within the first week?

0.4

(2 marks)

1. Circle True (T) or False (F) for each of the following statements:

T F If we randomly select 6 goldfish that have been transferred instead of 5, the experiment continues to satisfy the conditions for a binomial experiment.

T F John transfers each goldfish to the same bowl. In this case, the chance that a goldfish will die goes up by 1% for each additional goldfish that is selected. This new experiment continues to satisfy the conditions for a binomial experiment.

(11 total marks)

Thirty percent of students graduate from high school before they reach the age of 18. In a random sample of 16 high-school graduates, what is the probability that:

[ Hint: Use binomial table.]

(3 marks)

1. more than 10 of them graduated before they were 18 years old?

No,

(3 marks)

1. at most 4 of them graduated before they were 18?

Yes

(3 marks)

1. fewer than 7 of them graduated after they turned 18?

Yes

(2 marks)

1. Would the binomial probability distribution representing the sample in Question 4a be skewed? If yes, in what direction? Describe the shape of the distribution in context of the study.

The binomial distribution formula is for random variable X, given by; P(x:n,p)

P is Probability of success in a single experiments

(5 total marks)

Use the standard normal distribution table to find:

(2 marks)

1.

z

+0.00

+0.01

+0.02

+0.03

+0.04

+0.05

+0.06

+0.07

+0.08

1.1

0.5623

0.5625

0.5627

0.5631

0.5632

0.5635

0.5645

0.5646

0.5649

1.2

0.5723

0.5734

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5663

1.3

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5765

1.4

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5735

1.5

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

1.6

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

1.7

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

(3 marks)

1.

z

+0.10

+0.20

+0.30

+0.40

+0.50

+0.60

+0.70

+0.80

+0.90

2.0

0.5623

0.5625

0.5627

0.5631

0.5632

0.5635

0.5645

0.5646

0.5649

2.01

0.5723

0.5734

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5663

2.02

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5765

2.03

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5735

2.04

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

2.05

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

2.06

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

0.5635

(11 total marks)

The daily milk production of a dairy cow is normally distributed with a mean of 3,500 milliliters and a standard deviation of 250 milliliters.

(4 marks)

1. What is the probability that a cow selected at random will produce between 3,200 and 3,855 milliliters of milk per day?

The Probability is 0.3

(4 marks)

1. What is the probability that a cow selected at random will produce 3,900 milliliters of milk or more?

The probability is 0.34

(3 marks)

1. Forty percent (40%) of cows will likely produce less than what amount of milk, in milliletres?

Yes

(4 marks)

The time taken to assemble a car in a certain plant is a normal random variable having a mean of 20 hours. If 6.3% of the cars assembled take longer than 25 hours to assemble, what is the standard deviation of assembly time?

SD= 3.96

(11 total marks)

A national poll found that 60% of Canadians believe that life exists on other planets. In a randomly selected sample of 300 Canadians:

(3 marks)

1. what is the probability that fewer than 200 people in the sample believe in extraterrestrial life?

The probability is 0.2

(4 marks)

1. what is the probability that at least 160 people in the sample believe in extraterrestrial life?

The probability is 0.16

(4 marks)

1. what is the probability that exactly 190 Canadians in the sample believe in extraterrestrial life?

The probability is 0.19

1.

These extra pages are for additional calculations. If you need them for your solutions, please reference them in the appropriate place in the questions.

MATH 215 Assignment 4.docx

MATH 215 Assignment 4

Khadra Sulub

Victor Olobatuyi

April, 13th/ 2022

Assignment 4

Overview

Total marks:   / 92

This assignment covers content from Unit 4 of the course. It assesses your knowledge of sampling distributions that refer to probability distributions of sample statistics, such as the sample mean and sample proportion, and your ability to use sampling distributions in estimation and in hypothesis testing about population means and population proportions.

Instructions

Show all your work and justify all of your answers and conclusions, except for the True/False questions.

Keep your work to 4 decimals, unless otherwise stated.

Note: Finishing a test of hypotheses with a statement like “reject ” or “do not reject ” will be insufficient for full marks. You must also provide a written concluding statement in the context of the problem itself. For example, if you are testing hypotheses about the effectiveness of a medical treatment, you must conclude with a statement like, “we can conclude that the treatment is effective” or “we cannot conclude that the treatment is effective.”

(9 total marks)

The duration of long-distance telephone calls is normally distributed with a mean of and a standard deviation of . If a random sample of 64 telephone calls is used to reflect on the population of all long-distance calls, what is the probability that the sample mean call duration:

(3 marks)

will be more than 14 minutes?

Yes

(6 marks)

will be either less than 15 minutes or more than 20.5 minutes?

Its more than 20.5%

(7 total marks)

An insurance company states that 8% of all house insurance claims are fraudulent. If this estimate is correct, what is the probability that in a random sample of 184 house insurance claims, the proportion of claims that are fraudulent is

(3 marks)

1. less than 0.05?

Its true

(4 marks)

1. more than 0.10?

No

(11 total marks)

Researchers were interested in the number of monitors and screens (televisions and computer monitors) owned within households in Canada. They collected data from a random sample of ten households. The number of monitors/screens for the ten households was as follows:

5 8 1 3 3 4 2 7 6 4 (8 marks)

1. Assuming that this variable is normally distributed, construct a 90% confidence interval for the population mean.

Ans: 39.4

(2 marks)

1. In a sentence or two, describe what this confidence interval represents.

Ans: The probability that a parameter will fall between a pair of values around the mean. Confidence intervals measures the degree of uncertainty or certainty in a sampling method

(1 mark)

1. Which of the following would produce a confidence interval with a larger margin of error than the 90% confidence interval? Clearly circle only one response.

Sampling only 5 households instead of 10, because 5 are easier to manage.

Sampling 5 households rather than 10, because a smaller sample size will result in a larger margin of error.

Sampling 20 households rather than 10, because a larger sample size will result in a larger margin of error.

Computing an 85% confidence interval rather than a 90% confidence interval, because a larger confidence interval will result in a larger margin of error.

(10 total marks)

The manager of a large production plant would like to estimate the mean amount of time a worker takes to assemble a new component. A random sample of 64 workers indicates a mean time of 16.2 minutes. Assume that the population standard deviation of this assembly time is known to be 3.6 minutes.

(5 marks)

1. Construct a 95% confidence interval for the mean assembly time.

(4 marks)

1. How many workers should be involved in this study in order to have the mean assembly time estimated within 0.5 minutes of the population mean with 97.5% confidence?

(1 mark)

1.

Which of the following is a property of the sampling distribution of the sample mean, ? Clearly circle one response only.

1.

If you increase your sample size, the sample mean, , will always get closer to the population mean, .

1.

The standard deviation of the sample mean is generally larger than the standard deviation from the original population,

1.

The mean of the sampling distribution of is ,the population mean.

1. The sampling error will account, in part, for errors that occur in the collection, recording and tabulation of the data.

(8 total marks)

A researcher was interested in the enrollment of women within engineering majors at a local university. The data from a random sample of 25 engineering students were as follows:

Male Female Male Male Male Female Male Male Male Male Male Female Female Male Male Male Male Male Male Female Male Female Male Male Male

(7 marks)

1. Construct a 92.5% confidence interval for the population proportion of females with an engineering major.

(1 mark)

1. Suppose that we took a second sample and calculated that this sample had a 90% confidence interval of 0.19 to 0.35. Which of these is a valid interpretation of this confidence interval? Clearly circle only one response.

1. There is a 90% probability that a randomly selected student has a 0.19 to 0.35 probability of being female.

1. We are 90% confident that the mean proportion of being female in our sample is between 0.19 and 0.35.

1. We are 90% confident that 19 to 35% of female students will chose an engineering major.

1. We are 90% confident that the population mean proportion of females is between 0.19 and 0.35.

(8 total marks)

Suppose a consumer advocacy group would like to conduct a survey to find the proportion of consumers who bought the newest model of a particular vehicle who were happy with their purchase.

(4 marks)

1. How large a sample should they take so that a 90% confidence interval for the population proportion has a margin of error of 0.02? Assume that a preliminary sample suggests that 85% of consumers are satisfied.

(4 marks)

1. Assuming that the results of the preliminary sample are not available, what is the most conservative estimate of the sample size that would limit the margin of error to within 0.05 of the population proportion for a 98% confidence interval?

(11 total marks)

For the past several years, the mean literacy-achievement score for a population of third-grade students has been 45 with a population standard deviation of 15. A researcher is interested in whether an experimental teaching program is more or less effective than current teaching methods for literacy. After participating in the experimental teaching program, a random sample of 100 students had a mean score of 48.75.

(8 marks)

1.

Formulate and test the appropriate hypotheses using the p-value approach. Would you reject the null hypothesis ? What would be your conclusion, explained within the context of the test?

(2 marks)

1.

Would you reject the null hypothesis ? What would be your conclusion, explained within the context of the test?

(1 mark)

1.

How strong is the evidence against ? ( Hint: Refer to Unit 4, Section 7, in the Study Guide).

(14 total marks)

A researcher thinks that if hip-surgery patients go to physical therapy three times a week, instead of the usual twice a week, their recovery period will be shorter. In the past, population mean recovery time for hip-surgery patients who attended twice a week was 8.2 weeks. A random sample of 81 hip-surgery patients is selected. Each patient is asked to attend physical therapy three times a week. The sample results show a mean recovery time of 7.7 weeks and a standard deviation of 3.15 weeks.

Using a 5% level of significance, can the researcher conclude that the mean recovery time has decreased with more physical therapy? Formulate and test the appropriate hypotheses, using both the critical value approach and the p-value approach. Provide a concluding statement, interpreted within the context of the question.

Note: When you consult the appropriate statistical table to answer this question, use the degrees of freedom in the row following (where infinity shows as the ).

(8 marks)

1. Critical value approach

(6 marks)

1. p-value approach

(14 total marks)

A teacher believes that students with natural music abilities are more likely to be left-handed than typical students are. To test this, a random sample of 80 students in a performing arts school were selected. The results showed that 25 of these students were left-handed. Within the general population, it has been determined that 20% of students are left-handed.

Can it be concluded that the percentage of students in the performing arts school who are left-handed is greater than 20%? Formulate and test the appropriate hypotheses at a 1% significance level, using both the critical value approach and the p-value approach. Provide a concluding statement, interpreted within the context of the question.

(8 marks)

1. Critical value approach

(6 marks)

1. p-value approach

These extra pages are for additional calculations. If you need them for your solutions, please reference them in the appropriate place in the questions.

MATH 215 Assignment 5.docx

MATH 215 Assignment 5

Khadra Sulub

Victor Olobatuyi

April, 19th/ 2022

Assignment 5

Overview

Total marks:   / 70

This assignment covers content from Unit 5. It assesses your ability to use sampling distributions in hypothesis testing about the difference between two or more population means or the difference between two population proportions, including tests for experiments with more than two categories and tests about contingency tables.

Instructions

Show all your work and justify all of your answers and conclusions.

Keep your work to 4 decimals, unless otherwise stated.

Note: Finishing a test of hypotheses with a statement like “reject ” or “do not reject ” will be insufficient for full marks. You must also provide a written concluding statement in the context of the problem itself. For example, if you are testing hypotheses about the effectiveness of a medical treatment, you must conclude with a statement like, “we can conclude that the treatment is effective” or “we cannot conclude that the treatment is effective.”

(9 marks)

A researcher is interested in examining the cholesterol levels of heart-attack patients. Cholesterol levels are measured for 28 heart-attack patients (2 days after their attacks) and 30 other hospital patients who did not have a heart attack. The researcher believes that cholesterol levels will be higher for the heart-attack patients. Random samples from each group provide the following results:

Heart-Attack Patients

Non-Heart-Attack Patients

Sample Size

 28  

 30  

Mean Cholesterol (mg/DL)

213.9

193.1

Standard Deviation of Cholesterol (mg/DL)

 47.7

 22.3

Assume that the cholesterol levels for both populations are normally distributed and that the population standard deviations are equal.

Using a 5% significance level, can the researcher conclude that the mean cholesterol level of heart-attack patients is greater than the mean cholesterol level of non-heart-attack patients? Formulate and test the appropriate hypotheses. State and explain your conclusion within the context of the question. Use the critical value approach.

(9 marks)

A manufacturer wanted to improve on the processes used to produce electrical components. At the beginning of the year, the factory randomly examined 9,000 electrical components, and of these a total of 900 components were rejected after a quality-control inspection. A project was deployed to fix the problem. Following the project, 7,000 components were randomly picked to be examined. Of these, a total of 600 were rejected. Did the project intervention improve the process?

Test at the 2% significance level whether the population proportion of rejected components decreased after the project compared to the population proportion prior to the project. Formulate and test the appropriate hypotheses. Use the p-value approach. Be sure to clearly state and explain your conclusion within the context of the question.

(11 marks)

Researchers counted the number of breeding sea turtles on various sections of beach property in Cancun every year. Nine randomly selected sections of beach were used. The following table shows the number of counted sea turtles for two successive years (2015 and 2016).

Section A

Section B

Section C

Section D

Section E

Section F

Section G

Section H

Section I

2015

62

54

36

42

61

76

84

75

43

2016

60

58

31

40

62

70

81

72

43

At the 5% significance level, can it be concluded that the number of breeding sea turtles in 2015 is different from the number of sea turtles in 2016? Formulate and test the appropriate hypotheses. Use the critical value approach. Assume the population of paired differences has a normal distribution. Clearly state and explain your conclusion within the context of the question.

(9 marks)

After introducing a new teaching curriculum, a teacher is interested in whether the grade distribution in his course is significantly different than it was in previous years. The distribution of grades before the introduction of the new curriculum was as follows:

Grade

Percentage

A

15%

B

40%

C

25%

D

15%

F

 5%

A random sample of 150 students taken after the introduction of the new curriculum provided the following results:

Grade

Observed Frequency

A

30

B

65

C

35

D

15

F

 5

Does the observed data contradict the hypothesis? Formulate and test the appropriate hypotheses at the 1% significance level. Use the critical value approach. Clearly state and explain your conclusion within the context of the problem.

(10 marks)

A marketing firm that markets refrigerators is interested in studying consumer behavior in the context of purchasing a particular brand of refrigerator. It wants to know, in particular, whether the income-level of the consumers influences their choice of refrigerator brand. Currently there are three brands available in the marketplace. Brand A is a premium brand, Brand B is a more moderately priced brand, and Brand C is the most economical brand.

A representative stratified random sampling procedure was adopted covering the entire market using income as the basis of selection. Income was classified into three categories: lower, middle and high. A sample of 200 consumers participated in this study and produced the following data:

Brand A

Brand B

Brand C

Lower

20

30

50

Middle

20

25

15

High

10

15

15

At the 5% significance level, can it be concluded that there is a relationship between income-level and brand preference? Formulate and test the appropriate hypotheses. Use the critical value approach. Clearly state and explain your conclusion within the context of the question.

(12 marks)

Three colors of warning lights can be used on an automobile instrument panel. A researcher was interested to know whether users would have different reaction times depending on the color used in the panel. To find out, she randomly assigned, from 15 participants in total, 5 participants to each one of the 3 colors, and then measured their reaction times (in hundredths of a second, with decimal points deleted). The following data were obtained:

Red

Yellow

Blue

20

21

21

20

22

24

21

18

23

23

19

22

22

20

25

Given that the necessary assumptions are satisfied, can it be concluded, at the 5% level of significance, that not all mean reaction times to the colors are equal? Formulate and test the appropriate hypotheses. Use the critical value approach. Clearly state and explain your conclusion within the context of the question.

(10 total marks)

The following ANOVA table is based on information obtained for five samples selected from five independent populations that are normally distributed with equal variances:

Source of Variation

Degrees of Freedom

Sums of Squares

Mean Square

Value of the Test Statistic

Between

---

332.100

---

Within

20

---

75.400

---

Total

---

---

(6 marks)

Fill in the missing values in the table as indicated by the blanks (---).

(4 marks)

Using a significance level of , indicate what your null and alternative hypotheses would be in this situation. Test these hypotheses, state your conclusion and explain its meaning in the context of this problem.

These extra pages are for additional calculations. If you need them for your solutions, please reference them in the appropriate place in the questions.

MATH 215 Assignment 6.docx

MATH 215 Assignment 6

Khadra Sulub

Victor Olobatuyi

April, 29th/ 2022

Assignment 6

Overview

Total marks:   / 62

This assignment covers content from Unit 6. It assesses your knowledge of correlational analysis and regression analysis used to examine the relationship between two quantitative variables.

Instructions

Show all your work and justify all of your answers and conclusions, except for the True/False questions.

Keep your work to 4 decimals, unless otherwise stated.

Note: Finishing a test of hypotheses with a statement like “reject ” or “do not reject ” will be insufficient for full marks. You must also provide a written concluding statement in the context of the problem itself. For example, if you are testing hypotheses about the effectiveness of a medical treatment, you must conclude with a statement like, “we can conclude that the treatment is effective” or “we cannot conclude that the treatment is effective.”

(43 total marks)

A large warehouse superstore is interested in optimizing its customers’ shopping experiences and, as such, wants to ensure that it is able to staff the store properly during peak hours. The store management is interested in studying the relationship between the number of tills or checkouts that are open in the store and the amount of time it takes for a customer to check out (that is, the time it takes from when they get in line to when they complete their purchase). The data in the following table were collected from a random sample of 7 customers:

Tills Open (x)

Time to Checkout (minutes) (y)

2

17

9

10

12

5

5

12

3

15

10

8

6

12

(4 marks)

Construct a scatter diagram for these data with “Tills Open” on the horizontal (x) axis, and “Time to Checkout” on the vertical (y) axis. Note: Try to make relatively full use of the graph paper provided.

(2 marks)

Describe the general pattern of relationship between the two variables within the context of this question.

(11 marks)

Calculate the least squares regression line using “Time to Checkout” as the dependent variable and “Tills Open” as the independent variable.

(3 marks)

Calculate predicted values for and . Use these values to help plot the regression line on the scatter diagram you constructed in part a. above.

(10 marks)

Can it be concluded that the slope of the regression line is negative? Formulate and test the appropriate hypotheses at the 5% significance level. Use the critical value approach. Clearly state and explain your conclusion within the context of the problem.

(4 marks)

Construct a 95% confidence interval for β.

(2 marks)

Interpret the value of b in the sample regression line. What does it mean in the context of this question?

(2 marks)

One of the store managers regularly likes to keep 8 tills open on Saturdays. Use the equation of the regression line to provide the manager with the predicted time to check out if 8 tills are open.

(1 mark)

Which of the following cannot be answered from the regression equation? Clearly circle only one response.

A prediction of the value of y at a particular value of x.

An estimate of the slope between y and x.

An estimate of whether the linear association between variables is positive or negative.

An estimate of whether the association between variables is linear or non-linear

(2 marks)

In a sentence or two, describe what information the standard deviation of errors provides.

(2 marks)

When the correlation between x and y is 1.0, what will the standard deviation of errors be? Why is this?

(19 total marks)

Does the amount of education you have predict your salary? To answer this question, data from a random sample of 8 working adults was collected. Each participant answered the number of years of post-secondary education they have as well as their annual salary (in thousands of dollars). The data was as follows:

Post-Secondary Education (in years) (x)

Annual Salary (in thousands of dollars) (y)

4

70

0

55

5

40

8

80

10

125

4

95

2

85

6

60

You may use the following sums and sums of squares and cross products for the questions below.

           

(5 marks)

1. Calculate the least squares regression line using “Annual Salary” as the dependent variable and “Post-Secondary Education” as the independent variable.

(2 marks)

1. Interpret the value of b in the sample regression line. What does it mean in the context of this question?

(2 marks)

1. Compute the linear correlation between “Post-Secondary Education” and “Annual Salary”. Express your answer to 4 decimal places of accuracy.

(2 marks)

1. What percentage of variation in annual salary is explained by its linear relationship with post-secondary education?

(8 marks)

1. At the 2.5% significance level, can it be concluded that the correlation between post-secondary education and annual salary is positive? Formulate and test the appropriate hypotheses. Use the critical value approach. Clearly state and explain your conclusion within the context of the question.

These extra pages are for additional calculations. If you need them for your solutions, please reference them in the appropriate place in the questions.