statistics homework needs help

profilefunnydude166
hw.pdf

STAT 7 Final (Winter 2020)

March 14, 2020

This midterm contains 100 points. This is an open notes exam, i.e., you are allowed to

use notes and books. However, you are not allowed to collaborate among yourselves. The

deadline for submission is 11:59 PM on March 16. The deadline is a strict one and no late

submission will be entertained.

1. In public health studies, county-wise mortality rates of lung cancer are monitored to

understand any drastic difference in the rates of mortality due to lung cancer in two

adjacent counties. Among 100 and 200 lung cancer patients randomly sampled from

Hartford and New Haven counties, 10 and 5 respectively have succumbed. Assume

p1 and p2 are the proportion of lung cancer patients died in Hartford and New Haven

counties.

• Test the hypothesis H0 : p2 = 0.1 vs. H1 : p2 6= 0.1 at 5% level of significance. (5

points)

• Test the hypothesis H0 : p1 = 0.05 vs. H1 : p1 > 0.05 at 5% level of significance.

(5 points)

• Test the hypothesis H0 : p1 = p2 vs. H1 : p1 6= p2 at 5% level of significance. (5

points)

Use the fact that P(Z > 1.96) = 0.025 and P(Z > 1.64) = 0.05.

2. A small study is conducted involving 10 infants to investigate the association between

gestational age at birth, measured in weeks, and birth weight, measured in grams.:

1

Cured Not Cured Total Drug 1 37 13 50 Drug 2 88 21 109 Total 125 34 159

Table 1:

Gestational age: 34.7, 36, 29.3, 40.1, 35.9, 40.8, 38.3, 37, 41.2, 39.8.

Birth weight: 1895, 2030, 1440, 2835, 3115, 4013, 3174, 3625, 2289, 2845.

• Find the correlation coefficient r of the gestational age at birth and birth weight.

(5 points)

• Test H0 : ρ = 0 vs. H1 : ρ 6= 0 at 5% level of significance. Here ρ is the population

correlation coefficient between the two variables. Use P(t8 > 2.36) = 0.025. (5

points)

• Construct a regression line Y = β0 + β1X + �, where Y,X are gestational age at

birth and birth weight respectively. What are the best estimates of β0 and β1?

(10 points)

• Test H0 : β1 = 0 vs. H1 : β1 6= 0 at 5% level of significance. (5 points)

• Compute the coefficient of determination, R2. (5 points)

3. A new casino game involves rolling 2 dice. The winnings are directly proportional to

the total number of sixes rolled. Suppose a gambler plays the game 100 times, with 0,

1 and 2 sixes observed 40, 30, 30 times respectively. Do you reject the hypothesis H0:

that the dice are fair at 5% level of significance? Use the fact that P(χ22 > 5.99) = 0.05.

(10 points)

4. Efficacy of a new drug, referred to as Drug 2 is tested over the already existing drug,

referred to as Drug 1, for a particular disease. The data on the number of people using

Drug 1 and Drug 2 vs. people who are cured or not cured are given in Table 1. Test

H0 that there is no difference between Drug 1 and Drug 2 in curing people, at 5% level

of significance. Use P(χ21 > 3.84) = 0.05. (10 points)

5. Let the probability of getting cured from a certain virus infection is 0.90. Out of 20

2

randomly selected patients, 5 died from the virus infection. Define the random variable

X = no. of deaths.

• What distribution X follows? (2 points)

• Find the mean, variance, standard deviation of the random variable X. (3 points)

• Find the probability of 5 deaths? (5 points)

6. There are 12 accidents per month on a highway. Define the random variable X = no.

of accidents in a week.

• What distribution does X follow? (2 points)

• Find the mean, variance, standard deviation of the random variable X. (3 points)

• Find the probability of producing 5 accidents in a week. (5 points)

7. In Orange county, 51% of the adults are male. It is also known that 9.5% of male

smoke cigar, while 1.7% of female are smokers. If an adult randomly selected is found

to be a smoker, what is the probability that this person is a male? (5 points)

8. A diagnostic test has a probability 0.95 of giving a positive result when applied to a

person suffering from a certain disease, and a probability 0.10 of giving a (false) pos-

itive when applied to a non-sufferer. It is estimated that 0.5% of the population are

sufferers. Suppose that the test is now administered to a person about whom we have

no relevant information relating to the disease (apart from the fact that he/she comes

from this population). Calculate the following probabilities:

(a) that, given a positive result, the person is a sufferer; (5 points)

(b) that, given a negative result, the person is a non-sufferer. (5 points)

3