statistics homework

profilejbjbrian
homework20assignment2041.pdf

Stat 226 Module 4 Online Section

1. New York Stock Exchange. Suppose that the percentage returns for a given year for all Stocks listed on the New York Stock Exchange follows a non-normal distribution with mean µ = 12.4 percent and a standard deviation of σ = 8.4 percent.

(a) Can you find the probability that a single stock has an annual return less than 34 percent? Yes or no.

(b) Consider drawing a random sample of n = 5 stocks from the population of all stocks and calculating the mean return of the five sampled stocks. We can think of the observed sample mean, x̄, as a single draw from a distribution that relates to the random variable X̄. We call this distribution: the sampling distribution of the sample mean.

i. What is the shape of the sampling distribution of the sample mean when n=5? Is it normal or non-normal or is there not enough information to tell?

ii. What is the mean of the sampling distribution of the sample mean when n=5?

iii. What is the standard error of the sampling distribution of the sample mean when n=5?

iv. Can you find the probability that the mean return of the five sampled stocks is less than 34 percent? If yes, what is it?

(c) The CLT says a sample size of n = 30 should be sufficiently large to ensure an approx- imately Normal distribution for the sampling distribution of the sample mean. Thus, we are able to calculate the following probabilities even though we are drawing from a non-normal distribution. To answer the following questions you will want to use z-score formula: z = x̄−µσ√

n

.

i. Specify the sampling distribution of the sample mean when n = 30, i.e. what are the mean, the standard error and the shape of the sampling distribution?

A. shape: Normal/not Normal/not enough information to tell (Multiple Choice.)

B. mean:

C. standard error:

ii. Find the probability that the mean return for the 30 sampled stocks is less than 10 percent.

1

iii. Find the probability that the mean return for the 30 sampled stocks is between 11 and 14 percent.

2. The CLT in action. Last semester 440 ISU students provided data on the question “How many boyfriends/girlfriends have you had in your life?” For the purpose of illustrating the CLT, we assume that these data correspond to the population of interest and all 440 data points are plotted in the top left histogram. Thus, we know the population mean is µ = 3.3 (blue vertical line) and the population standard deviation is σ = 2.74. The purpose of the exercise below is to help you understand how the sampling distribution for a sample of size n here n = 5, 15, 30) is constructed and the distribution changes as we increase n. Note that the remaining three histograms each show 100 sample means (each one for a different sample size) and therefore display an approximation of sampling distribution of the sample mean X̄ for these three sample sizes.

2

(a) What is the shape of the population distribution, i.e. the distribution of the number of girl-/boyfriends?

Symmetry:

i. Symmetric - normal

ii. Symmetric - non-normal

iii. Skew right

iv. Skew left

v. Uniform

vi. Multimodal

(b) Specify the sampling distribution of the sample mean for samples of size n = 5,n = 15 and n = 30. Note that if we know the mean and variance of a random variable, but do not know the shape, we specify it as - X ∼ (mean, variance)

3

i. n = 5: Shape: normal or non-normal or unknown ? Mean: Std Error:

ii. n = 15: Shape: normal or non-normal or unknown ? Mean: Std Error:

iii. n = 30: Shape: normal or non-normal or unknown ? Mean: Std Error:

(c) A JMP script called SamplingDistribution.jsl and a data file called partners.jmp are avail- able on BlackBoard under Homework 4. Download both files to your desktop. We will use both files to illustrate the CLT and the behavior of the sampling distribution as the sample size increases.

(d) Taking a single sample of size n. Below are instructions and a table of results that you will fill out and then report online.

i. In JMP, first open the data file partners.jmp. Then open the script file SamplingDis- tribution.jsl. First, under Investigate choose the option Shape. Secondly, under the option population characteristics → population shape select My Data. Choose the partners.jmp dataset. In the next window, bring number of partners into X vari- able and click OK. The population mean box should now display 3.3 (instead of 0).

ii. Under Demo Characteristics adjust the Sample Size to 5 and Number of Sam- ples to 1. You can generate the first sample by clicking on the Draw Samples button under Run Simulation. The sample mean will appear under the Sample Summary Table on the far right. Record this value as x̄1 in the table below which should then be entered online.

iii. Obtain two more samples of size n = 5 and record the corresponding sample means in the first row of the table.

iv. Repeat the previous part for n = 15 and n = 30. Each result represents a single sample of size n.. Record in the table provided below and then record online:

population mean sample 1 sample 2 sample 3 µ x̄1 x̄2 x̄3

n = 5 3.3 n = 15 3.3 n = 30 3.3

4

(e) Taking many samples of size n. Below are instructions and a table of results that you will fill out and report online.

i. In part (b), you found the mean and the variance of the sample mean for each of the three choices of n. Note that the square root of the variance of the sample mean is equal to the standard error (or SE(X̄)). Fill in the corresponding values for SE(X̄) in the table provided below and online.

ii. Next, let’s estimate the sampling distributions stated in (b) by increasing the number of samples we generate. We will begin with the sampling distribution for samples of size n = 5. Under Run Simulation, choose Reset to set the number of sam- ples drawn back to zero. Under Demo Characteristics, change the Number of Samples to 1,000 samples and Animate Illustration? to Yes.

iii. Find the estimate of SE(X̄) in the Means Summary Table. Record this value in the table provided below. Note this is an estimate of the standard error because we only have 1,000 samples of size n - instead of all possible samples of size n!

iv. Repeat (iv) and (v) for n = 15 and n = 30. Don’t forget to Reset the number of samples drawn to zero before you start a new simulation! Record in the Table provided below and then online. Record table entries in the following table and then record online:

SE ( X̄ )

estimate of SE(X̄) n = 5 n = 15 n = 30

v. As the sample size n increases what happens to the standard error of the sampling distribution?

A. It approaches the standard deviation of the original distribution.

B. It decreases from the standard deviation of the original distribution.

C. It increases from the standard deviation of the original distribution.

D. It can do more than one of the above.

vi. As the sample size n increases what happens to the shape of the sampling distribution? (Hint: Look at the histograms for the different sample sizes and compare the shape.) Select all of the true statements.

A. The shape of the sampling distribution becomes more like the shape of the original distribution.

B. The shape of the sampling distribution becomes more symmetric and bell-shaped.

C. The shape of the sampling distribution becomes more skewed right.

D. The mean of the sampling distribution approaches 0.

E. The mean of the sampling distribution increases.

F. The mean of the sampling distribution approaches the true population mean.

5

3. Automobile Insurance An automobile insurance company sells a one-year coverage insurance policy product for which the profit for each policy sold is a random variable Y that has a mean of $210 and a standard deviation of $7000 (a negative profit implies a loss). Suppose that the company will sell 10,000 of these policies. Let Ȳ denote the sample mean of the 10,000 policies. Use Table A and report probabilities to four digits after the decimal. For any values that have units of dollars, round to the nearest dollar. When entering numerical answers on-line, just enter the number (do not use units or commas).

(a) What is the shape of the distribution of individual policies? NORMAL or NON-NORMAL? Do you have enough information to compute probabilities concerning one of the 10,000 policies? Yes or no?

(b) What is the shape of Ȳ ? NORMAL or NON-NORMAL? Do you have enough information to compute probabilities concerning Ȳ ? Yes or no?

(c) What is the mean of the distribution of Ȳ .

(d) What is the standard deviation of the distribution of Ȳ .

(e) What is the probability that Ȳ < 0? You will need to report the z-score and the probability.

(f) What is the probability that Ȳ is greater than $45? You will need to report the z-score and the probability.

(g) If for the 10,000 policies Ȳ turns out to be $45, how much total profit will they make on this policy product? If Ȳ turns out to be $300, how much total profit will they make on this policy product?

(h) Compute the 5th and 95th percentile of the distribution of Ȳ . Report the upper and lower z-scores as well as the percentiles.

6