phyllis young - 12 probability and statistical questions
Question 1
The joint probability density of X and Y is given by
f(x,y) =
{ 1 4 (2x + y), 0 < x < 1 and 0 < y < 2;
0, elsewhere. .
• Find;
1. the marginal density of X,
2. the marginal density of Y ,
3. the conditional density of Y given X, and
4. E[Y |X = 1/4].
• Are X and Y independent? Provide a reason why or why not.
Question 2
Assume that the joint probability density function (PDF) of X and Y is given by
fX,Y (x,y) =
{ 1 y , 0 < x < y , 0 < y < 1;
0, otherwise.
1. Determine the marginal PDF of X and the conditional PDF of X given Y .
2. Are X and Y independent? Explain why or why not.
3. Calculate the probability that the sum X + Y exceeds one half, that is, evaluate Pr(X + Y > 0.5). Hint: If X + Y ≤ 0.5 then X ≤ Y for 0.0 < Y ≤ 0.25, and X ≤ 0.5 −Y for 0.25 ≤ Y < 0.5.
Question 3
Suppose that the random variable Y has a probability density function (pdf) given by
fY (y) =
{ ky3e−y/2 for y > 0
0 otherwise.
(a) Find the value of k that makes fY (y) a density function.
(b) Does Y have a Chi-squared (χ2) distribution? If so, how many degrees of freedom?
(c) What is the probability that Y lies within two standard deviations of its mean value?
4
Question 4
Let Yi, for i = 1, 2, 3 and 4, be independent normally distributed random variables such that E [Yi] = i and V ar (Yi) = 1. (Note that the mean of each Yi is different, whereas the variance for each Yi is the same.)
(a) Let TY = ∑4
i=1 Yi.
(i) What is the probability distribution of TY ?
(ii) Find Pr (TY ≤ 12.46).
(b) Now let TZ = ∑4
i=1 Z 2 i , where Zi = (Yi − i), for each i = 1, 2, 3 and 4.
(i) What is the probability distribution of TZ?
(ii) Find the value of t such that Pr (TZ > t) = 0.05.
Question 5
1. Show that if µ = E[X] is the mean of a random variable X, and MX(t) is the moment generating function (MGF) of X, then the MGF of Y = X −µ is
MY (t) = MX−µ(t) = exp(−µt)MX(t) .
2. Hence establish that the rth derivative of exp(−µt)MX(t) with respect to t evaluated at the point t = 0 equals the rth centred moment E[(X − µ)r]. Hint: By definition E[Y r] = E[(X − µ)r], and MY (t) = MX−µ(t), now differentiate the latter, evaluate at zero, and use the relationship between the MGF and the moments of a random variable.
3. Use the result in (b) to verify that if X has a Binomial(n,p) distribution then the variance of X (V ar[X]) is np(1 − p). (Note: If X has a Binomial(n,p) distribution then MX(t) = {p exp(t) + (1 −p)}n.)
4. Show also that when X has a Binomial(n,p) distribution, the standardised third moment
E[(X −µ)3] V ar[X]3/2
= 1 − 2p√ np(1 −p)
.
What can we conclude about the properties of the Binomial distribution from the standardised third moment (i) when p = 0.5 and (ii) when n is large?
5
Question 6
Let X1, . . . ,Xn denote a sequence of independent and identically distributed i.i.d. N(µX,σ 2)
random variables, and let Y1, . . . ,Ym denote an independent sequence of i.i.d. N(µY ,σ 2) ran-
dom variables.
1. If µX = µY = µ, and X = n −1 ∑n
i=1 Xi and Y = m −1 ∑m
i=1 Yi, show that λX +(1−λ)Y is an unbiased estimator of µ for any value of λ in the unit interval, i.e. 0 ≤ λ ≤ 1.
2. Verify that the variance of this estimator is minimised when
λ = n
n + m
and determine the sampling distribution of the resulting estimator.
3. If µX = α + β and µY = α−β, determine a method of moments estimator for α. In what way(or ways) and under what circumstances will this estimator be equivalent to the previous estimator?
Question 7
1. Let X1, . . . ,Xn denote any sequence of random variables with constant mean E[Xi] = µ and constant variance V ar[Xi] = σ
2, i = 1, . . . ,n. Show that the statistic
S2 = 1
n
n∑ i=1
(Xi −X)2 ,
where X = n−1 ∑n
i=1 Xi, is a method of moments estimator of σ 2.
2. It is sometimes suggested that when n is large the statistic S2 will be approximately normally distributed with a mean of σ2 and a variance of σ4/2n. Verify that this approximation leads to the following (1 −α)100% large sample confidence interval for the standard deviation;
S
(1 + zα/2√ 2n
)1/2 < σ <
S
(1 − zα/2√ 2n
)1/2
where Φ(zα/2) = 1 −α/2.
6
Question 8
Suppose that Y1, Y2, ...,Yn denotes a random sample from a N (0,σ 2) distribution.
(a) Find the maximum likelihood estimator (MLE) for the population standard deviation parameter, σ. Denote this MLE by σ̂n
(b) Determine the information per observation pertaining to the parameter σ, denote it by i (σ).
(c) Use the formula for i (σ) to explain why
√ 2n
( σ̂n σ − 1 )
D→ N (0, 1) .
Question 9
A university administrator believes that only 20% of students enrolled in a Commerce degree will have already decided on a major area of study by the end of their first year of their course. To test the administrator’s belief, a random sample of 160 first year Commerce students are surveyed, at the end of their second semester of study, and asked if they have in fact already decided on their major.
(a) Find the form of the uniformly most powerful (UMP) hypothesis test to determine if the belief stated above is true, or if in fact the percentage of all university students enrolled in a Commerce degree will have decided on a major by the end of their first year is greater than 20%. Explain why your test procedure is UMP.
(b) If 45 of the 160 Commerce students surveyed reported having decided on their major, construct an approximate 95% confidence interval for the percentage outlined in part (a), and state the corresponding test conclusion. Are the confidence interval and test conclusion compatible ? – Explain why or why not.
Question 10
A real estate agent operating across the eastern suburbs of Melbourne claims that the average selling time of all residential properties in the eastern suburbs of Melbourne is six weeks, or 42 days. A sample of 30 recently sold residential properties in the area has a sample average of 49 days and a sample standard deviation of 20 days.
(a) Undertake an hypothesis test to determine whether the real estate agent’s claim is correct at the 5% level of significance against the alternative that the average selling time of all residential properties in the eastern suburbs of Melbourne is different from six weeks.
(b) Provide a justification for the sampling distribution of the test you have used to test the null hypothesis in part (a).
7
Question 11
A hypothesis test is said to be unbiased if the probability of rejecting the null hypothesis when the null hypothesis is true is smaller than the probability of rejecting the null hypothesis when the null hypothesis is false. Let T denote a statistic such that the likelihood ratio L(θ0)/L(θ1) is a monotonically increasing function of T, where θ0 and θ1 denote values of the parameter θ ∈ Θ = [−1, 1], and the probability density function of T is given by
fT (t) =
{ 1 + θ2(0.5 − t), for 0 < t < 1 ; 0, otherwise .
If the statistic T is used to test the null hypothesis H0 : θ = 0 against the alternative H1 : θ 6= 0, show that the critical region Cα = {T = t : t ≤ α} provides an unbiased and uniformly most powerful critical region of size α for testing H0 against H1.
Question 12
The number of successes in a sequence of n Bernoulli trials is to be used to test the hypothesis that the probability of success equals 0.5 against the alternative that it does not. Let X1, . . . ,Xn denote the values obtained in a sequence of n Bernoulli trials where Pr(Success) = Pr(Xi = 1) = p and Pr(Failure) = Pr(Xi = 0) = 1 −p.
1. Find an expression for the generalised likelihood ratio test statistic for testing the null hypothesis H0 : p = 0.5 against the alternative H1 : p 6= 0.5, and show that the critical region of the test can be formulated as
S · log S + (n−S) · log(n−S) ≥ k
where S = ∑n
i=1 Xi, the number of successes observed in the sequence, and c is an appropriate critical value.
2. Verify that the critical region of this test is equivalent to the region in the sample space given by {X1, . . . ,Xn : |2S −n| ≥ cα} where cα denotes a critical value that depends on α, the size of the test.
3. Provide a formula that could be used to calculate a precise value of cα that would ensure that
Pr(Type I Error) ≤ α.
4. Indicate how you would approximate the value of cα when n is large.
- END OF MOCK EXAMINATION PAPER -
8