asap

esyim
stat4002-cw2-2.docx

Module STAT4002 – Basic Data Analysis

2019-2020 Semester 1

Coursework 2

Total........

Due Date: 12:00 pm Noon Tuesday 26th November 2019 [25 Marks]

Place completed coursework in the red box in the R Building Wheatley

Question 1

A medical centre collected data on blood cholesterol levels of heart attack patients. You will need to retrieve the data file Cholest.mtw

There are four columns in the Minitab worksheet.

C1 - Cholesterol levels of patients 2 days after heart attack

C2 - Cholesterol levels of patients 4 days after heart attack

C3 - Cholesterol levels of patients 14 days after heart attack

C4 - Cholesterol levels of control group patients

Note: the first three columns contain repeated measurements on 28 individual heart attack patients with the last column holding 30 independent measurements on different people (non heart attack) serving as controls for this experiment.

Use appropriate graphical and descriptive techniques to make useful comparisons between these measurements. Answers, in note form, must be written here, and neat extracts of Minitab output should be attached to this coursework.

Obtain 90% confidence intervals for the differences between cholesterol levels:

(a) at 2 days after a heart attack and in the control group

(b) at 2 days and at 14 days after a heart attack

How well do the cholesterol levels at 2 days, 4 days and 14 days correlate? Provide the values and a comment.

Obtain two regression equations for the cholesterol level at 14 days: the first should be in terms of the level at 2 days; and the second in terms of the level at 4 days. (Cut & paste your results.)

How much can the behaviours of the 14-day values be explained by the 2-day and 4-day values respectively?

Q1 Mark:

Question 2

It is believed that students drinking coffee at Wheatley drink less than those at Headington. The weekly consumption of random samples of students at each location were recorded with the following results

Wheatley Headington

Number 18 25

Mean no. of coffees/week 17.2 19.6

Standard deviation 2.7 2.4

Do the data confirm the belief stated above?

Carry out a significance test at the 5% level by hand, clearly showing the steps involved. What assumptions have you made in testing this belief?

Q2 Mark:

Question 3

This question uses the Pulse data you have met before; instead of simply looking at graphical displays and summary measures, you should now estimate values and test hypotheses.

First you need to retrieve the pulse data, pulse.mtw.

In answering these questions, you will need to submit extracts of Minitab printout.

(i) Obtain a 95% confidence interval for the population mean initial (at rest) pulse rate assuming that the sample of 92 are a random sample from the general population.

(ii) Many physiological measures are normally distributed. Is this the case here for pulse1? Check using an appropriate Normality test. Provide a graph as evidence to support your conclusion.

(iii) For this part of the question you need to produce a new variable in C10 showing the differences between the Pulse rates at the start and after one minute i.e. C2-C1.

Perform a t-test to determine if there is a difference at the 5% level of significance between the pulse rates changes in the two groups (runners and non-runners).

You must use Minitab but need to write this up clearly as a full statistical test. You should copy and paste the result you have from Minitab.

(iv) What two assumptions does the test performed in (iii) rely on?

(v) Without doing further Minitab work, you can check the validity of one of these assumptions from the output you obtained in (iv). Explain why you think that the assumptions are valid or not.

Q3 Mark:

Page 2 of 6