hw8 and task 2
Class 8 Homework
Answer the question: Is there a linear relationship between body weight and blood pressure and if yes, by how much does the blood pressure increase for every 1 kg of body weight?
Background:
In the last homework you have calculated the association of obesity with hypertension, using the chi- square test for categorical variables. Obesity and hypertension status were defined by cutoff values and were just scored as yes and no.
It is possible that there is a threshold or cutoff for weight above which blood pressure increases dramatically. However, it is more likely that blood pressure increases gradually with weight. And once the weight reaches obesity status the blood pressure often reaches hypertension levels (I assume you all know that hypertension means that the blood pressure is abnormally high).
To test the hypothesis that blood pressure increases with weight, even before obesity and hypertension levels are reached, a pilot study was conducted obtaining the systolic blood pressure measurements and weights of 23 healthy individuals. (Note that the study is fictitious, real studies would have a lot more subjects.)
Ensuring various assumptions are met for the statistical tests:
The individuals were selected randomly, were not related, and data were collected in the same manner from each of the individuals. The study also controlled for other factors (besides weight) that may influence blood pressure (some of them with a known strong influence):
· The individuals were all in their 20s and 30s (the average blood pressure increases slightly with age, more pronounced after the age of 60).
· They did not have a family history of high blood pressure.
· They were not know to be diabetic or suffer from kidney disease.
Blood pressure measurements can fluctuate when taken several times in a row and may also depend on who is taking the blood pressure with which method. For this study (as is common for clinical studies), blood pressure was measured 3 times in a row over the period of ~ 1 hr (not just with the one time measurement that you get at the doctor’s office). So we make the assumption here that the measurements are reliable.
Data set file (Excel): DATASET of Systolic Blood Pressure and weight measurements for 23 individuals.xls
Note: In the data set file I created 5 tabs to enter your answers. The tabs are numbered according to the instructions below (Step 5: Excel or Minitab Output, Step 6: Parameter template, Step 7: Fitted Line Plot, Step 8: Residuals analysis, Step 9: Conclusions)
Detailed instructions:
1. Rename the Excel file containing the dataset by adding your name.
2. Next to the dataset create a scatter plot (x axis: body weight, y axis: systolic blood pressure) for the data set.
Give the plot a title and label your axis with the variable and the measurement unit (the latter in parenthesis).
3. From the scatter plot, determine visually if there seems to be a relationship between body weight and blood pressure in the provided data set. If so, what kind of relationship do you presume (linear pos. correlation - up, linear neg. correlation – down)?
4. Underneath the scatter plot, in the text box provided, write 2 or 3 sentences about any relationship you hypothesize. (To create a text box yourself, go to INSERT > Text > Textbox; then click somewhere in the file and start typing into the textbox created. Click on the textbox and pull it in or out to change its size.)
5. Perform a linear regression analysis on the dataset (in Minitab or Excel). Save or copy the results to the tab “Excel or Minitab results Output”.
6. Provide the following results from the regression analysis (enter into tab: Step 6 parameter table):
a. R square value (I am writing this out instead of writing R2, since Minitab and Excel call it R square in their output)
b. R value: you will have to calculate the R value by hand, Excel and Minitab don’t provide it. Note: the square root of R2 will always be positive. However, if the slope is negative, R is a negative value, if the slope is positive, R is a positive value (see Lecture 8, slide 7). Give you R value the correct sign manually, + or –.
c. Standard error of the residuals
d. P value for the regression
e. Values of the coefficients (slope and y-intercept)
f. Confidence interval for the slope (slope only , not y intercept)
Note: Minitab doesn’t calculate this, but Excel does. See posted file “Excel’s Regression Analysis Tool”, tab 2:
g. Equation of the straight line
h. The number by how much the blood pressure increases (in mmHg) per kg of body weight, including its confidence intervals
Tip: This is the slope with its confidence intervals, with the units of the measurements of x and y included: The slope is defined as increase in y per unit of x.
7. Provide a scatter plot of the data with the line of best fit (“Fitted Line Plot”). Include the equation of the line and the R2 value in the plot. You can create this plot in Excel, and/or copy the image from the Minitab Fitted Line plot into your results file.
8. Discuss the normality assumption: A good regression analysis includes plots showing whether the residuals are roughly normally distributed for the entire range of the analysis. Add the following two plots from your regression analysis (copy and paste into your results file) and discuss whether you think the assumption of normality is met (again, creating a text box):
a. The “Residuals vs. Fits Plot” (Minitab) or the “X vs. Residuals Plot” (Excel)
b. The “Normal Probability Plot”
9. Create another textbox in your results file, state your results in a few sentences by answering the following questions (don’t repeat the questions! They are just an aide for you to write your conclusions):
a. Is blood pressure increase with weight linear?
b. How good is the correlation (state the R value – basically the % of the variation in Y that can be explained by the variation in X)?
c. How significant is the regression analysis (p value - the probability that there is no correlation and the distribution can be explained by random scatter)?
d. By how much does the blood pressure increase per kg of body weight?
e. Have the assumptions for linear regression been met (that is, can the data be trusted)?