Biostatistics

profileEpiphania
COH602_SASDataAnalysis2_SummaryAnalysisofHeartData_DocWosu.docx

DATA ANALYSIS ASSIGNMENT #2: Analysis of Heart Data Set

This assignment uses a data set that is included as part of the SAS software for training purposes. For this assignment, you will demonstrate that you can use the SAS OnDemand Web Editor to:

· open a data set

· print out information on the format of the data

· compute basic descriptive statistics, graphs, and tables

NOTE: Before you complete this assignment you must sign up for SAS On-Demand for Academics for the course COH 602 offered at National University. A letter with a link on how to do this is posted on the course website.

NOTE : The codes shown below are examples. Replace “height” and “weight” with the variables you choose and everything should run correctly. Do not select the same pair of variables.

1. Go to the SAS OnDemand for Academics website and log in to your account.

2. In your SAS OnDemand profile page, click on “SAS® Studio,” (under “Applications”) and the SAS Web Editor should open.

3. Select 2 numeric variables of interest from the HEART data set. In your previous assignment, you did a PROC CONTENTS that listed the variable names and types. OR, you could expand “Libraries” and “My Libraries” and then double-click on “HEART” to open the data set and look at it.

4. Compute descriptive statistics by using PROC UNIVARIATE as follows:

PROC UNIVARIATE DATA = sashelp.heart;

VAR height weight;

RUN;

5. Create a histogram of the data for both variables

PROC SGPLOT DATA = sashelp.heart;

HISTOGRAM height;

RUN;

PROC SGPLOT DATA = sashelp.heart;

HISTOGRAM weight;

RUN;

6. Sort your data by sex and output a new temporary dataset (SAS will not output any results for this step; the data is simply being sorted internally).

PROC SORT DATA = sashelp.heart OUT = temp;

BY sex;

RUN;

7. Run the univariate procedure again for the new temporary dataset SAS has outputted.

PROC UNIVARIATE DATA = temp;

VAR height weight;

BY sex;

RUN;

8. Run the SGPLOT procedure again for the new temporary dataset SAS has outputted.

PROC SGPLOT DATA = temp;

HISTOGRAM height;

BY sex;

RUN;

PROC SGPLOT DATA = temp;

HISTOGRAM weight;

BY sex;

RUN;

9. Write a one page summary of your analysis in which you discuss the results of the statistical procedures you ran in SAS. Report the mean, standard deviation, median, interquartile range, and mode for each numeric variable. Interpret what these values tell you about the sample population of people you are studying, based upon these variables. Using the histograms from PROC SGPLOT, as well as the mean and median values you produced, explain if each variable is positively skewed, negatively skewed, or symmetric. Lastly, did sorting the data by sex change these descriptive statistic results and histograms for your two variables? If you use the “skewness” statistic as part of your evidence for this part, then make sure that you understand how to interpret it correctly (see notes below). For every statement you make about a variable, in your interpretation, make sure to support that statement with evidence from the descriptive statistics you have produced. “Evidence” consists of actual statistics (report the numbers) that support what you are stating.

This paper should be typed and double-spaced in a Word document and submitted in the course dropbox for this assignment. Again, do not forget to type your name in the document!

Below is the grading rubric with the criteria that I will be using for the grading. Please use this as a guide when writing your summary to ensure that you do not miss anything important.

Grading Item

Points Possible

Points Earned

Comments

Descriptive statistics reported for each type of variable are accurate

2

Interpretation of descriptive statistics for each variable is accurate

4

Discussion of skewness of distribution for each numeric variable is accurate

4

Interpretation of effect of sorting upon results for each variable is accurate

4

Summary has the correct format and length; writing is clear and understandable

4

TOTAL

18