For (winn) only
Find the attached CSV file that contains yearly information about hospitals. The following variable definitions apply.
1) year: year in which the observation was made with 1 = 2001, 2=2002, 3= 2003
2) hospital: hospital id, from 1 to 24
3) pop: potential population supported in the community
4) enrollment: actual enrolled individuals from the community (cannot exceed population)
5) RWPs: inpatient weighted workload (standardized workload)
6) RVUs: outpatient weighted workload (standardized workload)
7) prevscore: prevention score (0 to 100), percent of 10 metrics achieved, empirical
8) expenditures: how much money was spent by this facility during the year
9) education: whether or not a facility provides graduate medical education, {0,1}
10) admissions: number of admissions during the year
11) dispositions: number of dispositions during the year
12) beddays: number of bed days during the year
13) visits: number of outpatient visits during the year
14) ftes: number of full-time employees per year
15) satisfaction: overall patient satisfaction, percent by year
16) access: satisfaction with access, percent by year
1) Import the dataset into R. Identify whether each variable is quantitative or qualitative. Identify the level of measurement of each variable.
2) Provide appropriate descriptives statistics for all variables. Be sure to provide measures of center and measures of dispersion.
3) Provide graphs of all variables (boxplots, histograms, bar charts, etc.) as appropriate. DO NOT provide qualitative graphs for quantitative data and vice versa.
4) Test the hypothesis that mean patient satisfaction (15) and access (16) are correlated. Provide a scatterplot of the two. What might this mean?
5) Test the hypothesis that mean prevention scores (7) differ based on whether GME is present (9)
6) Test the hypothesis that mean expenditures (8) for all facilities are identical for all three years (1). Run appropriate post-hoc tests. Interpret all tests.
7) Test whether expenditures (8) is normally distributed. Provide a complete interpretation of the tests. If it is not, find the best transformation and add the transformed variable to your dataset (17, if needed).
8) Using expenditures (8) or your transformed variable (17), build a model of expenditures that includes all quantitative variables. Test for collinearity of variables. If you have collinearity, remove one of the offending variables and re-run the analysis. Interpret all output. Provide examples of how this model might be used to forecast expenditures for a facility within this three year period.