Data Analytics for Healthcare decision Making 3
8/16/25, 12:48 PMPredicting an Outcome Using Regression Models Walkthrough Transcript
Page 1 of 3https://media.capella.edu/CourseMedia/mha5017element19451/transcript.asp
Capella University
Predicting an Outcome Using Regression Models
Walkthrough
In this video, we will use multiple regression models in Microsoft Excel to generate a prediction to support a health care decision.
Regression is a statistical tool used to understand the relationship between two or more dependent and independent variables. Multiple regression models help analyse the impact of every independent variable on a dependent variable.
In this assessment, we will perform a regression analysis using the Analysis ToolPak add-in in Microsoft Excel to understand the relationship between the cost of cancer treatment and the number of admissions in the month, number of patients insured, and cost of medical supplies; we will then generate a prediction report that reflects the adjusted cost of cancer treatment with respect to the independent variables affecting it. The data used for the analysis consist of the cost of cancer treatment in dollars, number of admissions in a month, number of admitted patients with insurance coverage, and cost of medical supplies in dollars. Here, the cost of cancer treatment is a dependent variable, while the number of admissions in the month, number of patients insured, and cost of medical supplies are independent variables. Variables that vary based on other variables are called dependent variables, or the outcome, and variables that affect dependent variables are called independent variables, or the predictors.
Let us begin. Select Data Analysis in the Data tab in the Excel sheet and select Regression from the Data Analysis dialog box. Select OK. For the Input Y Range, select the cost of cancer treatment column along with the label field. For the Input X Range, select the admissions in the month, number of patients insured, and cost
8/16/25, 12:48 PMPredicting an Outcome Using Regression Models Walkthrough Transcript
Page 2 of 3https://media.capella.edu/CourseMedia/mha5017element19451/transcript.asp
of medical supplies columns along with their respective labels. Enable the “Labels” option. Do note that the confidence level is set at 95% by default, which indicates that our threshold for tolerable error is 5%, or 0.05. Next, select an Output Range in the same sheet. This makes it easier to relate the output to the data set. Select OK
to generate a summary output. The summary output displays three components: a regression statistics table, an ANOVA table, and a regression coefficients table.
Using the process discussed above, we will see how the independent variables (predictors) of our data set impact the dependent variable (outcome). R-square values range from 0 to 1 and are expressed in percentages from 0% to 100%. The R-square value for this data set indicates the percentage change in cost of cancer treatment due to variation in the number of admissions in the month, number of patients insured, and cost of medical supplies. For the given data, the summary output indicates an R-square value of 0.70, which means that any movement in the predictors will have a 70% impact on the outcome. The p-value is interpreted based on the threshold for tolerable error, which is 0.05 in the given case. A p- value greater than 0.05 indicates that the predictor does not have a significant impact on the outcome. Our data set indicates an overall p-value (intercept) of 0.0105. The number of admissions in the hospital and the cost of medical supplies have p-values of 0.0004 and 0.0009, respectively, and significantly impact the cost of cancer treatment. The number of patients with insurance coverage has a p- value of 0.30 and does not significantly impact the cost of cancer treatment.
Using the regression model, we can also generate a prediction for the adjusted cost of cancer treatment after considering the independent variables affecting it. From the given data set, let us look at the variables for the month of January 2018. A prediction is generated using the equation Y = β0 + β1 × X1 + β2 × X2 + β3 × X3,
where β0 = intercept = 23062.17, β1 = admissions = 156.65, β2 = insured patients
= 315.44, and β3= cost of medical supplies= -288.63. The variables X1, X2, and X3
are the number of admissions in the month, number of patients insured, and cost of medical supplies in a month, respectively. For the month of January 2018, X1 =
46, X2 = 22, and X3 = 75. By substituting the amounts in the equation, the value of
8/16/25, 12:48 PMPredicting an Outcome Using Regression Models Walkthrough Transcript
Page 3 of 3https://media.capella.edu/CourseMedia/mha5017element19451/transcript.asp
Y, or the prediction (of the adjusted cost), is as follows:
Y = 23062.17 + 156.65 × 46 + 315.44 × 22 + -288.63 × 75
Y = 15560.51756
Thus, the predicted cost of cancer treatment for January 2018 is $15,561.
Our regression analysis shows that the independent variables— number of admissions in the month, number of patients insured, and cost of medical supplies — have an overall significant impact on the cost of cancer treatment. The analysis also uses the prediction equation to show the quantitative impact of the independent variables on the cost of cancer treatment.
In this video, we have learned how to use regression models to generate prediction reports that help in decision-making.
! BACK TO MEDIA " BACK TO TOP
Licensed under a Creative Commons Attribution 3.0 License