Biostats Logistic Regression SPSS
PART3
Step-by-Step Guide to Assignment 6.3
Odds Ratios
Problem 3. Perform a simple logistic regression using SPSS and the practice problem 6.3 data set. Answer the following questions based on your SPSS output
Step 1. Open the practice_problem_6.3.sav dataset.
Step 2. Go to Analyze ( Regression ( Binary logistic
Step 3. Place Survival_Status in the Dependent box and place Gender in the Covariate(s) box. Click Options.
Step 4. In the Logistic Regression: Options window, check Classification plots, Casewise listing of residuals, and CI for exp(B). Make sure CI for exp(B) is set to 95%. Click Continue. Click OK.
SPSS Output:
|
Case Processing Summary |
|||
|
Unweighted Casesa |
N |
Percent |
|
|
Selected Cases |
Included in Analysis |
181 |
100.0 |
|
|
Missing Cases |
0 |
.0 |
|
|
Total |
181 |
100.0 |
|
Unselected Cases |
0 |
.0 |
|
|
Total |
181 |
100.0 |
|
|
a. If weight is in effect, see classification table for the total number of cases. |
The Case Processing Summary table shows there are 181 cases in the data set and there are no missing data. It also shows the percentages of cases represented in the regression analysis.
|
Dependent Variable Encoding |
|
|
Original Value |
Internal Value |
|
ALIVE |
0 |
|
DEAD |
1 |
The Dependent Variable Encoding shows that SPSS has numerically coded the two levels of survival status, which are string variables in the data set. Alive = 0 and Dead = 1.
|
Categorical Variables Codings |
|||
|
|
Frequency |
Parameter coding |
|
|
|
|
(1) |
|
|
Gender |
FEMALE |
53 |
1.000 |
|
|
MALE |
128 |
.000 |
SPSS has also coded the Gender variable levels. The Categorical Variables Codings table shows that the output will provide the odds ratio of males to females.
The first set of Output after the above is Block 0 output:
|
Classification Tablea,b |
|||||||
|
|
Observed |
Predicted |
|||||
|
|
|
Survival_Status |
Percentage Correct |
||||
|
|
|
ALIVE |
DEAD |
|
|||
|
Step 0 |
Survival_Status |
ALIVE |
150 |
0 |
100.0 |
||
|
|
|
DEAD |
31 |
0 |
.0 |
||
|
|
Overall Percentage |
|
|
82.9 |
|||
|
a. Constant is included in the model. |
|||||||
|
b. The cut value is .500 |
|||||||
|
Variables in the Equation |
|||||||
|
|
B |
S.E. |
Wald |
df |
Sig. |
Exp(B) |
|
|
Step 0 |
Constant |
-1.577 |
.197 |
63.862 |
1 |
.000 |
.207 |
The two tables above (Classification Table and Variables in the Equation table) reflect the predicted results of survival without any independent variables included in the model. Block 0 is also called the “constant only” model or the “reduced model”. It serves as the baseline to which a regression model with independent or predictor variables will be compared.
The second set of output is labeled Block 1:
|
Omnibus Tests of Model Coefficients |
||||
|
|
Chi-square |
df |
Sig. |
|
|
Step 1 |
Step |
5.506 |
1 |
.019 |
|
|
Block |
5.506 |
1 |
.019 |
|
|
Model |
5.506 |
1 |
.019 |
The Omnibus Test of Model Coefficients table compares the full model to the baseline (Block 0) model. If the chi square significance (p) value is <0.05, then the block 1 model is a significantly better predictor than the Block 0 model.
In this problem, the significance value is 0.019, which is less than 0.05, so the block 1 model is a significantly better predictor than the Block 0 model.
|
Model Summary |
|||
|
Step |
-2 Log likelihood |
Cox & Snell R Square |
Nagelkerke R Square |
|
1 |
160.252a |
.030 |
.050 |
|
a. Estimation terminated at iteration number 5 because parameter estimates changed by less than .001. |
Interpretation:
What is the meaning of R squares in model summary tables?
|
Variables in the Equation |
|||||||||
|
|
B |
S.E. |
Wald |
df |
Sig. |
Exp(B) |
95% C.I.for EXP(B) |
||
|
|
|
|
|
|
|
|
Lower |
Upper |
|
|
Step 1a |
Gender(1) |
-1.186 |
.563 |
4.434 |
1 |
.035 |
.305 |
.101 |
.921 |
|
|
Constant |
-1.319 |
.217 |
37.081 |
1 |
.000 |
.267 |
|
|
|
a. Variable(s) entered on step 1: Gender. |
The values of interest in the Variables in the Equation table are the significance of the Wald, the Exp(B), and the 95% Confidence Intervals for Exp(B). The Wald test is done to determine if the predictor variable(s) make a significant contribution to the model. A Sig. (p-value) of the Wald <0.05 indicates a significant contribution. Exp(B) is the odds ratio (OR) for the independent variable. It provides the amount of change in odds for the dependent variable resulting from a one unit change in the independent variable or predictor variables.
An Exp(B) 0.0 to less than 1.0 indicates an inverse relationship between the independent and the dependent variables. In this problem, an Exp(B) <1.0 means less likely to survive than the reference category.
An Exp(B) >1.0 indicates a positive relationship between the independent and dependent variables. In our problem an Exp(B) >1.0 means more likely to survive than the reference category.
Interpretation:
Your interpretation must cover the followings:
· Using data from the table to compare survival among males and females
Identifying whether there is a significant association between gender and survival based on the CI for gender in this model
a. Are the results of the simple logistic regression similar to or different from the results of the simple odds ratio?
The OR from SPSS is the inverse of the OR calculated in problem 6.1.
b. How are they similar or different? Include output from SPSS and an interpretation of the OR and confidence intervals in your response.
The OR hand calculated in problem 6.1 was 3.275 (95% CI 1.09, 9.88). This means that females are more than 3 times more likely to survive than males. The OR calculated by SPSS = 0.305 (95% CI 0.10, 0.92). This means that males are about 30% less likely to survive than females. The ORs are reversed.
The hand calculated OR in problem 6.1 compared males to females. As noted earlier, SPSS selected males to be the reference (male = 0) and females to be compared to males (females = 1). This is why it is important to review the output tables that describe how the data were coded by SPSS.
To replicate the results of problem 6.3 with the hand calculation, we need to switch the referent group to males. This is done by calculating 1 / OR for females or 1 / 3.275.
1 / 0.305 = 0.305
To calculate the 95% CI, we apply the same methodology, but reverse the order so that 1 / lower 95% CI becomes the upper 95% CI and 1 / upper 95% CI becomes the lower 95% CI:
Upper limit is 9.88 in problem 6.1.
1 / 9.88 = 0.101
This becomes the lower limit for the 95% CI
Lower limit is 1.09 in problem 6.1.
1 / 1.09 = 0.92
This becomes the upper limit for the 95% CI
Thus, males are about 31% (OR 0.305; 95% CI 0.10, 0.92) less likely to survive than females. The hand-calculated results now duplicate the SPSS results.
c. What can you do using logistic regression to duplicate the results from part 2 of this application (the use of CMH for common odds)
To duplicate the results from part 2 of this application (the common odds ratio), we would need to add the independent variable disease severity to the independent variables box with gender and conduct a multivariable logistic regression analysis.