Biostats Logistic Regression SPSS

profilevwccspt12
6_3_Step-by-StepGuide.doc

PART3

Step-by-Step Guide to Assignment 6.3

Odds Ratios

Problem 3. Perform a simple logistic regression using SPSS and the practice problem 6.3 data set. Answer the following questions based on your SPSS output

Step 1. Open the practice_problem_6.3.sav dataset.

Step 2. Go to Analyze ( Regression ( Binary logistic

image1.png

Step 3. Place Survival_Status in the Dependent box and place Gender in the Covariate(s) box. Click Options.

image2.png

Step 4. In the Logistic Regression: Options window, check Classification plots, Casewise listing of residuals, and CI for exp(B). Make sure CI for exp(B) is set to 95%. Click Continue. Click OK.

image3.png

SPSS Output:

Case Processing Summary

Unweighted Casesa

N

Percent

Selected Cases

Included in Analysis

181

100.0

Missing Cases

0

.0

Total

181

100.0

Unselected Cases

0

.0

Total

181

100.0

a. If weight is in effect, see classification table for the total number of cases.

The Case Processing Summary table shows there are 181 cases in the data set and there are no missing data. It also shows the percentages of cases represented in the regression analysis.

Dependent Variable Encoding

Original Value

Internal Value

ALIVE

0

DEAD

1

The Dependent Variable Encoding shows that SPSS has numerically coded the two levels of survival status, which are string variables in the data set. Alive = 0 and Dead = 1.

Categorical Variables Codings

Frequency

Parameter coding

(1)

Gender

FEMALE

53

1.000

MALE

128

.000

SPSS has also coded the Gender variable levels. The Categorical Variables Codings table shows that the output will provide the odds ratio of males to females.

The first set of Output after the above is Block 0 output:

Classification Tablea,b

Observed

Predicted

Survival_Status

Percentage Correct

ALIVE

DEAD

Step 0

Survival_Status

ALIVE

150

0

100.0

DEAD

31

0

.0

Overall Percentage

82.9

a. Constant is included in the model.

b. The cut value is .500

Variables in the Equation

B

S.E.

Wald

df

Sig.

Exp(B)

Step 0

Constant

-1.577

.197

63.862

1

.000

.207

The two tables above (Classification Table and Variables in the Equation table) reflect the predicted results of survival without any independent variables included in the model. Block 0 is also called the “constant only” model or the “reduced model”. It serves as the baseline to which a regression model with independent or predictor variables will be compared.

The second set of output is labeled Block 1:

Omnibus Tests of Model Coefficients

Chi-square

df

Sig.

Step 1

Step

5.506

1

.019

Block

5.506

1

.019

Model

5.506

1

.019

The Omnibus Test of Model Coefficients table compares the full model to the baseline (Block 0) model. If the chi square significance (p) value is <0.05, then the block 1 model is a significantly better predictor than the Block 0 model.

In this problem, the significance value is 0.019, which is less than 0.05, so the block 1 model is a significantly better predictor than the Block 0 model.

Model Summary

Step

-2 Log likelihood

Cox & Snell R Square

Nagelkerke R Square

1

160.252a

.030

.050

a. Estimation terminated at iteration number 5 because parameter estimates changed by less than .001.

Interpretation:

What is the meaning of R squares in model summary tables?

Variables in the Equation

B

S.E.

Wald

df

Sig.

Exp(B)

95% C.I.for EXP(B)

Lower

Upper

Step 1a

Gender(1)

-1.186

.563

4.434

1

.035

.305

.101

.921

Constant

-1.319

.217

37.081

1

.000

.267

a. Variable(s) entered on step 1: Gender.

The values of interest in the Variables in the Equation table are the significance of the Wald, the Exp(B), and the 95% Confidence Intervals for Exp(B). The Wald test is done to determine if the predictor variable(s) make a significant contribution to the model. A Sig. (p-value) of the Wald <0.05 indicates a significant contribution. Exp(B) is the odds ratio (OR) for the independent variable. It provides the amount of change in odds for the dependent variable resulting from a one unit change in the independent variable or predictor variables.

An Exp(B) 0.0 to less than 1.0 indicates an inverse relationship between the independent and the dependent variables. In this problem, an Exp(B) <1.0 means less likely to survive than the reference category.

An Exp(B) >1.0 indicates a positive relationship between the independent and dependent variables. In our problem an Exp(B) >1.0 means more likely to survive than the reference category.

Interpretation:

Your interpretation must cover the followings:

· Using data from the table to compare survival among males and females

Identifying whether there is a significant association between gender and survival based on the CI for gender in this model

a. Are the results of the simple logistic regression similar to or different from the results of the simple odds ratio?

The OR from SPSS is the inverse of the OR calculated in problem 6.1.

b. How are they similar or different? Include output from SPSS and an interpretation of the OR and confidence intervals in your response.

The OR hand calculated in problem 6.1 was 3.275 (95% CI 1.09, 9.88). This means that females are more than 3 times more likely to survive than males. The OR calculated by SPSS = 0.305 (95% CI 0.10, 0.92). This means that males are about 30% less likely to survive than females. The ORs are reversed.

The hand calculated OR in problem 6.1 compared males to females. As noted earlier, SPSS selected males to be the reference (male = 0) and females to be compared to males (females = 1). This is why it is important to review the output tables that describe how the data were coded by SPSS.

To replicate the results of problem 6.3 with the hand calculation, we need to switch the referent group to males. This is done by calculating 1 / OR for females or 1 / 3.275.

1 / 0.305 = 0.305

To calculate the 95% CI, we apply the same methodology, but reverse the order so that 1 / lower 95% CI becomes the upper 95% CI and 1 / upper 95% CI becomes the lower 95% CI:

Upper limit is 9.88 in problem 6.1.

1 / 9.88 = 0.101

This becomes the lower limit for the 95% CI

Lower limit is 1.09 in problem 6.1.

1 / 1.09 = 0.92

This becomes the upper limit for the 95% CI

Thus, males are about 31% (OR 0.305; 95% CI 0.10, 0.92) less likely to survive than females. The hand-calculated results now duplicate the SPSS results.

c. What can you do using logistic regression to duplicate the results from part 2 of this application (the use of CMH for common odds)

To duplicate the results from part 2 of this application (the common odds ratio), we would need to add the independent variable disease severity to the independent variables box with gender and conduct a multivariable logistic regression analysis.