statistics/SPSS

anserplouis
SPSSBivariateRelationships.pdf

SPSS How To Guide for Project 4 - Toothman

1

***Note***

This guide was written using SPSS Version 22 for Windows. The basic instructions to complete the statistical

tests should be the same. Opening files may vary between versions and operating systems.

Regardless, I suggest playing around with SPSS to learn it a bit on your own before to attempting to complete

Project 4.

Cross-Tabulation and Chi-Square: Background

Once your data are open, click Analyze, then Descriptive Statistics, and finally Crosstabs to create your cross-

tabulation and analyze your data using chi-square.

Figure 1

This will open the following dialogue:

Figure 2

Next, select the variables you will use to create your crosstab. You should select your independent variable to be

your column variable, and the dependent variable to be your row variable. Select the variable and then click the

arrow next to the appropriate column or row box to move it to the appropriate section.

SPSS How To Guide for Project 4 - Toothman

2

In this example, I selected SEXORNT as my column variable. SEXORNT is a nominal level variable created from

the question, “Which of the following best describes you?” with the following categories: “gay, lesbian, or

homosexual,” “bisexual,” and “heterosexual or straight.” Respondents who reported “don’t know,” “refused,” or

“not applicable” were coded as missing.

For my dependent variable, I selected marhomo as my row variable. marhomo is an ordinal level variable

created from responses to the statement, “homosexuals should have the right to marry.” Respondents reported

the following valid responses: “strongly agree,” “agree,” “neither agree nor disagree,” “disagree,” and “strongly

disagree.” Respondents who reported “cannot choose” or “not applicable” were coded as missing.

Figure 3

I will test the following hypotheses. The alpha has been set at 0.05.

H0: Sexual orientation and attitudes toward same sex marriage are statistically independent.

H1: Sexual orientation and attitudes toward same sex marriage are statistically dependent.

Next, you will need to click on the box labeled Statistics to tell SPSS which statistics you will calculate for your

cross-tabulation. The following dialogue will appear:

SPSS How To Guide for Project 4 - Toothman

3

Figure 4

Next, you will click on the checkbox next to each of the statistics you will calculate. At bare minimum, you will

select the box for chi-square.

You will also select a check-box to calculate an appropriate measure of association to test the strength of the

relationship between your two variables. For our purposes, I will check the boxes for lambda and phi and

Cramer’s V. Later in the guide, I will show you an example using two ordinal level variables so you can effectively

read the output.

Figure 5

Click Continue.

Now you will return to this screen:

SPSS How To Guide for Project 4 - Toothman

4

Figure 6

Click Cells. The following cell display dialogue will appear.

Figure 7

By default, only Observed should be checked. This will report the number of people who fall into each cell. In

addition to observed, you should also check the box for column percentages. When you create your own table

in Word or Excel, I only want to see the column percentages in your new table.

SPSS How To Guide for Project 4 - Toothman

5

Figure 8

Click Continue. This will return you to the crosstabs screen:

Figure 9

Now you are ready to click OK. This will produce the output.

The output screen will show several boxes to you. Let’s go through them one by one.

SPSS How To Guide for Project 4 - Toothman

6

Output

The first box simply shows you how many valid and missing cases you have. For your purposes, the Valid

percent should be 100%, and missing should be 0%.

Figure 10

The next box shows you the cross-tabulation. We’ll go through it one by one.

Figure 11

First, just look at the table and make sure your data are displayed appropriately. Ours are good! Notice the cells:

SPSS How To Guide for Project 4 - Toothman

7

Figure 12

Just from looking at the cells, it looks like people who identify as gay, lesbian, or homosexual seem more likely

than bisexuals or heterosexuals to agree that LGBT people should be able to marry. Bisexuals seem more

divided, but lean more heavily on agreeing that they should be allow to marry. Heterosexuals seem even more

divided: about half agreeing, 11.8% do not agree or disagree, and more than a third disagree.

Take a look at the totals:

Figure 13

By now you should also notice that the sample is disproportionately heterosexual. If we only looked at the raw

totals, we would not have been able to infer much about the relationship between sexual orientation and

SPSS How To Guide for Project 4 - Toothman

8

attitudes toward same sex marriage. You will need to replicate this table on your own, using either Excel or

Word to create the table for you.

The next box shows you the chi-square statistics:

Figure 14

We are interested in the information contained in the row labeled Pearson Chi-Square.

Figure 15

Our obtained chi-square statistic is 22.355. We have eight degrees of freedom. The P-value (labeled asymp. sig.

(2-sided) is 0.004.

With this information, we can reject the null hypothesis. Sexual orientation and attitudes toward same sex

marriage are statistically dependent.

The next box shows us the directional measures. These are the first set of measures of association we

calculated.

SPSS How To Guide for Project 4 - Toothman

9

Figure 16

In this box, we can find lambda. Since our dependent variable is marhomo, we need to refer to that section of

the table.

Figure 17

Lambda is zero! Why is that?

lambda will always be zero when the mode for each category of the independent variable falls into the

same category of the independent variable – even if other measures of association tell us that the two

variables actually are related. If the two variables seem related, based on the chi-square statistic or

observations of the differences in percentages, we need to try a different measure of association to

measure the strength of the relationship.

Lambda is not an adequate measure of association for our relationship. So let’s take a look at the final table in

our output:

SPSS How To Guide for Project 4 - Toothman

10

Figure 18

Here, we need to look at the row labeled Cramer’s V.

Figure 19

Do not worry about information in the column labeled approx. sig. Our Cramer’s V is 0.10. Interpret

appropriately. 

Chi-Square and Measures of Association for Two Ordinal Level Variables

If you select two ordinal level variables to complete this portion of the assignment, you will need to select a

different measure of association. For this next test, I will continue to use marhomo as the dependent variable.

The variable fund16 is the independent variable. Fund16 is an ordinal level variable that reports the

“fundamentalism/liberalism of religion [the] respondent was raised in.” The categories “fundamentalist” (1)

means the religion was categorized as a very conservative denomination. “Moderate” (2) means the religion was

categorized as being a more moderate religion. “Liberal” (3) means the religion was considered to be a liberal

religious group. We can think of this as a continuum. Higher scores indicate more liberal religious beliefs at 16;

lower scores indicate more conservative religious beliefs.

We will test the following hypotheses:

H0: Religious fundamentalism at 16 and attitudes toward same sex marriage are statistically

independent.

H1: Religious fundamentalism at 16 and attitudes toward same sex marriage are statistically dependent.

SPSS How To Guide for Project 4 - Toothman

11

Figure 20

We set up our chi-square test similarly to the previous example, until we get to the statistics box. We will still

select a chi-square test to test the significance of the relationship, but with respect to the association, we need

to select gamma and either Kendall’s tau-b or Kendall’s tau-c. Since there are five categories on our dependent

variable, and only three for our column variable, we should select Kendall’s tau-c in addition to gamma. This is

because we use Kendall’s tau-c when the cross-tabulation is a rectangle. If we had the same number of

categories on both variables, we would use Kendall’s tau-b.

Figure 21

SPSS How To Guide for Project 4 - Toothman

12

Output

Let’s take a look at the cross-tabulation:

Figure 22

It seems that something may be going on here. A little over one-third of people raised in fundamentalist

religions agree that same-sex couples should be allowed to marry. Twelve and a half percent do not agree or

disagree, while over half disagree with the statement that same-sex couples should be allowed to marry. Over

one-half of those raised in moderate and liberal denominations agreed that same-sex couples should be allowed

to marry. So there might be something going on here. Keep in mind that on the fund16 variable, lower scores =

more conservative beliefs; higher scores = more liberal beliefs. On our marhomo variable, lower scores =

agreement that same-sex couples should be allowed to marry and higher scores = disagreement that same-sex

couples should be allowed to marry. With this in mind, and with the evidence presented above, it seems that if

the two variables are statistically dependent, we are likely to have a negative relationship. As the independent

variable score increases, the dependent variable score decreases.

The chi-square test suggests that we should reject our null hypothesis. Religious fundamentalism and attitudes

toward same-sex marriage are statistically dependent.

SPSS How To Guide for Project 4 - Toothman

13

Figure 23

Now we can examine the box labeled symmetric measures. Here we can find our gamma and the Kendall’s tau-

c.

Figure 24

We are only interested in the information contained in the value column. Our Kendall’s tau-c is -0.143 and our

gamma is -0.193. Interpret appropriately. 

SPSS How To Guide for Project 4 - Toothman

14

Regression and Correlation

To begin, click Analyze, then Regression, and finally linear to complete the regression and correlation portion of

Project 4.

Figure 25

The following linear regression dialogue will appear:

Figure 26

SPSS How To Guide for Project 4 - Toothman

15

You will need to select two appropriate variables to obtain your regression line equation. Remember, your

independent variable is the one you think will predict a change in the dependent variable.

In this example, weekswrk is an interval-ratio level variable reporting the number of weeks a respondent

worked last year. It is the independent variable. VISLIB is an interval-ratio level variable reporting the number of

times a respondent visited a public library last year.

Figure 27

All we need to do to get the information to report our linear regression equation, the correlation coefficient, and

the coefficient of determination, is click OK. Your output will appear! I’ll go through each box one by one.

Figure 28

The first box just shows the variables in the equation and the method used to enter them (you don’t need to

worry about this). Double check that the variables entered matches your independent variable and the

dependent variable below the box matches what you intended to do. We’re all good!

SPSS How To Guide for Project 4 - Toothman

16

The Model Summary box reports Pearson’s correlation coefficient (R) and the coefficient of determination (R 2 ).

Figure 29

Hold off on interpreting the correlation coefficient for now. We can also see that our regression equation

demonstrates that only 0.1% of the variation in visits to the library last year is explained by how many weeks

worked last year.

The next box shows us ANOVA. ANOVA and regression are related! For this portion of the assignment, do not

worry about this box. This box shows us if there is a significant relationship in our regression (we did not cover

this in our class – we can conduct hypothesis tests with regression, too!). In short: there isn’t a statistically

significant relationship.

Figure 30

The final box, labeled coefficients, contains the information you need to report your regression line equation.

Figure 31

The information contained under the column labeled B shows us both the slope (b) and the y-intercept (a).

The row labeled (Constant) shows us the constant of the regression line. This is different language than you are

already familiar. The number in the cell where B and (Constant) meet is the y-intercept.

SPSS How To Guide for Project 4 - Toothman

17

The row labeled WEEKS R WORKED LAST YEAR shows us various statistics relevant to how the independent

variable is related to the dependent variable. In the cell where B and WEEKS R WORKED LAST YEAR meet, we are

shown the slope of the line (b). For each unit increase in the number of weeks someone worked last year, we

expect a decrease in the number of library visits last year of 0.019.

Let’s return to the correlation box I showed you earlier.

Figure 32

The correlation coefficient presented here is only going to show a positive figure (this is what you should expect;

the explanation is beyond the scope of this class). However, based on the slope of our regression line, we know

that we actually have a negative relationship – as X increases, Y decreases. When you report your correlation

coefficient, make sure you report the appropriate sign. In this case, we know we have a very weak negative

relationship (R = -0.036).

Just to confirm, you can but do not have to calculate the bivariate correlation (click Analyze, then Correlation,

and then Bivariate Correlation). Take a look at what we find:

Figure 33

There it is! We can see there is a negative correlation of -0.36!

SPSS How To Guide for Project 4 - Toothman

18

ANOVA

To begin our ANOVA, you will first click Analyze, then Compare Means, and finally One-way ANOVA.

Figure 34

The following One-way ANOVA dialogue will appear:

Figure 35

Under dependent list, you will select your dependent variable. For our purposes, I’ll use our weekswrk variable

from earlier. Under factor, you will select your grouping variable. I’m selecting marital, which is a nominal level

variable reporting respondents’ marital status: married, widowed, divorced, separated, or never married. This is

the variable you will use to see if there are differences in the mean number of weeks worked last year by marital

status.

I’ll test the following hypotheses:

H0: There are no differences in mean weeks worked last year by marital status.

H1: There is at least one difference in the mean number of weeks worked last year by marital status.

SPSS How To Guide for Project 4 - Toothman

19

Figure 36

From here, click on the box that says Post Hoc. Your book doesn’t discuss post-hoc tests, but they are very

useful in figuring out which groups differ and which do not. The following Post Hoc Multiple Comparisons

dialogue will appear:

Figure 37

There are lots of different tests we can use to see which groups differ. Click the box labeled simply Tukey. Then

click Continue.

SPSS How To Guide for Project 4 - Toothman

20

Figure 38

This will bring you back to the One-Way ANOVA dialogue. Click OK.

Figure 39

Now your output will appear. We’ll go through it one by one. The first box shows you the sum of squares, mean

squares, degrees of freedom, the obtained F-statistic, and the p-value. Everything you need and are already

comfortable with calculating by hand. 

Figure 40

SPSS How To Guide for Project 4 - Toothman

21

In Project 4, I’ve asked you to report dfb, dfw, MSB, MSW, the obtained F-statistic, and the p-value. It’s all right

here! We can reject the null hypothesis because our ANOVA demonstrates there is at least one difference in

mean number of weeks worked last year. If you did not find a significant relationship, you’re pretty much done.

If you did find a significant relationship, you need to take a look at the next bit of output. The Post Hoc Tests

output comes next.

Figure 41

The first column shows us all five relationship status categories. Notice that it is labeled (I) MARITAL STATUS.

The next column shows us each of the other four relationship status categories, relative to the category reported

in the first column. It is labeled (J) MARITAL STATUS. The third column labeled Mean Differences (I-J) shows us

the mean difference in weeks worked last year, subtracting the mean number of weeks worked for the first

category (labeled I) from the mean number of weeks worked for the second category (labeled J). Let’s examine

the first row:

SPSS How To Guide for Project 4 - Toothman

22

Figure 42

First, we are looking at the mean difference in the number of weeks worked last year between married

respondents and widowed respondents. The value reported in the first cell under the Mean Difference (I-J)

column is calculated using the following formula:

𝑀𝑎𝑟𝑟𝑖𝑒𝑑 − 𝑊𝑖𝑑𝑜𝑤𝑒𝑑 = 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑤𝑒𝑒𝑘𝑠 𝑤𝑜𝑟𝑘𝑒𝑑

In this equation, I = Married and J = Widowed. The mean difference is 23.317 weeks.

This means that married respondents worked an average of 23.317 more weeks than widowed respondents

did last year. Now take a look at the value under the sig. column. This is the P-value! The P-value is 0.000. This

means that at the alpha = 0.05 level, we can confirm that married respondents worked significantly more

weeks last year than widowed respondents.

You need to do this for each row. Notice there is repeated information in the table. Let’s look at the row where

I = WIDOWED and J = MARRIED.

SPSS How To Guide for Project 4 - Toothman

23

Figure 43

From this, we can see that widowed respondents worked significantly fewer weeks last year than married

respondents did. They worked 23.317 weeks fewer, on average. Hey! That’s the reciprocal value! 

Let’s identify each set that significantly differed:

SPSS How To Guide for Project 4 - Toothman

24

Figure 44

Notice anything? Widowed respondents worked significantly fewer weeks last year than married, divorced,

separated, and never married respondents. Perhaps even more interesting, the only significant differences

involved widowed respondents. What might explain this? It is probable that most of the widowed respondents

are elderly, and thus at retirement age already. Make sure you report all significant differences appropriately. 

Good luck on Project 4! I hope this guide was helpful.