statistics/SPSS
SPSS How To Guide for Project 4 - Toothman
1
***Note***
This guide was written using SPSS Version 22 for Windows. The basic instructions to complete the statistical
tests should be the same. Opening files may vary between versions and operating systems.
Regardless, I suggest playing around with SPSS to learn it a bit on your own before to attempting to complete
Project 4.
Cross-Tabulation and Chi-Square: Background
Once your data are open, click Analyze, then Descriptive Statistics, and finally Crosstabs to create your cross-
tabulation and analyze your data using chi-square.
Figure 1
This will open the following dialogue:
Figure 2
Next, select the variables you will use to create your crosstab. You should select your independent variable to be
your column variable, and the dependent variable to be your row variable. Select the variable and then click the
arrow next to the appropriate column or row box to move it to the appropriate section.
SPSS How To Guide for Project 4 - Toothman
2
In this example, I selected SEXORNT as my column variable. SEXORNT is a nominal level variable created from
the question, “Which of the following best describes you?” with the following categories: “gay, lesbian, or
homosexual,” “bisexual,” and “heterosexual or straight.” Respondents who reported “don’t know,” “refused,” or
“not applicable” were coded as missing.
For my dependent variable, I selected marhomo as my row variable. marhomo is an ordinal level variable
created from responses to the statement, “homosexuals should have the right to marry.” Respondents reported
the following valid responses: “strongly agree,” “agree,” “neither agree nor disagree,” “disagree,” and “strongly
disagree.” Respondents who reported “cannot choose” or “not applicable” were coded as missing.
Figure 3
I will test the following hypotheses. The alpha has been set at 0.05.
H0: Sexual orientation and attitudes toward same sex marriage are statistically independent.
H1: Sexual orientation and attitudes toward same sex marriage are statistically dependent.
Next, you will need to click on the box labeled Statistics to tell SPSS which statistics you will calculate for your
cross-tabulation. The following dialogue will appear:
SPSS How To Guide for Project 4 - Toothman
3
Figure 4
Next, you will click on the checkbox next to each of the statistics you will calculate. At bare minimum, you will
select the box for chi-square.
You will also select a check-box to calculate an appropriate measure of association to test the strength of the
relationship between your two variables. For our purposes, I will check the boxes for lambda and phi and
Cramer’s V. Later in the guide, I will show you an example using two ordinal level variables so you can effectively
read the output.
Figure 5
Click Continue.
Now you will return to this screen:
SPSS How To Guide for Project 4 - Toothman
4
Figure 6
Click Cells. The following cell display dialogue will appear.
Figure 7
By default, only Observed should be checked. This will report the number of people who fall into each cell. In
addition to observed, you should also check the box for column percentages. When you create your own table
in Word or Excel, I only want to see the column percentages in your new table.
SPSS How To Guide for Project 4 - Toothman
5
Figure 8
Click Continue. This will return you to the crosstabs screen:
Figure 9
Now you are ready to click OK. This will produce the output.
The output screen will show several boxes to you. Let’s go through them one by one.
SPSS How To Guide for Project 4 - Toothman
6
Output
The first box simply shows you how many valid and missing cases you have. For your purposes, the Valid
percent should be 100%, and missing should be 0%.
Figure 10
The next box shows you the cross-tabulation. We’ll go through it one by one.
Figure 11
First, just look at the table and make sure your data are displayed appropriately. Ours are good! Notice the cells:
SPSS How To Guide for Project 4 - Toothman
7
Figure 12
Just from looking at the cells, it looks like people who identify as gay, lesbian, or homosexual seem more likely
than bisexuals or heterosexuals to agree that LGBT people should be able to marry. Bisexuals seem more
divided, but lean more heavily on agreeing that they should be allow to marry. Heterosexuals seem even more
divided: about half agreeing, 11.8% do not agree or disagree, and more than a third disagree.
Take a look at the totals:
Figure 13
By now you should also notice that the sample is disproportionately heterosexual. If we only looked at the raw
totals, we would not have been able to infer much about the relationship between sexual orientation and
SPSS How To Guide for Project 4 - Toothman
8
attitudes toward same sex marriage. You will need to replicate this table on your own, using either Excel or
Word to create the table for you.
The next box shows you the chi-square statistics:
Figure 14
We are interested in the information contained in the row labeled Pearson Chi-Square.
Figure 15
Our obtained chi-square statistic is 22.355. We have eight degrees of freedom. The P-value (labeled asymp. sig.
(2-sided) is 0.004.
With this information, we can reject the null hypothesis. Sexual orientation and attitudes toward same sex
marriage are statistically dependent.
The next box shows us the directional measures. These are the first set of measures of association we
calculated.
SPSS How To Guide for Project 4 - Toothman
9
Figure 16
In this box, we can find lambda. Since our dependent variable is marhomo, we need to refer to that section of
the table.
Figure 17
Lambda is zero! Why is that?
lambda will always be zero when the mode for each category of the independent variable falls into the
same category of the independent variable – even if other measures of association tell us that the two
variables actually are related. If the two variables seem related, based on the chi-square statistic or
observations of the differences in percentages, we need to try a different measure of association to
measure the strength of the relationship.
Lambda is not an adequate measure of association for our relationship. So let’s take a look at the final table in
our output:
SPSS How To Guide for Project 4 - Toothman
10
Figure 18
Here, we need to look at the row labeled Cramer’s V.
Figure 19
Do not worry about information in the column labeled approx. sig. Our Cramer’s V is 0.10. Interpret
appropriately.
Chi-Square and Measures of Association for Two Ordinal Level Variables
If you select two ordinal level variables to complete this portion of the assignment, you will need to select a
different measure of association. For this next test, I will continue to use marhomo as the dependent variable.
The variable fund16 is the independent variable. Fund16 is an ordinal level variable that reports the
“fundamentalism/liberalism of religion [the] respondent was raised in.” The categories “fundamentalist” (1)
means the religion was categorized as a very conservative denomination. “Moderate” (2) means the religion was
categorized as being a more moderate religion. “Liberal” (3) means the religion was considered to be a liberal
religious group. We can think of this as a continuum. Higher scores indicate more liberal religious beliefs at 16;
lower scores indicate more conservative religious beliefs.
We will test the following hypotheses:
H0: Religious fundamentalism at 16 and attitudes toward same sex marriage are statistically
independent.
H1: Religious fundamentalism at 16 and attitudes toward same sex marriage are statistically dependent.
SPSS How To Guide for Project 4 - Toothman
11
Figure 20
We set up our chi-square test similarly to the previous example, until we get to the statistics box. We will still
select a chi-square test to test the significance of the relationship, but with respect to the association, we need
to select gamma and either Kendall’s tau-b or Kendall’s tau-c. Since there are five categories on our dependent
variable, and only three for our column variable, we should select Kendall’s tau-c in addition to gamma. This is
because we use Kendall’s tau-c when the cross-tabulation is a rectangle. If we had the same number of
categories on both variables, we would use Kendall’s tau-b.
Figure 21
SPSS How To Guide for Project 4 - Toothman
12
Output
Let’s take a look at the cross-tabulation:
Figure 22
It seems that something may be going on here. A little over one-third of people raised in fundamentalist
religions agree that same-sex couples should be allowed to marry. Twelve and a half percent do not agree or
disagree, while over half disagree with the statement that same-sex couples should be allowed to marry. Over
one-half of those raised in moderate and liberal denominations agreed that same-sex couples should be allowed
to marry. So there might be something going on here. Keep in mind that on the fund16 variable, lower scores =
more conservative beliefs; higher scores = more liberal beliefs. On our marhomo variable, lower scores =
agreement that same-sex couples should be allowed to marry and higher scores = disagreement that same-sex
couples should be allowed to marry. With this in mind, and with the evidence presented above, it seems that if
the two variables are statistically dependent, we are likely to have a negative relationship. As the independent
variable score increases, the dependent variable score decreases.
The chi-square test suggests that we should reject our null hypothesis. Religious fundamentalism and attitudes
toward same-sex marriage are statistically dependent.
SPSS How To Guide for Project 4 - Toothman
13
Figure 23
Now we can examine the box labeled symmetric measures. Here we can find our gamma and the Kendall’s tau-
c.
Figure 24
We are only interested in the information contained in the value column. Our Kendall’s tau-c is -0.143 and our
gamma is -0.193. Interpret appropriately.
SPSS How To Guide for Project 4 - Toothman
14
Regression and Correlation
To begin, click Analyze, then Regression, and finally linear to complete the regression and correlation portion of
Project 4.
Figure 25
The following linear regression dialogue will appear:
Figure 26
SPSS How To Guide for Project 4 - Toothman
15
You will need to select two appropriate variables to obtain your regression line equation. Remember, your
independent variable is the one you think will predict a change in the dependent variable.
In this example, weekswrk is an interval-ratio level variable reporting the number of weeks a respondent
worked last year. It is the independent variable. VISLIB is an interval-ratio level variable reporting the number of
times a respondent visited a public library last year.
Figure 27
All we need to do to get the information to report our linear regression equation, the correlation coefficient, and
the coefficient of determination, is click OK. Your output will appear! I’ll go through each box one by one.
Figure 28
The first box just shows the variables in the equation and the method used to enter them (you don’t need to
worry about this). Double check that the variables entered matches your independent variable and the
dependent variable below the box matches what you intended to do. We’re all good!
SPSS How To Guide for Project 4 - Toothman
16
The Model Summary box reports Pearson’s correlation coefficient (R) and the coefficient of determination (R 2 ).
Figure 29
Hold off on interpreting the correlation coefficient for now. We can also see that our regression equation
demonstrates that only 0.1% of the variation in visits to the library last year is explained by how many weeks
worked last year.
The next box shows us ANOVA. ANOVA and regression are related! For this portion of the assignment, do not
worry about this box. This box shows us if there is a significant relationship in our regression (we did not cover
this in our class – we can conduct hypothesis tests with regression, too!). In short: there isn’t a statistically
significant relationship.
Figure 30
The final box, labeled coefficients, contains the information you need to report your regression line equation.
Figure 31
The information contained under the column labeled B shows us both the slope (b) and the y-intercept (a).
The row labeled (Constant) shows us the constant of the regression line. This is different language than you are
already familiar. The number in the cell where B and (Constant) meet is the y-intercept.
SPSS How To Guide for Project 4 - Toothman
17
The row labeled WEEKS R WORKED LAST YEAR shows us various statistics relevant to how the independent
variable is related to the dependent variable. In the cell where B and WEEKS R WORKED LAST YEAR meet, we are
shown the slope of the line (b). For each unit increase in the number of weeks someone worked last year, we
expect a decrease in the number of library visits last year of 0.019.
Let’s return to the correlation box I showed you earlier.
Figure 32
The correlation coefficient presented here is only going to show a positive figure (this is what you should expect;
the explanation is beyond the scope of this class). However, based on the slope of our regression line, we know
that we actually have a negative relationship – as X increases, Y decreases. When you report your correlation
coefficient, make sure you report the appropriate sign. In this case, we know we have a very weak negative
relationship (R = -0.036).
Just to confirm, you can but do not have to calculate the bivariate correlation (click Analyze, then Correlation,
and then Bivariate Correlation). Take a look at what we find:
Figure 33
There it is! We can see there is a negative correlation of -0.36!
SPSS How To Guide for Project 4 - Toothman
18
ANOVA
To begin our ANOVA, you will first click Analyze, then Compare Means, and finally One-way ANOVA.
Figure 34
The following One-way ANOVA dialogue will appear:
Figure 35
Under dependent list, you will select your dependent variable. For our purposes, I’ll use our weekswrk variable
from earlier. Under factor, you will select your grouping variable. I’m selecting marital, which is a nominal level
variable reporting respondents’ marital status: married, widowed, divorced, separated, or never married. This is
the variable you will use to see if there are differences in the mean number of weeks worked last year by marital
status.
I’ll test the following hypotheses:
H0: There are no differences in mean weeks worked last year by marital status.
H1: There is at least one difference in the mean number of weeks worked last year by marital status.
SPSS How To Guide for Project 4 - Toothman
19
Figure 36
From here, click on the box that says Post Hoc. Your book doesn’t discuss post-hoc tests, but they are very
useful in figuring out which groups differ and which do not. The following Post Hoc Multiple Comparisons
dialogue will appear:
Figure 37
There are lots of different tests we can use to see which groups differ. Click the box labeled simply Tukey. Then
click Continue.
SPSS How To Guide for Project 4 - Toothman
20
Figure 38
This will bring you back to the One-Way ANOVA dialogue. Click OK.
Figure 39
Now your output will appear. We’ll go through it one by one. The first box shows you the sum of squares, mean
squares, degrees of freedom, the obtained F-statistic, and the p-value. Everything you need and are already
comfortable with calculating by hand.
Figure 40
SPSS How To Guide for Project 4 - Toothman
21
In Project 4, I’ve asked you to report dfb, dfw, MSB, MSW, the obtained F-statistic, and the p-value. It’s all right
here! We can reject the null hypothesis because our ANOVA demonstrates there is at least one difference in
mean number of weeks worked last year. If you did not find a significant relationship, you’re pretty much done.
If you did find a significant relationship, you need to take a look at the next bit of output. The Post Hoc Tests
output comes next.
Figure 41
The first column shows us all five relationship status categories. Notice that it is labeled (I) MARITAL STATUS.
The next column shows us each of the other four relationship status categories, relative to the category reported
in the first column. It is labeled (J) MARITAL STATUS. The third column labeled Mean Differences (I-J) shows us
the mean difference in weeks worked last year, subtracting the mean number of weeks worked for the first
category (labeled I) from the mean number of weeks worked for the second category (labeled J). Let’s examine
the first row:
SPSS How To Guide for Project 4 - Toothman
22
Figure 42
First, we are looking at the mean difference in the number of weeks worked last year between married
respondents and widowed respondents. The value reported in the first cell under the Mean Difference (I-J)
column is calculated using the following formula:
𝑀𝑎𝑟𝑟𝑖𝑒𝑑 − 𝑊𝑖𝑑𝑜𝑤𝑒𝑑 = 𝑚𝑒𝑎𝑛 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑤𝑒𝑒𝑘𝑠 𝑤𝑜𝑟𝑘𝑒𝑑
In this equation, I = Married and J = Widowed. The mean difference is 23.317 weeks.
This means that married respondents worked an average of 23.317 more weeks than widowed respondents
did last year. Now take a look at the value under the sig. column. This is the P-value! The P-value is 0.000. This
means that at the alpha = 0.05 level, we can confirm that married respondents worked significantly more
weeks last year than widowed respondents.
You need to do this for each row. Notice there is repeated information in the table. Let’s look at the row where
I = WIDOWED and J = MARRIED.
SPSS How To Guide for Project 4 - Toothman
23
Figure 43
From this, we can see that widowed respondents worked significantly fewer weeks last year than married
respondents did. They worked 23.317 weeks fewer, on average. Hey! That’s the reciprocal value!
Let’s identify each set that significantly differed:
SPSS How To Guide for Project 4 - Toothman
24
Figure 44
Notice anything? Widowed respondents worked significantly fewer weeks last year than married, divorced,
separated, and never married respondents. Perhaps even more interesting, the only significant differences
involved widowed respondents. What might explain this? It is probable that most of the widowed respondents
are elderly, and thus at retirement age already. Make sure you report all significant differences appropriately.
Good luck on Project 4! I hope this guide was helpful.