U5D2-64 - NEED TO KNOW SPSS - Interpret the Correlation, nature of the relationship in your own words of the Meta-Analysis reported on playing video games...see details
The second half of this course (Units 5–10) examines three fundamental inferential statistics, including correlation (Units 5–6), t tests (Units 7–8), and analysis of variance (ANOVA; Units 9–10). The first inferential statistic we will focus on is correlation, denoted r, which estimates the strength of a linear association between two variables. By contrast, in Weeks 7–10, t tests and ANOVAs will examine group differences on some quantitative dependent variable.
Interpreting Correlation: Magnitude and Sign
Interpreting a Pearson's correlation coefficient ( rXY) requires an understanding of two concepts:
1. Magnitude.
2. Sign (±).
The magnitude refers to the strength of the linear relationship between variable X and variable Y. The rXY ranges in values from −1.00 to +1.00. To determine magnitude, ignore the sign of the correlation, and the absolute value of rXY indicates the extent to which variable X and variable Y are linearly related. For correlations
close to 0, there is no linear relationship. As the correlation approaches either −1.00 or +1.00, the magnitude of the correlation increases. Therefore, for example, |.65| > |.25|, or the magnitude of r = −.65 is greater than the magnitude of r = +.25.
In contrast to magnitude, the sign of a non-zero correlation is either negative or positive. These labels are not interpreted as "bad" or "good." Instead, the sign represents the slope of the linear relationship between X and Y. A scatter plot is used to visualize the slope of this linear relationship, and it is a two-dimensional graph with dots representing the combined X, Y score. Interpreting scatter plots is necessary to check the assumptions of correlation discussed below.
A positive correlation indicates that as values of X increase, the values of Y also increase (for example, height and weight). Figures 7.1 to 7.4 (pp. 262–263) in the Warner text illustrate positive correlations showing that as values of X increase on the horizontal axis, values of Y increase on the vertical axis. The magnitude of the correlations range from a perfect positive linear relationship of r = +1.00 in Figure 7.1 to a weak positive correlation of r = +.23 in Figure 7.4. Conversely, a negative correlation indicates that as values of X increase, the values of Y decrease (see Figure 7.6 on p. 265 representing r = −.75). Finally, when X and Y values are randomly distributed on the scatter plot (that is, there is no linear relationship), then r = 0.00 (see Figure 7.5 on p. 264).
Assumptions of Correlation
All inferential statistics, including correlation, operate under assumptions that are checked prior to interpreting analyses. Violations of assumptions can lead to erroneous inferences regarding a null hypothesis. The first assumption of correlation is independence of observations for X and Y scores. The measurement of individual X and Y scores should not be influenced by errors in measurement or problems in research design. (For example, a student completing an IQ test should not be looking over the shoulder of another student taking that test; his or her IQ score should be independent.) This first assumption of correlation is not statistical in nature; it is controlled by using reliable and valid instruments and by maintaining proper research procedures to maintain independence of observations.
Unit 5 - Correlation: Theory and Logic INTRODUCTION
The second assumption is that, for Pearson's r, X and Y are quantitative and each variable is normally distributed. Other correlations discussed below do not require this assumption, but Pearson's r is the most widely used and reported type of correlation. It is therefore important to check this assumption when calculating Pearson's r in SPSS. This assumption is checked by a visual inspection of X and Y histograms and calculations of skew and kurtosis values.
The third assumption of correlation is that X and Y scores are linearly related. Correlation does not detect strong curvilinear relationships as shown in Figure 7.7 of the Warner text (p. 268). This assumption is checked by a visual inspection of the X, Y scatter plot.
The fourth assumption of correlation is that the X and Y scores should not have extreme bivariate outliers that influence the magnitude of the correlation. Bivariate outliers are also detected by a visual examination of a scatter plot. Figures 7.10 and 7.11 in the Warner text (pp. 272–273) illustrate how outliers can dramatically influence the magnitude of the correlation, which sometimes leads to errors in null hypothesis testing. Bivariate outliers are particularly problematic when a sample size is small, and Warner (2013) suggests an N of at least 100 for studies that report correlations.
The fifth assumption of correlation is that the variability in Y scores is uniform across levels of X. This requirement is referred to as the homogeneity of variance assumption, which is usually difficult to assess in scatter plots with a small sample size. Sometimes a potential violation can be detected, such as in Figure 4.43 of the Warner text (p. 169), but this assumption is typically emphasized when checking the homogeneity of variance for a t test or analysis of variance (ANOVA) studied later in the course.
Hypothesis Testing of Correlation
The null hypothesis for correlation predicts no significant linear relationship between X and Y, or H0: rXY = 0. A
directional alternative hypothesis for correlation is either an expected significant positive relationship ( H1: rXY >
0) or significant negative relationship ( H1: rXY < 0). A non-directional alternative hypothesis would simply
predict that the correlation is significantly different from 0, but it does not stipulate the sign of the relationship ( H1: rXY ≠ 0).
For correlation as well as t tests and ANOVA studied later in the course, the standard alpha level for rejecting the null hypothesis is set to .05. SPSS output for a correlation showing a p value of less than .05 indicates that the null hypothesis should be rejected; there is a significant relationship between X and Y. A p value greater than .05 indicates that the null hypothesis should not be rejected; there is not a significant relationship between X and Y.
Effect Size in Correlation
Even if the null hypothesis is rejected, how large is the association between X and Y? To provide additional context, the interpretation of all inferential statistics, including correlation, should include an estimate of effect size. An effect size is articulated along a continuum from "small" to "medium" to "large." Effect sizes allow the
researcher to properly interpret the data. For a Pearson's correlation, effect size is expressed by either r or r2. The proportion of variance shared between the two variables is expressed by squaring the correlation
coefficient ( r2). This is called the coefficient of determination and is a measure of the amount of variation
explained by the association. For example, if the correlation between X and Y is r = .40, then r2 = .16, meaning that approximately 16% of the variance in Y (the dependent variable) is explained by X (the independent variable).
Page 298 in the Warner text provides guidelines on the interpretation of r and r2. Roughly speaking, a
correlation less than or equal to .10 ( r2 = .01) is "small," a correlation of .30 ( r2 = .09) is "medium," and a
correlation above .50 ( r2 = .25) is "large." It is important to interpret correlation with this in mind, as it is possible to have a significant correlation (because correlation is partially dependent on sample size) and still have a small effect size (which is calculated independently of sample size).
Alternative Correlation Coefficients
Chapter 7 of the Warner text focuses on the most widely used correlation, referred to as Pearson's r. Pearson's r is calculated between X and Y variables that are measured on either the interval or ratio scale of measurement (for example, height and weight). Chapter 8 of the Warner text reviews other types of correlation that depend on other scales of measurement for X and Y. A point biserial ( rpb) correlation is calculated when one variable is
dichotomous (such as gender) and the other variable is interval/ratio data (such as weight). If both variables are ranked (ordinal) data, the correlation is referred to as Spearman's r ( rs). Although the underlying scales of
measurement differ from the standard Pearson's r, rpb and rs are both calculated between −1.00 and +1.00 and
are interpreted similarly.
If both variables are dichotomous, the correlation is referred to as phi (φ). A final test of association is referred to as chi-square. Phi and chi-square are studied in advanced inferential statistics.
Reference
Warner, R. M. (2013). Applied statistics: From bivariate through multivariate techniques (2nd ed.). Thousand Oaks, CA: Sage.
OBJECTIVES
To successfully complete this learning unit, you will be expected to:
1. Analyze the interpretation of correlation coefficients.
2. Identify the assumptions of correlation.
3. Identify null hypothesis testing of correlation.
4. Interpret a correlation reported in the scientific literature.
5. Analyze the assumptions of correlation.
Unit 5 Study 1
Use your Warner text, Applied Statistics: From Bivariate Through Multivariate Techniques, to complete the following:
• Read Chapter 7, "Bivariate Pearson Correlation," pages 261–314. This chapter addresses the following topics:
◦ Assumptions of Pearson's r. ◦ Preliminary data screening for Pearson's r. ◦ Statistical significance tests for Pearson's r. ◦ Factors influencing the magnitude and sign of Pearson's r. ◦ Effect-size indexes. ◦ Interpretation of Pearson's r values.
• Read Chapter 8, "Alternative Correlation Coefficients," pages 315–343. This chapter addresses the following topics:
◦ Correlations for rank or ordinal scores. ◦ Correlations for true dichotomies. ◦ Correlations for artificial dichotomies. ◦ Chi-square test of association.
PSY Learners – Additional Required Readings
In addition to the other required study activities for this unit, PSY learners are required to read the following articles. Note: For the first article, focus on interpreting Table 1.
Jia, Y., Konold, T. R., & Cornell, D. (2015). Authoritative school climate and high school dropout rates. School Psychology Quarterly. doi:10.1037/spq0000139
Anderson, C. A., & Bushman, B. J. (2001). Effects of violent video games on aggressive behavior, aggressive
cognition, aggressive affect, physiological arousal, and prosocial behavior: A meta-analytic review of the scientific literature. Psychological Science, 12(5), 353–359.
SOE Learners – Suggested Readings
Some programs have opted to provide program-specific content designed to help you better understand how the subject matter in this study is incorporated into your particular field of study. The following readings are suggested for SOE learners.
Walk, M., & Rupp, A. (2010). Pearson product-moment correlation coefficient. In N. J. Salkind (Ed.), Encyclopedia of research design (pp. 1023–1026). Thousand Oaks, CA: Sage. doi:10.4135/9781412961288.n309
IBM SPSS Step-by-Step Guide: Correlations.
Unit 5 Discussion 2 Interpreting Correlations A meta-analysis (Anderson & Bushman, 2001) reported that the average correlation between time spent playing video games ( X ) and engaging in aggressive behavior ( Y ) in a set of 21 well-controlled experimental studies was .19. This correlation was judged to be statistically significant. In your own words, what can you say about the nature of the relationship?
Reference
Anderson, C. A., & Bushman, B. J. (2001). Effects of violent video games on aggressive behavior, aggressive
cognition, aggressive affect, physiological arousal, and prosocial behavior: A meta-analytic review of the scientific literature. Psychological Science, 12(5), 353–359.