Discussion: Correlation and Bivariate Regression

profilemelodyharris257
CorrelationandBivariateRegression.pdf

       

     

           

                       

                                                                                           

                                 

             

                                       

                                     

             

                                           

                                                           

                                                     

                                           

                               

                                         

                                   

                                         

                           

                                                     

                                 

Correlation and Bivariate Regression

Correlation and Bivariate Regression Program Transcript

MATT JONES: This  week, we're performing a Pearson Correlation Test. To do this, we can go to SPSS  to perform  this  rather  simple procedure. Like many  of our  tests, go ahead and activate the Analyze button to get the drop down menu. Because we're performing a correlation, we can move down to Correlate and across  to Bivariate. The Pearson Correlation Test is  a bivariate test.

If you click  on that, you'll see a box  come up, Bivariate Correlations. Let's go ahead and perform  a bivariate correlation for  respondent's socioeconomic  status   index  and the respondent's highest level of education.

Now, it's important to remember, in this  GSS  data set, that respondent's highest level of education is  measured in two different ways, one, as  a categorical variable, and one as  an interval ratio level variable. The categorical variable is   the respondent's highest degree obtained. The respondent's highest level of education is  measured in number  of years  of education.

We want to use respondent's highest level of education as  measured in years, the interval ratio level variable, because a Pearson correlation test is  easier  to understand when we use two metric  level variables. We're going to want to use the respondent's highest level of education as  measured in number  of years. That is  the interval ratio level measurement for  this  test.

So again, I see my  variable listings  off to the left. And I can scroll down to find the appropriate variables  that I want to test for  a possible correlation.

Here, I can see the highest year  of school completed. I place my  cursor  over  it. It's highlighted. Again, I know this  is  the interval ratio level of measurement because I can see the scale ruler  next to it. I highlight that. Move it over.

If I scroll down to find socioeconomic  status  index, again, placing my  cursor  over   it, activating it, and moving it over, you'll see that SPSS  automatically, by  default, clicks  on this  Pearson correlation coefficient. Note that there are two other   correlation coefficients  that we will talk  about later  in the class.

The output for  the Pearson correlation coefficient is  rather  simplistic. Since it's a bivariate test, you'll see the bivariate combinations  here. We can see that there is   a correlation coefficient of 0.610 between the highest year  of school completed and the respondent's socioeconomic  index.

If we move below, we can see the test of significance and see that the p value for   this  test is  0.000, which is  well below the conventional 0.05 threshold. Therefore, we can reject the null hypothesis  that there is  no relationship between the respondent's highest year  of school completed and their  socioeconomic  index.

©2016 Laureate  Education, Inc. 1

       

     

                                     

                                           

                           

                                               

                                                 

                         

                                       

                                 

           

                                                 

                           

                                                             

               

                                                 

   

                                             

   

                                                             

                                   

Correlation and Bivariate Regression

Looking at the Pearson correlation coefficient, we know that this  is  a positive relationship and that the relationship is  somewhat moderate.

Again, remember  that a Pearson correlation coefficient is  a standardized index   that has  a range of values  from  negative 1.0 to positive 1.0 with a 0 indicating no relationship whatsoever. The closer  you move to 1.0 on either  side, the stronger   the relationship becomes.

You can see, by  default, SPSS  flags  significant correlations. If we move down to the bottom  here, we can see that this  correlation is  significant at the 0.01 level.

Bivariate regression in many  ways  similar  to a Pearson correlation coefficient. Whereas  a Pearson correlation coefficient provides  us  with the strength of a relationship between two variables, bivariate regression provides  us  with just a little bit more information. Let's go to SPSS  to see how we can perform  this  test.

To perform  this  bivariate regression in SPSS  we click  on Analyze. And we move our  cursor  down to Regression. Right away, you will see a number  of options  for   regression. For  bivariate regression we're using a method called ordinary  least squares, which in SPSS  is  referred to as  Linear  Regression. Bivariate regression often goes  by  the term  simple linear  regression as  well.

If we click  on that, we'll see that we have a number  of options  available to us. A   dependent variable and an independent variable box  are the first things  that we want to pay  attention to. Let's go ahead and predict a respondent's socioeconomic  status  index  from  their  highest level of education.

Again, we want to pay  attention to levels  of measurement. For  our  independent variable, we want to use the respondent's highest level of education measured as   number  of years  in school. That is  at the interval or  ratio level of measurement. Let's go ahead and enter  our  dependent variable first, Socioeconomic  Status   Index.

So again, I can hover  my cursor  over this  variable to make sure this  is  the proper   variable that I want to select. Highlight it. And just use the arrow key  to move it over.

We'll scroll up to my  independent variable, which is, again, respondent's highest level of education measured as  number  of years. Move that over. And then I can click  OK.

Let's go ahead and walk  through some of the output that SPSS  provides  us  for   the bivariate regression model. Let's first focus  on our  model summary. The large R, or  multiple R, in a bivariate regression model is  equal to the Pearson correlation coefficient. In this  case, we have a statistic  of 0.610 If we ran a Pearson correlation coefficient between a respondent's socioeconomic  status  

©2016 Laureate  Education, Inc. 2

       

     

                             

                                           

               

                                     

                     

                                                                     

                   

                                                           

                                   

                       

                                 

                                     

                             

                                               

                                   

                           

                                         

                                             

Correlation and Bivariate Regression

and their  highest level of education, we would receive a Pearson correlation coefficient statistic  of 0.610

The R Square, here a statistic  of 0.372 provides  us  with more information about the overall model. From  the 0.372, we can infer  that 37% of the respondent's socioeconomic  status  is  accounted for, or  explained, by  their  highest year  of school completed.

The Adjusted R Square is  similar  in this  case, because we only  have one predictor. As  we increase the number  of predictors  in a multiple regression model, that Adjusted R Square will change from  the R Square.

Next, we go to our  ANOVA  box. Here, we're testing for  the overall significance of the regression model. You'll see a significance level of 0.000, which is  well below the conventional 0.05 threshold. Therefore, we can conclude that our  model has   statistical significance and the R Square can be interpreted.

Next, let's go ahead and interpret the coefficients  output. You'll see here that we're provided with several statistics. The first statistic  is  the constant. This  is   where the slope of our  regression line intercepts  with the y-­axis.

Our  next coefficient to interpret is  our  independent variable, here, highest year  of school completed. This  is  the unstandardized coefficient, so we can interpret this   as  for  every  one unit increase in our  independent variable our  dependent variable will change by  this  value.

So we'll say  it in plain English. For  every  additional year  of school completed, socioeconomic  status  will change by  3.765 units, on average.

We'll also note here that SPSS  provides  us  with a standardized coefficient, or  a beta, for  our  independent variable. You might notice right away  that this  statistic, this  value, is  the same as  the Pearson R, 0.610. That's because the standardized coefficient standardizes  the units  of measure.

We, of course, also want to pay  close attention to our  significance. Here, we have a significance level of 0.000, which is  well below the 0.05 threshold. Therefore, we can reject the null hypothesis  that there is  no relationship between our  two variables  of highest year  school of completed and respondent's socioeconomic  index. It appears  that the more school one completes, on average, the higher  their  socioeconomic  index  will be.

This  was  just a basic  introduction to bivariate regression in SPSS. Although the procedures  are rather  simple, there still is  a lot more to know about bivariate regression. As  you'll probably  note, some of the output we didn't go over. If you have additional questions, be sure and use your  textbook  and also utilize your  

©2016 Laureate  Education, Inc. 3

       

     

                             

 

Correlation and Bivariate Regression

faculty  instructor. We want you to understand linear  regression. And we're here to see you succeed.

©2016 Laureate  Education, Inc. 4