db cj stats

profileismails95
10Correlation.ppt

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

Fox/Levin/Forde, Elementary Statistics in Criminal Justice Research, 4e

Chapter 10: Correlation

*

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

Differentiate between the strength and direction of a correlation

Identify a curvilinear correlation

Discuss the characteristics of correlation coefficients

Calculate and test the significance of Pearson’s correlation coefficient

Calculate the partial correlation coefficient

CHAPTER OBJECTIVES

10.1

10.2

10.3

10.4

10.5

*

Differentiate between the strength and direction of a correlation

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

10.1

*

10.1

*

Until now we’ve examined the presence or absence of a relationship between two or more variables.

What about the strength and direction of this relationship?

  • We refer to this as the correlation between variables.

Strength of Correlation

  • This can be visualized using a scatterplot.

X variable (IV) is located on horizontal line

Y variable (DV) is located on vertical line

Strength increases as the points more closely form an imaginary diagonal line across the center.

Correlation

*

Direction of Correlation

  • Correlations can be described as either positive or negative.

Positive – both variables move in the same direction

  • High score on X variable tends to have a high score on the Y variable

Negative – the variables move in opposite directions

  • High score on X variable tends to have a low score on the Y variable
  • A + or – correlation represents a straight line correlation

Lines in the scatterplot tend to form a straight line through the center of the graph

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

10.1

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

10.1

*

Identify a curvilinear correlation

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

10.2

*

10.2

*

A relationship between X and Y that begins as positive and becomes negative, or begins as negative and becomes positive

Fear of crime tends to have a curvilinear correlation

Fear of crime tends to decrease with age until ppl reach their thirties after which fear tends to increase with age

Curvilinear Correlation

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

10.2

*

Discuss the characteristics of correlation coefficients

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

10.3

*

The Correlation Coefficient

10.3

Direction

Strength

*

  • The sign (either – or +) indicates the direction of the relationship
  • Values close to zero indicate little or no correlation
  • Values closer to -1 or +1, indicate stronger correlations

Numerically expresses both the direction and strength of a relationship between two variables

  • Ranges between -1.0 and + 1.0

*

*

Calculate and test the significance of Pearson’s correlation coefficient (r)

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

10.4

*

10.4

*

We can determine the strength and direction of the relationship bt X and Y – must be measured at the interval level

Focuses on the product of the X and Y deviations from their respective means

Deviations Formula:

Computational Formula:

Pearson’s Correlation Coefficient (r)

*

Imagine we want to examine the relationship bt the length of a murder trial in days and the length of time in hours that the jury deliberates.

*

*

The scatterplot seems to indicate a positive relationship with a few exceptions

Pearson’s r indicates precisely how much the jury deliberation is extended with lengthier trials

Pearson’s r focuses on the product of the x and y deviations from their respective means

The deviation (X-x bar) tells how much longer or shorter than average a particular trial is

The deviation (y-y bar) tells how much longer or shorter than average a particular jury deliberation takes

With Pearson’s r we add the products to see if the positive products or the negative products are more abundant and sizeable

*

Calculating Pearson’s r requires us to compute the sum of the product for all the cases

The sum of the final column is + (+ relationship)

*

BUT we have to divide SP by the square root of the product of the sum of squares of both variables to get an r between -1 and +1

*

*

10.4

*

To test the significance of a measure of correlation, we set up a null hypothesis

no correlation exists in the population (ρ = 0).

ρ is pronounced rho

  • To test the significance of r, a t ratio with degrees of freedom N – 2 must be calculated.

A simplified method for testing the significance of r

  • Compare the calculated r to a critical value found in Table H in Appendix C

Testing the Significance of Pearson’s r

*

Step by Step – Pearson’s r

*

*

*

*

*

Requirements for the Use of Pearson’s r Correlation Coefficient

10.4

A Straight-Line Relationship

Interval Data

Random Sampling

Normally Distributed Characteristics

*

The Importance of Scatterplots

For a dataset containing several variables, a computer can obtain a correlation matrix

Displays in compact form the interrelationships of several variables simultaneously

Below the entry in the 2nd row, 4th column tells us the correlation of offender’s and spouse’s education

*

The matrix is triangular with the portion above the diagonal being identical to the portion below the diagonal

Pitfall – correlations may gloss over some major violations of assumptions of Pearson’s r

The correlation matrix does not tell us if a linear relationship actually exists

IT IS IMPORTANT TO CHECK THE SCATTERPLOT

*

Calculate the partial correlation coefficient

Learning Objectives

After this lecture, you should be able to complete the following Learning Outcomes

10.5

*

It is also important to consider if a correlation holds up when controlling for additional variables

Again we can focus on scatterplots

A scatterplot displays all the information in the correlation coefficient

We can construct separate scatterplots for different subgroups of a sample to see if the correlation observed for the full sample holds when controlling for the subgroup or control variable

*

Imagine we examine the relationship between height and salary and find a correlation

Is there a 3rd variable that could explain that relationship? Perhaps sex?

*

If any correlation remains in either of the two gender specific subplots, the relationship between height and salary is nowhere near as strong as we first saw

The partial correlation coefficient is the correlation bt two variables after removing (partialing out) the common effects of the 3rd variable

This too ranges from -1 to +1

*

10.5

*

The correlation between two variables, X and Y, after removing the common effects of a third variable, Z

When testing the significance of a partial correlation, a slightly different t formula is used

Partial Correlation

*

A consultant for a PD finds a -.44 correlation between performance on a physical fitness test (x) and salary (y)

This might suggest that the dept pays a lower salary to those in top shape

Perhaps a 3rd variable (number of years on the force, z) is impacts both x and y

*

*

*

The partial correlation coefficient is very useful for finding spurious relationships

The correlation between the rate of rape (per 100,000) in 1982 and the circulation of Playboy (per 100,000) in 1979 for 49 U.S. states (Alaska is an outlier on rape and is excluded) is r = +40.

Because of this substantial correlation, many observers have asked:

If Playboy has this kind of effect on sex crimes, imagine what harm may be caused by truly hardcore pornography?

*

*

© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey 07458 • All Rights Reserved

Correlation allows researchers to determine the strength and direction of the relationship between two or more variables.

In a curvilinear correlation the relationship between two variables starts out positive and turns negative, or vice-versa.

The correlation coefficient numerically expresses the direction and strength of a linear relationship between two variables.

Pearson’s correlation coefficient can be calculated for two interval-level variables.

The partial correlation coefficient can be used to examine the relationship between two variables, after removing the common effect of a third variable.

CHAPTER SUMMARY

10.1

10.2

10.3

10.4

10.5

*

(

)

(

)

(

)

(

)

22

SP

SSSS

XY

XXYY

r

XXYY

--

==

--

å

åå

(

)

(

)

2222

XYNXY

r

XNXYNY

-

=

--

å

åå

2

2

1

rN

t

r

-

=

-

.

22

11

XYXZYZ

XYZ

XZYZ

rrr

r

rr

-

=

--

.

2

.

3

1

XYZ

XYZ

rN

t

r

-

=

-