Data analysis and reporting

profiledpace123
Week12.ppt

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

23

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Bivariate Statistical Analysis: Measures of Association

Week 12

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

LEARNING OUTCOMES

Apply and interpret simple bivariate correlations

Interpret a correlation matrix

Understand simple (bivariate) regression

Understand the least-squares estimation technique

Interpret regression output including the tests of hypotheses tied to specific parameter coefficients

23-*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Bringing Your Work to Your Home (and Bringing Your Home to Work)

23-*

  • Work-family conflict (WFC).

Conflict that results when the demands and responsibilities of one role “spill over” into the other role.

  • Researchers have examined may work and family characteristics (independent variables) that can predict WFC (a dependent variable).

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

The Basics

  • Measures of Association

Refers to a number of bivariate statistical techniques used to measure the strength of a relationship between two variables.

The chi-square (2) test provides information about whether two or more less-than interval variables are interrelated.

Correlation analysis is most appropriate for interval or ratio variables.

Regression can accommodate either less-than interval or interval independent variables, but the dependent variable must be continuous.

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

EXHIBIT 23.1 Bivariate Analysis—Common Procedures for Testing Association

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

  • Correlation coefficient

A statistical measure of the covariation, or association, between two at-least interval variables.

  • Covariance

Extent to which two variables are associated systematically with each other.

23–*

Simple Correlation Coefficient (continued)

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Correlation Coefficient

  • Correlation coefficient (r)

Ranges from +1 to -1

Perfect positive linear relationship = +1

Perfect negative (inverse) linear relationship = -1

No correlation = 0

  • Correlation coefficient for two variables (X,Y)

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

EXHIBIT 23.2 Scatter Diagram to Illustrate Correlation Patterns

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Correlation, Covariance, and Causation

  • When two variables covary, they display concomitant variation.
  • This systematic covariation does not in and of itself establish causality.
  • e.g., Rooster’s crow and the rising of the sun

Rooster does not cause the sun to rise.

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Coefficient of Determination

  • Coefficient of Determination (R2)

A measure obtained by squaring the correlation coefficient; the proportion of the total variance of a variable accounted for by another value of another variable.

Measures that part of the total variance of Y that is accounted for by knowing the value of X.

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Correlation Matrix

  • Correlation matrix

The standard form for reporting correlation coefficients for more than two variables.

  • Statistical Significance

The procedure for determining statistical significance is the t-test of the significance of a correlation coefficient.

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

EXHIBIT 23.4 Pearson Product-Moment Correlation Matrix for Salesperson Example

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

What Makes Attractiveness?

23-*

  • What are the things that make someone attractive?
  • Many factors are correlated:

Fit

Attractiveness

Weight

Age

Manner of dress (how modern)

Personality (warm versus cold)

  • Results reveal:

Model seems to “fit” the store concept -> attractive.

Overweight -> less attractive

Age -> unrelated to fit or attractiveness

Modernness and perceived coldness -> less attractive

  • Can help a retailer determine what employees should look like.

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Regression Analysis

  • Simple (Bivariate) Linear Regression

A measure of linear association that investigates straight-line relationships between a continuous dependent variable and an independent variable that is usually continuous, but can be a categorical dummy variable.

  • The Regression Equation (Y = α + βX )

Y = the continuous dependent variable

X = the independent variable

α = the Y intercept (regression line intercepts Y axis)

β = the slope of the coefficient (rise over run)

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

The Regression Equation

  • Parameter Estimate Choices

β is indicative of the strength and direction of the relationship between the independent and dependent variable.

α (Y intercept) is a fixed point that is considered a constant (how much Y can exist without X)

  • Standardized Regression Coefficient (β)

Estimated coefficient of the strength of relationship between the independent and dependent variables.

Expressed on a standardized scale where higher absolute values indicate stronger relationships (range is from -1 to 1).

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

The Regression Equation (cont’d)

  • Parameter Estimate Choices

Raw regression estimates (b1)

Raw regression weights have the advantage of retaining the scale metric—which is also their key disadvantage.

If the purpose of the regression analysis is forecasting, then raw parameter estimates must be used.

This is another way of saying when the researcher is interested only in prediction.

Standardized regression estimates (β)

Standardized regression estimates have the advantage of a constant scale.

Standardized regression estimates should be used when the researcher is testing explanatory hypotheses.

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

EXHIBIT 23.7 The Best-Fit Line or Knocking Out the Pins

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

  • OLS

Guarantees that the resulting straight line will produce the least possible total error in using X to predict Y.

Generates a straight line that minimizes the sum of squared deviations of the actual values from this predicted regression line.

No straight line can completely represent every dot in the scatter diagram.

There will be a discrepancy between most of the actual scores (each dot) and the predicted score .

Uses the criterion of attempting to make the least amount of total error in prediction of Y from X.

23–*

Ordinary Least-Squares (OLS) Method of Regression Analysis

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

23–*

Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

The equation means that the predicted value for any value of X (Xi) is determined as a function of the estimated slope coefficient, plus the estimated intercept coefficient + some error.

23–*

Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

23–*

Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

  • Statistical Significance Of Regression Model
  • F-test (regression)

Determines whether more variability is explained by the regression or unexplained by the regression.

23–*

Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

  • R2

The proportion of variance in Y that is explained by X (or vice versa)

A measure obtained by squaring the correlation coefficient; that proportion of the total variance of a variable that is accounted for by knowing the value of another variable.

23–*

Ordinary Least-Squares Method of Regression Analysis (OLS) (cont’d)

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

EXHIBIT 23.8 Simple Regression Results for Building Permit Example

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

EXHIBIT 23.9 OLS Regression Line

23–*

*

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Size and Weight

23-*

  • The fight to get thin is a multibillion dollar business.
  • H1: Perceptions that a female model is overweight are related negatively to perceptions of attractiveness.
  • Can be tested with simple regression.
  • The results support the hypothesis.

©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Simple Regression and Hypothesis Testing

  • The explanatory power of regression lies in hypothesis testing. Regression is often used to test relational hypotheses.

The outcome of the hypothesis test involves two conditions that must both be satisfied:

The regression weight must be in the hypothesized direction. Positive relationships require a positive coefficient and negative relationships require a negative coefficient.

The t-test associated with the regression weight must be significant.

23–*

(

)

(

)

(

)

(

)

å

å

å

=

=

=

-

-

-

-

=

=

n

i

n

i

n

i

i

i

yx

xy

Y

Yi

X

Xi

Y

Y

X

X

r

r

1

1

2

2

1

Variance

Total

variance

Explained

2

=

R

875

.

0

40

.

882

,

3

49

.

398

,

3

2

=

=

R