Info project 4
INFO 1010
MULTI VARIABLE
MEASURES OF VARIABILITY
‹#›
1
Measures of Association Between Two Variables
Thus far we have examined numerical methods used
to summarize the data for one variable at a time.
Often a manager or decision maker is interested in
the relationship between two variables.
Two descriptive measures of the relationship
between two variables are covariance and correlation
coefficient.
‹#›
Covariance
Positive values indicate a positive relationship.
Negative values indicate a negative relationship.
The covariance is a measure of the linear association
between two variables.
‹#›
Covariance
The covariance is computed as follows:
for
samples
for
populations
=
=
‹#›
Correlation Coefficient
Just because two variables are highly correlated, it
does not mean that one variable is the cause of the
other.
Correlation is a measure of linear association and not
necessarily causation.
‹#›
The correlation coefficient is computed as follows:
for
samples
for
populations
Correlation Coefficient
=
=
‹#›
Correlation Coefficient
Values near +1 indicate a strong positive linear
relationship.
Values near -1 indicate a strong negative linear
relationship.
The coefficient can take on values between -1 and +1.
The closer the correlation is to zero, the weaker the
relationship.
‹#›
Sample Covariance
Sample Correlation Coefficient
Covariance and Correlation Coefficient
Example: Golfing Study
= = = -7.08
=
‹#›
Using Excel to Compute the Covariance and Correlation Coefficient
Excel Formula Worksheet
A
B
C
D
1
Average
Drive
18-Hole
Score
2
277.6
69
Samp. Covariance
=COVARIANCE.S(A2:A7,B2:B7)
3
259.5
71
Samp. Correlation
=CORREL(A2:A7,B2:B7)
4
269.1
70
5
267.0
70
6
255.6
71
7
272.9
69
8
‹#›
Using Excel to Compute the Covariance and Correlation Coefficient
Excel Value Worksheet
A
B
C
D
1
Average
Drive
18-Hole
Score
2
277.6
69
Samp. Covariance
-7.08
3
259.5
71
Samp. Correlation
-0.9631
4
269.1
70
5
267.0
70
6
255.6
71
7
272.9
69
8
‹#›