SACJ2
Measures of Variability & Dispersion
The Concept of Dispersion
Dispersion refers to the variety, diversity, or amount of variation among scores
The greater the dispersion of a variable, the greater the range of scores and the greater the differences between scores
Introduction
Mueller’s & Schuessler’s Index of Qualitative Variation
Range
Variance
Standard deviation
Measures of variability or dispersion– looking at the central tendency is not enough to get a full understanding of the data.
Nominal data: Mueller’s and Schuessler’s index of qualitative variation.
Range– distance between over which particular proportions of scores are spread. (like our interval range that we already talked about).
Deviation Score– distances of scores from the means of their distribution.
Standard Deviation– the square root of the variance—important for decision making.
3
Index of qualitative variation
IQV= X 100
Number of Products=
Mueller’s and Schuessler’s index of qualitative variation– the percentage of actual heterogeneity for a particular attribute according to the expected distribution or maximum heterogeneity of that attribute.
The X100 turns the proportion to a percentage.
Heterogeneity– amount of diversity
Sum of products = the observed amount of heterogeneity
Sum of the products of the expected frequencies would be the sum of products on the expected frequency.
4
| Distribution of 1,000 rape victims according to relationship with rapist | ||
| Relationship of rapist to victim | Observed rapes | Expected Rapes |
| Date | 200 | 200 |
| Close friend | 100 | 200 |
| Family acquaintance | 200 | 200 |
| Stranger | 350 | 200 |
| Relative | 150 | 200 |
| Totals | 1000 | 1000 |
IQV= 95.6
100% all are the same
200 in each category would have meant that there was an equal distribution.
5
Range (R)
Range indicates the distance between the highest and lowest scores in a distribution
Range (R) = High Score – Low Score
Quick and easy indication of variability
Can be used with ordinal or interval-ratio variables
Why can’t the range be used with variables measured at the nominal level?
The range
20, 23, 25, 27, 28, 30, 35, 35, 35, 36, 39, 40, 42, 43, 44, 45, 45, 45, 46, 49
Range– distance over which 100 percent of the scores in a distribution are spread.
49-20=29
Locate Q3 and Q1
Q1: 0.25 x 20 =5
Q3: 0.75 x 20=15
Interquartile Range: 44-28 =16
7
Interquartile Range (Q)
A type of range measure
Considers only the middle 50% of the cases in a distribution
Avoids some of the problems of the range by focusing on just the middle 50% of scores
Limitation: Because the Interquartile Range is based on only two scores, it fails to yield any information from all of the other scores
| Satisfaction Score | ||
| Interval | f | cf |
| 175-179 | 4 | 111 |
| 170-174 | 6 | 107 |
| 165-169 | 3 | 101 |
| 160164 | 13 | 98 |
| 155-159 | 8 | 85 |
| 150-154 | 7 | 77 |
| 145-149 | 10 | 70 |
| 140-144 | 9 | 60 |
| 135-139 | 10 | 51 |
| 130-134 | 15 | 41 |
| 125-129 | 11 | 26 |
| 120-124 | 10 | 15 |
| 115-119 | 5 | 5 |
| N=111 |
Mdn=+(fn/ff) (i)
111 x .5 =55.5
51 is a close are we can get to 55.5
Mdn=139.5+ 4.5/9 X 5
=139.5+22.5/9
=139.5 + 2.5
=142
175.5- 114.5= 61 Very unstable measure because it is very sensitive to deviant scores– poor choice if you have outliers.
Interquartile Range (Q) for grouped
111 X .25= 27.5
111
9
Range (R): Limitations
Range is based on only two scores:
Distorted by atypically high or low scores
No information about variation between high and low scores
The average deviation
AD=
| x | x-x̅ |
| 23 | -6 |
| 30 | 1 |
| 31 | 2 |
| 15 | -14 |
| 46 | 17 |
The AD is the average variation of scores from the mean of their distribution.
= deviation score
= the sum of the absolute deviation scores
N= sample size.
Take each score and subtract it from the mean to get .
Taking the absolute of each deviation turns the deviations into positive numbers.
Way to check yourself– if the mean was calculated correctly, the sum of all the deviation scores will always equal 0.
11
Standard Deviation: Calculations
To solve:
Subtract mean from each score
Square the deviations
Sum the squared deviations
Divide the sum of the squared deviations by N
Find the square root of the result
Ungrouped Data: Variance & Standard deviation
| X | |||
| 1 | 20 | -5 | 25 |
| 2 | 21 | -4 | 16 |
| 3 | 22 | -3 | 9 |
| 4 | 23 | -2 | 4 |
| 5 | 24 | 1 | 1 |
| 6 | 25 | 0 | 0 |
| 7 | 26 | 1 | 1 |
| 8 | 27 | 2 | 4 |
| 9 | 28 | 3 | 9 |
| 10 | 29 | 4 | 16 |
| 11 | 30 | 5 | 25 |
| N=11 | = 110 | ||
| S= | S= |
Variance– the sum of the squared deviations scores divided by N
= the sum of the squared deviation scores
N= sample size.
Standard deviation is the square root of the variance
13
Grouped Data: variance & standard deviation
=
14
| Distribution of Scores | |
| Interval | f |
| 652-653 | 4 |
| 650-651 | 5 |
| 648-649 | 6 |
| 646-647 | 7 |
| 644-645 | 9 |
| 642-643 | 13 |
| 640-641 | 15 |
| 638-639 | 13 |
| 636-637 | 10 |
| 634-635 | 8 |
| 632-633 | 6 |
| 630-631 | 4 |
| N= 100 |