Creative Exercises
Analyze quantitative data
Part I : Intro & Levels of Numerical Information
Why statistics matter What can we do with quantitative data?
2
Statistics is everywhere
© 2008 Carol Cutler Riddick & Ruth V. Russell. All rights reserved.
3
4
Contents
Part 1
Levels of numerical information
Part 2
Descriptive statistics
Correlational statistics
Data Collected in Research Falls Into Two Categories
Numerical data (#, %, etc.) quantitative analysis
Nonnumerical data (e.g., adjectives, ideas, observations) qualitative analysis
Numerical data
Numeric variables have values that describe a measurable quantity as a number, like 'how many' or 'how much'.
Examples:
number of participants in an adult fitness program
Percentage of the ski trail users who prefer longer operating hours
Percentage of customers who are satisfied with their purchase
Statistics
Procedures used to describe, synthesize, analyze, and interpret numerical data
Three kinds of statistics …
Descriptive
Correlational
Inferential
Levels of Numerical Measurement
Continuous: Observations can take any value between a certain set of real numbers.
Examples: height, income, and age.
Levels of Numerical Measurement
Discrete: Observations can take a value based on a count from a set of distinct whole values.
A discrete variable cannot take the value of a fraction between one value and the next closest value.
Examples: number of registered cars, number of business locations, and number of children in a family
Is it continuous or discrete?
Number of children
Income
Course credit
Distance
Addiction
Depression
Cultural awareness
Alcohol consumption
Number of national park visits
Population
House value
15
Is it continuous or discrete?
Number of children Discrete
Income Continuous
Course credit Discrete
Distance Continuous
Addiction Continuous
Depression Continuous
Cultural awareness Continuous
Alcohol consumption Continuous
Number of national park visits Discrete
Population Discrete
House value Continuous
16
Levels of Categorical Measurement
Categorical: values that describe a 'quality' or 'characteristic' of a data unit, like 'what type' or 'which category'.
Categorical variables fall into mutually exclusive (in one category or in another) and exhaustive (include all possible options) categories.
Levels of Categorical Measurement
Nominal: Observations can take a value that is not able to be organized in a logical sequence.
Examples: sex, business type, eye color, religion
Ordinal: Observations can take a value that can be logically ordered or ranked.
Examples: academic grades (i.e. A, B, C), clothing size (i.e. small, medium, large, extra large) and attitudes (i.e. strongly agree, agree, disagree, strongly disagree).
Quantitative Data Analysis Tools
Up to now, focus has been on univariate statistics
Focus on one variable at a time
Use descriptive statistics
Now, turn attention to bivariate statistics
Requires two or more variables at a time
Can be used with correlation statistics to analyze or
Inferential statistics
20
Analyze quantitative data
Part II : Descriptive Statistic
Descriptive Statistics
Summarize 1 variable at a time
Descriptive statistics can use …
Frequency distributions
Relative comparisons
Measures of central tendency
Measures of variability
1. Frequency Distributions
… used to describe how data are distributed
… used to arrange values of a variable and their responses
23
2. Relative Comparisons
Describe how data for a variable compare with each other
Ways relative comparisons can be stated …
Rate
Ratio
Proportion
Percentage
Relative Comparisons: Rate
Frequency of occurrence of a particular outcome
Steps:
Divide actual # occurrences by # possible occurrences
Multiply answer by a base so it is easier to understand
Rate Example
Calculate student-athlete injury rate
Steps:
1: # student-athletes = 97
2: # student-athletes injured = 64
3. now calculate # injured relative to the # of all athletes
= 64 ÷ 97 = 0.66
4. 0.66 × 10 = 6.6, Interpretation: Injury rate is 6.6 of every 10 student-athletes
27
Relative Comparisons: Ratio
Comparison of the frequency of one response with another
Step: Compare two responses to each other (A to B)
Example: # missed class due to either not being able to wake up (N = 23) versus # who were sick (N = 10) so 23 ÷ 10 = 2.3
Interpretation: For every 1 student who missed a class because of sickness, 2.3 missed because of sleeping in.
28
Relative Comparisons: Percentage
Proportion = a ratio to the total
Percentage = proportion multiplied by 100
Steps:
1: Compare # actual occurrences to total # possibilities
2: Convert answer from Step 1 into percentage by multiplying answer by 100
Percentage Example
Step 1: 41 kids fell from playground equipment and 68 total # kids hurt on playground (falling from equipment, tripping when running, etc.) so 41 ÷ 68 = 0.6
Step 2: 0.6 × 100 = 60%
Interpretation: 60% of all playground injuries due to falling off playground equipment
30
3. Measures of Central Tendency
… single number that describes a frequency distribution’s “center”
… # that represents the “average” or typical score
Mean
Median
Mode
Measures of Central Tendency: Mean
Arithmetic average
Steps:
1: Sum, or add a total for all the scores (X)
2: Divide sum by number of scores (N)
Using Excel
=SUM
=AVERAGE
Mean Example
Calculate the mean # miles jogged in a week by four people
Steps:
1: # miles jogged in week by four people = 3 + 7 + 4 + 6 = 20 miles
2: 20 ÷ 4 = 5
Interpretation: The mean # of miles four people jogged in a week was 5 miles.
33
Measures of Central Tendency: Median
Middle score of ordered distribution
Score divides a set of scores into two equal halves
Steps:
1: Arrange scores from lowest to highest, or vice versa
2: Find middle value (easy to do with odd number of scores); otherwise calculate mean for two middle scores
Using Excel
=MEDIAN
34
Median Example 1
Calculate median performance scores (ratings can be from 0 to 25) of three employees
Rating results are 5, 7, and 8
Median = 7 (since three scores and arranged from low to high; therefore, 7 is the middle score)
Interpretation: The median performance score for three rated employees is 7
35
Median Example 2
Calculate the median performance scores
Rating results of four employees are 5, 7, 8, 8
Since even number of scores, to find the middle between 7 and 8 calculate mean between 7 and 8: (7 + 8) ÷ 2 = 7.5; therefore, the median is 7.5
Interpretation: The median performance score for four rated employees is 7.5
Measures of Central Tendency: Mode
Most common or “popular” response
Step: Identify the score that appears the most often
Example: Reported performance scores were 5, 7, 8, 8, so mode = 8
Interpretation: Most popular performance rating recorded was 8
Using Excel
=MODE.SNGL
37
Some “Average” Rules
Mean = do not use when there are outliers
Median = use with small sample with outliers
Mode = use with large sample with outliers
Analyze quantitative data
Part III : Descriptive & Correlational Statistic
4. Measures of Variability
… describe the spread of the data for a variable
Two most popular measures …
Range
Standard deviation
Measures of Variability: Range
Distance between the highest and lowest scores
Can be reported one of three ways …
Lowest to highest score
Highest to lowest score
Difference between the highest and lowest scores
Using Excel
=MIN
=MAX
Range Example
How can the range of four staff performance ratings be reported: 5, 7, 8, 8?
5 to 8 (lowest to highest)
8 to 5 (highest to lowest)
3 (difference, 8 − 5)
Measures of Variability: Standard Deviation
… measures spread between each score and mean point
… average distance of every score from the mean
… standard deviation of 0 means no spread of scores, meaning the all the scores are the same
43
How to Calculate Standard Deviation
Steps:
1: Calculate the mean
2: Subtract this mean value from each score. These are called deviation scores
3: Square each of these deviation scores
4: Add these squared deviation scores together.
5: Divide this sum by N − 1 (where N represents the number of responses)
6: Take the square root of the answer found in Step 5
Using Excel
=STDEV.S
Interpreting Standard Deviation Graphs
Correlational Statistics
Used to describe relationship between two variables
How fluctuation in one variable affects the other
Popular correlation statistic is Pearson product–moment correlation coefficient, usually referred to as correlation coefficient (r)
Direction of Correlations
Positive correlation (+) = as one variable increases (decreases), the other variable increases (decreases) (e.g., # hours practice and winning)
Negative correlation (−) = As one variable increases (decreases), the other variable decreases (increases) (e.g., exercise and resting heart rate)
Strength of Correlations
r ranges from 0 (no relationship) to ± 1.0 (perfect relationship)
Guidelines for interpreting
| Correlation coefficient value | Interpretation |
| ± .8 to + 1.0 | Very strong relationship |
| ± .6 to ± .8 | Strong relationship |
| ± .4 to ±.6 | Moderate relationship |
| ± .2 to ±.4 | Weak relationship |
| 0 to ± .2 | Nonexistent relationship |
Coefficient of Determination
Take value of r and square it, r2
Tells the amount of change in one variable accounted for by change in other variable
Example: Relationship between # hours practice and winning basketball games
r = .73 (a strong relationship)
r2 = .53, or .73 × .73
Interpretation: 53% of games won can be explained by practicing a lot (and conversely 47% of the losses are explained by something other than practicing)