Digital Mapping
Last Week
Test Scores (Make up assignment for extra points on third exam the week of April 2nd and 4th)
Please contact your teaching assistant if you want to view your exam
Summary statistics help us understand data sets and make decisions
Organization ---- Analysis/Interpretation – Presentation – Make Decisions
Imagine Average and SD as anchors of data analysis – places to start – concepts to root into…
Slowly moving up the learning curve and taking our time … Connect on your own terms – make it yours
Today
Standard Deviation and the Normal Curve
Today
Standard Deviation and the Normal Curve
Thursday
Finish Standard Deviation, outliers, and correlation
Measures of Central Tendency
A central value – a typical value –
Average (Mean)
1, 1, 3, 3, 5, 7, 7, 9, 9
One number tells us a lot
“If I could throw away my data and replace it with one “average” value, what would it be?
“getting a “representative” sample”
“it’s the number “in the middle”, pulled up by large values and brought down by smaller ones.”
6
Summarizing Data
Types: Interval/ratio data: arithmetic mean (average).
Includes all numbers!
Example:
Find the arithmetic mean of the following numbers: 9, 3, 7, 3, 8, 10, and 2.
7
7
7
7
7
Geographic Mean
Using x,y coordinates – where is the center of the classroom
Instead of a number we are talking about a place
If we have 4 locations what is the middle or average location?
Add all latitudes and longitudes together and then divide by the total number of locations!!!
10
10
Describe what that mean center population is (the geographic center of the whole population. What they notice its shifted west and south so has the population! Talk about your philadelphia thing about its rank in cities.
Standard Deviation
Standard Deviation
HOW FAR IS THE AVERAGE VALUE FROM THE MEAN
Are your values all over the place – very high and very low
Or are your values pretty close to average (the mean)
Not very high – not very low
Summarizing Data
A relatively low standard deviation # indicates that the data points tend to be very close to the mean. (Clustered around the mean)
A relatively high standard deviation # indicates that the data are spread out over a large range of values.(very far from the mean)
Average = 15 SD = 5
Average = 15 SD = 10
13
13
13
13
SD Larger or Small (average distance from the mean)
Small or Large Standard Deviation?
Measures of Spread…
We have a measure of central tendency – we know what the average number will be…. But …
A 1,1,3,3,5,7,7,9,9
B 1,1,1,1,5,9,9,9,9
C 4,4,4,4,5,6,6,6,6
HOW FAR IS THE AVERAGE VALUE FROM THE MEAN?
16
The normal curve…
Is an ideal representation – your numbers will not always match it – but it is a vary useful tool for understanding the spread or distribution of your data from the mean
1,1,1,1,5,9,9,9,9 (SD = ?)
Will the SD be greater or less than the previous data set of 1 1 3 3 5 7 7 9 9 ? (SD =2.98)
17
SD is 3.8
Tells us values are further spread from the mean – largest SD than 2.98!
18
The average doesn’t tell the whole story
[20k, 20k, 40k, 60k, 60k] vs [35k, 35k, 40k, 45k, 45k]
Average is 40 k
That tells us something important
SD for first data set 8.9
SD for second data set 2.2
The variance in the first data is higher
The SD gives us more important information to understand what is going on with the data set.
The Normal Curve
The Normal Curve
Where data observations are likely to fall given a perfect distribution according to a particular protocol or standard
Distributions are rarely ‘normal’ but this is a quick way of assessing whether individual scores are close/far from the mean or similar to other observations
Normal Distribution
Observations (test scores for example) peak at the average and decline as they move away from the average
This is the case on both sides (negative and positive)
If most observations are at the average and they slowly decrease (on both sides – positive and negative) as they move away from the mean then normal distribution
Given a normal distribution
34 + 34 = 68
We expect 68 percent of data points fall within 1 SD of the mean
13 + 13 = 26
We expect that 26 percent of the data will fall within 1 and 2 SD of the mean
Or that (68 + 26) 95 percent of the data will fall within 2Sd of the mean
3 + 3 = 6
We expect that 6 percent of the data will within 2 and 3 Sd of the mean
Or that close to 100 percent of the data will within 3 SD of the mean
If your test score is within 1 SD of the mean you were similar/normal
If your test score is between 1 and 2 SD you deviated further from the mean
If you test score is between 2 and 3 SD you are very different than other scores
You have made your own histograms – you know it’s not always normal…
Then why do we use it?
Normal Distributions are most common out of any other kind of distribution
Institutional Memory and The Qwerty/Dvorak debate
It’s a simplification – that’s why we do math!
Even though our distribution isn’t normal it still tells us something useful
Using the normal curve we can also see how typical individual observations are.
If the mean is normal/average – a score that indicates similarity to the mean.
Mu – the average – the mean
Weird looking o – is standard deviations (3 of them)
26
26
26
Within one its close to the average, and out here its very far – 68 percent of observations will fall into here in the middle closest to the average.
26
How did I do on the test?
Avg is 85 and 5 SD for a test – We are trying to figure out how we scored into comparison to the rest of the class…
68 percent of students fell in between 80 and 90 = 1 sd avg 85 + 5 =90 85 – 5 = 80 (80 and 90)
95 percent of students fell in between 75 and 95= 2 sd =avg 85 +2d -2d 85 +10 = 95 85 – 75 (75 and 95)
99 Percent = 3 sd 3(5) = 15 85 + 15 = 100 85 -15 70
99.9 percent of students feel in between 70 and 100
In class assignment (Please stay so we can go over it)
The mean of an adult King Emperor Penguin is 75 pounds
The SD of these weights is 8 pounds
Draw the normal curve and fill in the bottom values
Also if a penguin ways 84 pounds what SD interval will they fall into?