Stats
Statistics in Surveys and Polls Part 3: Point Estimation and Confidence Intervals
1
1
Chapter 3: The Problem with Samples (Part 2)
“Statistics almost always give you the wrong answer”
Sample ≠ Population
True even with good samples
Statistics ≠ Parameters
Statistics have bias and variability
To reduce bias, use (SRS) and proper statistics
To reduce variability, use a larger sample (n)
2
2
Population and Sample: Opinion about Military Action against ISIS
3
3
Practice Showing that Statistics ≠ Parameters
Flip a coin 100 times, count the number of heads. (A head represents “in favor” of military action against ISIS.)
4
4
Proportion “in favor” observed from 100 responses.
5
Click to edit Master text styles
Second level
Third level
Fourth level
Fifth level
5
Point estimation
The proportion of “in favors” you obtained from 100 “responses” represents the point estimate.
It is a single guess as to the true value of the parameter.
The point estimator is the statistic which is used to estimate the parameter of interest.
What is the point estimator?
What is the parameter of interest?
6
6
Types of errors
Sampling error (margin of error=MOE)
Error resulting from surveying some and not all of units in population
Coverage error (undercoverage)
Error resulting from not allowing all units in population to have equal chance of being selected
Measurement error
Error resulting from poor question wording or questions presented in a manner that inaccurate or uninterpretable answers are obtained
Nonresponse error
Error resulting when units who do not respond are different from units that do respond
7
7
Solving the problem
Margin of Error (MOE) (Sampling error)– use to “expand” our statistic;
Statistic ± MOE
OR
Point estimate + MOE
Statistic ± MOE has a better chance of hitting the parameter than the statistic alone; this chance is called confidence level
Statistic ± MOE is sometimes called a confidence interval or an interval estimate
(Point estimate – MOE, Point estimate + MOE)
8
8
MOE
9
9
Solving the Problem (cont)
Calculating MOE
Qualitative
Estimate proportion (p) - 95% MOE ≈
Quantitative
Estimate mean (µ) - 95% MOE ≈
Reduce MOE by _____________
10
10
Interpreting Point estimate + MOE
Conclusion applies to population
Never certain
Doesn’t have to be 95%
http ://digitalfirst.bfwpub.com/stats_applet/stats_applet_4_ci.html
11
11
MOE
Use your point estimate to create a 95% confidence interval for p (the true proportion of Americans in favor of military action against ISIS).
Did your interval enclose p=0.5?
Point estimate + MOE
12
12
These values are called ___________________________
2016 Presidential Poll Results
How was the MOE for the Detroit Free Press poll determined?
| Candidate | Trafalgar Group | Fox 2 Detroit/ Mitchell | Detroit Free Press |
| Donald Trump | 0.49 + 0.028 (0.462, 0.518) | 0.41 + 0.031 (0.379, 0.441) | 0.38 + 0.04 (0.34, 0.42) |
| Hillary Clinton | 0.47 + 0.028 (0.442, 0.498) | 0.46 + 0.031 (0.429, 0.491) | 0.42 + 0.04 (0.38, 0.46) |
Give 95% confidence intervals for estimating the proportion of Michigan voters to vote for each candidate.
In the above table, circle the confidence interval(s) that did not include the actual election result. What do you notice?
Michigan 2016 Presidential Election Results
Donald Trump (R) 47.6%
Hillary Clinton (D) 47.3%
2016 Iowa Primary Polls
15
2016 Iowa Primary Election Results
16
2016 Iowa Primary Election Results
17
Finding sample size
Sample size can be determined for a particular MOE
Suppose we can live with an MOE of 1% (+ .01)
If MOE = 0.01=
What is the sample size for an MOE of 3%?
Qualitative observations
What do you notice about the relationship between n and MOE?
18
18
Finding sample size
Quantitative observations
19
19
Sample size
| Population Size | Sample Size |
| 100 | 92 |
| 200 | 169 |
| 400 | 291 |
| 600 | 384 |
| 800 | 458 |
| 1,000 | 517 |
| 2,000 | 696 |
| 4,000 | 843 |
| 6,000 | 906 |
| 8,000 | 942 |
| 10,000 | 965 |
| 20,000 | 1,013 |
| 40,000 | 1,040 |
| 100,000 | 1,056 |
| 1,000,000 | 1,066 |
| 1,000,000,000 | 1,067 |
Sample sizes determined based on population size, 3% margin of error and 95% confidence.
Dillman, D.A. (2007) Mail and Internet Surveys pg 207
20
20
Interpreting Political Polls
Results from a recent poll of the percentage of likely to vote registered voters who would vote for a particular candidate. Do you think A or B will win?
Candidate A 59%
Candidate B 41%
Sampling error of 3%
Candidate A: 0.59 + 0.03
Candidate B: 0.41 + 0.03
(0.56, 0.62)
(0.38, 0.44)
21
21
Candidate A
Candidate B
0.59
0.41
0.62
0.56
0.38
0.44
Interpreting Political Polls
24%
12%
Predict Candidate A is likely to win by as much as 24% or as little as 12%
22
22
Interpreting Political Polls
Results from a recent poll of the percentage of likely to vote registered voters who would vote for a particular candidate. Do you think A or B will win?
Candidate A 45%
Candidate B 41%
Sampling error of 3%
Candidate A: 0.45 + 0.03
Candidate B: 0.41 + 0.03
(0.42, 0.48)
(0.38, 0.44)
23
23
Interpreting Political Polls
Candidate A
Candidate B
0.45
0.41
0.48
0.42
0.38
0.44
Predict there could be as much as a 10 point lead for Candidate A over Candidate B or a 2 point lead for Candidate B over Candidate A
24
24
Popularity of Congress
From January 3-6, 2013, 803 Americans surveyed by telephone were asked:
“What do you have a higher opinion of:
Congress or _____?” MOE was 3.4%.
VS
25
| Congress | John Edwards | Congress CI | Edwards CI | Statistical difference? |
| 45% | 29% | |||
| Congress | Cockroaches | Congress CI | Cockroaches CI | Statistical difference? |
| 43% | 45% |
MOE was 3.4%
26
The Problem with Polls
Poll results must be interpreted with extra care. A great example of some of the problems can be found in this example, and this was a “good” poll. I have added to the example to make some points.
The Pew Research Center for the People and the Press imitated the methods of the better opinion polls. They dialed telephone numbers at random to get a “SRS” of 1000 household opinions on school choice. Here are the actual results
Numbers eliminated: business, FAX, no service,… not given
Households without phones unknown
Never answered phone 938 (33%)
Answered but refused to participate 678 (23%)
Not eligible: no person age 18, language barrier, … 221 ( 8%)
Incomplete interview 42 ( 1%)
Complete interview: in favor 451 (16%)
Complete interview: against 392 (14%)
Complete interview: don’t know, don’t care 157 ( 5%)
Total called 2879
Result was that 45% favored school choice vs. 39% against (with a sampling error of ± 2%).
Problems/Issues?
27
27
Questions to ask
Who carried out the survey or poll?
What was the population?
How was the sample selected?
How large was the sample?
Sample size
Margin of error
What was the response rate?
How were the subjects contacted?
When was the survey conducted?
What were the exact questions?
What options were given for potential answers?
Issues
Population of Interest
Sampling Issues
Sample Selection
Undercoverage
Sample Size
Statistics ≠ Parameters
Non-sampling Issues
Processing Errors
Processing and Analysis Methods
Response Rate and Impact of Nonresponse
Methods Subjects Contacted
When Survey Conducted
Questions and Wording of Questions
Who Conducted Survey
28
Examples of Issues with Polls
Straw polls
http ://news.yahoo.com/iowa-straw-poll-outs-gop-establishment-080859714--election.html
Push poll examples in South Carolina
https :// www.washingtonpost.com/news/the-fix/wp/2016/02/12/the- south- carolina -push-poll-controversy-explained-video/
http ://www.sourcewatch.org/wiki.phtml?title=Push_poll
29
29
Sample Data
Qualitative
Quantitative
Parameter p
Point estimator
Point estimate value of value of
95% CI
30
30
n
1
n
s
2
n
2
range
»
n
1
10,000
n
0.01
1
n
01
.
0
1
n
2
=
÷
ø
ö
ç
è
æ
=
=
2
MOE
2
range
n
MOE
2
range
n
n
2
range
MOE
÷
÷
ø
ö
ç
ç
è
æ
=
=
=
2
MOE
2s
n
MOE
s
2
n
n
s
2
MOE
÷
÷
ø
ö
ç
ç
è
æ
=
=
=
m
÷
ø
ö
ç
è
æ
+
n
1
p
ˆ
,
n
1
-
p
ˆ
÷
ø
ö
ç
è
æ
+
n
2
range
x
,
n
2
range
-
x
p
ˆ
x
p
ˆ
x
MOE
p
ˆ
±
MOE
x
±
n
1
p
ˆ
±
n
2
range
x
±