95% confidence interval with graph
I. Introduction
- A. Review of Population and Sample Estimates
- B. Sampling
- 1. Samples
- The sample mean is a random variable with a normal distribution.
- If you select many samples of a certain size then on average you will probably get close to the true population mean.
Population
Sample
Mean
Variance
I. Introduction
- 2. Central Limit Theorem
- Definition
- If a Random Sample is taken from any population with mean_ and standard deviation_, then the sampling distribution of the sample means will be normally distributed with
- 1) Sample Mean E() = _, and
- 2) Standard Error SE() = _/_n
- As n increases, the sampling distribution of tends toward the true population mean_.
I. Introduction
- C. Making Inferences
- To make inferences about the population from a given sample, we have to make one correction, instead of dividing by the standard deviation, we divided by the standard error of the sampling process:
- Today, we want to develop a tool to determine how confident we are that our estimates lie within a certain range.
II. Confidence Interval
- A. Definition of 95% Confidence Interval ( known)
- 1. Motivation
- We know that, on average, is equal to_.
- We want some way to express how confident we are that a given is near the actual_ of the population.
- We do this by constructing a confidence interval, which is some range around that most probably contains_.
- The standard error is a measure of how much error there is in the sampling process. So we can say that is equal to__ the standard error
- = X_Standard Error
II. Confidence Interval
- 2. Constructing a 95% Confidence Interval
- a. Graph
- First, we know that the sample mean is distributed normally, with mean_ and standard error
SE=
II. Confidence Interval
- b. Second, we determine how confident we want to be in our estimate of_.
- Defining how confident you want to be is called the -level.
- So a 95% confidence interval has an associate - level of .05.
- Because we are concerned with both higher and lower values, the relevant range is _/2 probability in each tail.
SYMBOL 97 \f "Symbol"= (1-.95) = .05 5% level
II. Confidence Interval
- C. Define a 95% Confidence Range
- Now let's take an interval around that contains 95% of the area under the curve.
- So if we take a random sample of size n from the population, 95% of the time the population mean will be within the range:
- What is a z value associated with a .025 probability?
- From the z-table, we find the z-value associated with a .025 probability is 1.96.
- So our range will be bounded by:
[-z.025 * SE , z.025 * SE]
[-1.96 * SE , 1.96 * SE].
II. Confidence Interval
- Now, let's take this interval of size [-1.96 * SE , 1.96 * SE] and use it as a measuring rod.
- d. Interpreting Confidence Intervals
II. Confidence Interval
- What's the probability that the population mean will fall within the interval 1.96 * SE?
- e. In General
- We get the actual interval as 1.96 * SE on either side of the sample mean .
- We then know that 95% of the time, this interval will contain . This interval is defined by:
SYMBOL 109 \f "Symbol" =
II. Confidence Interval
- For a 95% confidence interval:
- Examples
- 1. Example 1: Calculate a 95% confidence Interval
- Say we sample n=180 people and see how many times they ate at a fast-food restaurant in a given week. The sample had a mean of 0.82 and the population standard deviation is 0.48. Calculate the 95% confidence interval for these data.
SYMBOL 109 \f "Symbol" =
SYMBOL 109 \f "Symbol" =
II. Confidence Interval
- Answer:
- Why 95%?
SE = .48 / SYMBOL 214 \f "Symbol"180 = 0.036.
z.025 * SE = 1.96 * .036 = .07
C.I. = .82 SYMBOL 177 \f "Symbol" .07 OR [.75 < SYMBOL 109 \f "Symbol" < .89]
II. Confidence Interval
- 2. Example 2: Calculating a 90% confidence interval
- A random sample of 16 observations was drawn from a normal population with s = 6 and = 25. Find a 90% (a = .10) confidence interval for the population mean, _.
- First, find Z.10/2 in the standard normal tables
- Second, calculate the 90% confidence interval
Z.05 = 1.64
SYMBOL 109 \f "Symbol" =
SYMBOL 109 \f "Symbol" = 25 SYMBOL 177 \f "Symbol" 1.64 * 6/SYMBOL 214 \f "Symbol"16
SE= 1.5
SYMBOL 109 \f "Symbol" = 25 SYMBOL 177 \f "Symbol" 1.64 * 1.5 = 25 SYMBOL 177 \f "Symbol" 2.46
22.53 < SYMBOL 109 \f "Symbol" < 27.46
II. Confidence Interval
- 90% of the time, the mean will lie with in this range.
- What if we wanted to be 99% of the time sure that the mean falls with in the interval?
- What happens when we move from a 90% to a 99% confidence interval?
Z.005.= 2.58
25 SYMBOL 177 \f "Symbol" 2.58 * 1.5
25 SYMBOL 177 \f "Symbol" 3.87
21.13 < SYMBOL 109 \f "Symbol" < 28.87
II. Confidence Interval
- C. Confidence Intervals (_ unknown)
- 1.Characteristics of a Student-t distribution
- a.Shape the student t-distribution
- The t-distribution changes shape as the sample size gets larger, and in the limit it becomes identical to the normal.
II. Confidence Interval
- b. When to use t-distribution
- i. s is unknown
- ii. Sample size n is small
- 2. Constructing Confidence Intervals Using t-Distribution
-
A. Confidence Interval - 95% confidence interval is:
- B. Using t-tables
- Say our sample size is n and we want to know what's the cutoff value to get 95% of the area under the curve.
II. Confidence Interval
- i) Find Degrees of Freedom
- Degree of freedom is the amount of information used to calculate the standard deviation, s. We denote it as d.f. _ d.f. = n-1
- ii) Look up in the t-table
- Now we go down the side of the table to the degrees of freedom and across to the appropriate t-value.
- That's the cutoff value that gives you area of .025 in each tail, leaving 95% under the middle of the curve.
- iii) Example:
- Suppose we have sample size n=15 and t.025 What is the critical value? 2.13
II. Confidence Interval
- Answer:
s2 = (64-74)2 + (66-74)2 + (89-74)2 + (77-74)2 / 3 = 132.7
d.f. = 3
t.025 = 3.18
t.025 * SE = 3.18 * 5.76 = 18
SYMBOL 109 \f "Symbol" =
56 < SYMBOL 109 \f "Symbol" < 92 (not very precise with a sample of only size 4)
II. Confidence Interval
- III. Differences of Means
- A. Population Variance Known
- Now we are interested in estimating the value (1 - 2) by the sample means, using 1 - 2.
- Say we take samples of the size n1 and n2 from the two populations. And we want to estimate the differences in two population means.
- To tell how accurate these estimates are, we can construct the familiar confidence interval around their difference:
(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (
II. Confidence Interval
- This would be the formula if the sample size were large and we knew both 1 and 2.
- B. Population Variance Unknown
- 1. If, as usual, we do not know 1 and 2, then we use the sample standard deviations instead. When the variances of populations are not equal (s1 s2):
- Example: Test scores of two classes where one is from an inner city school and the other is from an affluent suburb.
(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (
II. Confidence Interval
- 2. Pooled Sample Variances, s1 = s2 (s2 is unknown)
- If both samples come from the same population (e.g., test scores for two classes in the same school), we can assume that they have the same population variance . Then the formula becomes:
or just
- In this case, we say that the sample variances are pooled. The formula for is:
(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (
(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (
II. Confidence Interval
- The degrees of freedom are (n1-1) + (n2-1), or (n1+n2-2).
- 3. Example:
- Two classes from the same school take a test. Calculate the 95% confidence interval for the difference between the two class means.
X1 X2
64
56
66
71
89
53
77
SYMBOL 83 \f "Symbol"X1/n= 296/4 = 74
SYMBOL 83 \f "Symbol"X2/n= 180/3= 60
II. Confidence Interval
- Answer:
(
n1 = 4; n2 = 3
II. Confidence Interval
sp = 10.8
SE =
d.f. = 5
t.025 = 2.57
t.025 * SE = 2.57 * 8.26 = 21
(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (
= -7 SYMBOL 163 \f "Symbol" (1 - 2 SYMBOL 163 \f "Symbol" 35.
II. Confidence Interval
- C. Matched Samples
- 1. Definition
- Matched samples are ones where you take a single individual and measure him or her at two different points and then calculate the difference.
- 2. Advantage
- One advantage of matched samples is that it reduces the variance because it allows the experimenter to control for many other variables which may influence the outcome.
II. Confidence Interval
- 3. Calculating a Confidence Interval
- Now for each individual we can calculate their difference D from one time to the next.
- We then use these D's as the data set to estimate , the population difference.
- The sample mean of the differences will be denoted .
- The standard error will just be:
- Use the t-distribution to construct 95% confidence interval:
SE = sD / SYMBOL 214 \f "Symbol"n.
SYMBOL 68 \f "Symbol" =
II. Confidence Interval
- 4. Example:
Student
X1 (Fall)
X2 (Spring)
D = X1-X2
Trimble
64
57
7
Wilde
66
57
9
Giannos
89
73
16
Ames
77
65
12
d.f. = n-1 = 3
s2D = (7-11)2 + (9-11)2 + (16-11)2 + (12-11)2 /3= 46/3
sD = 3.91
SE = sD / SYMBOL 214 \f "Symbol"4 = 3.91 / 2 = 1.96
t.025 = 3.18
t.025 * SE = 3.18 * 1.96 = 6
II. Confidence Interval
- Notice that the standard error here is much smaller than in most of our unmatched pair examples.
So SYMBOL 68 \f "Symbol" =
= 11 SYMBOL 177 \f "Symbol" 6 = 5 to 17.
5 SYMBOL 163 \f "Symbol" SYMBOL 68 \f "Symbol" SYMBOL 163 \f "Symbol" 17
IV. Confidence Intervals for Proportions
- Just before the 1996 presidential election, a Gallup poll of about 1500 voters showed 840 for Clinton and 660 for Dole. Calculate the 95% confidence interval for the population proportion of Clinton supporters.
- Answer: n= 1500
- Sample proportion P:
P =
_918454723.unknown
IV. Confidence Intervals for Proportions
- Create a 95% confidence interval:
- where and P are the population and sample proportions, and n is the sample size.
- That is, with 95% confidence, the proportion for Clinton in the whole population of voters was between 53% and 59%.
( = P ( sampling allowance
( = P ( 1.96
_918454836.unknown
( = .56 ( 1.96
_918455078.unknown
( = .56 ( .03.
Popul
a
tion
S
a
mpl
e
M
e
a
n
m
=
=
Œ
X
N
i
i
N
1
X
X
n
i
i
n
=
=
Œ
1
Va
ri
a
n
c
e
s
m
2
2
1
=
-
=
Œ
(
)
X
N
i
i
N
s
X
X
n
i
i
n
2
2
1
1
=
-
-
=
Œ
(
)
Z
X
S
E
=
-
m
.
S
E
=
s
n
.
m
SE
a
=
(1-
.
95)
= .
05
5%
l
e
v
e
l
m
S
E
a
/
2
a
/
2
[
-
z
.
0
2
5
*
S
E
,
z
.
0
2
5
*
S
E
]
[-1
.
96
*
SE
,
1.96
*
S
E]
.
m
SE
0.05/2=.025
0.05/2=.025
-1.96*SE
1.96*SE
m
SE
0.05/2=0.25
0.05/2=.025
-1.96*SE
1.96*SE
X
1
X
2
X
3
X
-
(
Z
.
0
2
5
*
_
/
_
n)
<
_
<
X
+ (Z
.
0
2
5
*
_
/
_
n)
m
=
X
_
Z
_
/
2
*
S
E
m
=
X
_
Z
.
0
2
5
*
S
E
m
=
X
_
1.96 *
S
E
S
E
=
.
48
/
_
180
= 0.036
.
z
.
0
2
5
*
S
E
=
1
.
96
*
.036
= .
07
C
.
I
. = .82
_
.07
O
R
[.75
<
_
< .89
]
SE = .036
0.05/2=.025
0.05/2=.025
-1.96*.036= .75
1.96*.036=.89
.82
Z
.
0
5
= 1.64
m
=
X
_
Z
.
0
5
*
_
/
_
n
m
= 25
_
1
.
64
*
6/
_
16
S
E
= 1.5
m
= 25
_
1.64 *
1
.
5
=
25
_
2.46
22
.
53
<
_
< 27.46
Z
.
0
0
5
.
= 2.58
25
_
2.58 *
1
.
5
25
_
3.87
21
.
13
<
_
< 28.87
a=.05
a=.05
a=.005
a=.005
21.13
28.87
22.53
27.46
m
Normal Distribution
Student-t
.025
Z
.025
Z
t
.025
t
.025
X
t
s
n
±
.
.
0
2
5
X
= (64 + 66
+
89
+
77)
/
4
=
74
s
2
= (64-74
)
2
+ (66-74
)
2
+ (89-74
)
2
+ (77-74
)
2
/
3
=
132.7
S
E
=
s
n
=
=
1
3
2
7
4
5
7
6
.
.
d
.
f.
=
3
t
.
0
2
5
= 3.18
t
.
0
2
5
*
S
E
=
3
.
18
*
5.76
=
18
m
=
X
_
18
56
<
_
< 92
(
not
v
e
ry
pr
e
c
i
s
e w
i
t
h
a sa
mpl
e
of
only
s
i
ze
4)
0.05/2=.025
0.05/2=.025
74
56
92
(
_
1
-
_
2
)
=
(
X
1
-
X
2
)
_
z
.
0
2
5
s
s
1
2
1
2
2
2
n
n
+
.
(
_
1
-
_
2
)
=
(
X
1
-
X
2
)
_
t
.
0
2
5
s
n
s
n
1
2
1
2
2
2
+
.
(
_
1
-
_
2
)
=
(
X
1
-
X
2
)
_
t
.
0
2
5
s
n
s
n
p
p
2
1
2
2
+
,
(
_
1
-
_
2
)
=
(
X
1
-
X
2
)
_
t
.
0
2
5
s
n
n
p
1
1
1
2
+
.
s
X
X
X
X
n
n
p
2
1
1
2
2
2
2
1
2
1
1
=
-
+
-
-
+
-
å
å
(
)
(
)
(
)
(
)
s
p
2
X
1
X
2
64
56
66
71
89
53
77
S
X
1
/n=
296/4 =
74
S
X
2
/n=
180/3= 60
X
1
= 74
X
2
=
60
(
X
1
-
X
2
)
=
14
n
1
= 4;
n
2
= 3
s
p
2
=
=
-
+
-
-
+
-
Œ
Œ
(
)
(
)
(
)
(
)
X
X
X
X
n
n
1
1
2
2
2
2
1
2
1
1
=
(
6
4
-
7
4
)
+
(
6
6
-
7
4
)
+
(
8
9
-
7
4
)
+
(
7
7
-
7
4
)
4
-
1
3
-
1
2
2
2
2
2
+
-
+
-
+
-
(
)
+
(
)
(
)
(
)
(
)
5
6
6
0
7
1
6
0
5
3
6
0
2
2
s
p
2
=
(
398
+
186
)
/ (3
+
2)
=
117.
s
p
= 10.8
S
E
=
s
+
p
1
2
1
n
1
n
á
=
á
+
=
1
0
8
1
4
1
3
8
2
6
.
.
d
.
f.
=
5
t
.
0
2
5
= 2.57
t
.
0
2
5
*
S
E
=
2
.
57
*
8.26
=
21
(
_
1
-
_
2
)
=
(
X
1
-
X
2
)
_
21
=
14
_
21
=
-
7
_
(
m
1
-
m
2
)
_
35.
0.05/2=.025
0.05/2=.025
-7
35
m
1
m
2
-
S
E
=
s
D
/
_
n
.
D
=
D
_
t
.
0
2
5
*
s
D
/
_
n
.
Stud
e
nt
X
1
(F
a
ll)
X
2
(Spr
i
ng)
D
=
X
1
-
X
2
T
r
imbl
e
64
57
7
W
ild
e
66
57
9
G
i
a
nnos
89
73
16
A
m
e
s
77
65
12
D
= (7 + 9
+
16
+
12)
/
4
=
11
d
.
f.
=
n-1
=
3
s
2
D
=
(7
-
11)
2
+
(9
-
11)
2
+
(16
-
11)
2
+
(
12-11)
2
/
3
=
46/3
s
D
= 3.91
S
E
=
s
D
/
_
4
= 3.91 /
2
=
1
.
96
t
.
0
2
5
= 3.18
t
.
0
2
5
*
S
E
=
3
.
18
*
1.96
=
6
So
_
=
D
_
t
.
0
2
5
*
s
D
/
_
n
.
=
11
_
6
=
5
to 17
.
5
_
_
_
17
P
=
8
4
0
1
5
0
0
5
6
=
.
p
= P
±
s
a
mpling
a
l
lo
w
an
ce
p
= P
±
1.96
P
P
n
(
)
1
-
,
p
= .56
±
1.96
.
(
.
)
5
6
1
5
6
1
5
0
0
-
,
p
= .56
±
.03
.