95% confidence interval with graph

profilepk3954
lecture06.ppt

I. Introduction

  • A. Review of Population and Sample Estimates 
  • B. Sampling 
  • 1. Samples
  • The sample mean is a random variable with a normal distribution.
  • If you select many samples of a certain size then on average you will probably get close to the true population mean.

Population

Sample

Mean

image1.wmf

image2.wmf

Variance

image3.wmf

image4.wmf

I. Introduction

  • 2. Central Limit Theorem
  • Definition
  • If a Random Sample is taken from any population with mean_ and standard deviation_, then the sampling distribution of the sample means will be normally distributed with
  • 1) Sample Mean E() = _, and
  • 2) Standard Error SE() = _/_n
  • As n increases, the sampling distribution of tends toward the true population mean_.

I. Introduction

  • C. Making Inferences
  • To make inferences about the population from a given sample, we have to make one correction, instead of dividing by the standard deviation, we divided by the standard error of the sampling process:
  • Today, we want to develop a tool to determine how confident we are that our estimates lie within a certain range.

image1.wmf.

II. Confidence Interval

  • A. Definition of 95% Confidence Interval ( known)
  • 1. Motivation
  • We know that, on average, is equal to_.
  • We want some way to express how confident we are that a given is near the actual_ of the population.
  • We do this by constructing a confidence interval, which is some range around that most probably contains_.
  • The standard error is a measure of how much error there is in the sampling process. So we can say that is equal to__ the standard error
  •  = X_Standard Error

II. Confidence Interval

  • 2. Constructing a 95% Confidence Interval
  • a. Graph
  • First, we know that the sample mean is distributed normally, with mean_ and standard error

SE=

image1.wmf.

II. Confidence Interval

  • b. Second, we determine how confident we want to be in our estimate of_.
  • Defining how confident you want to be is called the -level.
  • So a 95% confidence interval has an associate - level of .05.
  • Because we are concerned with both higher and lower values, the relevant range is _/2 probability in each tail.

SYMBOL 97 \f "Symbol"= (1-.95) = .05 5% level

image1.wmf

II. Confidence Interval

  • C. Define a 95% Confidence Range
  • Now let's take an interval around  that contains 95% of the area under the curve.
  • So if we take a random sample of size n from the population, 95% of the time the population mean  will be within the range:
  • What is a z value associated with a .025 probability?
  • From the z-table, we find the z-value associated with a .025 probability is 1.96.
  • So our range will be bounded by:

[-z.025 * SE , z.025 * SE]

[-1.96 * SE , 1.96 * SE].

II. Confidence Interval

  • Now, let's take this interval of size [-1.96 * SE , 1.96 * SE] and use it as a measuring rod.
  • d. Interpreting Confidence Intervals

II. Confidence Interval

  • What's the probability that the population mean  will fall within the interval  1.96 * SE?
  • e. In General
  • We get the actual interval as 1.96 * SE on either side of the sample mean .
  • We then know that 95% of the time, this interval will contain . This interval is defined by:

image1.wmf - (Z.025 * SYMBOL 115 \f "Symbol"/SYMBOL 214 \f "Symbol"n) < SYMBOL 109 \f "Symbol" <
image2.wmf + (Z.025 * SYMBOL 115 \f "Symbol"/SYMBOL 214 \f "Symbol"n)

SYMBOL 109 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" ZSYMBOL 97 \f "Symbol"/2 * SE

II. Confidence Interval

  • For a 95% confidence interval:
  • Examples
  • 1. Example 1: Calculate a 95% confidence Interval
  • Say we sample n=180 people and see how many times they ate at a fast-food restaurant in a given week. The sample had a mean of 0.82 and the population standard deviation  is 0.48. Calculate the 95% confidence interval for these data.

SYMBOL 109 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" Z.025 * SE

SYMBOL 109 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" 1.96 * SE

II. Confidence Interval

  • Answer:
  • Why 95%?

SE = .48 / SYMBOL 214 \f "Symbol"180 = 0.036.

z.025 * SE = 1.96 * .036 = .07

C.I. = .82 SYMBOL 177 \f "Symbol" .07 OR [.75 < SYMBOL 109 \f "Symbol" < .89]

II. Confidence Interval

  • 2. Example 2: Calculating a 90% confidence interval
  • A random sample of 16 observations was drawn from a normal population with s = 6 and = 25. Find a 90% (a = .10) confidence interval for the population mean, _.
  • First, find Z.10/2 in the standard normal tables
  • Second, calculate the 90% confidence interval

Z.05 = 1.64

SYMBOL 109 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" Z.05 * SYMBOL 115 \f "Symbol"/SYMBOL 214 \f "Symbol"n

SYMBOL 109 \f "Symbol" = 25 SYMBOL 177 \f "Symbol" 1.64 * 6/SYMBOL 214 \f "Symbol"16

SE= 1.5

SYMBOL 109 \f "Symbol" = 25 SYMBOL 177 \f "Symbol" 1.64 * 1.5 = 25 SYMBOL 177 \f "Symbol" 2.46

22.53 < SYMBOL 109 \f "Symbol" < 27.46

II. Confidence Interval

  • 90% of the time, the mean will lie with in this range.
  • What if we wanted to be 99% of the time sure that the mean falls with in the interval?
  • What happens when we move from a 90% to a 99% confidence interval?

Z.005.= 2.58

25 SYMBOL 177 \f "Symbol" 2.58 * 1.5

25 SYMBOL 177 \f "Symbol" 3.87

21.13 < SYMBOL 109 \f "Symbol" < 28.87

II. Confidence Interval

  • C. Confidence Intervals (_ unknown)
  • 1.Characteristics of a Student-t distribution
  • a.Shape the student t-distribution
  • The t-distribution changes shape as the sample size gets larger, and in the limit it becomes identical to the normal.

II. Confidence Interval

  • b. When to use t-distribution
  • i. s is unknown
  • ii. Sample size n is small
  • 2. Constructing Confidence Intervals Using t-Distribution

  • A. Confidence Interval
  • 95% confidence interval is:
  • B. Using t-tables
  • Say our sample size is n and we want to know what's the cutoff value to get 95% of the area under the curve.

image1.wmf

II. Confidence Interval

  • i) Find Degrees of Freedom
  • Degree of freedom is the amount of information used to calculate the standard deviation, s. We denote it as d.f. _ d.f. = n-1
  • ii) Look up in the t-table
  • Now we go down the side of the table to the degrees of freedom and across to the appropriate t-value.
  • That's the cutoff value that gives you area of .025 in each tail, leaving 95% under the middle of the curve.
  • iii) Example:
  • Suppose we have sample size n=15 and t.025 What is the critical value? 2.13

II. Confidence Interval

  • Answer:

image1.wmf = (64 + 66 + 89 + 77) / 4 = 74

s2 = (64-74)2 + (66-74)2 + (89-74)2 + (77-74)2 / 3 = 132.7

image2.wmf

d.f. = 3

t.025 = 3.18

t.025 * SE = 3.18 * 5.76 = 18

SYMBOL 109 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" 18

56 < SYMBOL 109 \f "Symbol" < 92 (not very precise with a sample of only size 4)

II. Confidence Interval

  • III. Differences of Means
  • A. Population Variance Known
  • Now we are interested in estimating the value (1 - 2) by the sample means, using 1 - 2.
  • Say we take samples of the size n1 and n2 from the two populations. And we want to estimate the differences in two population means.
  • To tell how accurate these estimates are, we can construct the familiar confidence interval around their difference:

(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (

image1.wmf1 -
image2.wmf2) SYMBOL 177 \f "Symbol" z.025
image3.wmf.

II. Confidence Interval

  • This would be the formula if the sample size were large and we knew both 1 and 2.
  • B. Population Variance Unknown
  • 1. If, as usual, we do not know 1 and 2, then we use the sample standard deviations instead. When the variances of populations are not equal (s1  s2):
  • Example: Test scores of two classes where one is from an inner city school and the other is from an affluent suburb.

(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (

image1.wmf1 -
image2.wmf2) SYMBOL 177 \f "Symbol" t.025
image3.wmf.

II. Confidence Interval

  • 2. Pooled Sample Variances, s1 = s2 (s2 is unknown)
  • If both samples come from the same population (e.g., test scores for two classes in the same school), we can assume that they have the same population variance . Then the formula becomes:

or just

  • In this case, we say that the sample variances are pooled. The formula for is:

(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (

image1.wmf1 -
image2.wmf2) SYMBOL 177 \f "Symbol" t.025
image3.wmf,

(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (

image1.wmf1 -
image2.wmf2) SYMBOL 177 \f "Symbol" t.025
image3.wmf.

II. Confidence Interval

  • The degrees of freedom are (n1-1) + (n2-1), or (n1+n2-2).
  • 3. Example:
  • Two classes from the same school take a test. Calculate the 95% confidence interval for the difference between the two class means.

X1 X2

64

56

66

71

89

53

77

SYMBOL 83 \f "Symbol"X1/n= 296/4 = 74

SYMBOL 83 \f "Symbol"X2/n= 180/3= 60

II. Confidence Interval

  • Answer:

image1.wmf1 = 74

image2.wmf2 = 60

(

image3.wmf1 -
image4.wmf2) = 14

n1 = 4; n2 = 3

image5.wmf =
image6.wmf =

image7.wmf

II. Confidence Interval

image1.wmf = (398 + 186) / (3 + 2) = 117.

sp = 10.8

SE =

image2.wmf

d.f. = 5

t.025 = 2.57

t.025 * SE = 2.57 * 8.26 = 21

(SYMBOL 109 \f "Symbol"1 - SYMBOL 109 \f "Symbol"2) = (

image3.wmf1 -
image4.wmf2) SYMBOL 177 \f "Symbol" 21 = 14 SYMBOL 177 \f "Symbol" 21

= -7 SYMBOL 163 \f "Symbol" (1 - 2 SYMBOL 163 \f "Symbol" 35.

II. Confidence Interval

  • C. Matched Samples
  • 1. Definition
  • Matched samples are ones where you take a single individual and measure him or her at two different points and then calculate the difference.
  • 2. Advantage
  • One advantage of matched samples is that it reduces the variance because it allows the experimenter to control for many other variables which may influence the outcome.

II. Confidence Interval

  • 3. Calculating a Confidence Interval
  • Now for each individual we can calculate their difference D from one time to the next.
  • We then use these D's as the data set to estimate , the population difference.
  • The sample mean of the differences will be denoted .
  • The standard error will just be:
  • Use the t-distribution to construct 95% confidence interval:

SE = sD / SYMBOL 214 \f "Symbol"n.

SYMBOL 68 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" t.025 * sD/SYMBOL 214 \f "Symbol"n.

II. Confidence Interval

  • 4. Example:

Student

X1 (Fall)

X2 (Spring)

D = X1-X2

Trimble

64

57

7

Wilde

66

57

9

Giannos

89

73

16

Ames

77

65

12

image1.wmf = (7 + 9 + 16 + 12) / 4 = 11

d.f. = n-1 = 3

s2D = (7-11)2 + (9-11)2 + (16-11)2 + (12-11)2 /3= 46/3

sD = 3.91

SE = sD / SYMBOL 214 \f "Symbol"4 = 3.91 / 2 = 1.96

t.025 = 3.18

t.025 * SE = 3.18 * 1.96 = 6

II. Confidence Interval

  • Notice that the standard error here is much smaller than in most of our unmatched pair examples.

So SYMBOL 68 \f "Symbol" =

image1.wmf SYMBOL 177 \f "Symbol" t.025 * sD/SYMBOL 214 \f "Symbol"n.

= 11 SYMBOL 177 \f "Symbol" 6 = 5 to 17.

5 SYMBOL 163 \f "Symbol" SYMBOL 68 \f "Symbol" SYMBOL 163 \f "Symbol" 17

IV. Confidence Intervals for Proportions

  • Just before the 1996 presidential election, a Gallup poll of about 1500 voters showed 840 for Clinton and 660 for Dole. Calculate the 95% confidence interval for the population proportion  of Clinton supporters.
  • Answer: n= 1500

  • Sample proportion P:

P =

_918454723.unknown

IV. Confidence Intervals for Proportions

  • Create a 95% confidence interval:
  • where  and P are the population and sample proportions, and n is the sample size.
  • That is, with 95% confidence, the proportion for Clinton in the whole population of voters was between 53% and 59%.

( = P ( sampling allowance

( = P ( 1.96

,

_918454836.unknown

( = .56 ( 1.96

,

_918455078.unknown

( = .56 ( .03.

Popul

a

tion

S

a

mpl

e

M

e

a

n

m

=

=

Œ

X

N

i

i

N

1

X

X

n

i

i

n

=

=

Œ

1

Va

ri

a

n

c

e

s

m

2

2

1

=

-

=

Œ

(

)

X

N

i

i

N

s

X

X

n

i

i

n

2

2

1

1

=

-

-

=

Œ

(

)

Z

X

S

E

=

-

m

.

S

E

=

s

n

.

m

SE

a

=

(1-

.

95)

= .

05

5%

l

e

v

e

l

m

S

E

a

/

2

a

/

2

[

-

z

.

0

2

5

*

S

E

,

z

.

0

2

5

*

S

E

]

[-1

.

96

*

SE

,

1.96

*

S

E]

.

m

SE

0.05/2=.025

0.05/2=.025

-1.96*SE

1.96*SE

m

SE

0.05/2=0.25

0.05/2=.025

-1.96*SE

1.96*SE

X

1

X

2

X

3

X

-

(

Z

.

0

2

5

*

_

/

_

n)

<

_

<

X

+ (Z

.

0

2

5

*

_

/

_

n)

m

=

X

_

Z

_

/

2

*

S

E

m

=

X

_

Z

.

0

2

5

*

S

E

m

=

X

_

1.96 *

S

E

S

E

=

.

48

/

_

180

= 0.036

.

z

.

0

2

5

*

S

E

=

1

.

96

*

.036

= .

07

C

.

I

. = .82

_

.07

O

R

[.75

<

_

< .89

]

SE = .036

0.05/2=.025

0.05/2=.025

-1.96*.036= .75

1.96*.036=.89

.82

Z

.

0

5

= 1.64

m

=

X

_

Z

.

0

5

*

_

/

_

n

m

= 25

_

1

.

64

*

6/

_

16

S

E

= 1.5

m

= 25

_

1.64 *

1

.

5

=

25

_

2.46

22

.

53

<

_

< 27.46

Z

.

0

0

5

.

= 2.58

25

_

2.58 *

1

.

5

25

_

3.87

21

.

13

<

_

< 28.87

a=.05

a=.05

a=.005

a=.005

21.13

28.87

22.53

27.46

m

Normal Distribution

Student-t

.025

Z

.025

Z

t

.025

t

.025

X

t

s

n

±

.

.

0

2

5

X

= (64 + 66

+

89

+

77)

/

4

=

74

s

2

= (64-74

)

2

+ (66-74

)

2

+ (89-74

)

2

+ (77-74

)

2

/

3

=

132.7

S

E

=

s

n

=

=

1

3

2

7

4

5

7

6

.

.

d

.

f.

=

3

t

.

0

2

5

= 3.18

t

.

0

2

5

*

S

E

=

3

.

18

*

5.76

=

18

m

=

X

_

18

56

<

_

< 92

(

not

v

e

ry

pr

e

c

i

s

e w

i

t

h

a sa

mpl

e

of

only

s

i

ze

4)

0.05/2=.025

0.05/2=.025

74

56

92

(

_

1

-

_

2

)

=

(

X

1

-

X

2

)

_

z

.

0

2

5

s

s

1

2

1

2

2

2

n

n

+

.

(

_

1

-

_

2

)

=

(

X

1

-

X

2

)

_

t

.

0

2

5

s

n

s

n

1

2

1

2

2

2

+

.

(

_

1

-

_

2

)

=

(

X

1

-

X

2

)

_

t

.

0

2

5

s

n

s

n

p

p

2

1

2

2

+

,

(

_

1

-

_

2

)

=

(

X

1

-

X

2

)

_

t

.

0

2

5

s

n

n

p

1

1

1

2

+

.

s

X

X

X

X

n

n

p

2

1

1

2

2

2

2

1

2

1

1

=

-

+

-

-

+

-

å

å

(

)

(

)

(

)

(

)

s

p

2

X

1

X

2

64

56

66

71

89

53

77

S

X

1

/n=

296/4 =

74

S

X

2

/n=

180/3= 60

X

1

= 74

X

2

=

60

(

X

1

-

X

2

)

=

14

n

1

= 4;

n

2

= 3

s

p

2

=

=

-

+

-

-

+

-

Œ

Œ

(

)

(

)

(

)

(

)

X

X

X

X

n

n

1

1

2

2

2

2

1

2

1

1

=

(

6

4

-

7

4

)

+

(

6

6

-

7

4

)

+

(

8

9

-

7

4

)

+

(

7

7

-

7

4

)

4

-

1

3

-

1

2

2

2

2

2

+

-

+

-

+

-

(

)

+

(

)

(

)

(

)

(

)

5

6

6

0

7

1

6

0

5

3

6

0

2

2

s

p

2

=

(

398

+

186

)

/ (3

+

2)

=

117.

s

p

= 10.8

S

E

=

s

+

p

1

2

1

n

1

n

á

=

á

+

=

1

0

8

1

4

1

3

8

2

6

.

.

d

.

f.

=

5

t

.

0

2

5

= 2.57

t

.

0

2

5

*

S

E

=

2

.

57

*

8.26

=

21

(

_

1

-

_

2

)

=

(

X

1

-

X

2

)

_

21

=

14

_

21

=

-

7

_

(

m

1

-

m

2

)

_

35.

0.05/2=.025

0.05/2=.025

-7

35

m

1

m

2

-

S

E

=

s

D

/

_

n

.

D

=

D

_

t

.

0

2

5

*

s

D

/

_

n

.

Stud

e

nt

X

1

(F

a

ll)

X

2

(Spr

i

ng)

D

=

X

1

-

X

2

T

r

imbl

e

64

57

7

W

ild

e

66

57

9

G

i

a

nnos

89

73

16

A

m

e

s

77

65

12

D

= (7 + 9

+

16

+

12)

/

4

=

11

d

.

f.

=

n-1

=

3

s

2

D

=

(7

-

11)

2

+

(9

-

11)

2

+

(16

-

11)

2

+

(

12-11)

2

/

3

=

46/3

s

D

= 3.91

S

E

=

s

D

/

_

4

= 3.91 /

2

=

1

.

96

t

.

0

2

5

= 3.18

t

.

0

2

5

*

S

E

=

3

.

18

*

1.96

=

6

So

_

=

D

_

t

.

0

2

5

*

s

D

/

_

n

.

=

11

_

6

=

5

to 17

.

5

_

_

_

17

P

=

8

4

0

1

5

0

0

5

6

=

.

p

= P

±

s

a

mpling

a

l

lo

w

an

ce

p

= P

±

1.96

P

P

n

(

)

1

-

,

p

= .56

±

1.96

.

(

.

)

5

6

1

5

6

1

5

0

0

-

,

p

= .56

±

.03

.