help with cj assgn (2) due in 24 hours

profilecombs
Unit6.pdf

Unit 6

Introduction to Inferential Statistics, the Sampling Distribution and Estimation

Unit Objectives

 Explain the purpose of inferential statistics in terms of generalizing from a sample

to a population.

 Explain the basic techniques of random sampling and these key concepts:

population, sample, parameter, statistic, representative, EPSEM.

 Differentiate among the sampling distribution, the sample, and the population.

 Explain the two theorems presented.

-Explain the logic of estimation and the role of the sample, sampling distribution, and

population.

 Define and explain the concepts of bias and efficiency.

 Construct and interpret confidence intervals for sample means and sample proportions.

 Use SPSS to get sample statistics to use in the construction of confidence intervals.

Unit Outline

 Using Statistics

 Probability Sampling: Basic Concepts

 The Sampling Distribution

 Symbols and Terminology

 Introduction to Estimation

 Estimation Selection Criteria

 Interval Estimation Procedures

 Computing Confidence Intervals: A Summary

 Controlling the Width of Interval Estimates

Using Statistics

 The techniques in this chapter can be used to estimate:

 Changes in attitudes toward same sex marriage over time

 Changes in support for medical marijuana

 Levels of support for political candidates

 Effectiveness of health treatments

Public reactions to controversial social issues

Probability Sampling: Basic Concepts

 Inferential statistics allow us to generalize what we see in our sample data to the

population.

 Probability samples are sometimes called random samples, although the former term is

preferred.

 Nonprobability samples are often used by researchers, but inferential statistics cannot be

performed on data from nonprobability samples (such as convenience samples).

 Inferential statistics allow us to generalize what we see in our sample data to the

population.

 Probability samples are sometimes called random samples, although the former term is

preferred.

 Nonprobability samples are often used by researchers, but inferential statistics cannot be

performed on data from nonprobability samples (such as convenience samples).

The Sampling Distribution

 Inferential statistics allow us to generalize what we see in our sample data to the

population.

 Probability samples are sometimes called random samples, although the former term is

preferred.

 Nonprobability samples are often used by researchers, but inferential statistics cannot be

performed on data from nonprobability samples (such as convenience samples).

 The sampling distribution is a “the distribution of a statistics (such as a mean) for

all possible sample outcomes of a certain size.”

 Sampling distributions are theoretical, not empirical.

 The sampling distribution is the central concept of inferential statistics.

 Constructing a Sampling Distribution

 Imagine a population of 40,000 students in a university.

 You select a sample of 1,000 students and note some

characteristic of the sample (such as its average age)

 You then select a second sample of 1,000 students and

again note some characteristic of the sample.

 You continue this procedure repeatedly until all possible

unique combinations of 1,000 students have been

sampled.

 Together, these sample means form Constructing a Sampling Distribution

 Imagine a population of 40,000 students in a university.

 You select a sample of 1,000 students and note some

characteristic of the sample (such as its average age)

 You then select a second sample of 1,000 students and

again note some characteristic of the sample.

 You continue this procedure repeatedly until all possible

unique combinations of 1,000 students have been

sampled.

 Together, these sample means form the sampling

distribution.

 Theorem 1

 “If repeated random samples of size N are drawn from a normal population with

mean (μ) and standard deviation (σ), then the sampling distribution of sample

means will be normal with a mean (μ) and a standard deviation of

 The dispersion of the sampling distribution is called the standard

error and is calculated by dividing the population standard

deviation (σ) by the square root of the sample size (N).

 This theorem requires that the population is normally distributed.

Symbols and Terminology

 Symbols for Means and Standard Deviations of Three Distributions

Introduction to Estimation

 Estimation is the first branch of inferential statistics.

 The goal in estimation is to estimate population values (or parameters) from sample

statistics.

 A common, useful, and frequently used estimation technique is the confidence

interval.

 Confidence intervals are mathematical statements of the range of

possible values (or intervals) for a population parameter.

 Confidence intervals have a probability level attached.

Estimation Selection Criteria

 Estimates are selected based on two criteria:

N

 Bias: An estimator is unbiased if the “mean of its sampling

distribution is equal to the population value of interest.”

 Sample means are unbiased estimators of population

means (recall the Central Limit Theorem), and sample

proportions are unbiased estimators of population

proportions.

 Efficiency: An estimator is efficient if the “sampling distribution is

clustered about its mean.”

 Efficiency is a matter of dispersion around the true

population parameter; therefore it is affected by the

sample size. Larger samples increase the efficiency of

the estimate (recall the Central Limit Theorem).

Interval Estimation Procedures

 Step 1: Set alpha.  Decide the risk you are willing to take of being wrong (of your interval

estimate not containing the true population parameter).

 This value, the probability of error, is called alpha (α).

 A related value, the confidence level, is the probability

that the calculated interval will contain the true

population parameter.

 Alpha levels are stated as probabilities; confidence levels

are stated as percentages.

 If the alpha equals 0.05, then the confidence

level is 95%.

 If the alpha equals 0.01, then the confidence

level is 99%.

 Step 2: Find the Z score.  Picture the sampling distribution, divide the probability of error (the

alpha) into the upper and lower tails of the distribution (divide it by 2),

and then find the corresponding Z score for that area.

 For example, given an alpha of 0.05: Half of the area

(0.0250) goes in the right tail, and half (0.0250) goes in

the left tail. Using Appendix A, we find a Z score of 1.96

leaves exactly this amount of area beyond it into the tail.

 Step 2: Find the Z score.  Common Z scores for various alpha levels are:

 Step 3: Construct the confidence interval.  Confidence intervals can be constructed around a sample mean to

estimate a population mean.

Confidence intervals can be constructed around a sample proportion to estimate a population

proportion.

 Use this formula when the population standard deviation is known:

 Use this formula when the population standard deviation is unknown:

Interval Estimation Procedures for Sample Means (Large samples)

 Example: In a sample of 200, you obtain a mean of 105. Assume that the

population standard deviation is 15. Calculate a 95% confidence interval for the

population mean.

-Therefore, the estimate of the interval is from 102.92 (105 - 2.08) to 107.08 (105 + 2.08).

-Example: In a sample of 500, you obtain a mean of 45,000 and a standard deviation of 200.

Calculate a 95% confidence interval for the population mean.

-Therefore, the estimate of the interval is from 44,982.45 (45000 - 17.55) to 45,017.55 (45000 +

17.55).

Formula:

Where P s is the sample proportion and P

u is set to 0.25 to maximize the interval estimate.

Example: In a sample of 200, you obtain a proportion of 0.30. Calculate a 95% confidence

interval for the population proportion.

Therefore, the estimate of the interval is from 0.23 (0.30 – 0.07) to 0.37 (0.30 + 0.07).

Controlling the Interval of width Estimates

The width of the interval estimate can be controlled through manipulation of two terms in the

formula:

1. The confidence level (or alpha) can be raised or lowered.

Higher confidence levels (lower alphas) produce wider intervals.

The width of the interval estimate can be controlled through manipulation of two terms in the

formula:

2. The sample size can be increased or decreased.

- Larger samples produce narrower intervals.

Summary

 Since populations are almost always too large to test, a fundamental strategy of

social science research is to select a sample from the defined population and then

use information from the sample to generalize to the population. This is done either

by estimation or by hypothesis testing.

 Simple random samples are created by selecting cases from a list of the population

following the rule of EPSEM (each case has an equal probability of being selected).

Samples selected by the rule of EPSEM have a very high probability of being

representative.

 The sampling distribution, the central concept in inferential statistics, is a

theoretical distribution of all possible sample outcomes. Since its overall shape,

mean, and standard deviation are known (under the conditions specified in the two

theorems), the sampling distribution can be adequately characterized and utilized by

researchers.

 Since populations are almost always too large to test, a fundamental strategy of social

science research is to select a sample from the defined population and then use

information from the sample to generalize to the population. This is done either by

estimation or by hypothesis testing.

 Simple random samples are created by selecting cases from a list of the population

following the rule of EPSEM (each case has an equal probability of being selected).

Samples selected by the rule of EPSEM have a very high probability of being

representative.

 The sampling distribution, the central concept in inferential statistics, is a theoretical

distribution of all possible sample outcomes. Since its overall shape, mean, and standard

deviation are known (under the conditions specified in the two theorems), the sampling

distribution can be adequately characterized and utilized by researchers.

 Since populations are almost always too large to test, a fundamental strategy of social

science research is to select a sample from the defined population and then use

information from the sample to generalize to the population. This is done either by

estimation or by hypothesis testing.

 Simple random samples are created by selecting cases from a list of the population

following the rule of EPSEM (each case has an equal probability of being selected).

Samples selected by the rule of EPSEM have a very high probability of being

representative.

 The sampling distribution, the central concept in inferential statistics, is a theoretical

distribution of all possible sample outcomes. Since its overall shape, mean, and standard

deviation are known (under the conditions specified in the two theorems), the sampling

distribution can be adequately characterized and utilized by researchers.

Basic Terms

 Alpha (α)

 Bias

 Central Limit Theorem

 Confidence interval

 Efficiency

 EPSEM

 μ

 μ

 μ p

 Nonprobability Sample

 Ps

 Pu

 Parameter

 Representative sample

 Sampling distribution

 Simple random sample

 Standard error of the mean