help with cj assgn (2) due in 24 hours
Unit 6
Introduction to Inferential Statistics, the Sampling Distribution and Estimation
Unit Objectives
Explain the purpose of inferential statistics in terms of generalizing from a sample
to a population.
Explain the basic techniques of random sampling and these key concepts:
population, sample, parameter, statistic, representative, EPSEM.
Differentiate among the sampling distribution, the sample, and the population.
Explain the two theorems presented.
-Explain the logic of estimation and the role of the sample, sampling distribution, and
population.
Define and explain the concepts of bias and efficiency.
Construct and interpret confidence intervals for sample means and sample proportions.
Use SPSS to get sample statistics to use in the construction of confidence intervals.
Unit Outline
Using Statistics
Probability Sampling: Basic Concepts
The Sampling Distribution
Symbols and Terminology
Introduction to Estimation
Estimation Selection Criteria
Interval Estimation Procedures
Computing Confidence Intervals: A Summary
Controlling the Width of Interval Estimates
Using Statistics
The techniques in this chapter can be used to estimate:
Changes in attitudes toward same sex marriage over time
Changes in support for medical marijuana
Levels of support for political candidates
Effectiveness of health treatments
Public reactions to controversial social issues
Probability Sampling: Basic Concepts
Inferential statistics allow us to generalize what we see in our sample data to the
population.
Probability samples are sometimes called random samples, although the former term is
preferred.
Nonprobability samples are often used by researchers, but inferential statistics cannot be
performed on data from nonprobability samples (such as convenience samples).
Inferential statistics allow us to generalize what we see in our sample data to the
population.
Probability samples are sometimes called random samples, although the former term is
preferred.
Nonprobability samples are often used by researchers, but inferential statistics cannot be
performed on data from nonprobability samples (such as convenience samples).
The Sampling Distribution
Inferential statistics allow us to generalize what we see in our sample data to the
population.
Probability samples are sometimes called random samples, although the former term is
preferred.
Nonprobability samples are often used by researchers, but inferential statistics cannot be
performed on data from nonprobability samples (such as convenience samples).
The sampling distribution is a “the distribution of a statistics (such as a mean) for
all possible sample outcomes of a certain size.”
Sampling distributions are theoretical, not empirical.
The sampling distribution is the central concept of inferential statistics.
Constructing a Sampling Distribution
Imagine a population of 40,000 students in a university.
You select a sample of 1,000 students and note some
characteristic of the sample (such as its average age)
You then select a second sample of 1,000 students and
again note some characteristic of the sample.
You continue this procedure repeatedly until all possible
unique combinations of 1,000 students have been
sampled.
Together, these sample means form Constructing a Sampling Distribution
Imagine a population of 40,000 students in a university.
You select a sample of 1,000 students and note some
characteristic of the sample (such as its average age)
You then select a second sample of 1,000 students and
again note some characteristic of the sample.
You continue this procedure repeatedly until all possible
unique combinations of 1,000 students have been
sampled.
Together, these sample means form the sampling
distribution.
Theorem 1
“If repeated random samples of size N are drawn from a normal population with
mean (μ) and standard deviation (σ), then the sampling distribution of sample
means will be normal with a mean (μ) and a standard deviation of
The dispersion of the sampling distribution is called the standard
error and is calculated by dividing the population standard
deviation (σ) by the square root of the sample size (N).
This theorem requires that the population is normally distributed.
Symbols and Terminology
Symbols for Means and Standard Deviations of Three Distributions
Introduction to Estimation
Estimation is the first branch of inferential statistics.
The goal in estimation is to estimate population values (or parameters) from sample
statistics.
A common, useful, and frequently used estimation technique is the confidence
interval.
Confidence intervals are mathematical statements of the range of
possible values (or intervals) for a population parameter.
Confidence intervals have a probability level attached.
Estimation Selection Criteria
Estimates are selected based on two criteria:
N
Bias: An estimator is unbiased if the “mean of its sampling
distribution is equal to the population value of interest.”
Sample means are unbiased estimators of population
means (recall the Central Limit Theorem), and sample
proportions are unbiased estimators of population
proportions.
Efficiency: An estimator is efficient if the “sampling distribution is
clustered about its mean.”
Efficiency is a matter of dispersion around the true
population parameter; therefore it is affected by the
sample size. Larger samples increase the efficiency of
the estimate (recall the Central Limit Theorem).
Interval Estimation Procedures
Step 1: Set alpha. Decide the risk you are willing to take of being wrong (of your interval
estimate not containing the true population parameter).
This value, the probability of error, is called alpha (α).
A related value, the confidence level, is the probability
that the calculated interval will contain the true
population parameter.
Alpha levels are stated as probabilities; confidence levels
are stated as percentages.
If the alpha equals 0.05, then the confidence
level is 95%.
If the alpha equals 0.01, then the confidence
level is 99%.
Step 2: Find the Z score. Picture the sampling distribution, divide the probability of error (the
alpha) into the upper and lower tails of the distribution (divide it by 2),
and then find the corresponding Z score for that area.
For example, given an alpha of 0.05: Half of the area
(0.0250) goes in the right tail, and half (0.0250) goes in
the left tail. Using Appendix A, we find a Z score of 1.96
leaves exactly this amount of area beyond it into the tail.
Step 2: Find the Z score. Common Z scores for various alpha levels are:
Step 3: Construct the confidence interval. Confidence intervals can be constructed around a sample mean to
estimate a population mean.
Confidence intervals can be constructed around a sample proportion to estimate a population
proportion.
Use this formula when the population standard deviation is known:
Use this formula when the population standard deviation is unknown:
Interval Estimation Procedures for Sample Means (Large samples)
Example: In a sample of 200, you obtain a mean of 105. Assume that the
population standard deviation is 15. Calculate a 95% confidence interval for the
population mean.
-Therefore, the estimate of the interval is from 102.92 (105 - 2.08) to 107.08 (105 + 2.08).
-Example: In a sample of 500, you obtain a mean of 45,000 and a standard deviation of 200.
Calculate a 95% confidence interval for the population mean.
-Therefore, the estimate of the interval is from 44,982.45 (45000 - 17.55) to 45,017.55 (45000 +
17.55).
Formula:
Where P s is the sample proportion and P
u is set to 0.25 to maximize the interval estimate.
Example: In a sample of 200, you obtain a proportion of 0.30. Calculate a 95% confidence
interval for the population proportion.
Therefore, the estimate of the interval is from 0.23 (0.30 – 0.07) to 0.37 (0.30 + 0.07).
Controlling the Interval of width Estimates
The width of the interval estimate can be controlled through manipulation of two terms in the
formula:
1. The confidence level (or alpha) can be raised or lowered.
Higher confidence levels (lower alphas) produce wider intervals.
The width of the interval estimate can be controlled through manipulation of two terms in the
formula:
2. The sample size can be increased or decreased.
- Larger samples produce narrower intervals.
Summary
Since populations are almost always too large to test, a fundamental strategy of
social science research is to select a sample from the defined population and then
use information from the sample to generalize to the population. This is done either
by estimation or by hypothesis testing.
Simple random samples are created by selecting cases from a list of the population
following the rule of EPSEM (each case has an equal probability of being selected).
Samples selected by the rule of EPSEM have a very high probability of being
representative.
The sampling distribution, the central concept in inferential statistics, is a
theoretical distribution of all possible sample outcomes. Since its overall shape,
mean, and standard deviation are known (under the conditions specified in the two
theorems), the sampling distribution can be adequately characterized and utilized by
researchers.
Since populations are almost always too large to test, a fundamental strategy of social
science research is to select a sample from the defined population and then use
information from the sample to generalize to the population. This is done either by
estimation or by hypothesis testing.
Simple random samples are created by selecting cases from a list of the population
following the rule of EPSEM (each case has an equal probability of being selected).
Samples selected by the rule of EPSEM have a very high probability of being
representative.
The sampling distribution, the central concept in inferential statistics, is a theoretical
distribution of all possible sample outcomes. Since its overall shape, mean, and standard
deviation are known (under the conditions specified in the two theorems), the sampling
distribution can be adequately characterized and utilized by researchers.
Since populations are almost always too large to test, a fundamental strategy of social
science research is to select a sample from the defined population and then use
information from the sample to generalize to the population. This is done either by
estimation or by hypothesis testing.
Simple random samples are created by selecting cases from a list of the population
following the rule of EPSEM (each case has an equal probability of being selected).
Samples selected by the rule of EPSEM have a very high probability of being
representative.
The sampling distribution, the central concept in inferential statistics, is a theoretical
distribution of all possible sample outcomes. Since its overall shape, mean, and standard
deviation are known (under the conditions specified in the two theorems), the sampling
distribution can be adequately characterized and utilized by researchers.
Basic Terms
Alpha (α)
Bias
Central Limit Theorem
Confidence interval
Efficiency
EPSEM
μ
μ
μ p
Nonprobability Sample
Ps
Pu
Parameter
Representative sample
Sampling distribution
Simple random sample
Standard error of the mean