Week 3
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Determination of Sample Size
A Review of Statistical Theory
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
LEARNING OUTCOMES
Understand basic statistical terminology
Interpret frequency distributions, proportions, and measures of central tendency and dispersion
Distinguish among population, sample, and sampling distributions
Explain the central-limit theorem
Summarize the use of confidence interval estimates
Discuss major issues in specifying sample size
17-*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Federal Reserve Finds Cards are Replacing Cash
17-*
- Payment options have gone high-tech and research supports that.
- The Federal Reserve conducted surveys of depository institutions, asking them to report the number of each type of payment the institutions processed.
- A sample of 2,700 was drawn from the 14,117 institutions in the population, with a 95% confidence interval and accuracy of ±5 percent.
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Introduction
- Descriptive Statistics
Describe characteristics of populations or samples.
- Inferential Statistics
Make inferences about whole populations from a sample.
- Sample Statistics
Variables in a sample or measures computed from sample data.
- Population Parameters
Variables in a population or measured characteristics of the population.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Making Data Usable
- Frequency Distribution
A set of data organized by summarizing the number of times a particular value of a variable occurs.
- Percentage Distribution
A frequency distribution organized into a table (or graph) that summarizes percentage values associated with particular values of a variable.
- Probability
The long-run relative frequency with which an event will occur.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.1 Frequency Distribution of Deposits
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.2 Percentage Distribution of Deposits
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.3 Probability Distribution of Deposits
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Well-Chosen Average
17-*
- “Average” pay could mean there’s a highly paid executive and many low-paid employees.
- Median may be more informative.
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Making Data Usable (cont’d)
- Proportion
The percentage of elements that meet some criterion
- Measures of Central Tendency
Mean: the arithmetic average.
Median: the midpoint; the value below which half the values in a distribution fall.
Mode: the value that occurs most often.
Population Mean
Sample Mean
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.4 Number of Sales Calls Per Day by Salesperson
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Measures of Dispersion
- The Range
The distance between the smallest and the largest values of a frequency distribution.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.5 Sales Levels for Two Products with Identical Average Sales
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.6 Low Dispersion versus High Dispersion
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Measures of Dispersion (cont’d)
- Why Use the Standard Deviation?
Variance
A measure of variability or dispersion.
Its square root is the standard deviation.
Standard deviation
A quantitative index of a distribution’s spread, or variability; the square root of the variance for a distribution.
The average of the amount of variance for a distribution.
Used to calculate the likelihood (probability) of an event occurring.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Calculating Deviation
Standard Deviation =
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.7 Calculating a Standard Deviation
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
The Normal Distribution
- Normal Distribution
A symmetrical, bell-shaped distribution (normal curve) that describes the expected probability distribution of many chance occurrences.
99% of its values are within ± 3 standard deviations from its mean.
Example: IQ scores
- Standardized Normal Distribution
A purely theoretical probability distribution that reflects a specific normal curve for the standardized value, z.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
The Normal Distribution (cont’d)
- Characteristics of a Standardized Normal Distribution
It is symmetrical about its mean.
The mode identifies the normal curve’s highest point, which is also the mean and median, and the vertical line about which this normal curve is symmetrical.
The normal curve has an infinite number of cases (it is a continuous distribution), and the area under the curve has a probability density equal to 1.0.
The standardized normal distribution has a mean of 0 and a standard deviation of 1.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.8 Normal Distribution
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
The Normal Distribution (cont’d)
- Standardized Values
Used to compare an individual value to the population mean in units of the standard deviation
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.9 Standardized Normal Distribution
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.11 Standardized Values Can Be Computed from Flat or Peaked Distributions Resulting in a Standardized Normal Curve
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.12 Standardized Distribution Curve
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
- Population Distribution
A frequency distribution of the elements of a population.
- Sample Distribution
A frequency distribution of a sample.
- Sampling Distribution
A theoretical probability distribution of sample means for all possible samples of a certain size drawn from a particular population.
- Standard Error of the Mean
The standard deviation of the sampling distribution.
Population Distribution, Sample Distribution, and Sampling Distribution
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.13 Fundamental Types of Distributions
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Central-limit Theorem
- Central-limit Theorem
The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.14
17–*
Distribution of Sample Means for Samples of Various Sizes and Population Distributions
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
- Point Estimates
An estimate of the population mean in the form of a single value, usually the sample mean.
Gives no information about the possible magnitude of random sampling error.
- Confidence Interval Estimate
A specified range of numbers within which a population mean is expected to lie.
An estimate of the population mean based on the knowledge that it will be equal to the sample mean plus or minus a small sampling error.
Estimation of Parameters and Confidence Intervals
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling the World
17-*
- Gallup’s WorldView generates a comprehensive snapshot of the opinions of people from across the globe.
- Collects data from over 150 countries, generally through samples of 1,000 individuals in each.
- Provides valuable information on poorer and more rural populations.
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Confidence Intervals
- Confidence Level
A percentage or decimal value that tells how confident a researcher can be about being correct.
It states the long-run percentage of confidence intervals that will include the true population mean.
The crux of the problem for a researcher is to determine how much random sampling error to tolerate.
Traditionally, researchers have used the 95% confidence level (a 5% tolerance for error).
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Calculating a Confidence Interval
Estimation of the sampling error
Approximate location (value) of the population mean
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Calculating a Confidence Interval (cont’d)
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Target and Walmart Shoppers Really Are Different
- 40% of respondents named both Target and Walmart as places where they shop.
- But 30% shopped at Walmart and not Target and 12% shopped at Target but not Walmart.
- Target shoppers who shun Walmart shop at more upscale stores, while Walmart shoppers who shun Target shop at discounters.
17-*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Sample Size
- Random Error and Sample Size
Random sampling error varies with samples of different sizes.
Increases in sample size reduce sampling error at a decreasing rate.
Diminishing returns - random sampling error is inversely proportional to the square root of n.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.18 Relationship between Sample Size and Error
17–*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Factors of Concern in Choosing Sample Size
- Variance (or Heterogeneity)
A heterogeneous population has more variance (a larger standard deviation) which will require a larger sample.
A homogeneous population has less variance (a smaller standard deviation) which permits a smaller sample.
- Magnitude of Error (Confidence Interval)
How precise must the estimate be?
- Confidence Level
How much error will be tolerated?
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
- Sequential Sampling
Conducting a pilot study to estimate the population parameters so that another, larger sample of the appropriate sample size may be drawn.
- Estimating sample size:
Estimating Sample Size for Questions
Involving Means
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Sample Size Example
- Suppose a survey researcher, studying expenditures on lipstick, wishes to have a 95 percent confident level (Z) and a range of error (E) of less than $2.00. The estimate of the standard deviation is $29.00. What is the calculated sample size?
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17–*
Sample Size Example
- Suppose, in the same example as the one before, the range of error (E) is acceptable at $4.00. Sample size is reduced.
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Calculating Sample Size at the 99 Percent Confidence Level
17-*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Determining Sample Size for Proportions
17-*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Determining Sample Size for Proportions (cont’d)
17-*
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Calculating Example Sample Size at the 95 Percent Confidence Level
17-*
753
=
001225
.
922
.
=
001225
)
24
)(.
8416
.
3
(
=
)
035
( .
)
4
)(.
6
(.
)
96
1.
(
n
4
.
q
6
.
p
2
2
=
=
=
*
©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
EXHIBIT 17.6 Low Dispersion versus High Dispersion
17–*
*