statistical analysis
Sampling
The world of probability and nonprobability sampling
*
Relationship Among the Stages in the Research Process
Formulate Problem
Design Data Collection
Method and Forms
Determine Research Design
Design Sample and Collect Data
Analyze and Interpret the Data
Prepare the Research Report
The World of Sampling
- Sample
- A subset, or some part, of a larger population.
- Population (universe)
- Any complete group of entities that share some common set of characteristics.
- Population Element
- An individual member of a population.
- Census
- An investigation of all the individual elements that make up a population.
*
Six-Step Procedure for Drawing a Sample
Define the Target Population
Identify the Sampling Frame
Select a Sampling Procedure
Determine the Sample Size
Select the Sample Elements
Collect the Data from the
Designated Elements
Step 1
Step 2
Step 4
Step 3
Step 5
Step 6
*
See the resource on the course site where a company
Schlesinger and Associates also has a sampling service. Read about their offering.
A sample is where the researcher collects data for analysis. I always like to say bad sample bad data. It is important to critically think about if you have the best sample source your resource constraints can provide.
*
Sampling Designs
Nonprobability Samples
Availability
Quota
Purposive
Snowball
Probability Samples
Simple Random
Systematic Random
Stratified
Disproportionate
Proportionate
Cluster
13
See the white paper in resources and the text for definitions.
Ultimately, one of the determinants of sample quality is sample size.
Samples will be more representative of the population if they are relatively large and selected through…
*
Why is a larger sample size better. The formula on the next page demonstrates this.
Sample Standard Error
Let’s see this in practice on the next slide.
Sample Standard Error - Example
The larger the sample the smaller the sample standard error—meaning the more generalizable the sample will be to the population. However how much sample is enough? Too much sample is a resource drain time and money because acquiring sample is a large part of research costs.
In order to determine the appropriate sample size, use the sample size calculator the formula that goes into the calculation are on the slides that follow.
Ultimately, one of the determinants of sample quality is sample size.
Samples will be more representative of the population if they are relatively large and selected through…
*
How large should the sample size be—this is a matter of resource constraints and feasibility. Sample size is often a function of time to collect the data and budget so management will sometimes decide how large of a sample is sufficient given these constraints. Nevertheless, on the slides that follow, calculating the optimal sample size is possible.
Sample Size Determination- Example
- Next, plug in your Z-score, Standard of Deviation, and confidence interval into this equation:**
- Necessary Sample Size = (Z-score)2 * StdDev*(1-StdDev) / (margin of error)2
- Here is how the math works assuming you chose a 95% confidence level, .5 standard deviation, and a margin of error (confidence interval) of +/- 5%.
- ((1.96)2 x .5(.5)) / (.05)2
- (3.8416 x .25) / .0025
- .9604 / .0025
- 384.16
- 385 respondents are needed
Sample is a function of confidence level (the z score), variability allowed with the standard deviation and the margin of error.
If you choose a different confidence level, use this Z-score table* to find your score. Example is on the next slide of Z scores and associated confidence levels.
There are many sample size calculators this one is recommended by the AMA. To use it plug in the confidence level generally at 95% and the margin or error or confidence interval generally +/- 5% and the population—this is usually left blank because we generally do not know how many are in the population (e.g., all the carbonated soft drink drinkers in the world.)
https://www.surveysystem.com/sscalc.htm
More discussion of Z scores and their role in determining confidence intervals will be explored in later weeks when we discuss hypothesis testing and multivariate statistics. For now, just know that they are important inputs into the sample size determination formula and the Z score associated with 95% confidence is 1.96.
Sample Size Determination- Example
- Your confidence level corresponds to a Z-score. This is a constant value needed for this equation. Here are the z-scores for the most common confidence levels:
- 90% – Z Score = 1.645
- 95% – Z Score = 1.96
- 99% – Z Score = 2.576
If you choose a different confidence level, use this Z-score table* to find your score.
Field Procedures and
Nonsampling Error
Nonsampling Error
NONSAMPLING ERROR
Observation Errors
Response Errors
Office Errors
Nonobservation Errors
Noncoverage Errors
Nonresponse Errors
Error that arises in research that is not due to sampling
Nonsampling error can occur because of errors in conception, logic, interpretation of questions and replies, statistics, arithmetic, analyzing, coding or reporting
Nonsampling error cannot be accounted for statistically
Nonsampling error that arises because of a failure to include some units, or entire sections, of the defined target population in the sampling frame
Noncoverage error is basically a sampling frame problem
Can be reduced, although not necessarily eliminated, by recognizing its existence and working to improve the sampling frame
Nonresponse error Nonsampling error that represents a failure to obtain information from some elements of the population that were selected and designated for the sample
This is a potential problem that only occurs when those who do respond are systematically different in some important way from those who don’t respond
Example – A university wants to assess the success of its graduates, based on their annual salaries, five years after graduation
Which graduates are more likely (less likely) to return their survey? Those who are happy (unhappy) with their salaries. Refusals
Nonsampling error that arises because some designated respondents refuse to participate in the study
Personal interviews, followed by telephone interviews, are the most successful at overcoming refusals
Not-at-Homes
Nonsampling error that arises because some designated respondents are not at home when the interviewer calls
Generally 3-4 callbacks are required to reach approximately 75% of sample pool
Callbacks should be planned for different days and/or times of the day relative to the original call
Response error occurs when an individual provides a response to an item, but the response is inaccurate for some reason
Possible causes of response error include
Does the respondent understand the question?
Does the respondent know the answer to the question?
Is the respondent willing to provide the true answer to the question?
Is the wording of the question or the situation in which it is asked likely to bias the response?
Office Error: Nonsampling errors that arise in the editing, coding, or analysis phases of research
Most office errors can be reduced, if not eliminated, by exercising proper controls in data processing
Calculating Response Rates
*
- The general response rate calculation is
- RR = CI/E
- RR = Response Rate;
- CI = Number of Completed Interviews with Responding Units;
- E = Number of Eligible Responding Units in the Sample
Discussion: How can a researcher improve response
rates?
If non-sampling error can be controlled the researcher will achieve a better response rate for their data collection effort.
Improving Response Rates
Prior
Notification
Motivating
Respondents
Incentives
Questionnaire Design
and Administration
Follow-Up
Other
Facilitators
Callbacks
Methods of Improving
Response Rates
Reducing
Refusals
Reducing
Not-at-Homes
The response rate on a project serves as an indicator of the overall quality of a data collection effort
It also provides insight into the likely influence of nonresponse error on the project
Researchers must strive to obtain the highest response rates possible in a given situation
https://www.surveysampling.com/solutions/