what could have gone wrong

ddhaocc

session_5_sampling_methods.pdf

Home >Business & Finance homework help >Management homework help >what could have gone wrong

Sampling B U S 1 1 5 W I N T E R 2 0 1 7

The Process of Gathering Primary Data

Define what you want to learn

The populationDefine whom you want to learn from

Decide whether you could/want to reach all or some of them

All: Census Some: Sampling

Learn from the selected people

The research question

Research methods

Known As…

Census vs. Sampling

Census measures the Parameter ◦ A characteristic or measure of a

population ◦ The variable that you are

interested in (e.g., how much all of our target consumers like us)

Sampling measures the Statistic ◦ A characteristic or measure of a

sample ◦ Statistics are calculated from sample

data and used to estimate population parameters (e.g., how much this sample of consumers like us)

Census vs. Sampling

Ideal method◦ Census: Ask everyone in the city◦

Realistic (cost effective) method◦ Sample from the population: ◦ Nielsen Ratings

Example: TV Ratings

https://www.youtube.com/watch?v=2AZ3ftjcJU4&list=PLEA28B1F5DDA45EBE&index=2

Procedure for Drawing a Sample

◦ Step 1: Identify the sampling frame ◦ Step 2: Select a sampling method ◦ Step 3: Determine the Sample Size ◦ Step 4: Collect data from the sample elements

Identify the sampling frame

Sampling frame

The LIST of population elements from which a sample (n) will be drawn ◦ May not cover the entire population – non-coverage error ◦ BMW customer vs. McDonald customer

Sampling methods

◦ Probability samples

◦ Everyone has an known chance

◦ Non-probability samples

◦ Everyone’s chance affected by the researcher’s judgment

Simple Random Sample (SRS)

Each member of the population has an equal probability to be selected.

Systematic Sampling

A random start with a constant skip interval.

Cluster Sampling

SRS among mutually exclusive clusters; census among each selected cluster.

Stratified Sampling

SRS within mutually exclusive strata.

Stratified vs. Cluster Sampling

◦ Homogeneity within group ◦ Heterogeneity between groups ◦ All groups are included ◦ Example: sample by ethnicity ◦ Purpose: increase precision; assure

representation under randomness

◦ Homogeneity between groups ◦ Heterogeneity within group ◦ Random selection of groups ◦ Example: sample by class section ◦ Purpose: decrease cost

Stratified SamplingCluster Sampling

Sampling methods

Non-probability Sampling

Definition: Approach whoever most accessible

Disadvantage: non- or infrequent visitors underrepresented

Convenience Sampling

Non-probability Sampling

Definition: Subjective choice

Disadvantage: Rely on researcher’s knowledge and experience

Judgment Sampling

Non-probability Sampling

Definition: Selection of additional members is based on their relationship with the current one

Disadvantage: opposite voice underrepresented

Snowball Sampling

Non-probability Sampling

Definition: Convenience sampling within each mutually exclusive strata

Disadvantage: non- or infrequent visitors (within the strata) underrepresented

Quota Sampling

The Nielsen Method

Error ◦ What do we wish to learn from a marketing research?

◦ Information. In most cases, the information about a population – usually the mean of a certain variable (e.g., how much our target consumers like our product).

◦ How to find the information (i.e., the parameter)?

◦ Census – ask/observe everyone, which is usually not feasible. So, we do sampling.

◦ The likely distance between a statistic and the parameter is called ERROR. A random sample will always have error, usually expressed as “± X” or “± X%”.

◦ If you replicate the research with another random sample from the same population, the finding will be “very likely” to fall in that ± range. So does the parameter.

Intuition

The number of people in the sample? The actual average rating in that population? Population diversity (i.e., how much the individuals’ opinion differ from each other)? The number of people in that population? The sampling method?

What could matter for the error?

Error

“confidence” “Maximum error allowed”

◦ What do we wish to learn from a marketing research?

◦ Information. In most cases, the information about a population – usually the mean of a certain variable (e.g., how much our target consumers like our product).

◦ How to find the information (i.e., the parameter)?

◦ Census – ask/observe everyone, which is usually not feasible. So, we do sampling.

◦ The likely distance between a statistic and the parameter is called ERROR. A random sample will always have error, usually expressed as “± X” or “± X%”.

◦ If you replicate the research with another random sample from the same population, the finding will be “very likely” to fall in that ± range. So does the parameter.

Sample Size Calculation *

2.5%2.5%

o The parameter

-1.96 SD +1.96 SD

“Z Score”

Confidence Tail Z

90% 5% 1.645

95% 2.5% 1.96

99% 0.5% 2.575

100% 0 ∞

Sample mean follows Normal distribution.

The statistic x

Errorallowed = Z * sample standard deviation Errorallowed² = Z² * sample variance = Z² * [σ2/n]

σ2 : population variance; n: sample size ⇒ n = σ2* Z² / Error²

Maximum error allowed

* Supplementary reading; not required in the exam

Sample Size Calculation

n = σ2*Z²/Error²

Questions: 1. To reduce error, n? 2. To increase confidence, n?

σ2: the variance of the population – how different the population elements are from each other Error: the range that the population parameter may fall into Z: (the Z score of) the confidence level that the population parameter may fall into the above range

Sample Size Calculation

Suppose you want to learn the average of all UCR students’ monthly living expense. You want keep the error of your result within “±$50” with 90% confidence.

How many students do you need to sample?

n = σ2*Z2/Error2

Confidence = 90%, so Z = 1.645

Error = 50

σ2 = ? σ2 learned from secondary data, experience, or preliminary survey

Sample Size Calculation

Suppose you want to learn the proportion of UCR students whose average monthly living expense is greater than $1,000. You want keep the error of your result within “±5%” with 95% confidence.

How many students do you need to sample?

n = σ2*Z2/Error2

Confidence = 95%, so Z = 1.96

Error = 0.05

σ2= ? Binomial Distribution (proportion data) σ2 = p*(1-p)

Stratified Sampling *

◦ Homogeneity within group ◦ Heterogeneity between groups ◦ All groups are included ◦ Example: sample by ethnicity ◦ Purpose: Increase precision

Stratified Sampling

Error² = z² * [σ²/n]

* Supplementary reading; not required in the exam

Stratified Sampling *

L: total number of strata

Nl: the population size of stratum l

nl: the sample size of stratum l

xl̅: the mean of the sample from stratum l Wl: the weight assigned to stratum l (0<Wl<1), so that

x̅ = Σ Wl* xl̅ (l = 1, 2, …, L)

Var(x̅) ≈ ΣWl²σl2/nl

Neyman Allocation: nl = n*Wlσl/ΣWlσl, with which Var(x̅) reaches the minimum value: Σ(Wlσl)2/n

* Supplementary reading; not required in the exam

Critical Thinking Questions

Sample size does not depend on population size?

n = σ2*Z2/Error2

◦ Nielsen TV Rating sampling ◦ 114M households ◦ 20,000 sample ◦ Reasonable?

In business practice, sample size sometimes IS a function of

population size – based on this formula, think of why.