Marketing Assignments

profileiam llc
section52013-1.pdf

Section 5

Data Distribution

Rhonda Knehans Drake

Associate Professor, New York University

Data Analytics, Interpretation and Reporting Copyright © 2013

2

• There are many distributional forms that our marketing data can take on.

• Most data follows a specific form of one type or another.

• Because of this fact, we can make estimates and forecasts about what our data is telling us and do so with a certain level of confidence.

Introduction

3

Examples of common forms of data in industry are:

– Time to failure follows what is known as an exponential distribution. Xerox takes advantage of this well know fact to determine servicing needs and how to set contract prices for its various equipment.

– Department stores take advantage of the fact that the period of time between the arrivals of two successive customers also follows an exponential distribution.

Examples of Data Distribution I

4

Examples of common forms of data in industry are:

– If you are interested in counting the number of occurrences of a specific event within a set time period then we have what is called the Poisson distribution.

– The number of murders in NYC during the month of April follows a Poisson distribution. Based on this fact, the city can then estimate the murder rate for the next month.

– The hypergeometric distribution is used when doing QC checking for defectives in a large lot. With this distribution, you can determine the probability that in a sample of size n from the entire lot of size N, you will accept it when in reality it exceeds your defective rate.

Examples of Data Distribution I

5

• However, the most prevalent and most important functional form for a marketer is the normal distribution.

• Also known as the bell shaped curve.

• Many data elements follow the normal distribution:

– Age

– Income Levels

– GPA’s

– Spend

– Years education

– Etc.

• And because all these measures tend to follow the normal curve, we as marketers and researchers are able to make many inferences about our metrics with a high degree of confidence.

Examples of Data Distribution II

6

• Referring back to the distribution of income from a prior chapter, we now know this data to be normally distributed.

• The distribution of income was symmetric and bell-shaped.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

$ 1 0 ,0

0 0 -

$ 2 5 ,0

0 0

$ 2 5 ,0

0 0 -

$ 4 0 ,0

0 0

$ 4 0 ,0

0 0 -

$ 5 5 ,0

0 0

$ 5 5 ,0

0 0 -

$ 7 0 ,0

0 0

$ 7 0 ,0

0 0 -

$ 8 5 ,0

0 0

$ 8 5 ,0

0 0 -

$ 1 0 0 , 0 0 0

$ 1 0 0 , 0 0 0 -

$ 1 1 5 , 0 0 0

Incom e Categories

R e la

t iv

e F

re q u e

n c y

Histogram and polygon for the relative frequency distribution of income levels.

Normal Distribution

7

• What would you think the measure of skewness and kurtosis to be for this data?.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

$ 1 0 ,0

0 0 -

$ 2 5 ,0

0 0

$ 2 5 ,0

0 0 -

$ 4 0 ,0

0 0

$ 4 0 ,0

0 0 -

$ 5 5 ,0

0 0

$ 5 5 ,0

0 0 -

$ 7 0 ,0

0 0

$ 7 0 ,0

0 0 -

$ 8 5 ,0

0 0

$ 8 5 ,0

0 0 -

$ 1 0 0 , 0 0 0

$ 1 0 0 , 0 0 0 -

$ 1 1 5 , 0 0 0

Incom e Categories

R e la

t iv

e F

re q u e

n c y

Histogram and polygon for the relative frequency distribution of income levels.

Normal Distribution

8

• As you recall from Section 4, we discussed the Empirical Rule. This rule stated that if the distribution of your data is symmetric and bell-shaped (now known as normally distributed data):

– 68% of the observations within the data set will lie within one standard deviation of the mean

– 95% of the observations within the data set will lie within two standard deviations of the mean

– 99.7% of the observations within the data set will lie within three standard deviations of the mean

The Spread of Normally Distributed Data I

9

• Pictorially, this looks as follows:



 

 -  + - 2 - 3  + 2  + 3

The Spread of Normally Distributed Data II

10

5.1 Rite Aid Pharmacy wishes to monitor the number of customers arriving at

the checkout counter on Sunday afternoons for staffing purposes. What

distributional form will this data follow?

5.2 How is the time to failure for GE light bulbs distributed?

5.3 How would you suspect the average age of your customer base to be

distributed?

5.4 Income on your customer database is distributed normally with a mean of

$55,000 and a standard deviation of $10,000. What percent of the

database do you estimated will have an income within the range $35,000

to $75,000.

5.5 How do the width and height of a normal distribution change when its

mean remains the same but its standard deviation decreases? Show this

graphically.

5.6 How do the width and or height of a normal distribution change when its

standard deviation remains the same but its mean increases? Show this

graphically.

Section 5 Exercises