MIS major

profilemqalwabir2g
HansensSlides-MarketSegmentationAnalytics.pdf

1/28/2019

1

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Market Segmentation via Classification/Cluster

Analysis

Professor Jared M. Hansen, Ph.D.

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Case: B2B Customer Satisfaction Index

1

2

1/28/2019

2

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Toyota’s Adjacent Segmentation Strategy

P ri

c e L

e v e l

Perceived Quality

Tercel

Paseo

Previa

Camry

Corrolla

Avalon

Supra

Lexus

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Market Segment Requirements

Targeted Segments Must Be:

• Identifiable

• Reachable

• Sizeable

And Should Be:

• Profitable

3

4

1/28/2019

3

©JMH: [email protected] - No redistribution/reusage/etc without permission.

3 Benefits of Market Segmentation

1. Helps in the design of marketing programs that are most effective for reaching homogeneous groups of customers.

2. Improves the strategic allocation of marketing resources.

3. Identifies opportunities for new product development.

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Consumer Market Major Segment Bases

Geographic • Region, City or Metro Size, Density, Climate

Demographic • Age, Gender, Family Size and Life Cycle, Race, Occupation,

Income

Psychographic • Lifestyle or Personality

Behavioral • Benefits, Usage Situations

5

6

1/28/2019

4

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Geographic Segmentation

Variable Typical Breakdown

Region Pacific, Mountain, West North Central, West South

Central, East North Central, East South Central, South Atlantic,

Middle Atlantic, New England

City or Metro Under 5,000; 5,000-20,000; 20,001-50,000;

Size 50,001-100,000; 100,001-250,000; 250,001-500,000;

500,001-1,000,000;1,000,001-4,000,000, over 4,000,000

Density Urban, suburban, rural

Climate Northern, southern

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Variable Typical Breakdown Age Under 6, 6-11, 12-19, 20-34, 35-49, 50-64, 65+

Gender Male, female

Family size 1-2, 3-4, 5+

Family life Young, single; young, married, no children; young, married, youngest child under six;

cycle young, married, oldest child over six; older, married, with children; older, married, no

children under 18; older, single; other

Income Under $10,000; $10,001-$15,000; $15,001-$20,000; $20,001-$30,000;

$30,001-$50,000; $50,001-$100,000; over $100,000

Occupation Professional and technical; managers, officials, and proprietors; clerical, sales;

craftspeople; farmers; laborers; retired; students; housewives; unemployed

Education Grade school or less; some high school; high school; some college; college graduate;

graduate degree

Religion Catholic, Protestant, Jewish, Muslim, Hindu, other

Race/Ethnic White, black, Asian, Hispanic

Nationality American, British, French, German, Italian, Japanese

Demographic Segmentation

7

8

1/28/2019

5

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Variable Typical Breakdown

Lifestyle Activities, interests, opinions

Personality Compulsive, gregarious, authoritarian,

ambitious

Values Sense of belonging, excitement,

warm relationships with others,

self-fulfillment, security,

being well respected,

fun and enjoyment of life,

self-respect, sense of accomplishment

Psychographic Segmentation

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Variable Typical Breakdown

Usage situation Regular occasion, special occasion

Benefits Quality, service, economy, speed

User status Nonuser, ex-user, potential user, first-time user, regular user

Usage rate Light user, medium user, heavy user

Loyalty status None, medium, strong, absolute

Readiness stage Unaware, aware, informed, interested, desirous, intending to buy

Attitude toward Enthusiastic, positive, indifferent, negative, hostile

product

Behavioral Segmentation

9

10

1/28/2019

6

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Convey the Benefits to The Swing Group

Those who are

Indifferent

Those Who

Love US

Swing

Group Future

Barriers Understand

Benefits

Those Who

Hate US

The Usage-based Approach

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Market Segmentation Example: Oldsmobile

11

12

1/28/2019

7

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Respect the brand 'contract': power to say no!

“In Canada for example, the lease offers for the entry level BMW 3 Series are very aggressive: the difference between driving a Toyota and a BMW can be as little as $100/monthly.”

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Respect the brand 'contract': power to say no!

13

14

1/28/2019

8

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Respect the brand 'contract': power to say no!

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Convey the Benefits to The Swing Group

Those who are

Indifferent

Those Who

Love US

Swing

Group Future

Barriers Understand

Benefits

Those Who

Hate US

The Usage-based Approach

15

16

1/28/2019

9

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Case: Pizza Hut Product Launch Decision

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Market Segmentation

Hierarchical Clustering (exploratory)

K-Means Clustering (confirmatory)

17

18

1/28/2019

10

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Cluster Analysis

Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups called clusters.

Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters.

Cluster analysis is also called classification analysis, or numerical taxonomy.

Both cluster analysis and discriminant analysis are concerned with classification.

However, discriminant analysis requires prior knowledge of the cluster or group membership for each object or case included, to develop the classification rule.

In contrast, in cluster analysis there is no a priori information about the group or cluster membership for any of the objects. Groups or clusters are suggested by the data, not defined a priori.

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Statistics Associated with Cluster Analysis

• Agglomeration schedule. An agglomeration schedule gives information on the objects or cases being combined at each stage of a hierarchical clustering process.

• Cluster centroid. The cluster centroid is the mean values of the variables for all the cases or objects in a particular cluster.

• Cluster centers. The cluster centers are the initial starting points in nonhierarchical clustering. Clusters are built around these centers, or seeds.

• Cluster membership. Cluster membership indicates the cluster to which each object or case belongs.

19

20

1/28/2019

11

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Statistics Associated with Cluster Analysis

• Dendrogram. A dendrogram, or tree graph, is a graphical device for displaying clustering results. Vertical lines represent clusters that are joined together. The position of the line on the scale indicates the distances at which clusters were joined. The dendrogram is read from left to right.

• Distances between cluster centers. These distances indicate how separated the individual pairs of clusters are. Clusters that are widely separated are distinct, and therefore desirable.

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Conducting Cluster Analysis: Formulate the Problem

• Perhaps the most important part of formulating the clustering problem is selecting the variables on which the clustering is based.

• Inclusion of even one or two irrelevant variables may distort an otherwise useful clustering solution.

• Basically, the set of variables selected should describe the similarity between objects in terms that are relevant to the business analytics problem.

• The variables should be selected based on past research, theory, or a consideration of the hypotheses being tested. In exploratory research, the data scientist should exercise judgment and intuition.

21

22

1/28/2019

12

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Conducting Cluster Analysis: Select a Distance or Similarity Measure

• The most commonly used measure of similarity is the Euclidean distance or its square. The Euclidean distance is the square root of the sum of the squared differences in values for each variable. Other distance measures are also available. • The city-block or Manhattan distance between two objects is the sum of the absolute

differences in values for each variable. • The Chebychev distance between two objects is the maximum absolute difference in values for

any variable.

• If the variables are measured in vastly different units, the clustering solution will be influenced by the units of measurement. In these cases, before clustering respondents, we must standardize the data by rescaling each variable to have a mean of zero and a standard deviation of unity. It is also desirable to eliminate outliers (cases with atypical values).

• Use of different distance measures may lead to different clustering results. Hence, it is advisable to use different measures and compare the results.

©JMH: [email protected] - No redistribution/reusage/etc without permission.

• Hierarchical clustering is characterized by the development of a hierarchy or tree-like structure. Hierarchical methods can be agglomerative or divisive.

• Agglomerative clustering starts with each object in a separate cluster. Clusters are formed by grouping objects into bigger and bigger clusters. This process is continued until all objects are members of a single cluster.

• Divisive clustering starts with all the objects grouped in a single cluster. Clusters are divided or split until each object is in a separate cluster.

• Agglomerative methods are commonly used in marketing research. They consist of linkage methods, error sums of squares or variance methods, and centroid methods.

Conducting Cluster Analysis Select a Clustering Procedure–Hierarchical

23

24

1/28/2019

13

©JMH: [email protected] - No redistribution/reusage/etc without permission.

• The single linkage method is based on minimum distance, or the nearest neighbor rule. At every stage, the distance between two clusters is the distance between their two closest points.

• The complete linkage method is similar to single linkage, except that it is based on the maximum distance or the furthest neighbor approach. In complete linkage, the distance between two clusters is calculated as the distance between their two furthest points.

• The average linkage method works similarly. However, in this method, the distance between two clusters is defined as the average of the distances between all pairs of objects, where one member of the pair is from each of the clusters.

Conducting Cluster Analysis Select a Clustering Procedure – Linkage Method

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Linkage Methods of Clustering

Single Linkage

Minimum Distance

Complete Linkage

Maximum Distance

Average Linkage

Average Distance

Cluster 1 Cluster 2

Cluster 1 Cluster 2

Cluster 1 Cluster 2

25

26

1/28/2019

14

©JMH: [email protected] - No redistribution/reusage/etc without permission.

• The variance methods attempt to generate clusters to minimize the within-cluster variance.

• A commonly used variance method is the Ward's procedure. For each cluster, the means for all the variables are computed. Then, for each object, the squared Euclidean distance to the cluster means is calculated. These distances are summed for all the objects. At each stage, the two clusters with the smallest increase in the overall sum of squares within cluster distances are combined.

• In the centroid methods, the distance between two clusters is the distance between their centroids (means for all the variables). Every time objects are grouped, a new centroid is computed.

• Of the hierarchical methods, average linkage and Ward's methods have been shown to perform better than the other procedures.

Conducting Cluster Analysis Select a Clustering Procedure – Variance Method

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Other Agglomerative Clustering Methods

Ward’s Procedure

Centroid Method

27

28

1/28/2019

15

©JMH: [email protected] - No redistribution/reusage/etc without permission.

• The nonhierarchical clustering methods are frequently referred to as k-means clustering. These methods include sequential threshold, parallel threshold, and optimizing partitioning.

• In the sequential threshold method, a cluster center is selected and all objects within a prespecified threshold value from the center are grouped together. Then a new cluster center or seed is selected, and the process is repeated for the unclustered points. Once an object is clustered with a seed, it is no longer considered for clustering with subsequent seeds.

• The parallel threshold method operates similarly, except that several cluster centers are selected simultaneously and objects within the threshold level are grouped with the nearest center.

• The optimizing partitioning method differs from the two threshold procedures in that objects can later be reassigned to clusters to optimize an overall criterion, such as average within cluster distance for a given number of clusters.

Conducting Cluster Analysis Select a Clustering Procedure– Nonhierarchical

©JMH: [email protected] - No redistribution/reusage/etc without permission.

• I recommend that when possible hierarchical and nonhierarchical methods be used in tandem. • First, an initial clustering solution is obtained using a hierarchical

procedure, such as average linkage or Ward's. • The number of clusters and cluster centroids so obtained are used

as inputs to the optimizing partitioning method.

• Choice of a clustering method and choice of a distance measure are interrelated. • For example, squared Euclidean distances should be used with

the Ward's and centroid methods. • Several nonhierarchical procedures also use squared Euclidean

distances.

Conducting Cluster Analysis: Select a Clustering Procedure

29

30

1/28/2019

16

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Results of Hierarchical Clustering

Stage cluster Clusters combined first appears

Stage Cluster 1 Cluster 2 Coefficient Cluster 1 Cluster 2 Next stage 1 14 16 1.000000 0 0 6 2 6 7 2.000000 0 0 7 3 2 13 3.500000 0 0 15 4 5 11 5.000000 0 0 11 5 3 8 6.500000 0 0 16 6 10 14 8.160000 0 1 9 7 6 12 10.166667 2 0 10 8 9 20 13.000000 0 0 11 9 4 10 15.583000 0 6 12 10 1 6 18.500000 6 7 13 11 5 9 23.000000 4 8 15 12 4 19 27.750000 9 0 17 13 1 17 33.100000 10 0 14 14 1 15 41.333000 13 0 16 15 2 5 51.833000 3 11 18 16 1 3 64.500000 14 5 19 17 4 18 79.667000 12 0 18 18 2 4 172.662000 15 17 19 19 1 2 328.600000 16 18 0

Agglomeration Schedule Using Ward’s Procedure

©JMH: [email protected] - No redistribution/reusage/etc without permission.

• Interpreting and profiling clusters involves examining the cluster centroids. The centroids enable us to describe each cluster by assigning it a name or label.

• It is often helpful to profile the clusters in terms of variables that were not used for clustering. These may include demographic, psychographic, product usage, media usage, or other variables.

• We often use Compare Means to do the profiling

Interpreting and Profiling the Clusters

31

32

1/28/2019

17

©JMH: [email protected] - No redistribution/reusage/etc without permission.

Case: Pizza Hut Product Launch Decision

33