Segmentation Models and Lift/Gains/Profitability Submit Assignment

profilemaomaochou
UsingLiftGainstoEvaluateModels.pdf

Kenan-Flagler Business School The University of North Carolina

Professor Charlotte Mason prepared this note to provide material for class discussion rather than to illustrate either effective or ineffective handling of a business situation.

Assessing a Model’s Performance:

Lifts and Gains Models are created to predict or classify – so one way to assess a model’s performance is to compare its performance to the results if no model was used. We can assess the value of a predictive model by using the model to rank or score a set of customers and then contacting them in that order. Lifts and gains are commonly used performance measures. Lift indicates how much better a model performs than the ‘no model’ or average performance. To show how lift is calculated, consider the results in Exhibit 1 that summarize the number of customers and number of buyers by recency decile for the BookBinders Book Club test involving the offer to purchase “The Art History of Florence.”

Recency Decile

# Customers

# Buyers

1 (top) 3748 670 2 7424 1058 3 3820 459 4 6254 638 5 6158 521 7 6229 474 8 6184 389 9 5346 203

10 (bottom) 4837 110 Total 50000 4522

Exhibit 1 Recency Decile Summary

• Recency Decile: note there are nine rather than ten deciles as a result of large numbers of customers having the same value for months since last purchase close to the ‘dividing line’ between deciles

• # Customers: the number of customers in that decile

• # Buyers: the number of customer who bought “The Art History of Florence”

Page 2

Lift and Cumulative Lift From these ‘raw’ numbers we can compute the following as shown in Exhibit 2:

• Cumulative # customers: the number of total customers up to and including that decile

• Cumulative % customers: the percent of total customers up to and including that decile

• Response Rate: the actual response rate for each decile, computed by the number of buyers divided by the number of customers for each decile

• Lift: (response rate for each decile) ÷ (overall response rate) ×100

• Cumulative Response Rate: the actual response rate up to and including that decile, computed as the sum of the number of buyers in the relevant deciles divided by the sum of the number of customers in the relevant deciles. For example, the cumulative response rate for decile 2 = (670 + 1058)/(3748+7424) = .1547, or 15.47%

• Cum(ulative) Lift: (cumulative response rate) ÷ (overall response rate) ×100

Decile

#

Customers

Cumulative #

customers

Cumulative %

Customers

#

Buyers

Response

Rate

Lift

Cum Response

Rate

Cum Lift

1(top) 3748 3748 7.5% 670 17.88% 198 17.88% 198 2 7424 11172 22.3% 1058 14.25% 158 15.47% 171 3 3820 14992 30.0% 459 12.02% 133 14.59% 161 4 6254 21246 42.5% 638 10.20% 113 13.30% 147 5 6158 27404 54.8% 521 8.46% 94 12.21% 135 7 6229 33633 67.3% 474 7.61% 84 11.36% 126 8 6184 39817 79.6% 389 6.29% 70 10.57% 117 9 5346 45163 90.3% 203 3.80% 42 9.77% 108 10 4837 50000 100.0% 110 2.27% 25 9.04% 100

Total 50000 100% 4522 9.04% Exhibit 2 Lift Calculations Lift is an index that indicates the model’s ability to beat the ‘no model’ case or average performance. For example, from Exhibit 2 we see that the lift for the top decile is 198. This indicates that by targeting only these customers we would expect to yield 1.98 times the number of buyers found by randomly mailing the same number of customers. In contrast, the last decile (decile 10) has only one-quarter (.25 times) the number of buyers as one would expect in a random sample of the same size. From the cumulative lift column we see that by targeting the top two deciles, we would expect to yield 1.71 times the number of buyers as compared with a random mailing. As a larger percent of the customers are included, cumulative lift will decrease – reaching 100 (or average response) when 100% of customers are included. Lift indices that exceed 100 indicate better than average performance or response, whereas lift indices less than 100 indicate poorer than average performance or response. Note that lift is a

Page 3

relative index – a lift of 400 could refer to a predicted 8% response rate or a predicted 80% response rate – depending on whether the overall or average response rate is 2% or 20%. A chart depicting the cumulative lift (plotting cumulative % of customers versus cumulative lift) is shown in Exhibit 3.

0

50

100

150

200

250

0% 20% 40% 60% 80% 100%

% of Customers

C um

L ift

Recency Model No Model

Exhibit 3 Cumulative Lift Chart Gains and Cumulative Gains A different way to summarize a model’s performance is with gains and cumulative gains. Again, we begin with the raw numbers in Exhibit 1 and create the table shown below in Exhibit 4:

Decile

#

Customers

Cumulative #

customers

Cumulative %

Customers

#

Buyers

Cum # Buyers

Gains

Cum Gains

1(top) 3748 3748 7.5% 670 670 15% 15% 2 7424 11172 22.3% 1058 1728 23% 38% 3 3820 14992 30.0% 459 2187 10% 48% 4 6254 21246 42.5% 638 2825 14% 62% 5 6158 27404 54.8% 521 3346 12% 74% 7 6229 33633 67.3% 474 3820 10% 84% 8 6184 39817 79.6% 389 4209 9% 93% 9 5346 45163 90.3% 203 4412 4% 98% 10 4837 50000 100.0% 110 4522 2% 100%

Total 50000 100% 4522 Exhibit 4 Gains and Cumulative Gains

• Gains: the proportion of responders in each decile

• Cum(ulative) Gains: the proportion of responders up to and including the decile, or simply the sum of the gains up to that decile.

Page 4

The cumulative gains chart in Exhibit 5 is a useful visual representation for comparing a model to the ‘no model’ case or average performance. All models start at the 0-0 point – if 0% of the customers are mailed or targeted, then we will yield 0% of buyers. Similarly, all models end at the 100-100 point – if 100% of the customers are targeted then we will yield 100% of buyers. The diagonal line represents the no model or baseline case – for example, if we randomly select 10% to mail or target, then we expect to get 10% of the buyers. Similarly, if we randomly select 50% to target, then we expect to get 50% of the buyers, and so on. The cumulative gains for the model reveal what proportion of responders we can expect to gain from targeting a specific percent of customers using the model. For example, results of using a recency model to target customers for “The Art History of Florence” show that by targeting the 7.5% most recent customers, we would gain 15% of total buyers. By targeting the top 22.3% most recent customers, we would gain 38% of customers. The larger the distance between the model and no model lines, the stronger or more powerful the model is.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 20% 40% 60% 80% 100%

% of Customers

C um

G ai

ns

Recency Model No Model

Comparing Models Lifts and gains can also be used to compare two or more alternative models, to track a model’s performance over time, or to compare a model’s performance on different samples. A cumulative gains chart comparing the recency model to a model using monetary value for the BookBinders Book Club “The Art History of Florence” mailing is shown in Exhibit 6. Clearly, the recency model is a more powerful predictor of response compared with the monetary model.

Page 5

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0% 20% 40% 60% 80% 100%

% of Customers

C um

G ai

ns Recency Model No Model Monetary Model

Exhibit 6 Cumulative Gains for Recency and Monetary Models In summary, lift is a measure of the effectiveness of a predictive model. It is computed as the ratio between the results obtained with the model to the results with no model. For a model predicting response, lift reveals how much more likely we are to get responders if we use the model than if we contact a random sample of customers. For a model predicting response, gains shows the percent of total possible responders gained by targeting a specific percent of the customers scored or ranked by a model. Cumulative lift and gains charts are useful visual tools for measuring and comparing a model’s performance. Both charts include a baseline or no model case – the greater the difference between the lift or gains curve and the baseline, the better the model.

  • Assessing a Model’s Performance:�Lifts and Gains
    • Lift and Cumulative Lift
    • Gains and Cumulative Gains
    • Exhibit 4 Gains and Cumulative Gains
    • Comparing Models