Excel application and statistics project

UziUziiii

SimulationNotes1.pdf

Home >Mathematics homework help >Statistics homework help >Excel application and statistics project

Business Modeling and Simulation Contents

1 Fundamentals of Simulation 4

1.1 The Central Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Dart Game and the Estimation of π . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 A Dinner Deal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 What is Modeling? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Excel Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5.1 Range Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5.2 Data Array and Matrix Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5.3 @ symbol and #This Row . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5.4 Random Variables, PDF, and CDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5.5 Point Mass δx: Use Distribution to Model Scalars . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5.6 Generate Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5.7 Simulation, Sampling, Sample, and Sample Path . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5.8 Conditional Probability and Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5.9 Simulate a Markov chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.5.10 Statistics and Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.6 Excel Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.6.1 How to Enter an Array Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.6.2 How to Sort Data with LARGE and SMALL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.6.3 How to Install Solver on Mac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.6.4 How to Speed Up Excel Execution: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.6.5 Excel shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.6.6 How to get help on Excel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.7 Textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Operations and Supply Chain Models 14

2.1 Decision Making under Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 A Sales Promotion Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Flextrola . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4 Supply Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.5 Newsvendor Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.6 Excel Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.6.1 Loops and Data Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.6.2 Logical Operations and Indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Financial Models 22

3.1 Dynamic Systems and Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 Derivatives: Forwards, Futures, and Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3 Pricing Asian Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.4 Hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4.1 Main Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.5 Investment Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.5.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.5.2 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.5.3 Managerial Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Marketing Models 27

4.1 Salesforce Performance Evaluation: An Insurance Model . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.2 Main Insights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.2 Market Share Model: Brand Switching and Brand Loyalty . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.2.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.3 Bass Diffusion Model: Predict New Product Sales and Product Rollover . . . . . . . . . . . . . . . . . 32

4.3.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.4 Excel Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.4.1 Random Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5 Public Policies 35

5.1 Healthcare: The Spread and Prevention of an Infectious Disease . . . . . . . . . . . . . . . . . . . . . 35

5.1.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.2 Social Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.3 Excel Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.3.1 Data Array and Matrix Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.3.2 How to Generate Arbitrary Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.3 Conditional Probability and Markov Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6 Markov Chain Models 41

6.1 Credit Risk Rating Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.1.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.2 Bankruptcy in the Restaurant Business . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.3 Negotiation in Show Business . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

7 Pricing and Revenue Management 45

7.1 List Price Selling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7.2 Auction and Optimal Bidding Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7.2.1 First Price, Sealed Bid Auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

7.2.2 Second Price, Sealed-bid Auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7.2.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7.2.4 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7.3 Car Rental Agency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

7.4 Revenue Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

7.4.1 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

7.4.2 Marginal Value Argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

1 Fundamentals of Simulation

Simulation is a systematic way of thinking. It tells stories with logic. Just as every story has actors, plots, and contexts, so does simulation. In a story, we want to know who are the actors, how they interact, under what context, and what are the outcomes of their interactions. In simulation, random variables are our actors, and we seek to understand how and why their interplay leads to certain outcomes.

1.1 The Central Idea

The central idea of simulation is simple: The expectation of a random variable X can be approx- imated by averaging the realizations {x1, x2, · · · , xN } of X, and the probability of an event (e.g., (X > 0)) can be approximated by averaging its indicators 1∗ (xi > 0):1,2

E[X] ≈ ∑N

i=1 xi N

, P(X > 0) ≈ ∑N

i=1 1∗ (xi > 0) N

For each simulation, we must first specify the relevant random variables. Like actors, they each have names and behaviors (personalities, characters). For example, Bernoulli, Binomial, and Normal are the names of typical random variables. Their behaviors are uniquely defined by their distribution functions—probability distribution function (PDF) f , or cumulative distribution function (CDF) F. Depending on the problem, either one can be pleasant to work with. Their relation is F(x) =∫ x –∞ f (t)dt for continuous case, and F(x) =

∑ ti≤x f (ti) for discrete case.

Second, we care about the context of the simulation. This is done by specifying the business and other environmental parameters, e.g., the time horizon, the production cost, and the market price. Along with the parameters of random variables, they constitute the INPUT of the simulation.

Third, we need to figure out the logic—the plot—of the simulation. That is, how the random variables interact in each context. This is the fun part of the simulation: as the logic depends on the problem specifics, we must analyze each on its own merit.

Once we have specified the INPUT and the logic, we are ready to run the simulation. The OUTPUT records the outcomes and statistics that interest us. For example, the mean profit and its 95% confidence internal. Typically it involves statistic analysis. Taking together, we have

OUPUT = LOGIC( INPUT ).

1.2 Dart Game and the Estimation of π

Jim is to play a dart game with his friend. The square frame is of two by two size, with the round board of radius one siting inside. Jim is a lousy shooter. He can make each shot in the square, but otherwise the shots are random. His friend makes him a generous offer: Jim gets free beer if he shoots on the board. What is the probability p that Jim gets free beer?

1The indicator 1A(x) of the event A is a function such that 1A = 1 if x ∈ A, and 0 if x /∈ A. An event A is a measurable set, i.e., it can be expressed by the relations among variables, e.g., A = {x : a < x ≤ b}.

2Fundamentally, probability is a special kind of expectation. For example, for a random variable X, the probability P(X ≤ t) = E1{X≤t}, i.e., the expectation of the indicator 1{X≤t}.

1.2.1 Hints

There are at least three ways to solve this problem. The first is mathematical. since Jim shots uniformly in the square, his chance of hitting the square is one. Given that, his chance of hitting in the circle would equal the ratio between the circle area and the square area: p = πr

2 2×2 = π/4. This

approach requires that you know the formula and the value of circle area. If you don’t, we need to find other ways.

The second way is to do real experiments, by asking Jim to shot. For example, If Jim shots N = 100 times, and you can count the number n of times he hits in the circle. Then the estimation of the probability p ≈ nN .

The third way is to use simulation. When the sample size N is large, it may be too costly or infeasible to do real experiments. Instead, we can carry out the experiments on computer. We focus on a quarter of the square. First, specify the INPUT. There are only two random variables— X, Y ∼ Unifrom[0, 1]—to model each shot (X, Y ) ∈ [0, 1]2. In Excel, we use RAND().

Next, we specify the logic. For each shot (X, Y ), we need to know whether it is a hit. That is, hit= IF( d<1, 1, 0), where d =

√ X2 + Y 2 is the distance between the shot (X, Y ) and the

center (0, 0). So hit is the indicator of hitting (=1) or not (=0). In Excel: hit= IF( d<1, 1, 0); equivalently, hit = 1*(d<1).

To generate a sample, we can replicate N = 1000 shots. For each shot (Xi, Yi), we know whether hiti = 1 or 0. We then do statistic analysis. The total number of hits is

∑N i=1 hiti, and the

probability p = ∑N

i=1 hiti N . In Excel, sum(hit)/N, or average(hit).

Having estimated the probability p of free beer, we can also estimate π: from the relation p = π/4, we can estimate π by π = 4×p.

How to draw the board: The board can be modelled as the circle B with radius r = 1 and center (0, 0), i.e.,

B = {(x, y) ∈ R2 : x2 + y2 = 1}.

It can be parameterized by the angle t ∈ [0, 2π] as

B = {(x, y) ∈ R2 : x = cos(t), y = sin(t), t ∈ [0, 2π]}.

If we vary t from 0 to 2π and plot ( cos(t), sin(t)

) , then we trace out the circle B.

1.3 A Dinner Deal

Sam and Mary are salespeople. Sam’s weekly sales X is uniformly distributed: X ∼ U[1, 5] (in thousands of dollars). Mary’s sales Y follows normal distribution with mean µ = 3 and standard deviation σ = 1.3: Y ∼N(3, 1.32). Each week they compare their performance by the ratio X/Y . If ratio X/Y < 0.3, Sam buys the dinner; if ratio X/Y > 0.7, Mary buys; otherwise, they split the check. Assume sample size N = 200 weeks.

Compute and plot the distribution (histogram) of the dinner outcome; i.e., how often will Sam buy the dinner, how often will Mary buy, and how often will they split?

Is this deal fair?

1.3.1 Hints

Let R = X/Y be the ratio. Depending on R, for each week, we have three outcomes/events Ai, i = 1, 2, 3:

A1 = { Sam buys } = 1 ∗ (R < 0.3) A2 = { Mary buys } = 1 ∗ (R > 0.7) A3 = { Split } = 1 ∗ (R >= 0.3)∗ (R <= 0.7).

The three events are mutually exhaustive and jointly exhaustive. They can be modeled by three Bernoulli random variables (indicators); in Excel, we use three columns to store the three indicators.

Alternatively, the outcome can be modeled by a random variable V with three values:

V = ∑3

i=1 i ·1Ai = 1 ∗ (R < 0.3) + 2 ∗ (R > 0.7) + 3 ∗ (R >= 0.3)∗ (R <= 0.7),

where V = 1, 2, 3 represents event 1, 2, and 3 occurs.

1.4 What is Modeling?

All disciplines use models. So what is a model? It is a useful abstraction of reality. It is an abstraction because it does not intend to recreate the real world on a one-to-one scale. Rather, it distills the essence of a situation so that it can be readily deployed in other similar situations.

One reason we need models is because we have limited cognitive capacity—we cannot keep track of all the details all the time. Modeling is a way to filter out irrelevance and to focus on what truly matters. So models are the lens through which we see the world. We build models because we want to highlight the most relevant factors of the circumstances. In a broad sense, language, music, painting are all different forms of modeling: they stress certain aspects of reality by ignoring the rest. As such, Simplicity is the first criterion.

The second criterion is generality. We don’t want to build a new model for every single situation; that would defeat the very purpose of modeling. Rather, we would like our models to have sufficient generality, so that we can apply the same models to different situations with limited modifications.

Models must be useful. Unlike simplicity and generality, usefulness is trickier to assess. The challenge arises from the parallel of two worlds: the real and the model world; so there is a gap one must be able to cross.

The main idea of modeling is that, if we isolate the substantive factors of a system into a model, then the outcomes generated by the model should also find counterparts in the real world. Or, if we observes certain outcomes in the real world, and the model produces similar outcomes, then these substantive factors should be determinants in the real world, too.

Both arguments rely on inductive inference. In general, the model and real world should share similarities in structure, dynamics, and outcomes. Yet these similarities are no guarantee that the two worlds move in lock step. Cases abound of otherwise. Thus, relevance has to be taken by faith, not by the logic reasoning. There are gaps that one must be willing to cross.

1.5 Excel Skills

Notation alert: We write ≡ for equal by definition. We use italicized font for math formula,

and typewriter font for Excel formula: f (x) vs. f(x), f(x). When we translate math formula into Excel:

x3 =x^3, √ N = N0.5 = sqrt(N) =N^(0.5), e–xy = exp(–x ∗ y), a[(bc) – d] = a ∗ ((b ∗ c) – d).

1.5.1 Range Names

In Excel, we always define the names of the cells before we use them. Doing so simplifies the execution tremendously. It is a habit we must develop. You cannot use the names c or r; they are taken by Exel to mean c= column and r=row. See ch. 2 (range names) on the Excel 2010 textbook.

Shortcuts for Create Names from Selection: CTRL+SHIFT+F3 in PC, or Fn+CTRL+SHIFT+F3 in Mac.

1.5.2 Data Array and Matrix Notation

In Excel, data are organized as tables or arrays. We express them with matrix notion. A m-by-n matrix A ∈ Rm×n is an array of elements aij, i = 1, . . . , m, j = 1, . . . , n, in the form of

A = [aij] =

  a11 a12 . . . a1n a21 a22 . . . a2n ...

am1 am2 . . . amn

  .

All scalars and vectors are special cases of matrices. For example, a row vector c = (c1, c2, . . . , cn) ∈ R1×n is a 1-by-n matrix; a column vector

b =

  b1 b2 ... bm

 

is a m-by-1 matrix. Let row vector b′ = (b1, . . . , bn) be the transpose of column vector b. Let x, b ∈ Rn. The inner product of vectors c and x is c · x =

∑n i=1 cixi. The notation Ax ≤ b stands

for

Ax =

  a11 a12 . . . a1n a21 a22 . . . a2n ...

am1 am2 . . . amn

    x1 x2 ... xn

  =

LHS︷︸︸︷  ∑n

j=1 a1jxj∑n j=1 a2jxj

...∑n j=1 anjxj

  ≤

RHS︷︸︸︷  b1 b2 ... bn

 

That is, Ax is a n-by-1 vector, with i-th element being the inner product between the i-th row vector of matrix A and the column vector x.

Excel: the inner product c · x=SUMPRODUCT(c,x). For matrix A, we name the relevant cells as A. To retrieve the element aij in the ith row and jth column, we use =INDEX(A,i,j). Also, Ax =MMULT(A,x) (array function, use CTRL+SHIFT+ENTER).

1.5.3 @ symbol and #This Row

The @symbol is used to select the current row of MyDataColumn; omitting it will select all rows in that column. This is clear when using aggregate such as sum, average etc. For example, =AVERAGE(@MyDataColumn]) would just return the same value present in MyDataColumn, but =AVERAGE([MyDataColumn]) would return the average of the entire column.

#This Row or @ or @[Column Name]

Just the cells in the same row as the formula. These specifiers can’t be combined with any other special item specifiers. Use them to force implicit intersection behavior for the reference or to override implicit intersection behavior and refer to single values from a column.

Excel automatically changes #This Row specifiers to the shorter @ specifier in tables that have more than one row of data. But if your table has only one row, Excel does not replace the #This Row specifier, which may cause unexpected calculation results when you add more rows. To avoid calculation problems, make sure you enter multiple rows in your table before you enter any structured reference formulas.

1.5.4 Random Variables, PDF, and CDF

A random variable X describes a population, i.e., all possible outcomes of an experiment with probability distribution function (pdf, or pmf) f (x) ≥ 0. There are two types of random variables, discrete and continuous. For example, discrete uniform (thraw a die), Bernouli (0-1), Binomial (n, p), and multinomial, Poisson are of discrete type. Continuous uniform, normal N(µ,σ2), and exponential random variables are of continuous type.

If X is discrete with outcomes X = {x1, . . . , xn} such that f (xi) > 0, then ∑n

i=1 f (xi) = 1. Here the space X is called the support of random variable (r.v.) X; it is used to model the possible outcomes in business. If X is continuous with f (x) > 0 on interval [a, b], then

∫ b a f (x) dx = 1.

For a given pdf f (x), we can generate its cumulative distribution function (CDF) F(x):

F(xi) = i∑

j=1 f (xj) = f (x1) + f (x2) + · · ·+ f (xi), ∀i = 1, 2, . . . , n, if X is discrete

F(x) = ∫ x

a f (t) dt = the area under the pdf curve from a up to x, if X is continuous.

Excel: We express pdf and cdf by two columns of data: the first column lists the value of possible outcomes x; the second column gives corresponding f (x) (or F(x)).

1.5.5 Point Mass δx: Use Distribution to Model Scalars

A special distribution is point mass f (t) = δx (t) at x, i.e., with probability 1 at point x and 0 elsewhere. We can use point mass to express outcomes. For example, consider X = {1, 3, 5}. The outcome 3 is equivalent to the distribution

f = [0, 1, 0] = [f (1), f (3), f (5)] = δ3.

For outcome space X = {x1, x2, . . . , xn}, any outcome xi ∈ X can be equivalently expressly as the point mass δxi .

For discrete random variable X ∈X with px = P(X = x), the PDF f (t) is the weighted average of point masses, where the weights are px :

f (t) = ∑ x∈X

pxδx (t).

1.5.6 Generate Random Variables

To generate a random variable X from CDF F, we first create a uniform random seed U ∼U[0, 1] between 0 and 1; in Excel U = rand(). Then

X = F–1(U).

To see why, let G be the CDF of random variable X = F–1(U). We need to show that G equals F; i.e., the random variable indeed behave as the function F dictates. Indeed, this is the case:

G(x) = P{X ≤ x} = P{F–1(U) ≤ x} by definition X = F–1(U) = P{F(F–1(U)) ≤ F(x)} since F is increasing

= P{U ≤ F(x)} since F ( F–1(U)

) = U

= F(x). since U ∼U[0, 1] with CDF FU (u) = P(U ≤ u) = u.

For example, consider exponential random variable X ∼ exp(λ), with mean µ = EX = 1/λ, and CDF

F(x) = 1 – e–λx , x ≥ 0.

Hence, solving U = F(X) gives

X = F–1(U) = – 1 λ ln(1 – U).

Since U ∼U[0, 1] implies (1 – U) ∼U[0, 1], we have

X = –µ ln U. (1)

Excel:

random seed: U = RAND() continuous uniform U[a,b]: = a + (b-a)*RAND() discrete uniform U{a,...,b}: = RANDBETWEEN(a,b) ,or =a + INT((b-a+1)*rand()) normal(mu, sigma): =NORM.INV(rand(), mu, sigma) exponential(mu): = - mu*LN(rand()) Binormial: =BINOM.INV( n, p, rand()) Bernoulli: =BINOM.INV( 1, p, rand()), or 1*( rand()<p ).

To generate a discrete random variable X from CDF F, we use

X = min{xj : U < F(xj)},

i.e., the smallest x value whose CDF dominates U.

For an arbitrary CDF F, we first expressed it as two columns (xs, Fs) (equivalently as two rows), where

xs =

  x1 x2 ... xn

  , Fs =

 

0 F(x1) F(x2)

... F(xn)

  . (2)

We add a zero on top of Fs to accommodate Excel.3

To generate a random variable X that follows F, we use X= INDEX( xs, MATCH(rand(), Fs) ). Equivalently, id= MATCH( rand(), Fs) X= INDEX( xs, id ).

Here, MATCH function is to find, in the column Fs, the id of the largest element that is less than random seed rand(). Then INDEX is to take out the corresponding element (of the same id) from the column xs. See file F.INV.xls for detail.

1.5.7 Simulation, Sampling, Sample, and Sample Path

For a simple outcome, each observation (realization, instance) of random variable X ∼ F is a random number; we create it with xi = F–1(Ui), Ui ∼ U[0, 1], i = 1, . . . , N. The set x = {x1, . . . , xN } is a sample of population X with sample size N. The process to generate the sample is called simulation (or sampling).

Excel: For a simple simulation, the outcome can be describe by a single number. After generating the first observation x1, we drag down to generate the sample of size N. Usually, the first column is the index ID i of each observation xi, and the second column contains the value xi = F–1(Ui).

Sample path: for a complex outcome, each element of a sample consists of a sequence of T ob- servations, x = (x1, x2, . . . , xT )′; we call it a sample path. For example, weekly stock prices of a year T = 52, or coke purchasing history xt ∈ {coke, pepsi} of a year. In this case, the sample [x1, x2, . . . , xN ] is essentially a T-by-N matrix.

1.5.8 Conditional Probability and Markov Chain

Almost all businesses involve uncertainty and dynamics. We use random variables to capture uncertainty, as showing by standard deviation σ.

We use Markov chain to capture dynamics. A markov chain is a sequence of correlated random variables {Xt}t≥0 = {X0, X1, X2, . . .}. Here Xt ∈ S = {1, 2, . . . , k} describe the system state in period t. S is the state space, i.e., all possible outcomes for state variable Xt. The system dynamics is captured by the conditional probability: Given the current state Xt, we can predict the next period state Xt+1 by the conditional probability

pij ≡ P(Xt+1 = j|Xt = i), i, j ∈ S. 3In Excel, match(U, F) gives max{xj : F(xj) ≤ U }, but we want to find x = min{xi : U < F(xi)}. In other words,

Excel MATCH function provides the largest index among those who are smaller or equal to U ; but we want to find the smallest index among those who are greater than U .

The transition matrix P that defines the Markov chain is given by

P = [pij] =

  p11 p12 . . . p1k p21 p22 . . . p2k ...

pk1 pk2 . . . pkk

  .

In matrix P, the Xt = ith row describe the probability that the next state Xt+1 = j. Hence∑k j=1 pij = 1. For example, p12 is the probability of the next state Xt+1 = 2, given current state

Xt = 1.

1.5.9 Simulate a Markov chain

Each Markov chain is specified by a transition matrix P and initial distribution F0. Given transition matrix (PDF) P, we first construct a CDF matrix Q with additional zero as the first column:

Q = [Qij] =

  0, p11, p11 + p12,

∑3 j=1 p1j, . . . , 1

0, p21, p21 + p22, ∑3

j=1 p2j, . . . , 1 ... 0, pk1, pk1 + pk2,

∑3 j=1 pkj, . . . , 1

  .

The first column of zeros is to deal with EXCEL peculiarity. Each row of matrix Q is the conditional CDF we use to generate the next random variable Xt+1. For example, given current Xt = i, we use CDF Ft = INDEX(Q, i, 0)—the ith row of matrix Q—to create Xt+1.

Second, create the initial state X0 ∼ F0, where F0 is the CDF (with zero added on top): X0 = MATCH(RAND(), F0).

Third, in period t > 0, suppose the current state is Xt, we generate next period state by

Xt+1 = MATCH(rand(), INDEX(Q, Xt, 0)).

Here, Ft = INDEX(Q, Xt, 0) is to take out the Xt-th row of the CDF matrix Q, Ft, where 0 stands for the entire row. Then MATCH returns the next state Xt+1 by using inverse transform Xt+1 = [Ft]–1(U).

1.5.10 Statistics and Histogram

For a sample (column vector) x = (x1, . . . , xN )′, we care about the following statistics:

x̄ = ∑N

i=1 xi N

, (3)

s2 = ∑n

i=1(xi – x̄)2 N – 1

, (4)

95% CI = ( x̄ –

1.96× s √ N

, x̄ + 1.96× s √ N

) , (5)

P(x < b) = ∑N

i=1 1∗ (xi < b) N

. (6)

Excel:

∑ i xi =SUM(x), x̄ =AVERAGE(x)=SUM(x)/N, s =STDEV(x).

P(x < b) =SUM( 1*(x<b) )/N = AVERAGE( 1*(x<b) ).

1*(x<b) is the indicator of event (x<b): it equals 1 if the statement is true; 0 otherwise. This is a shorthand for IF((x<b), 1, 0).

The number 1.96 comes from NORM.INV(0.975, 0, 1).

Histogram: For a sample with data in column SAMPLE and size N, the best way to see its behavior is to build its histogram—the empirical PDF distribution.

In Excel, there are two ways. First, construct a category/class/type column that sample values may take, call it BIN, then, create another column for relative frequency: = FREQUENCY(SAMPLE, BIN)/N. Since this is an array function we use CTRL+SHIFT+ENTER (CMD+SHIFT+RETURN in Mac). 4

The second way is only for discrete random variables (e.g., Binomial): use =COUNTIF(SAMPLE, BIN), where BIN is category value.

1.6 Excel Tips

1.6.1 How to Enter an Array Formula

use CTRL+SHIFT+ENTER, or CMMD+SHIFT+RETURN. Specifically, after you select the block and enter the formula, you need to push down and hold the two keys "CTRL + SHIFT", and then press "ENTER" to finish. Youtube: https://www.youtube.com/watch?v=uqDtwfKxZyY

1.6.2 How to Sort Data with LARGE and SMALL

Suppose we have unsorted data, a column vector x = (x1, x2, . . . , xn)′. Define vector rank = (1, 2, . . . , n)′. Then, let vector y = (y1, . . . , yn)′ be the sorted data of x from the smallest to the largest, and vector z from the largest to the smallest:

y = SMALL(x, rank), z = LARGE(x, rank),

where we enter them as array formula (CTRL+SHIFT+ENTER).

1.6.3 How to Install Solver on Mac

http://www.econ.ucla.edu/riley/106P/OPTIMIZATION/MacbookUsers.htm

1.6.4 How to Speed Up Excel Execution:

https://longgao.wordpress.com/2010/04/20/how-to-speed-up-excel 4We use PC. For shortcuts in Mac, google “Shortcuts in Excel 2016 for Mac.”

1.6.5 Excel shortcuts

https://exceljet.net/keyboard-shortcuts

https://support.office.com/en-us/article/Keyboard-shortcuts-in-Excel-2016-for-Mac-acf5419e-1f87- 444d-962f-4e951a658ccd

1.6.6 How to get help on Excel

press (Fn) F1 in Excel, look up Excel 2010 textbook, or stop by office hours.

1.7 Textbooks

We mainly use lecture notes. The required textbook is

Excel 2010: Data Analysis and Business Modeling, by Winston, Microsoft Press, 2011.

For later editions of the Excel book, the chapter numbers may differ, but the chapter name and contents are the same.

Here are recommended books for simulation:

1. Probability Models for Economic Decisions, by Roger B. Myerson, Duxbury Press, 2005. This book provides economic framework.

2. Simulation, by Sheldon M. Ross, Academic Press. This book provides mathematical founda- tion for simulation.

3. Simulation and the Monte Carlo Method, by Reuven Y. Rubinstein, and Dirk P. Kroese, Wiley. This one covers the foundation well, with MATLAB codes.

2 Operations and Supply Chain Models

In a changing business world, firms often need to made decisions with uncertain outcomes. For example, to plan a promotion, a furniture company needs to stock chairs before customer demand and selling price are fully known (§2.2). To save cost, a manufacturer needs to buy components overseas, and run production long before customers place their orders (§2.3). In each case, the decision is made when the future is uncertain—bet before tossing the coin.

2.1 Decision Making under Uncertainty

Almost all decision making problems can be framed as an optimization problem (max or min):

max x∈X

V (x).

It has three elements: the decision x, the objective function V (x), and the constraints x ∈X (e.g., 0 ≤ x ≤ 10). The decision x is the action we take. It leads to various outcomes g(x, Y ), which are affected by random variable Y ∼ f . The random variable Y ∈ Y describes the world state, i.e., uncertain demand, supply, price, or market condition. Our objective is to maximize some chosen criterion V (x); e.g., expected profit or cost

V (x) = E[g(x, Y )] = ∑ y∈Y

g(x, y) · f (y) = ∫ Y g(x, y)f (y) dy.

The constraints stipulate the feasible actions we can take; e.g., limited resources, budgets, or time confine what we can do. Let x∗ be the optimal decision, and V∗ the optimal objective function. Hence, V∗ = V (x∗) = maxx∈X V (x).

We consider two approaches to find the optimal decision x∗. The first is simulation. It works as fol- lows. Let X = {x1, x2, . . . , xK } be the set of feasible decisions we can make. For each decision xk ∈ X , we first simulate a sample of world states, a column vector y = (y1, y2, . . . , yN )′. For each state yi, we compute the outcome g(xk, yi) to obtain the outcome sample

( g(xk, y1), g(xk, y2), . . . , g(xk, yN )

)′. Then the expect profit under decision xk is

V (xk) = E[g(xk, Y )] = ∑N

i=1 g(xk, yi) N

We do so for each feasible decision xk, k = 1, 2, . . . , K, to obtain (V (x1), V (x2), . . . , V (xK ))′. The optimal decision is the one that yields the largest expected value V (x). In Excel, we can accomplish this by functions MAX, MATCH, and INDEX; or we can plot decisions (x1, x2, . . . , xk)′ against expected profit (V (x1), V (x2), . . . , V (xK ))′, and find the optimal decision visually. Alternatively, once we specify the relation between input x and output V (x), we can use Excel Solver to find the optimal decision x∗.

The second approach is mathematical (optional for this class). If the objective function V (x) is well-behaved (e.g., convex or concave, i.e., V ′′(x) ≥ 0, or V ′′(x) ≤ 0), we can find x∗ by the marginal analysis. First, find the unconstrained choice x∗ that makes the marginal value nil. This is expressed as the first order condition:

d dx

V (x∗) = V ′(x∗) = 0.

If it also satisfies all the constraints (x∗ ∈ X), it is the optimal decision; otherwise, check the boundary points to find the optimal choice. For an illustration, see the newvendor model in §2.5.

2.2 A Sales Promotion Model

A catalog company plans a furniture promotion a year from now. It must order before the selling season starts. The company plans to buy S = 3000 chairs at a unit cost c = $175. The promotion will last for two months. At this point, both demand and price are uncertain, because of the market conditions, fashion trends, and effectiveness of advertising campaign, and competitors’ moves. The company forecasts that the random demand follows normal distribution: D ∼ N(µ = 2000,σ2 = 5002), and the selling price follows uniform distribution: p ∼ U[$200, $300]. After the promotion, the company can sell all leftover units at the 50% discount price v = p/2. Assume sample size N = 1000. a) If the company orders S = 3000, what is the distribution (histogram) of the (annual) total profit? b) For S = 3000, what is the total expected profit EV (including both promotion and discount sales)? And what is the stockout probability SP (i.e., the probability that the demand during promotion is more than supply S)? c) What is the total expected profit EV and stockout probability SP, when S = 500, 600, . . . , 3500? [Hint: use Data Table] d) Plot the graphs of profit EV (S), and stockout probability SP(S). Find the optimal order quantity that maximizes the total expected profit EV .

2.2.1 Hints

For each decision S, we find its economic consequence, the expected profit EV (S).

Excel:

demand= max(0, NORM.INV(rand(), mu, sigma) ) price= 200 + (300 - 200)* RAND() sales= demand*(demand<S) + S*(S<demand) leftover= S - sales profit= price*sales + 0.5*price*leftover - unit_cost*S

The purpose of demand = max(0, ·) is to ensure that demands are nonnegative.

2.3 Flextrola

Flextrola, Inc., an electronic systems integrator, is planning to design a key component for their next generation product with Solectrics. Flextrola will integrate the component with some software and then sell it to consumers. Given the short life cycles of such products and the long lead times quoted by Solectrics, Flextrola only has one opportunity to place an order with Solectrics prior to the beginning of its selling season. Fletrola’s demand during the season is normally distributed with a mean of 1000 and a standard deviation of 600.

Solectrics’ production cost for the component is $52 per unit, and it plans to sell the component for $72 per unit to Flextrola. Flextrola incurs essentially no cost associated with the software integration and handling of each unit. Flextrola sells these units to consumers for $121 each. Flextrola can sell unsold inventory at the end of the season in a secondary electronics market for $50 each. The existing contract specifies that once Flexitrola places the inventory, no changes are allowed. Also, Solectrics does not accept any returns of unsold inventory, so Flexitrola must dispose of excess inventory in the secondary market.

Assume sample size N = 5000, and Flexitrola orders S = 1200 units.

1. What is Flextrola’s stockout probability, i.e., probability that the demand cannot be satisfied?

2. What is Flextrola’s expected sales?

3. How many units of inventory can Flexitrola expect to sell in the secondary electronics market?

4. What is Flextrola’s fill rate, i.e., the fraction of demand that is satisfied?

5. What is Flextrorla’s expected gross margin percentage, i.e., (Revenue - Cost)/Revenue? Note Revenue includes both regular sales and salvage revenue.

6. For S = 1200, what is the distribution (i.e., histogram) of Flextrola’s (seasonal) profit? What is Flextrola’s expected profit V (S) and its 95% confidence interval (CI)?

7. What is Solectrics’ expected profit G(S)?

8. By the contract the menu of order quantities that Flextrola can order is S = 1000, 1100, . . . , 1500, 1600. Which order quantity can maximize the expected profit of Flextorla? [Hint: for each S = 1000, . . . , 1600, run simulation to find the profit V (S) and its CI. Then plot the curve V (S) and CI for each S and find the maximal S along with CI for each S. Use Data Table in Excel.]

9. Which order quantity can maximize Solectrics’ expected profit G(S) given the contractual menu of the order quantities?

2.3.1 Hints

In Excel, we can simulate the operations as follows.

1. INPUT block specifies the parameters for the business. It uses range names to define cells that we will use later. Doing so eliminates the needs for cumbersome data selection via click-and-select.

2. SAMPLE block simulates the business for 5000 seasons (sample size N). It generates random demand by D=MAX(0, norm.inv( rand(), mu, sigma )). The purpose of MAX(0, . ) is to ensure nonnegative demand. To ensure integer demand, use INT( norm.inv... ). To generate anther sample, press F9.

3. IF condition is the building block for most analysis. For example, IF( demand < S, 1, 0) returns 1 if (demand<S), and 0 otherwise. Equivalently, we can write: 1*(demand < S). In general,

4. Sales: = demand* (demand<S) + S* (S<demand). Since conditions (demand<S) and (S<demand) are mutually exclusive, only one of them will be 1 (TRUE), which provides the condition for sales quantity.

5. Fill rate: if demand > 0, fillrate = sales/demand; if demand = 0, fillrate = 1. In Excel, fillrate=IF(demand>0, sales/demand, 1).

6. OUTPUT block carries out statistic analysis. Typically we need to compute sample mean x̄ (AVERAGE), standard deviation s (STDEV.S), and 95% confidence interval x̄ ±1.96∗s/

√ n

for each estimation. Note that 1.96 comes from NORM.INV( .975, 0, 1). If we want 99% confidence interval, we should use 2.56 from NORM.INV( .995, 0, 1), and 1.645 for 90% CI from NORM.INV( .95, 0, 1).

7. DATA TABLE block conducts sensitivity analysis. It shows how the outputs of interest (e.g., mean profit and stockout probability) vary with input (e.g., quantity decision S). As such, Data table is extremely useful in Excel.

8. OPTIMAL SOLUTION finds the optimal solution for the Data Table. It first finds the largest profit by LARGE( mean_profits, 1), then checks out the corresponding ID by MATCH( optimal_profit, mean_profits, 0), where 0 ensures precise match. Finally, it finds the optimal quantity by INDEX( Ss, ID, 1).

2.4 Supply Risk Management

Consider a pharmaceutical company facing random demand D U[0,10] (continuous uniform) for the flu vaccine during the flu season. There are two suppliers. Supplier 1 has a unit ordering cost Ci = $1M and a random yield factor X1 ∼ U[0.5, 0.8] (continuous uniform); that is, if the firm orders Q unit from supplier 1, he only receives S1 = X1 · Q and pays c1S1. Supplier 2 has a unit purchasing cost c2 = $2M and is completely reliable (yield factor X2 = 1). The firm needs to place the order before the flu season; during the flu season, each unit fulfilled demand brings revenue $4M. Suppose the company order Q from supplier 1 before the flu season; during the season in case of supply shortage(i.e., received supply from supplier 1 is less than realized demand), the company will meet all the shortfall (the amount of demand that cannot be met by the stock delivered by supplier 1) by expediting or rushing orders from supplier 2 (at unit purchasing cost c2 = $2M). The percentage of the total demand fulfilled by supplier 2 is called rush probability R. Assume sample size N = 3 x 103.

a) Suppose the order quantity Q = 20. What is the expectation and 95% confidence interval of profit W? What is the expectation of the rush probability R?

b) The possible order quantities are Q ∈ {1, 2, . . . , 20}. Compute the expectation of rush probability R(Q), the expectation and 95% CI of the profit W (Q) under each order quantity. Plot the mean and CI of W(Q) and find the optimal order quantity Q*.

c) Can you explain why dual sourcing strategy, i.e., using both suppliers (0 < R < 1), is better than single sourcing strategy, i.e., using either supplier 1 (R = 0) or supplier 2 only (R = 1)?

d) Disruption risk. Rather than the random yield risk, the supplier 1 is susceptible to disruption risk: upon receiving order Q, he delivers S1 = Q in full with probability p1 ∼ U[0.5, 0.8], and delivers nothing S1 = 0 with probability (1 – p1). All else equal, what is the optimal replenishment strategy for the company? Which risk is more harmful to the company?

2.5 Newsvendor Model

If you are really curious, here is how to solve the newsvendor problem mathematically.

Everyday morning, a newsboy buys x newspapers at unit price c before seeing the demand. He then sells at price p during daytime, and salvages the leftover at price v in the evening. The daily

demand Y is a random variable, with probability density function (PDF) f (y) and cumulative distribution function (CDF) F(y).

(i) Show that the optimal order quantity x∗ is

F(x∗) = p – c p – v

. (7)

(ii) Use formula (1) to find the optimal order quantity x∗ for the Flextrola problem. Compare it with the simulation solution. What’s the difference?

SOLUTION: To attack the problem, we need four weapons:

(a) ddx (y ·x n) = nyxn–1, ddx (c) = 0, where c is a constant.

(b) For random demand Y , F(y) = ∫ y 0 f (y)dy, and E[g(Y )] =

∫∞ 0 g(y) · f (y)dy. In particular,

E[leftover] = E[max(x – Y , 0)] = ∫ x 0 (x – y) · f (y)dy, (8)

E[lost sales] = E[max(Y – x, 0)] = ∫ ∞

x (y – x) · f (y)dy. (9)

min a≤x≤b

V (x). (10)

It has three elements: x is the decision, V (x) is the objective function, and 0 ≤ x ≤ b are the constraints. Marginal Value Argument: if the objective function V (x) is well-behaved, then at the optimal decision x∗, the marginal value of decision x is zero, i.e.,

d dx

V (x) = V ′(x) = 0. (11)

(d) Leibniz integral rule:

d dx

∫ b(x) a(x)

g(x, y) ·dy = g ( x, b(x)

) ·b′(x) – g

( x, a(x)

) ·a′(x) +

∫ b(x) a(x)

d dx

g(x, y) ·dy. (12)

(i) Armed with the four weapons, we are now ready to attack. First, we use weapon (c) to frame the problem as an optimization:

min x≥0

V (x),

where x is the order quantity, V (x) is the objective function of the total expected cost at quantity x, and x ≥ 0 is the constraint.

Next, we use weapon (b) to specify the objective function V (x). We invoke logic of too-much-too- little logic: for each realized demand y, (i) if demand y > x, we experience lost sales of (y – x) at unit cost b = p – c; (ii) if demand y ≤ x, we suffers leftover of (x – y) at unit cost h = c – v. Hence, total expected cost is

V (x) = hE[leftover] + bE[lost sales]

= h ∫ x 0 (x – y) · f (y)dy + b

∫ ∞ x

(y – x) · f (y)dy. by weapon (b)

To find the optimal quantity x∗, we now enlist the marginal value argument (weapon (c)): V ′(x∗) = 0.

V ′(x) = h d dx

∫ x 0 (x – y) · f (y)dy︸︷︷︸

A(x)

+b d dx

∫ ∞ x

(y – x) · f (y)dy︸︷︷︸ B(x)

= hA(x) + bB(x), (A)

where

A(x) = d dx

∫ x 0 (x – y) · f (y)dy

= (x – x)f (x)(x)′ – (x – 0)f (0)(0)′ + ∫ x 0

d dx

[(x – y)f (y)]dy by Leibniz integral rule

= ∫ x 0

f (y)dy = F(x) by CDF definition, F(x) = ∫ x 0

f (y)dy, (B)

and

B(x) = d dx

∫ ∞ x

(y – x) · f (y)dy

= (∞– x)f (∞)(∞)′ – (x – x)f (x)(x)′ + ∫ ∞

d dx

[(y – x) · f (y)]dy

= – ∫ ∞

x f (y)dy = –[1 – F(x)]. by CDF definition, F(x) =

∫ x 0

f (y)dy (C)

Hence, by Eqs. (A), (B), and (C), we have V ′(x∗) = hF(x∗) + b(–1)[1 – F(x∗)] = 0, i.e.,

F(x∗) = b

b + h =

p – c (p – c) + (c – v)

= p – c p – v

as desired.

(ii) In the Flextrola case, we find backorder cost b = 49, holding cost h = 22, and critical ratio b/(b + h) = 0.69. Given demand Y ∼ N(µ = 100,σ2 = 6002), the optimal quantity is x∗ = F–1(0.69) = 1297.749769, which can is found either by Z-table or by Excel = NORM.INV(0.69, 1000, 600).

2.6 Excel Skills

All programs boil down to two elements: loops and logical conditions. Excel uses the Data Table (under What-If) for loops. It has build-in functions for logical operations (e.g., IF, COUNTIF, SUMIF, COUNTIFS, SUMIFS). We will use indicators to simplify logical operations.

2.6.1 Loops and Data Table

In Excel, the Data Table is the most powerful tool we must know. It allows us to change chosen INPUT, and get the corresponding OUT in a simple table. Essentially, it evalutes the function:

OUT= SIMULATION( INPUT ).

Suppose we want to evaluate function y = f (x), i.e., the output y of function/relation/process f for varying input x. We can define the function f explicitly as a one-line function (e.g., 2*x). Or, we define f implicitly as a process (e.g., dart game simulation). In this case, we set x as our input, and the output of the process as y to track.

We use one-way table for y = f (x), and two-way table for z = f (x, y). To implement one-way table, we put the xs in the first column (a21, a31, . . . , am1)′ and the output name (=y) in the top cell a12 of the second column (a12, a22, . . . , am2)′.5 To implement two-way table A = [aij], we must put the output name we want to track in the cell a11, put xs in the first column (a21, a31, . . . , am1)′, and put ys in the first row (a12, a13, . . . , a1n).

Excel can only invoke one layer reference. Suppose y = f (x), and z = g(y), are the results of two consecutive processes, When we construct the table for z = g(f (x)), it won’t work because in g(y), as Excel insists, there is no x involved. Therefore, the first layer y = f (x) won’t execute.

Still, in Excel, the two-way table is the best solution for sensitivity analysis. If you have to create 10 separate spreadsheets for evaluating the impact of 10 levels of the same parameter/decision, perhaps you should consider Data Table. In Excel,

Data-> What-If Analysis-> Data Table ...

For more details on Data Table in Excel, see

http://chandoo.org/wp/2010/05/06/data-tables-monte-carlo-simulations-in-excel-a-comprehensive- guide/

https://www.youtube.com/watch?v=IECd65RVoPg

See DataTable.pdf for detail.

2.6.2 Logical Operations and Indicators

Logical functions in Excel, hit F1 for help in Excel:

Function Description AND Returns TRUE if all of its arguments are TRUE FALSE Returns the logical value FALSE IF Specifies a logical test to perform NOT Reverses the logic of its argument OR Returns TRUE if any argument is TRUE TRUE Returns the logical value TRUE

However, using the notion of indicator, we can greatly simply our work. We summarize logical operations as follows.

Indicator Let IA be the indicator of event (statement) A (e.g., D > S), i.e, the logical value of the statement: IA = 1 if A is true; 0 otherwise. In Exel: IF(A,1,0), or simply 1*(A).

Causality A ⊂ B denotes A is a subset of B, i.e., A implies B. 5In case you have one line formula for f (x), put it in am2.

OR A∪B be the union of A and B. In Excel: OR(A,B), or (A+B) if we interpret positive numbers as TRUE.

AND A∩B the intersection of A and B. In Excel: AND(A,B), or (A*B).

NOT Ac, i.e., not A. In Excel: NOT(A), or 1-A.

Moreover, let A\B be A ∩ Bc, i.e., A but not B, A × B the Cartesian product of A and B. Let X denote the entire sample space. We have the following useful properties for general indicator functions:

I∅ = 0 IX = 1

A ⊂ B ⇔ IA ≤ IB IA∩B = IA · IB = min(IA, IB) IA∪B = IA + IB – IA · IB = max(IA, IB) IA\B = IA – IA · IB IA×B = IA · IB.

3 Financial Models

Finance is concerned with the allocation (investment) of assets and liabilities over space and time. One challenge is that business conditions often change dynamically and stochastically over time §3.1. We focus on three financial models, i.e., asset pricing §3.3, hedging §3.4, and investment §3.5.

Notation: We write ≡ for equal by definition. We use italicized font for math formula, and typewriter font for Excel formula: f (x) vs. f(x), f(x). When we translate math formula into Excel:

x3 =x^3, √ N = N0.5 = sqrt(N) =N^(0.5), e–xy = exp(–x ∗ y), a[(bc) – d] = a ∗ ((b ∗ c) – d).

3.1 Dynamic Systems and Brownian Motion

In business, we often face dynamic situations that evolve over time. For example, GE’s stock price in year 2008, daily sales in the past week. To describe their evolutions, we use the notion of sample path. Specifically, let random variable Xt be the system state in period t. The process

X = (Xt)Tt=0 = {X0, X1, X2, . . . , XT }

fully describes the system evolution. In particular, each realization (xt)t≥0 of X—a sample path— gives a specific trajectory of the evolution.

To specific the dynamics, we need the initial condition X0 and the recursion Xt = G(Xt–1) (the linkage between two consecutive states). For example, let X0 be the initial stock price. The stock price Xt follows the geometric random walk (discrete time Brownian motion), whose recursion G is given by

Xt = Xt–1 ·e(µ–σ 2/2)∆t+σ

√ ∆t·Z , (13)

where ∆t is the period length (step size), µ is the drift, σ is the volatility, and Z ∼N(0, 12) is the standard normal variate. In Excel, Z = NORM.INV(rand(), 0, 1); given Xt–1, the next period Xt is given by Eq. (13).

Therefore, for a dynamic system, each element of a sample is a sample path, consisting of a sequence of T observations, x = (x1, x2, . . . , xT )′. For example, weekly stock prices xt ∈ R+ of a year with T = 52, or coke purchasing history xt ∈ {coke, pepsi} of a year. If the sample size is N, then the sample [x1, x2, . . . , xN ] is essentially a T-by-N matrix.

3.2 Derivatives: Forwards, Futures, and Options

A derivative is a contract that derives its value from the performance of an underlying entity. The underlying entity can be an asset, index, or interest rate. Derivatives can be used to hedge risks, speculate (take a view on the future direction of the market), lock in an arbitrage profit, change the nature of a liability or an investment (without incurring the costs of selling one portfolio and buying another). Most derivatives are traded over-the-counter (off-exchange) or on an exchange (e.g., the New York Stock Exchange).6

6There are three main categories of financial instruments: derivatives, stocks (i.e., equities or shares), and debt (i.e., bonds and mortgages). The oldest example of a derivative in history is thought to be a contract transaction of olives, entered into by ancient Greek philosopher Thales, who profited from the exchange. A more recent historical example is bucket shops (outlawed a century ago). In the United States, after the financial crisis of 2007–2009, there has been increased pressure to move derivatives to trade on exchanges. See https : //en.wikipedia.org/wiki/Derivative.

Typical derivatives include forwards, futures, options, and swaps.

• A futures/forward contract gives the holder the obligation to buy or sell at a certain price. An option gives the holder the right to buy or sell at a certain price.

• A call option is an option (right) to buy a certain asset by a certain date T for a certain price K (the strike price).

• A put option is an option to sell a certain asset by a certain date T for a certain price K (the strike price).

• An American option can be exercised at any time during its life.

• A European option can be exercised only at maturity T.

• An Asian option payoff is related to average asset price.

Let r be the risk-free interest rate. For European options with maturity T, strike price K, and the spot price of asset at maturity XT , the payoffs are

call.payoff = max(XT – K, 0)e–rT , (14) put.payoff = max(K – XT , 0)e–rT . (15)

3.3 Pricing Asian Options

Unlike European and American options, the price of an Asian option depends the average price of the stocks:

X̄ = ∑T

t=1 Xt T

The stock prices (Xt)Tt=1 evolve according to geometric random walk:

Xt = Xt–1 ·e(µ–σ 2/2)∆t+σ

√ ∆t·Z , (16)

where ∆t is the period length (step size), µ is the drift, σ is the volatility, and Z ∼ N(0, 12) is the standard normal variate. In Excel, Z = NORM.INV(rand(), 0, 1). Once we have Xt–1, the next period Xt is given by Eq. (16).

The present value of a call Asian option is

R0 = max(X̄ – K, 0) ·e–rT .

For a put Asian option, it is R0 = max(K – X̄, 0) ·e–rT .

The fair price is E[R0].

Let the initial stock price X0 = 100, the annual drift µ = 0.08, the volatility σ = 0.12, exercise price K = 97, ∆t = 1/52 (year), maturity T = 1, risk-free rate r = 0.059, and sample size N = 1000 (runs/replications/years). Questions:

1. Plot the sample paths (Xt)52t=1 for the first 3 runs (each run is one year with initial price X0 = 100). Is there any pattern?

2. What is the distribution (histogram) of R0?

3. How should it the put option be priced, i.e., E[R0]?

4. What is the probability P(R0 > 1.1)?

5. Answer the above questions for Asian call option.

3.4 Hedging

The idea of hedging is similar to insurance: by paying a small amount up front, one gets insured against substantial downside risks. For example, suppose an investor holds the stock, whose price St evolves according to the geometric random walk:

St = S0 ·exp ( (µ – σ2/2)∆t + σ

√ ∆t ·Z

) ,

where initial price S0 = 100, annual drift µ = 0.08, volatility σ = 0.12, and standard normal variate Z ∼N(0, 12). Also, the risk free rate is r = 0.059.

The investor considers two strategies. The first strategy relies on luck and does not consider hedging, leaving his fortune to the stock market. Hence his payoff is

(ST – S0)e–rT .

The second strategy uses the European put option. To limit the down side risk, he buys the put option at the price c = 0.57, with maturity T = 0.5 year, step size ∆t = 0.5 year, and exercise price K = 97. So if the market price drops below K, he will exercise the option, selling at price K = 97. Otherwise, he will hold the stock with payoff St – S0. Hence, this strategy gives him payoff

[(K – S0) · (ST < K) + (ST – S0) · (ST ≥ K)] ·e–rT – c.

Here, we use indicators to simplify the formulation. Let 1A be the indicator of event A: 1A = 1 if A occurs, 0 otherwise. In Excel, e.g., 1 ∗ (S > K) is the shorthand for IF(S > K, 1, 0).

Assume same size N = 2000. We are interested in three questions:

Q1: What is the distribution (histogram) of the stock price ST ? mean and its 95% CI? Downside risk P(ST – S0 < 0)?

Q2: The payoff of a European put option is R0 = max(K – ST , 0)e–rT . What is the distribution (histogram) of R0 (the present value of the payoff of the put option)? How should it the put option be priced, i.e., E[R0] =? What is its 95% CI? Is c = 0.57 in the CI?

Q3: Compare the return of the next two investment strategies. Which one is riskier? Why? Draw your conclusion by the comparison of the distribution; mean; variance; downside risk of each strategy. Stock only (no hedging); Stock plus put option (hedging).

3.4.1 Main Insights

The main purpose of financial hedging is to reduce the risk, i.e., the variance of the payoff. We observe that, although hedging and stock-only strategies yield similar expected payoffs, the variance of the latter is much higher. This is because hedging limit downside loss to about $4, while stocks have no limit on loss.

3.5 Investment Strategies

A fund has a value of $1,000,000. Half of the fund is invested in a stock portfolio whose value follows the Geometric Brownian motion (geometric random walk model), and half is invested in bonds paying 5 percent per year. You plan to withdraw $120,000 per year at the end of each year. Your plan is to withdraw $60,000 from the stock portfolio and $60,000 from the bond portion. If either portion does not have $60,000, then the portion will be emptied and the remaining amount withdrawn from the other portion until both parts are depleted. Show the distribution of the length of time until the fund is fully depleted.

Now consider the following withdrawal policy. Other things being equal, we withdraw the money from the bond fund until it is depleted, then withdraw the money from the stock portfolio. Compare the distribution of the life of the funds with that obtained from the first strategy.

The system parameters are as follows:

%% inputs B0= 500; %initial bond fund S0= 500; %initial stock fund m= 0.08; %drift s= 0.12; %volatility dt= 1; %year, time interval r=0.05; %risk free interest rate mT= 50; %maximum allowed years N=2000; %sample size

3.5.1 Hints

We can view the system as an inventory system with two types of inventories, stock and bond funds; the original ‘demand’ for each fund is D = 60K; aside from withdrawing policy stipulated, the inventories are also altered by geometric random walk and interest rate, respectively.

Specifically, we track the system state in period t by (St, Bt): the value of stock fund St and bond fund Bt before withdraw, respectively. Then we need to figure out the system recursion dynamics, the link between period t + 1 and t. We need to find the stopping time T such that ST + BT < 0 for the first time.

We can first generate a sample path with fixed periods, say 50. Then use stopping criterion of St + Bt < 0 to find the stopping time T from this sample path. After that, we replicate N = 2000 sample paths (and find stopping time T for each path) for the sample of stopping time T. Then carry out the analysis based on the sample of T with size N = 2000.

3.5.2 Formulation

We use indicators to simplify the formulation. Let 1A be the indicator of event A: 1A = 1 if A occurs, 0 otherwise. In Excel, e.g., 1(D>S) = 1 ∗ (D > S); it is the shorthand for IF(D > S, 1, 0).

We view the system as an inventory system with two types of inventories, stock and bond funds; the original ‘demand’ for each fund is D = 60K; aside from withdrawing policy stipulated, the inventories are also altered by geometric random walk and interest rate, respectively.

A key step in specifying the dynamics—the link between periods t and t + 1—is to use the notion of shortage: the deficit to be covered by the other fund. For example, in the parallel withdraw, the shortage SBt of bound fund is SBt = (60 – Bt) ∗ (60 > Bt) (equivalently, IF(60 > Bt, 60 – Bt, 0), or max(60 – Bt, 0)). Depending on the current state Bt, the shortage SBt dynamically changes over time: initially it is zero, because Bt > 60; once the bond fund drops below 60K, Bt < 60, the shortage becomes positive. This shortage must be covered by the stock fund, as long as it is positive. The story for the stock shortage is similar.

Model the first Policy: Specifically, the system state in period t is defined by (St, Bt), the value of stock fund St and bond fund Bt before withdraw, respectively. . The system recursion dynamics, the link between period t + 1 and t, are given by

SSt = (60 – St) · (60 > St), (17) SBt = (60 – Bt) · (60 > Bt), (18)

St+1 = 1 · (St ≥ 60) · [St – 60 – SBt] ·e(µ–σ 2/2)1+σ

√ 1Z , (19)

Bt+1 = 1 · (Bt ≥ 60) · [Bt – 60 – SSt] · (1 + r). (20)

In the above, the unit of value is $1K, (St – 60) is the leftover stock fund after withdrawing 60K in period t, SBt ≡ (60 – Bt)(60 > Bt) is the shortage from bond fund that needs to be covered by the stock fund, and SSt ≡ (60 – St)(60 > St) is the shortage from stock fund to be covered by bond fund.

We need to find the stopping time T such that ST + BT < 0 for the first time.

Model the second policy: Under the new withdraw policy, the system recursion dynamics can be specified by

SBt = (120 – Bt) · (120 > Bt), (21)

St+1 = 1 · (St ≥ 120) · [St – SBt] ·e(µ–σ 2/2)1+σ

√ 1Z , (22)

Bt+1 = 1 · (Bt ≥ 120) · (Bt – 120) · (1 + r). (23)

The interpretation is similar to the first one, and thus omitted for brevity.

How to find the stoping time ST?

Let Ft = St + Bt be the total remaining fund. We need to find the first time that Ft ≤ 0. We have three ways. (i) Use ST = COUNTIF(Ft, ” > 0”). (ii) Use ST = MATCH(0, Ft, –1), where –1 means that Ft is descending. (iii) Use indicator alive = 1 ∗ (Ft > 0). Hence, ST = SUM(alive), i.e., ST = sum(1 ∗ (Ft > 0)) (CTRL+SHIFT+ENTER).

3.5.3 Managerial Insights

The bond-first strategy lasts longer than the parallel strategy. This is because, by withdrawing from the bond fund first, it gives the more time for more profitable stock fund to grow.

4 Marketing Models

4.1 Salesforce Performance Evaluation: An Insurance Model

You are a sales agent for an insurance company. Your compensation is based on the loss ratio. Each year, you can sell S policies, where S ∼ N(µ = 100,σ2 = 102). Each policy has basic and excessive liabilities. While all customers buy basic liability, only pe = 0.40 (40 percent) of them buy excessive ones. The basic liability X ∼U[50, 200]; the excessive liability, if purchased, is 0.2X. Each policy i = 1, . . . , S, has pc = 0.16 chance to generate a loss (claim) Yi ∼ exp(µ = rcPi), where rc is the loss parameter, and Pi is the policy’s premium. Assume sample size N = 500 years.

Taking together, the agent has loss ratio

R = TL TP

where TL = ∑S

i=1 Ci is the total loss, and TP = ∑S

i=1 Pi is the total premium. Any loss perfor- mance receives F grade, i.e., F = {R > 1}.

We consider four scenarios (rc, pc) ∈ {(5, 0.16), (10, 0.08), (20, 0.04), (80, 0.01)}. 1. For each scenario, find the following statistics for loss ratio R: mean ER, standard deviation s, max, min, probability P(R > 1), and standard error s/

√ N;

2. For each scenario, plot the distribution (histogram) of the loss ratio R. 3. [Optional] Your company terminates any agent who receives two Fs in a row (i.e., making loss for two consecutive years). For each scenario, what’s the chance that you can last for five years?

4.1.1 Hints

We need the Excel skill for random sample size §4.4.1. The problem can be solved in two ways. The first is basic (see Insurance.Basic.xls). We first simulate one year to find the ratio R, and then use Data Table to replicate 500 years, for final analysis. The second method simulates all 500 years in one shot, then use Data Table to do the final analysis; see Insurance.Advanced.xls.

Method 1: To find expected lost ratio ER, we need to simulated N years. In each year t = 1, 2, . . . , N, we need to know the lost ratio Rt = TLt/TPt. This ratio comes from S ∼ N(µ,σ2) policies. For each policy i = 1, . . . , S, we need to find the premium Pi and loss Ci, so that TLt =

∑ i Ci and TPt =

∑ i Pi.

In Excel, the sales are S = INT(NORM.INV(rand(), mu, sigma))

where function INT() is to ensure S is an integer.7

the premium Pi of policy i = 1, . . . , S comes from

X = a + (b – a)∗ rand()

Pi = X ∗ ( 1 + 0.2 ∗ BINOM.INV

( 1, pe, rand()

) ) 7To ensure nonnegative sales, use function max(0, S).

The loss Ci of the policy i = 1, . . . , S comes from8

Yi = –(rc ∗ Pi)∗ LN(rand())

Ci = Yi ∗ BINOM.INV(1, pc, rand() )

In this model, we tossed the coin twice: one to decide whether to buy the premium; the other to decide whether the accident occurs. They are special cases of the Binomial random variable BINOM.INV(n, p, rand()), with the number of trials being n = 1, and probability of success being p = pe or pc. Using indicator, we can generate them in a simple way: 1 ∗ (rand() < p). Hence, we have

Pi = X ∗ ( 1 + 0.2 ∗ (rand() < pe)

) ,

Ci = Yi ∗ (rand() < pc).

Once we simulate a year, we can find the ratio R for that year. The trouble is, the sales S in each year is changing, and it affects the size of the table we hold polices. In Excel, to simulate N = 500 years, we may have to manually change the sample size (depending on sales S) for N = 500 times, and record the resulting ratio Rt for each year t. This is not pleasant.

In Excel, to get around the problem of random sample size, we use Data Table and indicators. The tricky part is how to handle the indicator. When you replicate 200 customers, the simplest error-free way is to use drag-down.

The idea is to simulate a fixed, large number K of polices for all years, and then consider only actual sales. For example, we can set K = 200, (since S ∼N(100, 102), 200 is far beyond µ + 3σ = 130). This fixes the table size of policies for all years, with ID = 1, 2, . . . , 200. For each year t, we have actual sales S, so only the first S out K policies should be considered. We use indicator pick = 1∗(ID <= S) to mark each policy: pick = 1 means an actual sale; 0 otherwise. Then total premium

TP = SUM(P ∗ pick)

with ctrl+shift+enter (an Array operation). Equivalently,

TP = SUMPRODUCT(Pi, pick)

TL = SUMPRODUCT(Ci, pick)

ratio = TL/TP

Use the array function to find:

P(ratio > 1) = AVERAGE(1 ∗ (ratios > 1)) = COUNTIF(ratios, ” > 1”)

COUNT(ratios) .

Once we have the ratio, we use the Data Table to simulate N = 500 years.

Method 2: In Excel, we can only invoke one layer of Data Table. To analyze four scenarios, we must carefully arrange the INPUT. Specifically, we define four scenarios by a matrix P = [Pkj],

8Consider exponential random variable X, with mean µ = 1/λ, and CDF F(x) = 1 – e–λx for ≥ 0. To generate X from random seed U , we apply the inverse transform method to solve U = F(X). We get X = F–1(U ) = – 1

λ ln(1 – U ) = –µ ln(1 – U ). Since U ∼U[0, 1] implies (1 – U ) ∼U[0, 1], we have X = –µ ln U .

with each row specifying a scenario: for scenario k = 1, 2, 3, 4, rc = Pk1, and pc = Pk2. In Excel, rc = INDEX(P, k, 1), and pc = INDEX(P, k, 2).

For scenario k = 1, we simulate all 500 years in one table, with each row recording the data of that year. Specifically, we simulate sales S, 200 premiums, 200 losses, and then compute TL, TR, and ratio of that year. We then do output analysis for scenario k = 1.

Next, we use one-way Data Table to generate the outputs for other scenarios. In Excel, we use row input.

4.1.2 Main Insights

Through simulation, we find that four scenarios yield the similar expected loss ratio (0.79). Yet they differ drastically in variability: scenario 1 is most stable, while the scenario 4 is the most volatile one. This can be seen from the distribution (histogram) of the loss ratio: scenario 4 has long, fat right tail, which means greater odds of severe loss (car totaled). This makes sense: although scenario 4 has least change of accident pc = 0.01, it is most damaging once the accident occurs. On average, scenario 4 must pay rc = 80 times of the premium! In contrast, scenario one pays only rc = 5 times, but with higher frequency of accidents (pc = 0.16).

To fully understand a phenomenon X, we must know not only mean EX, but also its variance var(X).

4.2 Market Share Model: Brand Switching and Brand Loyalty

When firms compete, their market shares vary over time. By using advertising and promotions, a firm can increase its market share. It is not obvious, however, whether advertising or promotion will increase profits. Using simulation, we can model and analyze such evolving situations over time.

Market share: Does initial market share matter in the long run?

Coke and Pepsi are fighting for a regional cola market. Each week each customer in the market buys on case of Coke or Pepsi. If his last purchase was Coke, there is a pc = 0.90 chance that his next purchase will be Coke (and 0.10 chance Pepsi). If his last purchase was Pepsi, there is pp = 0.80 chance that their next purchase will be Pepsi. This customer behavior can be summarized by a Markov transition matrix

P = ( 0.90 0.10 0.20 0.80

) . (24)

Solve the problems a) and b) with both the micro and macro simulations (see the hints):

(a). The initial market share is S0 = [0.50, 0.50] (row vector). Simulate one year (52 weeks) of sales in the cola market with N = 104 customers and estimate each firm’s average market share St in each week t. Plot St against t.

(b). Suppose the initial market share is S0 = [0.1, 0.9] and [0.9, 0.1], respectively. Simulate and plot St against t for each S0. Does initial market share have impact on the final (equilibrium) market share? Why or why not?

(c). [Optional] Analytically, the market share St after t periods (weeks) is

St = St–1 ·P = S0 ·Pt.

Compare your simulation results to the analytical results. Are they consistent? See Section §5.3.1 for matrix operations in Excel.

Advertising effectiveness: Is advertising worthwhile?

(d). Suppose the entire cola market consists of N = 104 customers. A profit of $1 per case is earned. For $5 × 104 per year, an advertising firm guarantees that only 5% of Coke customers, instead of original 10% without advertisement, will switch to Pepsi each week. Is this worthwhile? Plot St before and after advertising. [Hint: simulate new equilibrium market share (for a year) and compare the profits under this new equilibrium to that without advertisement.]

Discount: to coupon or not to coupon?

(e). We now investigate whether Coke should give out coupons. A coupon sacrifices current profits to increase market share in the near future. Coke is considering issuing a coupon for a case of soda every five weeks; that is, if we start from week 1, then weeks 6, 11, . . . , and 51 are discount weeks. The coupon is for a $0.25 discount. Coke estimates that during the week the coupon is issued, only 5% of all Coke customers will switch to Pepsi, and 40% of all Pepsi customers will switch to Coke. Will this coupon strategy increase profits? Plot St before and after deploying the coupon strategy. Assume that the cost of issuing coupons is negligible.

4.2.1 Hints

We can solve this problem in three ways. We label Coke=0 and Pepsi=1.

Analytical with matrix operations (optional): Let St = (S0,t, S1,t) be the market shares of Coca=0 and Pepsi=1 at time t. Then the next market share St+1 = St · P. In Excel, St+1 = MMULT(St, P) (CTRL + SHIFT + ENTER, array function). See MarketShare.Analytical.xlsx.

Micro Simulation: We track each customer’s purchase over time, and then derive their collective behavior via statistics (aggregation) methods. For each customer with ID i = 1, . . . , N = 104, let xi,t ∈ {0, 1} (Coke or Pepsi) be his purchase in period t. We need to generate his purchase sample path for 52 weeks, xi = (xi,1, xi,2 . . . , xi,52) (a row of 52 numbers in Excel). His initial purchase xi,1 is determined by coin tossing: BINOM.INV(1, S1,1, rand()), where S1,1 = 0.5 (equivalently, 1 ∗ (rand() < S1,1). Given current purchase xi,t, his next purchase xi,t+1 is determined by coin- tossing: if xi,1 = 0, he tosses Coke-coin, with probability pc = 0.9 (Tail) to buy Coke, and probability 1–pc (Head) to buy Pespsi. If xi,1 = 1, he tosses Pespsi-coin, with probability 1–pp = 0.2 (Tail) to buy Coke, and probability pp = 0.8 (Head) to buy Pespsi. We use BINOM.INV(1, p, rand()) to generate the outcome of coin tossing, {0, 1}. Or 1 ∗ (rand() < p).

Given his current purchase xi,t (0 or 1), his next purchase is

xi,t+1 = IF(xi,t = 0, 1 ∗ (rand() < 1 – pc), 1 ∗ (rand() < pp)).

Alternatively, we can use

xi,t+1 = (xi,t = 0)∗ BINOM.INV(1, 1 – pc, rand()) + (xi,t = 1)∗ BINOM.INV(1, pp, rand()),

where indicator 1 ∗ (xi,t = 0) is to ensure he tosses Coke-coin, and the outcome BINOM.INV(1, 1 – pc, rand()) determines which brand to buy. We can simplify this process by

xi,t+1 = BINOM.INV(1, pt, rand()).

Here, pt decides which (coin) probability to use, conditional on current purchase xt. Using transition matrix P = [pij], we can also set pt = index(P, xi,t + 1, 2), or pt = IF(xi,t = 0, 1 – pc, pp).

We need to generate N = 104 purchase sample pathes for all N customers. We put it as a matrix (table) X = [xi,t], i = 1, . . . , 104, and t = 1, . . . , 52. For each week t, the purchases xt of all N customers are the column t of matrix X; in Excel, xt = INDEX(X, 0, t) , and the market share of Coke and Pepsi are

S1,t = sum(xt)/N, S0,t = 1 – S1,t.

This approach hinges on the special structure of the binomial random variable. Alternatively, we can use general conditional probability to o generate next purchase xi,t+1 given xi,t; see §5.3.3 for detail.

Macro Simulation: We track the market share directly. In each period t, we describe the system state by (Ct, It), where Ct is the number of Coke fans, and It the number of Pepsi fans. Hence, It = N – Ct. Initially C1 = BINOM.INV(N, S0,1, rand()). Given Ct, the next period Ct+1 comes from two sources, those who remain loyal Coke fans BINOM.INV(Ct, pc, rand()), and those who switch from Pepsi BINOM.INV(It, 1 – pp, rand()). Hence,

Ct+1 = BINOM.INV(Ct, pc, rand()) + BINOM.INV(N – Ct, 1 – pp, rand()),

and It+1 = N – Ct+1. The market share S0,t = Ct/N, and S1,t = It/N.

How to find the coupon weeks?

The customer behavior in coupon weeks follows a different Markov transition matrix Q, as apposed to P in no-coupon weeks; in Excel, we use MOD to decide the coupon weeks.

First, let pc be the probability that a Coke fan will buy coke in the next period; define pp similarly for Pespi fans. Then,

pc = IF(MOD(week, 5) = 1, 0.95, 0.9)

pp = IF(MOD(week, 5) = 1, 0.60, 0.8)

Alternatively, we use indicator I = 1 ∗ (MOD(week, 5) = 1) to label each week: 1=coupon,0=no- coupon. Then

pc = I ∗ 0.95 + (1 – I)∗ 0.9

pp = I ∗ 0.60 + (1 – I)∗ 0.8

Here, MOD(n, d) returns the remainder after number is divided by divisor. The result has the same sign as divisor. If divisor is 0, MOD returns the #DIV/0! error value. The MOD function can be expressed in terms of the INT function: MOD(n, d) = n – d ∗ INT(n/d). For example, MOD(7, 5) = 2, because 7 = 5×1 + 2.

4.3 Bass Diffusion Model: Predict New Product Sales and Product Rollover

You are the brand manager of Q-Phone, a new product based on quantum physics that is set to revolutionize the cell-phone market. Your firm predicts the market size for Q-Phone is m = 1000, 000 units. Initial sales are promising: the first year you sell 263,000 units. Two years after the launch, however, you get worried. Although total sales are now over 615,000 units, the monthly figures are declining for the past 10 months, despite your hard work. The CEO is blaming you and wants you to resign.

To save you job, you must convince the CEO that the decline is not your fault. The CEO is a number guy, so mere talk won’t work. To convince him, you must use the data and theory to demonstrate that the decline is due to the natural life cycle of Q-phone. Further, you want to predict when the Q-Phone market will saturate, and when the firm must develop and launch next generation of Q-Phone.

You first step is to predict when a consumer will buy. Let t be the months after launching Q-Phone. Each consumer will buy only one Q-Phone. We call those who have bought Q-Phone adopters, and those who haven’t prospects or potentials. In month t, the market potential (number of prospects) is Mt, sales is Xt, and accumulative sales is Nt. Hence,

Nt = ∑t

s=0 Xs, M0 = m, Mt + Nt = m.

Moreover, the fraction of adopters in month t is

Ft = M0 – Mt

M0 =

Nt m

A potential customer has probability ft to buy in month (t + 1); i.e., conditional on that consumer has not bought Q-Phone till month t, he will buy with probability ft in the next month. The conditional probability ft1–Ft follows the celebrated Bass diffusion Model:

ft 1 – Ft

= p + q ·Ft, (25)

where p is the innovation rate, and q the contagion (imitation) rate. Intuitively, purchasing decision is driven by two forces. First, a potential consumer may buy out of his intrinsic preference, indepen- dent of other customers. The innovation rate p measures such force. Second, a potential customer may buy because of the social influence from other customers. For example, he may talk to other adopters, or read online reviews (e.g., Facebook, Twitter, YouTube, Amazon, WeChat). The more people bought Q-Phone, the stronger these word-of-mouth forces. The term q · Ft measures such contagion, network effect.

By analyzing the data, you find market size m = 1 × 106, innovation parameter p = 0.01, and imitation parameter q = 0.2.

Q1: Simulate one life cycle of Q-Phone for T = 60 months: find and plot the market potential Mt, sales Xt, and accumulative sales Nt, for t = 0, 1, . . . , T.

Q2: There are four stages of a product life cycle: introduction, growth, mature, and decline. The mature stage starts when the sales pass the peak; the decline stage starts when the fraction of adopters is over 80% (i.e., Ft ≥ 0.80). Let tp and td be the starting time of the mature stage and the decline stage. When will the mature stage start? How long will the mature stage last, i.e., d = td – tp? Simulate N = 1000 life cycles (sample size).

Q3: The process of introducing new products and phasing out old ones is called product rollover. To have a sustainable growth, the firm must roll out the next generation, Q-Phone 2.0, before the decline stage. What is the latest time that Q-Phone 2.0 must be launched? Simulate N = 1000 life cycles (sample size).

Q4: Change m, p and q. Describe how the changes affect the Sales curve Xt, the peak time tp, and the duration d of the mature stage

4.3.1 Hints

Macro approach: We use market potential Mt to describe the system state. In period t, the market potential is Mt, hence the fraction of adopters is Ft = Nt/m = (M0 – Mt)/M0. Each prospect has probability ft = (1 – Ft)(p + qFt) to buy, and hence with probability 1 – ft he remains a prospect in the next period. There are Mt prospects. The number Mt+1 of prospects in month t + 1 is a Binomial random variable with

Mt+1 = BINOM.INV(Mt, 1 – ft, rand()).

There are several ways to find the peak time tp. First, we use indicator IPt = 1 ∗ (Xt ≥ Xt–1): as long as current sales are greater than the previous one, we are not at peak yet; after the peak, however, IPt = 0. Hence, the time to peak is tp =

∑ t IPt.

Second, we can use MATCH and INDEX:

tp = INDEX(months, MATCH(MAX(Xt), Xt, 0)).

Here, MATCH(MAX(Xt), Xt, 0) is to find the index of the maximum MAX(Xt) in the column Xt; INDEX then finds the corresponding month that the sales peak.

The decline time td = MATCH(0.8, Ft).

For each life cycle, we care about three outputs (tp, td, d): the peak time tp, the starting time td of the decline stage, and the duration d of the mature stage. After we simulate one life cycle, we use Data Table to replicate three outputs for N times.

Specifically, set M0 = m, N0 = m – M0, X0 = 0. Then

Mt = BINOM.INV(Mt–1, 1 – ft–1, rand()) Nt = m – Mt Xt = Nt – Nt–1 Ft = Nt/m ft = (1 – Ft)(p + qFt).

Micro Approach: When the market size m is moderate, we can also track each customer’s state and then aggregate their decisions for analysis. Specifically, let xit be the state of customer i’in period t: 0 =adopter; 1 =prospect. Hence, the total prospects are Mt =

∑ i xit, and the buying

probability in next period is ft = (1 – Ft)(p + qFt). For customer i, the initial state is xi0 = 1. In each period t, he buys the Q-Phone with probability ft–1, and remains prospect with probability (1 – ft–1). Hence, the Markov transition matrix in period t is

Pt = (

1 0 ft–1 1 – ft–1

) .

For Excel implementation, besides tracking the sample path (xit)t≥0 of each customer, we also need to track potential Mt and probability ft.

4.4 Excel Skills

4.4.1 Random Sample Size

In Excel, the sample size is usually given, e.g., N = 1000. So after generating the first observation, you can simply drag down to get the whole sample. However, when the sample size is random, e.g., in the insurance model, you need the following trick. The key is use indicator smartly.

Consider the grades of simulation class. The grade Xi of each student i is continuous uniformly distributed over [90, 100], i.e., X ∼ U[0, 10]. We are interested in the total team grades Y , the sum of all members’ grades of a team. It is known that the team size N follows discrete uniform distribution over { 1, 2, 3, 4 }. Therefore, for a randomly picked team, its total grades is Y =

∑N i=1 Xi. What is

the distribution of Y ? Assume sample size n = 50 (i.e., simulate 50 teams).

Solution: The basic idea is to first simulate an ’ideal’ team, then replicate it 50 times using Table command: in Excel,

Data-> Data Tools-> What-If Analysis-> Data Table ...

Now we describe how to simulate an ’ideal’ team. We can view it as a sample of X ∼ U[90, 100] with sample size N. However,unlike our previous simulation with known, fixed sample size, the sample size (team size) N in this case is random, i.e., N ∼ U[1, 4].

How to get around this problem? First, observe that the maximal possible size of a team is 4. If we first ’pretend’ every team has the maximal size 4, i.e., generate grades of a team with four members with ID = 1, 2, 3, 4. Then, we generate the team size N ∼ U[1, 4] and counts only first N members. This is equivalent to counting the total grades of a team with size N. The key idea is the indictor of picking, i.e., if the member is picked or not, i.e., IF(ID<=N, 1,0), or simply 1*(ID<=N). See file RandomSampleSize.xls for detail.

Technically, let K = max N be the upper bound of random variable N. Then

Y = K∑

i=1 Xi · I(i ≤ N) =

N∑ i=1

Xi ·1 + K∑

i=N +1 Xi ·0 =

N∑ i=1

Xi.

In Excel, we can also use the array functions, Y = SUM(X ∗ (ID <= N)), with ctrl+shift+enter. Or, Y = SUMPRODUCT(X, 1 ∗ (ID <= N)).

5 Public Policies

5.1 Healthcare: The Spread and Prevention of an Infectious Disease

A remote village has a population of 100 people. At the beginning of period 1, it has 5 diseased people (called infectives) and 95 healthy people (called susceptibles). In each period, an infective has 0.10 chance to die at the end of the period. During that period, he will encounter a particular susceptible with probability 0.05; if that encounter occurs, the susceptible has 0.50 chance to contract the disease at the end of the period. Model the evolution of the population over 100 periods. Assume sample size N = 1000 (villages).

(a) What is the probability that the population will die out?

(b) What is the probability that the disease will die out?

(d) Suppose that people use the infection protection during encounters. This reduces from 0.50 to 0.10 the chance that a susceptible will contract the disease during a single encounter with an infective. Now answers questions (a)-(c) under the assumption that everyone uses the protection.

(e) Each individual pays annual tax $10,000. The tax revenue is accumulative. Because the protection is costly, the government must know the benefits of mandating the protection. Compare the total tax revenue with and without protection over 100 periods. What is the effect of mandating the protection on the total accumulative tax revenue?

5.1.1 Hints

Macro Approach: This is a dynamic simulation. Each sample path consists of 100 observations (years). We define the state the system by (St, It), where St is the number of remaining susceptible in period t, and It is the number of infected in period t. Let Dt be the number of deaths till period t. Then S1 = 95, I1 = 5, D1 = 0, and St + It + Dt = 100 for t = 2, 3, . . . , 100.

To specify the dynamics, we can model the behavior of St and It by binomial random variables. In period t, there is a probability 0.05×0.50 = 0.025 that an infective will infect a particular suscep- tible, and with probability (1 – 0.025) not infect him. Since there are It infected, the probability that a particular susceptible is not infected during a period is

pt = (1 – 0.025)^It.

There are St susceptible and each has probability pt remaining uninfected. Hence, the number of susceptible in period t + 1 follows binomial distribution, St+1 ∼ Binomial(St, pt). In Excel, we use

St+1 = IF(It = 0, St, BINOM.INV(St, pt, rand()))

The number It+1 of infected in period t + 1, is the sum of those It infected remaining alive and the number of newly infected (St – St+1). Among those infected It, the number remaining alive follows Binomial distribution Binominal(It, 0.9), since each has 0.1 chance to die. Therefore,

It+1 = IF(It > 0, BINOM.INV(It, 0.9, rand()), 0) + (St – St+1).

In both St+1 and It+1, the use of IF is to prevent the Excel error; i.e., when It = 0 or pt = 0, the function binom.inv may run into trouble.

Once we figure out the dynamics over the time, we can simulate a sample path of 100 years for a village. After that, replicate the sample path (of 100 years) for 1000 times (villages) to get the sample for further analysis.

P(poluation.dieout) = P(D100 = 100), and P(dease.dieout) = P(I100 = 0). In Excel, we use indicators to label each village, and then average them to get the probabilities.

Micro Approach: we can also track the state xit of each villager i in each period: 1 = healthy, 2 = sick, 3 = dead. The Markov transition matrix is

Pt =

 pt 1 – pt 00 0.9 0.1 0 0 1

  ,

where the probability of remaining healthy is pt = (1 – 0.05×0.5)It .

5.1.2 Background

Black Death: The Black Death is one of the most devastating pandemics in human history, resulting in the deaths of an estimated 75 to 200 million people in Eurasia and peaking in Europe from 1347 to 1351. The Black Death is estimated to have killed 30-60% of Europe’s total population. In total, the plague may have reduced the world population from an estimated 450 million down to 350-375 million in the 14th century. It took 200 years for the world population to recover to its previous level. The plague recurred as outbreaks in Europe until the 19th century (source: wikipedia).

Spanish Flu: The 1918 flu pandemic (January 1918–December 1920), also known as the Spanish flu, was an unusually deadly influenza pandemic, the first of the two pandemics involving H1N1 influenza virus. It infected 500 million people around the world, including people on remote Pacific islands and in the Arctic, and resulted in the deaths of 50 to 100 million (3-5% of the world’s population), making it one of the deadliest natural disasters in human history (source: wikipedia).

SARS: Severe acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the SARS coronavirus (SARS-CoV). Between November 2002 and July 2003, an outbreak of SARS in southern China caused an eventual 8,098 cases, resulting in 774 deaths reported in 37 countries, with the majority of cases in China (9.6% fatality rate) according to the World Health Organization (WHO).[2] No cases of SARS have been reported worldwide since 2004. In late 2017, Chinese scientists traced the virus to cave-dwelling Horseshoe bats in Yunnan province (source: wikipedia).

Ebola: Ebola virus disease (EVD), also known as Ebola hemorrhagic fever (EHF) or simply Ebola, is a viral hemorrhagic fever of humans and other primates caused by ebolaviruses. The disease has a high risk of death, killing between 25 and 90 percent of those infected, with an average of about 50 percent. Between 1976 and 2013, the World Health Organization reports a total of 24 outbreaks involving 1,716 cases. The largest outbreak to date was the epidemic in West Africa, which occurred from December 2013 to January 2016 with 28,616 cases and 11,310 deaths. It was declared no longer an emergency on 29 March 2016. Another outbreak in Africa began in May 2017 in the Democratic Republic of the Congo (source: wikipedia).

5.2 Social Mobility

A study of social mobility across generations was conducted and three social levels were identified: 1= upper level (executive, managerial, high administrative, professional); 2= middle level (high grade supervisor, non-manual, skilled manual); 3= lower level (semi-skilled or unskilled). Transition probabilities from generation to generation were estimated to be

P =

 .45 .48 .07.05 .70 .25 .01 .50 .49

  . (26)

Suppose each person has one offspring in each generation. Consider T = 50 generations. Assume sample size N = 1000 (families) and initial distribution [0.5, 0.0, 0.5].

(a) What is the long-run percentage of each social level, i.e., steady-state distribution, i.e., the distribution at time T = 50? What if initial distribution is f0 = [1, 0, 0] or [0, 0, 1]? Does the initial distribution matter in the long run?

(b) Suppose that Adam is in level 1 and Cooper is in level 3. Compute the probability A(t)( resp. C(t)) that the 1st, 2nd, . . . , 10th generation offspring of Adam (resp. Cooper) is in level 1, respectively. Graph A(t) and C(t) against t = 1, 2, . . . , 10. What is A(10) and C(10)?

(c) On average, how many generations (mean and 95% CI) does it take for Adam’s family to have the first level 3 offspring? On average, how many generations (mean and 95% CI) does it take for Cooper’s family to have the first level 1 offspring?

(d) What are the social and policy implications of these results, in terms of eduction, taxation, welfare programs, etc.?

5.2.1 Hints

This is a general Markov chain simulation. We can no longer rely on Binomial random variables. For necessary machinery, please see the following section on Matrix and Markov chain.

Equivalent Representation of Scalars and Distribution: level 1 (Adam) is equivalent to the distribution f0 = [f0(1), f0(2), f0(3)] = [1, 0, 0], where f0(1) = P(X0 = 1) = 100%, f0(2) = P(X0 = 2) = 0, f0(3) = P(X0 = 3) = 0. Similarly, we can express level 3 (Cooper) as f0 = [0, 0, 1].

How to find the first passage time ft? For a sample path x = (x1, x2, . . . , xT ), the first time the level 3 appears is ft = MATCH(3, x, 0) (0 means exact match), if such a time exists. In case no level 3 shows up in the sample path, we can force the time to be T. We use ISNUMBER(ft) to determine if ft is a number. Taken together, we have IF(ISNUMBER(ft), ft, T). Equivalently,

IF(ISNUMBER(MATCH(3, x, 0)), MATCH(3, x, 0), T)

5.2.2 Background

The following books provide in-depth analysis:

The Son Also Rises: Surnames and the History of Social Mobility, 2015, by Gregory Clark.

Coming apart: The state of White America, 1960-2010, 2012, by Charles Murray.

The Case against Education: Why the Education System Is a Waste of Time and Money, 2018, by Bryan Caplan.

5.3 Excel Skills

5.3.1 Data Array and Matrix Notation

In Excel, data are organized as tables or arrays. We express them with matrix notion. A m-by-n matrix A ∈ Rm×n is an array of elements aij, i = 1, . . . , m, j = 1, . . . , n, in the form of

A = [aij] =

  a11 a12 . . . a1n a21 a22 . . . a2n ...

am1 am2 . . . amn

  .

All scalars and vectors are special cases of matrices. For example, a row vector c = (c1, c2, . . . , cn) ∈ R1×n is a 1-by-n matrix; a column vector

b =

  b1 b2 ... bm

 

is a m-by-1 matrix. Let row vector b′ = (b1, . . . , bn) be the transpose of column vector b. Let x, b ∈ Rn. The inner product of vectors c and x is c · x =

∑n i=1 cixi. The notation Ax ≤ b stands

for

Ax =

  a11 a12 . . . a1n a21 a22 . . . a2n ...

am1 am2 . . . amn

    x1 x2 ... xn

  =

LHS︷︸︸︷  ∑n

j=1 a1jxj∑n j=1 a2jxj

...∑n j=1 anjxj

  ≤

RHS︷︸︸︷  b1 b2 ... bn

 

That is, Ax is a n-by-1 vector, with i-th element being the inner product between the i-th row vector of matrix A and the column vector x.

5.3.2 How to Generate Arbitrary Random Variables

To generate a random variable X from CDF F, we first create a uniform random seed U ∼U[0, 1] between 0 and 1; in Excel U = rand(). Then9

X = F–1(U).

Excel has build-in functions to perform the inverse operation F–1 for commonly used, parametric distributions. For example,

random seed: U = RAND() uniform U[a,b]: = a + (b-a)*RAND() normal(mu, sigma): =NORM.INV(mu, sigma, rand()) exponential(mu): = - mu*LN(rand()) binormial: =BINOM.INV( n, p, rand()) Bernoulli: =BINOM.INV( 1, p, rand()) = 1*( rand()<p ).

In practice, we often encounter arbitrary distributions. To generate a variable X from arbitrary CDF F, we use

X = min{xj : U < F(xj)},

i.e., the smallest x value whose CDF dominates U.

In Excel, we first expressed it as two columns (or two rows) (xs, Fs), where

xs =

  x1 x2 ... xn

  , Fs =

 

0 F(x1) F(x2)

... F(xn)

  . (27)

We add a zero on top of Fs to accommodate Excel.10

To generate a random variable X that follows F, we use X= INDEX( xs, MATCH(rand(), Fs) ). Equivalently, id= MATCH( rand(), Fs) X= INDEX( xs, id ).

9To see why, let G be the CDF of random variable X = F–1(U ). We need to show that G equals F; i.e., the random variable indeed behave as the function F dictates. Indeed, this is the case:

G(x) = P{X ≤ x}

= P{F–1(U ) ≤ x} by definition X = F–1(U )

= P{F(F–1(U )) ≤ F(x)} since F is increasing

= P{U ≤ F(x)} since F (

F–1(U ) ) = U

= F(x). since U ∼U[0, 1] with CDF FU (u) = P(U ≤ u) = u.

10In Excel, match(U, F) gives max{xj : F(xj) ≤ U }, but we want to find x = min{xi : U < F(xi)}. In other words, Excel MATCH function provides the largest index among those who are smaller or equal to U ; but we want to find the smallest index among those who are greater than U .

5.3.3 Conditional Probability and Markov Chain

Almost all businesses involve uncertainty and dynamics. We use random variables to capture uncertainty, as showing by standard deviation σ.

We use Markov chain to capture dynamics. A markov chain is a sequence of correlated random variables {Xt}t≥0 = {X0, X1, X2, . . .}. Here Xt ∈ X = {1, 2, . . . , k} describe the system state in period t. X is the state space, i.e., all possible outcomes for state variable Xt. The system dynamics is captured by the conditional probability: Given the current state Xt, we can predict the next period state Xt+1 by the conditional probability

pij ≡ P(Xt+1 = j|Xt = i), i, j ∈ S.

The transition matrix P that defines the Markov chain is given by

P = [pij] =

  p11 p12 . . . p1k p21 p22 . . . p2k ...

pk1 pk2 . . . pkk

  .

In matrix P, the Xt = ith row describe the probability that the next state Xt+1 = j. Hence∑k j=1 pij = 1. For example, p12 is the probability of the next state Xt+1 = 2, given current state

Xt = 1.

How to generate a Markov chain with transition matrix P and initial state X0 ∼ F0.

First, given transition matrix (PDF) P, we first construct a CDF matrix Q with additional zero as the first column:

Q = [Qij] =

  0, p11, p11 + p12,

∑3 j=1 p1j, . . . , 1

0, p21, p21 + p22, ∑3

j=1 p2j, . . . , 1 ... 0, pk1, pk1 + pk2,

∑3 j=1 pkj, . . . , 1

  .

Second, create the initial state X0 ∼ F0, where F0 is the CDF (with zero added on top): X0 = MATCH(RAND(), F0).

Third, in period t > 0, suppose the current state is Xt, we generate next period state by

Xt+1 = MATCH(rand(), INDEX(Q, Xt, 0)).

6 Markov Chain Models

6.1 Credit Risk Rating Model

Markov chains are often used in finance to model the variation of corporations’ credit ratings over time. Rating agencies like Standard & Poors and Moody’s publish transition probability matrices that are based on how frequently a company that started with, say, a AA rating at some point in time, has dropped to a BBB rating after a year. Provided we have faith in their applicability to the future, we can use these tables to forecast what the credit rating of a company, or a portfolio of companies, might look like at some future time using matrix algebra.

Let’s imagine that there are just three ratings: A; B and default, with the following probability transition matrix P for one year:

P =

 0.81 0.18 0.010.17 0.77 0.06

0 0 1

  (28)

In reality, this transition matrix is updated every year. However if we assume no significant change in the transition matrix in the future, then we can use the transition matrix to predict what will happen over several years in the future. In particular, we can regard the transition matrix as a specification of a Markov chain model.

Assume the maximal lifetime of a firm is 200 years. Sample size N = 1000.

(a) We interpret this table as saying that a random A-rated company has an 81% probability of remaining A-rated, an 18% probability of dropping to a B-rating, and a 1% chance of defaulting on their loans. Each row must sum to 100%. Note the matrix assigns a 100% probability of remaining in default once one is there (called an absorption state). In reality, companies sometimes come out of default, but we keep this example simple to focus on a few features of Markov Chains.

Let Xt be its rating in year t. Now let’s imagine that a company starts with rating B (i.e., initial PDF f1 = (0, 1, 0)). What is the distribution of state Xt in 2 and 5 years? What is the probability that a currently A firm (with initial PDF f1 = (1, 0, 0)) becomes default in 2 years? And in 5 years?

(b) Now let us imagine that we have a portfolio of 300 companies with an A-rating and 700 companies with a B-rating, and we would like to forecast what the portfolio might look like in t = 2, 5, 10, 50, 200, years. Plot the evolution of the portfolio. [Hint: the initial distribution f1 = (0.3, 0.7, 0).]

(c) We mentioned that in this model ’Default’ is assumed to be an absorption state. This means that if a path exists from any other state (A-rating, B-rating) to the Default state then eventually all individuals will end up in Default. The model below shows the transition matrix for t = 1, 10, 50, 200. If this Markov chain model is a reasonable reflection of reality one might wonder how it is that we have so many companies left. A crude but helpfully economic theory of business rating dynamics assumes that if a company loses its rating position within a business sector, a competitor will take its place (either a new company or an existing company changing its rating) so we have a stable population distribution of rated companies.

So now we consider what happens if we introduce new firms each year. Suppose that each year new firms of rating A and B are created with equal chance. Suppose that the number of new firmed created in each year is the same as the number of firms that default in the previous year. So we now assume that the number of firms (non-defaulted bonds) is constant over time. For example, if

in year t = 10, there are 3 firms default, then 3 new firms are created, each with probability 0.5 of being A or B.

Under the new modelling assumption,

(d) What is the transition matrix?

(e) What is the long-run distribution of the rating, i.e., the distribution ST , at time T = 200? What fraction of firms are in A in the long run? What is the expected fraction of firms that default each year in the long run?

(f) What is the expected number of time periods before a ’A’ rated firm defaults?

6.1.1 Hints

This is a general Markov chain. From initial PDF f1 and transition matrix P, we create the CDF version F1 and Q. For example, from PDF f1 = (0, 1, 0), we have F1 = (0, 0, 1, 1), where the first 0 is to accommodate Excel.

Let state 1=A, 2=B, and 3=Default. Hence the state space X = {1, 2, 3}. Let Xit be the state of ith firm at time t, where i = 1, 2, . . . , N = 1000, and t = 1, 2, . . . , T = 200.

Then the sample paths of 1000 firms form the data matrix X = {Xit : i ≤ N, t ≤ T}. To generate X, we use

Xi1 = match(rand(), F1)

Xit = match(rand(), index(Q, Xi,t–1, 0))

where index(Q, Xi,t–1, 0) is the Xi,t–1th row of the matrix Q, i.e., the CDF of the conditional distribution P(Xit|Xi,t–1).

Given data [Xit], we can compute the distribution ft = (f1t, f2t, f3t) of period t by

fit = countif(index(X, 0, t), i)/N

where index(X, 0, t) is the tth column of the data X.

The transition matrix with new entry:

P =

 0.81 0.18 0.010.17 0.77 0.06

0.5 0.5 0

  (29)

How to find the first default time (passage time) Dt?

For a sample path x = (x1, x2, . . . , xT ) (ith row of X, x = index(X, i, 0), with initial x1 = A = 1), the first time state 3 appears is Dt = MATCH(3, x, 0), where 0 means exact match, provided such a time exists. In case no state 3 shows up in the sample path, we can force the time to be T. We use ISNUMBER(Dt) to determine if Dt is a number. Taken together, we have IF(ISNUMBER(Dt), Dt, T). Equivalently,

IF(ISNUMBER(MATCH(3, x, 0)), MATCH(3, x, 0), T)

6.2 Bankruptcy in the Restaurant Business

Without benefit of dirty tricks, Harry’s restaurant business St fluctuates in successive years between three states: 1 (bankruptcy), 2 (verge of bankruptcy), 3 (solvency). The Markov transition matrix of this process is

P =

  1 0 00.5 0.25 0.25 0.25 0.25 0.50

  . (30)

Assume the maximal lifetime of a restaurant is T = 50 years. Sample size N = 2000.

(a) Let X be the stopping time from the initial solvency, i.e., the years until Harry’s restaurant goes bankrupt assuming he starts from the state of solvency S0 = 3. Plot the histogram of X and compute its 95% confidence interval.

(b) Let K be the stopping time when the restaurant is initially on the verge of bankruptcy (S0 = 2). Plot the histogram of K and compute its 95% confidence interval.

(c) Does initial state matter in the restaurant business? Why or why not? Make your case based on the results from (a)-(b).

(d) What is the distribution of the restaurant’s state in year 50, ST , with initial state solvency, i.e., the probability of the restaurant in states 1, 2, 3, respectively? And the distribution with initial state S0 = 2?

Harry’s rich uncle Zeke decides it is bad for the family name if his nephew Harry is allowed to go bankrupt. Thus when state 1 is entered, Zeke infuses Harry’s restaurant with cash returning him to solvency with probability 1. Thus the transition matrix for this new Markov chain is

Q =

  0 0 10.5 0.25 0.25 0.25 0.25 0.50

  . (31)

Let Z denote the years between cash infusions from Zeke. Answer (e) and (f).

(e) Plot the distribution of Z and compute its 95% CI.

(f) What is the distribution of the restaurant state in the long run (e.g., ST ) with S0 = 3? In this scenario, does the initial state matter? Why or why not? Compare your answer to (a)-(d) and make your case.

6.3 Negotiation in Show Business

One of uncle Zeke’s business acquaintances is the Hollywood mogul Sam Darling. Harry, disgusted over the success of yuppie oriented shows like L.A. Law, contacts Sam Darling about the possibility of doing a real down home show called Optima Street Restaurateur. Harry’s idea is that the show will combine cooking tips and urban adventure. Initial story lines look promising and negotiations get serious. Harry’s niece alertly notices that the negotiations seem to evolve as if they followed a Markov chain. The time scale is in hours. From Harry’s point of view, the states St are: 1 = Royalties and other financial are totally inadequate; 2 = Rewards are adequate but artistic control is inadequate; 3 = Rewards and artistic control are adequate, but not enough relative and neighbors will be employed in the show;

4 = Rewards, artistic control, employment of relatives and neighbors are adequate, but Harry has second thoughts about giving up the life of the respected restaurateur for the fast lane of show business.

The evolution of the negotiation seems to follow the transition matrix P and starts in state 1.

P =

  0.4 0.4 0.2 0.0 0.3 0.4 0.3 0.0 0.2 0.1 0.4 0.3 0.1 0.4 0.2 0.3

  (32)

Assume the maximal negotiation time is 1200 hours. Sample size N = 1000.

a) What is the long-run percentage of time that Harry is in states 1, 2, 3, 4, respectively?

[Hint: For each sample path of length 1200 hours, find the fraction of time the state St is in 1,2,3,4, respectively. This gives us one estimation of the distribution of St. Now take a sample of N = 1000 paths. Averaging over N gives us the solution—the long-run distribution of St.]

b) Suppose Harry decides to agree to terms on the sixth time the negotiations reach state 3, i.e., the stopping time X. What is the distribution of X (plot its histogram)? What is the expected time EX after the start of negotiations that Harry will agree to terms? And the 95% CI of EX?

c) Assume that the negotiation have been in progress for 36 arduous hours, and Harry finds himself in state 4. he is sick of Sam Darling and his abrasive manner, and unsure that he wants to dedicate his life to show business. After 36 hours, Harry decides to give up the idea of show business if state 1 is entered before state 3. If state 3 is entered before state 1, he will agree to terms and proceed with the show. Find the probability that Harry gives up show business.

7 Pricing and Revenue Management

7.1 List Price Selling

7.2 Auction and Optimal Bidding Strategies

Consider a single unit auction with N bidders (customers/buyers). Each bidder i with independent private valuation vi ∼ F. The objective of each bidder, given his private valuation vi and the common knowledge of the valuation distribution F of all the bidders (but not others’ actual bids), is to bid bi in order to maximize his surplus. The assumption that the valuations of all bidders follow the same distribution F means that the auctioneer and all the bidders have the same belief about the market and the likely valuations of other bidders.

In general, an auction mechanism is specified by (i) the allocation that defines the distribution of goods, and (ii) the payment made by the bidders. Intuitively, auctions create competition situation so that in general the goods can be allocated efficiently to the one valuates (needs) it most, where allocation is captured by I(win) and the valuation (so the payment) is captured by vi (and so bi).The more the bidders, the higher the expected revenue, provided the bidders do not collude, which is ruled out by our independent private value assumption.

In the first price, sealed-bid auction, the bidder with the largest bid wins and pays his bid price. Note that the first price, sealed-bid auction is equivalent to Dutch auction in the sense of alloca- tion/payment.

In the second price auction, the bidder with the largest bid wins but only pays the second largest bid price. The second price, sealed-bid auction is equivalent to English auction in the sense of allocation/payment. Under this mechanism, the bidders have incentive to bid more aggressively than in the first price auction, since the winner only pays the second highest price.

7.2.1 First Price, Sealed Bid Auction

CA plans to use the first price, sealed-bid auction to sell a piece of land to N bidders, each bidder i with independent private valuation vi following uniform distribution U[0, 1] ($M). It is known there are total N bidders and you are bidder i = 1. All other bidders i = 2, . . . , N, shade 1/N of their valuation and bid (1 – 1/N) of their valuation vi, i.e., bi = (1 – 1/N)vi. For any of your valuation v1 (a random draw from U[0, 1]), your objective is to bid b1(x) = x ·v1, so as to maximize your expected surplus

S1(x) = E[(v1 – b1(x)) · 1(b1(x) = B1)], (33)

where your strategy x is from the set {0.1, 0.2, . . . , 1.0}. Let Bj be the jth largest bid among all bids [b1, . . . , bn]; that is, the jth element in vector B = [B1, B2, . . . , BN ], which is descending sorted from [b1, b2, . . . , bN ]. The expected revenue for CA is

RN = E[B1] = E[max{b1, b2, . . . , bN }]. (34)

Assume sample size K = 103.

(a) Consider two-bidder case N = 2. Compute your expected surplus S1(x) under each of the 10 bidding strategies. Graph S1(x) and find your optimal bidding strategy x∗ for N = 2 case. Compute and graph CA’s expected revenue R2(x) = E[max(b1(x), b2)] under each of your 10 bidding strategies.

(b) Can CA achieve the maximal expected revenue among R2(x)? Why or why not?

(c) Repeat (a) to compute the expected surplus, graph it and find the optimal bidding strategy x∗ for N = 5 and 10, respectively.

(d) Generalize from (a)-(c) to N-bidder case. What is your optimal bidding strategy x∗ and CA’s expected revenue under x∗?

7.2.2 Second Price, Sealed-bid Auction

CA also considers using 2nd price, sealed-bid auction to sell a piece of land to N bidders, each bidder i with independent private valuation vi uniformly distributed, vi ∼ U[0, 1]($M). It is known all other bidders i = 2, . . . , N, always bid their true valuation v, i.e., bi = vi. For any of your (i = 1) valuation v1 (a random draw from U[0, 1]), your objective is to bid b1(x) = x ·v1, so as to maximize your expected surplus

S1(x) = E[(v1 – B2)) · 1(b1(x) = B1)], (35)

where your strategy x is from the set {0.1, 0.2, . . . , 1.0}. The expected revenue for CA is

RN = E[B2]. (36)

Assume sample size K = 103.

(b) Repeat (a) to compute the expected surplus, graph it and find the optimal bidding strategy x∗ for N = 5 and 10, respectively.

(c) Generalize from (a) and (b) to N-bidder case. What is your optimal bidding strategy x∗ and CA’s expected revenue under x∗?

(d) Based on the expected revenue under x∗, should CA use 1st- or 2nd-price sealed-bid auction to sell the land? Why?

7.2.3 Background

• When was the Roman empire sold, and who bought it? http://math.ucr.edu/home/baez/puzzles/5.html

• Why a licence plate costs more than a car in Shanghai https://www.economist.com/china/2018/04/19/why-a-licence-plate-costs-more-than-a-car-in-shanghai

7.2.4 Hints

For a strategy x and N bidders, we first simulate one round auction and find the surplus S1(x) and revenue RN (x). Then we use two-way Data Table to replicate the surplus of different strategies x = 0.1, 0.2, . . . , 1 (row) and K rounds (column). From the resulting data set S, we do statistic

analysis. We apply the similar procedure to revenue RN (x) (i.e., (i) use two-way Data table (see §2.6.1 and videos therein), and (ii) statistic analysis).

For a sample x = (x1, x2, . . . , xn), its kth largest number is LARGE(x, k). Hence, the largest number is max(x) = LARGE(x, 1), and the second largest is LARGE(x, 2). Similar, SMALL(x, k) gives the kthe smallest number in sample x. Next, we elaborate

How to simulate the 1st price, sealed bid auction (Dutch):

1. Simulation the first round of the Dutch auction game. The main inputs are

N = 2 number of bidders k = 1 round 1 x = 0.1 strategy

We need the following variables (columns) to record the round 1 data:

[ID, v, bid, w, pay, S]

For each bidder ID i = 1, . . . , N, we record his data in a row: valuation vi, bid bidi, winning indicator wi, payment payi, and surplus Si. Note that bidder 1 taking strategy x. In Excel,

vi = RAND() bid1 = v1 ∗ x bidi = vi ∗ (1 – 1/N) wi = 1 ∗ (bidi = LARGE(bid, 1)) payi = wi ∗ bidi Si = wi ∗ (vi – payi)

Under input k = 1, x = 0.1, we track two outputs, bidder 1’s surplus and the CA’s revenue:

Sp = S1

R = LARGE(bid, 1)

2. Replicate output (Sp, R) for K = 1000 rounds, and 10 strategies. We use two-way table. For t = 1, 2, . . . , 1000, and x = [0.1, 0.2, . . . , 1.0], we create the data table [Sp(k, x)] as

 = Sp 0.1 0.2 . . . 1.0 1 Sp(1, 0.1) Sp(1, 0.2) . . . . . . 2 Sp(2, 0.1) Sp(2, 0.2) . . . . . . ...

... ...

1000 Sp(1000, 0.1) . . . . . . Sp(1000, 1.0)

 

where the row input = x, and column input = k. We use another data table for [R(k, x)], where R(k, x) is the revenue in round k under strategy x. Hence, we have two data tables [Sp(k, x)] and [R(k, x)]. Under strategy x, the expected surplus and revenue are averaged over K = 1000 rounds of xth column:

ES(x) = AVERAGE(Sp(1 : K, x))

ER(x) = AVERAGE(R(1 : K, x))

Hence, under N = 2, we have results ES = [ES(x)] and ER = [ER(x)] (two row vectors). We create another sheet called "RESULT", copy N, ES, ER and paste as numbers to the RESULT sheet. If we press F9, RESULT sheet will not change because they are static data by design.

3. Changing number N of bidders. Instead of creating different auction games for different N, we can create only one by using the random sample size technique ( §4.4.1; see the insurance model §4.1. The key is to create the largest possible sample (in this case N = 10), and then use the indicator 1 ∗ (ID <= N) to select real bidders (those phantom/dummy bidders with ID > N are set to have zero valuation and bids).

4. Record all the results. For each N = 2, 5, 10, we first change N, then rerun the simulation (press F9), and record N, ES, ER in the sheet "RESULT". We should have three set of data.

How to simulate 2nd price, sealed bid auction (English):

The process is similar to simulating the Dutch auction, except

vi = rand() bid1 = v1 ∗ x bidi = vi wi = 1 ∗ (bidi = large(bid, 1)) payi = wi ∗ large(bid, 2) Si = wi ∗ (vi – payi)

7.3 Car Rental Agency

Congratulations! You have recently been promoted to the position of regional manager for Hurts Car Rental Company in Florida. You need to make pricing decision and fleet size decision for Tampa branch to maximize its expected total profit in the coming month (with 4 weeks or periods), taking into account the behavior of both your customers and your competitor.

Hurts has one primary competitor in Tampa with whom Hurts competes for market share. Your Market Intelligence (MI) has found that the competitor has adopted a masked price following strategy; that is, the competitors p2t price in period t will follow your last price p

1 t–1 within ±20%

range: p2t = p

1 t–1 ×εt, εt ∼ U[0.80, 1.20], t = 2, 3, 4. (37)

Assume that the competitor begins by choosing the same price as yours in period (week) 1, p11 = p 2 1.

MI also provides the weekly demand forecast for the coming 4 weeks based on the historical data. Your and your competitor’s pricing strategies, pt = (p1t , p

2 t ), jointly determine both the market size

Nt(pt) and the market share πt(pt). The market size Nt(pt), or the total number of demands for Hurts and its competitor in week t, is determined by pt = (p1t , p

2 t ) via

Nt(pt) = b0 + b1p1t + b2p 2 t , (38)

where b0 = 105644.5, b1 = –377.5, and b2 = –676.25.

Figure 1: Market Size: Nt(pt) Figure 2: Market Share: πt(pt)

The market share πt(pt) of Hurts, or the probability that a unit demand choosing Hurts instead of its competitor, is determined by the Multinomial logit (MNL) model:11

πt(pt) = 1

1 + exp(–c0 – c1p1t – c2p 2 t )

(39)

where c0 = 0.0875, c1 = –0.0220, and c2 = 0.0170. Note that 0 ≤ πt(pt) ≤ 1, and your competitor’s market share is 1 – πt(pt).

Therefore, your demand Dt(pt) in week t follows Binomial distribution, Dt(pt) ∼ Bino(Nt,πt), where both the number of trials (market size Nt(pt)) and the probability of success (market share πt(pt)) are determined by the pricing strategies via (38) and (39), respectively. Since the market size is sufficiently large, as MI pointed out, by Central Limit Theorem, your demand Dt(pt) is assumed to follow normal distribution

Dt(pt) ∼N(µt,σ2t ), (40)

where µt = Nt · πt, and σt = √ Nt ·πt · (1 – πt), with Nt and πt are given in (38) and (39),

respectively.

Each unit demand materializes as a unit sale if met by a day-car supply. Unmet demand is lost. Thus, your revenue Rt(pt, St) in week t is given by

Rt(pt, St) = p1t ×min(St, Dt(pt)), (41) 11Based on the historical data, MI estimates coefficients bi in (38) via linear regression (regress() in MATLAB);

and market share coefficients ci in (39) via Multinomial logistic regression ( mnrfit(), glmfit() in MATLAB).

where St = 7Q is the weekly day-car supply with the fleet size Q, and Dt(pt) is Hurts weekly demand in (40).

Currently the branch has a fleet of Q = 1500 cars. Total costs are comprised of three parts: maintenance, inventory and fixed costs. Each unit sale incurs unit maintenance cost M = $13 (per sale), the amount of work attributable to oil changes, cleaning, and preventative maintenance. Each car in the fleet incurs unit monthly inventory cost I = $298 (per car). Fixed costs, K = $344978, are the sum of all other monthly costs of running the branch and do not vary based on sales or inventory quantity.

Each unit demand requires one day-car supply, i.e., one car for one day rental and returns that car in the following day; multi-day rentals count as multi-unit demands. For example, a request for 4-day rental of a car is treated as 4 units demands. With fleet size Q = 1500, you have weekly day-car supply St = 7×Q = 10500 in week t.

As the manager, you need to decide the weekly price p1t , t = 1, 2, 3, 4, and monthly fleet size Q that together maximize net monthly profit before taxes. The performance of each strategy is measured by two criteria, monthly total profit V , the sum of profits of four weeks, and monthly fill rate f , defined by the ratio between your monthly sales and your monthly demand.

Are you ready for the job?

Tasks:

Q1. First assume fleet size Q = 1500. Consider a static pricing strategy: p1 ≡ p11 = p 1 2 = p

1 3 =

p14 = $40, i.e., set the same price $40 for four weeks. Simulate the operation of Tampa branch for n = 103 (sample size) months (sample path with 4 observations). Compute the expected fill rate E[f ], total expect monthly profit VQ(p1), and its 95% confidence interval.

Q2. Assume Q = 1500. Repeat question 1 for p1 ∈ { 45, 50, . . . , 70 }. Graph VQ(p1) against p1 and find the optimal price p1∗ that maximizes the total profit V∗Q = maxp1 VQ(p

1).

Q3. Now change Q = 2000, repeat questions 1 and 2. Find the optimal price p1∗ and total profit V∗Q for fleet size Q = 2000.

Q4. Repeat questions 1 and 2, for Q = { 2500, 3000, . . . , 4000 }. For each Q, find the optimal associated price p1∗ and the optimal total profit V∗Q. Graph V

∗ Q against Q. Find the optimal fleet

size Q∗.

Bonus (optional): Now consider a dynamic pricing strategy. Suppose you may set the price p1t for each week differently, where p1t ∈ { 40, 45, . . . , 70 }. By the similar approach in questions 1-3, find the optimal price path (p1∗1 , p

1∗ 2 , p

1∗ 3 , p

1∗ 4 ) for each fleet size Q ∈ { 1500, 2000, . . . , 4000 } and

its associated optimal profit V∗Q. Graph V ∗ Q against Q. Find the optimal Q

∗. Compare the total profit under the dynamic pricing strategy with that under the static pricing strategy. Which one performs better? By how much? Why?

7.4 Revenue Management

An airline offers two fare classes for coach seats on a particular flight: full-fare class at $440/ticket and economy class at $218/ticket. There are x = 230 coach seats on the aircraft. Demand for full- fare seats has a mean of 43, a standard deviation of 8, and the following empirical distribution (it is

not normal distribution, so we need to use inverse transform method to generate random demand). Economy-class customers must buy their tickets three weeks in advance, and these tickets are expected to sell out. For given capacity x, let y∗(x) be the protection level, b(x) = x – y∗(x) the booking limit for low-fare seats, V (x) the maximal expected profit, and p(x) = V (x) – V (x – 1) the marginal value of xth seat. Assume sample size N = 103.

a) Rationing: For capacity x = 230, find V (230), y∗(230), and b(230) via the following two methods: analytical method (marginal value argument), and brute force simulation defined below.12

b) For capacity x = 100, find the profit V (100) and protection level y∗(100). Do V (x) and y∗(x) change? Why or why not?

c) Dynamic pricing: Now compute V (x) and p(x) for each x = 1, 2, . . . , 230. Plot V (x) and p(x) against capacity x. Base one these results, if you were the manager, what price should you charge for each seat? Can you explain why the ticket price typically increases when approaching the departure date?

d) Suppose that unsold seats may sometimes be sold at the last minute at the reduced rate r3 = $100 (similar to USAirways’ “Esaver” for last-minute travel). What effect will this have on

the protection level calculated in (a)? What is the optimal protection level now?

demand pdf cdf 40 0.25 0.25 41 0.06 0.31 42 0.04 0.35 43 0.01 0.36 44 0.06 0.42 45 0.07 0.49 46 0.02 0.51 47 0.03 0.54 48 0.03 0.57 49 0.05 0.62 50 0.03 0.65 51 0.05 0.70 52 0.04 0.74 53 0.06 0.80 54 0.09 0.89 55 0.11 1.00

7.4.1 Hints

For this revenue management problem, we need to use both analytical method (marginal value argument), and the ’brute force’ simulation to find the optimal policy, i.e., the optimal protection level for class 1.

Brute force simulation works as follows. Suppose we have ample economy class (class 2) demand such that it is always greater than 230. We begin with reserve only one seat (y = 1) for full fare class (class 1) and sell all 229 to class 2, and compute the associated profit V2(1). Then we compute the profit V2(2) for (y = 2), and so on. In the end, we can find V2(y), for y = 1, 2, 4, 6, 8, . . . . Plot V2(y) against y and then find the optimal y∗ that maximizes the profit.

12See also chapter “revenue management with capacity controls” in the MGT 207 textbook matching supply with demand by Cachon and Terwiesch.

Note the the reserved seats (stock) y is the initial inventory for period 1. Suppose we set the protection level y = 2, i.e., reserve 2 seat to sell in period 1 (to class 1). In the current period (period 2), the reward is r2 ·(230 – 2); and the expected reward from selling the reserve two seats in period 1 can be computed by simulated K = 103 class 1 demands d(k), k = 1 : K. The expected reward in period 1 V1(2) is then

V1(2) = E[r1 ·min(D1, 2)] = ∑K

k=1 r1 ·min(d(k), 2) K

The expected total reward V2(2) under the policy y = 2 is

V2(2) = r2 · (230 – 2) + ∑K

k=1 r1 ·min(d(k), 2) K

Replace y = 2 by y = 3, 4, . . . , 230 and compute corresponding

V2(y) = r2 · (230 – y) + V1(y) = r2 · (230 – y) + E[r1 ·min(D1, y)],

then plot V2(y) against y and find the optimal y.

Compare the y∗ from brute force simulation and ’smart’ simulation to see if they are consistent. You should convince yourself why y∗ is indeed optimal and appreciate a bit more the value of logical thinking, or organized common sense.

7.4.2 Marginal Value Argument

The marginal value argument unifies almost all the decision models. It states that, for a well- behaved decision problem maxx∈X V (x) (e.g., V is concave and X is a convex set), the optimal solution x∗ has zero marginal value,

V ′(x∗) = 0.

Deterministic producer problem In a competitive market with prevailing price r, a producer with production cost C(x) chooses the output x to maximize profit V (x) = r ·x – C(x). The optimal x∗ solves

V ′(x∗) = r – C ′(x∗) = 0, (42) which yields r = C ′(x∗). Intuitively, the optimal quantity x∗ equates the marginal revenue r with the marginal cost C ′(x∗). If the production cost is linear, C(x) = c · x, then r = C ′(x∗) = c, as predicted by the microeconomic theory.

Stochastic inventory problem A firm needs to stock inventory x before random demand D ∼ F arrives. The firm incurs unit stockout penalty cost b and unit leftover holding cost h. Let (x)+ = max(x, 0). He seeks to minimize expected cost

V (x) = h ·E(x – D)+ + b ·E(D – x)+. (43)

Solving V ′(x∗) = 0, we find x∗ satisfies

h ·P(D ≤ x∗) – b ·P(D > x∗) = hF(x∗) – b[1 – F(x∗)] = 0. (44)

Hence, quantity x∗ equates underage cost b · P(D > x∗). The marginal value argument says that the firm at the optimal x∗ should be indifferent between suffering stockout and having leftover. Hence,

F(x∗) = P(D ≤ x∗) = b

b + h .

Newsvendor problem A newsboy must buy newspapers before daily random demand D ∼ F realizes. He purchases x quantity at unit price c, sells at price p, salves leftover at price v. The boy seeks to maximize his expected profit

V (x) = p ·E[min(D, x)] + vE(x – D)+ – c ·x. (45)

Setting too-much cost h = (c – v), and too-little cost b = (p – c), we find optimality condition V ′(x∗) = 0 yields

(c – v) ·P(D ≤ x∗) – (p – c) ·P(D > x∗) = hF(x∗) – b[1 – F(x∗)] = 0. (46)

Once again, the marginal value argument asserts that, at the optimum, the marginal cost of stockout (underage) equals the marginal cost of overage. Hence,

F(x∗) = b

b + h =

p – c p – v

Revenue management Consider an airline sells Q seats to two classes of customers with unit revenue r1 > r2, D2 ample (D2 > Q), and D1 ∼ F. Class-2 arrives before class-1. When selling to class-2, airline must decide how many seats x to protect/reserve for class-1, so as to maximize its expected profit

V (x) = r1E[min(D1, x)] + r2(Q – x). (47)

Setting the marginal profit V ′(x∗) = 0 yields

r2 ·P(D1 ≤ x∗) – (r1 – r2) ·P(D1 > x∗) = hF(x∗) – b[1 – F(x∗)] = 0. (48)

Hence, we should reserve x∗ seats for class-1, such that

F(x∗) = P(D1 ≤ x∗) = b

b + h =

r1 – r2 r1

, (49)

where x∗ is the ( 1 – r2r1

) ×100 percentile of the class-1 demand distribution F, too-little cost

b = r2 – r1, and too-much cost h = r2.

Reactive Capacity and Last-Minutes Sales Suppose unsold seats may sometimes be sold at the last minute at a reduced rate r3 (r3 ≤ r2, similar to USAirways’ “eSavers” for last-minute travel). What effect will this have on the protection level calculated? The additional selling opportunity will reduce the risk/the severity of not selling tomorrow, therefore encourages the reservation, and hence the protection level x∗ increases. Formally, the profit V (x) is

V (x) = r1E[min(D1, x)] + r2(Q – x) + r3E[(x – D1)+]. (50)

Setting V ′(x) = 0, we obtain

(r2 – r3) ·P(D1 ≤ x∗) – (r1 – r2) ·P(D1 > x∗) = hF(x∗) – b[1 – F(x∗)] = 0. (51)

Hence, we should reserve x∗ seats for class-1, such that

F(x∗) = P(D1 ≤ x∗) = b

b + h =

r1 – r2 r1 – r3

. (52)