Question 1,2,3,4

profilephilxu1988
static_pricing_13v1.pdf

An introduction to Static and Quasi-Static Pricing Policies c©

Guillermo Gallego

Spring 2012

Abstract

We consider the static pricing problem that calls for maximizing profits in excess of marginal costs that are driven by state dynamics. We establish conditions for the existence and uniqueness of finite maximizers and show that optimal profits are decreas- ing convex in the marginal cost. The convexity of optimal profits on the marginal cost, together with randomness in marginal costs driven by state dynamics is what justifies dynamic pricing. The results remain valid for the case of bounded capacity and when lower bound are imposed on sales under the assumption that aggregate demand is com- prised from many small customers or that large customers are willing to take partial orders. We then consider the welfare problem and show that options on capacity elim- inate the dead weight loss when booking and consumption are separated by time and consumers have ex-ante homogeneous willingness to pay. We then consider existence and uniqueness issues when aggregate demand comes from several market segments. We show that aggregate demand inherits existence properties from the individual market segments but this is not true for uniqueness properties. The problem of using a limited price menu to price multiple market segments is analyzed. Using a single price for all the market segments and a different price for each market segment are two extreme strategies that provide us with lower and upper bounds on profits. We next consider bounds and heuristics to design a menu of J > 1 prices for M > J market segments for a variety of demand functions including linear, log-linear demands and for demands governed by the multinomial logit model (MNL). Existence and uniqueness results for multiple products are provided for a variety of commonly used demand models.

1 Introduction

We are concerned with the following static pricing problem:

r(z) = sup p∈X

(p−z)d(p) (1)

where z is the marginal cost of capacity, d(p) is the demand at price p and X is the set of allowable prices. Economists are usually interested in the more general problem where costs

1

are non-linear. Our interest in the simpler problem with linear costs stems from dynamic pricing where problem (1) arises with z equal to the marginal value of capacity. Readers not interested in the connection to dynamic pricing can skip to Section 2.

To see the connection with dynamic pricing consider the problem of maximizing the ex- pected revenue that can be obtained from finite, non-replenishable, capacity c over a finite horizon [0,T] assuming zero salvage value. Gallego and van Ryzin [5] show that when demand arrives as a Poisson process with intensity dt(p), then the value function V (t,x), representing the maximum expected revenue when the time-to-go is t and the remaining inventory is x, satisfies the Hamilton Jacobi Bellman (HJB) equation:

∂V (t,x)

∂t = sup

p∈X (p− ∆V (t,x))dt(p), (2)

where ∆V (t,x) = V (t,x)−V (t,x−1) is the marginal value of the xth unit of capacity, and the conditions are V (t, 0) = V (0,x) = 0. Equation (2) requires continuity of dt(p) with respect to t. If dt(p) is piecewise continuous then the HJB equation (2) holds over each subinterval where dt(p) is continuous where the boundary condition is modified to be the value function over the remaining time horizon.

Notice that the optimization in (2) is of the form (1) with z = ∆V (t,x). If a maximizer, say pt(z), exists for each z ≥ 0 and each t ∈ [0,T], then an optimal solution to the dynamic pricing problem (2) is to set price P(t,x) = pt(∆V (t,x)) at state (t,x). There are two sources of price variation in dynamic pricing. The first source is variations due to state dynamics as the marginal cost ∆V (t,x) changes with the state (t,x). Gallego and van Ryzin show that ∆V (t,x) is increasing in t, so the marginal value of the x unit is more valuable if we have more time to sell it, and decreasing in x, so the marginal value decreases with capacity. We will later show that pt(z) is increasing in z, so a sale at state (t,x) causes the price to instantaneously increase to P(t,x− 1) > P(t,x). The second source of price variation is changes in pt(z) due to changes of demand dt(p) over time t. If pt(z) = p(z) is time invariant and then P(t,x) is increasing in the time-to-go t, since ∆V (t,x) is increasing in t. This means that prices decline in the absence of sales to stimulate demand. If pt(z) changes with time then P(t,x) can either increase or decrease over time as the forces of state dynamics may be in conflict with changes in willingness to pay.

Quasi-static pricing policies are heuristic pricing policies of the form

Ph(t,x) = pt(z(T,c)) 0 ≤ t ≤ T

that react to changes in dt(p) in t but not to changes in the marginal value ∆V (t,x). Typically z(T,c) is chosen to capture the marginal value of capacity by solving the following fluid program:

V̄ (T,c) = min z≥0

[cz + ∫ T

0 rt(z)dt]. (3)

Program (3) arises in at least three different ways: 1) By using Approximate Dynamic Programming (ADP) with affine functions, 2) By using a fluid limit approximation and du- alizing the capacity constraint and 3) By modifying the differential equation (2) by replacing

2

∆V (t,x) with the partial derivative Vx(t,x) of V (t,x) with respect to x. We will later show that (3) is a convex minimization program. We refer the reader to Gallego and van Ryzin [5] for a proof that V̄ (T,c) is an upper bound on V (T,c) and for a discussion of the asymp- totic optimality properties of the quasi-static pricing policy for the case d(p) = dt(p) for all t ∈ [0,T].

Quasi-static pricing policies are responsive to changes in willingness to pay but not re- sponsive to changes in state dynamics. It can be shown that quasi-static pricing policies are asymptotically optimal, see Gallego [7], and they are a natural extension of the fixed priced policies in [5]. The fact that quasi-static pricing policies ignore state dynamics is materially detrimental only when both capacity and aggregate demand are relatively small. On the posi- tive side, quasi-static pricing policies do not suffer from the nervousness of full dynamic pricing policies that react instantaneously to state dynamics, e.g., decreasing prices between sales and increasing them after each sale. This is an important advantage in practice as quasi-static policies are easier to implement. Limits are often are imposed on prices, so the optimization is restricted to p ∈ Xt where Xt may be a finite price menu. The design of the price menu is considered part of the problem. For example, if the cardinality of the set of different prices utilized by the static pricing heuristic {pt(z(T,c)) : 0 ≤ t ≤ T} is M and M is considered too large, then the task may be to select a pricing menu with at most J < M different prices, to prevent the pricing policy from being too nervous. We study a variant of this problem in Section 6.1.

The quasi-static heuristic is often made more dynamic by frequently resolving (3), ob- taining an updated value of the marginal value of capacity each time (3) is resolved. More precisely, if the realization of demand deviates significantly from its deterministic path, then the value of z can be updated at state (s,y) to z(s,y) where z(s,y) is the minimizer of [yz +

∫s 0 rt(z)dt]. Prices are then updated to pt(z(s,y)) for t ∈ [0,s] or until the deterministic

problem (3) is solved again. If the system is updated continuously, we get a feedback policy Ph(t,x) = pt(z(t,x)), which tends to perform better than the quasi-static policy but is also requires more computations and results in more nervous prices. The reader is referred to Maglaras and Meissner [13] who show that the feedback policy is also asymptotically optimal, and to Cooper [2] who presents and example that shows that updating z when the inventory and the time-to-go are small can hurt rather than help performance.

The near optimality of quasi-static pricing policies motivates the study of static optimiza- tion problem (1). Although this is a special case of the basic pricing problem where marginal costs are constant there are some subtle issues regarding existence and uniqueness. In addi- tion, there are a number of variants of the problem that are of interest in their own right. Our aim on this Chapter is to present the reader with a unified and comprehensive analysis of the problem.

In Section 2 of this Chapter we present basic properties and existence of finite maximizers. We first show that r(z) is decreasing convex in z and present conditions on d(p) that guarantee the existence of a finite price p(z) increasing in z such that r(z) = (p(z)−z)d(p(z)). In Section 3 we present sufficient conditions for the uni-modality of r(p,z) in p and for the uniqueness of p(z). We analyze the case of bounded capacity and lower bounds on sales in Section 4. Multiple market segments are treated in Section 6. We first look into the question of existence

3

and uniqueness when the demands of two or more market segments are aggregated. We show that existence conditions for the individual market segments are inherited by the aggregate demand. This is not so for uniqueness conditions. We then explore heuristics to price M market segments with at most J < M different prices. This problem may arise either because only a few prices are allowed or because detailed demand information from the different market segments is not enough to support using more prices. We show that it is often possible to design near optimal price menus for values of J that are small relative to M. The welfare problem is discussed in Section 5 where call options on capacity are presented as a viable solution when booking and consumption are separated by time and customers learn their valuations between booking and the time of consumption.

2 Basic Properties and Existence of Finite Maximizers

Let d(p) : X ⊂ [0,∞) → [0,∞] be a function representing the demand for a product at price p ∈ X. For any z ≥ 0 let r(p,z) = (p − z)d(p) be the profit function for any p ∈ X. We treat z as an exogenous unit cost and r(p,z) as the profit function. For z ≥ 0, we define r(z) = supp∈X r(p,z), the optimal profit as a function of the unit cost. We write sup instead of max in the definition of r(z) because the maximum may not be attained. To see this consider the demand function d(p) = 1 for p ∈ [0, 10) and d(p) = 0 for p ≥ 10 then r(z) = (10−z)+ but the maximum is not attained. As an example where a finite maximizer fails to exist, consider the demand function d(p) = p−b,p ≥ 0 for b ∈ (0, 1). Then r(p, 0) = p1−b so r(0) = ∞ and there is no finite maximizer. Later we will present sufficient conditions for the existence of a finite maximizer p(z) < ∞ such that r(z) = r(p(z),z). However, even if the supremum is not attained we can show that r(z) is decreasing1 convex in z.

Theorem 1 r(z) is decreasing convex in z.

Proof: Notice that for any z < z′, r(p,z) = (z′ − z)d(p) + r(p,z′) ≥ r(p,z′). Therefore r(z) = supp∈X r(p,z) ≥ supp∈X r(p,z′) = r(z′). To verify convexity, let α ∈ (0, 1), and let z(α) = αz + (1 −α)z′. Then

r(z(α)) = sup p∈X

r(p,z(α))

= sup p∈X

[αr(p,z) + (1 −α)r(p,z′)]

≤ α sup p∈X

r(p,z) + (1 −α) sup p∈X

r(p,z′)

= αr(z) + (1 −α)r(z′).

Remark 1: The convexity of r(z) implies that cz + ∫T 0 rt(z)dt is a convex problem in z, so to

obtain z(T,c), the marginal cost to be used for quasi-static pricing, all we need to do is to

1We use the term increasing and decreasing in the weak sense unless stated otherwise.

4

find the unconstrained minimizer of the convex function cz + ∫T

0 rt(z)dt and take its positive part.

Remark 2: Jensen’s inequality implies that Er(Z) ≥ r(EZ). This means that a retailer prefers a random unit cost Z than unit cost EZ, provided that he can charge random prices p(Z). This also explains why dynamic pricing reacts to state dynamics, ∆V (t,x), even when demand dt(p) = d(p) is time invariant.

Remark 3: If r is twice differentiable then r(Z) ' r(E[Z]) + (Z − E[Z])r′(E[Z]) + 0.5(Z − E[Z])2r′′(E[Z]). Taking expectations yields E[r(Z)] − r(E[Z]) ' 0.5Var[Z]r′′(E[Z]). Conse- quently, the benefits of responding to cost changes is large when Z has a large variance and r has large curvature at E[Z].

Example 1 If d(p) = 1 − p over p ∈ [0, 1] then for z ∈ [0, 1], p(z) = (1 + z)/2 maximizes r(p,z) = (p−z)(1−p) resulting in r(z) = (1−z)2/4. If z = 1/2 then r(1/2) = 1/16. Notice a retailer with demand d(p) prefers a wholesaler with unit cost z1 = 1/3 with probability 1/2 and unit cost z2 = 2/3 with probability 1/2 since this leads to more than a 10% increase in expected profits from 1/16 to 5/72. However, this does not mean that the retailer prefers to randomize prices if his true cost is z = 1/2 for any deviation from p(1/2) leads to lower profits.

The following Corollary pushes the idea a bit further. The proof is provided in the Ap- pendix.

Corollary 1 If g(y) : <m → <+ is increasing in y then r(g(y)) is decreasing in y. If g(y) is concave, then r(g(y)) is convex. Moreover, if Y ∈ <m is random, then Er(g(Y )) ≥ r(Eg(Y )) ≥ r(g(EY )).

We can interpret z = g(y) as the unit cost where y is the vector of component costs. As an example, g(y) = f ′y where f ∈<m+ is the vector of resource requirements. This shows, again, that the retailer is better off with random cost Y than with deterministic costs EY .

We have not assumed that d(p) is decreasing in p to allow for prestige goods whose demand may increase in price over a certain range. We will now show that we can construct a decreasing function d̄(p), based on d(p), such that under mild conditions we can find a maximizer p(z) of r(p,z) by finding a maximizer, say p̄(z), of r̄(p,z) = (p−z)d̄(p). Indeed, let d̄(p) = supp′≥p d(p′) for all p ≥ 0, and assume that d(p) is upper-semi-continuous (USC). Recall that a function d(p) : X ⊂ [0,∞) → [0,∞] is USC at po ∈ X if lim supp→po d(p) ≤ d(po) and d(p) is USC in p ∈ X if it is USC at every point p ∈ X. Clearly d(p) USC implies that d̄(p) is USC. Moreover any decreasing USC function is left-continuous with right limits (LCRL), so d̄(p) is LCRL. Let r̄(z) = supp∈X r̄(p,z). It is easy to construct examples where r̄(z) > r(z). The next Lemma shows that this is not possible if d(p) is upper-semi-continuous (USC).

Lemma 1 If d(p) is USC and p̄(z) is a finite maximizer of r̄(p,z), then p(z) = p̄(z) is a maximizer of r(p,z) and r(z) = r̄(z).

5

The proof of Lemma 1 can be found in the Appendix.

Our next task is to find conditions that guarantee the existence of a finite price that attains the maximum of r(p,z) and for this we need a few definitions from convex analysis, see Rockefeller [15]. A function d(p) is said to be proper if d(p) < ∞ for all p ∈ [0,∞). The product of two non-negative, proper USC functions is also USC. The product of two non-negative USC, proper or not, is also USC provided we treat 0 ×∞ = ∞, and we will agree to this convention to develop a unified theory for both proper and improper functions. Let s̄(z) =

∫∞ z d̄(y)dy be the area under the function d̄(y) to the right of z, and notice that

r̄(z) ≤ s̄(z) ≤ s̄(0) for all z ≥ 0.

The following result presents conditions that guarantees the existence of a finite maximizer. The proof of the result is somewhat technical and can be found in the Appendix.

Theorem 2 If d(p) is USC and s̄(0) < ∞ then for every z ≥ 0 there exist a finite price p(z) ∈ [z,∞) such that r(z) = r(p(z),z). Moreover, p(z) can be selected so that it is increasing in z.

Remark 1: If we want to guarantee the existence of a finite price for a given z, rather than for all z ≥ 0, then it is enough to require d to be USC on X ∩{p ≥ z} and to require s̄(z) < ∞.

Remark 2: Notice that the condition s̄(0) < ∞ is sufficient but not strictly necessary. To see this notice that d̄(p) = 1/p results in s̄(0) = ∞ yet r̄(p, 0) = 1 for all p > 0, so p̄(0) = 1 is optimal. However, p̄(z) = ∞ for all z > 0 since r̄(p,z) = 1 −z/p is increasing in p.

Remark 3: In many cases d(p) is eventually decreasing, i.e., there is a p′ such that d(p) is decreasing on p ≥ p′. However, Theorem 2 does not require this. For example, the demand function d(p) = a exp(−bp) sin2(p) is not eventually decreasing yet d̄(p) ≤ a exp(−bp) so s̄(0) ≤ a

b .

The following result shows that if the demand comes from a maximum willingness to pay function with finite mean then the conditions of Theorem 2 apply.

Corollary 2 If d(p) = λP(W ≥ p) for some random variable W with E[W] < ∞, then there exist a finite maximizer p(z) such that r(z) = r(p(z),z).

Under the first two conditions of Corollary 2, the actual demand, say D(p) is random with d(p) = E[D(p)]. As an example, the potential demand may be Poisson with parameter λ and demand at price p may be a thinned Poisson with parameter λH(p). Notice that by defining H(p) = P(W ≥ p) instead of H(p) = P(W > p) we are able to claim that H(p) is LCRL. This is an innocuous assumption if the distribution of W is continuous, or if W is discrete and pricing is, as it is in practice, restricted to discrete values, e.g., dollars and cents. However, the case where W is discrete and prices are allowed to be continuous leads to technical problems. Thus, if a customer is willing to pay any price lower than $10, then there is no finite price that maximizes the revenue that we can generate from such a customer, but things are fine if he is willing to pay up to and including $10. For this reason it is convenient to think of W as the maximum willingness to pay when H(p) is defined as P(W ≥ p).

6

As an example, if W is exponential with mean θ, then d(p) = λe−p/θ, p(z) = z + θ and r(z) = θe−(z+θ)/θ = θe−1e−z. In this case the demand function d(p) has two parameters, λ representing the expected market size and θ representing the mean willingness to pay. There are, however, examples where d(p) = λH(p) with H(p) decreasing in p, where H(p) is not of the form P(W ≥ p) for some random variable W. To see this consider the demand function d(p) = λp−β for some β > 1. Then s(z) < ∞ and p(z) = βz/(β − 1) for all z > 0, yet there is no random variable W such that λP(W ≥ p) = d(p), as p−β > 1 for p ∈ (0, 1). Often d(p) = λH(p), with H decreasing can be written as d(p) = λf(αp + β ln p), where f is a decreasing function and α and β are non-negative parameters. For example, f(x) = e−x, α = 1/θ and β = 0 yields d(p) = λe−p/θ, while α = 0 and β > 1 yields d(p) = λp−β.

2.1 Demand Estimation

Suppose that time is rescaled into tiny intervals so that the demand Dt = D(pt) at price pt in period t is a Bernoulli random variable with expected value d(pt) = λf(αpt + β ln(pt)) << 1, for some positive, decreasing function f, e.g., f(x) = e−x or f(x) = e−x/(1 + e−x). Then Dt = 1 with probability d(pt) and Dt = 0 with probability 1 −d(pt). Suppose we have data (ps,ds) : s = 1, . . . , t, where ds is the realized value of Ds in period s. The likelihood function up to time t is given by

Lt(λ,α,β) = Π t s=1d(ps)

d s(1 −d(ps))

1−ds.

The log-likelihood function is given by

lt(λ,α,β) = s∑ s=1

[ds ln(d(ps)/(1 −d(ps)) + ln(1 −d(ps))].

The score equations are obtain by setting the derivatives lt(λ,α,β) with respect to λ,α and β equal to zero. The solution to the score equations are the maximum likelihood estimators λ̂t, α̂t, β̂t. One important concern is whether the sequence of estimators λ̂t, α̂t, β̂t converges to the true parameter values λ,α and β. An interesting finding is that to guarantee convergence there needs to be enough variability in the prices. Without enough variability, it is possible for the estimates to converge to incorrect values of the parameters.

3 Unimodality of r(p,z) and uniqueness of p(z)

We now turn to conditions on the demand function d(p) that guarantee that r(p,z) does not have local, non-global, maximizers or more succinctly that r(p,z) is uni-modal in p ≥ z. This equivalent to r(p,z) being quasi-concave in p ≥ z and to r(p,z) having convex upper level sets: {p ≥ z : r(p,z) ≥ α} for all α. If d(p) is continuous and differentiable, we define the hazard rate at p to be h(p) = −d′(p)/d(p) where d′(p) is the derivative of d at p. The hazard rate function h(p) is defined for all p < p∞ = sup{p : d(p) > 0}. Notice that p∞ may be ∞. p∞ is the null price as d(p) > 0 for all p < p∞ and d(p) = 0 for all p ≥ p∞. We say that a function f(p) has a unique sign change from + to − over p ≥ z if the function starts positive,

7

becomes non-positive and stays non-positive once it becomes non-positive for the first time. Notice that we are not requiring f(p) to be decreasing, nor for a root of f(p) = 0 to exist. The following Theorem provides sufficient conditions for the existence of a finite maximizer. The proof of the Theorem is in the Appendix.

Theorem 3 If d(p) is differentiable and

f(p) = 1 − (p−z)h(p) (4)

has a unique sign change from + to − on p ≥ z, then r(p,z) is unimodal and

p(z) = sup{p : 1 − (p−z)h(p) ≥ 0} (5)

is a global maximizer of r(p,z).

Proof: The derivative of r(p,z) with respect to p can be written as

∂r(p,z)

∂p = d(p) + d′(p)(p−z)

= d(p) [1 − (p−z)h(p)] (6)

for all p < p∞. As a result r(p,z) is increasing in p for all p < p(z) and decreasing for all p ≥ p(z). Moreover, r(p,z) = 0 for all p ≥ p∞, proving that p(z) is a global maximizer.

Notice that we cannot guarantee the existence of a root to 1−(p−z)h(p). This is because d′(p) and therefore f(p) need not be continuous. While Theorem 3 rules out the existence of local, non-global, maximizers, there may be multiple global maximizers, i.e., multiple roots of f(p) = 0, if there is an interval over which h(p) = 1/(p− z). The following corollary provide stronger conditions for the existence and uniqueness of a finite maximizer and also provides bound on p(z).

Proposition 1 a) If h(p) is continuous and increasing in p and h(z) > 0, then there is a unique optimal price satisfying z ≤ p(z) ≤ z + 1/h(z).

b) If ph(p) is continuous and increasing in p and there exists a finite z′ ≥ z such that 1 < z′h(z′), then there is a unique optimal price satisfying z ≤ p(z) ≤ z/(1−1/z′h(z′)).

c) If d̃(p) is a demand function with hazard rate h̃(p) such that h̃(p) ≥ h(p) or ph̃(p) ≥ ph(p) for all p, then p̃(z) ≤ p(z) where p̃(z) is a maximizer of r̃(p,z) = (p−z)d̃(p).

Proof: Part a) If h(p) is continuous and increasing in p then f(p) is continuous and strictly decreasing in p ≥ z. Moroever, f(z + 1/h(z)) = 1 − h(z + 1/h(z))/h(z) ≤ 0 < 1 = f(z), on account of h(z + 1/h(z)) ≥ h(z) > 0. Therefore there exist a unique p(z) satisfying (5) that is bounded below by z and above by z + 1/h(z). Part b) If ph(p) is increasing in p and z′h(′z) > 1 then f(p) is continuous in p > z and the equation f(p) = 0 can be written as

8

ph(p) = p/(p−z) with the left hand side increasing in p and the right hand side decreasing to one for p > z. Since zh(z) < ∞ it follows that p(z) ≥ z. Notice that z/(1−z′h(z′)) is the root of z′h(z′) = p/(p−z). Since ph(p) ≥ z′h(z′) ≥ p/(p−z) for all p ≥ z/(1 −z′h(z′)) it follows that p(z) is unique and bounded above by z/(1 − 1/z′h(z′)). Part c) Clearly f̃(p) ≤ f(p) so p̃(z) ≤ p(z).

The reader may wonder whether there are demand functions that achieve the bounds in part a) and b) of Proposition 1. Part c) suggest that the bounds may be attained when h(p) or ph(p) increase the least, e.g., when they are constant. For part a) this suggest the hazard rate h(p) = 1/θ that corresponds to the exponential demand function d(p) = λe−p/θ, resulting in p(z) = z + θ = z + 1/h(z). For part b) we try ph(p) = b > 1, corresponding to d(p) = λp−b, which is known as the constant price elasticity demand model. In this case p(z) = bz/(b−1) = z/(1−1/b) = z/(1−1/z′h(z′)). Notice that the condition b > 1 is crucial as there is no finite root p(z) if b < 1, or if b = 1 and z > 0.

The reader is directed to van den Berg [17] and references therein for earlier efforts to characterize the existence or uniqueness of global maximizers. In particular van den Berg assumes that H exist, is continuous and E[V ] < ∞ to show existence. He assumes that ph(p) is strictly increasing to show uniqueness. He calls this condition increasing proportional failure rate condition (IPFR) and gives a large list of distribution functions that satisfy the IPFR condition. Economists frequently write the first order condition f(p) = 0 as

p−z p

= 1

ph(p) =

1

|e(p)|

where e(p) = −ph(p) = pd′(p)/d(p) is the elasticity of demand. Since ph(p) is the (absolute) elasticity of demand at price p, the IPFR condition is equivalent to assuming an increasing (absolute) demand elasticity. The reader is also referred to Lariviere and Porteus [10] for an equivalent assumption where ph(p) is called the generalized hazard rate.

The problem of maximizing r(p,z) can sometimes be transformed so that demand rather than price is the decision variable. This can be done if there is an inverse demand function, say p(d), that yields demand d at price p(d). This results in the problem of maximizing (p(d) −z)d over d. It is sometimes advantageous to use this formulation as there are demand functions for which (p(d) −z)d is concave in d while r(p,z) is not concave in p. While we are cognizant of this advantage, and have used it in some of our research, it is interesting to note that there are also demand functions for which r(p,z) is concave in p without (p(d)−z)d being concave in d. We refer the reader to Ziya et al. [19] for an interesting analysis that shows that non-equivalence of the following assumptions (i) concavity of pd(p) in p, (ii) concavity of dp(d) in d and (iii) ph(p) increasing in p.

9

4 Bounded Capacity and Sales Constraints

Consider pricing a product where up to c units can be procured at marginal cost z. At price p we can sell at most dc(p) = min(d(p),c) units. It is possible to sell up to dc(p) units assuming that customers are willing to take partial orders or that demand comes from many customers with small demands. In this case the pricing problem can be formulated as rc(z) = supp∈X rc(p,z) where rc(p,z) = (p−z)dc(p).

Proposition 2 If d(p) satisfies the conditions of Theorem 2, then so does dc(p) = min(d(p),c), and as a result there exists a finite maximizer pc(z), increasing in z, of rc(p,z) such that rc(z) = rc(pc(z),z) is decreasing convex in z.

Proof: Since dc(p) ≤ d(p) it follows that x̄c(z) ≤ s̄(z) for all z and consequently x̄c(0) ≤ s̄(0) < ∞. If d(p) is USC then so is dc(p) because the minimum of USC functions is USC.

As an example, suppose that z = 0, d(p) = 3 for p ≤ 10 and d(p) = 0 for p > 10. If c = 2 then d2(p) = 2 if p ≤ 10 and d2(p) = 0 for p > 10. Then r2(p, 0) is maximized at p2(0) = 10 resulting in r2(0) = 20. Notice that at this price three units are demanded but only two units are sold. If demand comes from three different customers each requesting one unit this is not a problem, but if it comes from a single customer that wishes to fulfill all of his demand or none at all then the formulation proposed here would be inappropriately optimistic. Indeed, if customers are not willing to take partial orders we can use a more conservative formulation: supp≥z r(p,z) subject to d(p) ≤ c. The set of feasible prices for the current example is {p : p > 10} and over this range r(p, 0) = 0, so the profit under this formulation is zero. This would be the correct profit if demand comes from a single customer unwilling to take partial orders but the formulation would be excessively pessimistic if the demand came from three different customers each demanding one unit at any price p ≤ 10.

We now turn to the questions of unimodality of rc(p,z) and uniqueness of pc(z).

Proposition 3 If the hazard rate h(p) of d(p) satisfies the conditions of Theorem 3 for a fixed z then so does the hazard rate hc(p) of dc(p) and as a result rc(p,z) is unimodal in p.

Proof: Suppose that 1 − (p− z)h(p) has a unique sign change from + to −. Let dc(p) = min(d(p),c). If d(0) < c then dc(p) = d(p) for all p ≥ 0 and there is nothing to show. Otherwise the hazard rate, say hc(p), of dc(p) is zero when d(p) > c and is equal to h(p) otherwise. Thus, if 1 − (p − z)h(p) has a unique sign change then so does 1 − (p − z)hc(p), showing that rc(p,z) is unimodal in p.

Let pmin(c) = sup{p ≥ 0 : d(p) ≥ c}. It is useful to think of pmin(c) as the market clearing price as demand exceeds supply for all p < pmin(c) and supply exceeds demand for all p > pmin(c). The following result links pc(z) to p(z) via the market clearing price pmin(c).

Corollary 3 The price

pc(z) = sup{p ≥ pmin(c) : 1 − (p−z)h(p) ≥ 0} = max(p(z),pmin(c))

10

is a global maximizer of rc(p,z). Moreover, if either hc(p) or phc(p) are strictly increasing or the equation 1 − (p−z)hc(p) has a unique root, then pc(z) is unique.

If d(p) is continuous then the formulation maxp≥z rc(p,z) is equivalent to the formulation maxp≥z r(p,z) subject to d(p) ≤ c and we can bring in the machinery of Lagrangian Relaxation. The idea is to impose a penalty γ(d(p) − c) for violations of the capacity constraint where γ is a non-negative Lagrange multiplier. Subtracting the penalty results in the Lagrangian:

L(p,γ) = r(p,z) −γ(d(p) − c) = r(p,z + γ) + γc.

The agenda is to find minγ≥0 maxp≥z L(p,γ). The inner optimization is solved by p(z + γ) and the outer optimization is equivalent to minγ≥0[r(z + γ) + γc] which is a convex program in γ. Notice that γ ≥ 0 increases the marginal cost of capacity. Let γc be any unconstrained minimizer of r(z + γ) + γc. Then the outer optimization is solved by γ∗c = max(γc, 0). If d(p(z)) ≤ c, then γc ≤ 0 and consequently γ∗c = 0. In other words, p(z) is an optimal solution if capacity is ample.2 On the other hand, if d(p(z)) > c, then capacity is scarce and γc is the root of d(p(z + γ)) = c. This corresponds to using the market clearing price pmin(c) discussed before. In summary, an optimal price is given by max(p(z),pmin(c) and if pmin(c) > p(z) then there exists a γ∗c > 0 such that pmin(c) = p(z + γ

∗ c ). As an example, consider the problem with

d(p) = λe−p/µ then p(z) = µ+z and pmin(c) = µ ln(c/λ) so pc(z) = max(µ+z,µ ln(c/λ)) solves the pricing problem and the problem is capacity constrained whenever c < d(p(z)) = e−1d(z). Also, γ∗c = max(0,µ[ln(c/λ) − 1] −z).

4.1 Sales Constraints

Management may be interested in achieving a certain sales volume and impose the constraint d(p) ≥ c on sales. This is the opposite of a capacity constraint and if d(p) is continuous the constraint can be handled by imposing a penalty γ(c−d(p)) on violations of the constraint. Subtracting the penalty results in the Lagrangian L(p,γ) = r(p,z) − γ(c − d(p)) = r(p,z − γ) −γc. The program is to maximize r(z −γ) −γc over γ ≥ 0. Notice that now γ ≥ 0 acts as a subsidy to the unit cost z. This is a convex program in γ. Let γc be the unconstrained optimizer of r(z−γ)−γc. Then γ∗c = max(γc, 0). If d(p(z)) ≥ c then γc ≤ 0 and consequently p(z) is an optimal solution. In this case, the target sales c is overshot. On the other hand, if d(p(z)) < c, then γc is the root of d(p(z − γ)) = c. This corresponds to using the market clearing price pmin(c) discussed before. In summary, the optimal price is given by pc(z) = min(p(z),pmin(c)).

5 Call Options and Social Welfare

Assume that demand is d(p) = λH(p) where H(p) = P(W ≥ p). While the seller is naturally interested in maximizing r(p,z) = (p − z)d(p), a social planner may be more interested

2In this case c−d(p(z)) units will go unsold. Any attempt to reduce the price to sell these additional units will result in lower profits.

11

in maximizing the sum of the seller’s profit r(p,z) plus the consumer’s surplus s(p) where s(p) =

∫∞ p d(y)dy = λE[(W −p)

+] =. The social welfare problem is to maximize

w(p,z) = s(p) + r(p,z) = λ[E[(W −p)+] + (p−z)H(p)].

Let w(z) = maxp≥z w(p,z). It is easy to see, by just drawing a graph of E[(W − p)+] + (p − z)H(p), that an optimal solution to the welfare problem is to set p = z, so w(z) = s(z) + r(z,z) = s(z). Unfortunately, this solution reduces the profit of the seller to zero, as r(z,z) = 0, while giving all of the surplus s(z) to the customers.

Welfare planners call dead-weight loss the difference w(z)−w(p(z),z) between the opti- mal social welfare and the social welfare that results when the seller maximizes his profits. We now explore a situation where the dead-weight loss can be eliminated. The situation requires the use of call options on capacity when booking and consumption are separated by time and customers have homogeneous ex-ante valuations at the time of booking. Examples include a group of homogeneous customers booking air transportation a month in advance of traveling or a single customer buying a service contract for services over a certain period of time.

Suppose there is a time separation between booking and consuming a service and that each customer has random valuation, say W, for the service at the time of consumption. We assume that customers know the distribution of W at the time of booking and learn the realization of W at the time of consumption. We assume that the distribution H(p) = P(W ≥ p) is known by the seller. Under these conditions, the seller can benefit from offering call options to consumers. A call option requires an upfront non-refundable payment x that gives the customer the non-transferable right to buy one unit of the service at price p at the time of consumption; see Gallego and Sahin [6], Png [14], Shugan and Xie [16], Xie and Shugan [18]. The special case where p = 0, is called advanced selling.

Customers evaluate call options by the surplus they provide. A customer who buys an (x,p) option will exercise his right to purchase one unit of the service at the time of consumption if and only if W ≥ p. By doing this, an individual customer obtains expected surplus E[(W − p)+]. Since the consumer needs to pay x for this right, the consumer receives surplus E[(W − p)+]−x. We will impose a participation constraint λ[E(W −p)+ −x] = s(p)−λx ≥ s̃, where s̃ ≥ 0 is a lower bound on the aggregate consumer surplus.

If all customers buy the call option then the seller’s profit is given by

λ[x + (p−z)H(p)]. (7)

This consists of the revenue from the non-refundable deposit x plus the profit p−z from those customers who exercise their options.

Consider now the problem of maximizing the expression in equation (7) with respect to (x,p) subject to the surplus constraint s(p) −λx ≥ s̃. Notice that the seller may set s̃ = 0 to extract as much surplus from consumers. Here we will analyze the problem for other values of s̃ to show that it is possible to eliminate the dead-weight loss and use s̃ as a mechanism to distribute profits and surplus between the seller and the consumers.

Since the objective function (7) is increasing in x, it is optimal to set λx = s(p) − s̃, so

12

the problem reduces to that of maximizing s(p) + r(p,z) − s̃ = w(z,p) − s̃ with repect to p. We already know that w(p,z) is maximized at p = z. Thus, the solution to the provider’s problem is to set p = z and x = [s(z)− s̃]/λ, so the provider obtains profits equal to s(z)− s̃, while consumers receive surplus s̃. We now explore the range of values of s̃ that guarantees that both the seller and the consumers are at least as well off as the solution (x,p) = (0,p(z)), where price p(z) is offered to consumers after they know their valuations. At price p(z), the provider makes profit r(z), while purchasing customers obtain aggregate surplus s(p(z)). As a result, consumers are better off whenever s̃ ≥ s(p(z)), while the seller is better off whenever s(z)−s̃ ≥ r(z), so a win-win is achieved for any value of s̃ such that s(p(z)) ≤ s̃ ≤ s(z)−r(z). Since the solution eliminates dead-weight loss, s(z) ≥ r(z) + s(p(z)), and consequently the win-win interval is non-empty. Absent competition or an external regulator, the provider may simply select s̃ = 0, to improve his profits from r(z) to w(z) extracting all consumer surplus while also capturing the dead-weight loss.

The idea of using call options can be extended to the case where the variable cost Z of providing the service at the time of consumption is random. In this case, the option be designed by setting λx = Es(Z) − s̃ and p = Z, so that by paying x in advance the option bearer has the right to purchase one unit of the service at the random marginal cost Z. It is interesting to measure the benefits to the provider of offering call options on capacity instead of selling at p(Z) when customers already know their valuations. In essence we want to compare Es(Z) − s̃ to Er(Z). To make this a fair comparison we will set s̃ = Es(p(Z)), so that both (x,p) with λx = Es(Z) −Es(p(Z)) and p = Z, and (x,p) = (0,p(Z)) result in the same consumer surplus. However, the benefits of offering call options may be larger as a monopolist need not compete against himself and can in fact extract all surplus by setting s̃ = 0. Our next result is for exponentially distributed W with mean θ. For convenience, we will let θ∗ = θ/e.

Proposition 4 If W is exponentially distributed with mean θ and the moment generating function MZ(−1/θ) = E[e−Z/θ] < ∞, then the lift in expected profits from offering call option (Es(Z) −Es(p(Z)),Z) relative to offering call option (0,p(Z)) is 72%. Moreover, the lift in profits for a monopolist who sets s̃ = 0 is 172%.

Proof: If W is exponential with mean θ. Then p(Z) = Z + θ and r(Z) = λθ∗e−Z/θ. Consequently, the expected profit from (0,p(Z)) is E[r(Z)] = λθ∗MZ(−1/θ). Since s(Z) = λθe−Z/θ and s(p(Z)) = r(Z)/, it follows that the expected profits from the call option is given by λ(θ−θ∗)MZ(−1/θ), and the relative lift in profits is equal to (θ−2θ∗)/θ∗ = (e−2) = 72%. If the seller extracts all the surplus then the relative lift in profits is (e− 1) = 172%.

The lift in expected profits from the exponential distribution is quite large and one may wonder whether large lifts are also possible for other distributions. It is possible to show that if d(p) = λp−b, then for z > 0 and b > 1, the lift in profits is at least as large as that for the exponential demand model, with the benefits converging to those of the exponential distribution as b → ∞. Consequently, the benefits are at least as large under the constant price elasticity model than under the exponential demand model. Here we show that if W has a uniform distribution, then the lift in expected profits can be up to 50%. Readers not interested in the details of the analysis can skip to the next section.

13

Example 2 : If W is uniformly distributed over the interval [a,b] then s(p) = E[W] −p for p < a, s(p) = 0.5(b−p)2/(b−a) for p ∈ [a,b] and s(p) = 0 for p > b. The revenue maximizing price is p(z) = max(a, (b+z)/2) for 0 < z ≤ b. For z > b there is no demand so we will confine our analysis for z < b. Then r(z) = a−z for 0 ≤ z < (2a−b)+ and r(z) = 0.25(b−z)2/(b−a) for (2a − b)+ ≤ z ≤ b. The expected surplus from offering price p(z) is s(p(z)) = s(a) = E[W]−a = 0.5(b−a) = 2r(a) for 0 ≤ z < (2a−b)+, s(p(z)) = 0.125(b−z)2/(b−a) = 0.5r(z) for 2a − b ≤ z ≤ b. An (x,p) option with p = z results in surplus −x + s(z) and for this to be more attractive we need x ≤ s(z) − s(p(z)). The contract (s(z) − s(p(z)),z) results in profits s(z) − s(p(z)) = E[W] − z − E[W] + a = a − z = r(z) for z < (2a − b)+ so there is no benefit in offering contracts when z < (2a − b)+. For (2a − b)+ ≤ z ≤ a we have s(z) −s(p(z)) = E[W] −z − 0.5r(z) > r(z) on account of θ(z) = E[W] −z − 1.5r(z) ≥ 0 on (2a− b)+ ≤ z ≤ a. This can be verified by checking that θ(2a− b) = 0 and θ′(z) > 0 on the interval (2a − b)+ ≤ z ≤ a. In fact at z = a we have θ(a) = E[W] − a − 1.5r(a) = 0.5r(a) so the lift from contracts is between (0, 0.5] over the interval ((2a− b)+,a). Finally, over the interval a ≤ z ≤ b we have s(z) −s(p(z)) = 2r(z) − 0.5r(z) = 1.5r(z) so there is a 50% lift.

5.1 Call Options and Service Contracts

As mentioned earlier, the idea of a call option may also apply to an individual customer buying a service contract for services over a certain period of time. The contract allows the customer to pay x in advance for the right to pay the marginal cost z each time the service need arises over a certain pre-specified horizon. If the expected number of services during this period of time is λ, and each service need has random value W, then a contract of the form (x,p) = (λ(s(z) − s̃),z) may be designed, by selecting s̃, to be as attractive as offering á la carte services at p(z). In this case, obtaining the surplus from á la carte services is a bit trickier because the decision of whether or not to buy a service at price p(z) for a current service of value W may influence the need for future services. As an example, consider the problem of repair services for a certain product. If the customer declines the service at price p(z) because W < p(z), then the customer forgoes the future utility associated with this product while the service provider forgoes the opportunity to continue servicing the product. This situation forces the customer to think carefully about whether or not to pay for the service at p(z) and forces the service provider to carefully design the contracts so they are win-win.

5.2 Call Options with Bounded Capacity

Assume there is a bounded capacity c. We will assume that each will buy at most one call option. We will formulate the problem with the unconstrained demand function and impose a condition on the number of customers that exercise the (x,p) option at the exercise price p. Under this formulation the seller’s profit is [s(p) − s̃] + r(p,z) subject to the constraint d(p) = λH(p) ≤ c. The constraint is equivalent to p ≥ pmin(c) so an optimal solution is to set the exercise price at max(z,pmin(c)) and the option price at s(max(z,pmin(c)))− s̃. This leads to profit [s(max(z,pmin(c))) − s̃] + r(max(z,pmin(c)),z) for the seller and aggregate consumer surplus s̃. It is instructive to compare the two cases: pmin(c) ≤ z and pmin(c) > z. In the first case the capacity constraint is not relevant as d(z) = λH(z) ≤ c, so the optimal option is

14

p = z and λx = s(z)−s̃, the profit to the seller is s(z)−s̃, and the aggregate consumer surplus is s̃. On the other hand, if pmin(c) > z then λx = s(pmin(c)) − s̃ and p = pmin(c) resulting in seller’s profit equal to s(pmin(c))− s̃ + r(pmin(c),z) = s(pmin(c))− s̃ + c(pmin(c)−z). It is also possible to work directly with the truncated demand function dc(p). This leads to essentially the same result but it is a bit more subtle to interpret.

6 Multiple Market Segments

Suppose we have multiple market segments with demands dm(p),m ∈ M = {1, . . . ,M}. For any subset S ⊂ M, let dS(p) =

∑ m∈S dm(p) denote the aggregate demand over market

segments in S and let rS(p,z) = (p−z)dS(p) denote the profit function for market segments in S when the variable cost is z, and a common price p is offered to all market segments in S. We will first deal with questions related to the existence and uniqueness of finite maximizers of rS(p,z) before exploring using a finite price menu of J different prices to price the M market segments.

The following result shows that dS(p) inherits some desirable properties from the individual market demand functions dm(p),m ∈ S.

Proposition 5 If dm(p) satisfies the conditions of Theorem 2 for every m ∈ S ⊂ M, then so does dS(p). Moreover, there exists a finite price pS(z), increasing in z, such that rS(z) = rS(pS(z),z) is decreasing convex in z.

Proof: Since the sum of USC is USC it follows that dS(p) is USC. Moreover x̄m(0) < ∞ for all m ∈ M implies that x̄S(0) =

∑ m∈S x̄m(0) < ∞. As a result dS(p) satisfies the conditions

of Theorem 2 so there exists a finite price pS(z), increasing in z, such that rS(z) = rS(pS(z),z) is decreasing convex in z.

It may be tempting to conclude that under the conditions of Proposition 5 pS(z) would lie in the convex hull of {pm(z),m ∈ S}, i.e., in the interval [minm∈S pm(z), maxm∈S pm(z)]. However, Example 3 shows that this is not true.

Example 3 Suppose that d1(p) = 1 for p ≤ 10 and d1(p) = 0 for p > 10. Then r1(p, 0) is maximized at p1(0) = 10 and r1(0) = 10. Suppose that d2(p) = 1 for p ≤ 9, d2(p) = .1 for 9 < p ≤ 99 and d2(p) = 0 for p > 99. Then r2(p, 0) is maximized at p2(0) = 99 resulting in r2(0) = 9.9 and total profit equal to 19.9 if each is allowed to be priced separately. Let S = {1, 2}, then rS(p, 0) = r1(p, 0) + r2(p, 0) is maximized at pS(0) = 9 < mini∈S pi(0) resulting in rS(0) = 18.

Since the sum of quasi-concave functions is not, in general, quasi-concave, it should not be surprising that properties of dm(p) that imply quasi-concavity of rm(p,z), for each m ∈M are not, in general, inherited by dS(p) =

∑ m∈S dm(p). Example 4 illustrates this.

15

Example 4 a) Suppose that dm(p) = exp(−p/bm) for m = 1, 2 with b1 < b2. Then the hazard rate hm(p) = 1/bm, is constant, and there is a unique price pm(z) = z + bm that maximizes rm(p,z). Let S = {1, 2} and notice that the hazard rate hS(p) of dS(p) is decreasing in p.

b) Suppose that dm(p) = 1/p bm for some bm > 1, then phm(p) = bm and there is a unique

price pm(z) = bmz/(bm − 1) that maximizes rm(p,z). However, the proportional hazard rate phS(p) of dS(p) is decreasing in p.

This state of affairs is very unsatisfying because in both cases in Example 4 the profit function rS(p,z) is actually quasi-concave, even if the aggregate demand function dS(p) has decreasing hazard rate (part a) or decreasing proportional hazard rate (part b). Some level of satisfaction may be restored if sufficient conditions can be founds so that rS(p,z) has a finite bounded maximizer. Here we present such conditions.

Theorem 4 Assume that the hazard rate hm(p) is continuous in p and there is a finite root pm(z) of fm(p) = 1 − (p − z)hm(p) = 0 for each m ∈ S. Assume further that phm(p) is increasing for each m ∈ S. Then rS(p,z) has a finite maximizer in the convex-hull of {pm(z),m ∈ S}.

Proof: It is easy to see that pm(z) > z is the root of p p−z = phm(p). Since the left hand

side is decreasing in p and phm(p) is increasing in p, it follows that there is a unique root p > z. This implies that fm(p) > 0 on p < pm(z) and fm(p) < 0 on p > pm(z). Let fS(p) = 1 − (p − z)hS(p) where hS(p) is the hazard rate of dS(p). Since fS(p) is a convex combination of fm(p) = 1 − (p − z)hm(p) with weights θm(p) = dm(p)/dS(p), it follows that fS(p) > 0 for all p < minm∈S pm(z) because over that interval fm(p) > 0 for all m ∈ S. Also fS(p) < 0 for all p > maxm∈S pm(z) because over that interval fm(p) < 0 for all m ∈ S. Since the derivative of rS(p,z) is proportional to fS(p) it follows that rS(p,z) is increasing over p < minm∈S pm(z) and decreasing over p > maxm∈S pm(z). Moreover, since rS(p,z) is continuous over the closed and bounded interval [minm∈S pm(z), maxm∈S pm(z)] and appeal to the EVT yields the existence of a global maximizer pS(z) of rS(p,z).

Corollary 4 Theorem 4 holds if hm(p) is increasing in p for all m ∈ S

The Corollary follows since then phm(p) is increasing in p for all m ∈ S.

6.1 Pricing with Finite Price Menus

Consider now the situation where it is possible to use third degree price discrimination so that a different price can be used for each market segment m ∈ M without worrying about incentive compatibility. This situation arises when it is possible to vary price by time, location

16

or customer attributes without cannibalizing demand from other market segments. We will embed this problem as part of a more general problem where we are allowed a price menu that consist of at most J ≤ M different prices. The use of a finite price menu J < M may result from constraints in pricing flexibility or because the demand functions of some of the market segments is not know with sufficient accuracy. We will assume that the demand functions dm(p),m ∈ M belong to the same family. By this we mean that dm(p) = λmHm(p),m ∈ M and the tail distributions Hm(p) = P(Vm ≥ p),m ∈ M differ only on their parameters. Examples of families of demand functions include linear, log-linear, CES, Logit, among others. We will assume that the profit function rm(p,z) = (p−z)dm(p) is quasi-concave for each m and that there is a unique finite maximizer pm(z) for each m ∈M. We will assume that the market segments are ordered so that p1(z) ≤ . . . ≤ pM (z). Finally, we will assume that for any S ⊂M, the profit function rS(p,z) has a finite maximizer in the interval [minm∈S pm(z), maxm∈S pm(z)], as guaranteed under the conditions of Theorem 4.

The extreme cases are J = 1 where a single price is used for all market segments and J = M where each market segment can be individually priced. In practice, one seldom has the freedom or sufficiently detailed knowledge to use J = M prices, particularly if M is large. In this section we solve to optimality the case J = M assuming detailed knowledge of the demand functions. In addition, we develop heuristics for J ∈ {1, . . . ,M − 1} that are robust to possible misspecification of demand functions dm(p),m ∈M. If J = M the problem is to separately select prices pm,m ∈M to maximize

∑ m∈M rm(pm,z). This problem has a trivial

solution, namely to price market segment at pm(z),m ∈M, so the optimal profit is given by

RM (z) = ∑ m∈M

rm(z).

Since each rm(z) is decreasing convex in z it follows that RM (z) is decreasing convex in z. RM (z) will serve as a benchmark upper bound against which we will measure heuristics when the price menu allows only J < M prices.

Since we will be using heuristic prices, it is convenient to have a measure of how efficient it is to use price p instead of pm(z) for market segment m. This motivates defining the relative efficiency of price p instead of pm(z) for market segment m when the unit cost is z as

em(p,pm(z),z) = rm(p,z)

rm(z) (8)

Notice that em(p,pm(z),z) ≤ 1, em(p,pm(z),z) reaches maximum efficiency at p = pm(z), and decays on both directions as a result of our quasi-concavity assumption. It is possible to find closed form formulas for em(p,pm(z),z) for many families of demand functions including linear, log-linear and CES. However, there are distributions that do not admit closed form expressions for em(p,pm(z),z) but the results that we will derive here can also be applied, numerically, for distributions that do not admit closed form expressions. The relative efficien- cies of prices will help us deal with situations where we may not know the exact parameters of some of the market segments.

We will be particularly interested in families of demands for which em(p,pm(z),z) is inde-

17

pendent of m, i.e, that the functional form of e does not depend on the market segment. The following result confirms that em is independent of m for the linear, for the log-linear and for the logit demand functions.

Lemma 2 For the linear demand function dm(p) = am − bmp

e(p,pm(z),z) = p−z

pm(z) −z

( 2 −

p−z pm(z) −z

) , (9)

for the log-linear demand function dm(p) = am exp(−p/bm)

e(p,pm(z),z) = p−z

pm(z) −z exp

( 1 −

p−z pm(z) −z

) , (10)

and for the logit demand function dm(p) = λme am−p/(1 + eam−p),

e(p,pm(z),z) = p−z

pm(z) −z + (ep−pm(z) − 1) . (11)

Proof: For the linear demand function d(p) = a − bp, p(z) − z = (a − bz)/2b. Since a− bp(z) = b(p(z) −z) it follows that r(z) = b(p(z) −z)2. Therefore

e(p,p(z),z) = (a− bp)(p−z) b(p(z) −z)2

.

Then (9) follows from (a− bp) = 2b(p(z) −z) − b(p−z) since 2b(p(z) −z) = a− bz.

For the log-linear demand function d(p) = ae−p/b, p(z) = z + b, so d(p(z)) = e−1d(z) and r(z) = be−1d(z). On the other hand, r(p,z) = (p−z)e(p−z)/bd(z). As a result,

e(p,p(z),z) = p−z p(z) −z

exp{(p(z) −z)/b− (p−z)/b}.

The result (10) follows since b = p(z) −z.

For the logit demand function ea−p/(1 + ea−p), p(z) is the root of the equation p − z = 1 + ea−p, so r(z) = ea−p(z) = p(z) −z − 1. Consequently, the ratio r(p,z)/r(z) can be written as (p−z)/[(p(z)−z−1)/d(p)] and the result follows if we can show that (p(z)−z−1)/d(p) = p(z) − z − 1 + ep−p(z). But this is equivalent to showing that (p(z) − z − 1)/ea−p = ep−p(z) or equivalently p(z) − z − 1 = ea−p(z). But we know this to be true since r(z) = ea−p(z) = p(z) −z − 1.

Notice that in the first two cases what is important is the markup ratio (p−z)/(p(z)−z). On occasions we will write e(p,q,z) and this should be interpreted as the efficiency of using price p when q is optimal, so for example, e(p,q,z) = (p−z)/(q −z)[2 − (p−z)/(q −z)] for the linear demand model.

18

The following result will be helpful in establishing our results.

Lemma 3 Suppose q1 < q2 and q ∈ (q1,q2) is selected so that e(q,q1,z) = e(q,q2,z), then e(q,p,z) ≥ e(q,q1,z) = e(q,q2,z) for all p ∈ (q1,q2).

Proof: Recall that e(q,p,z) deteriorates as q gets further from p in either direction. If p ∈ (q1,q) then e(q,p,z) > e(q,q1,z) as q is closer to p than to q1. On the other hand, if p ∈ (q,q2) then e(q,p,z) > e(q,q2,z) as q is closer to p than to q2.

We will now provide a bound when only one price is allowed for all of the market segments. We will make use of Lemma 6.1 to lower bound the ratio R1(z)/RM (z) where for J = 1 we write R1(z) = rM(z) as the maximum profit when all market segments are priced at pM(z).

Theorem 5 Assume that the functions rm(p,z) are quasi-concave and each has a unique finite maximizer pm(z). Suppose that the market segments are indexed so that pm(z) is increasing in m ∈M. Assume that em(αpm(z),pm(z),z),m ∈M is independent of m ∈M for all α > 0. Let q1 be the root of

e(q,p1(z),z) = e(q,pM (z),z) (12)

and let γ1(z) = e(q1,p1(z),z) = e(q1,pM (z),z) be the loss of efficiency of using q1 for market segments 1 and M. Then

R1(z)

RM (z) ≥ rM(q1,z)

RM (z) ≥ γ1(z),

Proof: Assume p1(z) and pM (z) are respectively the smallest and the largest optimal prices for the M market segments. Let q1 be the root of e(q,p1(z),z) = e(q,pM (z),z). Then, by Lemma 6.1 we know that e(q1,pm(z),z) ≥ γ1(z) for all m = 2, . . . ,M−1. From this it follows that

R1(z)

RM (z) ≥

rM(q1,z)

RM (z)

= ∑ m∈M

e(q1,pm(z),z) rm(z)

RM (z)

≥ ∑ m∈M

γ1(z) rm(z)

RM (z)

= γ1(z).

Notice that Theorem 5 does not require precise knowledge of the demand functions dm(p) other than knowing that pm(z) ∈ [p1(z),pM (z)]. Without detail knowledge of the demand functions dm(p),m ∈{2, . . . ,M −1} it is not possible to find RM (z) or even R1(z). However, it is possible to find q1, the root of equation (12). Theorem 5 guarantees that pricing all

19

segments at q1 is not too far from optimal when p1(z) and pM (z) are not too far apart. Moreover, the actual performance R1(q1,z)/RM (z) can be significantly better than the lower bound γ1(z). Closed form expressions for γ1(z) will be presented shortly for the linear and log-linear demand functions after we generalize Theorem 5 to J > 1.

We will now define RJ(z) the maximum expected revenue that we can obtain if we are allowed to use up to J different prices for J ∈ {2, . . . ,M − 1}. Fix 1 < J < M and consider any partition S1, . . . ,SJ of M such that ∪Jj=1Sj = M and Si ∩Sj = ∅ for i 6= j. Let

rSj (z) = sup p≥z

∑ m∈Sj

rm(p,z)

and let

RJ(z|S1, . . . ,SJ) = J∑ j=1

rSj (z).

Optimizing over the partitions we obtain

RJ(z) = max S1,...,SJ

RJ(z|S1, . . . ,SJ)

where the maximum is taken over all mutually exclusive and collectively exhaustive partitions of M into J subsets. Notice that finding RJ(z) can be a difficult as there are a combinatorial number of possible partitions of M. Moreover, solving for RJ(z) requires precise knowledge of all of the demand functions dm(p),m ∈M.

To extend the heuristic for J > 1 we proceed as follows: Select break-points p1(z) = s0 < s1 < s2 . . . < sJ−1 < sJ = pM (z) and prices qj ∈ (sj−1,sj) such that e(qj,sj−1,z) = e(qj,sj,z) for each j and the efficiencies e(qj,sj,z) are independent of j. More precisely, the sjs and qjs are selected so that

e(qj,sj−1,z) = e(qj,sj,z) for all j = 1, . . . ,J (13)

and e(q1,s1,z) = e(q2,s2,z) = . . . = e(qJ,sJ,z). (14)

Let γJ(z) = e(q1,s1,z) and define the sets Mj = {m : pm(z) ∈ [sj−1,sj)} for j = 1, . . . ,J − 1 and MJ = {m : pm(z) ∈ [sJ−1,sJ]}. Notice that the qjs and sjs are independent of the precise specification of dm(p),m = 2, . . . ,M −1 and consequently γJ(z) is also independent of the intermediate demands. However, identifying the sets Mj,j = 1, . . . ,J does require some knowledge of the intermediate demand functions in the sense that we need to identify the subset Mj to which each pm(z) belongs.

Theorem 6 Under the assumptions of Theorem 5, offering price qj to all segment in Sj for j = 1, . . . ,J results in

RJ(z)

RM (z) ≥ γJ(z).

20

Proof: Clearly

RJ(z)

RM (z) ≥

∑J j=1

∑ m∈Mj rm(qj,z)

RM (z)

= J∑ j=1

∑ m∈Mj

e(qj,pm(z),z) rm(z)

RM (z)

≥ J∑ j=1

∑ m∈Mj

γJ(z) rm(z)

RM (z)

= γJ(z) J∑ j=1

∑ m∈Mj

rm(z)

RM (z)

= γJ(z).

We now illustrate the lower bounds for a variety of demand functions.

6.2 Linear Demand Functions

Consider linear demand functions dm(p) = (am − bmp),m = 1, . . . ,M. Then pm(z) = (am + bmz)/2bm and e(p,pm(z),z) =

p−z ∆m(z)

( 2 − p−z

∆m(z)

) where ∆m(z) = pm(z) −z.

Let sj = z + ∆

1−j/J 1 (z)∆

j/J M (z) j = 0, 1, . . . ,J (15)

and prices

qj = z + 2 ∆

1−(j−1)/J 1 (z)∆

j/J M (z)

∆ 1/J 1 (z) + ∆

1/J M (z)

j = 1, . . . ,J. (16)

Proposition 6 Equations (15,16) are roots of equations (13, 14). Moreover,

γJ(z) = 4∆

1/J 1 (z)∆

1/J M (z)

(∆ 1/J 1 (z) + ∆

1/J M (z))

2 . (17)

Proof: To show that e(qj,sj,z) = qj−z sj−z

( 2 − qj−z

sj−z

) = γJ(z) for each j = 1, . . . ,J, first notice

that qj −z sj −z

= 2 ∆

1/J 1 (z)

∆ 1/J 1 (z) + ∆

1/J M (z)

j = 1, . . . ,J

and that

2 − qj −z sj −z

= 2 ∆

1/J M (z)

∆ 1/J 1 (z) + ∆

1/J M (z)

j = 1, . . . ,J,

21

so

e(qj,sj,z) = qj −z sj −z

( 2 −

qj −z sj −z

) =

4∆ 1/J 1 (z)∆

1/J M (z)

(∆ 1/J 1 (z) + ∆

1/J M (z))

2 .

To show that e(qj,sj−1,z) = γJ(z) for all j notice that

qj −z sj−1 −z

= 2 ∆

1/J M (z)

∆ 1/J 1 (z) + ∆

1/J M (z)

j = 1, . . . ,J

and that

2 − qj −z sj−1 −z

= 2 ∆

1/J 1 (z)

∆ 1/J 1 (z) + ∆

1/J M (z)

j = 1, . . . ,J,

so

e(qj,sj−1,z) = qj −z sj−1 −z

( 2 −

qj −z sj−1 −z

) =

4∆ 1/J 1 (z)∆

1/J M (z)

(∆ 1/J 1 (z) + ∆

1/J M (z))

2 .

The results for the linear demand function for z = 0 and J = 1 first appeared in Gallego and Queyranne [4].

One may wonder how large J needs to be to achieve γJ(z) ≥ 1−α for some pre-specified α and given ∆1(z), ∆M (z). The following corollary answers this question and Table 1 illustrates the results for a range of values of α and of the ratio ∆M (z)/∆1(z).

Corollary 5 Let a(z) = ∆M (z)/∆1(z) and w(α) = (1 + √ α)2/(1 − α). If J is an integer

greater or equal to ln(a(z))/ ln(w(α)), then γJ(z) ≥ 1 −α.

Proof: Let aJ(z) = a(z) 1/J. Then γJ(z) = 4aJ(z)/(1 + aJ(z))

2. Notice that w(α) is a solution to the equation 4w/(1 + w)2 = 1 −α. Thus γJ(z) ≤ 1 −α whenever aJ(z) ≤ w(α), or equivalently whenever a(z) ≤ w(α)J. Solving for J gives the result.

∆M (z)/∆1(z) 1 −α w(α) 2 5 10 25 90% 1.92 2 3 4 5 93% 1.75 2 3 5 6 95% 1.58 2 4 6 8 98% 1.38 3 6 8 11 99% 1.22 4 9 12 17

Table 1: Smallest J such that γJ(z) ≥ 1 −α

From Table 1 we see that if the markup ratio ∆M (z)/∆1(z) = (pM (z) −z)/p1(z) −z) = 2 we need only J = 2 to achieve an effectiveness of 95% regardless of the number of products M. If the markup ratio is 5 then J = 6 is enough to guarantee an effectiveness of 98%. The following example illustrates the lower bounds for a set of 10 products with linear demands as well as the actual performance of the heuristic for J = 1.

22

Example 5 Suppose that M = 10 with market sizes 100, 200, 300, 400, 500, 500, 400, 300, 200, 100, each with uniform willingness to pay functions U[Am,Am + 100] with Am = 100 + 5(m−1),m = 1, . . . , 10. Table 2 reports q1, γ1(z) and the actual performance rM(q1,z)/RM (z) of the heuristic. Table 3 reports the improvements on the efficiency lower bound as we enlarge the menu J. Recall that the results from the table are lower bounds on performance whereas the actual realization from a limited price menu can be significantly higher than the lower bound.

z q1(z) γ1(z) rM(q1,z)/RM (z) 0 $110.11 99% 100% 50 $134.78 98% 100% 100 $159.18 97% 99% 120 $168.78 95% 99% 140 $178.18 93% 98% 160 $187.20 87% 95% 180 $195.29 72% 86%

Table 2: Prices, Lower Bounds and Actual Performance for Example 5.

z γ1(z) γ2(z) γ3(z) γ4(z) γ5(z) 0 98% 100% 100% 100% 100% 50 99% 100% 100% 100% 100% 100 97% 99% 100% 100% 100% 120 95% 99% 99% 100% 100% 140 93% 98% 99% 100% 100% 160 87% 97% 98% 99% 99% 180 72% 92% 96% 98% 99%

Table 3: Efficiency Lower Bounds: J ∈{1, . . . , 5}, Example 5

Notice that the lower-bound γJ(z) deteriorates as z increases and improves as J increases, and even for fairly high values of z, it is possible to obtain reasonably high lower bounds with J = 3 or J = 4. For most demand models the contribution margins (pm(z) − z)/pm(z) go down as the unit cost z increases. The behavior of the lower bound indicates that as margins become thinner it becomes more important to have more pricing flexibility. In other words, higher marginal costs require a higher J to achieve near optimality. In the context of Revenue Management this suggest that a rich fare menu is more important when capacity is scarce than when it is ample.

Remark: Sometimes it is possible to improve on the performance of a limited price menu by giving up on the lower market segments. For example, for J = 1 and z = 180 the profit from market segment 1 is less than 1% of the total. This suggest we can do better by dropping the effort to keep the relative efficiency of market segment 1 high. If we use the single price

q′1(z) = z + 2 ∆2(z)∆M (z)

∆2(z) + ∆M (z) = $198.06

to control the efficiency of markets 2 through 10, the performance for J = 1 improves from 86% to 91.5% even though the efficiency of market segment 1 drops significantly.

23

6.3 Log-Linear Demand Functions

The family of log-linear, or exponential, demand functions is of the form dm(p) = am exp(−p/bm). The maximizer of rm(p,z) is given by pm(z) = z + bm and the efficiency function is given by

e(p,pm(z),z) = p−z

pm(z) −z exp

{ 1 −

p−z pm(z) −z

} .

Let sj = z + b1u

j/J, j = 0, 1, . . . ,J (18)

and prices qj = z + b1u

j/JUJ j = 1, . . . ,J. (19)

where u = bM/b1 and UJ = ln(u)

J(u1/J−1) .

Proposition 7 Equations (18,19) are roots of equations (13, 14). Moreover,

γJ = UJe 1−UJ. (20)

Proof:

To show that e(qj,sj,z) = qj−z sj−z

exp(1 − qj−z sj−z

) = γJ(z) for each j = 1, . . . ,J, first notice

that qj −z sj −z

= UJ,

so e(qj,sj,z) = UJe 1−UJ = γJ(z).

Notice that qj −z sj−1 −z

= u1/JUJ,

so to show e(qj,sj−1,z) = e(qj,sj,z) it is enough to show that u 1/JUJe

1−u1/JUJ = UJe 1−UJ but

this is equivalent to showing that ln(u1/J) = UJ(u 1/J−1) but this is true because UJ(u1/J−1) =

ln(u)/J = ln(u1/J).

Notice that unlike the linear demand function, for log-linear demand functions γJ is inde- pendent of z. However, just like the linear demand function the lower bound improves with J. On the other hand, γJ(z) deteriorates as u = bM/b1 increases.

One may wonder how large J needs to be to achieve γJ ≥ 1 −α for some pre-specified α and given ∆1(z), ∆M (z). The following corollary answers this question and Table 4 illustrates the results for a range of values of α and of the ratio ∆M (z)/∆1(z).

Corollary 6 Let w(α) be the root of ln(a)/(a − 1) = 1 − α and let u = bm/b1. If J is an integer greater or equal to ln(u)/ ln(w(α)), then γJ ≥ 1 −α.

24

Proof: Let bJ = b 1/J. Then γJ = ln(bJ)/(bJ − 1), so setting bJ = b1/J = w(α), solving for

J and rounding up achieves γJ ≥ α.

∆M (z)/∆1(z) 1 −α w(α) 2 5 10 25 90% 1.92 2 4 6 8 93% 1.75 2 5 7 9 95% 1.58 2 6 8 11 98% 1.38 4 9 12 17 99% 1.22 6 12 17 24

Table 4: Smallest J such that γJ(z) ≥ 1 −α

Example 6 Suppose that M = 10 with log-linear demand functions with parameters a1, . . . ,am given by 100, 200, 300,400, 500,500, 400,300,200,100, respectively and with parameters bm = 50+10(m−1),m = 1, . . . , 10. Table 5 reports q1, γ1(z) and the actual performance R(q1,z)/R(z) of a common pricing policy for a range of values of z. Notice that in this case γ1(z) is inde- pendent of z and that the actual performance is significantly better than the lower bound but does deteriorate slowly with z.

Table 6 reports the improvements on the efficiency lower bound as we enlarge the menu J for different values of u. The key observation is that for log-linear demand functions pricing flexibility is important when u is large, but just a little flexibility can result in a fairly high lower bound on efficiency, with the true performance of the system likely to be significantly better.

z q1(z) γ1 rM(q1,z)/RM (z) $0.00 $80.08 88% 96% $50.00 $130.08 88% 96% $100.00 $180.08 88% 96% $150.00 $230.08 88% 95% $200.00 $280.08 88% 95% $250.00 $330.08 88% 95%

Table 5: Efficiency Lower Bounds and Actual Performance J = 1

u γ1 γ2 γ3 γ4 γ5 1 100% 100% 100% 100% 100% 2 94% 99% 99% 100% 100% 3 86% 96% 98% 99% 99% 4 79% 94% 97% 99% 99% 5 73% 92% 96% 98% 99%

Table 6: Efficiency Lower Bounds: J ∈{1, . . . , 5}

25

6.4 Logit Demand Model

Consider the logit demand functions dm(p) = λmHm(z) where Hm(z) = e αm−p/(1 + eαm−p)

denotes the probability that a customer will select a product of quality αm at price p over a no purchase alternative under the logit model. This model arises when the utlity of the product is αm−p+� and the the no purchase alternative has utility �′ where � and �′ are both standard Gumbel random variables. It is easy to see that pm(z), the maximizer of rm(p,z) is the unique root of p = z + 1 + eαm−p and the efficiency function is given by

e(p,pm(z),z) = p−z

pm(z) −z + (ep−pm(z) − 1) .

The key to showing the form of e(p,pm(z),z) is that rm(z) = pm(z) − z − 1 = eαm−pm(z) = Hm(pm(z))/(1−Hm(pm(z)), so the optimal profit per customer is the purchase to no-purchase odds ratio. The highest efficiency common markup, say ∆ = p− z, for two market segments with optimal markups ∆i(z) = pi(z) −z,i = 1, 2 is given by

∆ = ln (

∆2 − ∆1 e−∆1 −e−∆2

) where ∆ is the root of the equation e(z + ∆,p1(z),z) = e(z + ∆,p2(z),z). This result in the lower bound

R1(z)

R2(z) ≥ γ1(z) = e(z + ∆,p1(z),z) =

∆(e−∆1 −e−∆2 ) (∆2 − 1)e−∆1 − (∆1 − 1)e−∆2

.

Finding a closed form solution to γJ(z) is quite involved, but γJ(z) can be computed numerically by finding breakpoints p1(z) − s0 < s1 < .. . < SJ = pM (z) and prices qj ∈ (sj−1,sj) such that e(qj,sj−1,z) = e(qj,sj,z) and e(qi,si,z) = e(q1,s1,z) for all j = 1, . . . ,J. The following example illustrates the behavior of the heuristic q1 for the case of J = 1 and the performance of γJ(z) for several values of J and z.

Example 7 Suppose that M = 10 with market sizes λm = 220−20m and quality parameters am = m,m = 1, . . . , 10. Table 7 reports q1, γ1(z) and the actual performance rM(q1,z)/RM (z) of the heuristic that prices all market segments at q1 where q1 is the root of the equation e(q1,p1(z),z) = e(q1,pM (z),z). Table 8 provides values of γJ(z) for J = 1, . . . , 5 and z = 2k,k = 0, . . . , 5. In sharp contrast to the linear demand function, where γj(z) decreases with z, here γj(z) increases with z. The reason for this is that for the logit function the difference pM (z) − p1(z) is decreasing in z, implying that restricting the price menu works better as z increases.

7 Multiple Products

So far we have explored how to price a single product in one or more markets. In this section we explore the problem of maximizing r(p,z) = (p − z)′d(p) where z ∈ <n is the vector of

26

z q1(z) γ1(z) rM(q1,z)/RM (z) $0.00 $3.44 49% 77% $2.00 $4.78 52% 82% $4.00 $6.35 62% 85% $6.00 $7.91 77% 89% $8.00 $9.46 92% 95% $10.00 $11.14 99% 99%

Table 7: Prices, Lower Bounds and Actual Performance for Example 7.

z γ1(z) γ2(z) γ3(z) γ4(z) γ5(z) $0.00 49% 77% 88% 93% 95% $2.00 52% 80% 90% 94% 96% $4.00 62% 86% 93% 96% 97% $6.00 77% 93% 97% 98% 99% $8.00 92% 98% 99% 99% 100% $10.00 99% 100% 100% 100% 100%

Table 8: Efficiency Lower Bounds: J ∈{1, . . . , 5}, Example 7

unit costs, p ∈ <n+ is the price vector and d(p) ∈ <n is the demand function. As before, it is easy to see that r(z) = supp r(p,z) is decreasing convex in z. The problem of existence of a finite maximizer p(z) and conditions for the uniqueness of p(z) have attracted the attention of several researchers, but most of the work is for specific demand functions. Here we present some results for the linear demand function and demands driven by the nested logic model.

7.1 Linear Demand Function

Let d(p) = a − Bp where a and p are n-dimensional vectors and B is an n × n matrix. We are interested in finding conditions on a and B that guarantee the existence of a unique, non- negative, profit maximizing price vector p(z) such that r(z) = (p(z)−z)′d(p(z)) for all z ≥ 0. Maximizing r(p,z) with respect to p is equivalent to minimizing 1

2 p′(B+B′)p−(a+B′z)′p+a′z

which is quadratic function. A sufficient condition for this function to be convex is that S = B + B′ is positive definitive. Recall that a matrix S is positive definitive if and only if p′Sp ≥ 0 for all p 6= 0. It is known that S is positive definitive, if and only if B is, see [9]. If B is positive definitive then S is invertible and since S is symmetric, so is its inverse S−1. If B is positive definitive then the maximizer of r(p,z) is given by

p(z) = S−1(a + B′z). (21)

We will impose conditions on a and B so that p(0) = S−1a ≥ 0. A sufficient condition for this is that a ≥ 0 and S is a Stieltjes matrix or s-matrix. An s-matrix is a real symmetric, positive definitive matrix with non-positve off-diagonal elements. Since we have already as- sumed that S is positive definitive the only additional requirement is that Bij + Bji ≤ 0 for all i 6= j, which is something we expect from the economics of the linear demand model. An

27

important consequence is that an s-matrix has a non-negative inverse implying that p(0) ≥ 0 whenever a ≥ 0. Since p(z) is non-decreasing in z it follows that p(z) ≥ p(0) ≥ 0 for all z ≥ 0.

By adding and subtracting Bz to the expression in parenthesis on the righthand side of (21) we can write

p(z) = z + S−1d(z) (22)

where d(z) is the demand at p = z. It is also possible to write d(p(z)) = a − Bp(z) = a − B(p(z) ± z) = a − Bz − B(p(z) − z) = (I − BS−1)d(z) and then use the fact that I −BS−1 = B′S−1 to obtain

d(p(z)) = B′S−1d(z). (23)

This allow us to write

r(z) = (p(z) −z)′d(p(z)) = d(z)′S−1B′S−1d(z) = d(z)′Nd(z) (24)

where N = S−1BS−1.

7.1.1 Random Potential Demand

A natural extension to the linear demand model is to have random potential demand d(0) = a. We will assume that a is a non-negative, random vector, with mean µ. Are we better off if a is random? The answer is yes if we can observe a before deciding the price p(z) = z +S−1d(z) to offer. From equation (24) we can write the optimal profit function as r(z) = (a−Bz)′N(a−Bz) which is a convex function of a given that N is positive-definitive. By Jensen’s inequality Ea(a − Bz)′N(a − Bz) ≥ (µ − Bz)′N(µ − Bz) = d̄(z)′Nd̄(z) = r̄(z), where d̄(z) = µ − Bz is the expected demand at z and r̄(z) is the optimal profit corresponding to demand d̄(z). Suppose now the decision maker has to price before observing a. Is he worse off because of randomness? The answer is no, since if he prices in anticipation of average demand, his optimal price is p̄(z) = z + S−1d̄(z), resulting in expected profits r̄(z), which are equal to the profits that the decision maker would have made if a = µ, i.e., if demand were deterministic. The implication here is that dynamic pricing can also be driven by randomness in the potential demand even if the marginal value of capacity is unchanged.

7.1.2 Linear Component Costs

Suppose that the n products are built from m components according to the recipe matrix A = (Aij) where Aij is the number of units of component j used by product i. Suppose further that component can be procured at a linear cost y, where y is the vector of unit component costs. How many units of each component should the firm buy? And, at what price should the products be sold if the demand function is d(p) = a−Bp? Since the demand is deterministic we can solve this problem by using the fact that the unit cost vector z is given by z = Ay. Therefore, it is optimal for the firm to price at p(Ay) and to sell d(p(Ay)) resulting in profits r(Ay) = (p − Ay)′d(p(Ay)). The problem becomes more interesting if a is non-negative random vector and there is a need to procure q units before observing the realization of a. Committing to q before observing a hurts, and if the decision maker has to set

28

prices before observing a then randomness in a is detrimental to profits. On the other hand, if a can be observed before setting prices, the revenue advantage from Jensen’s inequality can in some cases overcome the disadvantage of having to commit to q before observing a.

7.2 Log-Linear Demand

The log-linear demand function for multiple products can be written as d(p) = exp(a−Bp). Unfortunately this demand function is not very amenable to analysis as attempts to maximize r(p,z) = (p−z)′d(p) leads to unbounded solutions whenever the non-diagonal elements of B are negative. Indeed, if bij < 0 for some i then there is an incentive to make pj very large which has the negative effect of bringing demand and revenues from product j to near zero, but also the positive effect of artificially increasing demand and revenue for product i. If bij > 0 then increasing the price of product j decreases the demand of product i which is what we would expect if the products are complements (e.g., a shirt and a tie) instead of substitutes. This leaves the case bii = 0 which reduces to independent demands. This analysis shows that the log-linear demand function has limited applications to independent demands and the pricing of complementary products.

7.3 The Nested Logit Model

In this section we consider pricing under the Nested Logit (NL) model, which is a popular generalization of the standard MNL model. Under the NL model, customers make product selection decisions sequentially: at the upper level, they first select a branch, called a “nest” that includes multiple similar products; at the lower level, their subsequent selection is within that chosen nest (see McFadden [12], Carrasco and Ortuzar [1] and Green [3]). Suppose that the substitutable products constitute n nests and nest i has mi products. Let pi = (pi1,pi2, . . . ,pimi) be the price vector corresponding to nest i = 1, . . . ,n, and let (p1, . . . ,pn) be the price vector for all the products in all the nests. Let Qi(p1, . . . ,pn) be the probability that a customer selects nest i at the upper level; and let qk|i(pi) denote the probability that product k of nest i is selected at the lower level, given that the customer selects nest i where pi is the price vector for all the products in nest i. Qi(p1, . . . ,pn) and qk|i(pi) are defined as follows:

Qi(p1, . . . ,pn) = eγiIi

1 + ∑n l=1 e

γlIl , (25)

qj|i(pi) = eαij−βijpij∑mi s=1 e

αis−βispis , (26)

where αis can be interpreted as the “quality” of product s in nest i, βis ≥ 0 is the product- specified price sensitivity for that product, Il = log

∑ml s=1 e

αls−βlspls represents the attractiveness of nest l, which is the expected value of the maximum of the utilities of all the products in nest l, and nest coefficient γi can be viewed as the degree of inter-nest heterogeneity. When 0 < γi < 1, products are more similar within nest i than across nests; when γi = 1, products in nest i have the same degree of similarity as products in other nests, and the NL model

29

reduces to the standard MNL model; when γi > 1, products are more similar to the ones in other nests.

The probability that a customer will select product k of nest i, which can also be considered the market share of that product, is

πij(p1, . . . ,pn) = Qi(p1, . . . ,pn)qj|i(pi). (27)

The monopolist’s problem is to determine the price vectors (p1, . . . ,pn) to maximize the total expected profit

R(p1, . . . ,pn) = n∑ i=1

mi∑ j=1

(pij −zij)πij(p1, . . . ,pn), (28)

where zij is the unit cost of product j in nest i. The objective function R(p1, . . . ,pn) fails to be quasi-concave in prices. When the objective function is rewritten with market shares as decision variables then the objective function can be shown to be concave if the price sensitivity parameters βij = βi are product independent in each nest and γi ≤ 1 for all i as shown in Li and Huh [11]. However, the objective function fails to be concave in the market shares in the more general case where the price sensitivities are product dependent.

Let pij(z) denote the optimal price for product j in nest i as a function of the vector of unit costs z. Gallego and Wang [8], show that the optimal price pij(z) adds to the unit cost zij two components. The first component is the reciprocal of the price sensitivity βij and the second one is a nest dependent constant θi, so that pij(z) = zij + 1/βij + θi. They also show that the nest dependent constants θi, i = 1, . . . ,n are linked as explained in the following Theorem.

Theorem 7 If γi ≥ 1 or maxs βismins βis ≤ 1

1−γi , then there exist a unique constant φ such that

θi + (1 − 1

γi )wi(θi) = φ,

and

pij(zij) = zij + 1

βij + θ∗i ,

where wi(θ) = ∑mi k=1

1 βik · qk|i(θi) and qk|i(θi) = e

α̃ik−βikθi∑mi s=1

eα̃is−βisθi .

Theorem 7 is interesting because a non-concave optimization problem over ∑n i=1 mi vari-

ables can be reduced, under mild conditions, to a root finding problem over a single variable. Even in the mild condition γi ≥ 1 or maxs βismins βis ≤

1 1−γi

fails to hold, Gallego and Wang [8] show that the problem reduces to a single variable maximization problem of a continuous function over a bounded interval, so the problem can be easily solved numerically. Gallego and Wang [8] also show that if different firms control different nests the pricing problem under com- petition is strictly log-supermodular in the nest markup constants, so the equilibrium set is nonempty with the largest equilibrium preferred by all the firms.

30

8 Acknowledgments

I acknowledge the feedback from my students and collaborators. In particular, I would like to recognize the contributions and feedback from Anran Li and Richard Ratliff.

9 Appendix

Proof of Corollary 1 Proof: Clearly y′ ≥ y implies g(y′) ≥ g(y) and r decreasing implies that r(g(y′)) ≤ r(g(y)), showing that r(g(y)) is decreasing in y. Tho show that r(g(y)) is convex, notice that from the concavity of h it follows that g(αy + (1−α)y′) ≥ αg(y) + (1−α)g(y′) for all α ∈ [0, 1]. Then, since r is decreasing, it follows that r(g(αy + (1−α)y′) ≤ r(αg(y) + (1− α)g(y′)). Finally, from the convexity of r we see that r(αg(y) + (1 − α)g(y′)) ≤ αr(g(y)) + (1−α)r(g(y′)). Consequently, r(g(αy + (1−α)y′) ≤ αr(g(y)) + (1−α)r(g(y′)), showing that r(g(y)) is convex.

Since r is convex, by Jensen’s inequality Er(Z) ≥ r(EZ), In particular, Er(g(Y )) ≥ r(Eg(Y )), By the concavity of g and Jensen’s inequality we have Eg(Y ) ≤ g(EY ). Since r is decreasing, it follows that r(Eg(Y )) ≥ r(g(EY )).

Proof of Lemma 1 Proof: If d(p̄(z)) = d̄(p̄(z)) then r(z) ≤ r̄(z) = r̄(p̄(z),z) = r(p̄(z),z) ≤ r(z) implying that r(z) = r̄(z) and that p(z) = p̄(z) is a finite maximizer of r(p,z). Next, we will show that d(p̄(z)) < d̄(p̄(z)) leads to a contradiction. To see this, first notice that d(p) ≤ d̄(p) < d̄(p̄(z)) for all p > p̄(z),p ∈ X for otherwise there is a p ∈ X, p > p̄(z) such that r̄(p,z) > r̄(z) contradicting the optimality of p̄(z). But then, d(p(z)) < d̄(p(z)) = supp≥p(z) d(p) together with d(p) < d̄(p) < d̄(p(z)) for all p > p(z) contradicts the fact that d is upper-semicontinuous at p(z) since

d(p̄(z)) < d̄(p̄(z)) = lim �↓0

sup p̄(z)≤p≤p̄(z)+�

d(p) ≤ lim sup p→p̄(z)

d(p) ≤ d(p̄(z)),

where the last inequality follows from the USC of d(p) at p̄(z).

Proof of Theorem 2. Proof: Since d(p) is USC and the product of non-negative USC functions is also USC, it follows that r(p,z) and r̄(p,z) are USC in p ∈ [z,∞). If d(p) = 0 for all p ≥ z, then p(z) = z and r(z) = r(z,z) = 0 and there is nothing to prove. Otherwise there exists a p′ > z such that 0 < d(p′) < ∞ for if not then s̄(z) = ∞. We will show that there is a q > p′ such that r̄(p,z) ≤ r(p′,z) for all p > q. This will allow us to restrict the optimization of r̄(p,z) to p ∈ [z,q]. Since r̄(p,z) is USC and the supremum is now taken over a closed and bounded set, the Extreme Value Theorem (EVT) guarantees the existence of a finite price, say p̄(z) ∈ [z,q] such that r̄(z) = r̄(p̄(z),z) = maxp≥z r̄(p,z). Then, by Lemma 1, p(z) = p̄(z) is a finite maximizer of r(p,z).

Let � > 0. We claim there exists a p1 > p ′ such that s̄(p) < � for all p > p1. This follows

because s̄(0) − s̄(p) is increasing and converges to s̄(0) as p → ∞. Consequently, there exist a p1 > p

′ such that s̄(0) − s̄(p) > s̄(0) − � for all p > p1, or equivalently s̄(p) < � for all p > p1. We claim there exist a price q ≥ p1 such that r̄(q,z) < �. If q does not exist, then

31

r̄(p,z) > � for all p > p1, implying that d̄(p) > �/(p − z) for all p > p1 and consequently s̄(0) ≥ s̄(p1) = ∞, contradicting the finiteness of s̄(0).

Therefore for all p > q,

r̄(p,z) = (q −z)d̄(p) + (p− q)d̄(p) ≤ (q −z)d̄(q) + s̄(q) = r̄(q,z) + s̄(q)

≤ 2�.

By taking � ∈ (0, 0.5r(p′,z)) we guarantee that r̄(p,z) ≤ r̄(p′,z) for all p ≥ q, so we can limit the optimization to the closed and bounded set [0,q], enabling us to call on the EVT to show the existence of p̄(z) ≤ q such that r̄(z) = r̄(p̄(z),z).

We now turn to the monotonicity of the largest maximizer, say p(z), of r(p,z). Suppose that z ≤ z′. If p(z) ≤ z′ then there is nothing to show as then p(z) ≤ z′ ≤ p(z′). On the other hand, if z′ < p(z) we will show that r(p′,z′) < r(p(z),z′) for all prices p′ ∈ [z′,p(z)] so then r(z′) = maxp≥p(z) r(p,z

′) and therefore p(z′) ≥ p(z). To see this notice that r(p′,z) = (p′ − z)d(p′) ≤ (p(z) − z)d(p(z)), so (p(z) − p′)d(p(z)) ≥ (p′ − z)(d(p′) − d(p(z)) > (p′ − z′)(d(p′) −d(p(z)), and this implies that r(p′,z′) < r(p(z),z′), showing p(z′) ≥ p(z).

Proof of Corollary 2 Proof: If d(p) is proper and decreasing, and λ = d(0) < ∞, then H(p) = d(p)/λ ∈ [0, 1] is also decreasing and therefore it is the complement of the cumulative distribution function (CCDF) of a non-negative random variable, say W. If H(p) = P(W ≥ p), then H(p) is left continuous with right limits (LCRL). Since a decreasing LCRL function is USC it follows that H(p) and therefore d̄(p) = d(p) is USC. In addition, E[W] < ∞ implies that s̄(0) = s(0) = λE[W] < ∞. As a result the conditions of Theorem 2 are satisfied.

References

[1] Carrasco, J., J. de D. Ortuzar. 2002. Review and assessment of the nested logit model. Transport Reviews 22(2) 197-218.

[2] Cooper, W. 2002. Asymptotic behavior of an allocation policy forrevenue management. Oper. Res. 50(4) 720-727.

[3] Greene, W. H. 2007. Econometric Analysis. Pearson Education.

[4] Gallego, G. and M. Queyranne. 1995. Inventory Coordination and Pricing Decisions: Anal- ysis of a Simple Class of Heuristics. Chapter 6 in Optimization in Industry 3, Mathema- tical Programming and Modeling Techniques in Practice, Anna Sciomachen, Ed. Wiley.

[5] Gallego, G., G. van Ryzin. 1994. Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Science 40 999–1020.

32

[6] Gallego, G. and Sahin, O. 2010. Revenue Management with Partially Refundable Fares. Operations Research 58, 817-833.

[7] Gallego, G. 2010. Dynamic Pricing Chapter. Working paper, Columbia University.

[8] Gallego, G. and R. Wang. 2011. Multi-Product Price Optimization and Competition un- der the Nested Logit Model withProduct-Differentiated Price Sensitivities. Working paper. Columbia University.

[9] Johnson, C. R. ”Positive Definite Matrices.” Amer. Math. Monthly 77, 259-264 1970.

[10] Lariviere, M. A., E. L. Porteus. 2001. Selling to the newsvendor: An analysis of price-only contracts. Manufacturing Service Oper. Management, 3 293-305

[11] Li, H., W. T. Huh. 2011. Pricing multiple products with the multinomial logit and nested logit models: Concavity and implications. To appear in Manufacturing Service Oper. Man- agement.

[12] McFadden, D. 1974. Conditional logit analysis of qualitative choice behavior. P. Zarem- bka, ed., Frontiers inEconometrics. Academic Press, New York, 105142

[13] Maglaras, C. and J. Meissner. 2006. Dynamic Pricing Strategies for Multiproduct Revenue Management Problems. Manufacturing Service Oper. Management, 8, 2, 135-148.

[14] Png, I. P. L. 1989. Reservations: Customer insurance in the marketing of capacity. Mar- keting Sci., 8, 248-264.

[15] R. T. Rockafellar, Convex Analysis. Princeton University Press, 1970.

[16] Shugan, S., J. Xie. 2000. Advance pricing of services and other implications of separating purchase and consumption. J. Service Research. 2, 227239

[17] van den Berg, G. 2007. On the uniquenes of optimal prices set by monopolistic sellers. Journal of Econometrics, 14, 482-491.

[18] Xie, J., S. Shugan. 2001. Electronic tickets, smart cards, and online prepayments: When and how to advance sell. Marketing Sci., 20, 219243.

[19] S. Ziya , H. Ayhan, and R. D. Foley, Relationships Among Three Assumptions in Revenue Management, Operations Research 52, (2004) 804-809.

33