Economic

profileAceshooter
MultivariateDifferentialCalculus.pdf

Multivariate Differential Calculus

Dr Damien S. Eldridge

Australian National University

20 February 2021

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 1 / 146

Readings Part 1

Chiang, AC (1984), Fundamental methods of mathematical economics (third edition), McGraw-Hill, Singapore: Chapters 6 to 8 and Chapter 10 (pp. 127–227 and 268–306).

Chiang, AC and K Wainwright (2005), Fundamental methods of mathematical economics (fourth edition), McGraw-Hill, Singapore: Chapters 6 to 10 (pp. 124–290).

Greene, WH (2000), Econometric analysis (fourth edition), Prentice-Hall, USA: Chapter 2 (Section 9) (pp. 49–58).

Haeussler, EF Jr, and RS Paul (1987), Introductory mathematical analysis for business, economics, and the life and social sciences (fifth edition), Prentice-Hall, USA: Chapter 17 (Sections 1 to 7 and Sections 9 to 10) (pp. 668–706 and 714–723).

Lindstrom, TL (2017), Spaces: an introduction to real analysis, The Americam Mathematical Society, USA: Chapters 2 and 6 (pp. 23–42 and 173–238).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 2 / 146

Readings Part 2

Simon, CP, and L Blume (1994), Mathematics for economists, WW Norton and Company, USA: Chapters 13 to15 (pp. 273–371).

Spiegel, MR (1981a), Schaum’s outline of theory and problems of advanced calculus (SI metric edition), McGraw-Hill, Singapore: Chapters 6 to 8 (pp. 101–179).

Spiegel, MR (1981b), Theory and problems of vector analysis and an introduction to tensor analysis (SI (metric) edition), Schaum’s Outline Series, McGraw-Hill Book Company, Singapore: Chapters 3 and 4 (pp. 35–81).

Sundaram, RK (1996), A first course in optimization theory, Cambridge University Press, USA: Chapter 1 (Section 4) (pp. 41–50).

Sydsaeter, K, P Hammond, A Strom, and A Carvajal (2016), Essential mathematics for economic analysis (fifth edition), Pearson Education, United Kingdom: Chapters 11 and 12 (pp. 407–494).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 3 / 146

Some Introductory Remarks Part 1

You will hopefully be familiar with many aspects of differential calculus, including (but not necessarily limited to):

Limits, continuity, and differentiability of real-valued univariate functions of a real variable; Partial derivatives, the gradient vector, and total differentials of real-valued multivariate functions of real variables; and The implicit function theorem.

One way of interpreting a derivative in these cases is as the “best” affine (or, loosely speaking, linear) approximation to a function at a point.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 4 / 146

Some Introductory Remarks Part 2

Here we will extend this “derivative as the best affine approximation” idea to more general setting involving “real-vector-valued functions of a real vector”.

This more general setting involves functions of the form f : S −→ T , where S ⊆ Rm, T ⊆ Rn, m ∈ N, and n ∈ N.

The “real-valued functions of a real vector” cases that you have already encountered are special cases of this more general setting.

In the case of real-valued univariate functions of a real variable, we have m = n = 1. In the case of real-valued multivariate functions of real variables, we have m > 1 and n = 1.

In this course, we will typically be dealing with real-valued multivariate functions of real variables.

But you might sometimes encounter more general functions between Euclidean spaces during your studies or research. One important example of this from econometrics involves the statistical distribution theory for some transformations of vectors of random variables.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 5 / 146

Functions on Euclidean Spaces

We are interested in functions of the form f : S −→ T where S ⊆ Rm, T ⊆ Rn, m ∈ N, and n ∈ N. Recall the following.

S ⊆ Rm is the domain of f . T ⊆ Rn is the co-domain of f . f (S) = {y ∈ T : y = f (x) for some x ∈ S} is the image of S under f . Clearly f (S) ⊆ T ⊆ Rn. Since f is a function, we know that it associates only one point in T with any particular point in S.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 6 / 146

Continuous Functions on Euclidean Spaces Part 1

Let (S, dS) and (T , dT ) be metric spaces, where S ⊆ Rm, T ⊆ Rn, m ∈ N, n ∈ N, dS = dmE |S is the restriction of the m-dimensional Euclidean metric to S, and dT = d

n E|T is the restriction of the

n-dimensional Euclidean metric to T .

Consider a function of the form f : S −→ T . f is continuous at the point x ∈ S if either of the following is true.

For each e > 0, there exists some δe > 0 such that (y ∈ S and dS(x, y) < δe) =⇒ dT (f (x), f (y)) < e. For all sequences {xk}k∈N such that (xk ∈ S for all k ∈ N) and limk−→∞ xk = x, it is the case that limk−→∞ f (xk) = f (x).

f is said to be continuous on S if it is continuous at all points x ∈ S.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 7 / 146

Continuous Functions on Euclidean Spaces Part 2

Let (S, dS) and (T , dT ) be metric spaces, where S ⊆ Rm, T ⊆ Rn, m ∈ N, n ∈ N, dS = dmE |S is the restriction of the m-dimensional Euclidean metric to S, and dT = d

n E|T is the restriction of the

n-dimensional Euclidean metric to T .

For all i ∈{1, 2, · · · , n}, let (Ti , di) be a metric space, where Ti ⊆ R and di = dE is the restriction of the one-dimensional Euclidean metric to Ti .

Consider a function of the form f : S −→ T . Since T ⊆ Rn, we know that f = (f1, f2, · · · , fn)T is a column vector that consists of n component functions of the form fi : S −→ Ti , where Ti ⊆ R for all i ∈{1, 2, · · · , n}. Theorem: f is continuous at the point x ∈ S if and only if all n of the component functions fi are continuous at the point x ∈ S. Theorem: f is continuous on S if and only if all n of the component functions fi are continuous on S.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 8 / 146

Continuous Functions on Euclidean Spaces Part 3

Let (S, dS) and (T , dT ) be metric spaces, where S ⊆ Rm, T ⊆ Rn, m ∈ N, n ∈ N, dS = dmE |S is the restriction of the m-dimensional Euclidean metric to S, and dT = d

n E|T is the restriction of the

n-dimensional Euclidean metric to T .

Consider a function of the form f : S −→ T . Theorem: f is continuous at the point x ∈ S if and only if, for every open set V ⊆ Rn such that f (x) ∈ V , there exists an open set U ⊆ Rm such that both x ∈ U and f (z) ∈ V for all z ∈ U ∩S. Theorem: f is continuous on S if and only if, for every open set V ⊆ Rn, there exists an open set U ⊆ Rm such that f −1(V ) = U ∩S.

Recall that f −1(V ) = {x ∈ S : f (x) ∈ V} is the pre-image of the set V under the function f .

Note that continuity is only a local property of a function, in the sense that continuity at a particular point x does not tell us anything about continuity at other points.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 9 / 146

Differentiable Functions on Euclidean Spaces Part 1

Let S ⊆ Rm and T ⊆ Rn where m ∈ N and n ∈ N. Let ‖ ·‖m be the Euclidean norm on Rm, ‖ ·‖n be the Euclidean norm on Rn, ‖ ·‖S be the restriction of the Euclidean norm on Rm to S, and ‖ ·‖T be the restriction of the Euclidean norm on Rn to T . Where confusion is unlikely, we will drop the subscripts on the norm symbols.

Recall that the relationship between the Euclidean norm and the Euclidean distance for any Euclidean space is ‖x‖ = d(x, 0). Clearly S is a normed vector space because it is a sub-space of Euclidean m-space.

Clearly T is a normed vector space because it is a sub-space of Euclidean n-space.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 10 / 146

Differentiable Functions on Euclidean Spaces Part 2

Let S be a sub-space of Euclidean m-space and T be a sub-space of Euclidean n-space.

Consider a function of the form f : S −→ T . f is differentiable at the point x ∈ S if there exists an (n×m) matrix A such that for all e > 0, there exists some δe > 0 for which (y ∈ S and ‖y −x‖ < δe) =⇒‖f (x)− f (y)−A(x −y)‖ < e‖x −y‖. Alternatively, we could express this definition as follows.

f is differentiable at the point x ∈ S if limy−→x

( ‖f (y)−f (x)−A(y−x)‖

‖y−x‖

) = 0.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 11 / 146

Differentiable Functions on Euclidean Spaces Part 3

Let S be a sub-space of Euclidean m-space and T be a sub-space of Euclidean n-space.

Consider a function of the form f : S −→ T . When limy−→x

( ‖f (y)−f (x)−A(y−x)‖

‖y−x‖

) = 0, we call A the derivative of

f at the point x.

This is sometimes wriiten as A = Df (x) or A = grad f (x), or A = ∇f (x). The term“grad” comes from the word “gradient”. It indicates that the derivative of f at the point x is simply the slope of the tangent line (/plane/hyperplane) to f at the point x. The matrix A is sometimes called the Jacobian matrix, or the Jacobian derivative, of the function f at the point x.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 12 / 146

Differentiable Functions on Euclidean Spaces Part 4

Let S be a sub-space of Euclidean m-space and T be a sub-space of Euclidean n-space.

Consider a function of the form f : S −→ T . We can use the fact that the derivative of f at the point x is simply the slope of the tangent line (/plane/hyperplane) to f at the point x to provide an intuitive interpretation of the definition of differentiability.

A function g : Rm −→ Rn is called an affine function if it takes the form g(y) = Ay + b, where A is an (n×m) matrix and b ∈ Rn is an (n× 1) column vector.

If b = 0, then the function g is called a linear function.

Continued on the next slide.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 13 / 146

Differentiable Functions on Euclidean Spaces Part 5

Continued from the previous slide.

Intuitively, the derivative of f at the point x is the “best” affine approximation to f at the point x.

By “best”, we mean that limy−→x ( ‖f (y)−g(y)‖ ‖y−x‖

) = 0.

Clearly, we want g(x) = f (x) as well, because it would seem unreasonable to call g a good approximation to f at the point x if this were not the case.

Since f (x) = g(x), we must have g(x) = Ax + b = f (x). This requires that b = f (x)−Ax. Thus we have g(y) = Ay + f (x)−Ax = A(y −x) + f (x). Note that f (y)−g(y) = f (y)−(A(y −x) + f (x)) = f (y)− f (x)−A(y −x). Continued on the next slide.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 14 / 146

Differentiable Functions on Euclidean Spaces Part 6

Continued from the previous slide.

As such, we need

lim y−→x

( ‖f (y)−g(y)‖ ‖y −x‖

) = lim

y−→x

( ‖f (y)− f (x)−A(y −x)‖

‖y −x‖

) = 0.

But this is just the definition of the derivative of the function f at the point x that was provided earlier.

Illustrate this on the whiteboard for an example in which f : R −→ R.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 15 / 146

Differentiable Functions on Euclidean Spaces Part 7

Let S be a sub-space of Euclidean m-space and T be a sub-space of Euclidean n-space.

Consider a function of the form f : S −→ T . If f is differentiable at every point x ∈ S, then f is differentiable on S. If f is differentiable on S, then the derivative operator applied to f is itself a function of the form Df : S −→ MR,(n×m), where MR,(n×m) is the space of all (n×m) matrices whose entries are real numbers.

Technically, we would need to define an appropriate norm on MR,(n×m) in order to establish this result. However, we can avoid doing this by simply rewriting any (n×m) matrix as an (mn× 1) column vector. Such a vector would contain every entry of the original matrix. Note that the number of entries in an (n×m) matrix is the same as the number of entries in an (mn× 1) column vector. We can now think about Df : S −→ MR,(n×m) as being a function of the form Df : S −→ Rmn. This allows us to use the Euclidean norm on Rmn.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 16 / 146

Differentiable Functions on Euclidean Spaces Part 8

Let S be a sub-space of Euclidean m-space and T be a sub-space of Euclidean n-space.

Consider a function of the form f : S −→ T . If f is differentiable on S ⊆ Rm and Df is continuous on Df (S) ⊆ Rmn, then f ∈ C 1, where C 1 is the set of all functions that are at least once continuously differentiable.

Theorem: f is differentiable at the point x ∈ S if and only if all n of the component functions fi are differentiable at the point x ∈ S.

In this case, we have Df (x) = (Df1(x), Df2(x), · · · , Dfn(x)). Theorem: f is continuously differentiable on S if and only if all n of the component functions fi are continuously differentiable on S.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 17 / 146

Differentiable Functions on Euclidean Spaces Part 9

The difference between differentiability and continuous differentiability is not trivial.

This can be seen from the following example.

Consider the function f : R −→ R defined by

f (x) =

{ 0 if x = 0;

x2 sin( 1 x2 ) if x 6= 0.

When x 6= 0, we have f ′(x) = 2x sin( 1 x2 ) + ( 2

x ) cos( 1

x2 ).

Since |sin(·)| 6 1, |cos(·)| 6 1, limx−→0 2x = 0, and limx−→0 2x = ∞, it is clear that limx−→0 f

′(x) is not well defined.

Continued on the next slide.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 18 / 146

Differentiable Functions on Euclidean Spaces Part 10

Continued from the previous slide.

However, f ′(0) does exist.

f ′(0) = limx−→0 ( f (x)−f (0)

x−0

) = limx−→0

( f (x)−0 x−0

) = limx−→0

f (x) x =

limx−→0 x2 sin( 1

x2 )

x = limx−→0 x sin( 1 x2 ).

Since |sin (

1 x2

) | 6 1 for all 1

x2 ∈ R, we know that −1 6 sin

( 1 x2

) 6 1

for all 1 x2 ∈ R.

This means that limx−→0 f ′(0) = limx−→0 x sin(

1 x2 ) = 0.

Since limx−→0 f ′(x) 6= f ′(0), we know that Df is not continuous.

This means that f is not continuously differentiable.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 19 / 146

Differentiable Functions on Euclidean Spaces Part 11

Theorem: If the functions f : Rm −→ Rn and g : Rm −→ Rn are both differentiable at the point x ∈ Rm, then the function f + g is also differentiable at that point and D(f + g) = Df + Dg.

Theorem: If the function f : Rk −→ Rm id differentiable at the point x ∈ Rk and the function g : Rm −→ Rn is differentiable at the point f (x) ∈ Rm, then the composite function g ◦ f is also differentiable at the point x ∈ Rk and D(g ◦ f ) = Dg(f (x))Df (x).

The result that D(g ◦ f ) = Dg(f (x))Df (x) is known as the “chain rule” of differentiation.

Note that these two theorems are “if, then” theorems, not “if and only if” theorems.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 20 / 146

Differentiable Functions on Euclidean Spaces Part 12

Let S be a sub-space of Euclidean m-space and T be a sub-space of Euclidean n-space.

Consider a function of the form f : S −→ T . A necessary, but not sufficient, condition for f to be differentiable at a point x ∈ S is that f be continuous at that point. A necessary, but not sufficient, condition for f to be differentiable on S is that f be continuous on S.

There are some functions that are continuous on R but are not differentiable at any point in R.

Such cases are not necessarily just idle curiosities. For example, they play a central role in the study of Brownian motion (which I think is also known as the Wiener stochastic process). This stochastic process is encountered in some models of asset pricing.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 21 / 146

Partial Derivatives Part 1

Consider a function f : X −→ R, where X ⊆ Rn. This function can be written as f (x1, x2, · · · , xn). Suppose that we hold the values that are taken by (n− 1) of the independent variables constant and only allow one of the independent variables to take on different values. Specifically, suppose that we fix the value of xi = xi for all i ∈{1, 2, · · · , n}\k and allow only xk to vary.

To simplify notation, let

x−k = (x1, x2, · · · , xk−1, xk+1, · · · , xn)

and x−k = (x1, x2, · · · , xk−1, xk+1, · · · , xn) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 22 / 146

Partial Derivatives Part 2

We can now think about the multivariate function as a univariate function, with xk being the sole independent variable.

To be precise, we have

g (xk) = f (xk ; x−k) .

Suppose that the univariate function g (xk) is differentiable and that its derivative is given by

g ′ (xk) = dg (xk)

dxk .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 23 / 146

Partial Derivatives Part 3

We use this derivative to define the partial derivative of the function f (x1, x2, · · · , xn) with respect to the variable xk . To be precise, we define this partial derivative as

∂f (xk , x−k)

∂xk ≡

dg (xk)

dxk .

As a matter of convenience will sometimes use the following notation for a partial derivative:

fk (xk , x−k) = ∂f (xk , x−k)

∂xk .

Note that fk (xk ; x−k) will potentially depend on the values taken by all of the independent variables, not just on the one that is allowed to vary.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 24 / 146

Partial Derivatives Part 4

The formal definition of the partial derivative of the function f (x1, x2, · · · , xn) with respect to the variable xk is

∂f (xk , x−k)

∂xk := lim

h→0

{ f (xk + h, x−k)− f (xk , x−k)

(xk + h)−xk

} = lim

h→0

{ f (xk + h, x−k)− f (xk , x−k)

h

} .

Note that if we think of f (xk , x−k) as a univariate function of xk alone, then this definition collapses to the definition of the first-order derivative of a univariate function.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 25 / 146

Partial Derivatives Part 5

Let S ⊆ Rn be an open set. Consider a function f : S −→ Rn. This function can be written as f (x1, x2, · · · , xn). Theorem: If f is differentiable at the point x ∈ Rn, then all n of the first-order partial derivatives of f exist at the point x and

Df (x) = (

∂f (x) ∂x1

, ∂f (x)

∂x2 , · · · , ∂f (x)

∂xn

) .

Theorem: If all n of the first-order partial derivatives of f exist and are continuous at the point x ∈ Rn, then Df (x) exists and Df (x) =

( ∂f (x)

∂x1 ,

∂f (x) ∂x2

, · · · , ∂f (x) ∂xn

) .

Theorem: f ∈ C 1 on S if and only if all of the first-order partial derivatives of f exist and are continuous on S.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 26 / 146

Partial Derivatives Example 1 Part 1

Let f (x, y) = 2x2 −xy + y 2. We will use the definition of a partial derivative find the partial derivative of f with respect to x. We have

∂f

∂x

= lim h→0

{ f (x + h, y)− f (x, y)

(x + h)−x

} = lim

h→0

{ f (x + h, y)− f (x, y)

h

}

= lim h→0

  (

2 (x + h) 2 − (x + h) y + y 2

) h

− ( 2x2 −xy + y 2

) h

}

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 27 / 146

Partial Derivatives Example 1 Part 2

= lim h→0

{ 2 ( x2 + 2xh + h2

) −xy −hy

h

+ y 2 − 2x2 + xy −y 2

h

} = lim

h→0

{ 2x2 + 4xh + 2h2 −xy −hy

h

+ y 2 − 2x2 + xy −y 2

h

} = lim

h→0

{ 4xh + 2h2 −hy

h

} = lim

h→0 {4x + 2h−y}

= 4x + 2 (0)−y = 4x −y.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 28 / 146

Partial Derivatives Example 2

Let f (x, y) = x3y + exy 2 = x3y + exp

( xy 2 ) . We will use our knowledge

of univariate calculus to find the partial derivative of f with respect to x and the partial derivative of f with respect to y. We have

∂f

∂x =

∂x

{ x3y + exp

( xy 2 )}

= ∂

∂x

{ x3y } +

∂x

{ exp

( xy 2 )}

= 3x2y + y 2 exp { xy 2 }

,

and

∂f

∂y =

∂y

{ x3y + exp

( xy 2 )}

= ∂

∂y

{ x3y } +

∂y

{ exp

( xy 2 )}

= x3 + 2xy exp { xy 2 }

.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 29 / 146

Partial Derivatives Example 3 Part 1

Consider the function f : R2 −→ R defined by

f (x, y) =

 0 if (x, y) = (0, 0);xy√

x2+y 2 if (x, y) 6= (0, 0).

All of the partial derivatives of this function exist everywhere, including at the point (x, y) = (0, 0).

However, the partial derivatives are not continuous at the point (0, 0).

The function f is not differentiable at the point (0, 0).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 30 / 146

Partial Derivatives Example 3 Part 2

Note that f (x, 0) = 0 whenever x 6= 0. This means that whenever x 6= 0, we have ∂f (x,0)

∂y = limh−→0

( f (x,h)−f (x,0)

h

) = limh−→0

( f (x,h)

h

) =

limh−→0 (

1 h

xh√ x2+h2

) = limh−→0

( x√

x2+h2

) = 1.

Note also that ∂f (0,0)

∂y = limh−→0

( f (0,h)−f (0,0)

h

) = limh−→0

( 0−0 h

) =

limh−→0 (

0 h

) = limh−→0 0 = 0.

Since ∂f ∂y

exists at the point (x, y) = (0, 0), but ∂f (0,0)

∂y 6= limx−→0

∂f (x,0) ∂y

, we know that ∂f ∂y

is not continuous.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 31 / 146

Partial Derivatives Example 3 Part 3

Note that f (0, y) = 0 whenever y 6= 0. This means that whenever y 6= 0, we have ∂f (0,y)

∂x = limh−→0

( f (h,y)−f (0,y)

h

) = limh−→0

( f (h,y)

h

) =

limh−→0

( 1 h

hy√ h2+y 2

) = limh−→0

( y√

h2+y 2

) = 1.

Note also that ∂f (0,0)

∂x = limh−→0

( f (h,0)−f (0,0)

h

) = limh−→0

( 0−0 h

) =

limh−→0 (

0 h

) = limh−→0 0 = 0.

Since ∂f ∂x

exists at the point (x, y) = (0, 0), but ∂f (0,0)

∂x 6= limy−→0

∂f (0,y) ∂x

, we know that ∂f ∂x

is not continuous.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 32 / 146

Partial Derivatives Example 3 Part 4

In this example, f is not differentiable at the point (x, y) = (0, 0).

We will prove this by contradiction.

Suppose for the moment that f was differentiable at the point (x, y) = (0, 0).

In this case, we must have Df (0, 0) = (

∂f (0,0) ∂x

, ∂f (0,0)

∂y

) = (0, 0).

This means that we must have lim(x,y)−→(0,0)

( ‖f (x,y)−f (0,0)−Df (0,0)((x,y)−(0,0))‖

‖(x,y)−(0,0)‖

) = 0.

This can be simplified to obtain

lim(x,y)−→(0,0)

( ‖f (x,y)−f (0,0)−Df (0,0)(x,y)‖

‖(x,y)‖

) = 0.

Continued on the next slide.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 33 / 146

Partial Derivatives Example 3 Part 5

Continued from the previous slide.

Note that for each e > 0, there must be some a ∈ R such that the point (a, a) ∈ Be((0, 0)). Since f (0, 0) = 0, Df (0, 0) = 0, and ‖(x, y)‖ =

√ x2 + y 2, we have

lima−→0 ( ‖f (a,a)−f (0,0)−Df (0,0)(a,a)‖

‖(a,a)‖

) = lima−→0

( a2

2a2

) =

lima−→0 (

1 2

) = 1

2 6= 0.

Contradiction.

Hence our initial assumption must have been wrong.

Thus we can conclude that f is not differentiable at the point (x, y) = (0, 0).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 34 / 146

Some Economic Applications

Marginal utilities;

Marginal products;

Various elasticities of demand;

Cournot aggregation; and

Engel aggregation.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 35 / 146

Marginal Utilities Part 1

We will provide some illustrations of an application of the concept of a partial derivative by using a variety of different two-commodity utility functions. These utility functions take the form U : R2+ −→ R. We will think of this as a function that ranks bundles of two commodities according to their desirability to the consumer. The equation for the graph of the utility function is

U = U (Q1, Q2) ,

where U is the level of utility (which is an ordinal concept, not a cardinal one, for people like me who worry about such things), Q1 is the amount of good one that is consumed and Q2 is the amount of good two that is consumed.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 36 / 146

Marginal Utilities Part 2

We will consider a perfect substitutes utility function, a generalised perfect substitutes utility function, a Cobb-Douglas utility function and a constant elasticity of substitution utility function.

We should emphasise that because utility is only an ordinal concept (only ranking is important), the concept of marginal utility is not a meaningful one. However, the related concept of a marginal rate of substitution is meaningful.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 37 / 146

Marginal Utilities Part 3

Suppose that we have a perfect substitutes utility function:

U (Q1, Q2) = Q1 + Q2.

The marginal utility of good one is simply the first-order partial derivative of this utility function with respect to the quantity of good one that is consumed. Thus we have

MU1 (Q1, Q2) = ∂U (Q1, Q2)

∂Q1 =

∂ (Q1 + Q2)

∂Q1 = 1.

The marginal utility of good two is simply the first-order partial derivative of this utility function with respect to the quantity of good two that is consumed. Thus we have

MU2 (Q1, Q2) = ∂U (Q1, Q2)

∂Q2 =

∂ (Q1 + Q2)

∂Q2 = 1.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 38 / 146

Marginal Utilities Part 4

Suppose that we have a generalised perfect substitutes utility function:

U (Q1, Q2) = αQ1 + βQ2.

The marginal utility of good one is simply the first-order partial derivative of this utility function with respect to the quantity of good one that is consumed. Thus we have

MU1 (Q1, Q2) = ∂U (Q1, Q2)

∂Q1 =

∂ (αQ1 + βQ2)

∂Q1 = α.

The marginal utility of good two is simply the first-order partial derivative of this utility function with respect to the quantity of good two that is consumed. Thus we have

MU2 (Q1, Q2) = ∂U (Q1, Q2)

∂Q2 =

∂ (αQ1 + βQ2)

∂Q2 = β.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 39 / 146

Marginal Utilities Part 5

Suppose that we have a Cobb-Douglas utility function:

U (Q1, Q2) = Q α 1 Q

1−α 2 ,

where 0 < α < 1.

The marginal utility of good one is simply the first-order partial derivative of this utility function with respect to the quantity of good one that is consumed.

Continued on the next slide.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 40 / 146

Marginal Utilities Part 6

Continued from the previous slide.

Thus we have

MU1 (Q1, Q2) = ∂U (Q1, Q2)

∂Q1

= ∂Qα1 Q

1−α 2

∂Q1 = αQα−11 Q

1−α 2

= αQ −(1−α) 1 Q

1−α 2

= α

( Q1−α2 Q1−α1

) = α

( Q2 Q1

)1−α .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 41 / 146

Marginal Utilities Part 7

The marginal utility of good two is simply the first-order partial derivative of this utility function with respect to the quantity of good two that is consumed. Thus we have

MU2 (Q1, Q2) = ∂U (Q1, Q2)

∂Q2

= ∂Qα1 Q

1−α 2

∂Q2 = (1 − α) Qα1 Q

1−α−1 2

= (1 − α) Qα1 Q −α 2

= (1 − α) ( Qα1 Qα2

) = (1 − α)

( Q1 Q2

)α .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 42 / 146

Marginal Utilities Part 8

Suppose that we have a constant elasticity of substitution utility function:

U (Q1, Q2) = ( αQ

ρ 1 + βQ

ρ 2

)1 ρ .

The marginal utility of good one is simply the first-order partial derivative of this utility function with respect to the quantity of good one that is consumed. Thus we have

MU1 (Q1, Q2) = ∂U (Q1, Q2)

∂Q1

=

(( αQ

ρ 1 + βQ

ρ 2

)1 ρ

) ∂Q1

=

( 1

ρ

)( αQ

ρ 1 + βQ

ρ 2

)1 ρ −1 (

ραQ ρ−1 1

) = αQ

ρ−1 1

( αQ

ρ 1 + βQ

ρ 2

) (1−ρ) ρ .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 43 / 146

Marginal Utilities Part 9

The marginal utility of good two is simply the first-order partial derivative of this utility function with respect to the quantity of good two that is consumed. Thus we have

MU2 (Q1, Q2) = ∂U (Q1, Q2)

∂Q2

=

(( αQ

ρ 1 + βQ

ρ 2

)1 ρ

) ∂Q2

=

( 1

ρ

)( αQ

ρ 1 + βQ

ρ 2

)1 ρ −1 (

ρβQ ρ−1 2

) = βQ

ρ−1 2

( αQ

ρ 1 + βQ

ρ 2

) (1−ρ) ρ .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 44 / 146

Marginal Products Part 1

We will provide some illustrations of an application of the concept of a partial derivative by using a variety of different two-input and one-output production technologies. These technologies can be represented by a production function of the form f : R2+ −→ R+. We will think of this as a function that turns bundles of labour and capital into an amount of output. In other words, the equation for the graph of production function is

Q = f (L, K) ,

where Q is the quantity of output that is produced when L units of labour and K units of capital are employed.

We will consider a perfect substitutes production function, a generalised perfect substitutes production function, a Cobb-Douglas production function and a constant elasticity of substitution production function.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 45 / 146

Marginal Products Part 2

Suppose that we have a perfect substitutes production function:

f (L, K) = L + K .

The marginal product of labour is simply the first-order partial derivative of this production function with respect to the quantity of labour that is employed. Thus we have

MPL (L, K) = ∂f (L, K)

∂L =

∂ (L + K)

∂L = 1.

The marginal product of capital is simply the first-order partial derivative of this production function with respect to the quantity of capital that is employed. Thus we have

MPK (L, K) = ∂f (L, K)

∂K =

∂ (L + K)

∂K = 1.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 46 / 146

Marginal Products Part 3

Suppose that we have a generalised perfect substitutes production function:

f (L, K) = αL + βK .

The marginal product of labour is simply the first-order partial derivative of this production function with respect to the quantity of labour that is employed. Thus we have

MPL (L, K) = ∂f (L, K)

∂L =

∂ (αL + βK)

∂L = α.

The marginal product of capital is simply the first-order partial derivative of this production function with respect to the quantity of capital that is employed. Thus we have

MPK (L, K) = ∂f (L, K)

∂K =

∂ (αL + βK)

∂K = β.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 47 / 146

Marginal Products Part 4

Suppose that we have a Cobb-Douglas production function:

f (L, K) = ALαK β.

The marginal product of labour is simply the first-order partial derivative of this production function with respect to the quantity of labour that is employed. Thus we have

MPL (L, K) = ∂f (L, K)

∂L

= ∂ ( ALαK β

) ∂L

= αALα−1K β.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 48 / 146

Marginal Products Part 5

The marginal product of capital is simply the first-order partial derivative of this production function with respect to the quantity of capital that is employed. Thus we have

MPK (L, K) = ∂f (L, K)

∂K

= ∂ ( ALαK β

) ∂K

= βALαK β−1.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 49 / 146

Marginal Products Part 6

Suppose that we have a constant elasticity of substitution production function:

f (L, K) = (αLρ + βK ρ) 1 ρ .

The marginal product of labour is simply the first-order partial derivative of this production function with respect to the quantity of labour that is employed. Thus we have

MPL (L, K) = ∂f (L, K)

∂L

= ∂ ( (αLρ + βK ρ)

1 ρ

) ∂L

=

( 1

ρ

) (αLρ + βK ρ)

1 ρ −1 (

ραLρ−1 )

= αLρ−1 (αLρ + βK ρ) (1−ρ)

ρ .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 50 / 146

Marginal Products Part 7

The marginal product of capital is simply the first-order partial derivative of this production function with respect to the quantity of capital that is employed. Thus we have

MPK (L, K) = ∂f (L, K)

∂K

= ∂ ( (αLρ + βK ρ)

1 ρ

) ∂K

=

( 1

ρ

) (αLρ + βK ρ)

1 ρ −1 (

ρβK ρ−1 )

= βK ρ−1 (αLρ + βK ρ) (1−ρ)

ρ .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 51 / 146

Elasticities of Demand Part 1

Suppose that the equation for the graph of an individual’s Marshallian (or “Walrasian” or “ordinary” or “uncompensated”) demand function for commodity k is given by

Qk = Dk (p1, p2, · · · , pn, y) ,

where pi is the price of commodity i for each i ∈{1, 2, · · · , n} and y is the consumer’s income.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 52 / 146

Elasticities of Demand Part 2

The own-price elasticity of demand for foe commodity k for this consumer is

εkk (p1, p2, · · · , pn, y)

= ∂ ln (Dk (p1, p2, · · · , pn, y))

∂ ln (pk)

=

( pk Qk

)( ∂Dk ∂pk

) =

( pk

Dk (p1, p2, · · · , pn, y)

)( ∂Dk ∂pk

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 53 / 146

Elasticities of Demand Part 3

The cross-price elasticity of demand for foe commodity k with respect to the price of commodity l for this consumer is

εkl (p1, p2, · · · , pn, y)

= ∂ ln (Dk (p1, p2, · · · , pn, y))

∂ ln (pl)

=

( pl Qk

)( ∂Dk ∂pl

) =

( pl

Dk (p1, p2, · · · , pn, y)

)( ∂Dk ∂pl

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 54 / 146

Elasticities of Demand Part 4

The income elasticity of demand for foe commodity k for this consumer is

εky (p1, p2, · · · , pn, y)

= ∂ ln (Dk (p1, p2, · · · , pn, y))

∂ ln (y)

=

( y

Qk

)( ∂Dk ∂y

) =

( y

Dk (p1, p2, · · · , pn, y)

)( ∂Dk ∂y

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 55 / 146

Elasticities of Demand Part 5

We can use these elasticities to classify the types of commodities that are being considered.

If εky > 0, then commodity k is a normal good. If ε k y < 0, then

commodity k is an inferior good.

If εkl > 0, then commodities k and l are substitutes. If ε k l < 0, then

commodities k and l are complements.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 56 / 146

Elasticities of Demand Part 6

The Marshallian demand curve for most commodities will usually slope down. As such, we would usually expect εkk < 0.

However, there are circumstances in which the Marshallian demand curve for a commodity can slope up over some range of prices (at least in theory). Such commodities are known as Giffen goods. In such circumstances, we would have εkk > 0 over the relevant range of prices. Note that a necessary, but not sufficient, condition for a commodity to be a Giffen good is that it be an inferior good.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 57 / 146

Cournot Aggregation Part 1

Suppose that a consumer’s preferences over bundles of L commodities are locally non-satiated in the neighbourhood of any potentially feasible commodity bundle. Then we know that budget exhaustion (which is sometimes called Walras’ law for the individual) must hold for the consumer.

This ensures that L

∑ l=1

plxl (p, y) = y,

where p = (p1, p2, · · · , pn) = (pk , p−k) is the price vector, y is the consumer’s income and xl (p, y) is the consumer’s Marshallian demand for good l.

Note that this can be rewritten as

pkxk (pk , p−k , y) + ∑ l 6=k

plxl (p, y) = y

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 58 / 146

Cournot Aggregation Part 2

Partially differentiating both sides of this equation with respect to the price of commodity k, we obtain{

(1) xk (p, y) + pk (

∂xk (p,y) ∂pk

)} + ∑l 6=k pl

( ∂xl (p,y)

∂pk

) = 0.

This can be simplified to obtain

xk (p, y) + L

∑ l=1

pl ∂xl (p, y)

∂pk = 0.

This can be rearranged to obtain

L

∑ l=1

pl ∂xl (p, y)

∂pk = −xk (p, y) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 59 / 146

Cournot Aggregation Part 3

This can be rewritten as

L

∑ l=1

( pk pk

)( xl (p, y)

xl (p, y)

)( y

y

) pl

∂xl (p, y)

∂pk = −xk (p, y) .

This can be rearranged to obtain

L

∑ l=1

( plxl (p, y)

y

)( pk

xl (p, y)

)( ∂xl (p, y)

∂pk

) = −

( pkxk (p, y)

y

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 60 / 146

Cournot Aggregation Part 4

This can be rewritten as

L

∑ l=1

sl ε l k = −sk ,

where

sl = plxl (p, y)

y

is the budget share of commodity l and

εlk =

( pk

xl (p, y)

)( ∂xl (p, y)

∂pk

) is the kth price elasticity of demand for commodity l.

The above formula is a result known as Cournot aggregation. It provides a relationship between the kth price elasticities of demand for the various commodities.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 61 / 146

Engel Aggregation Part 1

Suppose that a consumer’s preferences over bundles of L commodities are locally non-satiated in the neighbourhood of any potentially feasible commodity bundle. Then we know that budget exhaustion (which is sometimes called Walras’ law for the individual) must hold for the consumer.

This ensures that L

∑ l=1

plxl (p, y) = y,

where p = (p1, p2, · · · , pn) is the price vector, y is the consumer’s income and xl (p, y) is the consumer’s Marshallian demand for good l.

Partially differentiating both sides of this equation with respect to income, we obtain

L

∑ l=1

pl

( ∂xl (p, y)

∂y

) = 1.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 62 / 146

Engel Aggregation Part 2

This can be rewritten as

L

∑ l=1

( y

y

)( xl (p, y)

xl (p, y)

) pl

( ∂xl (p, y)

∂y

) = 1.

This can be simplified to obtain

L

∑ l=1

( plxl (p, y)

y

)( y

xl (p, y)

)( ∂xl (p, y)

∂y

) = 1.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 63 / 146

Engel Aggregation Part 3

This can be rewritten as

L

∑ l=1

sl ε l y = 1,

where

sl = plxl (p, y)

y

is the budget share of commodity l and

εly =

( y

xl (p, y)

)( ∂xl (p, y)

∂y

) is the income elasticity of demand for commodity l.

The above formula is a result known as Engel aggregation. It provides a relationship between the income elasticities of demand for the various commodities.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 64 / 146

Higher Order Derivatives Part 1

Let S ⊆ Rn be an open set and f : S −→ R be a function that is differentiable on S, with Df =

( ∂f ∂x1

, ∂f ∂x2

, · · · , ∂f ∂xn

) . The vector Df (x)

is called the gradient vector of f at the point x ∈ S. Note that Df : S −→ R is itself a function. Suppose that the function Df is itself differentiable at the point x ∈ S. This means that every one of the n first-order partial derivatives of f must themselves be differentiable at the point x.

Consider one such partial derivative, say ∂f ∂xi

.

For each i ∈{1, 2, · · · , n}, we have n “second-order” partial

derivatives of ∂f ∂xi

. These take the form ∂ (

∂f ∂xi

) ∂xk

= ∂ 2f

∂xk ∂xi for each

k ∈{1, 2, · · · , n}. Note that there are n2 such second-order partial derivatives in total.

Thus we have D (

∂f ∂xi

) = (

∂2f ∂x1 ∂xi

, ∂ 2f

∂x2 ∂xi , · · · , ∂

2f ∂xn∂xi

) for each

i ∈{1, 2, · · · , n}. D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 65 / 146

Higher Order Derivatives Part 2

Let S ⊆ Rn be an open set and f : S −→ R be a function that is differentiable on S, with Df =

( ∂f ∂x1

, ∂f ∂x2

, · · · , ∂f ∂xn

) .

Note that Df : S −→ Rn is itself a function. Suppose that the function Df is itself differentiable at the point x ∈ S. The derivative of the gradient vector Df at the point x ∈ S is given by the matrix

D (Df (x)) = D2f (x)

 

∂2f (x) ∂x21

∂2f (x) ∂x2 ∂x1

· · · ∂ 2f (x)

∂xn∂x1 ∂2f (x) ∂x1 ∂x2

∂2f (x) ∂x22

· · · ∂ 2f (x)

∂xn∂x2 ...

... . . .

... ∂2f (x) ∂x1 ∂xn

∂2f (x) ∂x2 ∂xn

· · · ∂ 2f (x) ∂x2n

  ,

where ∂2f (x)

∂x2i =

∂2f (x) ∂xi ∂xi

.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 66 / 146

Higher Order Derivatives Part 3

The matrix D2(x) is called the Hessian matrix of f at the point x ∈ S. If f is twice-differentiable at every point x ∈ S, then f is said to be twice-differentiable on S.

If f is twice-differentiable on S and ∂ 2f

∂xk ∂xi is a continuous function on

S for all (i, k) ∈{1, 2, · · · , n}×{1, 2, · · · , n}, then f is said to be twice continuously differentiable on S. This is denoted by f ∈ C 2 on S.

Young’s Theorem: If f ∈ C 2 on S, then ∂ 2f

∂xk ∂xi = ∂

2f ∂xi ∂xk

for all i 6= k. Young’s Theorem ensures that the Hessian matrix for f will be symmetric when f ∈ C 2 on S.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 67 / 146

Higher Order Derivatives Part 4

It is possible to extend the process of differentiation for multivariate functions to even higher orders than second-order derivatives.

Doing so for partial derivatives should be relatively straight-forward.

Doing so for vector and matrix derivatives might be more complicated.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 68 / 146

Second-Order Partial Derivatives Part 1

Consider a function f : S −→ R, where S ⊆ Rn is an open set. This function can be written as f (x1, x2, · · · , xn). Assume that this function is at least twice continuously differentiable with respect to all of its arguments.

Suppose that we want to find the second-order partial derivative of the underlying function f (x1, x2, · · · , xn) with respect to the variable xi first and then with respect to the variable xj second.

This notation for this second-order derivative is

∂2f (x1, x2, · · · , xn) ∂xj ∂xi

= fij (x1, x2, · · · , xn) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 69 / 146

Second-Order Partial Derivatives Part 2

Note that the first-order partial derivative of this function with respect to the variable xi is a function in its own right:

∂f (x1, x2, · · · , xn) ∂xi

= fi (x1, x2, · · · , xn)

= g i (x1, x2, · · · , xn) . We can use this function g i (x1, x2, · · · , xn) to define the second-order partial derivative of the underlying function f (x1, x2, · · · , xn) with respect to the variable xi first and then with respect to the variable xj second. To be precise, we have

fij (x1, x2, · · · , xn) := ∂g i (x1, x2, · · · , xn)

∂xj .

Thus we have

∂2f (x1, x2, · · · , xn) ∂xj ∂xi

= ∂

∂xj

( ∂f (x1, x2, · · · , xn)

∂xi

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 70 / 146

Young’s Theorem Part 1

Consider a function f : S −→ R, where X ⊆ Rn is an open set. This function can be written as f (x1, x2, · · · , xn). Suppose that this function is at least twice continuously differentiable with respect to all of its arguments in a non-empty neighbourhood around the point

x0 = ( x01 , x

0 2 , · · · , x

0 n

) .

In this case, the order of differentiation will not affect the second-order partial derivatives evaluated at the point x0.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 71 / 146

Young’s Theorem Part 2

More precisely, in this case we will have

∂2f ( x01 , x

0 2 , · · · , x0n

) ∂xj ∂xi

= ∂2f

( x01 , x

0 2 , · · · , x0n

) ∂xi ∂xj

or, if you prefer,

fij ( x01 , x

0 2 , · · · , x

0 n

) = fji

( x01 , x

0 2 , · · · , x

0 n

) .

This result is known as Young’s theorem.

The result is trivially true when i = j. This is the case of second-order own partial derivatives.

The result is also true, and much more interesting, when i 6= j. This is the case of second-order cross partial derivatives.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 72 / 146

Young’s Theorem Example 1 Part 1

Recall the example in which

f (x, y) = x3y + exy 2 = x3y + exp

( xy 2 )

.

We have already found that

fx (x, y) = ∂f

∂x = 3x2y + y 2 exp

{ xy 2 }

and

fy (x, y) = ∂f

∂y = x3 + 2xy exp

{ xy 2 }

.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 73 / 146

Young’s Theorem Example 1 Part 2

This means that

fxx (x, y)

= ∂fx (x, y)

∂x

= ∂ ( 3x2y + y 2 exp

{ xy 2 })

∂x

= (2) ( 3x2−1y

) + y 2 exp

{ xy 2 }(

y 2 )

= 6xy + y 4 exp { xy 2 }

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 74 / 146

Young’s Theorem Example 1 Part 3

It also means that

fxy (x, y)

= ∂fx (x, y)

∂y

= ∂ ( 3x2y + y 2 exp

{ xy 2 })

∂y

= 3x2 + { (2y)

( exp

{ xy 2 })

+ ( exp

{ xy 2 })

(2xy) ( y 2 )}

= 3x2 + 2y exp { xy 2 } + 2xy 3 exp

{ xy 2 }

.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 75 / 146

Young’s Theorem Example 1 Part 4

It also means that

fyx (x, y)

= ∂fy (x, y)

∂x

= ∂ ( x3 + 2xy exp

{ xy 2 })

∂x

= 3x2 + { (2y)

( exp

{ xy 2 })

+ ( y 2 exp

{ xy 2 })

(2xy) }

= 3x2 + 2y exp { xy 2 } + 2xy 3 exp

{ xy 2 }

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 76 / 146

Young’s Theorem Example 1 Part 5

It also means that

fyy (x, y)

= ∂fy (x, y)

∂y

= ∂ ( x3 + 2xy exp

{ xy 2 })

∂y

= 0 + { (2x)

( exp

{ xy 2 })

+ ( 2xy exp

{ xy 2 })

(2xy) }

= 2x exp { xy 2 } + 4x2y 2 exp

{ xy 2 }

.

Note that fxy (x, y) = fyx (x, y) in this example.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 77 / 146

Young’s Theorem Example 2 Part 1

Young’s theorem is only guaranteed to hold when the underlying function f is at least twice continuously differentiable in the neighbourhood of the point of interest.

An example in which Young’s theorem cannot be applied because the underlying function is not twice continuously differentiable at the point of interest is provided by Spiegel (1981a, p. 126, Worked Problem 43). This example is presented below.

Consider the function f : R −→ R defined by

f (x, y) =

{ 0 if (x, y) = (0, 0);

xy ( x2−y 2 x2+y 2

) if (x, y) 6= (0, 0).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 78 / 146

Young’s Theorem Example 2 Part 2

If (x, y) = (0, 0), then the first-order partial derivatives are

fx(0, 0) = lim h→0

f (h, 0)− f (0, 0) h

= lim h→0

0

h = 0

and

fy (0, 0) = lim k→0

f (0, k)− f (0, 0) k

= lim k→0

0

k = 0.

If(x, y) 6= (0, 0), then

fx(x, y) = ∂

∂x

{ xy

( x2 −y 2 x2 + y 2

)} = xy

( 4xy 2

(x2 + y 2)2

) +y

( x2 −y 2 x2 + y 2

) and

fy (x, y) = ∂

∂y

{ xy

( x2 −y 2 x2 + y 2

)} = xy

( −4x2y

(x2 + y 2)2

) +x

( x2 −y 2 x2 + y 2

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 79 / 146

Young’s Theorem Example 2 Part 3

If (x, y) = (0, 0), then the second-order partial derivatives are

fxx(0, 0) = lim h→0

fx(h, 0)− fx(0, 0) h

= lim h→0

0

h = 0,

fyy (0, 0) = lim k→0

fy (0, k)− fy (0, 0) k

= lim k→0

0

k = 0,

fxy (0, 0) = lim k→0

fx(0, k)− fx(0, 0) k

= lim k→0

−k k

= 1

and

fyx(0, 0) = lim h→0

fy (h, 0)− fy (0, 0) h

= lim h→0

h

h = 1.

Note that fxy (0, 0) 6= fyx(0, 0).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 80 / 146

Vector Derivatives of Vectors and Matrices Part 1

This material is drawn from Greene (2000, pp. 49-58).

Whether we think about row vectors or column vectors as representing points in Euclidean n-space is somewhat arbitrary. We can always redefine things so that results can be derived for one or the other of these approaches.

In this part of these slides and the econometric application that follows, I will think of vectors as being column vectors. The transpose of a vector will be a row vector.

Suppose that a is an (n× 1) column vector, x is an (n× 1) column vector, and A is an (n×n) matrix. Then we have Dx

( aT x

) = aT , Dx (Ax) = A

T , DxT (Ax) = A and Dx ( xT Ax

) = ( A + AT

) x.

In the event that A is a symmetric matrix, we have Dx ( xT Ax

) = ( A + AT

) x = 2Ax.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 81 / 146

Vector Derivatives of Vectors and Matrices Part 2

In attempting to see the relationship between these results, it might help to think about the n rows of the matrix A as being transposed column vectors, so that:

A =

 

a11 a12 · · · a1n a21 a22 · · · a2n

... ...

. . . ...

an1 an2 · · · ann

  =

 

aT1 aT2 ... aTn

  .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 82 / 146

An Econometric Application Part 1

Recall the classical linear regression model y = X β + e, where e|X ∼ N(0, σ2I). Suppose that the vector b is an estimate of the coefficient parameter vector β.

The predicted values of the response variables for each sample unit are given by the vector ŷ = Xb.

The deviations of the predicted values of the response from the actual values of the response are given by the vector e = (y − ŷ). The sum of squared deviations of the predicted values of the response from the actual values of the response are given by the vector eT e = (y − ŷ)T (y − ŷ).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 83 / 146

An Econometric Application Part 2

Note that eT e = (y − ŷ)T (y − ŷ) = ( yT − ŷT

) (y − ŷ) =

yT y −yT ŷ − ŷT y + ŷT ŷ. Note also that eT e is a scalar, which means that each of the terms in the above expression for eT e is also a scalar. Since the transpose of a

scalar is the scalar itself, we have ŷT y = ( ŷT y

)T = yT ŷ.

This means that the sum of squared deviations expression may be written as eT e = yT y − 2yT ŷ + ŷT ŷ. Upon substituting ŷ = Xb into this equation, we obtain eT e = yT y − 2yT (Xb) + (Xb)T (Xb). This may be rewritten as eT e = yT y − 2yT Xb + bT X T Xb.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 84 / 146

An Econometric Application Part 3

We have established that the sum-of-squared-deviations function is given by eT e = yT y − 2yT Xb + bT X T Xb. The gradient vector for eT e is simply the derivative of eT e with respect to the vector b, which is

Db

( eT e

) = Db

( yT y

) − 2Db

(( yT X

) b ) + Db

( bT ( X T X

) b )

= 0 − 2 ( yT X

)T + 2

( X T X

) b

= −2X T y + 2 ( X T X

) b.

Note that if we set Db ( eT e

) = 0 and rearrange to make b the

subject, we obtain b = ( X T X

)−1 X T y.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 85 / 146

An Econometric Application Part 4

We have established that the sum-of-squared-deviations function is given by eT e = yT y − 2yT Xb + bT X T Xb. We have also established that the gradient vector for eT e is Db ( eT e

) = −2X T y + 2

( X T X

) y.

The Hessian matrix for eT e is simply the derivative of Db ( eT e

) with

respect to the vector bT , which is

DbT ( Db

( eT e

)) = −2DbT

( X T y

) + 2DbT

(( X T X

) b )

= 0 + 2 ( X T X

)T = 2

( X T X

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 86 / 146

An Econometric Application Part 5

We have established that the sum-of-squared-deviations function is given by eT e = yT y − 2yT Xb + bT X T Xb. We have also established that the gradient vector for eT e is Db ( eT e

) = −2X T y + 2

( X T X

) y.

We have also established that the Hessian matrix for eT e is D2 bbT

( eT e

) = 2

( X T X

) .

Note that if we set Db ( eT e

) = 0 and rearrange to make b the

subject, we obtain b = ( X T X

)−1 X T y.

It can be shown that the Hessian matrix is positive defininite.

This means that b minimises the sum-of-squared-deviations function.

Thus we can conclude that the ordinary least squares estimator of the coefficient parameter vector β in the classical linear regression model

is given by b = ( X T X

)−1 X T y.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 87 / 146

Total Differentials Part 1

Consider a function f : S −→ R, where S ⊆ Rn is an open set. This function can be written as f (x1, x2, · · · , xn). Suppose that this function is at least twice continuously differentiable with respect to all of its arguments.

Suppose that we want to consider the impact of a small change in each of the independent variables on the value of the function.

Imagine that the initial values of the independent variables are

x = (x1, x2, · · · , xn) .

Following the change, the new values of the independent variables are

x + dx = (x1 + dx1, x2 + dx2, · · · , xn + dxn) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 88 / 146

Total Differentials Part 2

Note that the vector of changes in the independent variables is

dx = x + dx −x = (x1 + dx1, x2 + dx2, · · · , xn + dxn) −(x1, x2, · · · , xn)

= (dx1, dx2, · · · , dxn) .

The actual change in the value of the function that is induced by these changes in the independent variables is

df = f (x + dx)− f (x) .

When the function is non-linear, this actual change might sometimes be rather complicated to calculate explicitly.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 89 / 146

Total Differentials Part 3

As such, we will sometimes employ a first-order (or linear) differential approximation for df .

This approximation is known as the (first-order) total differential of the function.

It is given by

df ≈ n

∑ i=1

( ∂f

∂xi

) dxi .

If the function is twice continuously differentiable in all of its arguments and the change in each of the independent variables is sufficiently small, then this approximation for df will be reasonably accurate.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 90 / 146

The Implicit Function Theorem Part 1

This is Theorem 15.2 in Simon and Blume (1994, p. 341).

Suppose that G(x1, x2, · · · , xn, y) is a function that is at least once continuously differentiable around the point (x∗1 , x

∗ 2 , · · · , x∗n , y∗).

Suppose also that

G(x∗1 , x ∗ 2 , · · · , x

∗ n , y

∗) = c for some constant c

and that ∂G

∂y (x∗1 , x

∗ 2 , · · · , x

∗ n , y

∗) 6= 0.

Continued on the next slide.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 91 / 146

The Implicit Function Theorem Part 2

Continued from the previous slide.

Under these conditions, there is a function y(x1, x2, · · · , xn) defined on some open ball B around the point (x∗1 , x

∗ 2 , · · · , x∗n ) that is at

least once continuously differentiable such that:

(a) G(x1, x2, · · · , xn, y(x1, x2, · · · , xn)) = c

for all (x1, x2, · · · , xn) ∈ B,

(b) y∗ = y(x∗1 , x ∗ 2 , · · · , x

∗ n ); and

(c) ∂y

∂xi (x∗1 , x

∗ 2 , · · · , x

∗ n ) = −

( ∂G ∂xi

(x∗1 , x ∗ 2 , · · · , x∗n , y∗)

∂G ∂y (x∗1 , x

∗ 2 , · · · , x∗n , y∗)

) for each i ∈{1, 2, · · · , n} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 92 / 146

Derivatives of Implicit Functions Part 1

Suppose that we know that some variable y is a function of n other variables (x1, x2, · · · , xn). Sometimes it is not easy to explicitly characterise this function in the form

y = f (x1, x2, · · · , xn) .

In some cases, we might only be able to characterise the function implicitly, along the lines of a relationship of the form

g (y, x1, x2, · · · , xn) = b,

where b ∈ R is some fixed constant.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 93 / 146

Derivatives of Implicit Functions Part 2

Suppose that we want to obtain the partial derivative of y with respect to xk in such a case. How would we do this?

Since b ∈ R is some fixed constant, we must have dg = 0. Note that the total differential for g is

dg =

( ∂g

∂y

) dy +

n

∑ i=1

( ∂g

∂xi

) dxi .

Thus we must have( ∂g

∂y

) dy +

n

∑ i=1

( ∂g

∂xi

) dxi = 0.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 94 / 146

Derivatives of Implicit Functions Part 3

Suppose that the only variables that are allowed to change are xk and y. This means that dxk 6= 0 and dy 6= 0. It also means that dxi = 0 for all i 6= k. Substituting these values into the above equation yields(

∂g

∂y

) dy +

( ∂g

∂xk

) dxk = 0.

This can be rearranged to obtain

dy

dxk

∣∣∣∣ dxi =0 for all i 6=k

= − (

∂g ∂xk

) (

∂g ∂y

) . Since we are holding dxi = 0 for all i 6= k, this is really a partial derivative. Thus we have

∂y

∂xk = − (

∂g ∂xk

) (

∂g ∂y

) . D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 95 / 146

Some Applications of Implicit Functions

Some potential applications of the implicit function theorem include the following.

The slope of an indifference (iso-utility) curve. The slope of an iso-expenditure (budget) line. The slope of an iso-quant. The slope of an iso-cost.

Before discussing these examples, we will consider the related concepts of level sets, upper contour sets and lower contour sets.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 96 / 146

Level Sets and Contour Sets Part 1

Consider a function f : X −→ Y , where X ⊆ RL and Y ⊆ R. A level set for the function f defined with respect to a point in the domain x̂ ∈ X is defined to be

f 0x̂ = {x ∈ X : f (x) = f (x̂)} .

A weak upper contour set for the function f defined with respect to a point in the domain x̂ ∈ X is defined to be

f + x̂

= {x ∈ X : f (x) > f (x̂)} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 97 / 146

Level Sets and Contour Sets Part 2

A strong upper contour set for the function f defined with respect to a point in the domain x̂ ∈ X is defined to be

f ++ x̂

= {x ∈ X : f (x) > f (x̂)} .

A weak lower contour set for the function f defined with respect to a point in the domain x̂ ∈ X is defined to be

f − x̂

= {x ∈ X : f (x) 6 f (x̂)} .

A strong lower contour set for the function f defined with respect to a point in the domain x̂ ∈ X is defined to be

f −− x̂

= {x ∈ X : f (x) < f (x̂)} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 98 / 146

The Slope of an Indifference Curve Part 1

Suppose that U : RL+ −→ R is a utility function. The level sets for a utility function are known as indifference curves. The indifference curve that passes through consumption bundle x̂ ∈ RL+ is defined to be

U0x̂ = { x ∈ RL+ : U (x) = U (x̂)

} .

The (weak) upper contour sets for a utility function could be called consumption requirement sets. The consumption requirement set that is associated with the reference consumption bundle x̂ ∈ RL+ is defined to be

U+ x̂ = { x ∈ RL+ : U (x) > U (x̂)

} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 99 / 146

The Slope of an Indifference Curve Part 2

Note that we could also define the level set and the (weak) upper contour set for a utility function in terms of a reference utility level, U. I will leave the modification of the definitions for the case in which a reference utility level is employed as an exercise for you to attempt.

Note that any point x ∈ RL+ that belongs to an indifference curve must yield the same level of utility.

This means that the change in utility around any indifference curve is zero.

In other words, dU = 0 along an indifference curve.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 100 / 146

The Slope of an Indifference Curve Part 3

Suppose that L = 2. We have

dU =

( ∂U

∂x1

) dx1 +

( ∂U

∂x2

) dx2.

Along any indifference curve, we must have

dU =

( ∂U

∂x1

) dx1 +

( ∂U

∂x2

) dx2 = 0.

This can be rearranged to obtain

dx2 dx1

∣∣∣∣ dU=0

= − (

∂U ∂x1

) (

∂U ∂x2

) = −MRS12 (x) , which is the formula for the slope of an indifference curve when U : R2+ −→ R.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 101 / 146

The Slope of a Budget Line Part 1

A consumer’s expenditure is given by the linear function E : RL+ −→ R that is defined by

E (x) = pT x = L

∑ l=1

plxl ,

where p ∈ RL++, so that pl > 0 for all l ∈{1, 2, · · · , L}. We will refer to this as the budget function, because the term “expenditure function” has a specific meaning in microeconomics that differs somewhat from this function.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 102 / 146

The Slope of a Budget Line Part 2

The level sets for a budget function are known budget lines (or, if you prefer, budget hyper-planes).

The budget line that passes through consumption bundle ω ∈ RL+ is defined to be

B0ω (p, ω) = { x ∈ RL+ : E (x) = E (ω)

} .

In this case, we might think of ω ∈ RL+ as an endowment bundle. The budget line that is consistent with an exogenous money income equal to y ∈ R+ is defined to be

B0y (p, y) = { x ∈ RL+ : E (x) = y

} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 103 / 146

The Slope of a Budget Line Part 3

The (weak) lower contour sets for a budget function are known budget sets. The budget set that is associated with the endowment bundle ω ∈ RL+ is defined to be

B−ω (p, ω) = { x ∈ RL+ : E (x) 6 E (ω)

} .

The (weak) lower contour sets for a budget function are known budget sets. The budget set that is associated with an exogenous money income equal to y ∈ R+ is defined to be

B−y (p, y) = { x ∈ RL+ : E (x) 6 y

} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 104 / 146

The Slope of a Budget Line Part 4

Note that any point x ∈ RL+ that belongs to a budget line must require the same level of expenditure.

This means that the change in expenditure along any budget line is zero.

In other words, dE = 0 along a budget line.

Suppose that L = 2. We have

dE =

( ∂E

∂x1

) dx1 +

( ∂E

∂x2

) dx2

=

( ∂ (p1x1 + p2x2)

∂x1

) dx1

+

( ∂ (p1x1 + p2x2)

∂x2

) dx2

= p1dx1 + p2dx2.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 105 / 146

The Slope of a Budget Line Part 5

Along any budget line, we must have

dE = p1dx1 + p2dx2 = 0.

This can be rearranged to obtain

dx2 dx1

∣∣∣∣ dE=0

= −p1 p2

,

which is the formula for the slope of a budget line when the consumption set is given by R2+.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 106 / 146

The Slope of an Isoquant Part 1

Suppose that F : RK+ −→ R+ is a single-product production function. The level sets for a production function are known as isoquants. The isoquant that is consistent with y ∈ R+ units of output being produced is defined to be

F 0y = { x ∈ RK+ : F (x) = y

} .

The (weak) upper contour sets for a production function are known as input requirement sets. The input requirement set that is associated with y ∈ R+ units of output being produced is defined to be

F +y = { x ∈ RK+ : F (x) ≥ y

} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 107 / 146

The Slope of an Isoquant Part 2

Note that we could also define the level set and the (weak) upper contour set for a production function in terms of a reference input bundle, x̂. I will leave the modification of the definitions for the case in which a reference input bundle is employed as an exercise for you to attempt.

Note that any point x ∈ RK+ that belongs to an isoquant must yield the same level of output.

This means that the change in output around any isoquant is zero.

In other words, dF = 0 along an isoquant.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 108 / 146

The Slope of an Isoquant Part 3

Suppose that K = 2. We have

dF =

( ∂F

∂x1

) dx1 +

( ∂F

∂x2

) dx2.

Along any isoquant, we must have

dF =

( ∂F

∂x1

) dx1 +

( ∂F

∂x2

) dx2 = 0.

This can be rearranged to obtain

dx2 dx1

∣∣∣∣ dF=0

= − (

∂F ∂x1

) (

∂F ∂x2

) = −MRTS12 (x) , which is the formula for the slope of an isoquant when F : R2+ −→ R+.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 109 / 146

The Slope of an Isocost Part 1

A firm’s expenditure on inputs is given by the linear function E : RK+ −→ R that is defined by

E (x) = wT x = K

∑ k=1

wkxk ,

where w ∈ RK++ is the input price vector. Note that w ∈ RK++ means that wk > 0 for all k ∈{1, 2, · · · , K}. We will refer to this as the budget function, because the term “cost function” has a specific meaning in microeconomics that differs somewhat from this function.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 110 / 146

The Slope of an Isocost Part 2

The level sets for a firm’s budget function are known as isocosts.

The isocost line that passes through input bundle x̂ ∈ RK+ is defined to be

B0x̂ (w, x̂) = { x ∈ RK+ : E (x) = E (x̂)

} .

The isocost line that is consistent with a particular level of expenditure on inputs equal to C ∈ R+ is defined to be

B0C (p, M) = { x ∈ RK+ : E (x) = C

} .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 111 / 146

The Slope of an Isocost Part 3

Note that any point x ∈ RK+ that belongs to an isocost line must involve the same level of expenditure on inputs.

This means that the change in expenditure on inputs along any isocost line is zero.

In other words, dE = 0 along an isocost line.

Suppose that K = 2. We have

dE =

( ∂E

∂x1

) dx1 +

( ∂E

∂x2

) dx2

=

( ∂ (w1x1 + w2x2)

∂x1

) dx1

+

( ∂ (w1x1 + w2x2)

∂x2

) dx2

= w1dx1 + w2dx2.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 112 / 146

The Slope of an Isocost Part 4

Along any isocost line, we must have

dE = w1dx1 + w2dx2 = 0.

This can be rearranged to obtain

dx2 dx1

∣∣∣∣ dE=0

= −w1 w2

,

which is the formula for the slope of an isocost line for a firm with a production technology that involves two inputs.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 113 / 146

The Inverse Function Theorem

This is Theorem 15.9 in Simon and Blume (1994, p. 367).

Consider a function of the form f : Rn −→ Rn. Suppose that:

(a) f : Rn −→ Rn is at least once continuously differentiable; (b) f (x∗) = y∗; and (c) The Jacobian matrix Df is non-singular at the point x∗.

If this is the case, then:

(i) There exists an open ball Br (x ∗) around the point x∗, and an open

set V around the point y∗, such that f : Br (x∗) −→ V is both one-to-one and onto; (ii) The inverse map f −1 : V −→ Br (x∗) is at least once continuously differentiable; and (iii) The derivative of the inverse function at the point x∗ is equal to the inverse of the derivative function at the point x∗ (that is (Df −1)(f (x∗) = (Df (x∗))−1).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 114 / 146

An Elasticity Application of the Inverse Function Theorem Part 1

Consider a function of the form f : S −→ R, where S ⊆ R is an open set. (Note that n = 1 in this application.)

Suppose that this function is at least once continuously differentiable.

The elasticity of f with respect to the variable xi at the point (x, f (x)) is given by the formula

εfi =

( xi

f (x)

)( ∂f

∂xi

) .

Another formula for the elasticity of f with respect to the variable xi at the point (x, f (x)) that is sometimes useful is

εfi = ∂ ln (f (x))

∂ ln (xi) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 115 / 146

An Elasticity Application of the Inverse Function Theorem Part 2

We will use the chain rule of differentiation and the inverse function theorem to show that these two formulas are equivalent.

Note that

∂ ln (f (x))

∂ ln (xi) =

( ∂ ln (f (x))

∂f (x)

)( ∂f (x)

∂xi

)( ∂xi

∂ ln (xi)

) from the chain rule of differentiation.

We know from the inverse function theorem that( ∂xi

∂ ln (xi)

) =

1( ∂ ln(xi )

∂xi

).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 116 / 146

An Elasticity Application of the Inverse Function Theorem Part 3

Thus we have

∂ ln (f (x))

∂ ln (xi) =

( ∂ ln (f (x))

∂f (x)

)( ∂f (x)

∂xi

) 1( ∂ ln(xi )

∂xi

)   .

Note that ∂ ln(f (x))

∂f (x) = 1

f (x) and

∂ ln(xi ) ∂xi

= 1 xi

.

This means that

∂ ln (f (x))

∂ ln (xi) =

( 1

f (x)

)( ∂f (x)

∂xi

)( 1 1 xi

) =

( 1

f (x)

)( ∂f (x)

∂xi

) (xi) .

This can be rearranged to obtain

∂ ln (f (x))

∂ ln (xi) =

( xi

f (x)

)( ∂f

∂xi

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 117 / 146

Homogeneous Functions Part 1

Consider a function f : S −→ R, where S ⊆ Rn. This function can be written as f (x1, x2, · · · , xn). Let λ > 0 be a positive real number.

The function f (x1, x2, · · · , xn) is said to be homogeneous of degree r if

f (λx1, λx2, · · · , λxn) = λrf (x1, x2, · · · , xn)

for all (x1, x2, · · · , xn) ∈ S and all λ > 0. A function that is homogeneous of degree one is said to be linearly homogeneous.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 118 / 146

Homogeneous Functions Part 2

Homogeneity is a cardinal property, not an ordinal one.

This means that homogeneity will not necessarily be preserved under a strictly increasing transformation.

An ordinal property that is somewhat similar to homogeneity is that of homotheticity.

A discussion of homotheticity can be found in Simon and Blume (1994, pp. 483 and 500-504).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 119 / 146

Euler’s Theorem Part 1

Suppose that the function f (x1, x2, · · · , xn) is homogeneous of degree r.

In this case we have

n

∑ i=1

xi

( ∂f

∂xi

) = rf (x1, x2, · · · , xn) .

This result is known as Euler’s theorem.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 120 / 146

Euler’s Theorem Part 2

It is straight-forward to explicitly derive this result from the definition of homogeneity of degree r. The following proof is based on the one in Haeussler and Paul (1987, p. 723).

Since the function f (x1, x2, · · · , xn) is homogeneous of degree r, we know that

f (λx1, λx2, · · · , λxn) = λrf (x1, x2, · · · , xn)

for any choice of λ > 0.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 121 / 146

Euler’s Theorem Part 3

Consider the left-hand side of this definition first.

Let zi = λxi for all i ∈{1, 2, · · · , n}. Totally differentiating

f (λx1, λx2, · · · , λxn) = f (z1, z2, · · · , zn)

yields

df = n

∑ i=1

( ∂f

∂zi

) dzi .

Dividing both sides of this total differential by dλ 6= 0 yields

df

dλ =

n

∑ i=1

( ∂f

∂zi

)( dzi dλ

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 122 / 146

Euler’s Theorem Part 4

Suppose that we hold all of the xi variables constant and only allow λ to vary. This means that dxi = 0 for all i ∈{1, 2, · · · , n} and dλ 6= 0. This yields

df

∣∣∣∣ dxi =0 for all i

= n

∑ i=1

( ∂f

∂zi

)( dzi dλ

∣∣∣∣ dxi =0 for all i

) .

Note that df

∣∣∣∣ dxi =0 for all i

= ∂f

∂λ

and dzi dλ

∣∣∣∣ dxi =0 for all i

= ∂zi ∂λ

.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 123 / 146

Euler’s Theorem Part 5

Thus we have ∂f

∂λ =

n

∑ i=1

( ∂f

∂zi

)( ∂zi ∂λ

) .

Note that ∂zi ∂λ

= ∂ (λxi)

∂λ = xi

for all i ∈{1, 2, · · · , n}. Thus we know that the partial derivative of

f (λx1, λx2, · · · , λxn)

with respect to λ is

∂f (λx1, λx2, · · · , λxn) ∂λ

= n

∑ i=1

( ∂f

∂zi

) xi ,

where zi = λxi for all i ∈{1, 2, · · · , n}. D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 124 / 146

Euler’s Theorem Part 6

Now consider the right hand-side of the following definition for a function that is homogeneous of degree r:

f (λx1, λx2, · · · , λxn) = λrf (x1, x2, · · · , xn)

for any choice of λ > 0.

Note that λ does not appear in the term (x1, x2, · · · , xn). As such, the partial derivative of λrf (x1, x2, · · · , xn) with respect to λ is

∂λrf (x1, x2, · · · , xn) ∂λ

= r λr−1f (x1, x2, · · · , xn) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 125 / 146

Euler’s Theorem Part 7

Thus we have

f (λx1, λx2, · · · , λxn) = λrf (x1, x2, · · · , xn)

⇐⇒

∂f (λx1,λx2,··· ,λxn) ∂λ

= ∂λr f (x1,x2,··· ,xn)

∂λ

⇐⇒

∑ni=1 (

∂f ∂zi

) xi = r λ

r−1f (x1, x2, · · · , xn) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 126 / 146

Euler’s Theorem Part 8

Note that when λ = 1 we have zi = 1xi = xi for all i ∈{1, 2, · · · , n}. This means that(

∂f

∂zi

) =

( ∂f

∂xi

) for all i ∈{1, 2, · · · , n} when λ = 1. Note also that when λ = 1 we have

λr−1 = 1r−1 = 1.

Now suppose that we evaluate both sides of this last equation at the point λ = 1. This yields

n

∑ i=1

( ∂f

∂xi

) xi = rf (x1, x2, · · · , xn) ,

which is the result known as Euler’s theorem.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 127 / 146

Applications of Homogeneity

Some applications of homogeneity include the following.

Returns to scale for production technologies in general. Returns to scale for Cobb-Douglas production technologies. Euler aggregation for a Marshallian demand function. Product exhaustion under perfect competition and constant returns to scale.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 128 / 146

Returns to Scale Part 1

Suppose that an n input and one output production technology can be represented by a production function of the form

Q = f (L1, L2, · · · , Ln) .

What happens if we increase all of the inputs by the same proportion?

Specifically, suppose that we move to an input bundle of the form (λL1, λL2, · · · , λLn), where λ > 0.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 129 / 146

Returns to Scale Part 2

The production technology is said to display decreasing returns to scale if this increase in all of the inputs induces a less than proportionate change in the amount of output that can be produced.

In other words, production technology is said to display decreasing returns to scale if

f (λL1, λL2, · · · , λLn) < λQ,

which is equivalent to

f (λL1, λL2, · · · , λLn) < λf (L1, L2, · · · , Ln) .

Thus the production function for a production technology that displays decreasing returns to scale is either homogeneous of degree less than one or it is not homogeneous.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 130 / 146

Returns to Scale Part 3

A simple replication argument would suggest that decreasing returns to scale do not make any sense.

Studies that find decreasing returns to scale are probably really finding diminishing marginal products.

There is probably some unobserved (or imprecisely observed) input that is implicitly being held fixed or, at least, not varying in the same proportion as all of the other inputs.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 131 / 146

Returns to Scale Part 4

The production technology is said to display constant returns to scale if this increase in all of the inputs induces a proportionate change in the amount of output that can be produced.

In other words, production technology is said to display constant returns to scale if

f (λL1, λL2, · · · , λLn) = λQ,

which is equivalent to

f (λL1, λL2, · · · , λLn) = λf (L1, L2, · · · , Ln) .

Thus the production function for a production technology that displays constant returns to scale will be homogeneous of degree one. (Homogeneity of degree one is sometimes referred to as linear homogeneity.)

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 132 / 146

Returns to Scale Part 5

Constant returns to scale production technologies are employed in the standard versions of the neoclassical model of economic growth.

This model is often referred to as the Solow-Swan model of economic growth, because one of the seminal papers on this model was written by Robert Solow and the other was written by Trevor Swan.

Note that the late Trevor Swan was an Australian economist who I believe was the first Professor of Economics at the Australian National University.

The relevant references are:

Solow, RM (1956), “A contribution to the theory of economic growth”, The Quarterly Journal of Economics 70(1), February, pp. 65–94; and Swan, TW (1956), “Economic growth and capital accumulation”, The Economic Record 32, November, pp. 334–361.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 133 / 146

Returns to Scale Part 6

The production technology is said to display increasing returns to scale if this increase in all of the inputs induces a more than proportionate change in the amount of output that can be produced.

In other words, production technology is said to display increasing returns to scale if

f (λL1, λL2, · · · , λLn) > λQ,

which is equivalent to

f (λL1, λL2, · · · , λLn) > λf (L1, L2, · · · , Ln) .

Thus the production function for a production technology that displays increasing returns to scale is either homogeneous of degree more than one or it is not homogeneous.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 134 / 146

Returns to Scale Part 7

The returns to scale displayed by some production technologies might vary with input vector.

In such cases, the production technology will not necessarily display either constant returns to scale or increasing returns to scale over the entire set of possible values for the input vector.

Such a situation might be known as “variable returns to scale”.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 135 / 146

Cobb-Douglas Production Functions Part 1

Suppose that an two input and one output production technology can be represented by a production function of the form

Q = f (L, K) = ALαK β.

This type of production function is known as a Cobb-Douglas production function.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 136 / 146

Cobb-Douglas Production Functions Part 2

Suppose that λ > 0. Note that

f (λL, λK) = A (λL) α (λK)

β

= AλαLα λβK β

= λα+βALαK β

= λα+βf (L, K) .

Thus we know that the Cobb-Douglas production function is homogeneous of degree (α + β).

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 137 / 146

Cobb-Douglas Production Functions Part 3

If (α + β) < 1, then the Cobb-Douglas production function displays decreasing returns to scale;

If (α + β) = 1, then the Cobb-Douglas production function displays constant returns to scale; and

If (α + β) > 1, then the Cobb-Douglas production function displays increasing returns to scale.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 138 / 146

Euler Aggregation Part 1

Suppose that a consumer’s Marshallian demand for commodity k is given by the function

Qk = xk (p1, p2, · · · , pn, y)

= xk (p, y) .

It can be shown that under certain circumstances, Marshallian demand functions are homogeneous of degree zero.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 139 / 146

Euler Aggregation Part 2

It can be shown that under certain circumstance, Marshallian demand functions are homogeneous of degree zero.

We will not provide a formal proof of this proposition here. However, the result is fairly intuitive. If all prices and income increase by the same proportion, then the budget constraint for the consumer does not change. If his or her preferences remain the same, then the fact that the budget constraint has not changed would suggest that the choice problem that faces the consumer has not changed in any real sense. As such, the consumer’s optimal consumption bundle should not have changed either.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 140 / 146

Euler Aggregation Part 3

If Marshallian demand functions are homogeneous of degree zero, then the Marshallian demand for commodity k will be homogeneous of degree zero.

In such circumstances, we know from Euler’s theorem that

y

( ∂xk (p, y)

∂y

) +

n

∑ i=1

pk

( ∂xk (p, y)

∂pi

) = (0) xk (p1, p2, · · · , pn, y) .

This can be simplified to obtain

y

( ∂xk (p, y)

∂y

) +

n

∑ i=1

pk

( ∂xk (p, y)

∂pi

) = 0.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 141 / 146

Euler Aggregation Part 4

This can be rearranged to obtain

n

∑ i=1

pk

( ∂xk (p, y)

∂pi

) = −y

( ∂xk (p, y)

∂y

) .

Upon dividing both sides of this equation by xk (p, y), we obtain

n

∑ i=1

( pk

xk (p, y)

)( ∂xk (p, y)

∂pi

) = −

( y

xk (p, y)

)( ∂xk (p, y)

∂y

) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 142 / 146

Euler Aggregation Part 5

Thus we have n

∑ i=1

εki = −ε k y ,

where

εki =

( pk

xk (p, y)

)( ∂xk (p, y)

∂pi

) is the elasticity of demand for commodity k with respect to the price of good l and

εky =

( y

xk (p, y)

)( ∂xk (p, y)

∂pi

) is the income elasticity of demand for commodity k.

As far as I am aware, this well known result does not have a name. But it could be referred to as Euler aggregation, because it makes use of Euler’s theorem.

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 143 / 146

Product Exhaustion Part 1

Consider a profit-maximising firm that has a single output (Q) and two inputs (L and K ) production technology that displays constant returns to scale. This production technology can be represented by a production function of the form

Q = f (L, K) .

Suppose that this firm is a price-taker in both the output market and all input markets.

Constant returns to scale imply that the production function is homogeneous of degree one. Thus we know from Euler’s theorem that(

∂f

∂L

) L +

( ∂f

∂K

) K = f (L, K) . (1)

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 144 / 146

Product Exhaustion Part 2

Recall that the first-order conditions for profit maximisation by a firm with these characteristics are(

∂f

∂L

) =

w

p

and ( ∂f

∂K

) =

r

p ,

where p is the output price, w is the input price for labour (L) and r is the input price for capital (K ).

Upon substituting these FOCs into Equation (1), we obtain( w

p

) L +

( r

p

) K = f (L, K) .

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 145 / 146

Product Exhaustion Part 3

This can be rearranged to obtain

wL + rK = pf (L, K) .

In other words, the firm’s revenue from sales of its output will be exactly equal to the firm’s expenditure on inputs. This result, which implies zero profits for a perfectly competitive firm with a constant returns to scale production technology, is known as product exhaustion.

(Note that profit here means economic profit, not accounting profit.)

D. S. Eldridge (ANU) Multivariate Differential Calculus 20 February 2021 146 / 146

  • Multivariate Differential Calculus Lecture
    • Readings Part 1
    • Readings Part 2
    • Some Introductory Remarks Part 1
    • Some Introductory Remarks Part 2
    • Functions on Euclidean Spaces
    • Continuous Functions on Euclidean Spaces Part 1
    • Continuous Functions on Euclidean Spaces Part 2
    • Continuous Functions on Euclidean Spaces Part 3
    • Differentiable Functions on Euclidean Space Part 1
    • Differentiable Functions on Euclidean Space Part 2
    • Differentiable Functions on Euclidean Space Part 3
    • Differentiable Functions on Euclidean Space Part 4
    • Differentiable Functions on Euclidean Space Part 5
    • Differentiable Functions on Euclidean Space Part 6
    • Differentiable Functions on Euclidean Space Part 7
    • Differentiable Functions on Euclidean Space Part 8
    • Differentiable Functions on Euclidean Space Part 9
    • Differentiable Functions on Euclidean Space Part 10
    • Differentiable Functions on Euclidean Space Part 11
    • Differentiable Functions on Euclidean Space Part 12
    • Partial Derivatives Part 1
    • Partial Derivatives Part 2
    • Partial Derivatives Part 3
    • Partial Derivatives Part 4
    • Partial Derivatives Part 5
    • Partial Derivatives Example 1 Part 1
    • Partial Derivatives Example 1 Part 2
    • Partial Derivatives Example 2
    • Partial Derivatives Example 3 Part 1
    • Partial Derivatives Example 3 Part 2
    • Partial Derivatives Example 3 Part 3
    • Partial Derivatives Example 3 Part 4
    • Partial Derivatives Example 3 Part 5
    • Some Economic Applications
    • Marginal Utilities Part 1
    • Marginal Utilities Part 2
    • Marginal Utilities Part 3
    • Marginal Utilities Part 4
    • Marginal Utilities Part 5
    • Marginal Utilities Part 6
    • Marginal Utilities Part 7
    • Marginal Utilities Part 8
    • Marginal Utilities Part 9
    • Marginal Products Part 1
    • Marginal Products Part 2
    • Marginal Products Part 3
    • Marginal Products Part 4
    • Marginal Products Part 5
    • Marginal Products Part 6
    • Marginal Products Part 7
    • Elasticities of Demand Part 1
    • Elasticities of Demand Part 2
    • Elasticities of Demand Part 3
    • Elasticities of Demand Part 4
    • Elasticities of Demand Part 5
    • Elasticities of Demand Part 6
    • Cournot Aggregation Part 1
    • Cournot Aggregation Part 2
    • Cournot Aggregation Part 3
    • Cournot Aggregation Part 4
    • Engel Aggregation Part 1
    • Engel Aggregation Part 2
    • Engel Aggregation Part 3
    • Higher Order Derivatives Part 1
    • Higher Order Derivatives Part 2
    • Higher Order Derivatives Part 3
    • Higher Order Derivatives Part 4
    • Second-Order Partial Derivatives Part 1
    • Second-Order Partial Derivatives Part 2
    • Young's Theorem Part 1
    • Young's Theorem Part 2
    • Young's Theorem Example 1 Part 1
    • Young's Theorem Example 1 Part 2
    • Young's Theorem Example 1 Part 3
    • Young's Theorem Example 1 Part 4
    • Young's Theorem Example 1 Part 5
    • Young's Theorem Example 2 Part 1
    • Young's Theorem Example 2 Part 2
    • Young's Theorem Example 2 Part 3
    • Vector Derivatives of Vectors and Matrices Part 1
    • Vector Derivatives of Vectors and Matrices Part 2
    • An Econometric Application Part 1
    • An Econometric Application Part 2
    • An Econometric Application Part 3
    • An Econometric Application Part 4
    • An Econometric Application Part 5
    • Total Differentials Part 1
    • Total Differentials Part 2
    • Total Differentials Part 3
    • The Implicit Function Theorem Part 1
    • The Implicit Function Theorem Part 2
    • Derivatives of Implicit Functions Part 1
    • Derivatives of Implicit Functions Part 2
    • Derivatives of Implicit Functions Part 3
    • Some Applications of Implicit Functions
    • Level Sets and Contour Sets Part 1
    • Level Sets and Contour Sets Part 2
    • The Slope of an Indifference Curve Part 1
    • The Slope of an Indifference Curve Part 2
    • The Slope of an Indifference Curve Part 3
    • The Slope of a Budget Line Part 1
    • The Slope of a Budget Line Part 2
    • The Slope of a Budget Line Part 3
    • The Slope of a Budget Line Part 4
    • The Slope of a Budget Line Part 5
    • The Slope of an Isoquant Part 1
    • The Slope of an Isoquant Part 2
    • The Slope of an Isoquant Part 3
    • The Slope of an Isocost Part 1
    • The Slope of an Isocost Part 2
    • The Slope of an Isocost Part 3
    • The Slope of an Isocost Part 4
    • The Inverse Function Theorem
    • An Elasticity Application of the Inverse Function Theorem Part 1
    • An Elasticity Application of the Inverse Function Theorem Part 2
    • An Elasticity Application of the Inverse Function Theorem Part 3
    • Homogeneous Functions Part 1
    • Homogeneous Functions Part 2
    • Euler's Theorem Part 1
    • Euler's Theorem Part 2
    • Euler's Theorem Part 3
    • Euler's Theorem Part 4
    • Euler's Theorem Part 5
    • Euler's Theorem Part 6
    • Euler's Theorem Part 7
    • Euler's Theorem Part 8
    • Applications of Homogeneity
    • Returns to Scale Part 1
    • Returns to Scale Part 2
    • Returns to Scale Part 3
    • Returns to Scale Part 4
    • Returns to Scale Part 5
    • Returns to Scale Part 6
    • Returns to Scale Part 7
    • Cobb-Douglas Production Functions Part 1
    • Cobb-Douglas Production Functions Part 2
    • Cobb-Douglas Production Functions Part 3
    • Euler Aggregation Part 1
    • Euler Aggregation Part 2
    • Euler Aggregation Part 3
    • Euler Aggregation Part 4
    • Euler Aggregation Part 5
    • Product Exhaustion Part 1
    • Product Exhaustion Part 2
    • Product Exhaustion Part 3