Project #14 Optimization of Functions of Several Variables
Introduction One of the major applications of calculus is to determine how functions obtain their maximum or minimum values. Consider that Nobel prizes have been awarded for optimization techniques! In introductory calculus, with functions of one variable, this was done with either the first or second derivative test (see Chapter 3 of Calculus by Bradley and Smith). In multivariable calculus, this was done for functions of two variables (see Chapter 13 of Bradley and Smith, pages 793-8).
Specifically, for a function f(x,y), one found critical points, (a,b) and then applied a second derivative test to see if the f(x,y) near the critical point was either always greater than f(a,b) (a local minimum) or less than f(a,b) (a local maximum). Other possibilities are having a saddle point ,and having the test fail.
A couple of shortcomings of the approach presented are that no explanation is given for why the method works, and more important, there is nothing to suggest how it would be extended to a function of three or more variables.
In this project, we see how linear algebra answers both concerns by letting eigenvectors associated with the second derivative determine the best coordinate system to see how the function behaves and then having the eigenvalues determine the nature of the critical point.
Background: This project assumes a familiarity with the Principal Axis Theorem as well as a knowledge of optimization of functions of two variables. Suppose f(x,y) has continuous second derivatives, and (a,b) is a critical point of f, meaning that fx and fy are both 0 at (a,b) (this means one must solve a system of two equations and two unknowns. However, this will often be nonlinear, with possibly several solutions. There is no general technique; anything goes!)
Recall the two term Taylor Series for f(x,y) about (a,b):
f(x,y) = f(a,b) + fx(a,b)(x-a) + fy(a,b)(y-b) + fxx(a,b) (x-a)2 + 2fxy(a,b) (x-a)(y-b) +fyy(a,b) (y-b)2
Taking into account that (a,b) is a critical point and using increment notation, this becomes
now if this last expression is always positive, for any increment in x or y then we have a local minimum. If it is always negative then we have a local maximum. If it takes on both signs + and - for different (x,y) near (a,b) then we have a saddle point and no extrema
As an example, consider the function f(x,y) = 1/x + 1/y + 2xy. It has a critical point at
(1/21/3 , 1/21/3). The second derviatives are
fxx = 2x-3 , fxy = 2 and fyy = 2y-3
which evaluate, respectively, to 4, 2, and 4. Thus the expression for the increment in f
is
and, given that x and y can take on any sign, independent of each other, it is hard to tell if f is always the same sign. Enter linear algebra. In matrix form, the increment in f is
which has eigenvalues of 6 and 2, for eigenvectors of (1 1) and (1 -1). So if we take (1/21/3 , 1/21/3) as the origin and then rotate by 45 degrees (due to the eigenvectors) , the second derivative becomes
Due to the lack of any "cross terms " (uv) ,it is now clear that for any u and v that f > 0 indicating a minimum at (1/21/3,1/21/3).
The same procedure works for a function of three variables, f(x,y,z). A critical point is where all three partial derivatives are zero (fx = fy = fz = 0). The two term Taylor series for the increment in f near a critical point is
all partials being evaluated at the critical point. The matrix form of this increment is
and if we change to a coordinate system of eigenvectors of the matrix, it will become diagonal. If all three diagonal entries (the eigenvalues) are positive, we have a local minimum. If all three are negative, we have a maximum. If the signs are mixed, we have a saddle point. As an example, we consider
f(x,y,z) = x3 + y3 + 2xyz + z2
which has critical points at (0,0,0), (3/2,3/2,-9/4), and (3/2,-3/2,9/4). We examine the
second point as a possible local extrema. Using the second derivatives
fxx = 6x, fxy = 2z , fxz = 2y, fyz = 2x , fyy = 6y, fzz = 2
and the expression above for the two term Taylor Series, we have that the increment in f near (3/2,3/2,-9/4) is
which has eigenvalues of 13.5, 7.67, and -1.17. Since they are not all of the same sign, the point (3/2,3/2,-9/4) is a saddle point of the function and not a local extrema.
Exercises:
In each case, find all critical points of f and test each one as a possible local extrema using linear algebraic techniques:
1. f(x,y) = 3x - x3 - 3xy2 (there are 4 critical points)
2. f(x,y) = 6xy2 -2x3 -3y4 (there are 3 critical points)
3. f(x,y) = x4/3 + y4/2 - 4xy2 + 2x2 + 2y2 + 3 (there are 5 critical points).
4. Suppose for a function of three variables, f(x,y,z), that (2,1,5) is a critical point and that its matrix of second partial derivatives has eigenvalues of 2, 3, and -1. What can you say about the critical point (2,1,5)?
5. Suppose a function of three variables has (2,7,6) as a critical point and eigenvalues of its matrix of second partial derivatives of 2,3 and 0. What can you say about the critical point? What generalization can you draw?