1.3. THE IMPLICIT FUNCTION THEOREM

1.3 The implicit function theorem

Let us return to the positive aspect of Newton™s method. You might ask, how

can we ever guarantee in advance that an inequality such as (1.14) holds? The

answer comes from considering not a single function, P , but rather a parame-

terized family of functions: Suppose that u ranges over some interval, or more

generally, over some region in a vector space. To ¬x the notation, suppose that

this region contains the origin, 0. Suppose that P is a function of u and x, and

depends continuously on (u, x). Suppose that as a function of x, the function

P is twice di¬erentiable and satis¬es (1.13) for all values of u (with the same

¬xed K). Finally, suppose that

P (0, 0) = 0. (1.26)

Then the continuity of P guarantees that for |u| and |x0 | su¬ciently small, the

condition (1.14) holds, that is

|P (u, x0 )| < r

where r is small enough to guarantee that x0 is in the basin of attraction of a

zero of the function P (u, ·). In particular, this means that for |u| su¬ciently

small, we can ¬nd an > 0 such that all x0 satisfying |x0 | < are in the basin

of attraction of the same zero of P (u, ·). By choosing a smaller neighborhood,

given say by |u| < δ, starting with x0 = 0 and applying Newton™s method to

P (u, ·), we obtain a sequence of x values which converges exponentially to a

solution of

P (u, x) = 0. (1.27)

satisfying

|x| < .

Furthermore, starting with any x0 satisfying |x0 | < we also get exponential

convergence to the same zero. In particular, there can not be two distinct

solutions to (1.27) satisfying |x| < , since starting Newton™s method at a zero

gives (inductively) xn = x0 for all n. Thus we have constructed a unique

function

x = g(u)

satisfying

P (u, g(u)) ≡ 0. (1.28)

This is the guts of the implicit function theorem. We have proved it under

assumptions which involve the second derivative of P which are not necessary

for the truth of the theorem. (We will remedy this in Chapter??.) However these

stronger assumptions that we have made do guarantee exponential convergence

of our algorithm.

For the sake of completeness, we discuss the basic properties of the function

g: its continuity, di¬erentiability, and the computation of its derivative.

16 CHAPTER 1. ITERATIONS AND FIXED POINTS

1.Uniqueness implies continuity. We wish to prove that g is continuous at

any point u in a neighborhood of 0. This means: given β > 0 we can ¬nd ± > 0

such that

|h| < ± ’ |g(u + h) ’ g(u)| < β. (1.29)

We know that this is true at u = 0, where we could choose any > 0 at

will, and then conclude that there is a δ > 0 with |g(u)| < if |u| < δ . To

prove (1.29) at a general point, just choose (u, g(u)) instead of (0, 0) as the

origin of our coordinates. and apply the preceding results to this new data. We

obtain a solution f to the equation P (u + h, f (u + h)) = 0 with f (u) = g(u)

which is continuous at h = 0. In particular, for |h| su¬ciently small, we will

have |u + h| ¤ δ, and |f (u + h)| < , our original and δ in the de¬nition of

g. The uniqueness of the solution to our original equation then implies that

f (u + h) = g(u + h), proving (1.29).

2.Di¬erentiability. Suppose that P is continuously di¬erentiable with respect

to all variables. We have

0 ≡ P (u + h, g(u + h)) ’ P (u, g(u)

so, by the de¬nition of the derivative,

‚P ‚P

[g(u + h) ’ g(u)] + o(h) + o[g(u + h) ’ g(u)].

0= h+

‚u ‚x

If u is a vector variable, say ∈ Rn , then ‚P is a matrix. The terminology o(s)

‚u

means some expression which approaches zero so that o(s)/s ’ 0. So

’1 ’1

‚P ‚P ‚P

g(u+h)’g(u) = ’ h’o(h)’ o[g(u+h)’g(u)]. (1.30)

‚x ‚u ‚x

As a ¬rst pass through this equation, observe that by the continuity that we

have already proved, we know that [g(u + h) ’ g(u)] ’ 0 as h ’ 0. The

expression o([g(u + h) ’ g(u)]) is, by de¬nition of o, smaller than any constant

times |g(u + h) ’ g(u)| provided that |g(u + h) ’ g(u)| itself is su¬ciently small.

This means that for su¬ciently small [g(u + h) ’ g(u)] we have

1

|o[g(u + h) ’ g(u)]| ¤ |g(u + h) ’ g(u)|

2K

‚P ’1

where we may choose K so that | | ¤ K. So bringing the last term over

‚x

to the other side gives

’1

1 ‚P ‚P

|g(u + h) ’ g(u)| ’ |g(u + h) ’ g(u)| ¤ | h| + o(|h|),

2 ‚x ‚u

and we get an estimate of the form

|g(u + h) ’ g(u)| ¤ M |h|

17

1.4. ATTRACTORS AND REPELLERS

for some suitable constant, M . But then the term o[g(u + h) ’ g(u)] becomes

o(h). Plugging this back into our equation (1.30) shows that g is di¬erentiable

with

’1

‚g ‚P ‚P

=’ . (1.31)

‚u ‚x ‚u

To summarize, the implicit function theorem says:

Theorem 1.3.1 The implicit function theorem. Let P = P (u, x) be a

di¬erentiable function with P (0, 0) = 0 and ‚P (0, 0) invertible. Then there

‚x

exist δ > 0 and > 0 such that P (u, x) = 0 has a unique solution with |x| < for

each |u| < δ. This de¬nes the function x = g(u). The function g is di¬erentiable

and its derivative is given by (1.31).

We have proved the theorem under more stringent hypotheses in order to

get an exponential rate of convergence to the solution. We will provide the

details of the more general version, as a consequence of the contraction ¬xed

point theorem, later on. We should point out now, however, that nothing in our

discussion of Newton™s method or the implicit function theorem depended on x

being a single real variable. The entire discussion goes through unchanged if x

is a vector variable. Then ‚P/‚x is a matrix, and (1.31) must be understood

as matrix multiplication. Similarly, the condition on the second derivative of p

must be understood in terms of matrix norms. We will return to these points

later.

1.4 Attractors and repellers

We introduce some notation which we will be using for the next few chapters.

Let F : X ’ X be a di¬erentiable map where X is an interval on the real line.

A point p ∈ X is called a ¬xed point if

F (p) = p.

A ¬xed point a is called an attractor or an attractive ¬xed point or a stable ¬xed

point if

|F (a)| < 1. (1.32)

Points su¬ciently close to an attractive ¬xed point, a, converge to a geometri-

cally upon iteration. Indeed,

F (x) ’ a = F (x) ’ F (a) = F (a)(x ’ a) + o(x ’ a)

by the de¬nition of the derivative. Hence taking b < 1 to be any number

larger than |F (a)| then for |x ’ a| su¬ciently small, |F (x) ’ a| ¤ b|x ’ a|. So

starting with x0 = x and iterating xn+1 = F (xn ) gives a sequence of points

with |xn ’ a| ¤ bn |x ’ a|.

The basin of attraction of an attractive ¬xed point is the set of all x such

that the sequence {xn } converges to a where x0 = x and xn+1 = F (xn ). thus

18 CHAPTER 1. ITERATIONS AND FIXED POINTS