Root finding in one dimension

Contents Summary

Title Page
Contents
1. Introduction
2. Key Idea
3. Root finding in one dimension
4. Linear equations
5. Numerical integration
6. First order ordinary differential equations
7. Higher order ordinary differential equations
8. Partial differential equations

3. Root finding in one dimension

3.1 Why?

Solutions = x0 to equations of the form f(x) = 0 are often required where it is impossible or infeasible to find an analytical expression for the vector x. If the scalar function f depends on m independent variables x1,x2,,xm, then the solution x0 will describe a surface in m-1 dimensional space. Alternatively we may consider the vector function f(x)=0, the solutions of which typically collapse to particular values of x. For this course we restrict our attention to a single independent variable x and seek solutions to f(x)=0.

3.2 Bisection

This is the simplest method for finding a root to an equation. As we shall see, it is also the most robust. One of the main drawbacks is that we need two initial guesses xa and xb which bracket the root: let fa = f(xa) and fb = f(xb) such that fa fb <= 0. An example of this is shown graphically in figure 1 . Clearly, if fa fb = 0 then one or both of xa and xb must be a root of f(x) = 0.

The basic algorithm for the bisection method relies on repeated application of

By replacing the interval (xa,xb) with either (xa,xc) or (xc,xb) (whichever brackets the root), the error in our estimate of the solution to f(x)  = 0 is, on average, halved. We repeat this interval halving until either the exact root has been found or the interval is smaller than some specified tolerance.

3.2.1 Convergence

Since the interval (xa,xb) always bracets the root, we know that the error in using either xa or xb as an estimate for root at the nth iteration must be en < |xa ­ xb|. Now since the interval (xa,xb) is halved for each iteration, then

en+1 ~ en/2.
(3)

More generally, if xn is the estimate for the root x* at the nth iteration, then the error in this estimate is

en = xn ­ x*.
(4)

In many cases we may express the error at the n+1th time step in terms of the error at the nth time step as

|en+1| ~ C|en|p.
(5)

Indeed this criteria applies to all techniques discussed in this course, but in many cases it applies only asymptotically as our estimate xn converges on the exact solution. The exponent p in equation ( 5 ) gives the order of the convergence. The larger the value of p, the faster the scheme converges on the solution, at least provided en+1 < en. For first order schemes (i.e. = 1), |C| < 1 for convergence.

For the bisection method we may estimate en as en. The form of equation ( 3 ) then suggests = 1 and = 1/2, showing the scheme is first order and converges linearly. Indeed convergence is guaranteed ­ a root to f(x) = 0 will always be found ­ provided f(x) is continuous over the initial interval.

3.2.2 Criteria

In general, a numerical root finding procedure will not find the exact root being sought (e = 0), rather it will find some suitably accurate approximation to it. In order to prevent the algorithm continuing to refine the solution for ever, it is necessary to place some conditions under which the solution process is to be finished or aborted. Typically this will take the form of an error tolerance on en = |an-bn|, the value of fc, or both.

For some methods it is also important to ensure the algorithm is converging on a solution (i.e. |en+1|en| for suitably large n), and that this convergence is sufficiently rapid to attain the solution in a reasonable span of time. The guaranteed convergence of the bisection method does not require such safety checks which, combined with its extreme simplicity, is one of the reasons for its widespread use despite being relatively slow to converge.

3.3 Linear interpolation (regula falsi)

This method is similar to the bisection method in that it requires two initial guesses to bracket the root. However, instead of simply dividing the region in two, a linear interpolation is used to obtain a new point which is (hopefully, but not necessarily) closer to the root than the equivalent estimate for the bisection method. A graphical interpretation of this method is shown in figure 2 .

The basic algorithm for the linear interpolation method is

Because the solution remains bracketed at each step, convergence is guaranteed as was the case for the bisection method. The method is first order and is exact for linear f.

3.4 Newton-Raphson

Consider the Taylor Series expansion of f(x) about some point x = x0:

f(x) = f(x0) + (x-x0)f'(x0) + ½(x-x0)2f"(x0) + O(|x-x0|3).
(6)

Setting the quadratic and higher terms to zero and solving the linear approximation of f(x) = 0 for x gives

.
(7)

Subsequent iterations are defined in a similar manner as

.
(8)

Geometrically, xn+1 can be interpreted as the value of x at which a line, passing through the point (xn,f(xn)) and tangent to the curve f(x) at that point, crosses the y axis. Figure 3 provides a graphical interpretation of this.

When it works, Newton-Raphson converges much more rapidly than the bisection or linear interpolation. However, if f' vanishes at an iteration point, or indeed even between the current estimate and the root, then the method will fail to converge. A graphical interpretation of this is given in figure 4 .

3.4.1 Convergence

To study how the Newton-Raphson scheme converges, expand f(x) around the root x*,

f(x) = f(x*) + (x- x*)f'(x*) + 1/2(x- x*)2f''(x*) + O(|x- x*|3),
(9)

and substitute into the iteration formula. This then shows

(10)

since f(x*)=0. Thus, by comparison with ( 4 ), there is second order (quadratic) convergence. The presence of the f' term in the denominator shows that the scheme will not converge if f' vanishes in the neighbourhood of the root.

3.5 Secant (chord)

This method is essentially the same as Newton-Raphson except that the derivative f'(x) is approximated by a finite difference based on the current and the preceding estimate for the root, i.e.

,
(11)

and this is substituted into the Newton-Raphson algorithm ( 8 ) to give

.
(12)

This formula is identical to that for the Linear Interpolation method discussed in section3.3 . The difference is that rather than replacing one of the two estimates so that the root is always bracketed, the oldest point is always discarded in favour of the new. This means it is not necessary to have two initial guesses bracketing the root, but on the other hand, convergence is not guaranteed. A graphical representation of the method working is shown in figure 5 and failure to converge in figure 6 . In some cases, swapping the two initial guesses x0 and x1 will change the behaviour of the method from convergent to divergent.

3.5.1 Convergence

The order of convergence may be obtained in a similar way to the earlier methods. Expanding around the root x = x* for xn and xn+1 gives

f(xn) = f(x*) + enf'(x*) + 1/2en2f''(x*) + O(|en|3),
(13a)
f(xn­1) = f(x*) + en-1f'(x*) + 1/2en-12f''(x*) + O(|en-1|3),
( 13 b)

and substituting into the iteration formula

.
(14)

Note that this expression for en+1 includes both en and en-1. In general we would like it in terms of en only. The form of this expression suggests a power law relationship. By writing

,
(15)

and substituting into the error evolution equation ( 14 ) gives

(16)

which we equate with our assumed relationship to show

(17)

Thus the method is of non-integer order 1.61803… (the golden ratio). As with Newton-Raphson, the method may diverge if f' vanishes in the neighbourhood of the root.

3.6 Direct iteration

A simple and often useful method involves rearranging and possibly transforming the function f(x) by T(f(x),x) to obtain g(x) = T(f(x),x). The only restriction on T(f(x),x) is that solutions to f(x) = 0 have a one to one relationship with solutions to g(x)  = x for the roots being sort. Indeed, one reason for choosing such a transformation for an equation with multiple roots is to eliminate known roots and thus simplify the location of the remaining roots. The efficiency and convergence of this method depends on the final form of g(x).

The iteration formula for this method is then just

xn+1 = g(xn).
(18)

A graphical interpretion of this formula is given in figure 7 .

3.6.1 Convergence

The convergence of this method may be determined in a similar manner to the other methods by expanding about x*. Here we need to expand g(x) rather than f(x). This gives

g(xn) = g(x*) + eng'(x*) + 1/2en2g"(x*) + O(|en|3),
(19)

so that the evolution of the error follows

.
(20)

The method is clearly first order and will converge only if |g'| < 1. The sign of g' determines whether the convergence (or divergence) is monotonic (positive g') or oscillatory (negative g'). Figure 8 shows how the method will diverge if this restriction on g' is not satisfied. Here g' < 1 so the divergence is oscilatory.

Obviously our choice of T(f(x),x) should try to minimise g'(x) in the neighbourhood of the root to maximise the rate of convergence. In addition, we should choose T(f(x),x) so that the curvature |g"(x)| does not become too large.

If g'(x) < 0, then we get oscillatory convergence/divergence.

3.7 Examples

Consider the equation

f(x) = cos x  1/2.
(21)

3.7.1 Bisection method

IterationError en+1/en
0

-0.261799

-0.500001909862

1

 0.130900


-0.4999984721161

2

-0.0654498


-0.5000015278886

3

 0.0327250


-0.4999969442322

4

-0.0163624


-0.5000036669437

5

 0.00818126


-0.4999951107776

6

-0.00409059


-0.5000110008581

7

 0.00204534


-0.4999755541866

8

-0.00102262


-0.5000449824959

9

 0.000511356


-0.4999139542706

10

-0.000255634


-0.5001721210794

11

 0.000127861


-0.4996574405018

12

-0.0000638867


-0.5006848060707

13

 0.0000319871


-0.4986322611303

14

-0.0000159498


-50274110020.188

15

 0.0000
0801862


3.7.2 Linear interpolation

IterationError en+1/en
0


-0.261799


0.1213205550823

1

-0.0317616


0.0963178807113

2

-0.00305921


0.09340810209172

3

-0.000285755


0.09312907910623

4

-0.0000266121


0.09310313729469

5

-0.00000247767


0.09310037252741

6

-0.000000230672


0.09310059304987

7

-0.0000000214757


0.09310010849472

8

-0.00000000199939


0.09310039562066

9

-0.000000000186144


0.09310104005501

10

-0.0000000000173302


0.09310567679542

11

-0.00000000000161354


0.09316100003719

12

-0.000000000000150319


0.09374663216227

13

-0.0000000000000140919


0.10000070962752

14

-0.0000000000000014092


0.1620777746239

15

-0.0000000000000002284


3.7.3 Newton-Raphson

IterationError en+1/en en+1/en2
0

0.0235988

0.00653855280777

0.2770714107399


1

0.000154302


0.0000445311143083

0.2885971297087


2

0.00000000687124


0.000000014553


-

3

1.0E-15


4

Machine accuracy

3.7.4 Secant method

IterationError en+1/en |en+1|/|en|1.618
0

-0.261799

0.1213205550823

0.2777


1

-0.0317616


-0.09730712558561

0.8203


2

0.00309063


-0.009399086917554

0.3344


3

-0.0000290491


0.0008898244696049

0.5664


4

-0.0000000258486


-0.000008384051747483

0.4098

5

0.000000000000216716


6

Machine accuracy

3.7.5 Direct iteration

There are a variety of ways in which equation ( 21 ) may be rearranged into the form required for direct iteration.

3.7.5.1 Addition of x

Use

xn+1 = g(x) = xn + cos x - 1/2
(22)

IterationError en+1/en
0

-0.547198

0.30997006568

1

-0.169615


0.1804233116175

2

-0.0306025


0.1417596601585

3

-0.00433820


0.1350620072841

4

-0.000585926


0.1341210323488

5

-0.0000785850


0.1339937647134

6

-0.0000105299


0.1339775306508

7

-0.00000141077


0.1339750632633

8

-0.000000189008


0.1339747523914

9

-0.0000000253223


0.1339747969181

10

-0.00000000339255


0.1339744440023

11

-0.000000000454515


0.1339748963181

12

-0.0000000000608936


0.1339759843399

13

-0.00000000000815828


0.1339878013503

14

-0.00000000000109311


0.1340617138257

15

-0.0000000000001465442


3.7.5.2 Multiplcation by x

Use

xn+1 = g(x) = 2x cos x
(23)

IterationError en+1/en
0

 0.0635232


-0.9577980958138

1

-0.0608424


-0.6773664418235

2

 0.0412126


-0.9070721090152

3

-0.0373828


-0.7297714456916

4

 0.0272809


-0.8754733164962

5

-0.0238837


-0.7600455540808

6

 0.0181527


-0.854809477378

7

-0.0155171


-0.778843985023

8

 0.0120854


-0.8410892481838

9

-0.0101649


-0.7908921878228

10

 0.00803934


-0.8319464035605

11

-0.00668830


-0.7987216482514

12

 0.00534209


-0.8258546748557

13

-0.00441179


-0.8038528579103

14

 0.00354643


-0.8218010788314

15

-0.00291446


3.7.5.3 Approximating f'(x)

The Direct Iteration method is closely related to the Newton Raphson method when a particular choice of transformation T(f(x)) is made. Consider

f(x) = f(x) + (x-x)h(x) = 0.
()

Rearranging equation ( 24 ) for one of the x variables and labelling the different variables for different steps in the interation gives

xn+1 = g(xn) = xn - f(xn)/h(xn).
()

Now if we choose h(x) such that g'(x)=0 everywhere (which requires h(x) = f'(x)), then we recover the Newton-Raphson method with its quadratic convergence.

In some situations calculation of f'(x) may not be feasible. In such cases it may be necessary to rely on the first order and secant methods which do not require a knowledge of f'(x). However, the convergence of such methods is very slow. The Direct Iteration method, on the otherhand, provides us with a framework for a faster method. To do this we select h(x) as an approximation to f'(x). For the present f(x) = cos ­ 1/2 we may approximate f'(x) as

h(x) = 4x(x - p)/p2
(26)

IterationError en+1/en
0

 0.0235988


0.02985973863078

1

 0.000704654


0.02585084310882

2

 0.0000182159


0.02572477890195

3

 0.000000468600


0.02572151088348

4

 0.0000000120531


0.02572134969427

5

 0.000000000310022


0.02572107785899

6

 0.00000000000797410


0.02570835580191

7

 0.000000000000205001


0.02521207213623

8

 0.00000000000000516850


9

Machine accuracy

The convergence, while still formally linear, is significantly more rapid than with the other first order methods. For a more complex example, the computational cost of having more iterations than Newton Raphson may be significantly less than the cost of evaluating the derivative.

A further potential use of this approach is to avoid the divergence problems associated with f'(x) vanishing in the Newton Raphson scheme. Since h(x) only approximates f'(x), and the accuracy of this approximation is more important close to the root, it may be possible to choose h(x) in such a way as to avoid a divergent scheme.

3.7.6 Comparison

Figure 9 shows graphicall a comparison between the different approaches to finding the roots of equation ( 21 ). The clear winner is the Newton-Raphson scheme, with the approximated derivative for the Direct Iteration proving a very good alternative.

3.7.7 Fortran program*

The following program was used to generate the data presented for the above examples. Note that this is included as an illustrative example. No knowledge of Fortran or any other programming language is required in this course.

PROGRAM Roots
      INTEGER*4 i,j
      REAL*8    x,xa,xb,xc,fa,fb,fc,pi,xStar,f,df
      REAL*8    Error(0:15,0:15)
      f(x)=cos(x)-0.5
      df(x) = -SIN(x)
      pi = 3.141592653
      xStar = ACOS(0.5)
      WRITE(6,*)'# ',xStar,f(xStar)
C=====Bisection
      xa = 0
      fa = f(xa)
      xb = pi/2.0
      fb = f(xb)
      DO i=0,15
        xc = (xa + xb)/2.0
        fc = f(xc)
        IF (fa*fc .LT. 0.0) THEN
          xb = xc
          fb = fc
        ELSE
          xa = xc
          fa = fc
        ENDIF
        Error(0,i) = xc - xStar
      ENDDO
C=====Linear interpolation
      xa = 0
      fa = f(xa)
      xb = pi/2.0
      fb = f(xb)
      DO i=0,15
        xc = xa - (xb-xa)/(fb-fa)*fa
        fc = f(xc)
        IF (fa*fc .LT. 0.0) THEN
          xb = xc
          fb = fc
        ELSE
          xa = xc
          fa = fc
        ENDIF
        Error(1,i) = xc - xStar
      ENDDO
C=====Newton-Raphson
      xa = pi/2.0
      DO i=0,15
        xa = xa - f(xa)/df(xa)
        Error(2,i) = xa - xStar
      ENDDO
C=====Secant
      xa = 0
      fa = f(xa)
      xb = pi/2.0
      fb = f(xb)
      DO i=0,15
        IF (fa .NE. fb) THEN
C         If fa = fb then either method has converged (xa=xb)
C         or will diverge from this point
          xc = xa - (xb-xa)/(fb-fa)*fa
          xa = xb
          fa = fb
          xb = xc
          fb = f(xb)
        ENDIF
        Error(3,i) = xc - xStar
      ENDDO
C=====Direct iteration using x + f(x) = x
      xa = 0.0
      DO i=0,15
        xa = xa + f(xa)
        Error(4,i) = xa - xStar
      ENDDO
C=====Direct iteration using  xf(x)=0 rearranged for x
C-----Starting point prevents convergence
      xa = pi/2.0
      DO i=0,15
        xa = 2.0*xa*(f(x)-0.5)
        Error(5,i) = xa - xStar
      ENDDO
C=====Direct iteration using xf(x)=0 rearranged for x
      xa = pi/4.0
      DO i=0,15
        xa = 2.0*xa*COS(xa)
        Error(6,i) = xa - xStar
      ENDDO
C=====Direct iteration using 4x(x-pi)/pi/pi to approximate f'
      xa = pi/2.0
      DO i=0,15
        xa = xa - f(xa)*pi*pi/(4.0*xa*(xa-pi))
        Error(7,i) = xa - xStar
      ENDDO
C=====Output results
      DO i=0,15
        WRITE(6,100)i,(Error(j,i),j=0,7)
      ENDDO
100   FORMAT(1x,i4,8(1x,g12.6))
      END



Goto next document (Linear)


Links

Start of this section
Title Page
Contents

Stuart Dalziel's home page


Stuart Dalziel, last page update: 17 February 1998