J. Numer. Anal. Approx. Theory, vol. 44 (2015) no. 2, pp. 146–153 ictp.acad.ro/jnaat
A SIMPLIFIED PROOF OF THE KANTOROVICH THEOREM FOR SOLVING EQUATIONS USING TELESCOPIC SERIES
IOANNIS K. ARGYROS1 and HONGMIN REN2
Abstract. We extend the applicability of the Kantorovich theorem (KT) for solving nonlinear equations using Newton-Kantorovich method in a Banach space setting. Under the same information but using elementary scalar telescopic ma- jorizing series, we provide a simpler proof for the (KT) [2], [7]. Our results provide at least as precise information on the location of the solution. Numeri- cal examples are also provided in this study.
MSC 2010. 65J15, 65G99, 47H99, 49M15.
Keywords. Newton-Kantorovich method; Banach space; majorizing series, telescopic series, Kantorovich theorem.
1. INTRODUCTION
Newton’s method is one of the most fundamental tools in computational analysis, operations research, and optimization [1, 2, 4, 7–11]. One can find applications in management science; industrial and financial research; data mining; linear and nonlinear programming. In particular interior point algo- rithms in convex optimization are based on Newton’s method.
The basic idea of Newton’s method is linearization. SupposeF :R→R is a differentiable function, and we would like to solve equation
(1.1) F(x) = 0.
Starting from an initial guess, we can have the linear approximation of F(x) in the neighborhood ofx0 :F(x0+h)≈F(x0) +F0(x0)h, and solve the resulting linear equationF(x0)+F0(x0)h= 0, leading to the recurrent method (1.2) xn+1 =xn−F0(xn)−1F(xn) (n≥0).
This is Newton’s method as proposed in 1669 by I. Newton (for polynomial only). It was J. Raphson, who proposed the usage of Newton’s method for general functions F. That is why the method is often called the Newton- Raphson method. Later in 1818, Fourier proved that the method converges
1 Department of Mathematical Sciences, Cameron University, Lawton, Oklahoma 73505- 6377, USA, e-mail: [email protected].
2College of Information and Engineering, Hangzhou Polytechnic, Hangzhou 311402, Zhe- jiang, PR China, e-mail: [email protected].
quadratically in a neighborhood of the root, while Cauchy (1829, 1847) pro- vided the multidimensional extension of Newton’s method (1.2). In 1948, L.
V. Kantorovich published an important paper [7] extending Newton’s method for functional spaces (the Newton-Kantorovich method (NKM)). That is F : D⊆X → Y, where X, Y are Banach spaces, and D is a open convex set [1, 7]. Ever, since thousands of papers have been written in a Banach space set- ting for the (NKM) as well as Newton-type methods, and their applications.
We refer the reader to the publications [1–11] for recent results (see also, the references there).
It is stated in the (KT) that Newton’s method (1.2) converges provided the famous for its simplicity and clarity Kantorovich hypotheses (KH) (see (C6)) is satisfied. (KH) uses the information (x0, F, F0). Any successful attempt for weakening (KH) under the same information is extremely important in computational mathematics, since that will imply the extension of the appli- cability of (NKM). We have already provided conditions weaker than (KH) [1, 2] by introducing the center Lipichitz condition, which is a special case of the Lipschitz condition.
In this study we present a new proof of the (KT) for Newton’s method using telescopic majorizing sequences. Our proof is simpler than the corresponding one provided by Kantorovich [7]. Section 2 contains the semilocal convergence of Newton’s method (1.2). Numerical examples where our results apply to solve nonlinear equations are provided in Section 3.
2. SEMILOCAL CONVERGENCE OF NEWTON-KANTOROVICH METHOD (NKM)
Let us assume that F0(x0)−1 ∈L(Y, X) at somex0 ∈D, and the following conditions hold:
(C1) 0<kF0(x0)−1k ≤β, (C2) 0<kF0(x0)−1F(x0)k ≤η, and Lipschitz continuity condition
(C3) kF0(x)−F0(y)k ≤Lkx−yk for some L >0, and for all x, y∈D.
It is convenient for us to define fora0 =b0 = 1,
(2.1) γ =βLη,
and scalar sequences
(2.2) an+1 = an
1−γanbn,
(2.3) cn= 12Lb2n,
(2.4) bn+1 =βan+1cnη.
We shall find a connection between (NKM){xn}, and scalar sequences {an}, {bn}, {cn}. Sequences {an}, {bn} and {cn} have been used by Kantorovich [7]. However, we present a new proof for the (KT).
Lemma 2.1. Under the (C1)−(C3) conditions further suppose:
(C4) xn∈D, and
(C5) γanbn<1.
Then, the following estimates hold:
(In) kF0(xn)−1k ≤anβ,
(IIn) kxn+1−xnk=kF0(xn)−1F(xn)k ≤bnη, and
(IIIn) kF(xn+1)k ≤cnη2.
Proof. We shall use induction to show items (In)−(IIIn). (I0) and (II0) follow immediately from the initial conditions. To show (III0), we use (1.2) forn= 0, (II0), and (C3) to obtain
(2.5)
F(x1) = F(x1)−F(x0)−F0(x0)(x1−x0)
= Z 1
0
[F0(x0+t(x1−x0))−F0(x0)](x1−x0)dt
⇒
(2.6)
kF(x1)k = Z 1
0
[F0(x0+t(x1−x0))−F0(x0)](x1−x0)dt
≤ Z 1
0
k[F0(x0+t(x1−x0))−F0(x0)]kdtkx1−x0k
≤ Lkx1−x0k2 Z 1
0
tdt= L2kx1−x0k2
≤ L2b20η2 =c0η2.
If xk+1 ∈ D (k ≤ n), then it follows from (C3) −(C5) and the induction hypotheses:
(2.7) kF0(xk)−1kkF0(xk+1)−F0(xk)k ≤ akβLkxk+1−xkk
≤ akβLbkη=γakbk <1.
In view of (2.7), and the Banach lemma on invertible operators [1, 2, 7]
F0(xk+1)−1 ∈L(Y, X), so that
(2.8) kF0(xk+1)−1k ≤ 1−kF0(xk)−1kFkkF0(xk0(x)−1k+1k )−F0(xk)k
≤ 1−γaakβ
kbk =ak+1β, which shows (In) for alln≥0.
As in (2.5), we have (2.9)
F(xk+1) = F(xk+1)−F(xk)−F0(xk)(xk+1−xk)
= Z 1
0
[F0(xk+t(xk+1−xk))−F0(xk)](xk+1−xk)dt and
(2.10) kF(xk+1)k ≤ L2kxk+1−xkk2 ≤ L2b2kη2=ckη2,
which shows (IIIn) for alln≥0. Moreover, we get by (1.2), (2.8) and (2.10):
(2.11)
kxk+2−xk+1k = kF0(xk+1)−1F(xk+1)k ≤ kF0(xk+1)−1kkF(xk+1)k
≤ βak+1ckη2 =bk+1η.
That completes the induction for (IIn) and the proof of the lemma.
We need to show the convergence of sequence {xn}, which is equivalent to proving that {bn} is a Cauchy sequence. To this effect we need the following auxiliary results:
Lemma 2.2. Suppose:
there exists x0 ∈D such that
(C6) 2γ ≤1, where, γ is given by (2.1)
and Condition (C4) holds. Then, the following assertions hold:
(a) Scalar sequence {an} increases. In particular, (C5) holds.
(b) limn→∞bn= 0.
(c)
(2.12) r :=
∞
X
k=0
bk = 2 1 +√
1−2γ. (d) (C4) holds if U(x0, rη)⊆D.
Proof. (a) We shall show using induction that {an}, {bn} and {cn} are positive sequences.
In view of the initial conditions, a0, b0,c0 are positive, and 1−γa0b0 >0.
Assume ak,bk, ck and 1−γakbk are positive for k ≤n. By (2.2), ak+1 >0, consequently bk+1 > 0. Moreover, 1−γak+1bk+1 > 0 by (C6) (see also [2], [7]). The induction is completed.
Solving (2.2) for bn, we obtain
(2.13) bn= 1
γ( 1 an
− 1 an+1
).
By telescopic sum, we have:
(2.14)
n−1
X
k=0
bk= 1 γ( 1
a0 − 1 an) = 1
γ(1− 1
an), since a0 = 1, or
(2.15) an= 1
1−γPn−1k=0bk.
But 1−γPn−1k=0bk decreases, so{an}given by (2.15) increases. Note also that an≥a0= 1.
(b) By (a)an≥1 is increasing, so that
(2.16) 0< 1
an
≤1.
Therefore,{a1
n}is monotonic on the compact set [0,1] and as such it converges to some limita. By lettingn→ ∞in (2.13), we get
(2.17) lim
n→∞bn= lim
n→∞
1 γ( 1
an
− 1 an+1
) = 1
γ(a−a) = 0.
(c) It follows from (b) and (2.14) that there exists r=P∞k=0bk. The value of r is well known to be given by (2.12) [2], [7].
(d) We have kx1 −x0k ≤ b0η = η ⇒ x1 ∈ U(x0, rη). Assume xk ∈ U(x0, rη)⊆Dfor all k≤n. Then, we have by Lemma 2.1 in turn that
kxk+1−x0k ≤ kxk+1−xkk+· · ·+kx1−x0k ≤(bk+· · ·+b0)η < rη,
⇒
xk+1∈U(x0, rη)⊆D.
That completes the proof of the lemma.
Remark 2.3. In view of (C3) there existsL0>0 such that kF0(x)−F0(x0)k ≤L0kx−x0k for allx∈D.
Note that
L0 ≤L holds in general and LL
0 can be arbitrarily large [2].
We can show the semilocal convergence result for (NKM) (1.2).
Theorem 2.4. Under conditions (C1)−(C3), (C6) further assume
(C7) U(x0, rη) ={x∈X|kx−x0k ≤rη} ⊆D where, r is given by (2.12).
Then, sequence {xn} generated by (NKM) (1.2) is well defined, remains in U(x0, rη) for alln≥0 and converges to a solutionx?∈U(x0, rη) of equation F(x) = 0. Moreover, the following estimate holds:
(2.18) kxn−x?k ≤
∞
X
k=n
bkη < rη.
Furthermore, x? is the only solution of equation F(x) = 0 in D0TD = D1 where, D0 =U(x0,βL1
0).,
Proof. It follows from Lemmas 2.1 and 2.2 (see also (IIn)) that {xn} is a Cauchy sequence in a Banach space X and as such it converges to some x?. We have limn→∞bn = 0, which implies by (2.3) that limn→∞cn = 0. By letting n → ∞ in (2.10) and using the continuity of operator F, we obtain F(x?) = 0.
By (C7), we obtain
(2.19) kxn+1−x0k ≤
n
X
k=0
kxk+1−xkk ≤
n
X
k=0
bkη < rη.
⇒xn+1 ∈U(x0, rη)
⇒x? = limn→∞xn∈U(x0, rη) (sinceU(x0, rη) is a closed set).
Letm > n. Then, we have
(2.20) kxn−xmk ≤
m−1
X
k=n
kxk−xk+1k ≤
m−1
X
k=n
bkη≤
∞
X
k=n
bkη < rη.
By letting m → ∞ in (2.20), we obtain (2.18). Finally, to show uniqueness, let y? ∈D0 be a solution of equationF(x) = 0. Define linear operator
(2.21) M =
Z 1 0
F0(x?+t(y?−x?))dt.
We have using (C3):
kF0(x0)−1kkM−F0(x0)k ≤βL0
Z 1 0
kx?+t(y?−x?)−x0kdt
≤βL0
Z 1 0
[(1−t)kx?−x0k+tky?−x0k]dt (2.22)
< βL0
2 (rη+ 1 βL0)≤ 1
2 +1 2 = 1.
It follows from (2.22), and the Banach lemma on invertible operators that M−1∈L(Y, X). In view of (2.21), we get
0 =F(y?)−F(x?) =M(y?−x?),
which impliesx?=y?. That completes the proof of the theorem.
Remark 2.5. If L0 = L, Theorem 2.4 reduces to the (KT). Otherwise, (i.e if L0 < L), then our theorem provides a more precise information on the location of the solution.
3. APPLICATIONS
In the first example we show that the Kantorovich hypothesis (see (3.2)) is satisfied with the bigger uniqueness ball of solution than before [2], [7].
Example 3.1. Let X = Y = R be equipped with the max-norm. Let x0 = 1, D=U(x0,1−q), q∈[0,1) and define functionF on Dby
(3.1) F(x) =x3−q.
Then, we obtain that β = 13, L= 6(2−q), L0 = 3(3−q) andη = 13(1−q).
Then, the famous for its simplicity and clarity Kantorovich hypothesis for solving equations using (NKM) [1, 2, 7] is satisfied, say for q=.6, since
(3.2) h= 2Lβη= 4
3(2−q)(1−q) = 0.746666. . . <1.
Hence, (NKM) converges starting at x0 = 1. We also have that r = 1.330386707, η = .133. . ., rη = .177384894, L0 = 7.2 < L = 8.4 and
1
βL0 = .41666. . .. That is our Theorem 2.4 guarantees the convergence of
(NKM) to x? = √3
0.49 = .788373516 and the uniqueness ball is better than the one given in (KT).
In the second example we apply Theorem 2.4 to a nonlinear integral equation of Chandrasekhar-type.
Example 3.2. Let us consider the equation
(3.3) x(s) = 1 + s
4x(s) Z 1
0
x(t)
s+tdt, s∈[0,1].
Note that solving (3.3) is equivalent to solving F(x) = 0, where F :C[0,1]→ C[0,1] defined by
(3.4) [F(x)](s) =x(s)−1−s 4x(s)
Z 1 0
x(t)
s+tdt, s∈[0,1].
Using (3.4), we obtain that the Fr´echet-derivative ofF is given by (3.5) [F0(x)y](s) =y(s)− s
4y(s) Z 1
0
x(t) s+tdt− s
4x(s) Z 1
0
y(t)
s+tdt, s∈[0,1].
Let us choose the initial point x0(s) = 1 for each s ∈ [0,1]. Then, we have that β = 1.534463572, η = .2659022747, L0 = L = ln 2 = .693147181, h = 2γ =.392066334 andr = 1.23784269 (see also [1,2,3,7]). Then, hypotheses of Theorem 2.4 are satisfied. In consequence, equation F(x) = 0 has a solution x? inU(1, ρ), where ρ=rη=.298816793.
Acknowledgements. This work was supported by National Natural Sci- ence Foundation of China (Grant No. 10871178).
REFERENCES
[1] I.K. Argyros, Polynomial equations in abstract spaces and applications, St. Lu- cie/CRC/Lewis Publ. Mathematics Series, 1998, Boch Raton Florida.
[2] I.K. Argyros,Computational theory of iterative methods, Series: Studies in Compu- tational Mathematics 15, Editors, C.K. Chui and L. Wuytack, Elservier Publ. Co. New York, USA, 2007.
[3] S. Chandrasekhar,Radiative Transfer, Dover. Publ, 1960, New York.
[4] P. Deuflhard,Newton Methods for Nonlinear Problems: Affine Invariance and Adap- tive Algorithms, Springer-Verlag, Berlin, Heidelberg, 2004.
[5] J.A. Ezquerro, M.A. Hern´andez, New Kantorovich-type conditions for Halley’s method, Appl. Num. Anal. Comp. Math., 2 (2005), pp. 70–77.
[6] J.A. Ezquerro, M.A. Hern´andez, An optimization of Chebyshev-method, J. Com- plexity, 25 (2009), pp. 343–361.
[7] L.V. Kantorovich, G.P. Akilov, Functional Analysis, Pergamon Press, Oxford, 1982.
[8] J.H. Mathews, Bibligraphy for Newton’s method, Available from:
http://mathfaculty.fullerton.edu/mathews/n2003/newtonsmethod/
Newton’sMethodBib/Links/Newton’sMethodBib lnk 3.html.
[9] Yu. Nesterov,Introductory Lectures on Convex Programming, Kluwer, Boston, 2004.
[10] S.J. Wright,Primal-Dual Interior Point Methods, SIAM, Philadelphia, 1997.
[11] T.J. Ypma,Historical developments of the Newton-Raphson method, SIAM Review, 37 (1995), pp. 531–551.
Received by the editors: September 12, 2012.