1 \Politehni a" University of Bu harest Department of Engineer S ien es Ele tri al Engineering and Computers Se tion
NUMERICAL METHODS Boris Jora Bogdan Dumitres u Cristian Oara
1995
2
Contents 1 Main features of numeri al omputation
5
2 Dire t methods for solving linear systems
17
1.1 The oating point representation . . . . . . . . . . . . . . . . . . . . 6 1.2 Computer arithmeti . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3 Numeri al ondition and numeri al stability . . . . . . . . . . . . . . 12
2.1 2.2 2.3 2.4
2.5 2.6 2.7 2.8 2.9 2.10 2.11
Types of linear systems . . . . . . . . . . . . . . . . . . . . . . . . Triangular systems . . . . . . . . . . . . . . . . . . . . . . . . . . . Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . Pivoting strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Partial pivoting . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Complete pivoting . . . . . . . . . . . . . . . . . . . . . . . Solving determined linear systems . . . . . . . . . . . . . . . . . . Matrix inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Triangular matrix inversion . . . . . . . . . . . . . . . . . . 2.6.2 Matrix inversion using GP P and GCP . . . . . . . . . . . Determinant al ulus . . . . . . . . . . . . . . . . . . . . . . . . . . LDU and LU matrix fa torizations . . . . . . . . . . . . . . . . . . Cholesky fa torization . . . . . . . . . . . . . . . . . . . . . . . . . Appli ations of the LU fa torizations . . . . . . . . . . . . . . . . . 2.10.1 Linear system solving . . . . . . . . . . . . . . . . . . . . . 2.10.2 Matrix inversion . . . . . . . . . . . . . . . . . . . . . . . . 2.10.3 Determinant al ulus . . . . . . . . . . . . . . . . . . . . . . Numeri al ondition of linear systems . . . . . . . . . . . . . . . . 2.11.1 Matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11.2 Numeri al ondition of linear systems . . . . . . . . . . . . 2.11.3 Numeri al stability of dire t algorithms for linear systems . 3
. . . . . . . . . . . . . . . . . . . . .
17 21 23 28 29 31 33 36 36 38 41 42 47 51 51 52 52 52 52 54 56
4
CONTENTS
3 The linear least squares problem
3.1 The linear least squares problem . . . . . . . . . . . . . 3.2 Orthogonality. Householder and Givens transformations 3.2.1 Householder transformations . . . . . . . . . . . 3.2.2 Givens transformations . . . . . . . . . . . . . . 3.3 Orthogonal triangularization. QR fa torization . . . . . 3.4 Solving the linear least squares problem . . . . . . . . . 3.5 Solving underdetermined linear systems . . . . . . . . .
4 Iterative te hniques in matrix algebra
4.1 Ve tor and matrix sequen es. Convergen e . 4.2 Iterative te hniques for linear system solving 4.2.1 Ja obi iterative method . . . . . . . . 4.2.2 Gauss-Seidel iterative method . . . . . 4.2.3 Relaxation methods . . . . . . . . . . 4.3 Iterative re nement of the omputed solution
5.5
.. .. .. .. .. .. ..
. . . . . . .
.. .. .. .. .. .. ..
. . . . . . .
.. .. .. .. .. ..
. . . . . .
.. .. .. .. .. ..
. . . . . .
. . . . . .
.. .. .. .. .. ..
. . . . . .
.. .. .. .. .. ..
. . . . . .
............ Eigenvalues and eigenve tors . . . . . . . . . . . The real S hur form. De ation te hniques . . . . Power and inverse power methods . . . . . . . . 5.4.1 The power method . . . . . . . . . . . . . 5.4.2 The inverse power method . . . . . . . . . The QR algorithm . . . . . . . . . . . . . . . . . 5.5.1 Redu tion to Hessenberg form . . . . . . 5.5.2 The iterative stage of the QR algorithm .
. . . . . . . . .
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
. . . . . . . . .
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
.. .. .. .. .. .. .. .. ..
. . . . . . . . .
5 Eigenvalues and eigenve tors 5.1 The spa e Cn . . . . . . . .
5.2 5.3 5.4
. . . . . . .
6 The singular value de omposition
6.1 De nitions. Properties . . . . . . . . . . . . . . . 6.2 The SVD Algorithm . . . . . . . . . . . . . . . . 6.3 Appli ations of the SVD . . . . . . . . . . . . . . 6.3.1 On omputing the rank of a matrix . . . . 6.3.2 The general least square problem . . . . . 6.3.3 Orthonormal basis for linear subspa es . . 6.3.4 The pseudoinverse omputation . . . . . .
Solutions to the problems
. . . . . . .
.. .. .. .. .. .. ..
. . . . . . .
. . . . . . .
.. .. .. .. .. .. ..
. . . . . . .
.. .. .. .. .. .. ..
. . . . . . .
59
59 60 63 66 67 71 75
79
79 81 84 86 88 90
93
93 95 98 100 101 102 104 104 106 113
113 123 139 139 140 141 143
147
Chapter 1
Main features of numeri al
omputation This hapter is devoted to a dis ussion of the pra ti al onsiderations that make the dieren e between an inee tive and an ee tive algorithm. Many existen e proofs in mathemati s spe ify pro edures by whi h the obje t whose existen e is to be established may a tually be onstru ted. However, there is a wide gap between a mathemati al algorithm and a reliable omputer algorithm for solving a given problem. A orre t mathemati al algorithm an be inadequate for pra ti al purposes be ause of possibly desastrous ee ts of rounding errors or be ause its large omplexity ( on erning memory and exe ution time requirements). The memory requirement
an be evaluated by the amount of input and auxiliary data. The exe ution time is usually evaluated by the number of arithmeti (and sometimes logi al) operations needed to ompute the solution from the input data. A more diÆ ult task is to estimate the ee ts of rounding errors on the omputed solution a
ura y. To approa h this problem, let us introdu e some de nitions. Let x and x be real numbers, where x is interpreted as an approximation of x. Two natural measures of the approximation a
ura y are given by: De nition 1.1 Let x; x 2 R, with x absolute error in x is the number
regarded as an approximation to x. The
"a = jx xj:
If x 6= 0, the relative error is the number "r =
x
x
x
5
= j"xaj :
6
CHAPTER 1.
MAIN FEATURES OF NUMERICAL COMPUTATION
Example 1.2 Let x = 1:0, and x = 0:999 its approximation. Then, "a = 10 3 and "r = 10 3 . If y = 0:009 is an approximation to x = 0:01, then the absolute error is the same, "a = 10 3 , but the relative error is a hundred times greater: "r = 10 1 .
This example indi ates that the relative error is mu h more appropriate to estimate approximation a
ura y. 1.1 The oating point representation
The rounding errors appear be ause of omputer representation of real numbers by a nite sequen e of symbols (binary digits). To give an estimate to su h errors, let remember the well known positional representation of real numbers. Let introdu e: 2 N, 2, the numeration base; C = fd0 ; d1 ; : : : ; d 1 g, a set of symbols alled digits, for the rst natural numbers; S = f+; ; : g, an additional set of symbols. Theorem 1.3 For every x 2 R, there is a unique in nite sequen e of symbols from C and S , of form: x = san 1 an 2 : : : a0 :a 1 a 2 a 3 : : : ;
whi h does not end by an in nite sequen e of d 1 , where s 2 f+; so that the value of x is ! n 1 X X i i ai + a i : x=s i=0
g and ai 2 C ,
i=1
In this formula, ai are rather regarded as values of the natural numbers symbolised by the symbol ai , and s is regarded as 1 or -1.
In ten based numeration system the two sequen es 0:99999 : : : 1:00000 : : : represent the same real number. The binary representations of the same numbers are 0:11111 : : : 1:00000 : : : The a
epted representation is, in both situations, 1:0000 : : :.
Example 1.4
1.1.
THE FLOATING POINT REPRESENTATION
7
Obviously, the omputer representation of a real number may have only a nite number of symbols, and so it is, inherently, approximate. A "good" representation by a nite sequen e of xed number of symbols must satisfy the following requirements: a suÆ iently large range of represented numbers; a suÆ iently small relative representation errors; an uniform error distribution. A good ompromise between these requirements is given by the so alled Floating Point Format (FPF). De nition 1.5 A Floating Point Format (FPF) is de ned by three integers, ( ; t; s), having the following signi an e: { the numeration base ( 2); t { the pre ision, i.e. the number of signi ant digits (the mantissa "length"); s { the number of exponent digits. A oating point number is a pair (f; e), where f = 0:f1 f2 : : : ft ; f1 6= d0 ; fi 2 C is a signed fra tional, normalized (f1 6= "0"), real number with t digits, alled mantissa and e = e1 e2 : : : es is a signed integer with s digits alled exponent. The exa t value asso iated with the oating point number (f; e) is x = f e: Let L be the minimum value of the exponent (L = d 1 : : : d 1 ) and U = L the maximum. Noti e that all oating point numbers x satisfy: L 1 = m jxj M = U (1 t ) i.e. the range of oating point representation is R = [ M; M ℄. Obviously, in all
omputer representation, = 2. Example 1.6 Let onsider the FPF with ( ; t; s) = (2; 2; 1). Then L = 1, U = 1, m = 2 2 = 0:2510 , M = 2(1 2 2 ) = 1:510 . The oating point numbers are: x1 = ( 0:11; +1) = 1:510 x7 = (+0:10; 1) = 0:2510 x2 = ( 0:10; +1) = 1:010 x8 = (+0:11; 1) = 0:37510 x3 = ( 0:11; 0) = 0:7510 x9 = (+0:10; 0) = 0:510 x4 = ( 0:10; 0) = 0:510 x10 = (+0:11; 0) = 0:7510 x5 = ( 0:11; 1) = 0:37510 x11 = (+0:10; +1) = 1:010 x6 = ( 0:10; 1) = 0:2510 x12 = (+0:11; +1) = 1:510
8
CHAPTER 1.
MAIN FEATURES OF NUMERICAL COMPUTATION
-1.5
-1
-0.5
x1
x2 x3 x4 x5 x6
0
0.5
1
x7 x8 x9 x10 x11
1.5 -
x12
Figure 1.1: The representation of all oating point numbers of example 1.6. 01001100 01101001 10111000 00001011 67 bits exponent- 6 (23+1) bit mantissa exponent sign number (mantissa) sign Figure 1.2: Floating point number stru ture for ( ; t; s) = (2; 24; 7). The geometri al representation, on the real axis, of all the numbers above is given in gure 1.1. The oating point numbers are uniformly distributed, relatively; i.e., jxi xi 1 j=jxi j is approximately the same for all i. Example 1.7 A standard 32-bit FPF may be the following: ( ; t; s) = (2; 24; 7). Two bits are assigned to mantissa and exponent signs, but only the latest must be physi ally stored; the former takes the pla e of the rst bit of the mantissa whi h always have the value "1" and thus need not to be stored. An example of oating point number in su h format is presented in gure 1.2. For this FPF: L = 127, U = 127, m = 2 127 = (2 10 )13 23 10 38 , M 1038 (we an approximate 210 103 ). This indi ates that FPF provides a wide range of numbers, represented with a relatively small number of digits. This example is very
lose to the a tual standard IEEE format for single pre ision oating point numbers. In order to estimate the pre ision of a oating point representation, let be given a FPF, i.e. ( ; t; s), and denote F = fx 2 R j x has an exa t FP representationg [ f0g: (1.1) Of ourse, F is a nite set of rational numbers. To represent a real number x 2 [ M; M ℄ in FPF means to approximate it by a number x 2 F . This approximation an be expressed by a so alled rounding fun tion. De nition 1.8 If ( ; t; s) is a FPF and F the set de ned in (1.1), then a fun tion fl : [ M; M ℄ ! F
1.1.
9
THE FLOATING POINT REPRESENTATION
whi h asso iates with every x 2 [
M; M ℄ a unique FP representation x = fl(x)
is alled rounding fun tion. The approximation relative error jx fl(x)j de ned for all nonzero x 2 [
jxj
M; M ℄ is alled roundo error.
As the interval [ M; M ℄ is an in nite set of real numbers, ea h x 2 F represents an in nite set of numbers from [ M; M ℄; so, we are interested to establish an upper bound of roundo errors for a given rounding fun tion. There are several rounding fun tions. We will present only the one alled hopped rounding. In order to introdu e it, let write the number x 2 [ M; M ℄ in the form: x = f e = 0:f1 f2 : : : ft ft+1 : : : e = 0:f1 f2 : : : ft e 0:ft+1 ft+2 : : : e t = f e + g e t; where fi 2 C , f1 6= d0, f = 0:f1 f2 : : : ft, g = 0:ft+1ft+2 : : :. Obviously: 1= jf j < 1; 1= jfj < 1; 0 jgj < 1: (1.2) De nition 1.9
The hopped rounding fun tion fl1 : [ M; M ℄ ! F
is de ned by:
x = fl1 (x) =
(
f e ; for x 6= 0 0; for x = 0:
Using (1.2) it is easy now to establish an upper bound for roundo errors introdu ed by fl1. Indeed, for every x 2 [ M; M ℄ f0g: ej jgj e t t jx fl1(x)j = jf e f "r = = < = t+1 : jxj
jf j e
jf j e
1
This formula indi ates that the roundo error level is ex lusively determined by the number of mantissa digits and this is the reason to all t the oating point pre ision. On most omputers, the oating point numbers have a xed pre ision. Many omputers have the ability to manipulate oating point numbers with about 2t
10
CHAPTER 1.
MAIN FEATURES OF NUMERICAL COMPUTATION
mantissa bits. Su h numbers are alled double pre ision and omputations involving them are said to be done in double pre ision. For instan e, in the IEEE standard, the double pre ision numbers are stored on 64 bits, with t = 53 and s = 10 (an extra bit is for the exponent sign). For all used rounding fun tions, the relative error bound has the form "r t ; (1.3) where is a number of order unity. The t mantissa digits are known as the signi ant digits of the real number involved. In the de imal numeration basis, the IEEE single pre ision numbers (t = 24) have 7 signi ant digits (we ount 3 de imal digits for 10 binary digits) and double pre ision numbers (t = 53) have 16 signi ant digits; onsequently, the roundo error bound is about 10 7 and 10 16 , respe tively; this quantity is also alled ma hine epsilon. From (1.3) the existen e of a number results, su h that x = fl(x) = x(1 + ); jj t (1.4) whi h is another way to express the roundo error. 1.2 Computer arithmeti
Computers with oating point hardware are provided with a set of instru tions for manipulating oating point numbers, su h as addition, subtra tion, multipli ation or division. We must stress that the operations mentioned above dier from the
orresponding mathemati al ones be ause the result ne essarily belongs to the nite set F of oating point numbers. Hen e, the arithmeti operations annot be performed exa tly. The error introdu ed by the oating point arithmeti operations is
alled rounding error. The onsequen es an be very important; there is a real possibility that rounding error a
umulate and, in extensive omputation, ontaminate the answers to make them unusefull. Therefore, it is desirable that every algorithm is analyzed to show that rounding errors will not ae t the results unduly. This is, in general, a very diÆ ult task. In order to make a rounding error analysis we need to know bounds for errors in
oating point operations. For most omputers, the following model of oating point operation errors, based on the evaluation (1.4), is available: if we denote by fl(xy), fl(x=y) and fl(x + y) the omputed produ t, quotient and sum of the oating point numbers x and y, then, in a t-digit FPF: fl(x op y) = (x op y)(1 + ); jj t ; (1.5)
1.2.
11
COMPUTER ARITHMETIC
where op an be +, or /; is of order unity. This relation guarantees that these operations introdu e only small relative errors into the omputation. For addition, the relation an be written as fl(x + y) = x(1 + ) + y(1 + ), whi h says that the omputed sum of x and y is the exa t sum of two numbers that are very near to x and y. In numeri al analysis terminology, it means that the addition algorithm is stable. For most purposes, this result is good enough; however, when x and y are of opposite sign and of nearly equal module, the rounding error may be important. Example 1.10 If x and y oating point representations have the same exponent and the same rst k mantissa digits, i.e.: x = fl(x) = 0:d1 d2 : : : dk dk+1 : : : dt e y = fl(y) = 0:d1 d2 : : : dk k+1 : : : t e then the oating point form of x y will be: fl(x y) = 0:f1 : : : ft k gt k+1 : : : gt e k where the rst t k digits are orre t, but the others, denoted by g, have no signi an e (they even might be randomly assigned). In any further omputation involving x y we will an rely only on at most t k orre t signi ant digits, sin e any hain of al ulations would not be expe ted to be more a
urate than its weakest point (however, there are situations when the results are more a
urate; for instan e, if we
ompute (x y) + y, the result will have at least t 1 orre t digits; but, usually, the analysis is done for the worst ase). This loss of exa t signi ant digits is due to the data involved, and not to the algorithm. It an often be avoided by hanging the algorithm whi h implies su h subtra tions, as in the example below. Example 1.11 a 6= 0:
The well known formulas for root al ulation of ax2 + bx + = 0,
x1 =
b
p2 b 2a
4a ;
x2 =
p
b + b2 2a
4a
are not the best ones, from the numeri al point of view, espe ially when b2 4a . The formulas: p 2p b sign(b) b2 4a ; x2 = x1 = 2a b + sign(b) b2 4a are better, by avoiding subtra tions of nearby numbers (the se ond relation is dedu ed from x1x2 = =a).
12
CHAPTER 1.
MAIN FEATURES OF NUMERICAL COMPUTATION
We do not present how the basi arithmeti oating point operations are performed. We must only retain the evaluation (1.5) of the omputed numbers and its possible onsequen es. Finally, we mention that when a oating point operation produ es a number with too large absolute value (> M ), a oating point over ow o
urs. When the result has a too small absolute value (< m, but nonzero), an under ow o
urs. Obviously, every well onstru ted algorithm must spe ify what to do in the event of an over ow or an under ow. Most omputers, automati ally, set the resulting number to zero in the ase of an under ow, and stop omputation, or, at least, write an warning message in the ase of an over ow. In the IEEE oating point standard, there is a spe ial value alled in nity whi h is assigned to the result at over ow; the omputation ontinues with this value; the results follow the known rules for operations with in nity. Another spe ial value { Not a Number (NaN) { is assigned to an inde nite result, su h for 0 1; an operation involving a NaN has always NaN as result. 1.3 Numeri al ondition and numeri al stability
Any numeri al problem requires the omputation of some numeri al results starting with some numeri al initial data, su h that the results an be interpreted as an approximation to the solution of a mathemati ally expressed problem at the initial data point. For su h a problem to be well de ned, the existen e and uniqueness of the solution must be assured. Thus, a numeri al problem an always be des ribed as an evaluation of a fun tion: f : X Rm ! Y Rn (1.6) in a given point x 2 X . The m omponents of the argument are the entry data and the n omponents of f (x) are the answers or the output data. Example 1.12 a) Given a set X of three real numbers and 2
x = 64
a b
3 7 5
2 X = fx 2 Rn j a 6= 0; b2 4a 0g R3
to solve the quadrati equation
ay2 + by + = 0
is a well de ned numeri al problem sin e for any x 2 X there exists a unique ve tor y = f (x) =
"
y1 y2
#
=
2
p
3
b sign(b) b2 4a 22a 5 4 p2 b+sign(b) b 4a
2 R2 = Y
1.3.
NUMERICAL CONDITION AND NUMERICAL STABILITY
whi h is the problem solution. b) Compute
Z b
13
t dt; e a where a; b 2 R are two given numbers is also a well de ned numeri al problem. " # Indeed, for the entry data x = ab 2 X = R2 , there is a unique integral value R y 2 Y = R, although the inde nite integral e t dt has no expressed primitive. 2
2
The ina
ura y of the omputed solution for a given numeri al problem may have two essential dierent sour es: 1. The problem's sensibility to initial data perturbations. In pra ti e, only an approximation x to the entry data x is known and even at best we an al ulate only f (x) instead of f (x). If the fun tion f de ning our problem is too sensitive at argument variations, f (x) and f (x) may dier greatly even for a small relative error kx xk=kxk of the entry data (k k is a ve tor norm). Su h a problem is alled ill
onditioned.
If we attempt to solve an ill onditioned problem starting with inexa t data, the solution will be inexa t no matter how it is omputed ! A numeri al ondition of a problem f : X Rm ! Y Rn at a point x 2 X
an be expressed by the relative error quotient: kf (x) f (x)k=kf (x)k ; (1.7) (x) = kx xk=kxk for x 6= 0 and f (x) 6= 0. When (x) is small (say, of order unity) the problem is said to be well onditioned in x. Otherwise, it is ill onditioned. 2. The algorithm numeri al quality. In order to solve a given numeri al problem de ned by the fun tion (1.6), a omputer exe utes a de nite sequen e of arithmeti and logi al operations alled algorithm. In general, for a problem there may exist several algorithms. An algorithm gives a uniquely determined answer for given entry data and so it an be also mathemati ally expressed by a fun tion f : X Rm ! Y Rn : We do not expe t f to solve ill onditioned problems more a
urately than the data warrants. However, we want that f do not introdu e larger inna ura ies by itself. This suggests that we require the following about f: "For any x 2 X , there is a nearby x 2 X su h that f (x) is equal to f(x)", i.e. the omputed solution by
14
CHAPTER 1.
MAIN FEATURES OF NUMERICAL COMPUTATION
the given algorithm with exa t entry data is the exa t solution for some slightly perturbed entry data. Algorithm with this property are alled stable. It is easy to show that a well
onditioned problem is solved by a stable algorithm with a high a
ura y. Indeed, from the numeri al stability of the algorithm f for exa t initial data x, there exists x su h that the relative error is of roundo error order: kx xk t kxk
and
f (x) = f(x):
(1.8)
The problem being well onditioned, from (1.7) it results that: kf (x) f (x)k = kx xk t ; (1.9) kf (x)k kxk where , are of order unity. Hen e, from (1.8) and (1.9): kf (x) f(x)k t; kf (x)k i.e., for a well onditioned problem, a stable algorithm do not introdu e signi ant errors by itself. Similarly, it an be seen that in all other situations (i.e. for ill onditioned problems or unstable algorithms), the resulting error annot be under ontrol. When a stable algorithm is used to solve an ill onditioned problem, there is no guarantee that f (x) and f (x) are near one another and although f (x) is equal to f(x), the dieren e between f (x) and f(x) may be important. When an unstable algorithm is used to solve a well onditioned problem, the omputed solution with exa t entries f(x) is an exa t solution to the problem with x far to x, and so it is f (x) to f (x). At last, there is no hope in small errors when an unstable algorithm is applied to an ill onditioned problem. Example 1.13
de ned by:
"
x1 x2
Let onsider the problem of omputing the sum of two numbers, f : R2 ! R; y = f (x) = x1 + x2 ; #
2 R2 is the entry data and y = f (x) 2 R the output data. where x = The problem may or may not be ill onditioned, depending on the input data (see example 1.10).
1.3.
NUMERICAL CONDITION AND NUMERICAL STABILITY
15
Suppose now that the sum is omputed in oating point arithmeti by an algorithm f : R2 ! R: y = f(x) = fl(fl(x1) + fl(x2 )); where fl is a rounding fun tion. From (1.4) it results that fl(x1 ) = x1 (1 + 1 );
fl(x2 ) = x2 (1 + 2 );
j1 j; j2 j t :
Using now (1.5), it an be easily proved that: y = x1 (1 + 1 ) + x2 (1 + 2 );
where j1 j; j2 j 2 t (1 = + 1 + 1 ). Thus, given any ve tor x 2 R2 , there exists a nearby ve tor x =
su h that
"
x1 (1 + 1 ) x2 (1 + 2 )
#
f(x) = f (x):
Hen e, the oating point addition whi h satis es (1.5) is a stable way of omputing the sum of two numbers. The above example is a very simple illustration of the so alled ba kward roundIn this te hnique, one uses the error bounds (1.5) for oating point arithmeti to prove that the omputed solution of a given problem is the exa t solution of the problem with slightly perturbed input data. This proof guarantees that the algorithm is numeri ally stable and, for well onditioned problems, small relative errors in the omputed data are assured. For nontrivial problems, to show that a proposed algorithm is stable may be a very diÆ ult task. Now, there are a relatively small number of algorithms for whi h a omplete proof of numeri al stability exists. Nevertheless, a lot of important algorithms have a pra ti al validity based on a numeri al experien e over de ades. As Wilkinson says: "the main obje t of an error analysis is not to establish pre ise bounds, but to expose the potential instabilities, if any, of an algorithm, and thus to nd a way to improve it". In our approa h, the main interest will be rather the presentation of the best existing algorithms that solve a ertain problem, than a proof of their numeri al stability. ing error analysis of an algorithm.
16
CHAPTER 1.
MAIN FEATURES OF NUMERICAL COMPUTATION
Problems
The following example indi ates that oating point addition is not asso iative. Let
onsider x = 0:001, x = 1, x = 1; the FPF is hara terized by = 10, t = 3. How big is the relative error when omputing y = x + x + x ? Give a bound of the relative error for the oating point omputation of y = x + x + x ; suppose that x , x , x are oating point numbers. Consider the problem of omputing the solution of the linear equation ax + b = 0, with a; b 2 , a 6= 0. For whi h values of the input data a, b is this problem ill onditioned ? An "algorithm" to ompute the solution is x = b=a; is this algorithm numeri ally stable ? P 1.1
1
2
3
1
2
3
P 1.2 2
1
3
1
2
3
P 1.3
R
The same question for the equation (a + a )x + (b + b ) = 0, where the input data is a ; a ; b ; b 2 . Give an expression for the distan e between 1 and the next greater oating point number in the FPF de ned by ( ; t; s). Give the relative error bound for the rounding fun tion whi h asso iates with ea h x 2 [ M; M ℄ the nearest oating point number. P 1.4 1
P 1.5
P 1.6
1
2
1
2
R
2
1
2
Chapter 2
Dire t methods for solving linear systems A set of m linear equations: 8 > > > < > > > :
a11 x1 + a12 x2 + : : : + a1n xn = b1 a21 x1 + a22 x2 + : : : + a2n xn = b2 ; ::: am1 x1 + am2 x2 + : : : + amn xn = bm
(2.1)
where aij 2 R, bi 2 R, for i = 1 : m, j = 1 : n, are given and xj , j = 1 : n, are the unknowns, is alled a linear system. In matrix form, it an be written as: Ax = b; (2.2) where A = (aij )i=1:m;j=1:n 2 Rmn, b = (bi )i=1:m 2 Rm and x = (xj )j=1:n 2 Rn . 2.1 Types of linear systems
The rst main hara teristi of a linear system is the relation between m and n: when the number of equations is greater than the number of unknowns (m > n), the system (2.2) is alled overdetermined.
when, onversely, the number of unknowns is greater, the system is said to be underdetermined.
in the ase of a square matrix A (i.e. m = n), the system (2.2) is determined. if the right hand side is a zero ve tor (b = 0), the system is alled homogenous.
17
18
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
Finding a solution to the linear system (2.2) means to ompute a ve tor x 2 Rn su h that the system is satis ed. Of ourse, we an do this only if the system has really (at least) a solution. If it has not, we must hange the meaning of "solution", in su h a manner su h the system has one. When the solution is not unique, we must sele t, in some way, a single one and then ompute it. The onditions under whi h the system (2.2) has a solution and when it is unique are well known from Linear Algebra. So we will present them without proofs. Given a matrix A 2 Rmn , let rst introdu e the following linear subspa es of a real linear spa e: ImA = fy 2 Rm j 9x 2 Rn su h that y = Axg Rm KerA = fz 2 Rn j Az = 0 2 Rm g Rn : The existen e onditions for a solution of Ax = b are given by: Theorem 2.1 The linear system (2.2) has a solution if and only if one of the following equivalent onditions holds: a) rank[A b℄ = rank(A) b) b 2 ImA. Corollary 2.2
If rank(A) = m, then the system (2.2) always has a solution. 2
3
1 07 6 Example 2.3 If A = 4 1 1 5, then ImA is the plane y2 = y1 + y3 . If b = [b1 b2 b3 ℄T 0 1 has b2 = b1 + b3, then the system Ax = b has a solution sin e both onditions in Theorem 2.1 hold. Otherwise, it has none (see gure 2.1). When the solutions exist, the uniqueness is derived from: Theorem 2.4 If x 2 Rn is a solution to the linear system (2.2), then the set of all solutions of (2.2) is the aÆne subspa e x + KerA = fy = x + z j z 2 KerAg: The solution x is unique if and only if
KerA = f0g i.e.the matrix A is moni (A has linearly independent olumns).
2.1.
19
TYPES OF LINEAR SYSTEMS
6y3 b 62 ImA HbH2HImA Y 1 HH H y2
y1
-
ImA
Figure 2.1: ImA and possible b for example 2.3. 2
3
1 Example 2.5 If A = 10 11 01 and b = 21 , then x = 64 1 75 is a solution to 0 Ax = b. On the other hand, KerA is the line des ribed by: ( x1 + x2 = 0 x2 + x3 = 0 whi h passes through the points O(0; 0; 0) and P (1; 1; 1). So that, all the ve tors from KerA have the form: 3 2 1 KerA 3 z = 64 1 75 ; 2 R 1 and, hen e (see gure 2.2), 8 9 2 3 > > 1 + < = 6 7 x + KerA = >y = 4 1 5 j 2 R> : "
#
"
:
#
;
When the system (2.2) has more than one solution, a possible way to sele t a single one is to ask it to be of minimum "length", where the length is de ned by a suitable ve tor norm. When the system (2.2) has no solution it is onvenient to de ne a new kind of (pseudo)solution, like the ve tor x whi h minimizes a suitable ve tor norm of residual r = b Ax. It is obvious that if b 2 ImA, then su h a pseudosolution be omes a true solution of the system.
Remark 2.6
20
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
6x3 P
O x 9 R x2 y
KerA
:
x-1
x + KerA
Figure 2.2: Solutions (x { parti ular solution, y { general solution) for the system in example 2.5. Theorem 2.7 When the matrix A is square (A 2 Rnn ), the system (2.2) admits a unique solution for all b 2 Rn if and only if A is nonsingular, i.e.
det(A) 6= 0: If A is nonsingular, there exists a unique matrix A 1 2 Rnn , alled the inverse of A, su h that A 1 A = AA 1 = In and the unique solution of system (2.2) an be written as
x = A 1 b:
(2.3)
We must stress that (2.3) is not a good formula for the omputation of the solution of the system Ax = b and so are other known methods, like the Cramer's rule, et . In the sequel, we will see how to solve determined nonsingular systems in a reliable and eÆ ient manner. There are two main ategories of methods for nding a reliable solution to nonsingular linear systems:
Dire t methods, based on the redu tion, by a nite sequen e of transformations, to one or two triangular systems, whi h are solved by a simple substitution. To this ategory belong methods like the Gaussian elimination and the LU fa torizations, whi h are re ommended for systems of medium size (say, n < 200, but this limit depends on the omputer power).
2.2.
21
TRIANGULAR SYSTEMS
0 0 L
U
Figure 2.3: Lower and upper triangular matri es.
Iterative methods, based on the re ursive building of a ve tor sequen e onvergent to the system solution; these methods are re ommended for huge n, or when A has some spe ial stru ture.
This hapter is devoted to a presentation of dire t methods. 2.2 Triangular systems De nition 2.8 A matrix L 2 Rnn , whi h has lij = 0 for all i < j is alled lower triangular. A matrix U 2 Rnn , with uij = 0 for all i > j is alled upper triangular. The system Ax = b is alled lower (upper) triangular if the matrix A is lower (upper)
triangular. See gure 2.3.
The following result is obvious: A triangular matrix is nonsingular if and only if all its diagonal elements are nonzero.
Proposition 2.9
The algorithms for triangular system solving are very simple, be ause the unknowns an be omputed, in a de nite order, by numeri al substitution. Consider, rst, a lower triangular system: Lx = b; where L 2 Rnn, with lij = 0 for i < j , lii 6= 0, i = 1 : n, and b 2 Rn . The rst equation is l11 x1 = b1 ; from whi h x1 = b1 =l11 : (2.4)
22
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
In general, if we know x1, x2, .. ., xi 1, we an solve the i-th equation: i 1 X k=1
to obtain
lik xk + lii xi = bi
xi = bi
i 1 X k=1
!
lik xk =lii :
(2.5)
Formulas (2.4) and (2.5) de ne an algorithm for the omputation of the solution of Lx = b, known as the method of forward substitution. (Given a nonsingular lower triangular n n real matrix L and a ve tor b 2 Rn , this algorithm omputes the solution x of the system Lx = b.) 1. for i = 1 : n 1. s bi 2. if i > 1 then 1. for k = 1 : i 1 1. s s lik xk 3. xi s=lii Algorithm 2.10
(LTRIS)
This algorithm requires NLT RIS = Pni=1 (2(i 1) + 1) = n(n 1) + n = n2
oating point operations ( ops) and MLT RIS = n(n +1)=2+2n +1 n2=2 memory
oat lo ations. If the ve tor b is no further needed, the solution x an be stored in the same memory spa e as b. In order to solve an upper triangular nonsingular linear system: Ux = b; where U is an upper triangular matrix of order n, with uii 6= 0, i = 1 : n, and b 2 Rn , let observe that the last equation is unn xn = bn ; from whi h xn = bn =unn : (2.6) Moreover, if xn, xn 1, . .. , xi+1, are omputed, then, from the i-th equation: uii xi +
n X k=i+1
uik xk = bi
2.3.
23
GAUSSIAN ELIMINATION
we obtain
0
xi = bi
n X k=i+1
1
(2.7)
uik xk A =uii :
From formulas (2.6) and (2.7), whi h de ne the so alled ba k substitution, we obtain the following: (UTRIS) (Given a nonsingular upper triangular n n real matrix U and a ve tor b 2 Rn , this algorithm omputes the solution x of the system Ux = b.) 1. for i = n : 1 : 1 1. s bi 2. if i < n then 1. for k = i + 1 : n 1. s s uik xk 3. xi s=uii As in the lower triangular ase, the omputational eort is given by NUT RIS = n2
ops and the memory required is MUT RIS n2=2 (s alar) oat lo ations. Algorithm 2.11
2.3 Gaussian elimination The Gaussian elimination is a tehnique for system redu tion to triangular form. To des ribe it in terms of matrix operations, let us introdu e the notion of elementary triangular matrix. De nition 2.12 An elementary lower triangular matrix of order n and index s is a matrix of the form: Ms = In ms eTs ; (2.8) where ms = [0 0 : : : 0 s+1;s : : : ns ℄T (2.9) es = [0 0 : : : 1 0 : : : 0 ℄T : The stru ture of an elementary lower triangular (ELT) matrix of order n and index s is: 2 3 1 0 ::: 0 ::: 0 6 0 : : : 0 777 6 0 1 ::: 6 Ms =
6 6 6 6 6 6 6 4
::: ::: ::: ::: :::
0 0 1 0 0 s+1;s ::: 0 0 ns The main properties of su h a matrix are given by:
::: ::: ::: ::: :::
0 0 0 1
7 7 7: 7 7 7 7 5
24
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
a) An ELT matrix is nonsingular and Ms 1 = In + mseTs : (2.10) b) Let x 2 Rn be a ve tor with xs = eTs x 6= 0. Then, there exists an ELT matrix su h that for i = 1 : s (Msx)i = 0x;i; for (2.11) i = s + 1 : n. If xs = 0, then for all Ms: Ms x = x: (2.12) Proof. a) Obviously, det(Ms ) = 1, i.e. Ms is nonsingular. Moreover, Ms (In + ms eTs ) = (In mseTs )(In + ms eTs ) = In ms eTs + ms eTs ms eTs ms eTs = In ms(eTs ms)eTs = In be ause, from (2.9), eTs ms = 0. Hen e, (2.10) is true. b) Using (2.8), we obtain: (Ms x)i = ((In mseTs )x)i = (x mseTs x)i = xi isxs (2.13) and, hen e, from (2.9), (Msx)i = xi, for i = 1 : s. Now hoosing is = xi =xs ; for i = s + 1 : n (2.14) from (2.13) it results that (Msx)i = 0, for i = s + 1 : n. If xs = 0, then (2.13) leads immediately to (2.12). Note that, if xs 6= 0, then the ELT matrix de ned by (2.14) is the unique ELT matrix whi h repla e the last n s omponents of x by zeros. 2 The properties stated by (2.11) and (2.12) are ru ial in a matrix redu tion to triangular form. To obtain this goal, in the method of Gaussian elimination, the given matrix is premultiplied by a sequen e of ELT matri es, ea h hosen to introdu e a olumn with zeros below the diagonal. To show, formally, how to do this, we need a de nition and two propositions. De nition 2.14 If A 2 Rnn , then the leading prin ipal submatrix of order s of matrix A is the matrix: A[s℄ = (aij )i=1:s;j =1:s = A(1 : s; 1 : s) i.e. is the s s submatrix from the left upper orner of A. Proposition 2.15 The produ t of two (unit, i.e. with all diagonal elements equal to 1) lower (upper) triangular matri es is a (unit) lower (upper) triangular matrix and so is a produ t of any nite sequen e of su h matri es. Proposition 2.13
2.3.
25
GAUSSIAN ELIMINATION
The proof is left as exer ise. Let A 2 Rnn and L 2 Rnn a lower triangular matrix. Then, (LA)[s℄ = L[s℄A[s℄: If Li 2 Rnn, i = 1 : k, are all lower triangular matri es, then (L1 L2 : : : Lk A)[s℄ = L[1s℄L[2s℄ : : : Lk[s℄A[s℄: Proposition 2.16
The proof is left as exer ise. The following theorem and its proof des ribe the Gaussian elimination method for redu ing a matrix to upper triangular form. Theorem 2.17 If the matrix A 2 Rnn has all the leading prin ipal submatri es A[s℄ nonsingular, for s = 1 : n 1, then there exists a unit lower triangular matrix M 2 Rnn su h that the matrix MA = U is upper triangular. If A is nonsingular, then U is nonsingular.
The proof is onstru tive, i.e. it will des ribe a pro edure for an ee tive redu tion of matrix A to the triangular form U . The pro edure onsists of n 1 steps. [1℄ Step 1. Let A1 = A and a1 = Ae1 its rst olumn. By hypothesis, A[1℄ 1 =A = a11 6= 0. From proposition 2.13, there exists an ELT M1 su h that i=1 (M1a1 )i = a0;11 ; for for i = 2 : n. i.e. the matrix A2 = M1A1 has all subdiagonal elements of its rst olumn equal to zero: 2 (2) (2) (2) 3 Proof.
A2 =
Step s.
6 6 6 6 6 6 6 4
a11 a12 : : : 0 a(2) 22 : : : 0 a(2) 32 : : : ::: (2) 0 an2 : : :
a1n a(2) 2n a(2) 3n a(2) nn
Let suppose that the matrix As = Ms 1 : : : M2 M1 A = Ms
7 7 7 7 7: 7 7 5
1 As 1
26
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
has all subdiagonal elements, in the rst s 1 olumns, equal to zero. Let us onsider a olumn partition of the matrix As: As = [ a(1s) a(2s) : : : a(ss) : : : a(ns) ℄: Applying the proposition 2.16, we have A[ss℄ = Ms[s℄ 1 : : : M1[s℄A[s℄: Sin e the matri es Mi[s℄ are all unit lower triangular, then detMi[s℄ = 1, i = 1 : s 1. Hen e: s Y [ s ℄ detAs = a(iis) = detA[s℄ 6= 0 a(s) ,
i=1
and the element ss alled pivot, is nonzero. Now, we an use proposition 2.13 to
laim the existen e of the ELT matrix Ms de ned by: is = a(iss) =a(sss) ; i = s + 1 : n (2.15) so that (Msas(s) )i = 0, for i = s +1 : n. Moreover, the premultipli ation of As by Ms do not alter the zeros introdu ed in previous steps, be ause a(sjs) = 0, for j = 1 : s 1. Thus, the zeroing pro edure started at step 1 an be ontinued up to step n 1 (in luded), to make the matrix: U = An = Mn 1 Mn 2 : : : M1 A upper triangular. Obviously, from proposition 2.15, the matrix M = Mn 1 Mn 2 : : : M1 is unit lower triangular and, if A is nonsingular, U is also nonsingular, as a produ t of two nonsingular matri es. The proof is omplete. 2 The proof above gives a pro edure for the triangularization of a matrix, known as the method of Gaussian elimination. This pro edure is based on the following
omputation s heme: 1. for s = 1 : n 1 1. Compute the ELT matrix Ms (i.e. (MsA)i = 0, for i = s + 1 : n 2. A MsA
is , i
= s + 1 : n), so that
The Gaussian multipliers is are omputed with (2.15), and we an store them in the pla e of zeroed elements. By the mean of instru tion 1.2 of the s heme above,
2.3.
2
27
GAUSSIAN ELIMINATION
u11 21
u12 : : : u1s 6 u22 : : : u2s 6 6 ::: 6 6 6 s1 : : : uss s2 6 6 6 s+1;1 s+1;2 : : : s+1;s 6 ::: 4 n1 n2 : : : ns
u1;s+1 : : : u1n 3 u2;s+1 : : : u2n 77 7 ::: 7 us;s+1 : : : usn 777 +1) : : : a(s+1) 7 a(ss+1 ;s+1 s+1;n 7 7 ::: 5 ( s+1) ( s+1) an;s+1 : : : ann
After step s
2
u11 u12 : : : u1s 6 6 21 u22 : : : u2s 6 6 ::: 6 6 s1 s2 : : : uss 6 6 ::: 6 6 ::: 4 n1 n2 : : : ns
3
: : : u1n : : : u2n 777 7 ::: 7 : : : usn 77 7 ::: 7 7 ::: 5 : : : unn
Final ontents
Figure 2.4: The ontents of matrix A after step s of the Gaussian elimination and after the ompletion of the algorithm. all intermediate and nal results are stored in the memory spa e o
upied by the matrix A; see gure 2.4. The matrix multipli ation A MsA an be eÆ iently a
omplished, taking into a
ount that only the submatrix A(s + 1 : n; s : n) is altered. The olumn s is modi ed by zeroing the subdiagonal elements. For the rest of the submatrix, if aj = Aej , then (Msaj )i = ((In mseTs )aj )i = (aj msasj )i = aij isasj ; j = s+1 : n; i = s+1 : n: (2.16) Relations (2.15) and (2.16), and the omputation s heme presented above de ne the algorithm G of Gaussian elimination des ribed (below. The algorithm an be s)
arried out to ompletion only if all the numbers ass , s = 1 : n 1, alled pivot elements, are nonzero, i.e. all the leading prin ipal submatri es A[s℄, s = 1 : n 1, are nonsingular. (G) (Given a n n real matrix A with all leading prin ipal submatri es A[s℄, s = 1 : n 1, nonsingular, this algorithm overwrites the upper triangle of A, in luding the diagonal, with the upper triangular matrix U = Mn 1Mn 2 : : : M1A. The lower triangle of A is overwritten by the Gaussian multipliers is de ning the ELT matri es Ms , s = 1 : n 1.) 1. for s = 1 : n 1 1. for i = s + 1 : n 1. ais is = ais=ass 2. for i = s + 1 : n Algorithm 2.18
28
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
1. for j = s + 1 : n 1. aij aij isasj The required number of ops is: NG =
nX1 s=1
(n
s + 2(n s)2 ) =
n(n
1) + 2 n(n 1)(2n 1) 2n3 ; 2 6 3
while the size of ne essary memory is MG = n2 oat lo ations. The algorithm G an be used to solve the linear system Ax = b when the matrix A is nonsingular and all its leading prin ipal submatri es are also nonsingular. Indeed, the system Ax = b has the same solution as the upper triangular system: Ux = Mn 1 Mn 2 : : : M1 b; where the matrix U is given by algorithm G. This triangular system an be solved by algorithm UT RIS . However, the nonsingularity of the leading prin ipal submatri es of A is not a ne essary ondition for the existen e and uniqueness of a system solution. Thus, we have to modify the algorithm G in order to handle the situations when some leading prin ipal submatri es are singular, but A is still nonsingular. 2.4 Pivoting strategies
The singularity of a leading prin ipal submatrix A[s℄ implies that, at step s of Gaussian elimination, the pivot a(sss) is equal to zero; thus, there exists no Ms to zero, by premultipli ation, the subdiagonal elements in olumn s of As. The modi ation of algorithm G onsists of a row (or/and olumn) inter hange, whi h brings in the pivot position a nonzero element. Another reason for the row and olumn permutations is the numeri al improvement of the omputational pro ess. An intuitive explanation is that when a mathemati al pro ess fails to be de ned for a parti ular value of a parameter, there is a good han e that the orresponding numeri al pro ess will be unstable when the parameter is near the oending value. In our ase, we must avoid small pivot elements in order to improve the numeri al stability of the algorithm. Let us introdu e now the notion of permutation matrix, whi h allow the des ription of row and olumn inter hanges in terms of matrix operations. De nition 2.19
A matrix Pij 2 Rnn obtained from the n n unity matrix by
2.4.
29
PIVOTING STRATEGIES
inter hanging the i and j olumns (or rows), i.e. a matrix of the form (here i < j ): 2
Pij = [e1 e2 : : : ei 1 ej ei+1 : : : ej 1 ei ej +1 : : : en ℄ =
6 6 6 6 6 4
Ii 1
3
0 1
Ij
i
1
1 0
In
7 7 7 7 7 5
j
where all the missing elements are zero, is alled elementary permutation matrix (EPM). A produ t of elementary permutation matri es is a permutation matrix. The properties of elementary permutation matri es are stated in the folowing proposition, whi h proof is obvious. Proposition 2.20 a) An EPM is nonsingular and Pij 1 = Pij . b) The postmultipli ation of a matrix A by the EPM Pij inter hanges the olumn i with olumn j , i.e.: 8 < Aek ; for k 6= i; j (APij )ek = : Aej ; for k = i Aei ; for k = j .
) The premultipli ation of a matrix A by the EPM Pij inter hanges the row i with row j , i.e.: 8 T > < ek A; for k 6= i; j T ek (Pij A) = > eTj A; for k = i : T ei A; for k = j . 2.4.1
Partial pivoting
We are able now to present a rst pivoting strategy. Let modify the algorithm G as follows. If at the s-th step of the redu tion to triangular form (see the proof of theorem 2.17), the pivot element a(sss) is unsatisfa torily small, we may hoose some other element, say a(iss) 6= 0, the greatest in module on olumn s. The inter hanging of rows s and is moves a(iss) into the (s; s) position so that it be omes the new pivot. In order to avoid the alteration of the matrix stru ture obtained up to step s, we must hoose is s. See gure 2.5. For simpli ity, we shall denote the EPM Pi s by Ps . Thus, the s-th step of algorithm G is modi ed into: 1. Determine is so that jai sj = maxi=s:n jaisj. 2. Inter hange the is-th and s-th rows, i.e. A PsA. s
s
s
s
30
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
2
u11 : : : u1s : : : 6 ... : : : 6 6 0 6 6 a(sss) : : : 6 6 As = 6 0 ::: 6 ( s ) 6 ai s : : : 6 6 0 ::: 4 ( s) ans : : : s
u1n
3
7 7 7 ( s) 7 asn 77 7 7 7 a(issn) 77 7 5
s) a(nn
3
2
u11 : : : u1s : : : u1n 6 . . . : : : 777 6 6 0 7 6 6 a(iss) : : : a(isn) 77 6 7 Psi As = 66 0 ::: 7 6 (s) : : : a(s) 7 6 a ss sn 7 7 6 7 6 0 ::: 5 4 ( s) ( s) ans : : : ann s
s
s
Figure 2.5: Step s of Gaussian elimination, with row inter hange; a(sss) is too small; ( s) in Psi , the new pivot is ai s and have a "good" (big enough) value. s
s
3. Determine the orresponding ELT matrix Ms. 4. Compute A MsA. whi h provides a new matrix given by: A As+1 = Ms Ps As : The overall pro ess, known as Gaussian elimination with row partial pivoting, leads nally to the upper triangular form: U = An = Mn 1 Pn 1 Mn 2 Pn 2 : : : M1 P1 A: (2.17) The orresponding algorithm is given below. (GPP { Gaussian elimination with partial pivotmatrix A 2 Rnn , this algorithm overwrites the upper
Algorithm 2.21
(Given a triangle of A, in luding the diagonal, with the upper triangular matrix U from (2.17). The lower triangle of A is overwritten by the Gaussian multipliers is de ning the ELT matri es Ms, s = 1 : n 1. The integers is, de ning the EPM-s Ps, are stored in a n 1 ve tor p, so that p(s) = is , for s = 1 : n 1.) 1. for s = 1 : n 1 1. Find the rst is 2 s : n su h that jai sj = maxi=s:n jaisj. 2. p(s) is 3. for j = s : n finter hange rows s and isg 1. asj $ ai j ing)
s
s
2.4.
31
PIVOTING STRATEGIES
4. for i = s + 1 : n 1. ais is = ais=ass 5. for i = s + 1 : n 1. for j = s + 1 : n 1. aij aij isasj The extra operations required by algorithm GP P , ompared P tonalgorithm G, ap1 pear only in the sear h of the pivot; the number of omparisons is s=1 (n s + 1) n2 =2 2n3 =3, so that the partial pivoting does not introdu e a signi ant overhead. The memory requirements is MGP P = n2. An important result related to algorithm GP P is given by: Theorem 2.22 If A 2 Rnn is nonsingular, then algorithm GP P an be arried out to ompletion and the resulting upper triangular matrix U is nonsingular. Proof. Algorithm GP P annot be arried out if, at some step s, the pivot element ai s = 0. In su h a ase, sin e the pivot has maximal module, all the other elements ais , for i = s : n are zero and, hen e, the matrix As is singular (see the stru ture of matrix As). The matri es Mk and Pk being nonsingular, the singularity of s
As = Ms 1 Ps 1 : : : M1 P1 A
implies that A is singular; ontradi tion. Hen e, all ai s 6= 0 and the algorithm
an be arried out to ompletion. The nonsingularity of matrix U derives from the nonsingularity of all matri es in the right hand side of (2.17). 2 s
2.4.2
Complete pivoting
Better numeri al properties an be a hieved if we look for the pivot among aij , with i = s : n, j = s : n, i.e in the whole right lower orner of matrix As . The greatest element in module, say ai j , is moved into the pivot position (s; s), by the mean of a row and a olumn inter hange; see gure 2.6. Denoting the EPM Psi by Ps, and Qsj by Qs, the s-th step of algorithm G be omes: 1. Find is and js so that jai j j = maxi=s:n;j=s:n jaij j. 2. Inter hange the is-th and s-th rows, i.e. A PsA. 3. Inter hange the js-th and s-th olumns, i.e. A AQs. 4. Determine the orresponding ELT matrix Ms. s s
s
s
s s
32
CHAPTER 2.
2
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
u11 : : : u1s : : : 6 6 0 ... ::: 6 6 ( s) 6 ass : : : 6 As = 66 0 ::: 6 ( s ) 6 ai s : : : 6 6 0 ::: 4 ( s) ans : : : s
u1j : : : ::: ( s) asj : : : s
s
a(isj) : : : s s
u1n
7 7 7 ( s) 7 asn 77 7 7 7 a(issn) 77 7 5
s) a(njs) : : : a(nn s
3
2
u11 : : : u1j : : : u1s 6 6 0 ... ::: 6 6 ( s) 6 ai j : : : a(iss) 6 = 66 0 ::: 6 ( s ) 6 asj : : : a(sss) 6 6 0 ::: 4 ( s) anj : : : a(nss) s
Psi As Qsj s
s
s s
s
3
: : : u1n 7 7 ::: 7 ( s) 7 : : : ai n 77 s
7 7 7 a(sns) 77 7 5
s
:::
s
s) : : : a(nn
Figure 2.6: Step s of Gaussian elimination, with row and olumn inter hange. 5. Compute A MsA. Thus, the step s provides the following matrix transformations: A As+1 = Ms Ps As Qs : and the overall omputational pro ess leads to the upper triangular matrix: A U = Mn 1 Pn 1 : : : M1 P1 AQ1 : : : Qn 1 : (2.18) The orresponding algorithm, known as the method of Gaussian elimination with
omplete pivoting, is given below: (GCP { Gaussian elimination with omplete pivmatrix A 2 Rnn, this algorithm overwrites the upper
Algorithm 2.23
(Given a triangle of A, in luding the diagonal, with the upper triangular matrix U from (2.18). The lower triangle of A is overwritten by the Gaussian multipliers is de ning the ELT matri es Ms, s = 1 : n 1. The integers is and js , de ning the EPM-s Ps and Qs , are stored in two n 1 ve tors p and q, so that p(s) = is and q(s) = js , for s = 1 : n 1.) 1. for s = 1 : n 1 1. Find is 2 s : n and js 2 s : n su h that jai j j = maxi=s:n;j=s:n jaij j. 2. p(s) is 3. q(s) js 4. for j = s : n finter hange rows s and isg 1. asj $ ai j 5. for i = 1 : n finter hange olumns s and jsg 1. ais $ aij oting)
s s
s
s
2.5.
SOLVING DETERMINED LINEAR SYSTEMS
33
6. for i = s + 1 : n 1. ais is = ais=ass 7. for i = s + 1 : n 1. for j = s + 1 : n 1. aij aij isasj The extra operations required by algorithm GCP , ompared to algorithm G, are again only in the sear h of the pivot; the number of omparisons is now nX1 s=1
(n
n
X s + 1)2 = s2 n3 =3; s=2
unlike the partial pivoting, the omplete pivoting introdu es a number of omparison whi h is omparable to the number of other operations; this overhead is signi ant, so that the omplete pivoting must be used only when a great a
ura y of the results is required. The memory requirements is MGCP = n2. The orrespondent of theorem 2.22, with a similar proof, is: Theorem 2.24 If A 2 Rnn is nonsingular, then algorithm GCP an be arried out to ompletion and the resulting upper triangular matrix U from (2.18) is nonsingular. Remark 2.25 If we want to make the GPP and GCP algorithms to triangularize singular matri es, there are some slight modi ations to be done. In GPP, the following test must be added: if ai s = 0, then all ais, i = s : n, are zero and, for su h s, the instru tions 1.2 { 1.5 must not be exe uted. In GCP, if ai j = 0, then all aij , i = s : n, j = s : n, are zero and the algorithm terminates at step s be ause the matrix As is already upper triangular. s
s s
2.5 Solving determined linear systems Theorems 2.22 and 2.24 show that the GP P and GCP algorithms provide a good base to solve the nonsingular linear system Ax = b. Although, from numeri al point of view, the GCP algorithm is better, GP P is good enough for pra ti al uses, so that it is widely used. Let us onsider the following informal syntax for the use of the GP P and GCP algorithms: [M; U; p℄ = GP P (A) [M; U; p; q℄ = GCP (A) where M is the set of Gaussian multipliers is, s = 1 : n 1, i = s + 1 : n, U is the resulting upper triangular matrix and p and q are integer ve tors de ning the row
34
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
and olumn permutations. Although the matri es M and U overwrite the matrix A, we will use the syntax above for larity. Let now onsider the linear system Ax = b, with nonsingular A. Be ause all the ELT matri es Ms and EPM-s Ps are nonsingular, the system Ax = b is equivalent (i.e. has the same solution) to Mn 1 Pn 1 : : : M1 P1 Ax = Mn 1 Pn 1 : : : M1 P1 b; (2.19) where Ms, Ps are those from 2.17. Hen e, (2.19) des ribes an upper triangular system whi h an be solved by the UT RIS algorithm. We onsider the following informal syntax for the use of this algorithm: x = UT RIS (U; b): Thus, to solve Ax = b, we must ompute the ve tor: b d = Mn 1 Pn 1 : : : M1 P1 b; using the following s heme: 1. for s = 1 : n 1 1. b Psb 2. b Msb and the GP P and UT RIS algorithms. The orresponding algorithm is given below: Algorithm 2.26
(2.20)
(LSS GPP { linear system solver using GPP)
(Given a nonsingular matrix A 2 Rnn and a ve tor b 2 Rn, this algorithm omputes the solution x 2 Rn of the linear system Ax = b, using Gaussian elimination with partial pivoting.) 1. [M; U; p℄ = GP P (A) 2. for s = 1 : n 1 1. bs $ bp(s) 2. for i = s + 1 : n 1. bi bi isbs 3. x = UT RIS (U; b) The omputational eort is given by: NLSS
GP P = NGP P +
nX1 s=1
2(n
s) + NUT RIS
2n3 + n2 + n2 2n3 NGP P ; 3 3
2.5.
SOLVING DETERMINED LINEAR SYSTEMS
35
i.e. the main amount of work is in the triangularization of A. Obviously, the memory needed is MLSS GP P n2. GCP an be used in a similar way; it is easy to see that the system Ax = b is equivalent to Mn 1 Pn 1 : : : M1 P1 AQ1 : : : Qn 1 Qn 1 : : : Q1 x = Mn 1 Pn 1 : : : M1 P1 b: (2.21) Denoting Qn 1 : : : Q1 x = y; (2.22) it results from (2.18) that y an be omputed as the solution of the upper triangular system Uy = d; (2.23) 1 where d is the ve tor given by (2.20). From (2.22), and knowing that Qs = Qs, we obtain: x = Q1 Q2 : : : Qn 1 y (2.24) whi h an be omputed by the s heme: 1. x y 1. for s = n 1 : 1 : 1 1. x Qsx From relations (2.20){(2.24), it results the following (LSS GCP { linear system solver using GCP) (Given a nonsingular matrix A 2 Rnn and a ve tor b 2 Rn, this algorithm omputes the solution x 2 Rn of the linear system Ax = b, Algorithm 2.27
using Gaussian elimination with omplete pivoting.) 1. [M; U; p; q℄ = GCP (A) 2. for s = 1 : n 1 1. bs $ bp(s) 2. for i = s + 1 : n 1. bi bi isbs 3. x = UT RIS (U; b) 4. for s = n 1 : 1 : 1 1. xs $ xq(s) The omplexity of this algorithm is asimptoti ally the same as that of GCP . Remark 2.28 The Gaussian elimination, presented here in a systemati al manner, is the well known manual omputation method of redu tion and substitution. The row pivoting orresponds to a reordering of equations in (2.1) and the olumn pivoting orresponds to a reordering of the omponents of x. The numeri al stability of these algorithms will be dis ussed later.
36
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
2.6 Matrix inversion
The Gaussian elimination with pivoting is an eÆ ient way to ompute the inverse of a nonsingular square matrix. To use it to this goal, we need to know rst how to inverse a triangular nonsingular matrix. 2.6.1
Triangular matrix inversion
Let L 2 Rnn be a lower triangular matrix. Assume that L is nonsingular, i.e. lii 6= 0, i = 1 : n. In these onditions, it is well known that there exists a unique matrix X 2 Rnn su h that: LX = XL = In ;
alled the inverse of L. In order to ompute it, let onsider the olumn partition of the matrix equation LX = In: Lxj = ej ; j = 1 : n; (2.25) where xj = Xej is the j -th olumn of matrix X . The lower triangular system (2.25) may be solved in an eÆ ient manner taking into a
ount the parti ular form of the right hand term. Indeed, the system (2.25) an be partitioned as follows: # # " #" " x0j 0 L(11j ) 0 ; (2.26) = L(21j ) L(22j )
x00j
e00j
where L(11j) is the (j 1) (j 1) upper left submatrix of L and e00j = [1 0 : : : 0℄T 2 Rn j +1 : From (2.26) it results that: ( (j ) L11 x0j = 0 (2.27) L(21j ) x0j + L(22j ) x00j = e00j : Sin e L is nonsingular and, hen e, so are L(11j) and L(22j), and the relation (2.27) be omes: ( x0j = 0 (2.28) L(22j ) x00j = e00j : This relation states an important fa t, given by: Proposition 2.29 The inverse of a nonsingular lower triangular matrix is lower triangular.
2.6.
37
MATRIX INVERSION
An algorithm whi h omputes the inverse of a nonsingular lower triangular matrix may be derived from (2.28) and has the following s heme: 1. for j = 1 : n 1. if j > 1 then x0j = 0 2. x00j = LT RIS (L(22j); e00j ) It an be seen that the omputed inverse may overwrite the given matrix. Writing expli itly the lower triangular system solving, we obtain: (LINV) (Given a nonsingular lower triangular matrix L 2 Rnn, this algorithm overwrites L by its inverse.) 1. for j = 1 : n 1. ljj xjj = 1=ljj 2. if j = n then stop 3. for i = j + 1 : n P i l x 1. lij xij = k=j +1 ik kj =lii Algorithm 2.30
The number of ops required is (exer ise: prove this result): NLINV
= n+
nX1 X n j =1 i=j +1
2(i
j)
n3
3:
The ne essary memory is MLINV n2=2 oat lo ations. In order to ompute the inverse of an upper triangular nonsingular matrix U 2 Rnn , we must solve the matrix equation: UX = In ; or, equivalently, the n linear equations: Uxj = ej ; j = 1 : n; (2.29) where xj is the j -th olumn of matrix X , the inverse of U . Using the partition: "
(j ) U (j ) U11 12 0 U22(j)
#"
x0j x00j
#
"
#
e0j ;
= 0
(2.30)
where U11(j) is the j j upper left submatrix of U and e0j = [0 0 : : : 0 1℄T 2 Rj , the system (2.29) be omes: ( (j ) (j ) x00 = e0 U11 x0j + U12 j j (2.31) (j ) x00 = 0 U22 j
38
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
Sin e U11(j) and U22(j) are nonsingular, the relation (2.31) be omes: ( (j ) x0 = e0 U11 j j (2.32) 00 xj = 0 and have the following signi an e: Proposition 2.31 The inverse of a nonsingular upper triangular matrix is upper triangular. The omputational s heme resulted from (2.32) is: 1. for j = 1 : n 1. x0j = UT RIS (U11(j) ; e0j ) 2. if j < 1 then x00j = 0 If the systems (2.32) are solved in reverse order (i.e., for j = n; n 1; : : : ; 1), it
an be easily seen that the inverse matrix omponents may overwrite those of the original matrix. Hen e, we have the following: (UINV) (Given a nonsingular upper triangular matrix U 2 Rnn, this algorithm overwrites U by its inverse.) 1. for j = n : 1 : 1 1. ujj xjj = 1=ujj 2. if j = 1 then stop 3. for i = j 1 : 1 :1P j u x 1. uij xij = k=i+1 ik kj =uii Algorithm 2.32
As for the lower triangular ase, NUINV n3=3 and MUINV n2=2. 2.6.2
Matrix inversion using
GPP
and
GCP
If A is a nonsingular n n real matrix, then the Gaussian elimination with partial pivoting arries out to ompletion and provides a nonsingular upper matrix U : Mn 1 Pn 1 : : : M1 P1 A = U; (2.33) where Ms are (nonsingular) elementary triangular matri es and Ps are elementary permutation matri es. From (2.33): X A 1 = U 1 Mn 1 Pn 1 : : : M1 P1 and, hen e, the inverse of a matrix A an be omputed as follows
2.6.
39
MATRIX INVERSION
A
U M
)
A
U 1 M
Figure 2.7: Data storage in matrix inversion. 1. [U; M; p℄ = GP P (A) 2. X = UINV (U ) 3. for s = n 1 : 1 : 1 1. X XMs 2. X XPs Remember that in the GP P algorithm, the matrix U and the set of Gaussian multipliers is, s = 1 : n 1, i = s + 1 : n, denoted by M in the s heme above, overwrite the matrix A as in the left side of gure 2.7. In algorithm UINV , we have seen that U 1 an overwrite U , so that, after the instru tion 2 in the s heme above, the data storage is like in the right side of gure 2.7. We shall des ribe an algorithm, based on the presented s heme, whi h omputes the inverse of A with a minimum amount of used memory. To do this, observe that the rst two instru tions need additional memory only for the ve tor p 2 Rn 1 , whi h stores the row representation of GP P . Let now analyze the for loop in instru tion 3. The rst assignment, X XMs, an be detailed by a row partitioning of X : xTi Ms = xTi (In mseTs ) = xTi (xTi ms)eTs = [xi1 xi2 : : : xi;s 1 xis xTi ms xi;s+1 : : : xin℄; i = 1 : n: (2.34) It an be seen that only the elements on the s-th olumn of X are modi ed. The se ond instru tion of the for loop, X XPs, inter hange the s-th and is-th
olumns (is > s). Taking into a
ount the above onsiderations and the reverse order of olumn
al ulation, the step s of the for loop 3 in the s heme above, i.e. X XMs Ps, an be des ribed like in gure 2.8, where U 1 and M denote the elements of U 1 and the Gaussian multipliers whi h were still not used (and not ae ted) in the exe ution of the loop until the urrent step; ae ted elements (part of X ) are denoted by . Thus, the only information that must be saved onsists of the Gaussian multipliers is, i = s + 1 : n, required to perform (2.34).
40
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
( 1) : : : U 1 u1s. .. : : : ( uss 1) : : : X s+1;s : : : M
...
ns
s
U 1 XMsPs
:::
M
:::
...
::: ::: :::
...
::: s
Figure 2.8: Data storage in step s of matrix inversion. In this way, a matrix inversion an be done with only two additional (n 1) ve tors: p for storing the permutation and, say, m for temporary saving the urrent Gaussian multipliers. The algorithm that implements the above onsiderations is: (INV GPP) (Given a nonsingular matrix A 2 Rnn , this algorithm overwrites A by its inverse, A 1 . The algorithm uses GP P for Gaussian triangularization and UINV to invert an upper triangular matrix.) 1. [U; M; p℄ = GP P (A) fU and M are stored over Ag 2. A X = UINV (U ) fonly the upper left triangle is modi edg 3. for s = n 1 : 1 : 1 1. for i = s + 1 : n 1. mi is fsave the multipliersg 2. for i = 1 : s Pn 1. ais ais s+1 aik mk frelation (2.34), rst s rowsg 3. for i = s + 1P: nn frelation (2.34), other rowsg 1. ais k=s+1 aik mk 4. if p(s) 6= s then 1. for i = 1 : n finter hange olumns s and p(s)g 1. ais $ ai;p(s) Algorithm 2.33
The number of oating point operations is given by: NINV
GP P = NGP P + NUT RIS +
nX1 s=1
2n(n
s)
2n3 + n3 + n3 = 2n3 3 3
and the memory needed is, of ourse, MINV GP P = n2. It is remarkable that the matrix inversion is not more diÆ ult (as number of operations) than the matrix multipli ation.
2.7.
41
DETERMINANT CALCULUS
A better pre ision an be obtained by using GCP instead of GP P . In this ase: Mn 1 Pn 1 : : : M1 P1 AQ1 Q2 : : : Qn 1 = U: Then: X A 1 = Q1 Q2 : : : Qn 1 U 1 Mn 1 Pn 1 : : : M1 P1 and the resulting algorithm is obtained by using GCP and adding to algorithm INV GP P the row inter hanges whi h orrespond, in reverse order, to the olumn permutations from GCP . In pra ti e, the pre ision obtained by INV GP P is good enough for most appli ations. The omputational eort needed to invert a nonsingular matrix is about three times greater then the eort for solving a linear system of the same dimension. So, whenever possible, matrix inversion has to be avoided and repla ed by linear system solving. For example, for omputing the real s alar: = bT A 1 ; where A 2 Rnn, b; 2 Rn, the re ommended s heme is: 1. Solve Ax = to ompute x = A 1 2. bT x Remark 2.34
2.7 Determinant al ulus
The GP P and GCP algorithms are good tools to ompute the determinant of a square matrix. The following well known or easy to verify fa ts give the idea of this
omputation. If A; B 2 Rnn are two square matri es, then det(A B ) = det(A) det(B ): The property holds for any nite number of square matri es. Proposition 2.35
Proposition 2.36
If A 2 Rnn is a triangular matrix, then: det(A) =
n Y i=1
aii :
42
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
Proposition 2.37 The inter hange of two rows or olumns hanges the sign of the matrix determinant, or, equivalently, if P is an elementary permutation matrix, then: det(P ) = 1: The elementary lower triangular matri es Ms, s = 1 : n 1, in (2.33), have unit diagonal elements. Then, det(Ms ) = 1 and, hen e, n Y det(A) = ( 1)k det(U ) = ( 1)k uii; i=1
where k n 1 is the number of proper elementary permutation matri es (i.e. P 6= In ) in (2.33). The orresponding algorithm is: (DET) (Given a square matrix A 2 Rnn , this algorithm omputes the matrix determinant using Gaussian redu tion to triangular form, with row pivoting.) 1. [U; M; p℄ = GP P (A) 2. det 1 3. for s = 1 : n 1. det det uss 4. for s = 1 : n 1 1. if p(s) 6= s then 1. det det Algorithm 2.38
The main omputational eort in the algorithm above is due to GP P exe ution. Obviously, GCP an be used instead of GP P , with slightly better pre ision of the answer. 2.8
LDU and LU matrix fa torizations
In many situations, it is onvenient to express a given matrix as a produ t of two triangular matri es. This form an be used to simplify al ulations involving the given matrix. De nition 2.39 Let A 2 Rnn be given. If there exist a lower triangular matrix L 2 Rnn and an upper triangular matrix U 2 Rnn su h that A = LU (2.35) then the above expression is alled LU matrix fa torization.
2.8.
LDU
AND
LU
MATRIX FACTORIZATIONS
De nition 2.40 A matrix D 2 Rnn (i.e. dij = 0 if i 6= j ) is alled diagonal.
43
simultaneously lower and upper triangular
It is an easy exer ise to the reader to prove the following: The produ t of any number of (unit) lower (upper) triangular matri es is a (unit) lower (upper) triangular matrix. Proposition 2.41
Let now A 2 Rnn be a LU -de omposable matrix and D a nonsingular diagonal matrix. Then, from propositions 2.29 and 2.31, D 1 is also diagonal, and, hen e, (2.35) an be written in the form: A = LU = LDD 1U = L0 U 0 ; (2.36) where L0 = LD is lower triangular and U 0 = D 1U is upper triangular. If D 6= I , then the fa torizations (2.35) and (2.36) dier and so that, the LU fa torization, if there exists one, is not unique. Thus, it is useful to introdu e: If A 2 Rnn, then the expression A = LDU; where L 2 Rnn is unit lower triangular, U 2 Rnn is unit upper triangular and D 2 Rnn is a diagonal matrix, is alled the LDU fa torization of matrix A.
De nition 2.42
The following important fa t shows under whi h onditions a LDU fa torizations exists and is unique. Theorem 2.43 A matrix A 2 Rnn has a unique LDU fa torization if and only if the leading prin ipal submatri es A[s℄, s = 1 : n 1, are all nonsingular. Proof. We give a proof for the ase when the matrix A itself is nonsingular. The theorem's proof for singular A is a not trivial exer ise left to the reader. Let show rst that if a LDU de omposition exists, it is unique. Suppose that there exist two dierent LDU de ompositions: A = LDU = L0 D0 U 0 : (2.37) L, L0 , U , U 0 are unit triangular matri es and, so that, nonsingular; moreover, L 1 , (L0 ) 1, U 1 , (U 0) 1 have the same stru ture. A is nonsingular, hen e so are D and D0 , and the matri es D 1 and (D0 ) 1 are diagonal. Then, from (2.37): (L0) 1 L = D0U 0U 1 D 1
44
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
where the left hand side is a unit lower triangular matrix and the right hand side is an upper triangular matrix. Hen e (L0 ) 1 L = In i.e. L = L0 : Now, it results that: U 0 U 1 = (D0 ) 1 D: The left hand side is a unit upper triangular matrix and the right hand side a diagonal one. This is possible if and only if both are unit matri es In. Hen e: U = U 0; D = D0 whi h ompletes the proof of the LDU de omposition uniqueness. Also, if the LDU de omposition of the nonsingular matrix A exists, then D is nonsingular and it is easy to see that: A[s℄ = L[s℄D[s℄U [s℄; s = 1 : n; where, obviously, L[s℄, D[s℄ and U [s℄ are all nonsingular. Hen e, A[s℄, s = 1 : n, are all nonsingular. It remains to show the existen e of the LDU de omposition when A[s℄ are nonsingular for s = 1 : n 1. From theorem 2.17, whi h states the theoreti al base for Gaussian elimination, if A[s℄, s = 1 : n 1, are nonsingular, then there exists elementary lower triangular matri es Ms, s = 1 : s 1, su h that: Mn 1 : : : M2 M1 A = R; (2.38) where R is a nonsingular upper triangular matrix. Denoting: L = (Mn 1 : : : M2 M1 ) 1 D = diag(r11 ; r22 ; : : : ; rnn ) U = D 1 R; then (2.38) an be written as A = LDU , with L, D, U satisfying the onditions for a LDU de omposition. 2 Pra ti al LU fa torizations asso iate the diagonal matrix D from the LDU fa torization: with the matrix U , de ning the so alled Doolittle fa torization A = LU , where L is unit lower triangular and U is upper triangular,
2.8.
LDU
AND
LU
45
MATRIX FACTORIZATIONS
or with the matrix L, de ning the Crout fa torization A = LU , where L is
lower triangular and U unit upper triangular. From theorem 2.43 it results that both Doolittle and Crout fa torizations exist and are unique if and only if the leading prin ipal submatri es A[s℄, s = 1 : n 1, are all nonsingular. An algorithm to ompute the Doolittle fa torization is provided by the algorithm G for Gaussian triangularization. Indeed, the algorithm G produ es the ELT matri es Ms so that from (2.38) it result the LU fa torization A = LU , with L = (Mn 1 : : : M2 M1 ) 1 and U = R. Hen e, to ompute the Doolittle fa torization, it is suÆ ient to apply the algorithm G and to ompute L = M1 1 M2 2 : : : Mn 11 : But (see proposition 2.13): Ms 1 = In + ms eTs ; s = 1 : n 1 and, hen e: L = (In + m1 eT1 )(In + m2 eT2 ) : : : (In + mn 1 eTn 1 ) = In +
nX1 s=1
mseTs
+ S;
where S is a sum of matrix terms of the form: T = : : : mp eTp mq eTq : : : with p < q. Sin e eTp mq = pq = 0, for all p < q, then T = 0 and S = 0. Thus, 2 1 0 ::: 0 0 3 6 1 : : : 0 0 777 21 6 6 nX1 32 : : : 0 0 77 6 31 L = In + mseTs = 66 . . . 6 ::: : : : 777 s=1 6 4 n 1;1 n 1;2 : : : 1 0 5 n1 n2 : : : n;n 1 1 and this means that the algorithm G omputes the Doolittle de omposition, both matri es L (without its diagonal elements) and U overwriting the matrix A. The Crout de omposition an be omputed taking in (2.36) LU as the Doolittle de omposition and D = diag(u11 ; u12 ; : : : ; unn). Then, L0U 0 is the Crout de omposition of A. However, there are some other possibilities to organize the omputation to obtain the so alled ompa t fa torizations. We shall derive su h a Crout fa torization,
46
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
omputed elements 1 ? ass. .. asj . ..
0
=
1 .. . usj. .. ... 0 lis i ... 1
omputed elements s Figure 2.9: Intermediate stage in Crout fa torization.
... ais ...
lss
based on the idea of su
esive omputation of a olumn of L and of a row of U . The pro edure may be started from the identity A = LU , resulting: Ae1 = LUe1 = Le1 i.e. li1 = ai1 ; i = 1 : n (2.39) and eT1 A = eT1 LU = l11 eT1 U; i.e. a1j = l11 u1j ; j = 2 : n; where l11 6= 0 (sin e l11 = a11 = A[1℄ 6= 0). Then, u1j = a1j =l11 ; j = 2 : n: (2.40) Let suppose now that the rst s 1 olumns of L and s 1 rows of U are known ( omputed); see gure 2.9. From the A = LU identity we have ais =
sX1 k=1
lik uks + lis ; i = s : n;
where the only unknown s alar is lis. Hen e, lis = ais
sX1 k=1
lik uks ; i = s : n:
(2.41)
2.9.
47
CHOLESKY FACTORIZATION
Also:
asj =
sX1 k=1
lsk ukj + lss usj ; j = s + 1 : n;
where, now, the only unknown s alar is usj . It an be shown that the nonsingularity of A[s℄ implies lss 6= 0. Hen e, usj = asj
sX1 k=1
!
lsk ukj =lss ; j = s + 1 : n:
(2.42)
So, the pro edure started by (2.39) and (2.40) an be ontinued to ompletion by (2.41) and (2.42), if A[s℄, s = 1 : n 1, are all nonsingular, whi h guarantee that the division in (2.42) may be exe uted. The resulting algorithm is: (Given a matrix nonsingular, this algorithm omputes the Crout fa torization A = LU , where L is lower triangular and U is unit upper triangular. L and U are stored over the orresponding elements of A (without the diagonal elements of U , whi h are 1).) ( 0. ai1 li1 = ai1 ; i = 1 : n) 1. for j = 2 : n 1. a1j u1j = a1j =l11 2. for s = 2 : n 1. for i = s : n 1. ais lis = ais Pks =11 lik uks 2. if s = n then stop 3. for j = s + 1 : n P 1. asj usj = asj ks =11 lsk ukj =lss A
Algorithm 2.44
(CROUT { Crout fa torization)
2 Rnn whi h has all leading prin ipal matri es A[s℄, i = 1 : n 1,
The oating point operations asimptoti ally totalize NCROUT 2n3 =3 as in G, GP P or GCP algorithms and the memory used is MCROUT n2 . To avoid the ne essity for A[s℄, s = 1 : n 1, to be nonsingular, only row pivoting is possible. It may be introdu ed after instru tion 2.1 of the algorithm above. 2.9 Cholesky fa torization
An important spe ial ase of LU fa torization whi h an be performed by an eÆ ient and reliable algorithm is that of positive de nite matri es.
48
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
A square symmetri (A = AT ) matrix A is alled positive de nite if for any nonzero ve tor x 2 Rn xT Ax > 0 (2.43) and positive semide nite if xT Ax 0: De nition 2.46 If A is a real n n matrix and I = fi1 ; i2 ; : : : ; ir g is a set of indi es su h that i1 < i2 < : : : < ir n, then the r r matrix A0 de ned by a0kl = ai i , for k; l = 1 : r, is alled prin ipal submatrix of A. Re all that if I = f1; 2; : : : ; rg, then A0 = A[r℄ is alled leading prin ipal submatrix of A, of order r. For positive de nite matri es, we have the following: Proposition 2.47 A prin ipal submatrix of a positive de nite matrix is positive de nite. Proof. Let the set I = fi1 ; i2 ; : : : ; ir g de ne a prin ipal submatrix A0 of the given positive de nite matrix A 2 Rnn. For any nonzero ve tor x 2 Rr , de ne the ve tor y 2 Rn by: yi = xk ; k = 1 : r yi = 0; i 62 I : Obviously, x 6= 0 implies y 6= 0, and from the positiveness of A: xT A0 x = yT Ay > 0; i.e. A0 is positive de nite. 2 Corollary 2.48 All leading prin ipal submatri es of a symmetri positive de nite matrix, and the matrix itself, are nonsingular. Proof. It is suÆ ient to show that a positive de nite matrix is nonsingular and then to apply proposition 2.47. Assume that a positive de nite matrix is singular. Then, there exists a nonzero ve tor x 2 Rn so that Ax = 0 and therefore xT Ax = 0, whi h ontradi ts the de nition of positiveness. 2 The theoreti al basis for the Cholesky de omposition is given by: De nition 2.45
k l
k
Theorem 2.49 For any symmetri positive de nite matrix A 2 Rnn , there exists a unique lower triangular matrix L 2 Rnn with positive diagonal elements su h that: A = LLT ; whi h is alled the Cholesky fa torization.
2.9.
49
CHOLESKY FACTORIZATION
The matrix A is positive de nite; hen e, a
ording to orollary 2.48, A[s℄, 1, are nonsingular and, therefore, A has a unique LDU fa torization (see theorem 2.43). From the matrix symmetry (A = AT ): A = L0 DU 0 = (U 0 )T D(L0 )T : The uniqueness of the LDU de omposition and the relation above imply: U 0 = (L0 )T : Hen e, there exists a unique unit lower triangular matrix L0 2 Rnn and a unique diagonal matrix D 2 Rnn su h that: A = L0 D(L0 )T : (2.44) Moreover, using (2.43) for xj = (L0 ) T ej , (2.44) leads to: xTj Axj = eTj (L0 ) 1 L0 D(L0 )T (L0 ) T ej = eTj Dej = djj > 0; for all j = 1 : n. Hen e, all diagonal elements of D are positive. Then, denoting: p p p E = diag( d11 ; d22 ; : : : ; dnn ); we have D = E 2 and (2.44) an be written as: A = L0 EE (L0 )T = LLT ; 0 where pd , i L= =1 :LnE. is a lower triangular matrix with positive diagonal elements lii = ii We proved the existen e of the Cholesky fa torization A = LLT ; the uniqueness derives from the uniqueness of the LDU de omposition and the uniqueness of the matrix E (with positive diagonal elements) su h that D = E 2 . 2 There are some few algorithms to ompute the Cholesky fa torization of a positive de nite matrix, whi h dier in the ordering of s alar omputations. A possible order is to al ulate from top to down the j -th olumn of L, for j = 1 : n. Identifying A = LLT (only the lower left triangle of A is represented sin e A is symmetri ) Proof. s=1:n
2
2
3
a11 7 6 7 6 : : : ... 7 6 7 6 7 6 as1 : : : ass 6 . 6 . . 775 4 ::: an1 : : : ans : : : ann
=
3 2
l11 7 6 7 6 : : : ... 0 7 6 7 6 7 6 ls1 : : : lss 6 . 6 . . 775 4 ::: ln1 : : : lns : : : lnn
3
l11 : : : ls1 : : : ln1 6 . . . : : : 777 6 6 6 lss : : : lns 777 6 6 . . . : : : 75 6 4 0 lnn
50
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
for the rst olumn we have:
p
2 ) l11 = a11 a11 = l11 ai1 = li1 l11 ) li1 = ai1 =l11 ; i = 2 : n
(2.45)
i.e. the omputational pro ess an start. Now, assuming that the rst s 1 olumns of L were omputed, the identi ation of the (s; s) element in A = LLT yields: ass =
sX1 k=1
2 + l2 ; lsk ss
(2.46)
where the only unknown s alar is lss. A
ording to theorem 2.49, if A is positive de nite, there exists a unique positive lss su h that (2.46) holds. Hen e, ass
and Also,
lss = ais =
sX1 k=1
sX1 k=1 v u u ta
2 >0 lsk
ss
sX1 k=1
2: lsk
(2.47)
lik lsk + lis lss; i = s + 1 : n;
where, in the given order of omputation, the only unknown is lis. Hen e: lis = ais
sX1 k=1
!
lik lsk =lss ; i = s + 1 : n:
(2.48)
Formulas (2.45) and (2.47), (2.48) de ne the algorithm given below: (CHOL { Cholesky fa torization) (Given a symRnn , this algorithm establishes if the matrix is
Algorithm 2.50
metri matrix A 2 positive de nite. If so, it overwrites the lower triangle with the matrix L of the Cholesky fa torization A = LLT .) 1. if a11 0 then 1. Print('A is not positive de nite') 2. stop p 2. a11 l11 = a11 3. for i = 2 : n 1. ai1 li1 = ai1=l11
2.10.
APPLICATIONS OF THE
LU
FACTORIZATIONS
51
4. for s = 2 : n P 1. ass ks =11 lsk2 2. if 0 then 1. Print('A is not positive de nite') 2. stop 3. ass lss = p 4. if s = n then stop 5. for i = s + 1 : n P 1. ais lis = ais ks =11 lik lsk =lss The above algorithm is the best test for the positive de niteness of a matrix. The fa t is based on the fa t that the Cholesky fa torization of a symmetri matrix
an be arried out to ompletion if and only if the matrix is positive de nite (see theorem 2.49). The CHOL algorithm requires about NCHOL = n3=3 ops and additional n square root al ulations. The memory needed is about MCHOL = n2=2 (a symmetri matrix is usually stored by one of its triangular parts). 2.10 Appli ations of the
LU fa torizations
The LU fa torizations are useful in many omputations involving a square matrix. For the general ase, for reliability reasons, the LU fa torization must be omputed with a suitable pivoting strategy. However, to emphasize the LU de omposition utility, we shall assume that a generi given matrix A 2 Rnn really has su h a fa torization. 2.10.1
Linear system solving
The solution of the linear system Ax = b an be found using the following s heme: 1. A = LU ( ompute the LU fa torization) 2. Solve the lower triangular system Ly = b 3. Solve the upper triangular system Ux = y The omputational eort is about the same as in the Gaussian elimination and the main part of it is done in the fa torization pro ess. When the linear system has multiple right hand side, i.e. has the form AX = B;
52
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
where B 2 Rnp, and X will result of same dimension, this system an be redu ed to p systems with simple right hand side, Axj = bj , j = 1 : p, where bj and xj are the j -th olumns of B and X , respe tively. However, the omputational eort is not p times bigger; sin e A is the same for all p systems, the LU fa torization an be done only on e, and then only triangular systems have to be solved. The algorithm is the following: 1. A = LU ( ompute the LU fa torization) 2. for j = 1 : p 1. Solve the lower triangular system Ly = bj 2. Solve the upper triangular system Uxj = y The number of operations will be only about 2n3 =3 + 2pn2. 2.10.2
Matrix inversion
If a nonsingular matrix A has a LU de omposition, then A 1 = U 1L 1; so an inversion pro edure an be: 1. A = LU ( ompute the LU fa torization) 2. U X = U 1 3. L Y = L 1 3. A 1 XY The number of operations is 2n3. 2.10.3
Determinant al ulus
Obviously, det(A) = det(LU ) = det(L) det(U ) = (Qni=1 lii) (Qni=1 uii). So, a LU fa torization gives an immediate possibility of determinant evaluation. 2.11 Numeri al ondition of linear systems 2.11.1
Matrix norms
In order to evaluate the numeri al ondition of a linear system, we need some elementary notions of matrix norms. De nition 2.51 A fun tion : Rmn ! R+ is a matrix norm on Rmn if:
2.11.
NUMERICAL CONDITION OF LINEAR SYSTEMS
53
1. (A) > 0, 8A 2 Rmn , A 6= 0. 2. (A) = jj (A), 8A 2 Rmn, 8 2 R. 3. (A + B ) (A) + (B ), 8A; B 2 Rmn. This de nition has some immediate onsequen es, given by Corollary 2.52
If : Rmn ! R+ is a matrix norm, then:
1. (0) = 0. 2. ( A) = (A), 8A 2 Rmn. 3. The matrix sequen e (Ak )k2N is onvergent and has the limit A if and only if the s alar sequen e ( (Ak A))k2N is onvergent to zero, i.e. lim Ak = A , klim (Ak A) = 0: k!1 !1 To make use of matrix norms independently of matrix dimensions, we introdu e the following: mn ! R is a family of matrix norms De nition 2.53 A fun tion : 1 + m=1;n=1 R if for ea h m; n > 0, the restri tion of to Rmn is a matrix norm. A family of matrix norms is onsistent if S
(AB ) (A) (B )
whenever the produ t AB is de ned. If n = 1, then is a family of ve tor norms. The following theorem provides a pra ti al way to de ne on rete matrix norms: Theorem 2.54 Let be a family of ve tor norms. For A 2 Rmn , de ne:
kAk = max (Ax): (x)=1 Then, k k
: S1m=1;n=1 Rmn ! R+ is a onsistent family of matrix norms.
The family of ve tor norms kk is said to be subordinate to the family of ve tor norms .
54
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
The most natural family of ve tor norms used in theorem 2.54 is one of the
p-norms (or Holder norms), for p = 1; 2; 1, de ned by: m X
kxkp =
i.e., respe tively:
kxk1 =
m X
i=1
jxijp
!1=p
jxij
i=1 v um uX t x2
kxk2 = i (eu lidian norm) i=1 kxk1 = imax jx j: =1:m i The matrix 1-norm and 1-norm, de ned like in theorem 2.54, may be easily
omputed:
kAk1 = kmax kAxk1 = jmax =1:n xk =1
m X
1
i=1
0
jaij j
n X
!
1
kAk1 = kxmax kAxk1 = imax jaij jA: =1:m k =1
1
j =1
The matrix p-norms, applied to ve tors seen as one olumn matri es, give the same values as the orresponding ve tor p-norms. Hen e, there is no possibility of
onfusion in using the same symbol k kp for the matrix norm and the ve tor norm. Another matrix norm widely used is the Frobenius norm, de ned by: kAkF =
v uX m X n u t a2
i=1 j =1
ij :
The family of Frobenius norms is onsistent. 2.11.2
Numeri al ondition of linear systems
Let return now to the problem of numeri al ondition of linear determined nonsingular systems of form Ax = b, where A 2 Rnn and b 2 Rn are given. For a perturbation (A; b) ! (A + A; b + b) (A; b) in the input data, the perturbation of the solution x x ! x + x x
2.11.
NUMERICAL CONDITION OF LINEAR SYSTEMS
is su h that Hen e, sin e Ax = b:
55
(A + A)(x + x) = b + b:
A x + A x + A x = b: Negle ting the produ t A x, we obtain: x A 1 A x + A 1 b
and, for a onsistent family of matrix norms k k, kxk kA 1 k kAk kxk + kA 1 k kbk: If x 6= 0, a relative error bound is given by: kxk kA 1 k kAk kAk + kA 1 k kbk kxk kAk kxk = kA 1 k kAk kAk + kA 1 k kAk kbk
kAk kAk kxk k b k k A k kA 1 k kAk kAk + kbk : Hen e, the relative errors kAk=kAk and kbk=kbk in the input data are ampli ed (A) = kA 1 k kAk (2.49)
times, independently of the algorithm used to solve the system. The real number (A) de ned in (2.49) is alled the ondition number of the matrix A and provides a measure for the numeri al ondition of the given linear system. A smaller orresponds to a better numeri al ondition. Remark 2.55 A usual way to verify a omputed solution x of the linear system Ax = b is to al ulate a norm for the residual r = b Ax: krk = kb Axk and to on lude that if krk is "small", then x is a "good" (a
urate) solution. In fa t this on lusion is not true if the matrix A is ill onditioned. Indeed, the true solution is x = A 1b, so: x x = A 1 b A 1 (b r) = A 1 r and, hen e, the relative error has the following bound: kx xk = kA 1 rk kA 1 k krk (A) krk kr k (A) 1 1 1 kxk kA bk kA rk kAk kA bk kbk for any onsistent norm k k. So the solution test "r is small" is valid only if the matrix A has a good numeri al ondition (i.e. small ondition number).
56
CHAPTER 2.
2.11.3
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
Numeri al stability of dire t algorithms for linear systems
The ee ts of rounding errors on the algorithms des ribed in this hapter is usually expressed in terms of numeri al stability whi h involve a ba kward error analysis. An algorithm for solving the determined linear system Ax = b is alled numeri ally stable if the omputed solution x is the exa t solution of a linear system: (A + E )x = b where the perturbation matrix E has a small norm ( ompared to kAk). Thus, the solution error has an equivalen e in initial data error. To evaluate bound for kE k is a very diÆ ult task. Sometimes there are no su h analiti al bounds and the numeri al stability of an algorithm may be stated by pra ti al experien e. We present here only a brief numeri al stability hara terization of several important algorithms: The solutions of triangular systems are usually omputed with high a
ura y and the orresponding algorithms are stable. The algorithm GCP of Gaussian elimination with omplete pivoting is numeri ally stable. The Gaussian elimination with partial pivoting is not un onditionally stable, but in omputing pra ti e it is onsidered stable. The Crout redu tion algorithm with row pivoting is also pra ti ally a stable algorithm. The Cholesky fa torization algorithm for positive de nite matri es is un onditionally stable. So that, it must be used whenever the system matrix is symmetri and positive de nite. Problems
are nonsingular lower bidiagonal matri es (a = 0 for i < j or i > j +1). Give eÆ ient algorithms for: the multipli ations Ax and ABx, where x 2 . the matrix multipli ation AB. solving the linear system Ax = , where 2 . solving the linear system Ax = e , where e is the j -th unit ve tor. solving the linear system ABx = , where 2 .
omputing the inverse of A.
P 2.1
A; B 2 R n
n
ij
n
a.
R
b.
n
.
d. e. f.
R
j
j
n
R
2.11.
57
NUMERICAL CONDITION OF LINEAR SYSTEMS
Let A 2 be a nonsingular lower bidiagonal matrix, and T = AA . Give eÆ ient algorithms to:
ompute det(T ).
ompute the Frobenius norm of T . solve the linear system T x = b, where b 2 .
ompute the inverse of T .
ompute the Cholesky fa torization of T (prove rst that T is symmetri and positive de nite). Let H 2 be a nonsingular upper Hessenberg matrix (h = 0, for i > j + 1). If all the prin ipal leading submatri es of H are nonsingular, adapt the Gaussian elimination algorithm for solving the linear system Hx = b, where b 2 ; al ulate the number of operations. Adapt the GP P algorithm for the same problem. Adapt the Crout algorithm for omputing the LU fa torization. Let onsider the matri es: H 2 , nonsingular upper Hessenberg, and R 2 , unit upper triangular. Present eÆ ient algorithms for:
omputing the matrix multipli ation HR.
omputing the ve tor HRb, with b 2 . solve the linear system HRx = b, with b 2 . When all prin ipal leading submatri es of H are nonsingular, the Crout fa torization of A = HR an be obtained using one of the following two s hemes: S heme 1. 1. Compute A = HR. 2. Compute the Crout fa torization of A: A = LU . S heme 2. 1. Compute the Crout fa torization of H : H = LU . . 2. Compute U = UR Whi h one is the most eÆ ient ? Propose an algorithm to solve the linear system Ax = b, where A 2 is nonsingular and b 2 , using only real arithmeti . Give an algorithm for solving the matrix equation AX = B, where A 2 is nonsingular and B 2 . Give an eÆ ient algorithm to solve the linear system A x = b, where A 2 is nonsingular, b 2 and k 2 , k > 1. Let B = RA A0 , with A; R 2 , nonsingular, R upper triangular. The LU fa torization of A exists and it is known (A = LU ). Write an algorithm for omputing the LU fa torization of B, B = L~ U~ . Propose an algorithm for solving the linear system Bx = d, where d 2 . Cal ulate the number of operations for both these algorithms. P 2.2
R
n
n
T
a.
b.
n
.
R
d. e.
P 2.3
n
R
n
ij
a.
n
R
b.
.
n
P 2.4 R
n
n
R
n
a.
n
b.
R
n
.
R
d.
P 2.5 a.
R
C
b.
R
n
n
n
n
m
k
P 2.6
n
R
R
N
n
R
n
a.
b.
n
R
n
P 2.7
n
n
R
2n
58
CHAPTER 2.
DIRECT METHODS FOR SOLVING LINEAR SYSTEMS
Let A 2 be a nonsingular matrix with all the leading prin ipal matri es nonsingular, A = AA AA , with A ; A ; A ; A 2 and A is upper triangular. Write an algorithm for solving the linear system Ax = b, with b 2 . The same, keeping only the assumption that A is nonsingular. Let A 2 be a nonsingular tridiagonal matrix (a = 0, for i > j +1 or i < j 1). Adapt the Gaussian elimination algorithm to this type of matrix. Give the algorithm that solves Ax = b, with b 2 . If A is symmetri and positive de nite, adapt the Cholesky fa torization algorithm for A. Let u; v 2 be two nonzero ve tors, and A = I + uv . Give an algorithm to ompute the Frobenius norm of A. Give an eÆ ient algorithm to ompute the determinant of A. When is A nonsingular ? If A is nonsingular and b 2 , write an eÆ ient algorithm for solving the linear system Ax = b. Des ribe a variant of Gaussian elimination that introdu es zeros into the olumns of A, above the diagonal, in the order n : 1 : 2, and whi h produ es the fa torization A = UL, where U is unit upper triangular and L is lower triangular. Suppose that A 2 has an LU fa torization and that L and U are known. Give an algorithm whi h omputes the (i; j )-entry of A in approximately (n j ) + (n i)
ops. If A 2 is symmetri and positive de nite, propose an algorithm for the fa torization A = UU , where U is upper triangular and has positive diagonal elements. If A; B 2 are nonsingular matri es, give an algorithm to solve the linear system (AB) x = , where 2 . Give a variant of the algorithm LSS GP P for solving the linear system Ax = b using Gaussian elimination with partial pivoting, without storing the multipliers. 2n 2n R
P 2.8
1
2
3
4
1
2
3
n
n
R
4
3
2n R
a.
b.
P 2.9
R
n
n
ij
a.
n
b.
R
.
n
P 2.10
R
n
T
a.
b.
n
.
R
P 2.11
n
P 2.12
R
n
1
P 2.13
R
n
n
T
n
P 2.14
R
k
P 2.15
n
R
n
2
2
Chapter 3
The linear least squares problem 3.1 The linear least squares problem
Let us onsider the linear system
Ax = b;
where A 2 Rmn , b 2 Rm . Generally, an overdetermined system (m > n) has no solution and we must
hange the meaning of its solving. A natural statement of the problem is to nd a ve tor x 2 Rn su h that the ve tor y = Ax is as near as possible to the ve tor b. More pre isely, we want to determine a ve tor x 2 Rn that minimizes the fun tion: (x) = (b Ax); where is a ve tor norm on Rm . For an underdetermined system (m < n), generally, there are more than one solution. A natural riterion to sele t a single one is to require it to have minimum "length", i.e. to minimize the fun tion: (x) = (x)jAx=b ; where is a ve tor norm on Rn. Most of the standard te hniques for nding minima of fun tions are unapplyable to the problem above, sin e, generally, a norm is not a dierentiable fun tion of its argument. Thus, there are not general pro edures, independent of the norms involved, to solve these problems. When is the eu lidian norm, the problem of minimizing the norm of the residual r = b Ax is alled the linear least square (LLS) problem. The LLS problem is en ountered in many area of appli ations, su h as approximation theory of fun tions and data, statisti s, et . 59
60
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
Example 3.1 Let onsider a pro ess des ribed by a fun tional dependen y y = f (u), known only from experimental data given by the set of pairs f(ui ; yi ) j i = 1 : mg. Our problem is to nd an approximate analyti al expression f for the fun tion f as a linear ombination of the given n independent fun tions g1 ; g2 ; : : : ; gn that
an be evaluated in the points ui, i = 1 : m, i.e. to nd the oeÆ ients xi , i = 1 : n,
su h that the fun tion
y = f(u) =
makes the residuals
n X j =1
xj gj (u)
ri = yi f(ui ); i = 1 : m
as small as possible. If m > n, there are more residuals than unknowns and it will not be possible, generally, to make all of them equal to zero. The LLS approa h is to hoose the oeÆ ients xi so that: krk =
where
m X i=1
ri2
!1=2
=
0 0 m BX yi
i=1
n X j =1
12 11=2
xj gj (ui )A
C A
= kb
Axk
b = [y1 y2 : : : ym ℄T 2 Rm A 2 Rmn ; aij = gj (ui ); i = 1 : m; j = 1 : n;
is minimized. Parti ularly, if gj (u) = uj 1, j = 1 : n, then f is a polinomial fun tion of degree n 1. The theory of LLS is intimately related to the geometry of the spa e Rn, espe ially to the notion of orthogonality whi h makes the LLS problem meaningful. In the next se tion we will re all some geometri al properties of the ve tor spa e Rn. 3.2 Orthogonality. Householder and Givens transformations Let us introdu e some de nitions, propositions and theorems. The proofs not given below an be found in any book of linear algebra. De nition 3.2 Two ve tors x; y 2 Rm are orthogonal if xT y =
m X i=1
x i yi = 0:
3.2.
ORTHOGONALITY. HOUSEHOLDER AND GIVENS TRANSFORMATIONS
The ve tors u1; u2 ; : : : ; up are orthogonal if they are pairwise orthogonal, i.e. uTi uj = 0; 8i 6= j; i; j = 1 : p: (3.1) If, in addition to (3.1), the ve tors ui have unit eu lidian norm: kuik = 1; 8i = 1 : p; they are alled orthonormal. De nition 3.3 A set S of ve tors from Rm is alled linear subspa e of the spa e Rm if both the following onditions hold: 1. u + v 2 S , 8u; v 2 S . 2. u 2 S , 8 2 R, 8u 2 S . Obviously, every linear subspa e ontains the origin. In pra ti e, a subspa e is often given as the olumn spa e of a matrix, i.e. as: S = ImA = fy 2 Rm j 9x 2 Rn su h that y = Axg; where A 2 Rmn . The notion of orthogonality may be extended to linear subspa es. De nition 3.4 The ve tor x 2 Rm is orthogonal to the linear subspa e S 2 Rm if it is orthogonal to every ve tor in S . Two subspa es, S ; T 2 Rm , are orthogonal if ea h ve tor s 2 S is orthogonal to ea h ve tor t 2 T . De nition 3.5 A linear subspa e S 2 Rm is alled orthogonal omplement to the linear subspa e T 2 Rm if the two subspa es are orthogonal and omplementary, i.e. for any ve tor x 2 Rm , there exist unique ve tors s 2 S and t 2 T su h that x = t + s. Theorem 3.6 (Pythagoras) If x; y 2 Rm are orthogonal, then
kx + yk2 = kxk2 + kyk2 ;
where
k k is the eu lidian norm.
Theorem 3.7 Let A 2 Rmn . The ve tor x subspa e S = ImA if and only if AT x = 0
2 Rn
is orthogonal to the linear
(, x 2 KerAT ) In other words, the subspa e T = KerAT Rm is the orthogonal omplement of S = ImA.
61
62
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
6y3
(0,1,1) k QQQ QQ y2
(1,0,1) ImA
y1
-
KerAT (1,1,-1)
Figure 3.1: KerAT is the orthogonal omplement of ImA. Data from example 3.8. 2
Example 3.8
3
1 07 6 Let A = 4 0 1 5. 1 1
2
x1 x2
3
7 Then, S = ImA = fy 2 R3 j y = 64 5 ; x1 ; x2 2 Rg, i.e. S is a plane of x1 + x2 equation y3 = y1 + y2. T = KerAT2 = fy32 R3 j A0T y2 = 0g3=1 fy 2 R3 j y2 + y3 = 0; y1 + y3 = 0g = 1 1 7C 6 fy 2 R3 j y = 64 1 75 g = Im B 4 1 5A. 1 1 See gure 3.1.
A square matrix Q 2 Rmm is alled orthogonal if its olumns are orthonormal, i.e. j qiT qj = 01;; ifif ii 6= (3.2) = j, where qi and qj are the i-th and, respe tively, the j -th olumn of Q. De nition 3.9
Theorem 3.10 If the matrix Q 2 Rmm is orthogonal, then QT Q = QQT
= Im ;
kQxk = kxk; 8x 2 Rm :
(3.3) (3.4)
3.2.
ORTHOGONALITY. HOUSEHOLDER AND GIVENS TRANSFORMATIONS
These relations are a dire t onsequen e of (3.2) and of the de nition of the eu lidian norm. The property (3.3) expresses the important fa ts that an orthogonal matrix is nonsingular, its inverse is its transpose and it has orthonormal rows and olumns. The property (3.4) shows that an orthogonal transformation onserves the eu lidian ve tor norm (i.e. it de nes an isometry). Theorem 3.11 If Q1 ; Q2 ; : : : ; Qp 2 Rmm are orthogonal, then the matrix: Q = Q1 Q2 : : : Qp
(3.5)
is also orthogonal. Proof. Obviously, relations (3.2) and (3.3) are equivalent. Then (Q1 Q2 )T Q1 Q2 = QT2 QT1 Q1 Q2 = QT2 Q2 = Im . We pro eed by indu tion; let suppose that Q from (3.5) is orthogonal; then, Q0 = QQp+1, where Qp+1 is orthogonal, is orthogonal, by a
similar argument. 2 Two kinds of orthogonal matri es are important in LLS problem: a) Householder matri es or elementary re e tors, and b) Givens matri es or plane rotations. Let us introdu e them and stress their main properties. 3.2.1
Householder transformations
An elementary re e tor of order m (or Householder transformation) is a matrix U 2 Rmm , of the form: 1 uuT (3.6) U = Im
De nition 3.12
where u 2 Rm is a nonzero ve tor and = kuk2 =2: When ui = 0 for i = 1 : s 1, then the elementary re e tor is alled of index s. Obviously, an elementary re e tor Us of order m and index s has the following stru ture: " # 0 I Us = s0 1 U ; where U 2 R(m s+1)(m s+1) is an elementary re e tor of order m s + 1. Some simple onsequen es of de nition 3.12 are ontained in the following
63
64
CHAPTER 3.
Theorem 3.13 If U orthogonal.
THE LINEAR LEAST SQUARES PROBLEM
2 Rmm is an elementary re e tor, then U is symmetri and
From (3.6), U T = U , i.e. U is symmetri . Moreover: 1 1 2 uuT + 1 u(uT u)uT = Im ; T 2 T T U U = U = Im uu uu = Im Im Proof.
2
sin e uT u = kuk2 = 2 . So that, an elementary re e tor is its own inverse. 2 The ru ial property of elementary re e tors is that they an be used to introdu e zeros into a ve tor. Spe i ally, given a ve tor x 2 Rm , one an nd an elementary re e tor Us of index s su h that (Us x)i = 0, for i = s + 1 : m. The re ipe for determining the ve tor us whi h de ne su h a re e tor is given in the following theorem: Theorem 3.14 Let s 2 1 : m
1 and a ve tor x 2 Rm su h that 2 =
Then the ve tor us 2 Rm given by:
m X i=s
x2i 6= 0:
(3.7)
0; for i = 1 : s 1 uis (us )i = : xs + ; for i = s xi ; for i = s + 1 : n 8 <
(3.8)
de nes an elementary re e tor of order m and index s
Us = Im su h that
Proof.
1 u uT ; s s
s
s = kus k2 =2;
(3.9)
for i = 1 : s 1 (Usx)i = : ; for i = s 0; for i = s + 1 : n 8 < xi ;
Let us ompute
yi (Us x)i =
Im
From (3.7) and (3.8): uTs x =
m X i=1
1 usuT x = xi 1 usuT x = xi 1 uis(uT x): s s i s s s s i
uis xi = xs (xs + ) +
m X i=s+1
x2i = 2 + xs
(3.10)
3.2.
ORTHOGONALITY. HOUSEHOLDER AND GIVENS TRANSFORMATIONS
and s =
0
1
1 2 1 2 2A 2 i=s uis = 2 (xs + ) + i=s+1 xi = m X
m X
0
1
m X = 21 x2s + 2xs + 2 + x2i A = 2 + xs : i=s+1
Thus, from (3.8) and (3.10) we obtain 8 < xi ;
yi = xi uis = : ; 0;
for i = 1 : s 1 for i = s for i = s + 1 : n
and the proof is ompleted. 2 The theorem 3.14 almost gives an algorithm for omputing us and s whi h de ne the re e tor (3.9). The onstant is determined up to its sign by the fa t that Us is orthogonal (see (3.7)). In order to hoose the sign of , observe that the formulas for omputing uss (see (3.8)) and s: s = ( + xs)
give a better pre ision if and xs have the same sign, avoiding an ellation of signi ant digits in oating point arithmeti . As we shall see, in appli ations it is often the ase that having formed the ve tor us , we need no longer the zeroed omponents of x. But the last n s omponents of us are identi al with the last n s omponents of x; hen e, only the s-th omponent of x needs to be altered in overwriting x(s : m) by the nonzero elements of us. All these onsiderations may be summed up in the following algorithm. (Given s 2 1 : m 1 and a m-ve tor x, this algorithm returns , s and us su h that [(I (2= s )us uTs )x℄i = 0, for i = s + 1 : m. The nonzero omponents of us overwrite those of x.) Algorithm 3.15
1. 2. xs (2'. xi 3. s
qP
m x2 sign(xs ) i=s i uss = xs + uis = xi , for i = s + 1 : m) uss
65
66
CHAPTER 3.
3.2.2
THE LINEAR LEAST SQUARES PROBLEM
Givens transformations
De nition 3.16
A matrix Pij 2 Rmm of the form: 2
Pij =
6 6 6 6 6 4
Ii 1
d
Ij
i
where ; d 2 R and
1
3
d
In
7 7 7 7 7 5
j
2 + d 2 = 1 is alled a Givens transformation or plane rotation of order m in the (i; j ) plane.
The main properties of the Givens transformations are given by the following Theorem 3.17 1. The Givens transformations are orthogonal, i.e. PijT Pij = Pij PijT
= Im:
2. If x 2 Rm is a ve tor with x2i + x2j 6= 0, then there exists a Givens transformation de ned by: x x d= q 2j 2
= q 2 i 2; xi + xj xi + xj su h that
8 xk ;
yk (Pij x)k = :
for k 6= i; j
x2i + x2j ; for k = i
(3.11)
0; for k = j . Proof. 1. Obvious, by dire t al ulation. 2. To prove (3.11), it is suÆ ient to solve the system (Pij x)i = xi + dxj = (Pij x)j = dxi + xj = 0
2 + d 2 = 1 for the unknowns , d and . 2 To see that Givens transformations oer an alternative to elementary re e tors, let us observe that a sequen e of plane rotations of the form Ps = Psm : : : Ps;s+2 Ps;s+1 may be used for zeroing the s +1 : m elements of a ve tor x 2 Rm , as an elementary re e tor does in theorem 3.14.
3.3.
ORTHOGONAL TRIANGULARIZATION.
QR FACTORIZATION
67
3.3 Orthogonal triangularization. QR fa torization Solving the LLS problem in a stable numeri al way requires a pro edure for matrix redu tion to triangular form by orthogonal transformations. As the next theorem states, this is always possible. We shall use elementary re e tors for su h a redu tion.
Theorem 3.18 Let A 2 Rmn ; then, there exists an orthogonal matrix U su h that UA = R is upper triangular.
2 Rmm
(3.12)
Proof. Note rst that when m < n we shall use the term "upper triangular" for an "upper trapezoidal" form. To pre ise the index information, suppose, rst, that m > n. The proof will be onstru tive, leading to an orthogonal triangularization pro edure. Let onsider a olumn partition of the matrix A: A = [a1 a2 : : : an ℄ and let A1 = A. We pro eed by indu tion. Step 1. If ai1 , i = 2 : m, are not all zero, there exists an elementary re e tor of order m and index 1, U1 2 Rmm (see theorem 3.14), su h that (U1 a1 )i = 0; i = 2 : m: Then the matrix 3 2
A2 = U1 A1 = [U1 a1 U1 a2 : : : U1 an ℄ =
6 6 6 6 6 6 6 4
(2) a(2) 11 a12 : : : 0 a(2) 22 : : : (2) 0 a32 : : : ::: (2) 0 am2 : : :
a(2) 1n a(2) 2n a(2) 3n
a(2) mn
7 7 7 7 7: 7 7 5
has all the rst olumn underdiagonal elements equal to zero. If ai1 = 0, for i = 2 : m, then we take simply U1 = Im and pro eed to the next step. (The unity matrix Im is orthogonal !) Step s. Let suppose that in the rst s 1 steps of the pro edure we have obtained the matrix 3 2 a(11s) a(12s) : : : a(1ss) : : : 6 6 a(22s) : : : a(2ss) : : : 6 6 ... 6 ::: 6 ( s) As = Us 1 : : : U2 U1 A = 66 ass : : : 6 ) ::: 6 0 a(ss+1 6 ;s 6 ::: 4 ( s) ams : : :
a(1sn) a(2sn)
7 7 7 7 7 7 ( s) 7 : asn 77 ) 7 a(ss+1 ;n 7 7 5
s) a(mn
68
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
Now, if a(iss) , i = s + 1 : m, are not all zero, there exists an elementary re e tor of order m and index s, Us 2 Rmm , su h that (see theorem 3.14): (Usa(ss) )i = 0; i = s + 1 : m: (Otherwise we take Us = Im and go to the next step.) Taking into a
ount that any re e tor of index s does not hange a ve tor whose last m s elements are zero (Usx = x (1= s )us(uTs x) = x, be ause uTs x = 0), it results that the matrix: As+1 = Us As = [Us a(1s) Usa(2s) : : : Us a(ns) ℄
has the rst s 1 olumns un hanged and in the s-th olumn the underdiagonal elments are zero. Thus, the pro edure of zeroing subdiagonal elements an be started, as in step 1, and, on e started, it an be ontinued, as in step s. So, when m > n, after n steps, we obtain an upper triangular matrix: R = An+1 = Un Un 1 : : : U2 U1 A =
"
#
R0 ; 0
where R0 2 Rnn is a square upper triangular matrix. The matrix: U
= UnUn
1 : : : U2 U1
(3.13)
is orthogonal, sin e it is a produ t of orthogonal matri es, and, hen e, we have obtained the triangularization (3.12). When m n, the same result is obtained after m 1 steps. The proof is
omplete. 2 In order to implement the above pro edure su h that the matrix R overwrites the matrix A we an use the following s heme: 1. for s = 1 : min(m 1; n) 1. Determine Us su h that (Usas)i = 0, for i = s + 1 : m 2. A UsA In order to ompute Us (i.e. the ve tor us and the s alar s), we shall use the algorithm 3.15. Observe that an e onomi al way to store us is to use the zeroed elements memory lo ations for the storage of uis, i = s + 1 : m. For uss and s one
an use two additional rows (m + 1 and m + 2) of the array A, like in the following
3.3.
ORTHOGONAL TRIANGULARIZATION.
QR FACTORIZATION
69
diagram for the situation after the s-th step: 2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4
To ompute A
r11 u21
r12 : : : r1s r22 : : : r2s
...
...
...
...
us2 : : : rss rs;s+1 : : : +1) : : : us+1;1 us+1;2 : : : us+1;s a(ss+1 ;s+1 us1
...
...
um1 u11 1
...
um2 : : : ums u22 : : : uss 2 : : : s
...
s+1) a(m;s +1 : : :
Us A, we must ompute only aj Us aj = (Im
with
r1;s+1 : : : r1n r2;s+1 : : : r2n
=
=
m X k=s
7 7 7 7 7 7 rsn 77 +1) 7 a(ss+1 ;n 7 7 7 7 ( s+1) 7 amn 77 7 5
... ...
Us aj , for j = s +1 : m.
(1= s )usuTs )aj = aj
uTs aj = s
3
But
us
!
uks akj = s
and the rst s 1 elements of the olumn aj remain un hanged be ause uis = 0, for i = 1 : s 1. En apsulating the above ideas we obtain the following algorithm. In the algorithm, the statement "pass to next step" is used to indi ate that for the urrent value of the for loop ounter, the instru tions of the(s) y le body that follow this statement are not exe uted. This is the ase when ais = 0, for i = s : m (in su h
ases, Us = Im). (ORTHT { orthogonal triangularization) (Given a matrix A 2 Rmn , this algorithm overwrites the matrix A by the upper triangular matrix R and the ve tors us and s alars s whi h de ne the elementary re e tors Us, s 2 1 : p, p = min(m 1; n), su h that R = Un Un 1 : : : U2 U1 A.) 1. p = min(m 1; n) 2. for s = 1 : p q 1. sign(ass) Pmi=s a2is 2. if = 0 then am+1;s 0, am+2;s 0, pass to next step 3. am+1;s uss = ass + 4. am+2;s uss 5. ass rss = 6. for j = s + 1 : n Algorithm 3.19
70
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
1. (Pmk=s uksakj ) = s 2. for i = s : m 1. aij aij uis The omplexity of the above algorithm is given by NORT HT = 4(mnp (m + n) 2 + p3 ) ops and MORT HT mn oating point lo ations. If m n, the most frequently o
urring ase, the algorithm requires about 2(mn2 n3=3) ops. Moreover, it is shown that the algorithm ORT HT is numeri ally stable. Note that the algorithm does not ompute expli itly the orthogonal matrix U of (3.13), but only stores all the ne essary elements to do it. An immediate appli ation of algorithm ORT HT is the omputation of the so
alled QR fa torization of the matrix A. The de nition and the existen e of the QR fa torization are given by the following p2
3
Theorem 3.20 If A 2 Rmn , has linearly independent olumns, then it an be written uniquely in the form A = QR; (3.14)
where Q 2 Rmn has orthonormal olumns and R 2 Rnn is upper triangular with positive diagonal elements. The matrix fa torization (3.14) is alled QR fa torization. Proof. For for all x 6= 0,
all x 2 Rn , x 6= 0, we have Ax 6= 0. Hen e, xT AT Ax = kAxk2 > 0 i.e. the matrix B = AT A is positive de nite. Then, by Cholesky fa torization, there exists a unique upper triangular matrix with positive diagonal elements su h that B = RT R. Let Q = AR 1. Then QT Q = R T AT AR 1 = R T RT RR 1 = In , i.e. Q has orthogonal olumns and (3.14) holds. The uniqueness of the QR fa torization results from the uniqueness of the Cholesky fa torization. 2
The use of the Cholesky fa torization for omputing the matri es Q and R from (3.14) is not re ommended be ause of the ill numeri al ondition of matrix B = AT A. A mu h better method is based on the orthogonal triangularization (3.12). The matrix A having linearly independent olumns and m n, (3.12) may be written: A=
UT R
=
"
0 Q0 Q00 R
#
0 0 0 =QR;
(3.15)
where Q0 2 Rmn and R0 2 Rnn. This relation is the QR fa torization (3.14) up to the signs of the diagonal elements of R.
3.4.
71
SOLVING THE LINEAR LEAST SQUARES PROBLEM
The matrix Q0 an be omputed from: "
Q0 = Q In
#
"
In
#
"
0 = 0 = (Up : : : U2 U1) using the following omputational s heme: Q0 =
"
In
UT
T
In
#
"
In
0 = U1U2 : : : Up 0
#
#
1. 0 2. for s = p : 1 : 1 1. Q0 UsQ0 However, it should be stressed that in most appli ations, the matrix Q U T or the matrix Q0 from (3.15) is not needed; it is rather suÆ ient to know the fa tored form (3.13) of U . So that, writing an expli it algorithm to implement the above s heme remain an exer ise to the reader. 3.4 Solving the linear least squares problem Let return to the LLS problem of minimizing the eu lidian norm of the overdetermined linear system residual: min krk = xmin kb Axk; (3.16) x2R 2R where A 2 Rmn , m > n, and b 2 Rm are given. First, we are interested to establish the onditions under whi h the LLS problem has a solution and when this solution is unique. We shall develop furtherly an algorithm to ompute this (pseudo)solution of an overdetermined linear system. n
n
Theorem 3.21 Let A 2 Rmn and b 2 Rm be given. The linear least squares problem (3.16) always has a solution. The solution is unique if and only if the matrix has linearly independent olumns. Proof. Let S = ImA and T = S ? ve tor b an be uniquely written as:
its orthogonal omplement in Rm . Then, the
b = b1 + b2 ; b1 2 S ; b2 2 T :
(3.17) Obviously, the norm being a positive fun tion of its argument, (x) kb Axk is minimum i (x) = 2(x) is minimum. Also, b1 2 S , Ax 2 S for all x imply that b1 Ax 2 S ; hen e, b2 and b1 Ax are orthogonal (see gure 3.2). Then, applying theorem 3.6, 2 (x) = kb Axk2 = kb1 Ax + b2 k2 = kb1 Axk2 + kb2 k2 kb2 k2 :
72
CHAPTER 3.
b2
S = ImA
THE LINEAR LEAST SQUARES PROBLEM
CO C 6 b C CC b Ax C C b1 1 SoS bC1C Ax XXX S C X zSC Ax XXXX T = S?
Figure 3.2: Illustration for theorem 3.21. Thus, 2(x) will be minimum when kb1 Axk2 is minimum. But b1 2 S = ImA, so that we an always nd a x 2 Rn satisfying Ax = b1 : (3.18) For this x , kb1 Ax k2 = 0, whi h is ertainly a minimum, and min kb Axk = kb Axk = kb2 k: (3.19) x2R n
As we an see in the gure, (3.19) expresses the well known fa t that the minimum distan e from a point to a subspa e is given by the eu lidian length of the perpendi ular from the point to the subspa e. Now, from (3.18), the solution x is unique i A(x x0) = 0 imply x = x0 , i.e. KerA = f0g, or, equivalently, A has linearly independent olumns. 2 If A 2 Rmn has linearly independent olumns, then the LS solution of the overdetermined system Ax = b an be written in the form: x = A+ b; (3.20) where the matrix A+ = (AT A) 1 AT (3.21) is alled the pseudo-inverse or the Moore-Penrose generalized inverse of A. Corollary 3.22
As the matrix A has linearly independent olumns, the matrix B = AT A is nonsingular. Hen e, AT r = AT (b Ax ) = AT b2 , where b2 2 T ?S = ImA is Proof.
3.4.
SOLVING THE LINEAR LEAST SQUARES PROBLEM
73
de ned in (3.17). But b2 ?ImA implies b2Av = 0 for all v 2 Rn, parti ularly for v = e1 ; e2 ; : : : ; en . Hen e, bT2 A = 0, or AT b2 = 0 (or b2 2 ImA), and AT (b Ax ) = 0, i.e. the LS solution of the overdetermined linear system Ax = b is the lassi al solution of the determined nonsingular system AT Ax = AT b (3.22) whi h gives (3.20) and (3.21). 2 The linear system (3.22) is alled the system of normal equations for the LLS problem. The omputation of the LLS solution by solving the system of normal equations (3.22) is not re ommended, be ause of the ill numeri al ondition of the matrix B = AT A. A numeri ally stable way to ompute the LLS solution is by the mean of the orthogonal triangularization algorithm ORT HT . Indeed, the orthogonal transformations onserving the eu lidian norm, for all orthogonal matri es U 2 Rmm we have: kb Axk = kU (b Ax)k = kd Rxk; (3.23) where d = Ub; R = UA: (3.24) If the matrix U is that from theorem 3.21, then R is an upper triangular matrix, i.e. R an be written in the form " 0 # R R= 0 ; with R0 a square n n upper triangular matrix. R has linearly independent olumns be ause A has the same property. Hen e, R0 is a nonsingular matrix. Now, using the partition # " 0 d (3.25) d = d00 ; where d0 2 Rn and d00 2 Rm n , relation (3.23) an be written: kb Axk =
"
d0 R0 x d00
#
q
= kd0
R0 xk2 + kd00 k2 ;
whi h is minimum when d0 R0x = 0. Hen e, the LLS solution of the overdetermined system Ax = b an be omputed as the solution of the upper triangular nonsingular system: R 0 x = d0 :
74
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
(Note that the same result is immediately obtained by substituting the QR fa torization of A, A = Q0R0, in the normal system (3.22): (R0)T (Q0 )T Q0R0x = (R0)T (Q0)T b, whi h implies R0x = (Q0)T b d0.) From (3.24), (3.25) and (3.13) it results that: d0 = [In 0℄d = [In 0℄Ub = [In 0℄Un : : : U2 U1 b: Thus, the ve tor d0 an be omputed using the s heme: 1. d b 2. for s = 1 : n 1. d Usd 0 3. d d(1 : n) The minimal residual ve tor r = b Ax = b2 has the signi an e of the orthogonal proje tion of the ve tor b onto the linear subspa e T = S ? = (ImA)T = KerAT , while the ve tor b1 = Ax has the signi an e of the proje tion of b onto the linear subspa e S = ImA. The two orthogonal proje tions of the ve tor b onto ImA and KerAT are best
omputed with the expressions: b1 = U1 U2 : : : Un
"
d0
#
0
#
0
"
b2 = U1 U2 : : : Un d00
whi h result from
"
#
"
#!
0 b= = 0 + d00 = b1 + b2 : Summarizing the above ideas, we obtain the following: UT d
UT
d0
Algorithm 3.23 (LLS { linear least squares) (Given a matrix A 2 Rmn , m > n, with linearly independent olumns, and a ve tor b 2 Rn ,
this algorithm omputes the LLS solution of the overdetermined linear system Ax = b. Also, the algorithm omputes the orthogonal proje tions b1 and b2 onto the linear subspa es ImA and KerAT , respe tively.) f The orthogonal triangularization of A g 1. [Us; s = 1 : n; R℄ = ORT HT (A) f Cal ulation of d, overwriting the ve tor b g 2. for s = 1 : n
3.5.
SOLVING UNDERDETERMINED LINEAR SYSTEMS
75
1. (Pmk=s uksbk ) = s 2. for i = s : m 1. bi bi uis f Computation of the LLS solution g 3. x = UT RIS (R; b) f Computation of the proje tions g 4. b1(1 : n) b(1 : n); b1(n + 1 : m) 0 5. b2(1 : n) 0; b2(n + 1 : m) b(n + 1 : m) 6. for s = n P : m1 : 1 1. ( k=s uksb1k ) = s 2. for i = s : m 1. b1Pi m b1i uis 3. ( k=s uksb2k ) = s 4. for i = s : m 1. b2i b2i uis The main omputational eort is onsumed for orthogonal triangularization. Hen e, the asymptoti number of oating point operations is the same as in ORT HT . Algorithm 3.23 is a very stable way of solving the LLS problem and, if ne essary, of omputing the orthogonal proje tions of a given ve tor onto two omplementary linear subspa es de ned by a given matrix. 3.5 Solving underdetermined linear systems The linear system Ax = b, where A 2 Rmn and b 2 Rn, with more unknowns than equations (n > m), have, generally, an in nite number of solutions. A natural
riterion, in the least square sense, for the sele tion of a solution, is its eu lidian norm. The solution of minimum eu lidian norm is alled normal solution of Ax = b. The onditions for the existen e and uniqueness of the normal solution are given by: Theorem 3.24 If the matrix A 2 Rmn , m < n, has linearly independent rows, then the system Ax = b has a unique normal solution, i.e. a unique ve tor x 2 Rn su h that: kx k = min kxk: Ax=b
The normal solution an be written in the form: x = A# b; where A# = AT (AAT ) 1 is alled the normal pseudo-inverse of A.
(3.26) (3.27)
76
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
Proof. We shall give a onstru tive proof, whi h will lead us to an algorithm for the omputing of the normal solution. If A has linearly independent rows, then its transpose AT has linearly independent olumns. Theorem 3.18 says that there exists an orthogonal matrix U su h that:
=
UAT
"
#
R ; 0
where R is a square m m nonsingular upper triangular matrix; then: A = [RT 0℄U = [L 0℄U; (3.28) where L = RT is a nonsingular lower triangular m m matrix. Thus, the system Ax = b be omes: [L 0℄Ux = b: (3.29) Denoting Ux = y and partitioning: y=
#
"
y0 ; y00
where y0 2 Rm and y00 2 Rn m , (3.29) an be written in the form Ly0 = b. Hen e, the set of all solutions of the system Ax = b is: X = fx 2
Rn
jx=
"
UT
#
L 1 b ; y00 2 Rn y00
m
g:
Sin e orthogonal transformations onserve the eu lidian norm, for all x 2 X : kxk =
"
T
U
L 1b y00
#
=
"
L 1b y00
#
q
= kL 1 bk2 + ky00 k2 :
Hen e:
min kxk = min kxk = kL 1 bk x2X and the unique x 2 X having this property is: Ax=b
x = U T
"
# L 1b ; 0
(3.30)
the normal solution of the system Ax = b. In order to show that the solution (3.30) an be written in the form (3.26), (3.27), let observe that, using (3.28), we have: AT
=
"
UT
LT
0
#
3.5.
77
SOLVING UNDERDETERMINED LINEAR SYSTEMS
and
(
AAT
)
1=
"
[L 0℄
UU T
Hen e: A# b = AT (AAT ) 1 b = U T
LT
#!
0
"
LT
0
1
= (LLT ) 1 = L
#
L
TL
1b = U T
"
TL
L 1b
0
#
1:
= x ;
whi h ompletes the proof. 2 The omputation of the normal solution follows the proof above: (USS { underdetermined system solving) (Given a matrix A 2 Rmn , m < n, with linearly independent rows, and a ve tor b 2 Rn , this algorithm omputes the normal solution x 2 Rn of the underdetermined linear system Ax = b, i.e. the solution of minimum eu lidian norm.) f The orthogonal triangularization of AT g 1. [Us; s = 1 : m; R℄ = ORT HT (AT ) f Solve the lower triangular system RT y0 = b, with y0 overwriting b g 2. b = UT RIS (R; b) f Computation of the normal solution from (3.30) g 3. x(1 : m) b; x(m + 1 : n) 0 4. for s = mP: n 1 : 1 1. ( k=s uksxk ) = s 2. for i = s : n 1. xi xi uis Algorithm 3.25
The statement 4 implements (3.30) written in the form: x = U1 U2 : : : Um
by the following s heme: "
y0
#
"
y0
#
0
1. x 0 2. for s = m : 1 : 1 1. x Usx Most of the oating point operations are onsumed in the rst statement, for the orthogonal triangularization. The above algorithm is the best one, from the numeri al point of view, for omputing the normal solution of an underdetermined system with full rank matrix.
78
CHAPTER 3.
THE LINEAR LEAST SQUARES PROBLEM
Problems
Let H 2 (m > n) be an upper Hessenberg matrix (h = 0 for i > j + 1). Give algorithms for:
omputing the QR fa torization of H . solving the overdetermined linear system Hx = b, where b 2 is given. Use Householder and Givens transformations and ompare the two versions. The same problem, when H 2 is lower Hessenberg (h = 0, for j > i + 1). Adapt the orthogonal triangularization algorithm ORT HT to the matrix A = A , where A 2 is upper triangular and A 2 . A The same when A is also upper triangular. Show that in (3.15), the olumns of Q0 2 form an orthogonal basis for ImA and the olumns of Q00 2 form an orthogonal basis for KerA . Show that a triangular orthogonal matrix is diagonal. Let onsider the ve tor x 2 , of unit norm. Find an orthogonal Y 2 su h that x Y = 0 (or [x Y ℄ orthogonal). Let A 2 , m > n, a rank n matrix whose QR fa torization is known (A = QR, Q 2 , R 2 , see (3.15)). Let A~ = [A z ℄, where z 2 . Give an algorithm to
ompute the QR fa torization of A~. Let A 2 , m n, a rank n matrix whose QR fa torization is known (A = QR, Q 2 , R 2 , see (3.15)). Let A~ = wA , where w 2 is a ve tor. Show that the olumns of A~ are linearly independent. Give an algorithm to ompute the QR fa torization of A~. Let A 2 be a " olumn" matrix, A 6= 0. Give an algorithm to ompute the pseudo-inverse of A. Let u; v 2 be two ve tors of equal norm, kuk = kvk. Give algorithms for:
omputing an orthogonal matrix Q 2 su h that Qu = v.
omputing an orthogonal matrix W 2 su h that W u?v. Let a; b 2 be two non olinear ve tors. Give an algorithm for omputing the s alar x 2 su h that kax bk is minimum. Give a geometri interpretation of x. Give an algorithm for omputing the proje tion of a onto the dire tion of b. m
P 3.1
n
R
ij
a.
b.
R
m
P 3.2
R
n
m
ij
P 3.3 a. 1
n
R
1
2
b.
n
(m R
2
n)
n
2
P 3.4
R
m
R
(m
m
n
n)
T
P 3.5
n
P 3.6
R
R
T
m
P 3.7
n
R
m
R
m
R
m
m
P 3.8
m
m
m
R
n
R
R
n
m
T
n
R
R
n
a.
b.
P 3.9
P 3.10
1
n
R
n
R
n
a.
R
n
b.
P 3.11 a.
b.
R
n
n
n
R
R
2
n
(n 1)
Chapter 4
Iterative te hniques in matrix algebra The iterative methods are based on a step by step onstru tion of a sequen e onvergent to the solution of a given numeri al problem. A ertain riterion is used for ending the iterations and the result is a trun ated sequen e with the last term
onsidered as a satisfa tory approximate of the solution. In this hapter we shall be ex lusively on erned of iterative methods for determined linear system solving. Sin e the time required for suÆ ient a
ura y in the iterative te hniques ex eeds that required for dire t te hniques su h as the Gaussian elimination method, the rst ones are seldom used for solving linear systems of small dimension. However, for large systems with a high per entage of zero entries these te hniques prove to be eÆ ient in terms of omputer storage and time requirements. Systems of this type arise frequently in the numeri al solution of the boundary-value problems and partial-dierential equations. In order to appre iate the properties of su h iterative te hniques we need rst some results on ve tor and matrix sequen e onvergen e. 4.1 Ve tor and matrix sequen es. Convergen e
A ve tor or matrix norm (see se tion 2.11.1) an be interpreted as a measure for ve tor or matrix "size", or a measure for the distan e between the ve tor (the matrix) and the origin of the orresponding spa e. Consequently, a ve tor norm on Rn allows us to introdu e the on ept of distan e on Rn. De nition 4.1
Let k k : Rn ! R+ be a ve tor norm on Rn . Then the fun tion d : Rn Rn ! R+ (4.1) 79
80
CHAPTER 4.
de ned by
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
d(x; y) = kx yk
is alled a distan e on Rn indu ed by the norm k k. Example 4.2 The eu lidean distan e between two ve tors x; y 2 Rn is d2 (x; y) = kx yk2 =
v u n uX t
The distan e indu ed by the 1-norm is d1 (x; y) = kx yk1 =
i=1
n X i=1
(xi
yi )2 :
jxi yij
and the distan e indu ed by the 1-norm is de ned by d1 (x; y) = kx yk1 = max jxi i=1:n
yi j:
(4.2)
(4.3) (4.4) (4.5)
The distan e between two matri es A; B 2 Rmn is d(A; B ) = kA B k; (4.6) where k k is an appropriate matrix norm. The on ept of distan e on Rn an be used to de ne a limit of a sequen e of ve tors in Rn . Similarly, the distan e on Rmn allows us to introdu e the limit of a matrix sequen e. De nition 4.4 A sequen e (x(k) )k2N of ve tors in Rn is alled onvergent to x 2 Rn , with respe t to the norm k k, if and only if the sequen e of real numbers (k )k2N de ned by k = kx(k) xk is onvergent to zero. Similarly a matrix sequen e (A(k) )k2N with A(k) 2 Rmn is
alled onvergent to A 2 Rmn , with respe t to the matrix norm k k, if and only if the sequen e of real numbers ( k )k2N de ned by k = kA(k) Ak
onverges to zero. In studying iterative matrix te hniques, it is of parti ular interest to know when powers of a matrix be ome small, i.e., when all the entries of Ak approa h zero. Matri es of this type are alled onvergent. De nition 4.3
4.2.
ITERATIVE TECHNIQUES FOR LINEAR SYSTEM SOLVING
De nition 4.5
or, equivalently
81
A square matrix A 2 Rnn is alled onvergent if lim (Ak )ij = 0; i = 1 : n; j = 1 : n k!1
(4.7)
lim Ak = 0:
(4.8)
k!1
In order to derive the onditions under whi h a square matrix is onvergent, we must introdu e the on epts of eigenvalue and of spe tral radius of a (square) matrix. These on epts are very important in a more general framework and we shall pay a spe ial attention to them in a future hapter. For the moment we need only a few basi fa ts about them. De nition 4.6 If A 2 Rnn (or A 2 Cnn) the polynomial of degree n p() = det(I A) (4.9) is alled the hara teristi polynomial of A. The n zeros (over C) of the hara teristi polynomial are alled the eigenvalues of the matrix A. De nition 4.7 Let A 2 Rnn (or A 2 Cnn). If = f1 ; 2 ; ; n g is the set of its eigenvalues then the spe tral radius (A) of the matrix A is de ned by the nonnegative number (A) = imax (j j): (4.10) =1:n i An important onne tion between the spe tral radius of a matrix and the onvergen e of the matrix is stated subsequently. Theorem 4.8 A square real matrix A 2 Rnn is onvergent (i.e. limk!1Ak = 0) if and only if the spe tral radius of the matrix A satis es
(A) < 1:
(4.11)
The proof an be found in [?℄. 2 4.2 Iterative te hniques for linear system solving
An iterative te hnique to solve the determined linear system Ax = b; A 2 Rnn ; b 2 Rn (4.12) with A nonsingular onsists in nding an initial approximation x(0) of the solution x and onstru ting a sequen e of ve tors (x(k) )k2N that onverges to x. Most of
82
CHAPTER 4.
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
these iterative te hniques involve a onversion pro ess of the system (4.12) into an equivalent system of the form x = Tx + (4.13) for a ertain matrix T 2 Rnn and ve tor 2 Rn. After the initial ve tor x(0) is sele ted, the sequen e of approximate solutions is onstru ted re urrently using the formulae x(k+1) = T x(k) + ; k = 1; 2; : : : (4.14) The matrix T in (4.13) is hosen su h that the omputational eort to generate x(k+1) in (4.14) is reasonable small and the onvergen e speed is suÆ iently great. The main result on the onvergen e of the sequen e de ned in (4.14) with initial ve tor x(0) is given below. Theorem 4.9 For any x(0) 2 Rn , the sequen e (x(k) )k2N de ned by (4.14) for
6= 0, onverges to the unique solution of the system (4.13) if and only if (T ) < 1:
(4.15)
The system (4.13) is equivalent to the system (4.12). Hen e the matrix I T is nonsingular and x = A 1 b = (I T ) 1 (4.16) (The invertibility of the matrix I T an be derived from (4.15). Indeed, if the spe tral radius (T ) satis es (4.15) then the eigenvalues of T satisfy jij < 1; i = 1 : n. Hen e j1 i j > 1 ji j > 0; i = 1 : n , i.e. 1 i 6= 0; i = 1 : n. But i = 1 i are the eigenvalues of the matrix I T and onsequently I T is nonsingular). Now it an be easily he ked that if m X Sm = T i (4.17)
Proof.
i=0
then
(I T )Sm = I T m+1: (4.18) From (4.15) we have that T is onvergent and onsequently limm!1 T m+1 = 0. From (4.18) we obtain 1 X (I T ) 1 = T i (4.19) and from (4.14), we get by indu tion x(k) = T k x(0) +
i=0
kX1 i=0
!
Ti
:
(4.20)
4.2.
ITERATIVE TECHNIQUES FOR LINEAR SYSTEM SOLVING
83
Assuming (T ) < 1, and using Theorem 4.8, from (4.20) we get lim
k!1
x(k) =
lim
k!1
T k x(0)
+ klim !1
kX1 i=0
!
Ti
= (I
T ) 1 = x:
(4.21)
Conversely, suppose that the sequen e (x(k) )k2N onverges to x for any x(0) . From (4.13) and (4.14) e(k) := x x(k) = T x + T x(k 1) = T (x x(k 1) ) = T e(k 1) : (4.22) By indu tion, from (4.22) we have e(k) = T k e(0) ; e(0) = x x(0) : (4.23) The onvergen e, for any x(0) , of the sequen e (x(k) )k2N is equivalent to the onvergen e, for any e(0) , of the sequen e (e(k) )k2N , de ned by (4.23), to the ve tor e = 0. Let e(0) i be n initial ve tors su h that the matrix (0) E (0) = [e(0) (4.24) 1 ; ; en ℄ is nonsingular. Then lim E (k) = klim T k E (0) = 0 (4.25) k!1 !1 from where lim T k = 0 (4.26) k!1 i.e. the matrix T is onvergent. By Theorem 4.8, (4.15) is true and the proof ends. 2
The ne essary and suÆ ient onvergen e ondition (4.15) if often diÆ ult to
he k. A useful riterion to prove the onvergen e to the system solution of a parti ular sequen e is given below. If the matrix T 2 Rnn satis es kT k < 1; (4.27) for a onsistent family of matrix norms k k, then the sequen e (x(k) )k2N de ned by (4.14) is onvergent to the solution x of (4.12), for any initial approximation x(0) . Proposition 4.10
Proof. Let i 2 (T ), where (T ) is the set of all eigenvalues of the matrix T . Then det(i In T ) = 0 and there exists a nonzero ve tor y 2 Rn (or y 2 Cn) alled
eigenve tor, su h that
T y = i y:
(4.28)
84
CHAPTER 4.
Hen e
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
ji jkyk = kT yk kT kkyk
i.e.
(4.29) The inequality (4.29) implies (4.15) and, by Theorem 4.9, the orresponding sequen e is onvergent. 2 In order to obtain some parti ular iterative methods for linear system solving,
onsider the following de omposition of the matrix A given in (4.12) A = L + D + U; (4.30) where L is the stri tly lower triangular part of A, D is the diagonal part of A and U is the stri tly upper-triangular part of A i.e. 2 2 0 0 0 0 3 a11 0 0 0 3 6 a 6 0 a 0 0 0 777 0 0 777 21 22 6 6 6 6 L = 66 77 ; D = 66 77 ; 4 an 1;1 an 1;2 4 0 0 05 0 an 1;n 1 0 5 an1 an2 2an;n 1 0 0 0 0 0 ann 0 a12 a1;n 1 a1n 3 6 0 0 a2;n 1 a2n 777 6 6 U = 66 7: 4 0 0 0 an 1;n 75 0 0 0 0 0 (4.31) In the following development we assume that aii 6= 0 8i = 1 : n , i.e., the matri es D; L + D; D + U are all nonsingular. Sin e the matrix A is nonsingular, this assumption is always satis ed after appropriate an appropriate row ( olumn) permutation (see Remark 4.15). 4.2.1
jij kT k < 1; 8i = 1 : n:
Ja obi iterative method
Let and In this ase (4.13) be omes or
TJ
=
D 1 (L + U )
(4.32)
= D 1 b:
(4.33)
x = D 1 (L + U )x + D 1 b
(4.34)
(D + L + U )x = b
4.2.
85
ITERATIVE TECHNIQUES FOR LINEAR SYSTEM SOLVING
and sin e (4.34) is equivalent to (4.12), (4.14) be omes x(k+1) = D 1 (L + U )x(k) + D 1 b; k = 1; 2; (4.35) whi h de nes the so alled Ja obi iterative method. In s alar form (4.35) an be written 0 1 n X ( k ) ( k+1) aij xi A =aii ; i = 1 : n (4.36) xi = bi j =1;j 6=i
whi h implements (4.35) in the form of the following linear system solving Dx(k+1) = (L + U )x(k) + b: (4.37) The above method should be ompleted with an adequate riterion for ending the sequen e of iterations. A possible way is to iterate (4.36) until kx(k+1) x(k) k kx(k+1) k
(4.38)
is smaller than some pres ribed toleran e " > 0. To this end, any onvenient norm
an be used but the most usual one is the k k1 norm. Due to the small speed of
onvergen e or even due to the possible sequen e divergen e, su h a riterion might be not ful lled in a reasonable great number of iterations. Noti e that there is no need to store all the ve tors in the sequen e. An additional ve tor updated at ea h step is suÆ ient. Algorithm 4.11
(J ITER { Ja obi iterative method)
(Given: - a nonsingular matrix A 2 Rnn with aii 6= 0 for i = 1 : n - a nonzero ve tor b 2 Rn - an initial approximation x 2 Rn - a toleran e " > 0 - a maximum number of iterations N the algorithm omputes an approximate solution x 2 Rn of the linear system Ax = b or prints a message that the number of iterations was ex eeded without attaining the given toleran e) 1. k 0 2. while k N 1. for i = 1 : n Pn 1. yi (bi j=1;j6=i aij xj )=aii 2. if ky xk=kyk < " 1. x y 2. STOP
86
CHAPTER 4.
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
3. x y 4. k k + 1 3. Print('The maximum number of iterations was ex eeded without attaining the given toleran e') Using now for onvergen e the suÆ ient ondition (4.27) we obtain the following proposition. Proposition 4.12
If the matrix A is stri tly diagonally dominant, i.e., jaii j >
n X j =1;j 6=i
jaij j; i = 1 : n
(4.39)
then the ve tor sequen e given by Ja obi's iterative method is onvergent to the solution x. Proof.
From (4.39) we have jaij j < 1; 8i = 1 : n: ja j j =1;j 6=i ii n X
Hen e
0
1
j aij j A < 1 kTJ k1 = k D 1 (L + U )k1 = imax =1:n j =1;j 6=i jaii j and, sin e kk1 is a onsistent family of matrix norms, we on lude with Proposition n X
4.10 that the Ja obi sequen e is onvergent. 4.2.2
Gauss-Seidel iterative method
A possible improvement of the Ja obi iterative method is suggested by the following remarks: - for omputing x(k+1) the omponents of x(k) are used; - the omputations are made in su
essive order x(1k+1) ; : : : ; x(nk+1) . Sin e at step i > 1, x(1k+1) ; x(2k+1) : : : ; x(ik+1) 1 have been already omputed and are supposedly better approximations to the exa t solutions x1 ; x2 ; : : : ; xi 1 than x(1k) ; x(2k) : : : ; x(ik)1 , it seems reasonable to ompute x(ik+1) using the last omputed values. To be more spe i , instead of (4.36) it seems better to use 0
x(k+1) = b i
i
i 1 X j =1
aij x(jk+1)
n X j =i+1
1
aij x(jk) A =aii ; i = 1 : n:
(4.40)
4.2.
ITERATIVE TECHNIQUES FOR LINEAR SYSTEM SOLVING
87
This orresponds to the following hoi e of the matrix T in (4.13) and (4.14) TGS = (L + D) 1 U: (4.41) This method depi ted in (4.40) and based on the lower-triangular linear system solving (L + D)x(k+1) = Ux(k) + b (4.42) is alled the Gauss-Seidel iterative method. Pro eeding as in Algorithm 4.11, the Gauss Seidel method an be implemented as follows. Algorithm 4.13
(GS ITER { Gauss-Seidel iterative method)
(Given: - a nonsingular matrix A 2 Rnn with aii 6= 0 for i = 1 : n - a nonzero ve tor b 2 Rn - an initial approximation x 2 Rn - a toleran e " > 0 - a maximum number of iterations N the algorithm omputes an approximate solution x 2 Rn of the linear system Ax = b or prints a message that the number of iterations was ex eeded without attaining the given toleran e) 1. k 0 2. while k N 1. for i = 1 : n P 1. yi (bi ij=11 aij yj Pnj=i+1 aij xj )=aii 2. if ky xk=kyk < " 1. x y 2. STOP 3. x y 4. k k + 1 3. Print('The maximum number of iterations was ex eeded without attaining the given toleran e') Using for (4.41) the k k1 family of matrix norms, the following result, similar to the one stated in Proposition 4.12, an be proved (exer ise to the reader). If the matrix A is stri tly diagonal-dominant then the ve tor sequen e de ned by the Gauss-Seidel method is onvergent to the solution x for any initialization.
Proposition 4.14
88
CHAPTER 4.
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
Remark 4.15 Noti e that Algorithms 4.11 and 4.13 work if aii 6= 0; i = 1 : n. If this is not the ase, a reordering of the equations (row permutations of the matrix A) an be performed su h that aii 6= 0; 8i = 1 : n. Sin e the system !is 1 2 n nonsingular this is always possible. Indeed, let = (1) (2) (n) be a permutation of the set f 1 ; 2 ; :::; n g and the
orresponding permutation mah i n n trix P = e(1) e(2) e(n) 2 R . Let also Sn denote the set of all permutations (n!) of the set f1; 2; ; ng. It is well known that det A = 2S ( 1)sign a(1);1 a(2);2 a(n);n : Consequently, as det A 6= 0 there exists 2 Sn su h that a(1);1 a(2);2 a(n);n 6= 0 (sin e if all the terms equal zero then det A = 0) and hen e a(i);i 6= 0; i = 1 : n: Clearly (P A)(i; i) = a(i);i 6= 0 and we on lude that P A is a solution to our problem. 2 n
The Propositions 4.12 and 4.14 suggest that in order to speed up the onvergen e the equations should be arranged su h that aii is as large as possible. When we have introdu ed the Gauss-Seidel method we laimed that it seems to be superior to the Ja obi method. In fa t there are linear systems for whi h the Ja obi method onverges and the Gauss-Seidel method does not and onversely. In general, it is not known whi h of the two te hniques, Ja obi or Gauss-Seidel will work better on a parti ular ase. 4.2.3
Relaxation methods
Let us de ne, for an approximation x of the system solution x, the residual ve tor (4.43) r = b Ax: In an iterative method, a residual ve tor is asso iated with ea h omputation of an approximation of the solution ve tor. The obje tive of an iterative method is to generate a sequen e of approximations for whi h the asso iated residual ve tor
onverges to zero.
4.2.
ITERATIVE TECHNIQUES FOR LINEAR SYSTEM SOLVING
89
In Gauss-Seidel method, the approximate solution is, at the step at whi h x(ik+1) is omputed (k) (k) T x = [x(1k+1) ; x(2k+1) ; :::; x(ik+1) (4.44) 1 ; xi ; ; xn ℄ and, if we denote by (k+1) ℄T ri(k+1) = [r1(ki +1) ; r2(ki +1) ; :::; rni (4.45) the residual ve tor at this step, then from (4.43) and (4.44) rii(k+1) = bi
Consequently
i 1 X j =1
n X
aij x(jk+1) i 1 X
aii x(ik) + rii(k+1) = bi
j =1
j =i+1
aij x(jk)
aij xj(k+1)
n X j =i+1
aii x(ik) :
(4.46)
aij x(jk) :
(4.47)
Re all that in Gauss-Seidel method x(ik+1) is omputed by (4.40) and (4.47) an be rewritten as (4.48) aii x(ik) + rii(k+1) = aii x(ik+1) or r(k+1) x(ik+1) = x(ik) + ii : (4.49) aii Moreover, with (4.40), after the i-th omponent of x(k+1) was omputed (k+1) = b ri;i i +1
i X j =1
aij x(jk+1)
n X j =i+1
aij x(jk) = 0:
(4.50)
The redu tion to zero of one of the entries of the residual ve tor is not ne essarily ( k+1) the most eÆ ient way to redu e the norm of the ve tor ri+1 . It an be shown that by introdu ing a s alar parameter ! the modi ed Gauss-Seidel method given in (4.49) r(k+1) (4.51) x(ik+1) = x(ik) + ! ii aii leads, for ertain hoi es of ! > 0, to a signi antly faster onvergen e. The equation (4.51) an be obtained in matrix form as follows. Consider the de omposition of the matrix A depi ted in (4.31) and modify the Gauss-Seidel method by rede ning the matrix TGS in (4.41) as 1 1 )D + U ): T! = (L + D) 1 ((1 (4.52) !
!
90
CHAPTER 4.
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
In the modi ed method, the equation whi h de nes an iteration is (L + 1 D)x(k+1) = ((1 1 )D + U )x(k) + b;
(4.53) where ! is a s alar parameter used to speed up the sequen e onvergen e. It is easy to show that (4.52) and (4.53) orrespond to the updating (4.51) of the elements of x. Methods based on (4.51) or, equivalently, on (4.52) and (4.53) are alled relaxation methods. For 0 < ! < 1 the pro edures are alled under-relaxation methods and an be used for obtaining the onvergen e of some systems whi h do not onverge in Gauss-Seidel method. For ! = 1 one re overs the Gauss-Seidel method. For ! > 1 the pro edures are alled over-relaxation methods and are urrently abbreviated SOR (Su
essive Over-Relaxation). These last methods an be used for speeding up the onvergen e of systems whi h onverge by using the Gauss-Seidel algorithm. SOR are parti ularly useful for solving linear systems o
urring in the numeri al solution of ertain partial dierential equations. Generally, the optimal ! ranges in the interval (1; 2). For more details see [?℄. !
!
4.3 Iterative re nement of the omputed solution
As we have already seen, when solving linear nonsingular systems Ax = b, the perturbations in the initial data (A ! A + A, b ! b + b) are ampli ed, in terms of a spe i ed kAkkA 1 k =: (A) times. Here k k stands for a ertain spe i ed family of matrix norms. The number (A) is alled the numeri al ondition of the matrix A. It is worth mentioning that the initial error ampli ation is independent of the algorithm used. Moreover, the omputations in an approximate arithmeti leads to additional errors. If the algorithm employed in the linear system solving is a numeri ally stable one, this additional errors are of the same size as those due to the rounding errors in initial data when exa t al ulations are performed. For ill onditioned systems it is parti ularly important to have a possibility for improving the a
ura y of a omputed system solution. Su h a possibility is shown below. Let x 2 Rn be an approximate solution of the system Ax = b and let (4.54) r = b Ax be the residual ve tor and x = A 1 b. Hen e, if e=x x (4.55)
4.3.
ITERATIVE REFINEMENT OF THE COMPUTED SOLUTION
then
Ae = Ax Ax = b (b r) = r: When solving the system Ae = r one expe ts xnew given by (4.55)
91 (4.56)
xnew = x + e
(4.57) to be more a
urate than x. Here e stands for the omputed solution of the system (4.56). Unfortunately, the naive oating point exe ution of the above omputations renders an xnew that is no more a
urate than x. This is due to the fa t that the
omputed residual ve tor (4.54) has few, if any, orre t signi ant digits (re all the disastrous ee ts of oating point subtra tion of nearby numbers). In [?℄ it is shown that a more a
urate solution an be obtained if the omputation of the residual ve tor is made in double pre ision (i.e. with a mantissa having a double number of (binary) digits). In order to improve the a
ura y of a omputed solution x, for example by Gaussian elimination with partial pivoting, the following sequen e of statements
an be iterated. 1. Compute in double pre ision r = b Ax. 2. Solve Ae = r (using the Gaussian multipliers and permutation from the initial
omputation of x). 3. x = x + e. This iterative improvement of the solution a
ura y is known as iterative re nement. An error analysis of the iterative re nement an be found in [?℄. Problems
Show that the Ja obi iterative method onverges for 2 2 symmetri positive de nite systems. Show that if A is stri tly diagonally dominant (i.e. ja j > P 6 ja j), then the Ja obi's iterative method is onvergent. 2 1 Consider the linear system Ax = b, where A = 1 2 . Whi h iterative method will onverge faster: Ja obi or Gauss-Seidel ? P 4.1
P 4.2
P 4.3
T
n
jj
i=1;i=j
ij
92
CHAPTER 4.
ITERATIVE TECHNIQUES IN MATRIX ALGEBRA
Chapter 5
Eigenvalues and eigenve tors Many obje ts and pro esses of the real world are des ribed by mathemati al models
alled dynami al systems. The linear dynami al systems form an important lass whi h is widely used in various s ienti and te hni al areas. A nite dimensional linear system is often represented by a set of real matri es and its properties are expressed in terms of these matri es. Some of the ru ial properties of linear dynami al systems, su h as stability, depend on a set of numbers and ve tors related to a system matrix alled eigenvalues and eigenve tors. This is one possible motivation for studying the so alled algebrai eigenvalue problem. Sin e the eigenvalues of a real matrix may be omplex numbers and the eigenve tors may be omplex ve tors, a natural setting for the eigenvalue problem is the spa e Cn. We begin with the presentation of the main features of this spa e. 5.1 The spa e Cn
For our purposes a omplex n-ve tor x is a olle tion of omplex numbers x1, x2, ..., xn ordered in a olumn 2 3 x=
6 6 6 6 4
x1 x2
...
xn
7 7 7; 7 5
xi 2 C:
(5.1)
Clearly, any omplex n-dimensional ve tor an be written in the form x = u + iv; (5.2) where u; v 2 Rn and i is the imaginary unit. The set of all omplex n-dimensional ve tors and the set of all omplex m n matri es is denoted by Cn and Cmn, respe tively. 93
94
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
The spa es Rn and Cn feature a striking similarity. Sin e the axioms of the eld of real numbers are also satis ed by the eld of omplex numbers all the properties of the ve tor spa e Rn are also ful lled in Cn. However, the de nition of the inner produ t highlights a dieren e between the two spa es. If x 2 Cn denote xH = xT (5.3) its onjugate transpose. De nition 5.1
number
If x; y 2
Cn
then the inner produ t of x and y is the omplex < x; y >= yH x:
(5.4)
Clearly, < y; x >= < x; y >. The above de nition of the inner produ t allows us to introdu e the on ept of orthogonality in Cn as follows. De nition 5.2
Two ve tors x; y 2 Cn are orthogonal if xH y = 0:
(5.5)
Sin e Cn with the inner produ t (5.4) form a Hilbert spa e, the inner produ t in
an be used, in the usual way, to de ne the orresponding eu lidian ve tor norm on Cn (indu ed by the inner produ t). Noti e that xH x is always a nonnegative real number. Cn
Theorem 5.3 The fun tion k:k : Cn ! R+ de ned by
p kxk := kxk2 = xH x
(5.6)
is a ve tor norm on Cn.
The proof is an easy exer ise for the reader. More generally, all the de nitions and theorems on norms in Rn an be extended to Cn; however, when dealing with the 2-norm, the transpose must be repla ed by the onjugate transpose. De nition 5.4
The matrix A 2 Cnn is Hermitian if AH = A:
Theorem 5.5 If A x 2 Cn .
2 Cnn
(5.7)
is Hermitian then the number xH Ax is real for any
5.2.
95
EIGENVALUES AND EIGENVECTORS
We have (xH Ax)H = (xH Ax)T = xT AT (xH )T = xH AH x. Hen e xH Ax is a s alar whi h equals its onjugate, i.e. is a real number. 2 Using Theorem 5.5 we an introdu e the on ept of positive de nite omplex matrix. De nition 5.6 A Hermitian matrix A 2 Cnn is positive de nite if xH Ax > 0 8x 2 Cn ; x 6= 0: (5.8) The analogous on ept to that of an orthogonal real matrix is in Cn the unitary matrix introdu ed below. De nition 5.7 A matrix A 2 Cnn is unitary if AH A = I (5.9) i.e., if it has orthonormal ( omplex) olumns. In a similar way to the real ase, one an de ne elementary unitary matri es su h as omplex re e tors and omplex rotations. However, it is rather unusual in pra ti e to work with omplex matri es sin e in most situations it is possible to redu e the problem to the real ase.
Proof.
5.2 Eigenvalues and eigenve tors De nition 5.8 Let A 2 Cnn. A ve tor x 2 Cnn is an eigenve tor of A, orresponding to the eigenvalue 2 C, if the following properties hold a) x 6= 0 (5.10) b) Ax = x: Clearly, if x is an eigenve tor of A orresponding to the eigenvalue , the ve tor y = x is also an eigenve tor of A orresponding to the same eigenvalue , for any 2 C; 6= 0. Thus, the eigenve tors are determined only by their dire tion. Some important properties of the eigenvalues of a given square matrix are stated subsequently. Theorem 5.9 Let A 2 Cnn. The number 2 C is an eigenvalue of A if and only if the matrix I A is singular. Moreover, the matrix A has exa tly n eigenvalues, multipli ity ounted, whi h are the zeros of the hara teristi polynomial p() = det(In
A ):
If A 2 Rnn , the omplex eigenvalues o
ur in onjugate pairs.
(5.11)
96
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
Proof. If is a xed eigenvalue of A, then there exists a nonzero (eigen)ve tor x su h that Ax = x, i.e., (In A)x = 0 and onsequently In A is singular. Conversely, if for some 2 C the matrix In A is singular, then there exists a nonzero ve tor x 2 Cn su h that (In A)x = 0, i.e., Ax = x and thus is an eigenvalue of A. Sin e In A is singular if and only if is a root of the algebrai equation of order n p() = 0 (5.12) ( alled the hara teristi equation of A), we infer that (5.12) has n roots (multipli ity ounted) whi h oin ide with the eigenvalues of A. If A is a real matrix, the
hara teristi polynomial p() has real oeÆ ients and onsequently the omplex roots o
ur in onjugate pairs. 2 Denote by (A) = f1 ; 2 ; :::; n g = f 2 C j det(I A) = 0g (5.13) the set of all eigenvalues of a matrix A 2 Cnn. (A) is alled the eigenvalue spe trum of A. We are interested to determine matrix transformations that preserve the eigenvalue spe trum of a given matrix. Noti e that the eigenvalues and eigenve tors are not preserved when the matrix is premultiplied or postmultiplied by another matrix. In parti ular, the elementary row or olumn operations, su h as permutations, may hange the eigenvalues. The eigenvalues are preserved by the so alled similarity transformations introdu ed below.
Two matri es A; B 2 Cnn are alled similar if there exists a nonsingular matrix T 2 Cnn su h that B = T AT 1 : (5.14) If T is a unitary (orthogonal in the real ase) matrix, then A and B are alled orthogonally similar. De nition 5.10
Theorem 5.11 If A; B spe trum
2 Cnn
are similar then they share the same eigenvalue
(A) = (B ):
(5.15)
Moreover, if T is a similarity transformation as in (5.14) and if xA is an eigenve tor of A orresponding to the eigenvalue 2 (A), then the ve tor
xB = T xA is an eigenve tor of the matrix B , orresponding to the same eigenvalue.
(5.16)
5.2.
97
EIGENVALUES AND EIGENVECTORS
Let A and B be related by (5.14). Then det(I B ) = det(I T AT 1) = det(T (I A)T 1 ) = det T det(I A)det(T 1 ) = det(I A) i.e., (5.15) is true. If xA is an eigenve tor of A orresponding to the eigenvalue , then AxA = xA or T AT 1T xA = T xA form where BxB = xB and onsequently assertion (5.16) is also true. 2 In most of our developments we shall be on erned with the generi ase of the so
alled nondefe tive matri es, i.e., the matri es whi h have a omplete set of linearly independent eigenve tors. Proof.
Theorem 5.12 Let A 2 Cnn be a matrix with n linear independent eigenve tors and let V 2 Cnn be a matrix built up of these eigenve tors as olumns. Then: V 1 AV = Cnn (5.17) is a diagonal matrix.
We have V = [v1 ; v2 ; :::; vj ; :::; vn ℄ and Avj = j vj , j = 1 : n, where j are the eigenvalues of A. Hen e AV = [Av1 ; Av2 ; :::; Avj ; :::; Avn ℄ = [1 v1 ; 2 v2 ; :::; j vj ; :::; n Avn ℄ = [v1; v2 ; :::; vj ; :::; vn ℄diag(1 ; 2 ; :::; j ; :::; n ) = V : Sin e the ve tors vj , j = 1 : n, are linearly independent, the matrix V is nonsingular and (5.17) follows. It an be shown that if a n n matrix has n distin t eigenvalues then it is nondefe tive and hen e, from Theorem 7.12., we infer that after a ertain similarity transformation it has a anoni al diagonal form. When a matrix has not a diagonal form, it an be redu ed, by a similarity transformation, to the so alled Jordan anoni al form: J = diag[J1 ; J2 ; :::; Jq ℄; (5.18) where 2 3 k 1 0 0 0 6 0 k 1 0 0 77 6 6 Jk = 66 777 2 Cn n : (5.19) 4 0 0 0 k 1 5 0 0 0 0 k Although the Jordan anoni al form is essential in matrix analysis, however it is not suitable for numeri al purposes due to its stru ture sensibility to numeri al perturbations in the matrix entries. This is the reason for using in all numeri al developments
on erning matrix eigenvalues and eigenve tor of a more robust stru ture alled real (or omplex) S hur form.
Proof.
k
k
98
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
5.3 The real S hur form. De ation te hniques We are interested in a matrix form whose stru ture is insensitive to numeri al perturbations and, at the same time, allows an easy omputation of its eigenvalues. De nition 5.13 A matrix S 2 Rnn is alled to be in the real S hur form (RSF) if 2 3 S11 S12 S1q 6 7 S = 66 0 S22 S2q 77 ; (5.20) 4
5
0 0 : : : Sqq where Sii 2 R11 or Sii 2 R22 and all 2 2 diagonal blo ks have omplex eigenvalues. An immediate onsequen e of the S hur stru ture is given in the following Proposition. Proposition 5.14 Let S 2 Rnn be in RSF. Then: (S ) = [qi=1 (Sii ): (5.21) Hen e, the real eigenvalues of a RSF an be determined by a simple inspe tion of the main diagonal and the omplex ones an be omputed by solving the
orresponding se ond order algebrai equations det(I Sii ) = 0: (5.22) The omplex S hur form (CSF) is an upper triangular stru ture. The eigenvalues of a matrix in CSF are, of ourse, its diagonal elements. Let A be a real n n matrix and v 2 Rn one of its real eigenve tors (if any). Then the asso iate eigenvalue an be omputed as follows: vT Av vT v
If v has a unit eu lidian norm then
= vvTvv = : T
= vT Av:
(5.23)
(5.24) This remark allows us to determine, by a similarity transformation, a real eigenvalue whenever a orresponding unit eigenve tor is available. Indeed, let v be su h an eigenve tor and V 2 Rnn 1 a matrix hosen su h that h i V = vi V (5.25)
5.3.
THE REAL SCHUR FORM. DEFLATION TECHNIQUES
99
is orthogonal. Su h a matrix exists sin e, if U 2 Rnn is an elementary re e tor whi h zeroes the last n 1 omponents of v, then Uv = kvke1 = e1 and v = U T e1 is the rst olumn of the orthogonal matrix V = U T = U . Let us apply now an orthogonal similarity transformation, de ned by (5.25), to the matrix A. We get A1 = V 1 AV
=
"
V T AV
=
=
vT Av vT AV V T Av V T AV
#
"
=
vT V T "
#
A v V h
vT AV 0 V T AV
i
#
:
= (5.26)
In fa t, in (5.26) we have performed a step of the de ation pro edure. The pro edure
an be further ontinued in a similar way on the redu ed dimensional matrix (5.27) B = V T AV : The resulting orthogonal transformation on B an be expressed in terms of a dimensional extended orthogonal transformation on the matrix A1 de ned in (5.26). For omplex eigenvalues the asso iated eigenve tors are omplex and a omplex arithmeti must be used. It an be shown that for a pair of omplex onjugate eigenvalues the two omplex de ation steps an be ondensed and omputed as a double real step whi h emphasizes a 2 2 diagonal blo k of the orresponding S hur form. As a on lusion, by using the de ation pro edure, the following important result
an be stated. Theorem 5.15 For any square matrix A 2 Rnn there exists an orthogonal matrix Q 2 Rnn su h that QT AQ = S (5.28) is in the RSF.
Thus, any matrix is orthogonally similar with a RSF. Sin e (A) = (S ) (5.29) and the eigenvalues of S an be easily omputed, the de ation te hnique suggests a way for omputing the eigenvalues of a matrix. Unfortunately, the de ation pro edure requires at ea h step an eigenve tor, whi h, in turn, an not be omputed by dire t methods without the a priori knowledge of the orresponding eigenvalue. Consequently, the de ation pro edure must be ompleted with a method for
omputing an eigenve tor without knowing the asso iate eigenvalue. Su h a method
100
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
is ne essarily an iterative one, be ause the possible existen e of a dire t method
ontradi ts the well-known algebra theorem that states the impossibility of solving by a nite sequen e of elementary operations an algebrai equation of degree greater than four. The most used methods for iterative eigenve tor omputing are the power method and the inverse power method. 5.4 Power and inverse power methods
These methods are based on the following result. Proposition 5.16 Let A 2 Rnn be a square matrix, (A) = f1 ; 2 ; :::; n g (5.30) its eigenvalue spe trum and xi, i = 1 : n, the orresponding eigenve tors asso iated with the eigenvalues i. Then: a) (Ak ) = fk1 ; k2 ; :::; kn g (5.31) for any k 2 N, and xi ; i = 1 : n, is an eigenve tor asso iated with the eigenvalue ki . b) (A I ) = f1 ; 2 ; :::; n g (5.32) and xi is also an eigenve tor asso iated with i .
) If 62 (A) then A I is nonsingular and 1 ; 1 ; :::; 1 ((A I ) 1 ) = (5.33) 1 2
n
xi ; i = 1 : n, is also an eigenve tor asso iated with 1 . i
Using the hypothesis, for any i 2 1 : n; xi 6= 0, we have Axi = i xi (5.34) a) We have A2 xi = iAxi = 2i xi and by indu tion, it follows immediately that Ak xi = ki xi (5.35) for all k 2 N, and (5.31) is true. Clearly xi is an eigenve tor orresponding to ki . b) From (5.34) (A I )xi = ixi xi = (i )xi: (5.36) Proof.
5.4.
101
POWER AND INVERSE POWER METHODS
) If A (5.36)
I
is nonsingular then 0 62 (A I ), i.e., i 6= 0, 8i 2 1 : n. From 1 x = (A I ) 1 x ; i = 1 : n; (5.37) i
i
i
whi h ompletes the proof. 2 In Proposition we have given some types of matri es that maintain a set of eigenve tors, although the set of eigenvalues is hanged. This provides a possibility to de ne a ve tor sequen e onvergent to an eigenve tor of a given matrix. 5.4.1
The power method
Let A 2 Rnn be a matrix having a dominant eigenvalue, i.e., an eigenvalue having the magnitude stri tly greater than the magnitude of the others. Suppose the eigenvalues are ordered by their magnitude j1 j > j2 j j3 j ::: jnj: (5.38) Let y(0) 2 Rn be an initial unit ve tor having a nonzero omponent on the dire tion of the eigenve tor asso iated with the dominant eigenvalue 1 2 (A). If A has a
omplete set of linearly independent eigenve tors x1 ; x2 ; :::; xn (5.39) then y(0) an be uniquely written as y(0) =
with
n X i=1
(5.40)
i xi
1 6= 0:
(5.41)
If we de ne a re urrent unit ve tor sequen e (y(k) )k2N y(k) = k Ay(k 1) ; k = 1; 2; (5.42) where 1 (5.43) k = ( kAy k 1) k is a s aling fa tor, then it is easy to show, by indu tion, that y(k) = ~k Ak y(0) ; (5.44) where ~k is a umulative s aling fa tor (~k = 1=kAk y(0) k). From (5.40), (5.44) and (5.31) we have y(k) = ~
k
n X i=1
Ak xi
= ~k
n X i=1
i ki xi
n X
!
= ~k 1 1 x1 + i ( i )k xi : 1 i=2 k
(5.45)
102
CHAPTER 5.
Using (5.38) we get from where and
i
1
EIGENVALUES AND EIGENVECTORS
< 1; i = 2 : n
lim k!1
(5.46)
i k =0 1
lim y(k) = x1; (5.47) where is a nonzero s alar su h that k x1 k = 1. Hen e, the ve tor sequen e de ned by the re urren e (5.42) is onvergent to an eigenve tor (5.47) asso iated with the dominant eigenvalue of the matrix A. The onvergen e speed is determined by the ratio j2 =1 j. The smaller the value of j2=1 j, the greater the onvergen e speed. Consequently, the power method works eÆ iently for the matri es with a strong dominant eigenvalue and a sparse stru ture (to easily multiply Ay(k 1) ). From a pra ti al point of view, the power method is attra tive due to the simpli ity of its re urren e (5.42). A drawba k to the method is that it onverges slowly when 1 is not strongly dominant. A ure for this problem is to transform the matrix A so that the transformed matrix has a strongly dominant eigenvalue and the same eigenve tors. The fa t that the speed of onvergen e of the power method depends on the ratio of the magnitude of the largest two eigenvalues suggests that one should rst transform the matrix so that it has a very large eigenvalue of multipli ity one. This idea is synthetised in the so alled inverse power method. k!1
5.4.2
The inverse power method
Suppose again that A 2 Rnn has the eigenvalues i; i = 1 : n (not ne essarily in any spe ial order), orresponding to the linearly independent eigenve tors xi, i = 1 : n, and suppose that is a lose approximation to 1 . Then, by Proposition 7.16, the matrix (I A) 1 has the eigenvalues ( i) 1 ; i = 1 : n, orresponding to the same eigenve tors. Consequently, if we hoose a starting ve tor y(0) non-defe tive with respe t to x1 , i.e., satisfying (5.40) and (5.41) we an de ne for the matrix (I A) 1 , using the power method, the unit ve tor sequen e y(k) = k (I A) 1 y(k 1) ; k = 1; 2; ::: (5.48) Now, if is mu h loser to 1 than to i; i = 2 : n, then ( 1 ) 1 will be mu h larger than ( i) 1 ; i = 2 : n, i.e., maxi=2:n j( i ) 1j (5.49) j( 1 ) 1j
5.4.
POWER AND INVERSE POWER METHODS
103
is mu h smaller than 1, and onsequently y(k) is very rapidly onvergent to x1 . The re urren e (5.48), with k a s aling ve tor, de nes the power method for the matrix (I A) 1 and it is known as the inverse power method for the matrix A. Of ourse, for performing the iteration (5.48), one has not to ompute the matrix (I A) 1 but to solve a linear system as in the following s heme. 1. Solve (I A)y(k) = y(k 2. y(k) y(k) =ky(k) k:
1)
with unknown y(k) .
Solving the above system requires about 2n3=3 oating point operations, whi h is a high pri e for the omputation of a single eigenve tor. Fortunately, the inverse power method is usually performed on Hessenberg matri es (as we will see in the QR algorithm) whi h redu es the omputational eort to approximately n2 ops per iteration. The inverse power method is one of the best methods for iterative omputing an eigenve tor of a given matrix. An important variant of the inverse power method is the Rayleigh quotient iteration whi h, at ea h iteration, uses in (5.48), as an approximation to a matrix eigenvalue , the urrent Rayleigh quotient (y(k 1) )H Ay(k 1) = (y(k 1) )H Ay(k 1) : (5.50) = ky(k 1) k2
This hoi e of improves the onvergen e speed of the ve tor sequen e to the orresponding eigenve tor. Namely, it an be shown that the Rayleigh quotient iteration assures a so- alled quadrati onvergen e, i.e., ky(k+1) x1 k ky(k) x1k2
(5.51)
where is a onstant s alar. This is a very fast onvergen e (for example, if ky(0)
x1 k " and = 1, then ky(k) x1 k "2 ). The formal algorithms for eigenve tor omputation that implement the power and inverse power methods are left as exer ises for the reader (see Problem 5.4). As an ending riterion for the iterative pro ess one an use k
ky(k) y(k 1) k < tol; ky(k) k
where tol is a pres ribed toleran e.
(5.52)
104
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
5.5 The QR algorithm The eigenvalue and eigenve tor problem an be solved as follows: 1. Find an eigenve tor by using the power or inverse power method. 2. Compute the orresponding eigenvalue by Rayleigh quotient. 3. Apply the de ation pro edure in order to redu e the problem dimension. After a nite number of the above steps one an ompute all the eigenvalues of a given matrix A. This pro edure is elegantly en apsulated in a famous algorithm known as the QR algorithm. The QR algorithm is an iterative de ation pro edure whi h re urrently onstru ts a matrix sequen e, with all the matri es orthogonally similar to the given matrix and onvergent to a S hur form. When only real arithmeti is used the sequen e limit is the real S hur form of the initial matrix. When the QR algorithm is applied to a full matrix it is prohibit expensive requiring approximately n3 oating point operations for ea h iteration. Consequently, the matrix must be rst redu ed by a nite number of similarity transformations to a form as lose as possible to the (real) S hur form . This form is the upper Hessenberg stru ture. Fortunately, the QR iterations preserve the upper Hessenberg stru ture and onsequently, the iteration on su h a parti ular matrix requires only approximately n2 oating point operations. The redu tion to Hessenberg form is
onsidered as the rst stage of the QR algorithm. 5.5.1
Redu tion to Hessenberg form
It is a famous lassi result in mathemati s that there is no algorithm, involving only a nite number of additions, multipli ations, divisions and root extra tion whi h solves a general polynomial equation of degree greater or equal to 5. Be ause the eigenvalue omputation is equivalent to solving the hara teristi equation there is no dire t ( nite) algorithm whi h redu es a matrix (of arbitrary dimension n) to a form loser to the S hur one than the upper Hessenberg form. Re all that the matrix H 2 Rnn is in upper Hessenberg form if hij = 0 8i > j + 1, i.e.: 2
3
h1;n 1 h1n 7 6 h2;n 1 h2n 77 6 6 6 h3;n 1 h3n 77 6 6 H=6 h4;n 1 h4n 77 : (5.53) 6 7 6 7 6 0 0 hn 1;n 1 hn 1;n 75 4 0 0 0 0 hn;n 1 hnn For redu ing a matrix A 2 Rnn to the upper Hessenberg form by orthogonal simih11 h12 h13 h21 h22 h23 0 h32 h33 0 0 h43
larity transformations, we shall use elementary re e tors. Re all that an elementary
5.5.
THE QR ALGORITHM
re e tor Us of order n and index s is an orthogonal matrix # " us uTs I s 1 0 nn Us = In 2 kusk2 = 0 Us 2 R ; where us = [0; ; 0; us;s ; us+1;s ; ; uns ℄T : Given a ve tor x 2 Rn there exists always a re e tor su h that (Usx)i = 0 for i = s + 1 : n (see Chapter 3.)
105 (5.54) (5.55) (5.56)
Theorem 5.17 For any matrix A 2 Rnn , there exists an orthogonal matrix U 2 Rnn su h that the matrix H = UAU T (5.57) is upper Hessenberg. Proof. We shall give a onstru tive proof based on the following omputational s heme: 1. for s = 1 : n 2 1. Compute an elementary re e tor Us+1 su h that (Us+1A)i;s = 0 for i = s + 2 : n 2. A Us+1A 3. A AUs+1 whi h overwrites the matrix A by the matrix A H = Un 1 U3 U2 AU2 U3 Un 1 : (5.58) Denoting U = Un 1 :::U3 U2 (5.59) we have U T = U2T U3T :::UnT 1 = U2 U3 :::Un 1 ; (5.60) be ause the elementary re e tors are symmetri . Consequently, (5.58) an be written in the form (5.57). It only remains to show that the above omputational s heme really reates an upper Hessenberg stru ture. We show this by nite indu tion. Step s=1: It is known (see Chapter 3) that there exists an elementary re e tor U2 su h that (U2 A)i;1 = 0 for i = 3 : n, i.e., whi h reates the upper Hessenberg stru ture on the rst olumn. The matrix U2 has the stru ture " # 1 0 U2 = 0 U : (5.61) 2
106
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
Hen e, the postmultipli ation of U2 A by U2 does not hange the rst olumn of U2 A. i.e., the zeroes in the rst olumn of U2 A are preserved in U2 AU2. Step s=k: Suppose that in the rst k 1 steps we have obtained the upper Hessenberg stru ture in the rst k 1 olumns. A Ak := Uk U2 AU2 Uk : (5.62) Then, there exists an elementary re e tor Uk+1 su h that (Uk+1 A)i;k = 0 for i = k + 2 : n, i.e., the premultipli ation by Uk+1 reates the upper Hessenberg stru ture in the k-th olumn whi h is not altered by postmultiplying with Uk+1. The stru ture " # I 0 k Uk+1 = 0 U (5.63) k+1 assures that the postmultipli ation by Uk+1 preserves the rst k olumns of the matrix (Uk+1Ak ). Hen e, the omputational s heme presented at the beginning of the proof provides the redu tion of a given matrix to the upper Hessenberg form by orthogonal similarity transformation. This ompletes the proof. 2 We leave to the reader the task of writing a detailed algorithm, based on the s heme given above (Problem 5.6. The algorithm is quite stable, this means that the omputed Hessenberg form in a oating point arithmeti is an exa t Hessenberg form of a slightly perturbed matrix A + E . Here E stands for a perturbation matrix whose entries are of magnitude omparable to the rounding errors. 5.5.2
The iterative stage of the QR algorithm
One of the most eÆ ient methods for nding all the eigenvalues of a matrix is an iterative te hnique, alled the QR algorithm, whi h uses, in an impli it manner, the power and inverse power methods to assure the iterative redu tion of a matrix to the S hur form, by ex lusively using orthogonal similarity transformations. To improve the eÆ ien y of the method, the given matrix is rst redu ed to the upper Hessenberg form. Suppose A 2 Rnn is a matrix in upper Hessenberg form. The QR algorithm produ es a sequen e of orthogonally similar matri es A = A1 ; A2 ; ; Ak ; Ak+1 ; (5.64)
onvergent to the S hur ((quasi)-upper triangular) form of the matrix A. The expli itly shifted QR algorithm is de ned by the following re urren e: 1:Ak k In = Qk Rk ; k = 1; 2; (5.65) 2:Ak+1 = Rk Qk + k In
5.5.
THE QR ALGORITHM
107
where the s alar k , alled the origin shift, is used to speed up the onvergen e. In the rst step of (5.65) the shifted matrix Ak k In is fa torized in a produ t of an orthogonal matrix Qk and an upper triangular matrix Rk . (See the QR fa torization in Chapter 3). In the se ond step, the next matrix Ak+1 is obtained multiplying the Qk and Rk matri es in reverse order and vanishing the shift by adding k In . The main properties of the matrix sequen e (5.64), alled the QR sequen e and generated by (5.65), are given by the following theorem. Theorem 5.18 a) If the initial matrix A1 = A of the QR matrix sequen e is upper
Hessenberg then all the matri es in sequen e have the upper Hessenberg stru ture. b) All the matri es of the QR sequen e are orthogonally similar and share the same eigenvalue spe trum.
) By an appropriate hoi e of the origin shift k , the QR sequen e is onvergent to a real S hur form of the initial matrix A.
The rst two properties an be easily derived. a) If Ak in (5.65) is an upper Hessenberg matrix, so is the shifted matrix Ak k In. If k 62 (Ak ), whi h is the usual ase, then Ak k In is nonsingular and so is the upper triangular matrix Rk . Hen e (5.66) Qk = (Ak k In )Rk 1 : The inverse of an upper triangular matrix is upper triangular and it is easy to he k that the produ t between an upper Hessenberg matrix and an upper triangular one is upper Hessenberg. Hen e Qk is an orthogonal upper Hessenberg matrix. It follows, by a straightforward omputation, that Rk Qk is upper Hessenberg and, learly, so is Ak+1. By indu tion, if A1 = A is upper Hessenberg then all Ak ; k = 2; 3; ::: are upper Hessenberg matri es. b) From (5.65) Rk = QTk (Ak k In ); (5.67) whi h introdu ed in (5.65) gives: Ak+1 = QTk (Ak k In )Qk + k In = QTk Ak Qk ; (5.68) i.e., Ak+1 and Ak are orthogonally similar and share the same eigenvalue spe trum. Applying (5.68) repeatedly we get Ak = QTk 1 Ak 1 Qk = QTk 1 QTk 2Ak 2 Qk 2 Qk 1 = = QTk 1QTk 2 Q1 A1Q1 Qk 2Qk 1 = Q~ Tk AQ~ k ; (5.69) where Q~ k = Q1 Q2 Qk 1 (5.70)
Proof.
108
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
is an orthogonal matrix as a produ t of orthogonal matri es. Hen e, all the matri es in the QR sequen e are orthogonally similar and onsequently share the same spe trum. The resulting orthogonal transformation (5.70) an be onstru ted re urrently by the relation Q~ k+1 = Q~ k Qk ; Q~ 0 = In ; k = 0; 1; 2; (5.71)
) We prove now that by an adequate hoi e of the shifts k the QR sequen e is
onvergent to the S hur form of the matrix. To this end, we show that the odiagonal element in the last row of Ak an be onstrained to approa h zero very swiftly. Moreover, all the subdiagonal elements of Ak may approa h zero somewhat more slowly. The reason is twofold: (i) The QR sequen e, with an appropriate hoi e of the origin shifts, implements a version of the Rayleigh quotient iteration and onsequently onverges quadrati ally. (ii) The QR sequen e implements simultaneously an extension of the power method. Thus, while the method onverges to a ertain eigenvalue, it also attempts to nding the others, albeit more slowly. In order to justify the above assertions let us rst onsider (5.65) from where, if k 62 (A), it results Rk = QTk (Ak k In ); (5.72) whi h with (5.69) and (5.71) be omes Rk = QTk (Q~ Tk AQ~ k k In) = QTk Q~ Tk (A k In )Q~ k = Q~ Tk+1 (A k In )Q~ k (5.73) or Q~ k+1Rk = (A k In )Q~ k : (5.74) We begin with the se ond assertion (ii). The rst olumn of (5.74) is Q~ k+1r1(k) = (A k In )~q1(k) ; (5.75) where q~1(k) = Q~ k e1 is the rst olumn of Q~ k and r1(k) = Rk e1 is the rst olumn of Rk . Sin e Rk is upper triangular we have (k) e r1(k) = r11 (5.76) 1 and (5.75) an be written in the equivalent form 1 (A In)~q(k) q~(k+1) = (5.77) 1
(k) r11
k
1
whi h expresses the power method's re urren e relations for omputing an eigenve tor of the matrix A using a s alar shift parameter k . As it has been shown in Se tion 5.4, the rst olumn of the orthogonal transformation matrix Q~ k onverges to an eigenve tor asso iated with the dominant eigenvalue of the matrix A k I .
5.5.
109
THE QR ALGORITHM
A
ording to the de ation pro edure (see Se tion 5.3) the rst olumn of the matrix
Q~ Tk AQk = Ak onverges to the S hur form of A, i.e., the subdiagonal element in the rst olumn of A is asymptoti ally vanishing. The speed of onvergen e depends
on the ratio of the absolute values of the rst and se ond dominant eigenvalues of
A k In .
In order to stress out the Rayleigh quotient iteration, let us derive from (5.74) the dependen e between the last olumns of the matri es Q~ k+1 and Qk , respe tively. Rewriting (5.74) in the form Rk Q~ Tk = Q~ Tk+1 (A k In ) (5.78) and separating the last row of (5.78) we get eTn Rk Q~ Tk = eTn Q~ Tk+1(A k In): (5.79) But Rk is upper triangular and its last row is (k) ℄ = r(k) eT : (5.80) eTn Rk = [0 0 rnn nn n Hen e (5.79) an be rewritten as (k) = (~ (~qn(k) )T rnn qn(k+1) )T (A k In ) (5.81) or (k) (AT In) 1 q~(k) ; q~n(k+1) = rnn (5.82) k n whi h de nes the inverse power iteration for nding an eigenve tor of the matrix AT . It is straightforward to prove that AT has the same eigenvalue spe trum as A but, in general, has other eigenve tors. Fortunately, if the last olumn of the orthogonal similarity transformation is an eigenve tor of AT then the o-diagonal elements of the last row of A are vanishing in a similar way as in the standard de ation pro edure. Indeed, onsider the orthogonal transformation i h (5.83) Q = Q vn with AT vn = vn : (5.84) Then # # " # " " T T T h i T A h Q Q AQ Q Av T n (5.85) A Q vn = T = 0 n Q AQ = vT v AQ vT Av n
n
n
n
where the expression for the se ond blo k row in the last equality follows from (5.83) and (5.84) vnT AQ = n vnT Q = 0 (5.86) vnT Avn = n vnT vn = n :
110
CHAPTER 5.
EIGENVALUES AND EIGENVECTORS
The quadrati onvergen e of qn(k) in (5.82) to an eigenve tor of AT an be a hieved by hoosing k as in the Rayleigh quotient iteration (5.50) (~qn(k))T AT q~(k) = (~q(k))T AT q~(k) = (~q(k))T Aq~(k) = eT Q~ T AQe ~ n = eTn Ak en = a(nnk) : n n n k ( k) kq~n k (5.87) The hoi e (5.87) of the origin shifts guarantees an ex ellent onvergen e of the QR sequen e to the S hur form. Thus the proof of the ) part of the Theorem ends. 2 In the above proof of onvergen e the expli itly QR algorithm is de ned by the re urren e (5.65), with the hoi e (5.87) of the origin shifts. When the element k) a(n;n 1 satis es a ondition like
k =
k) (k) (k) ja(n;n 1 j < tol(jan 1;n 1 j + jann j); k) where tol is a pres ribed toleran e level, we an onsider a(n;n 1 to be numeri ally ( k) negligible and set it to zero. Thus ann is the omputed eigenvalue of A. After this step is ompleted, the dimension of our problem is redu ed by one. Usually, 1.5-2 QR iterations are enough to ompute one eigenvalue of a given Hessenberg matrix. Another problem to be solved is to ompute in real arithmeti a 2 2 diagonal blo k of the real S hur form. To remain in the real arithmeti eld it is ne essary to
ompress two QR iterations (5.65) into a single one, and to establish how to hoose an adequate double origin shift. The details an be found in [?℄, [?℄. Standard implementations of the QR algorithm use an iteration version alled impli itly shifted QR algorithm. The ore of this version is an intensive use of the algorithm of redu ing a matrix to upper Hessenberg form. In the impli itly shifted variant ea h iteration onsists of two orthogonal similarity transformations whi h have the same ee t as an expli itly shifted iteration. The rst transformation, essentially de ned by the rst olumn of the matrix Qk in (5.65), assures the desired origin shift but ruins the upper Hessenberg stru ture. The se ond transformation rebuilds the upper Hessenberg form by using the algorithm depi ted in the proof of Theorem 5.17. For details the reader is referred to [?℄ and [?℄. Eigenve tor omputation. On e the S hur form and the eigenvalues of the given matrix obtained, the asso iated eigenve tors an be omputed by solving the orresponding singular linear system or by few iterations of the inverse power method. It is important to stress out that in fa t in many appli ations instead of the eigenve tors one an use in an adequate manner the olumns of the transformation Q, de ned in Theorem 5.15. These olumns are the so alled S hur ve tors asso iated with the given matrix. The orthogonal transformation matrix Q an be
omputed by aggregating all transformations during the QR algorithm exe ution.
5.5.
111
THE QR ALGORITHM
Problems P 5.1
Prove that if A = A0
1
A2 A3
, then (A) = (A ) [ (A ). 1
3
1 2 Compute a real S hur form of the matrix A = 2 3 . Compute a real S hur form of the matrix A = 11 11 . Write detailed algorithms for the power and inverse power methods. Whi h will be the result of the power method applied to: 2 5 8 13 A = 4 0 1 2 5; B = 11 31 ; 0 0 2 with appropriate initial approximations ? What does "appropriate" mean in these ases ? Write a detailed algorithm to redu e a given matrix A 2 to upper Hessenberg form by orthogonal similarity transformations. Use the omputational s heme from the proof of Theorem 5.17. There are p linear systems of the form (A + I )x = b to be solved, where A 2 , 2 , b 2 , for k = 1 : p. Show that x an be omputed from the solution of the system (H + I )y = d , where H 2 is upper Hessenberg and orthogonally similar to A. Outline an algorithm for solving the p systems (A + I )x = b , using the idea of . Compare the number of operations to that obtained using Gaussian elimination. Consider the symmetri matrix A = d0 d0 , with d 6= 0. Compute the eigenvalues and eigenve tors of A. Consider the diagonal matrix D = diag (d ; : : : ; d ), where d 6= d , 8i 6= j , and 0 D . Show that there exists a permutation matrix P (as produ t of B= D 0 2 elementary permutation matri es) su h that P BP = diag(D ; : : : ; D ); where D = d0 d0 : Give an algorithm to ompute the eigenvalues and eigenve tors of B. P 5.2
P 5.3
P 5.4
P 5.5
n
P 5.6
R
P 5.7
R
n
n
k
k
R
k
n
R
k
a.
k
n
k
n
R
k
k
k
n
b.
k
k
k
a
.
P 5.8 a.
b.
1
n
i
j
2n 2n R
T
1
n
i
i
i
.
Let A = 2 be a symmetri matrix, with 6= 0. Give an algorithm to ompute the eigenvalues and the eigenve tors of A (verify that the eigenve tors are orthogonal). P 5.9 a.
2 2 R
112
CHAPTER 5.
b.
Consider the symmetri matrix B 2 2
B=
6 6 6 6 6 4
1
EIGENVALUES AND EIGENVECTORS
n
R
n
:
...
1
3 7 7 7 7; 7 5
6= 0:
Give an algorithm to ompute the eigenvalues and eigenve tors of B. Show that a real symmetri matrix has real eigenvalue spe trum and pairwise orthogonal eigenve tors. What omputational fa ilities appear in the symmetri QR algorithm ? Suppose A 2 has distin t eigenvalues and B 2 ommutes with A: AB = BA. Show that if Q AQ = T is the ( omplex) S hur de omposition of A, then Q BQ is upper triangular. Give an algorithm for redu ing a matrix A 2 to Hessenberg form, H = T AT , where T is a produ t of elementary triangular (Gaussian) transformations. (The Hessenberg form similar to a given matrix is not unique) Suppose A 2 and z 2 . Give an algorithm for omputing an orthogonal Q su h that QAQ is upper Hessenberg and Q z is a multiple of e . Let H 2 be upper Hessenberg. Des ribe the details of an expli itly shift QR step, using the shift = h . Suppose that an orthogonal Q 2 was determined su h that: ~H := Q HQ = ; 0 H where 2 (H ) and H 2 is upper Hessenberg. Give an algorithm for
omputing an eigenve tor asso iated with . Let H 2 be upper Hessenberg. Write an algorithm whi h veri es if H is a real S hur form (quasi upper triangular). Give an algorithm to ompute the eigenvalues of A = I + uv , where u; v 2 . P 5.10
P 5.11
C
n
n
n
C
n
H
H
P 5.12 1
R
n
n
n
P 5.13
R
n
T
R
T
1
n
P 5.14
R
n
a.
nn
b.
R
n
n
1
T
(n 1) (n 1) R
1
1
P 5.15
P 5.16
n
R
n
n
T
n
R
n
Chapter 6
The singular value de omposition As we have already seen, the orthogonality plays a striking role in matrix omputations. The least-square (LS) problem and eigenvalue-eigenve tor problem solving intensively use orthogonal transformations. Another important appli ation of orthogonal transformations is the solving of an extremely useful stru tural matrix de omposition alled singular value de omposition (SVD). Among other features, the SVD enables to eÆ iently solve the matrix rank omputation problem. The important mathemati al on ept of matrix rank is perfe tly lear in the exa t arithmeti setting. However, in the presen e of roundo errors, all matri es are, generi ally, of full rank. The SVD allows us to introdu e the pra ti al on ept of numeri al rank. The SVD is also the best tool for solving the general least square problem, i.e., to ompute the minimum norm ve tor x whi h minimizes the eu lidian norm of the residual krk = kb Axk, in the ase of rank de ien y of the matrix A. 6.1 De nitions. Properties
Before stating the SVD, let us re all rst some results from the orthogonal transformations theory, established in Chapter 3. 1. The orthogonal transformations preserve the eu lidian norm, i.e., if Q 2 Rmm is an orthogonal matrix, x 2 Rm and k k is the eu lidian ve tor norm then kQxk = kxk: (6.1) 2. The matrix 2-norm k k2 and the Frobenius matrix norm k kF are also invariant with respe t to orthogonal transformations, i.e., for any orthogonal matrix 113
114 Q 2 Rmm
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
and any matrix A 2 Rmn we have kQAk2 = kmax kQAxk2 = kAk2 ; xk =1 2
0
kQAkF =
m X n X i=1 j =1
11=2
j(QA)ij j2 A = kAkF :
(6.2) (6.3)
3. In parti ular, it is easy to prove that for any orthogonal matri es Q and Z , of appropriate dimensions, we have kQAZ k2 = kAk2 ; (6.4) kQAZ kF = kAkF : (6.5) 4. The matrix rank is de ned as follows. Let A 2 Rmn . The rank of the matrix A is the maximal number of linearly independent olumns of A, or, equivalently, rank(A) = dim(ImA); (6.6) where dim stands for the dimension of the argument spa e.
De nition 6.1
5. It an be shown (see any linear algebra book) that rank(A) = rank(AT ) (6.7) and, thus, any matrix has the same maximal number of independent rows and independent olumns. Hen e the rank of a given matrix A is the maximal dimension of a nonsingular square submatrix of A. 6. It is an easy exer ise for the reader to show that rank(A) = rank(T AS ) (6.8) for any nonsingular matri es T and S of appropriate dimensions. Now we are able to de ne SVD by using the following theorem. Theorem 6.2 (Singular Value De omposition-SVD) If A 2 Rmn then there exist two orthogonal matri es U 2 Rmm and V 2 Rnn su h that "
U T AV
= = 0 00
#
;
(6.9)
6.1.
115
DEFINITIONS. PROPERTIES
where
= diag(1 ; 2 ; :::; r ) 2 Rrr ;
(6.10)
with
1 2 ::: r > 0: (6.11) The expressions (6.9)-(6.11) de ne the singular value de omposition of the matrix A.
If A = 0 then (6.9)-(6.11) are satis ed by r = 0, U = Im, V = In, = 0. If then kAk2 6= 0 and, by using a nite indu tion, we shall obtain the matrix stru ture in (6.9). Step 1. Let x 2 Rm be a unit eu lidian norm ve tor (kxk2 = 1) for whi h (6.12) kAk2 = kmax kAxk2 = kAxk2 xk =1
Proof. A 6= 0
2
(the existen e of su h a ve tor is assured by the theorem on erned with the matrix 2-norm). De ne the ve tor y 2 Rn as y=
where x is de ned in (6.12). Then
Ax kAk2 ;
(6.13)
kyk2 = yT y = kA1k kAxk22 = 1: q
q
(6.14)
2
As we have seen in the previous hapter (see the de ation te hnique), on e given a unit 2-norm ve tor q 2 Rp there exists a matrix Q 2 Rp(p 1) su h that the matrix i h Q = q Q is orthogonal (for example Q an hoosen to be a re e tor). Hen e, let U 1 2 Rm(m 1) and V 1 2 Rn(n 1) be su h that the matri es h i U1 = y U 1 2 Rmm (6.15) be orthogonal. Then 1 := U1T AV1 = From (6.13)
h
i
h
i
V1 = x V 1 "
yT U T1
#
A x V1
(6.16)
2 Rnn
=
"
#
yT Ax yT AV 1 : U T1 Ax U T1 AV 1
yT Ax = yT ykAk2 = kAk2 U T1 Ax = U T1 ykAk2 = 0:
(6.17) (6.18) (6.19)
116
CHAPTER 6.
Denoting
1 wT B1
(6.17) be omes
1 =
and applying (6.4) "
1 w1
:= kAk2 := yT AV 1 := U T1 AV 1 1=
"
U T AV 1
(6.20)
1 wT 0 B1
#
k1 k2 = kAk2 = 1 :
On the other hand,
THE SINGULAR VALUE DECOMPOSITION
# 2
2
=
"
12 + wT w B1 w
# 2
2
(6.22)
= (12 + wT w)2 + kB1 wk22
(12 + wT w)2 = (12 + kwk22 )2
and from the onsisten y property of the matrix 2-norm we have
"
1 w1
# 2
2
"
k1 k22
w1
# 2
2
Now, (6.23) and (6.24) imply i.e. or Hen e
(6.21)
= 12 (12 + kwk22 ):
(6.23) (6.24)
12 + kwk22 12 ;
(6.25)
kwk2 = 0
(6.26)
w = 0:
(6.27)
"
1
0
#
(6.28) 1 = 1 1 = 0 B1 and the obje tive of the rst step is attained. Step s. Suppose that in the rst s 1 steps of the diagonalization pro edure we have obtained " # 0 s 1 T T T s 1 = Us 1 U2 U1 AV1 V2 Vs 1 = 0 Bs 1 (6.29) where s 1 = diag(1; 2 ; :::; s 1 ); with 1 2 s 1 > 0: (6.30) U T AV
6.1.
117
DEFINITIONS. PROPERTIES
If Bs 1 = 0 the pro edure is ompleted. If Bs 1 6= 0 then there exists U~s 2 R(m s+1)(m s+1) and V~s 2 R(n s+1)(n s+1) both orthogonal (by the same arguments as in step 1) su h that # " 0 s T (6.31) U~s Bs 1 V~s = 0 B ; s = kBs 1 k2 > 0: s De ning # # " " I I s 0 s 0 (6.32) Us = 0 U~s ; Vs = 0 V~s ; whi h are both orthogonal, it results # " s 0 T (6.33) s = Us s 1Vs = 0 Bs ; with s a diagonal matrix with positive diagonal elements. To omplete the proof it is suÆ iently to show that s s 1: (6.34) But this inequality holds sin e the expression (6.31), written for the previous step, we have
"
# " #
0 0
~ s 1 s 1 T ~s 1
=
V s 1 = kBs 2 k2 =
Us 1 0 Bs 1 0 Bs 1
"
kzk2 =1
#
z
2
"
#"
2
#
0 s 1 0 max 0 B s 1 2 kz k 2 = 1 0 Bs 1 y 2 T T z = [0 y ℄ = kmax kBs 1yk2 = kBs 1 k2 = s: yk =1 Hen e, the diagonalization pro edure an be started and ontinued. Thus, for some r p = min(m; n) (6.35) we shall obtain for r < p Br = 0 (6.36) or Br does not exist for r = p. Finally, # " r 0 T T T T (6.37) r = Ur U2 U1 AV1 V2 Vr = U AV = 0 0 ; where U = U1 U2 :::Ur (6.38) V = V1 V2 :::Vr are orthogonal matri es, r := r ; := r and the proof ends. 2 = max
s 1
0
2
118
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
Remark 6.3 Let uj = Uej ; j = 1 : m, and vj = V ej , j = 1 : n, be the j -th olumns of the matri es U and V de ned in (6.9), respe tively. Rewriting (6.9) in the form AV = U (6.39) or AT U = V T (6.40) we get Avj = j uj ; j = 1 : p; p = min(m; n): (6.41) AT uj = j vj
Theorem 6.2 and the above Remark allow us to introdu e the following de nition. The positive numbers i , i = 1 : p; p = min(m; n) 1 2 r > 0; r+1 = r+2 = = p = 0 (6.42) are alled the singular values of the matrix A. The olumns uj 2 Rm of the orthogonal matrix U are alled the left singular ve tors of the matrix A. The olumns vj 2 Rn of the orthogonal matrix V are alled the right singular ve tors of the matrix A. De nition 6.4
The SVD reveals a great deal about the stru ture of a given matrix. We shall give some properties in the Propositions below. Let A 2 Rmn and its SVD be given by Theorem 6.2 and De nition 6.4. Proposition 6.5
rank(A) = r (6.43) i.e. the rank of a matrix is given by the number of its nonzero singular values. The orthogonal matri es U T and V are learly nonsingular. Then form (6.8) and (6.9)-(6.11) 2 rank(A) = rank(U V T ) = rank = rank = r: Proof.
Let now introdu e the following partitions of U and V : U = [u1 u2 ::: ur ur+1 ::: um ℄; V = [v1 v2 ::: vr vr+1 ::: vn ℄ U (:; 1 : r) =: U1 ; U (:; r + 1 : m) =: U2 ; V (:; 1 : r) =: V1 ; V (:; r + 1 : n) =: V2 :
6.1.
119
DEFINITIONS. PROPERTIES
Proposition 6.6
a)
b)
ImA = ImU1
(6.44)
KerA = ImV2
(6.45)
i.e., the rst r olumns of the matrix U form an orthonormal basis for the linear subspa e ImA and the last n r olumns of the matrix V form an orthonormal basis for the linear subspa e KerA. Proof.
a)
ImA = fy 2 Rm j9x 2 Rn su h that y = Axg: Let y 2 ImA. Then, from (6.9) " # " # i h 0 0 T y = Ax = U V T x = U 0 0 V T x = U1 U2 0 0 V x = U1 z h
where z = 0
i
V T x.
Hen e y 2 ImU1 and ImA ImU1: Conversely, if y 2 ImU1, i.e., y = U1x for some x 2 Rr then for z = y = U1 z = U1
h
0r;n
i
r
"
z w
#
(6.46) 1 x we have
for an arbitrarily hosen w 2 Rn r , or equivalently y = U1 [ 0℄V T V
where z~ = V
"
z w
#
"
z w
#
= U V T z~ = Az~
2 Rn . Hen e y 2 ImA and
ImU1 ImA: From (6.46) and (6.47) the desired equality (6.44) follows. b) KerA = fx 2 RnjAx = 0 2 Rm g: Let x 2 KerA. Then Ax = 0, i.e. " # #" # " Tx T 0 V V T 1 1 Ax = U V x = U 0 0 0 = 0: V2T x = U
(6.47)
(6.48)
120
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
But U and are nonsingular and onsequently V1T x = 0; (6.49) i.e., x 2 KerV1T : (6.50) Now, it is easily he kable that KerV1T = ImV2 : (6.51) The olumns of V form an orthogonal basis of Rn and hen e the olumns of V1 and the olumns of V2 form orthogonal basis for the two orthogonal omplementary subspa es S = ImV1 and T = ImV2 = S ?. Hen e any ve tor x 2 T is orthogonal to S , parti ularly to every olumn of V1 , i.e., xT V1 = 0 or, V1T x = 0 and onsequently x 2 KerV1T or equivalently T KerV1T . Conversely, if x 2 KerV1T , i.e., V1T x = 0 or, equivalently xT V1 = 0 or x ? ImV1 , i.e., x 2 T ) KerV1T T . Hen e (6.51) is true. From (6.48), (6.50) and (6.51) we have KerA ImV2 : (6.52) Conversely, if x 2 ImV2 = KerVT1 then V1T x = 0 and from (6.48) we get Ax = 0, i.e., x 2 KerA. Hen e ImV2 KerA: (6.53) The in lusions (6.52) and (6.53) prove the desired equality (6.45) and the proof ends. 2
The following proposition stresses out a lose relation between the SVD and two parti ular matrix norms. Proposition 6.7
a)
kAk2 = 1
b)
(6.54)
q
kAkF = 12 + 22 + + r2
i.e., the 2 norm of a matrix oin ides with the maximum singular value and the Frobenius norm equals the square root of the sum of the squared singular values. Proof.
a) From (6.4) we have kAk2 = kU T AV k2 = kk2 = kmax kxk2 xk =1 2
= x +x max ++x =1 2 1
2 2
2
n
q
12 x21 + 22 x22 + + r2 x2r =
6.1.
121
DEFINITIONS. PROPERTIES
=
r
max (12 x21 + 22 x22 + + r2 x2r ): (6.55) But 1 2 r > 0. Hen e f (x1 ; x2 ; ; xn ) = 12 x21 + 22 x22 + + r2 x2r 12 x21 + ( 2 )2 x22 + + ( r )2 x2r 1 1 12 (x21 + x22 + + x2r ) 12 (x21 + x22 + + x2n) and onsequently max f 12 kxk2 12: (6.56) kxk =1 x21 +x22 ++x2n =1
2
h
i
The equality in (6.56) holds for x = 1 0 0 T , and we infer that max f = 12 kxk =1 and from (6.55) we on lude that (6.54) is true. b) From (6.5) q kAkF = kU T AV kF = kkF = k1kF = 12 + 22 + + r2 and the proof ends. 2 The omputation of the SVD is based on the following main result. Theorem 6.8 The nonzero singular values i ; i = 1 : r, of a matrix A 2 Rmn are 2
the positive square roots of the nonzero eigenvalues of the symmetri semipositive de nite matrix B = AT A (6.57) i.e., if 1 2 r > 0, are the r nonzero eigenvalues of B then p
i = i ; i = 1 : r:
(6.58) Proof. Using Theorem 6.2, there exists two orthogonal matri es U 2 Rmm and V 2 Rnn su h that (6.9) holds and we get # " 2 0 T T T T 2 T (6.59) B = A A = V U U V = V V = V 0 0 V : Hen e, B = AT A and 2 are orthogonally similar and both share the same spe trum. Then (B ) = (2 ) = f12 ; 22 ; ; r2 ; i = 0; i = r + 1 : ng (6.60) whi h proves (6.58). 2 Theorem 6.2 shows that a symmetri positive semide nite matrix fa torized as in (6.57) has only real nonnegative eigenvalues. Moreover, any symmetri positive semide nite matrix has su h property.
122
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
Proposition 6.9 Let B 2 Rnn be a symmetri matrix and let (B ) = f1 ; 2 ; ; n g and x1 ; x2 ; ; xn , be its spe trum and the orresponding
set of eigenve tors, respe tively. Then the following assertions are true. 1. (B ) R: (6.61) 2. xTi xj = 0; 8i 6= j (6.62) i.e. all the eigenve tors are orthogonal. 3. The real S hur form of B is diagonal. 4. If, in addition, B 0 then j 0; j = 1 : n: (6.63) Proof. From the previous hapters we know that any matrix is orthogonally similar with a real S hur form S = QT BQ; (6.64) where Q is orthogonal and S is quasi upper triangular. Sin e S T = QT B T Q = QT BQ = S; (6.65) i.e., we infer that S is a blo k diagonal matrix, ea h blo k having at most dimension 2. The 2 2 dimensional diagonal blo ks are symmetri and, as it an be easily
he ked, their spe trum belongs to R. From here we on lude that S is a real diagonal matrix whi h is exa tly assertion 3. From (B ) = (S ) we have that (6.61) also holds. Using (6.64) rewritten as QS = BQ (6.66) we get sjj qj = Bqj ; j = 1 : n; (6.67) where qj = Qej ; j = 1 : n, is the j -th olumn of the matrix Q. From (6.67) we have that the eigenve tors of B are all olinear with the orresponding olumns of an orthogonal matrix, and (6.62) follows. If the symmetri matrix B is positive semide nite, then xT Bx 0; 8x 2 Rn . Taking su
essively x = Qej = qj ; j = 1 : n, the above inequality be omes qjT Bqj = eTj QT BQej = eTj Sej = sjj = j 0; j = 1 : n; (6.68) and onsequently the assertion 4 is true and the proof ends. 2 Theorem (6.8) suggests a pro edure for omputing the singular values of a given matrix A by using the QR algorithm for omputing the eigenvalues of B = AT A.
6.2.
123
THE SVD ALGORITHM
This pro edure is not re ommended from a numeri al point of view due to the possible ill numeri al ondition of B whi h equals the squared numeri al ondition of A. In order to avoid the expli it omputation of B , a numeri ally sound algorithm (Golub and Kahan) a ting ex lusively on the matrix A was developed. This algorithm, known as the SVD algorithm, is presented in the next se tion. 6.2 The SVD Algorithm
As we have already seen, the singular values of a matrix A 2 Rmn are the nonnegative square roots of the symmetri semipositive de nite matrix B1 = AT A (see (6.60) of Theorem 6.8). Moreover, there are other important relationships between the SVD of" a matrix#A and the RSF (Real S hur form) of the matri es B2 = AAT T and B3 = A0 A0 . Theorem 6.10 Let A 2 Rmn , with m n, and onsider its SVD
=
"
U T AV
= 01
#
; 1 = diag(1 ; 2 ; ; n ):
(6.69)
Then the matri es B2 and B3 de ned above have the following RSF
S2 = U T (AAT )U
= diag(12 ; 22 ; ; n2 ; 0| ; {z ; 0})
(6.70)
m n
S3 =
"
QT
0
A
AT
0
#
Q = diag(1 ; 2 ; ; n ; 1 ; 2 ; ; n ; 0| ; {z ; 0})
(6.71)
m n
where the orthogonal matrix Q is
1 Q= p 2 with
"
V U1
V U1
p0 2U2
#
(6.72)
U1 = U (:; 1 : n); U2 = U (:; n + 1 : m):
a) First, from the SVD of A we have # # " " h i 2 0 1 T T T T T 1 1 0 U = U 0 0 U T = US2U T AA = U V V U = U 0 and onsequently S2 = U T AAT U . Proof.
124
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
b) Se ond, from the SVD of A and the partition U2 ℄ U = [|{z} U1 |{z} n
we have
A = U
VT
=
h
U1 U2
and
AT
Hen e
#
"
T B3 = A0 A0 =
Let
"
m n
i
"
1 0
#
VT
= U1 1V T
= V 1U1T :
0 U1 1 V T
V 1 U1T
"
#
#
0
"
= U01 V0
#"
0 1 1 0
#"
0
VT
U1T
#
0 : (6.73)
S := 0 01 2 Rnn ; 1 = diag(1 ; ; n ) 1 and de ne the similarity transformation " # " # I 0 I n n 0 1 T := ;T = : In In
Then and
S~ = T ST 1 =
In In
"
1 1 0 1
#
(6.74)
(S ) = (S~) = (1 ) [ (
1) = f1 ; ; ng [ f 1 ; ; ng: For obtaining the RSF of the symmetri matrix S we need to determine a set of (orthonormal) eigenve tors for it. To this end, we nd rst a set of eigenve tors yj 2 R2n , j = 1 : 2n, for S~. Further, the eigenve tors xj 2 R2n of S will be
omputed as xj = T 1 yj : (6.75) Hen e, for j = j ; j = 1 : n, and with the partition yj =
"
yj0 yj00
S~yj = j yj
an be split as
(i j )yij0 + iyij00 = 0 ; = 0 ( i j )yij00
#
the system (6.76)
i = 1 : n:
(6.77)
6.2.
125
THE SVD ALGORITHM
The system (6.77) is satis ed by the nonzero ve tor yj given by yij00 = 0; i = 1 : n 0 yij = 0; i = 1 : n n fj g (6.78) 0 yjj = j 6= 0 For j = j ; j = 1 : n, the system (6.76) be omes (i + j )yij0 + iyij00 = 0 ; i = 1 : n; (6.79) = 0 ( i + j )yij00 whi h is satis ed by a nonzero ve tor yj00 if 00 = j 6= 0 yjj (6.80) the other omponents of yj00 being zero 00 = 0; i = 1 : n n fj g: yjj (6.81) From the rst set of equations (6.79) we have that yj00 an be yij0 = 0; i = 1 : n n fj g (6.82) 2j yjj0 + j j = 0 or 0 = 1 j : yjj (6.83) 2 Let now R := diag(1 ; 2 ; ; n ) (6.84) K := diag(1 ; 2 ; ; n ): Then from (6.78) and (6.80)-(6.83), the set of eigenve tors of S~ an be rewritten in the form # " 1K R 2 2 R2n2n (6.85) Y= 0 K and from (6.75) the orresponding set of eigenve tors of S is " #" # " # 1K 1K I R R n 0 1 2 2 X =T Y = I I (6.86) 0 K = R 21 K n n i.e. # " j ej (6.87) xj = e 2 R2n ; j = 1 : n; j j 1 xj = 2
"
j nej n j nej n
#
2 R2n; j = n + 1 : 2n:
(6.88)
126
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
For unity eu lidian norm eigenve tors, it is ne essary to have 2j + 2j = 1 ) j = p12 ; j = 1 : n; (6.89) 1 2 + 1 2 = 1 ) j = p2; j = 1 : n: 4 j 4 j Consequently the eigenve tor matrix X (6.86) be omes # " 1 I I n n X=p I I (6.90) 2 n n and a dire t he k shows that it is orthogonal. Hen e the RSF of S is # # " #" #" " 1 0 I I 0 I I 1 n n 1 n n T ~ = X SX = 2 In In 1 0 In In = 0 1 : (6.91) Now from (6.73) we have " # " # " # T T 0 A 0 V 0 U 1 B3 = A 0 = U 0 S V T 0 = 1 "
where " Q~ = 0
U1
= U01 V0 #
"
#
#
X ~ X T
0 V 1 0 X = U1 0 p2
V
"
"
0
VT
In In
In In
U1T
0
#
#
= Q~ ~ Q~ T ;
= p12
"
V U1
(6.92) V U1
#
2 R(m+n)n :
Finally, it is straightforward to see that the matrix # # " h 1 V V 0 0 (m+n)(m+n) ~ p Q= Q U = p 2 2 U1 U1 2U2 2 R "
#
(6.93)
(6.94)
T 0 = Im+n ) and is orthogonal ( = 2V0V 2UU T 2 3 " # 0 0 1 ~ B3 = Q 0 gg2mn n QT ) QT B3 Q = 64 0 1 0 75 2 0 0 0 0 0 Theorem 6.8 and the above one suggest the possibility of omputing the singular values of a matrix A as the eigenvalues of one of the matri es B1, B2 or B3, whi h in turn, an be omputed by using the QR Algorithm. However, for omputing the SVD of A the orthogonal matri es U and V are needed in addition.
QQT
1 2
6.2.
THE SVD ALGORITHM
For example,
127
let B = B1 = AT A and let Q be the orthogonal matrix produ ed by the QR algorithm QT BQ = diag(1 ; 2 ; : : : ; n ) =: (6.95) with i 0; i = 1 : n (be ause of AT A 0). The rst step in obtaining the SVD of A is the orthogonal ordering of the diagonal elements of . If Pij = [e1 ; : : : ; ei 1 ; ej ; ei+1 ; : : : ; ej 1; ei ; ej+1; : : : ; en ℄ is the (i; j )permutation matrix (whi h is orthogonal) then PijT Pij = Pij Pij = diag(1 ; : : : ; i 1 ; j ; i+1 ; : : : ; j 1 ; i ; j +1 ; : : : ; n ) (6.96) and any ordering algorithm an be used for obtaining P T P = diag(12 ; 22 ; ; n2 ) (6.97) with 1 2 : : : n. De ning V = QP (6.98) we have V T BV = V T AT AV = diag(12 ; 22 ; : : : ; n2 ) (6.99) where, as we have already seen, i, i = 1 : n, are the singular values of A and V is one of the two orthogonal matri es de ning the SVD of A. For omputing the matrix U let i , i = 1 : r, be the nonzero singular values of A 1 = diag(1 ; 2; ; r ) (6.100) and onsider the partition i h (6.101) V = V1 V2 ; V1 = V (:; 1 : r): Then (6.66) be omes # " # " V1T AT A h V V i = 21 0 gr 2 (6.102) 1 2 0 0 gn r =: V2T i.e. V1T AT AV1 = 21 ; V2T AT AV2 = 0 or 1 1V1T AT AV1 = 1; AV2 = 0: (6.103) Let U1 = (1 1 V1T AT )T = AV1 1 2 Rmr : (6.104)
128
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
The matrix U1 has orthonormal olumns U1T U1 = 1 1 V1T AT AV1 1 1 = Ir and if U2 2 Rm(m r) is a ompletion of U1 to an orthogonal matrix U2 ℄ U = [|{z} U1 |{z} r
m r
then from (6.103) and (6.104) we have U T AV
=
"
U1T U2T
#
h
A V1 V2
i
=
"
"
U1T AV1 U1T AV2 U2T AV1 U2T AV2
= 01 00
#
#
=
"
U1T U1 1 U2T U1 1
gr gm r
|{z} |{z}
r
(6.105) (6.106) 0 0
#
(6.107)
n r
whi h is exa tly the SVD of A and onsequently the matrix U an be " omputed # 0 1 from (6.104) and (6.106). More dire tly, from (6.107), the matrix 0 0 is upper triangular. So U T (AV ) an be seen as the orthogonal triangularization of the matrix AV 2 Rmn . Sin e su h a triangularization does not ne essarily give a diagonal ordered matrix it should be ompleted with a olumn pivoting strategy (see [?℄). The QR fa torization with olumn pivoting omputes a QR fa torization of a matrix C 2 Rmn (m n) with permuted olumns CP = QR; (6.108) where P 2 Rnn is a permutation matrix, Q 2 Rmm is orthogonal and, in the general ase, the upper triangular matrix R has the following stru ture R=
"
R11 R12
#
gr ; r = rank(C ); R11 nonsingular: gn r
(6.109) 0 0 In order to obtain the fa torization (6.108)-(6.109) let us ompute an orthogonal triangularization with olumn pivoting by using Householder re e tors. If, at step s, we have omputed the re e tors U1 ; U2 ; ; Us 1 and the permutation matri es P1 ; P2 ; ; Ps 1 su h that " (s 1) C (s 1) # gs 1 C 12 11 Us 1 U1 CP1 P2 Ps 1 = Cs 1 = 0 C22(s 1) gm s + 1 (6.110)
6.2.
THE SVD ALGORITHM
129
where C11(s 1) is upper triangular and nonsingular. Now, suppose that i (s 1) = h (s 1) (s 1) C22 (6.111) zs zs+1 zn(s 1) is the olumn partition of C22(s 1) and let l 2 s : n be the smallest index su h that (kzi(s 1) k2 ): (6.112) kzl(s 1) k2 = max i2s:n
Noti e that if r = rankC = s 1 then kzl(s 1) k2 = 0 and the fa torization is a hieved. For improving the eÆ ien y of the fa torization method, noti e that there is no need to ompute at ea h step the norms in (6.112) sin e the orthogonal transformations leave them un hanged and " # QT z = w gg1k 1 ) kQT z k = 2 + kwk22 ) kwk2 = kz k 2 (6.113)
holds for any orthogonal matrix Q 2 Rkk and any z 2 Rk . Permuting, at the urrent step s, the olumns l and s, we get nally that the fa torization (6.108) is given by Q = U1 U2 Ur (6.114) P = P1 P2 Pr : This algorithm was developed by Businger and Golub and is based on the following
omputational s heme. 1. Q Im 2. for j = 1 : n 1. j kC (1 : m; j )k2 3. r 0 4. Find the smallest l 2 1 : n su h that = l = maxj=1:n(j ) 5. while > 0 1. r r + 1 2. pr l 3. C (1 : m; r) $ C (1 : m; l) 4. l $ r 5. Compute the re e tor Ur su h that Ur C (r + 1 : m; r) = 0 6. C Ur C 7. Q QUr 8. for j = r + 1 : n 1. j j 2rj 9. if r < n 1. Find the smallest l 2 r + 1 : n su h that l = maxj2r+1:n(j ) else 1. 0.
130
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
The ve tor p 2 Nr ontains all the olumn permutations. Clearly, the upper triangular matrix R overwrites the matrix C and nally jr11 j jr22 j jrrr j > 0: (6.115) In our parti ular ase C = AV and the result of the orthogonal triangularization with olumn pivoting QT CP
= QT AV P =
"
R11 R12
0
#
gr
0
(6.116)
is an orthogonal upper triangular matrix (QT AV P )T QT AV P = P T V T AT QQT AV P = P T P (6.117) = Pr P2 P1 P1P2 Pr = diag(p2 ; p2 ; ; p2 ; 0; ; 0) whi h implies with (6.115) that pi = i; i = 1 : r. In on lusion, a possible s heme for the SVD omputing is the following. 1
2
r
1. Compute B = AT A 2. Apply to the matrix B the symmetri QR Algorithm whi h omputes the matrix V and = diag(1 ; 0) 3. Compute C = AV 4. Compute the matrix U := Q form the QR fa torization with olumn pivoting of the matrix C , i.e., su h that CP = UR. "
#
= 01 (for m n) de ne the SVD
The omputed matri es U; V and of A. However, the expli it omputation of the matrix B = AT A an lead to a loss of information, as shown in the following example 2 3 " # 1 1 1 : 000001 1 6 7 T A = 4 0:001 0 5 ) A A = : 1 1 : 000001 0 0:001 RP 1
"
#
= 11 11 .
In a oating point format with less than 7 de imal digits, we have A better method for omputing the SVD has been proposed by Golub and Kahan in 1965. Their algorithm determines U and V simultaneously by impli itly applying the symmetri QR algorithm to B = AT A (i.e. working only on the matrix A without expli itly forming the matrix B ). AT A
6.2.
131
THE SVD ALGORITHM
This SVD algorithm has two stages. 1. The rst stage onsists of redu ing A to an upper bidiagonal form J su h that the tridiagonal matrix T = J T J oin ides with the one produ ed by the rst step of the QR symmetri algorithm applied to B . 2. The se ond stage onsists of iteratively redu ing the matrix J to diagonal form by vanishing asymptoti ally the overdiagonal elements. This an be a hieved using the bilateral orthogonal transformations that orrespond to an impli it QR symmetri step applied to B . Let us work now on some details. Stage 1: Bidiagonalization Suppose A 2 Rmn and m n. matri es U 2 Rmm and V 2 Rnn
2
J = U T AV
=
6 6 6 6 6 6 6 6 6 6 6 4
We show next how to ompute the orthogonal su h that 3
d1 f1 d2 f2
... ... ...
fn 1 dn
7 7 7 7 7 7 7 7 7 7 7 5
(6.118)
is bidiagonal. The matri es U and V will be omputed as produ ts of Householder transformations U = U1 U2 Un (6.119) V = V2 V3 Vn 1 using the following omputational s heme 1. U Im 2. V In 3. for s = 1 : n 1. Compute the re e tor Us su h that (Us A)(s + 1 : m; s) = 0 2. A UsA 3. U UUs 4. if s < n 1 1. Compute the re e tor Vs+1 su h that (AVs+1)(s; s + 2 : n) = 0 2. A AVs 3. V V Vs Noti e that at step s we have Us = diag(Is 1; U s) and the zeros reated in the rst s 1 rows and s olumns are not destroyed. Similarly, Vs+1 = diag(Is; V s+1)
132
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
does not ae t the zeros reated in the rst s olumns and s 1 rows. Finally, the matrix A is overwritten by the bidiagonal matrix A Un U2 U1 AV2 V3 Vn 1 = U T AV: (6.120) The detailed algorithm orresponding to the above s heme is given below. JQ (Bidiagonalization) Given the matrix A 2 Rnn the algorithm omputes the orthogonal matri es U 2 Rmm and V 2 Rnn and the bidiagonal matrix J su h that J = U T AV . The matrix J overwrites the matrix A. Algorithm 6.11
1. U eye(m) 2. V eye(n) 3. for s = 1 : n % Compute Us, A qUPsA and U = UUs 1. sign(ass) mi=s a2is 2. uss ass + 3. for i = s + 1 : m 1. uis ais 4. s uss 5. ass % A Us A 6. for i = s + 1 : m 1. ais 0 7. if s < n 1. for j = s P +1 :n 1. = ( mi=s uisaij )= s 2. for i = s + 1 : m 1. aij aij uis % U UUs 8. for i = 1 P: mm 1. ( k=s uik uks)= s 2. for j = s : m 1. uij uij ujs % Compute Vs+1, A AVs+1 and V 9. if s < n 1 % Vs+1 q 1. sign(as;s+1) Pnj=s+1 a2sj 2. vs+1;s+1 as;s+1 + 3. for j = s + 2 : n 1. vj;s+1 as;j
V Vs+1
6.2.
133
THE SVD ALGORITHM
4. s+1 vs+1;s+1 % A AVs+1 5. as;s+1 6. for j = s + 2 : n 1. asj 0 7. for i = s P +1 :m 1. ( ni=s+1 vj;s+1aij )= s+1 2. for j = s + 1 : n 1. aij aij vj;s+1 % V V Vs+1 8. for i = 1 P: nn 1. ( j=s+1 vij vj;s+1)= s+1 2. for j = s + 1 : n 1. vij vij vj;s+1 The omputational eort is Nop =
n X s=1
2p + 4(n n X
s)(m s) + 4(m s)(n s) + 4m(m s) + 4n(n s))
= 2np + 8 (nm s=1
s(n + m) + s2 ) + 4m
n X
(m
s=1
s) + 4n
n X
(n
s=1
s)
= 2np + 8(n2m (n + m)n2=2 + n3=3 + 4m(mn n2=2) + 4n(n2 n2=2) = 2np + 4n2 m 4n3=3 + 4mn(m n=2) + 2n3 : Remark 6.12 A more eÆ ient algorithm for a
umulating the transformations an be obtained if (6.119) are implemented out of the main loop by a ba kward a
umulation. For the hairy details see [?℄. Stage 2: Iterative diagonalization
The iterative diagonalization produ es a matrix sequen e J1 = J; J2 ; ; Jk ; (6.121) " #
onvergent to the diagonal matrix = 01 00 su h that the matrix sequen e T1 = J1T J1 ; ; Tk = JkT Jk ; (6.122) # " 2 is the QR symmetri sequen e with impli it shift onvergent to 01 00 2 Rnn, a RSF of the tridiagonal matrix T = T1.
134
CHAPTER 6.
Suppose
THE SINGULAR VALUE DECOMPOSITION
2
Jk := J =
6 6 6 6 6 6 6 6 6 6 6 4
3
d1 f1 d2 f2
... ... ...
fn 1 dn
7 7 7 7 7 7 7 7 7 7 7 5
(6.123)
has a bidiagonal form and apply an impli it shift QR algorithm step to the tridiagonal matrix T := Tk = JkT Jk =: J T J: (6.124) An impli it shift QR step onsists of 1. Computation of the shift := T (n; n) = d2n (6.125) or, better, using the Wilkinson shift, whi h is the losest to d2n eigenvalue of the matrix # " 2 + d2 d f f n 1 n 1 (6.126) T (n 1 : n; n 1 : n) = nd 2 f n 1 f 2 + d2 : n 1 n 1 n n 1 2. Compute an orthogonal matrix U1 su h that U1 e1 is the rst olumn of the expli it shift QR step transformation 2
6 6 6 66 6 4
t11 t21
3
2
7 7 7 7 7 7 5
6 6 6 66 6 4
d21 f1 d1
0 0 = ... ... 0 0 The matrix U1 an be a Givens rotation P12 su h that U1 e1 =
2
6 6 6 P12T 66 6 4
3. Compute
d21 f1d1
0 ... 0
3 7 7 7 7 7 7 5
2
3
7 7 7 7: 7 7 5
3
0 777 1 = 0. 77 = e1: .. 75 0 6 6 6 6 6 6 4
C = P12T T P12
(6.127)
that outlines an altered tridiagonal matrix in the position (3; 2).
(6.128)
(6.129)
6.2.
135
THE SVD ALGORITHM
4. Apply the HQ algorithm (adequately updated for the symmetri ase, i.e., the tridiagonalization algorithm) to the matrix C and obtain the matrix T = QT CQ (6.130) in tridiagonal form. The matrix Q an be a sequen e of rotations: Q = P23 Pn 1;n ;
(6.131)
su h that the new matrix is (6.132) T QT CQ = QT P12T T P12 Q = PnT+1;n P23T P12T T P12 P23 Pn 1;n : The idea of Golub-Kahan SVD step is detailed as follows. 1. Apply the above P12 Givens rotation to the matrix J instead of T K = JP12 (6.133) whi h alters the bidiagonal stru ture of the matrix J in the (1; 2) position. 2. Redu e the altered stru ture to the bidiagonal form by bilateral orthogonal transformations. J J 0 = Un 1 Un 2 U1 KV2 Vn 1 (6.134) where Us, Vs an be Givens rotations (or Householder re e tors). 2
Example 6.13
1.
Let m = 5, n = 3, J =
K
0 0 0 0 0 0
3 7 7 7 7 7 5
.
2 0 32 03 3 6 0 7 6 ? 7
s 0 6 6 7 7 6 76 7 K = JP12 = 66 0 0 77 4 s 0 5 = 666 0 0 777 : 2
4
2.
6 6 6 6 6 4
d1 f1 0 0 d2 f2 0 0 d3
0 0 0 0 0 0
2
?3 6 0 7 6 7 U1 K = 666 0 0 777 ) K 4
0 0 05 0 0 0
5
0 0 1
2
4
3 6 0 7 6 7 KV2 666 0 ? 777 ) K 4
0 0 05 0 0 0
0 0 0 0 0 0
5
03 6 0 7 6 7 U2 K 666 0 0 777 : 2
4
0 0 05 0 0 0
136
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
From the above example it is lear that the omputational s heme for (6.134) is for s = 1 : n
1 1. Compute the Givens rotation Us := Ps;s+1 su h that (UsK )(s + 1; s) = 0 2. K UsK 3. if s < n 1 1. Compute the Givens rotation Vs+1 := Ps+1;s+2 su h that (KVs+1)(s; s + 2) = 0 2. K KVs+1
The new matrix K = J 0 is bidiagonal and T 0 = J 0T J 0 = (Un 1 U1 JP12 V2 Vn 1 )T = Un 1 U1 JP12 V2 Vn 1 = VnT 1 V2T P12T J T U1T UnT 1Un 1 U1JP12 V2 Vn 1 = QT J T JQ (6.135) and Qe1 = P12 V2 Vn 1 e1 = P12 e1 is the same as in the QR-step for the tridiagonal matrix T . Consequently, Jk = J , de ning the SVD sequen e is su h that Tk = JkT Jk is the QR sequen e for the matrix B = AT A and, hen e, is onvergent to a diagonal form. It is possible that the diagonal limit matrix = J1 has no ordered diagonal elements. The ordering an be immediately a hieved by a suitable permutation sequen e. Rn
Given the ve tors d 2 that determine the bidiagonal matrix
Algorithm 6.14
and f 2
Rn
1
(Golub-Kahan SVD step) 2
J=
6 6 6 6 6 6 6 6 6 6 6 4
3
d1 f1 d2 f2
... ... ...
fn 1 dn
7 7 7 7 7 7 7 7 7 7 7 5
the algorithm omputes the su
eeding matrix J 0 = QT JQ in the SVD sequen e. The matrix J 0 overwrites the given matrix J . The new ve tors d0 and f 0 overwrite the ve tors d and f . 1. % Compute the Wilkinson shift Æ (fn2 2 + d2n 1 fn2 1 d2n )=2
6.2.
137
THE SVD ALGORITHM
fn2 1 + d2n Æ+sign(Æ(f)pÆ12d+(f1 ) 2. y = d1 ; z = f1d1 " n
2
n
n
)
1 dn 1 2
y 12 z
#
"
?
#
3. % Compute ; s su h that = 0 p 1. r y2 + z2 2. y=r 3. s z=r 4. % Compute J JP12 . Let be the outbidiagonal nonzero element 1. d1 sf1 2. f1 sd1 + f1 3. d1 4. d2 d2 5. sd2 % Redu tion to bidiagonal form (6.134) 5. for k = 1q: n 1 1. r d2k + 2 2. dk =r 3. s =r 4. dk r 5. fk + sdk+1 6. dk+1 sfk + dk+1 7. fk 8. fk+1 9. if k < nq 1 1. r = fk2 + 2 2. = fk =r 3. s = =r 4. fk r 5. dk+1 sfk+1 6. fk+1 dk+1s + fk+1 7. dk+1 8. dk+2 The above SVD-step must be ompleted with the a
umulation of the transformations. The number of ops required is Nop ' 2np + 24n 0 ' 6mn Nop 00 ' 6n2 Nop where Nop0 and Nop00 are the ops required for a
umulating U and V , respe tively. Typi ally, after few SVD-steps of the above algorithm fn 1 be omes negligible and PT
138
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
the problem dimension de reases. The following riterion is used to ontrol the pro ess of zeroing the supradiagonal entries if jfij "(jdi j + jdi+1j) then fi = 0
(6.136)
and for zeroing the diagonal elements the usual riterion is used if jdi j "kJ k then di = 0
(6.137)
where " is a small multiple of the unit roundo. The SVD algorithm onsists of Golub-Kahan iterative steps applied on the irredu ible part of the bidiagonal matrix J and simultaneously, based on the above
riterion, monitors the supradiagonal and diagonal zeros. The omputed SVD is an exa t SVD for a slightly perturbed matrix A + E . (The SVD Algorithm) Given a matrix A 2 Rmn (m n) and the toleran e level " the above algorithm overwrites A with U T AV = D + E , where U 2 Rmm , V 2 Rnn are orthogonal, D 2 Rmn is diagonal, E satis es kE k ukAk2 and u is the unit Algorithm 6.15
roundo. 1. Bidiagonalization. Apply Algorithm 6.14 (JQ) to ompute the bidiagonal matrix A J = U T AV 2. while q < n 1. for i = 1 : n1 1. if jai;i+1 j "(jaii j + jai+1;i+1j) 1. ai;i+1 0 2. Find the largest q and2the smallest p su h3 that in the partition A11 0 0 gp A(1 : n; 1 : n) = 64 0 A22 0 75 gn p q 0 0 A33 gq the bidiagonal matrix A22 is irredu ible (i.e. with all the supradiagonal entries zero) 3. if q < n 1. for i = p + 1 : n q + 1 if aii 0 1. ai;i+1 0 else 1. Apply the SVD step (Alg. 9.1) to the matrix A22 2. A diag(Ip; U; Iq+m n )T A diag(Ip; V; Iq )
6.3.
139
APPLICATIONS OF THE SVD
6.3 Appli ations of the SVD 6.3.1
On omputing the rank of a matrix
The omputed diagonal matrix is after ordering and without zeroing the diagonal entries 2 3 = J1 = UAV T = where
6 6 6 6 6 6 6 6 4
1
2
...
7 7 7 7 7; n 777 5
(6.138)
1 2 n 0:
(6.139) If we onsider now negligible the singular values that satisfy a riterion like (6.137), where the matrix norm is, for example, the Frobenius one i "kkF
=
v u n uX "t 2
i=1
(6.140)
i
and if the index i = r + 1 is the rst whi h satis es (6.140), the matrix be omes 2
=
6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4
1
3
2
...
r
0
...
0
7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 5
(6.141)
and the numeri al rank of A is (6.142) rankA = rank = r: Clearly, the rank is an intrinsi property of the matrix itself but, due to roundo in a omputer one should always expe t to nd full rank matri es (i.e. r = n) unless a threshold Æ is hosen below whi h elements of will be onsidered zero. Usually, the threshold is hosen of order "kAkF (where " is the relative pre ision of the omputer used).
140 6.3.2
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
The general least square problem
The general least square problem (GLS) onsists in solving overdetermined or underdetermined linear systems Ax = b (6.143) whith rank de ient matrix A, i.e., rankA < min(m; n): (6.144) Generally, in su h a ase the linear system (6.143) has no solution but has an in nite number of (pseudo)solutions in the least square (LS) sense. In fa t, the problem is to ompute the minimum eu lidean norm (pseudo)solution kx k = min minkb Axk kxk: (6.145) x
2Rn
The following theorem shows how one an use the SVD to ompute this normal (pseudo)solution. Theorem 6.16 Suppose A 2 Rn , m n and rankA = r < n. If U T AV
=
is the SVD of A then the minimum norm pseudosolution to (6.143) is given by
x =
r X
uTi b vi i=1 i
(6.146)
where ui = Uei 2 Rm and vi = V ei 2 Rn are the i-th olumn of U and V , respe tively. Moreover, the minimum residue is
( )2 = kAx
bk22 =
m X
(uTi b)2 :
(6.147)
i=r+1
For any x 2 Rn we have krk22 = kb Axk22 = kb U V T xk22 = kU (U T b (V T x))k22 : Let us de ne now d = U T b; y = V T x: Then
" # 2
d(1 : r ) (1 : r; 1 : r )y (1 : r ) 2 2 krk2 = kd yk2 =
d(r + 1 : m) Proof.
= kd(1 : r)
(1 : r; 1 : r)y(1 : r)k22 + kd(r + 1 : m)k22 ;
2
6.3.
141
APPLICATIONS OF THE SVD
whi h attains its minimum for arbitrary y(r + 1 : n) and 2
y(1 : r) =
6
1 (1 : r; 1 : r)d(1 : r) = 6 6 6 4
Then the set of all minimizers is of the form x=Vy =V
"
y(1 : r) y(r + 1 : n)
uT1 b=1 uT2 b=2
...
uTr b=r
3 7 7 7: 7 5
#
among whi h, the minimum norm ones satisfy y(r + 1 : n) = 0: Consequently r X uTi b x = V (:; 1 : r)y(1 : r) = vi i=1 i and m X ( )2 = kb Ax k22 = kd(r + 1 : m)k22 = (uTi b)2 i=r+1
and the proof ends. 2 Noti e that the omputational s heme is straightforward. If m < n (underdetermined systems) the normal (pseudo)solution an be omputed in a similar way. The rank de ient underdetermined systems an have no solution and onsequently we must de ne and ompute a LS solution that is given by (6.146). 6.3.3
Orthonormal basis for linear subspa es
The SVD of matrix A 2 Rmn produ es orthonormal basis for all linear subspa es de ned by A. By theorem (6.2), if U T AV = is the SVD of A then (a) S1 = ImA = ImU (:; 1 : r) Rm (b) T1 = KerA = ImV (:; r + 1 : n) Rn (6.148) ( ) S2 = KerAT = ImU (:; r + 1 : m) Rm (d) T2 = ImAT = ImV (:; 1 : r) Rn : The proof of the rst two equalities is given in Proposition 6.6; ( ) and (d) follow immediately from (a) and (b), sin e U and V are orthogonal; however, we present here dire t proofs for the last two equalities.
142
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
( ) Suppose that x 2 KerAT . Then AT x = 0 ) (U V T )T x = 0 ) V T U T x = 0 ) T U T x = 0 " #" # Tx 0 U (: ; 1 : r ) 1 ) 0 0 U (:; r + 1 : m)T x = 0 ) 1U (:; 1 : r)x = 0 ) (U (:; 1 : r))T x = 0: But x 2 Rm and the olumns of U form a basis of Rm . Consequently, there exists y su h that " # " # Tx ( U (: ; 1 : r )) 0 T x = Uy ) y = U x = y(r + 1 : m) = y(r + 1 : m) )x=U
i.e.,
"
#
0 y(r + 1 : m) = U (:; r + 1 : m)y(r + 1 : m)
x 2 ImU (:; r + 1 : m) ) KerAT
ImU (:; r + 1 : m):
Conversely, if x 2 ImU (:; r + 1 : m) ) x = U (:; r + 1 : m)z ) AT x = V T U T U (:; r + 1 : m)z =V
T
=V
T
"
0
#
Im r z
"
U1T U2T
#
U2 z =
= 0 ) x 2 KerAT ) ImU (:; r + 1 : m) KerAT :
(d) Suppose x 2 ImAT . Then "
#
"
#
x= ) x = V 01 00 U T z = V y0 = V (:; 1 : r)y ) x 2 ImV (:; 1 : r) ) ImAT ImV (:; 1 : r) Conversely, if " " # 1y # y g r 1 1 x 2 ImV (:; 1 : r) ) x 2 V (:; 1 : r)y = V 0 gn r = V = 0 AT z
"
#
"
#"
#
1 = V 01 1 1y = V 01 00 1 y " 1y # T = V U U 1 = AT z ) ImV (:; 1 : r) ImAT :
6.3.
143
APPLICATIONS OF THE SVD
6.3.4
The pseudoinverse omputation
As we have already seen, the pseudoinverse of a full rank matrix A 2 Rmn is de ned by m > n A+ = (AT A) 1 AT (6.149) m = n A+ = A 1 m < n A+ = AT (AAT ) 1 : The general de nition is as follows. De nition 6.17 Let A 2 Rmn . The pseudo-inverse of A X 2 Rnm that satis es the four Moore-Penrose onditions
(i) AXA (ii) XAX (iii) (AX )T (iv) (XA)T
= = = =
A X AX XA:
is the unique matrix (6.150)
It is straightforward to he k these onditions for the full rank ase. In the de ient rank ase, let U T AV = be the SVD of A. Then # " #" T 0 V 1 1 V2 ℄ ) A = U1 1 V1T : V1 |{z} U2 ℄ 0 0 A = U V T = [|{z} U1 |{z} V2T ; where V = [|{z} r r m r n r It is easy to show that the pseudo-inverse of is "
#
1 X = + = 1 0 ggrn 0 0
r
|{z} |{z}
r
2 Rnm
(6.151)
m r
Theorem 6.18 Let A 2 Rmn and U T AV = its SVD. Then the pseudo-inverse of A is A+ = V + U T : (6.152) Moreover, X = A+ is the unique minimal F -norm solution to the problem
min kAX
X 2Rnm
I m kF
(6.153)
and for any ve tor b the solution to the GLS system Ax = b an be written as x = A+ b: (6.154)
144 Proof.
Indeed,
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
The matrix de ned in (6.151) veri es all four Moore-Penrose onditions. (i)
AA+ A = U V T V + U T U V T
= U +V T = U V T = A:
(ii) A+AA+ = V +U T U V T V +U T = V +U T = A+: (iii) (AA+ )T = (U V T V +U T )T = (U +U T )T = U (+)T U T = = U +U T = U V T V +U T = AA+: (iv) (A+ A)T = (V +U T U V T )T = V (+)T V T = V +V T = = V +U T U V T = A+A: The Frobenius matrix norm is preserved under orthogonal transformations. Consequently rF (X ) = kAX Im kF = kU T AV V T X U T kF = kV T X U T kF = k
V T XU
Im kF
=
"
1 0 0 0
"
#"
Y11 Y12 Y21 Y22
#
Im
F
=
#
1 Y12
Im r F
1Y11 Ir ; 0 where Y := V T XU 2 Rnm and the partition of Y is onformaly to the partition of . Clearly, rF (X ) is minimum when1 both k1Y12 kF and k1 Y11 Ir kF are minimum, i.e., for Y12 = 0 and Y11 = 1 . Hen e, all the matri es X 2 Rnm that minimizes rF (X ) are of the form =
X = V Y UT
"
1 1 0
=V
"
1 1 0
#
T Y21 Y22 U :
#
But kX kF = Y12 Y22 is minimum for Y21 = 0; Y22 = 0. Consequently the F minimal F -norm solution for the problem (6.153) is X = V +U T = A+ . Finally, rewrite the solution (6.146) of the GLS problem in the matrix form 2 h
x = v1
vr
i
6 4
uT1 b=1
...
uTr b=r
3
2
7 5
= V (:; 1 : r) 64
uT1 b=1
...
uTr b=r
3 7 5
=
6.3.
145
APPLICATIONS OF THE SVD
2
uT1 b=1
3
2
... 77 6 1 1 . 6 7 .. 6 T b= 7 u 6 r 7 r V 0 777 = V 666 ... 75 4 0 and the proof ends. 2 6 6 6 6 6 6 6 6 6 4
3
r 1
...
0
7 7 7 7 T 7U b 7 7 5
= V +U T b = A+b
Problems
Whi h are the singular values of A = 11 11 ? Show that kQAk = kAV k = kAk , where A 2 , and Q 2 are orthogonal. p Show that if A 2 , then kAk kAk rank(A)kAk . Prove that if is the largest sigular value of A, then
P 6.1
P 6.2
F
P 6.3
F
R
P 6.4
m
m
n
n
R
F
2
m
R
m
,V 2
n
R
n
2
F
1
y Ax 1 = 2Rmax m 2Rn kyk2kxk2 : T
y
;x
Show that if A 2 , m n, and rank(A) < m, then the SVD of A is not unique (however, noti e that the singular values of A are unique). m
P 6.5
P 6.6
R
n
Give expli it formulae for omputing the SVD of A =
Æ
.
Whi h is the SVD of A = uv , where u 2 , v 2 ? Whi h is the rank of A ? Give an algorithm to ompute the SVD of A = I + uv , where u; v 2 are two non olinear ve tors. Show that if A 2 has rank n, then kA(A A) A k = 1. Prove that the ondition number of a nonsingular matrix A 2 , (A) := kAk kA k = = , where and are the largest and the smallest singular values of A. T
P 6.7
R
m
R
P 6.8
n
T
n
P 6.9
R
m
n
1
T
2
n
P 6.10 2
T
n
R
R
1
2
1
1
n
A1 A2 0 A3
n
2
n
Let A = , where A 2 , A 2 . Sket h an algorithm (equivalent to algorithm JQ) to eÆ iently redu e A to bidiagonal form by bilateral orthogonal transformations. P 6.11
R
m
n
1
p
R
p
146
CHAPTER 6.
THE SINGULAR VALUE DECOMPOSITION
Solutions to the problems P1.1 The result depends on the order of omputation; y1 = (x1 + x2 ) + x3 = 0, y2 = x1 +(x2 + x3 ) = 0:001. The exa t result is y2 (relative error equal to 0). For y1 , the relative error is j0 0:001j=0:001 = 1 (i.e. 100%). P1.2 The omputed result is y = fl(fl(x1 + x2 ) + x3 ) = fl((x1 + x2 )(1 + 1 ) + x3 ) = [(x1 + x2 )(1 + 1) + x3 ℄(1 + 2), with j1 j; j2j , and of order unity. It follows that: t
jy yj 1 + jx + x j : jy j jx + x + x j 1
1
2
2
t
3
Suppose that input data are ae ted by errors, so that (a + a; b + b) is used instead of (a; b). The result will be x + x. From (x + x)(a + a) = (b + b), negle ting ax, it results that x=x = a=a b=b. Thus, the problem is always well onditioned (small relative input errors imply small relative output error). Sin e x = fl( b=a) = ( b=a)(1 + ) = b(1 + )=a = b=a, with jj , the algorithm is numeri ally stable (b is lose to b). the problem inherits the ill onditioning of the sum (for example, when ja + a j is small and ja j, ja j big). The "algorithm" x = (b + b )=(a + a ) is stable. The next oating point number is x = 0:100 : : : 01 ; then, x 1 = (the relative error bound for hopped rounding). " 0:5 . P1.3
t
P1.4
1
1
2
1
Hint: C = AB is lower tridiagonal.
t+1
b1=a11 i=2:n 1. x (b
.
The algorithm is:
1
for
i
i
a
i;i
1
x
1
i
)=a
ii
Hint: x = 0, for i = 1 : j 1. is lower triangular: d.
1.
2
2
t+1
r
P2.1 b.
1. x 2.
1
1
P1.5
P1.6
2
e.
i
j=1:n 1. x 1=a 2. i = j + 1 : n 1. x a
Solve rst Ay = , then Bx = y.
for
jj
jj
for
ij
i;i
1
x
i
1;j
=a
ii
147
f.
The inverse
Hint: only here T must be omputed; T is tridiagonal and symmetri ; that is, only the diagonal and the rst under diagonal must be omputed. see the previous problem. ompute X = A like in the previous problem, and then T = X X ; X is lower triangular, X is upper triangular and T is symmetri . T = AA is not already the Cholesky fa torization, sin e the diagonal elements of A are not ne essarily positive. Let T = LL be the Cholesky fa torization. It is very natural to try to prove that L is lower bidiagonal (then, by uni ity, L is the Cholesky fa tor). For any s 2 1 : n 1: t =a +a =l +l t =a a =l l : It is again natural to try to prove that jl j = ja j; for the rst relation above, the signs are not important. For the se ond, if a is negative, we take l = a and l = a . The algorithm will be: P2.2 b.
.
1
d.
1
1
T
T
T
e.
T
2
ss
s;s
s+1;s
2
2
1
ss
s+1;s
s;s
ij
ss
1.
for
s=1:n ja j if s < n then 1. if a < 0 then l +1 else l +1 a +1
1. l 2.
ss
ss
s+1;s ss
ss
ij
2
1
ss
s+1;s
ss
s+1;s
ss
ss
s
s
;s
s
a +1
;s
s
;s
;s
It is obvious that the multipliers will be zero for i > j + 1. Then, the Gaussian elimination will have the form: P2.3 a.
1.
ij
s=1:n 1 1. h h =h 2. j = s + 1 : n 1. h h
for
s;s+1
s+1;s
ss
for
s+1;j
s+1;j
h +1 h s
;s
sj
The ve tor b will be modi ed a
ording to these parti ular values of the multipliers: 1. s = 1 : n 1 1. b b h b Then, an upper triangular system must be solved. Hint: the partial pivoting does not hange the upper Hessenberg stru ture. Noti e that L is lower bidiagonal. Hint: ompute y = Rb, then Hy. Solve Hy = b, like in the previous problem, then Rx = y. If b = + id, where ; d 2 , the system an be written as A[y z℄ = [ d℄, where y; z 2 and x = y + iz. It is a parti ular ase of , for m = 2. the matrix equation AX = B onsists of m linear systems: Ax = b , for j = 1 : m (x and b are the j -th olumn of X and B, respe tively). Using the algorithm: 1. j = 1 : m 1. use LSS GPP to solve Ax = b for
s+1
s+1
s+1;s s
b.
.
P2.4 b.
.
P2.5 a.
R
n
n
R
b
b.
j
j
j
for
j
j
148
j
is not a good idea, sin e the number of operations would be 2mn =3. Instead, GP P an be used only on e to triangularize A, like in the following: 1. [M; U; p℄ = GP P (A) 2. j = 1 : m 1. s = 1 : n 1 1. b $ b 1. i = s + 1 : n 1. b b b 2. x = UT RIS (U; b ) The number of operations is 2n =3 + O(mn ). Again, it is not a good idea, nor to ompute A (2kn ops), neither to use the algorithm based on the idea that A(A x) = b, applied re ursively: 1. j = 1 : k 1. use LSS GPP to solve Ax = b 2. b x whi h implies 2kn =3 ops. Again, GP P an be used only on e to nd the solution of all the systems in the statement 1.1 of the s heme above. Thus, this s heme is modi ed to: 1. [M; U; p℄ = GP P (A) 2. j = 1 : k 1. s = 1 : n 1 1. b $ b 1. i = s + 1 : n 1. b b b 2. b = UT RIS (U; b) 3. x b The number of operations is only 2n =3 + O(kn ). Suppose that: L 0 ; U~ = U Y : L~ = X L 0 U Then A LY B = L~ U~ = XU XY + A : Thus LY = 0 and, be ause A nonsingular implies L, U nonsingular, Y = 0; XU = R, thus X = RU , and X is upper triangular. A 0 x = d ) Ax = d R A Rx + Ax = d d x First solve Ly = d , Ux = y and x is obtained (2n ops). Then ompute f = d Rx (n ops); then solve Ly = f , Ux = y and x is obtained (2n ops). The total is of 3
for
for
p(s);j
sj
for
ij
ij
j
is
sj
j
3
2
3
k
P2.6
k
1
for
3
for
for
p(s)
s
for
i
i
is
s
3
2
P2.7 a.
1
b.
2
1
1
1
1
1
2
2
1
1
2
2
2
1
2
2
149
2
2
1
only 5n ops. The above omputational s heme an be applied to solve any blo k lower triangular systems. Use Gaussian elimination; a = 0, for i > j + n; the multipliers will respe t the same relation. Use Gaussian elimination with partial pivoting, whi h will not ae t the stru ture of A. 2
P2.8 a.
ij
ij
b.
P2.9 a.
1.
for
s=1:n
1. a 2. a
s+1;s
1
a +1 =a a +1 +1 s
s+1;s+1
;s
s
ss
;s
;
a +1 a s
;s
s;s+1
We present only an algorithm for orthogonal triangularization (adaptation of With Householder re e tors (U = I (u u )= , with u = [0 : : : u u 0 : : : 0℄): 1. s = 1 : n q 1. sign(a ) a + a 2. u a + , u a 3. u 4. a r = 5. j = s + 1 : n 1. (u a + u a )= 2. a a u 2. a a u With Givens rotations: 1. s = 1q: n a +a 1. t 2. a =t, d a =t 3. a r = t 4. j = s + 1 : n 1. t a + d a 2. a d a + a 3. a t In this ase, the se ond variant is better (the reverse situation is met for general matri es). Both the use of Householder re e tors or Givens rotations will destroy the lower Hessenberg stru ture (the algorithm will be identi al to ORT HT ). Hint: elementary re e tors U = I (u u )= , with u = [0 : : : 0 u 0 : : : 0 u : : : u ℄ will be used. u = [0 : : : 0 u 0 : : : 0 u : : : u ℄, where p = min(s + n; m). Sin e A = Q0R0 and R0 is nonsingular, ImA = ImQ0 (and Q0 has orthogonal
olumns). KerA = (ImA)? ; but (Q00 ) Q0 = 0, be ause U is orthogonal; thus Q00?Q0 and ImQ00?ImA. P3.1 a.
ORT HT ).
s
s
T s
s
T s
ss
s+1;s
for
ss
ss
s+1;s
ss
s
2 s+1;s
2
ss
s+1;s
ss
ss
ss
for
ss
sj
s+1;s
sj
sj
s+1;j
s
ss
s+1;j
s+1;j
s+1;s
for
2 s+1;s
2
ss
s
ss
ss
s+1;s
s
ss
for
s
s+1;j
sj
s
s
s+1;j
sj
s
s+1;j
sj
P3.2
P3.3
T s
a.
s
n+1;s
ss
b.
T s
ss
s
ms
ps
ms
P3.4
T
T
T
150
T s
s
From q q = 0, for j > 1, prove rst that q = 0 (Q is upper triangular). Continue by nite indu tion. Let U 2 be a Householder re e tor su h that Ux = e , or x = U e . Then, U = U = [x Y ℄, and Y = U (:; 2 : n) is the desired matrix. Let B = Q A~ = [Q A Q z℄ = [R y℄ 2 . B is already upper triangular, unless its last olumn, y = Q z. Let U 2 be the Householder re e tor for whi h (U y) = 0, for i > n. Then, U B = [R U y℄ = R~ is upper triangular; U Q A~ = R~ implies Q~ = QU . that A~ = QU R~ and then 1 0 w ~ Take Z = 0 Q . Then, Z A = R is upper Hessenberg. The QR fa torization of this matrix an be omputed as in problem 3.1. Let Q , Z be elementary re e tors of order n and index 1 su h that Q u = Z v = kuke . Then, Z Q u = v and Q = Z Q is the desired orthogonal matrix. Attention: Q is not omputed using matrix multipli ation. If Z = I 2zz and Q = I 2qq (we take kzk = kqk = 1), then Z Q = I 2zz 2qq + 4z(z q)q . The rst olumn of Z has the same dire tion as v (Z e = v=kuk). Consider an other
olumn of Z , denoted by y; obviously, y?v. Then, use the algorithm from to ompute an orthogonal W su h that W u = yP; hen e, W u?Pv. x = (a a) a b = ( a b )= a . There is no numeri al problem in using the pseudo-inverse, sin e a a is a s alar. The orthogonal proje tion of a onto Imb is by, where y 2 is su h that kby ak is minimum. Of ourse, y = (b b) b a. T
P3.5
1
1j
j
n
P3.6
n
R
T
1
1
T
T
P3.7
T
T
n
i
(n+1)
m
R
T
R
n
n
m
m
n
T
n
n
n
T
P3.8
T
P3.10 a.
1
1
1
T
1
1
1
T
1
1
1
T
1
T
1
b.
T
1
T
T
1
T
1
T
1 1
a
1
T
P3.11 a.
1
n
n
T
T
i=1
i
i
i=1
2 i
b.
R
1
T
T
0 a12=a11 , thus the eigenvalues of T are T = D 1(L + U ) = a12 =a22 0 p 1 2 = a12 = a11 a22 . Sin e A is positive de nite, then y Ay > 0 for y = [x 1℄; then, a11 x2 + 2a12 x + a22 > 0 for any x 6= 0 and the dis riminant 4(a12 a11 a22 ) is negative. In
on lusion, j1 2j < 1. P P4.2 k D 1 (L + U )k1 = max =1: =1 ja =a j < 1. P4.3 For Ja obi's method: T = D 1 (L + U ) = 1=02 10=2 , so that (T ) = 1=2. For Gauss-Seidel method, T = (L + D) 1U = 00 11==42 ; thus, (t ) = 1=4. Sin e (T ) < (T ), the Gauss-Seidel method onverges faster. P4.1
T
;
T
;
n
j
n
ij
i
jj
J
J
GS
GS
P5.1
GS
J
Suppose 2 (A) and
A1 A2 0 A3
x1 x2
=
x1 : x2
If x 6= 0, then A x = x and 2 (A ); if x = 0, then A x = x , and 2 (A ). Hen e, (A) (A ) [ (A ). But (A) and (A ) [ (A ) have the same ardinality. Another way to solve this problem is dire tly from det(I A) = det(I A ) det(I A ). 2
3 2 1
2
3
2
3
1
1
1
1
1
3
1
3
151
The eigenvalues of A are both equal to 1. The unique asso iated eigenve p 1 tor is x = (1= 2) 1 (with kxk = 1). Taking Q = [x y℄, where x y = 0, thus p y = (1= 2) 11 (with kyk = 1, too), it results that Q AQ = 01 41 . Noti e that the S hur form an be obtained, although A is defe tive. The eigenvalues of A are not real, thus a RSF of A is quite A. Only two ve tor variables are required. The entry data are x { the initial approximation (of unit norm), " { the toleran e, N { the maximum number of steps; here is the power method: P5.2
T
T
P5.3
P5.4
1.
k=1:N 1. y Ax 2. y y=kyk 3. ky xk < " 1. x y, 2. print('The method is not onvergent') for
if
then
stop
For A, the result is e , sin e Ae = 5e and (A) = f5; 2; 1g. For B, the method will not onverge, be ause there is no dominant eigenvalue: (B) = f1; 1g. If H = UAU , with U 2 , orthogonal, then (A + I )x = b is equivalent to (UAU + I )Ux = Ub . The algorithm is: P5.5
1
1
T
P5.7 a. T
k
k
n
R
1
n
k
k
k
k
b.
1. ompute orthogonal U su h that UAU = H , upper Hessenberg 2. k = 1 : p 1. d Ub 2. solve (H + I )y = d 3. x U y T
for
k
k
k
T
The main omputational eort is in step 1, i.e. O(n ) ops. In 2.2, the solution of an upper Hessenberg system may be obtained in O(n ) ops, see . The total is of O(n + pn ) ops. If Gaussian elimination is used p times to solve (A + I )x = b , then O(pn ) operations will be ne essary. For larity, we onsider only the ase when n is even. First, transform B by similarity, applying the elementary permutation matri es P ,P , ..., P , obtaining the matrix C = QAQ whi h has nonzero elements only on the se ondary diagonal; more pre isely, these elements are, from the upper right to the lower left orner: d , d , ..., d , d , ..., d , d . Then, applying the EPMs P , P , . . . , P , C is brought to a matrix with 2 2 diagonal blo ks, i.e. diag(D ; D ; : : : ; D ; D ). The permutation of this diagonal blo ks to the desired form an be now easily performed (with a sorting algorithm). 3
2
3
P2.3
2
.
k
k
k
3
P5.8 b.
n+1;2n
T
n+2;2n
1
3n=2;3n=2+1
1
2
n
n
2
1
2;2n
1
4;2n 2
3
4
152
n;n+2
2
P5.9 b. the eigenvalues are those of A and 1 with multipli ity n x is the asso iated eigenve tor, then: x1 + x = x1 ;
x1 + x = x ; x = x ; for i = 2 : n 1:
2. If 2 (B) and
n
n
i
n
i
x1 x
2 If 2 (A), then the rst two relations are ful lled for the eigenve tor of A, y = , and x = 0, for i = 2 : n 1. Otherwise, = 1 and x = x = 0 (and the eigenve tors are, for example, e ; : : : ; e ). If 2 (A) and Ax = x, then BAx = Bx, A(Bx) = (Bx) and, sin e the eigenvalues of A are distin t, A has a omplete set of linearly independent eigenve tors and Bx = x (Bx is an eigenve tor both for A and B ). Sin e the redu tion to S hur form based on the de ation pro edure is done using eigenve tors, the same similarity transformation brings A and B to the S hur form (similar arguments hold for the real ase). The idea is the same as for the orthogonal redu tion in Theorem 5.17, but using ELTs. Idea: nd U , elementary re e tor, su h that U z = kzke . Compute A0 = U AU . Then, transform A0 to Hessenberg form H = Q0 U AU (Q0 ) ; Q = Q0 U is the desired orthogonal transformation, sin e Q0 = U : : : U , and thus Qz = U : : : U kzke = kz ke . H I is upper Hessenberg; like in , its QR fa torization an be
omputed using a sequen e of Givens rotations P P : : : P (H I ) = R, where : : : P . The matrix multipli ation RQ an be performed without expli itly Q=P
omputing Q, by RQ = RP : : : P . The s heme of the QR step is: 1. h 2. H H I 3. ompute Givens rotations P , . . . , P su h that P : : : P H = R 4. H RP : : : P 5. H H + I All omputations ould be performed in pla e, in A. ~ = e , and H (Qe ) = (Qe ). Hen e Qe is the desired eigenve Obviously, He tor. If Q = Q is the Householder re e tor for whi h Qu = kuke , then A0 = Q AQ = I + kuke v Q is upper triangular and a0 = 1 for i 2; thus, 1 is an eigenvalue of multipli ity n 1. a0 = 1 + kukv q , where q = Qe is the rst olumn of Q. But q = u=kuk, hen e a0 = 1 + v u; this is the n-th eigenvalue of A. The matrix is symmetri , then there exists an orthogonal Q su h that Q AQ = D is diagonal; this is also the SVD of A (up to a reordering of the eigenvalues). Hen e the singular values of a symmetri matrix are its eigenvalues; in our ase, = 2, = 0. Let denote B := A A, C := AA . It is easy to prove that kAk = tra e(B) = P b and also kAk = tra e(C ). Then, kQAk = tra e(A Q QA) = tra e(B ) = kAk and kAV k = tra e(C ). n
R
2
1
i
2
n
1
n
P5.11
P5.12
P5.13
1
1
T
1
1
n
1
T
1
T
1
1
2
1
n
2
1
1
P5.14 a.
P3.1
12
T n
23
n
1;n
T
12
1;n
T n
T
12
1;n
nn
12
T n
1
1
1 1
T
n
1;n
1
1
1
1
1
T
11
1
12
T
P5.16
T
1;n
12
1;n
b.
n
T
11
T
ii
1
1
1
T
T
P6.1
1
T
P6.2
i=1
ii
2
2
n
2
F
F
F
153
2
2
T
T
T
F
2
F
kAk = P r , where r = rank(A) and kAk = . r
2
P6.3
2
i=1
2 1
i
2
1
Hint: if A = U V is the SVD of A, then (U , V are nonsingular): z U AV x z x max y Ax = 2Rmax m 2Rn kUz k kV wk = 2Rmax m 2Rn kz k kwk : 2Rm 2Rn kyk kxk Sin e the singular values of A are the eigenvalues of A A, they are uniquely de ned, and so is (whose diagonal is ordered). But U and V may not be unique. For example, if rank(A) < m, then in (6.106) U is uniquely de ned, but U is hosen as an orthogonal ompletion to U and may take several values (even when rank(A) = m 1, there are two possibilities: U = w or U = w, where w 2 is the unique unit norm ve tor su h that U w = 0). Another situation whi h allows multiple de nitions of U and V is when several singular values are equal. If U 2 , V 2 are Householder re e tors su h that U u = kuke 2 , and V v = kvke 2 , then: vk 0 2 : U AV = kukk 0 0 Obviously, rank(A) = 1 if u 6= 0, v 6= 0, and zero otherwise. Let w := Qv?u, where Q is omputed like in problem 3.10, and w := w= kwk, u := u=kuk. Compute an orthogonal C 2 su h that U = [u w C ℄ and V = [w u C ℄ are orthogonal (use the QR fa torization of [u w℄; see also problem 3.6). Then 3 2 3 2 u (I + uv )w u (I + uv )u 0 u uv w 1 + u uv u 0 0 0 5 U (I +uv )V = 4 w (I + uv )w w (I + uv )u 0 5 = 4 1 0 0 I 0 0 I and the problem has been redu ed to the 2 2 ase (anyway, n 2 singular values of A are equal to 1). A = (U V ) = V U , and thus the singular values of A are f ; : : : ; g. Then, kA k = 1= . Using Householder or Givens transformations, one an introdu e zeros as follows (m = 7, n = 6, p = 3; we use Givens rotations, for larity): F
T
P6.4
T
y
;x
T
2
z
2
;w
T
T
2
z
2
;w
2
2
T
P6.5
1
2
1
2
R
2
m
T
1
P6.7
m
R
1
m
R
1
1
m
R
1
n
n
1
1
n
R
T
1
m
R
1
n
P6.8
R
T
T
(n 2)
n
T
T
T
T
T
T
T
T
T
n
1
P6.10 1
1
n
1
T
1
1
1
2
T
T
T
2
n
2
1
T
n
P6.11
2
3
2
3
6 7 6 0 7 6 7 6 7 6 7 6 0 7 6 7 6 7 77 ; A P P A = 66 77 ; A=6 6 6 6 77 77 6 6 4 4 5 5 2 3 2 3 0 0 0 0 6 0 7 6 0 7 6 7 6 7 6 0 7 6 0 7 6 7 6 7 77 ; A P P P A = 66 77 ; A AP P = 6 6 6 6 77 0 77 6 6 4 5 4 0 5 0 12
45
46
13
45
154
46
47
6.3.
155
APPLICATIONS OF THE SVD
2
0 0 0 6 0 6 6 0 6 ? 0 A AP P = 6 6 6 0 6 4 0 0 23
24
0
3
77 77 77 : 77 5
After these transformations A is bidiagonal on the rst row. A similar problem is obtained, but of dimension (m 1) (n 1); the square p p blo k shifted one diagonal position.