Constructive Aspects for the Generalized Orthogonal Group by Steven Christopher Fuller A dissertation submitted to the Graduate Faculty of Auburn University in partial ful llment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama May 14, 2010 Keywords: generalized orthogonal group, generalized orthogonal matrix, generalized Householder matrix, inde nite inner product, elementary re ection Copyright 2010 by Steven Christopher Fuller Approved by: Frank Uhlig, Chair, Professor of Mathematics Greg Harris, Associate Professor of Mathematics Douglas Leonard, Professor of Mathematics Chris Rodger, Scharnagel Professor of Mathematics Abstract Our main result is a constructive proof of the Cartan-Dieudonn e-Scherk Theorem in the real or complex elds. The Cartan-Dieudonn e-Scherk Theorem states that for elds of characteristic other than two, every orthogonality can be written as the product of a certain minimal number of re ections across hyperplanes. The earliest proofs were not constructive, and more recent constructive proofs either do not achieve minimal results or are restricted to special cases. For the real or complex elds, this paper presents a constructive proof for decomposing a generalized orthogonal matrix into the product of the minimal number of generalized Householder matrices. A pseudo code and the MATLAB code of our algorithm are provided. The algorithm factors a given generalized orthogonal matrix into the product of the minimal number of generalized Householder matrices speci ed in the CDS Theorem. We also look at some applications of generalized orthogonal matrices. Generalized Householder matrices can be used to study the form of Pythagorean n-tuples and generate them. All matrices can not be factored in a QR-like form when a generalized orthogonal matrix in used in place of a standard orthogonal matrix. We nd conditions on a matrix under which an inde nite QR factorization is possible, and see how close we can bring a general matrix to an inde nite QR factorization using generalized Householder eliminations. ii Acknowledgments I would like to thank my advisor, Dr. Uhlig, for all of his guidance throughout my master?s and PhD programs. His experience and suggestions have often been crucial to my progress. I appreciate the assistance of the AU math department, my committee members, and other in uencial professors at Auburn. I would also like to thank my family and friends for their support, particularly my dad whose encouragement and sacri ces have always made reaching my goals possible and my wife, Torri, for all that she does and all that she means to me. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 A constructive proof of the CDS Theorem . . . . . . . . . . . . . . . . . . . . . 6 2.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Diagonal Congruence of a Symmetric Matrix . . . . . . . . . . . . . . . . . . 7 2.3 Problem Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Case of D(U In) not skew-symmetric . . . . . . . . . . . . . . . . . . . . . 10 2.5 Case of D(U In) skew-symmetric . . . . . . . . . . . . . . . . . . . . . . . 20 2.6 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3 Pseudo Code for a CDS Factorization over R or C . . . . . . . . . . . . . . . . . 28 4 Some Applications of generalized orthogonal matrices . . . . . . . . . . . . . . . 32 4.1 Pythagorean n-tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Inde nite QR factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Appendix: MATLAB code for the CDS Theorem factorization . . . . . . . . . . . . . 48 iv Chapter 1 Introduction Over a eld F with char F6= 2, the Cartan-Dieudonn e-Scherk Theorem states that a generalized orthogonal matrix can be factored into the product of a certain minimal number of generalized Householder matrices. Thus the CDS Theorem states that the elements of the generalized orthogonal group can be decomposed into the group?s \basic building blocks." The CDS Theorem has been called the Fundamental Theorem of Algebraic Groups because of this. The identity matrix In is the most basic orthogonal as well as generalized orthogonal matrix. It is the unit element of the group. Generalized Householder matrices are rank one modi cations of the identity In and generalized orthogonal matrices. Thus they are the simplest generalized orthogonal matrices di erent from the identity. The CDS Theorem itself can be used to show that every generalized orthogonal matrix that is rank one removed from the identity is a generalized Householder matrix. Thus the general orthogonal group?s \basic building blocks" are these generalized Householder matrices. Cartan-Dieudonn e-Scherk Theorem (CDS Theorem). Let F be a eld with char F6= 2 and S 2 Mn(F) be nonsingular symmetric. Then every S-orthogonal matrix Q 2 Mn(F) can be expressed as the product of rank(Q In) generalized S-Householder matrices, unless S(Q In) is skew-symmetric. If S(Q In) is skew-symmetric, then the same holds with rank(Q In) + 2 generalized S-Householder matrices. Furthermore, the number of factors is minimal in either case. In this paper, we will present a constructive proof of the CDS Theorem with F equal to R or C. We address the complications that arise when the inner product de ned by 1 a nonsingular symmetric matrix S is inde nite. For a given S-orthogonal matrix Q, we will explicitly determine a set of S-Householder matrix factors whose product is Q. Our construction, which relies mainly on the diagonalization of a symmetric matrix, factors Q into the product of the minimal number of S-Householder matrix factors as stated in the theorem. We begin by establishing some notations that we use. Let F be a eld and S2Mn(F) be a nonsingular symmetric matrix. A vector x2Fn is called S-isotropic if xTSx = 0 and non- S-isotropic otherwise. A matrix S = ST de nes a symmetric bilinear inner product which may allow nonzero S-isotropic vectors. Two vectors x and y in Fn are called S-orthogonal if xTSy = yTSx = 0, and a matrix Q2Mn(F) is called S-orthogonal provided QTSQ = S. In a positive-de nite inner product space over F with S = In the orthogonal Householder matrix has the form H = In 2uuT=(uTu) for a nonzero vector u2Fn: The transformation induced by H represents a re ection across the hyperplane orthogonal to u. In Numerical Linear Algebra, Householder matrices over both R or C are often used to transform a matrix to Hessenberg form by similarity or to nd a QR decomposition of a matrix. More on House- holder matrices can be found in [8, 11] for example. For a general nonsingular symmetric matrix S2Mn(F), Householder matrices generalize to S-Householder matrices which have the form HS;u := In 2uu TS uTSu for a non-S-isotropic vector u2F. Similar to the standard S = In case, we have HS;u = H 1S;u and HS;u is S-orthogonal; however, HTS;u does not necessarily equal H 1S;u for S-Householder matrices with general S = ST. By construction, both the standard Householder matrices and the S-Householder matrices are rank one modi cations of the identity matrix. Having to avoid S-isotropic vectors when forming S-Householder matrices is the main complication when trying to apply Householder matrices in inde nite inner product spaces. Householder matrices were generalized in [10] not only to symmetric bilinear inner products but also to skew-symmetric bilinear and Hermitian and skew-Hermitian sesquilinear inner products. 2 While the CDS Theorem states a precise minimum number of generalized Householder matrix factors that is required to express a generalized orthogonal matrix, this minimal factorization is not unique. The CDS Theorem applied to the identity In is an interesting case for which the factorization is not unique. Since the zero matrix is trivially skew- symmetric, two generalized Householder matrix factors are needed to express In. Used as its own inverse, any generalized Householder matrix times itself is a factorization of In. Hence in the CDS Theorem, the identity is to be considered a repeated re ection rather than a lack of re ections. The earliest results regarding the CDS Theorem date back to 1938 by E. Cartan. Work- ing over R or C, Cartan proved that every orthogonal transformation of Rn (or Cn) is the product of at most n re ections [3]. In 1948 J. Dieudonn e extended the result to arbitrary elds with characteristic not two [5]. Finally, P. Scherk obtained the minimal number and dealt with arbitrary elds F of char F6= 2 in 1950. Part of Scherk?s proof [13] relies on an existence argument and is not entirely constructive. In the literature the CDS Theorem is often stated with the non-minimal upper bound of n factors and Scherk?s name is omitted, see [6, 14]. From an algebraic point of view, the fact that every generalized orthogonal group over a eld F with char F6= 2 is generated by products of re ections, regardless of the min- imal number of factors, is the important notion behind the CDS Theorem [14]. However, from a numerical standpoint, an algorithm that factors an S-orthogonal matrix into the product of the minimal number of S-Householder matrices is desirable. The structure of generalized orthogonal groups is studied in [15], and the CDS Theorem is not only presented with the minimal upper bound of factors but also extended to elds with characteristic two; the conclusion being that every generalized orthogonal group except in a space of dimension four over a eld of characteristic two is generated by re ections. Over general elds the CDS Theorem is used in algebraic geometry and group theory. Our interest lies in numerical applications, and hence we work over R or C. Generalized real 3 and complex orthogonal Householder matrices have applications in computer graphics, crys- tallography, and signal processing where a least-squares problem is repeatedly modi ed and solved [1, 12]. Some of these applications show less sensitivity to rounding errors compared to other methods while only slightly increasing and at times even decreasing computation requirements [12]. Our goal is to prove the CDS Theorem with F equal R and C constructively. It should be noted that we use matrix transposes for both the real and complex case. Typically, matrix conjugate-transposes are used for extensions to the complex case. However, the work leading up to the CDS Theorem deals with general elds and only with orthogonal matrices. An extension of the CDS Theorem to the complex eld with matrix conjugate-transposes and unitary matrices is an interesting, but di erent, case. This is open. Existing proofs of the CDS Theorem are not constructive, and constructive proofs ei- ther do not achieve minimal results or are restricted to special cases. Scherk?s proof [13] uses construction in parts, but some key steps are proved through existence arguments. Speci - cally, the existence of a non-isotropic vector that is needed to form a Householder factor [13, Lemma 4] is not proved in a constructive way by Scherk. The outline of our proof closely resembles [13] when restricted to F equal R or C, however, we construct all elements for the S-Householder factorization of a generalized orthogonal matrix explicitly rather than rely on their existence. In 2001, F. Uhlig gave a constructive proof over R in the case of S = In and in the general real or complex case he found a constructive factorization requiring a non-minimal number of 2n 1 S-Householder matrices [16]. Moreover, he noted that for the generalized real problem one may deal with a nonsingular diagonal matrix D that is congruent to the nonsingular symmetric matrix S rather than work with the more general matrix S; this is also mentioned in [7]. The proof of Uhlig?s rst result cannot be extended directly to general S because in the case of S = In the complication of nonzero isotropic vectors does not exist. His second result requires up to 2n 1 factors by applying n 1 generalized Householder 4 matrices to reduce a matrix U to a matrix diag( 1). To complete the factorization, up to n additional generalized Householder matrices are needed to reduce the diagonal matrix diag( 1) to the identity In. By sorting the order in which the columns of U are updated, his method can be modi ed to obtain a smaller, but still non-minimal, upper bound of n 1 + minfn+;n g factors, where n+ and n are the number of +1 and 1 entries in D respectively. Whether the method can be further improved to construct the minimal number of factor matrices is unclear. There is a constructive proof [1] over R, C, or Q using Cli ord algebras, but it requires up to n re ections to represent an orthogonal transformation of an n-dimensional inner product space. 5 Chapter 2 A constructive proof of the CDS Theorem 2.1 Outline In Section 2.2 we detail a diagonalization process for real or complex symmetric matrices that will be utilized throughout the paper. Following [16], Section 2.3 establishes that the construction of the S-Householder factors of an S-orthogonal matrix Q can be reduced via a matrix D = diag( 1) that is congruent to S to constructing the D-Householder matrix factors of a D-orthogonal matrix U that is similar to Q. Then we constructively prove the CDS Theorem over R or C by induction on rank(U In). The base case is veri ed by showing that U is a D-Householder matrix whenever rank(U In) = 1. This establishes the generalized Householder matrices as elementary building blocks of the generalized orthogonal group. For rank(U In) > 1, we address the two cases that D(U In) is skew-symmetric or not. In Section 2.4, we treat the case of a D-orthogonal matrix U for which D(U In) is not skew-symmetric. A non-D-isotropic vector w is found satisfying certain conditions in order to form an appropriate Householder update matrix HD;w. Because of our choice of w, this HD;w will guarantee two conditions, namely that D(HD;wU In) is not skew-symmetric and that rank(HD;wU In) = rank(U In) 1. This determines one D-Householder factor of U and reduces the problem. Repetition nally establishes rank HD;wr 1HD;wr 2 HD;w1 U In = 1 : 6 In Section 2.5, we deal with a skew-symmetric D(U In). We rst update U via HD;w so that rank(HD;wU In) = rank(U In)+1. Here the rank increases rather than decreases, but our update forces D(HD;wU I) to be not skew-symmetric. Thus after one extra update, the method in Section 2.4 for D(U In) not skew-symmetric can be used for all subsequent iterations. This accounts for the additional two factors that are required in the CDS Theorem when D(U In) is skew-symmetric. Finally we discuss the details of our construction in Section 2.6. 2.2 Diagonal Congruence of a Symmetric Matrix Given a symmetric matrix A in Mn(R) or Mn(C), in several instances we will need to nd a full rank matrix C in Mn(R) or Mn(C), respectively, such that CTAC is diagonal. Over the reals, it is well-known that a real symmetric matrix is orthogonally diagonalizable. If A2Mn(R) is symmetric and the columns of C 2Mn(R) are orthonormal eigenvectors of A, then CTAC = where CTC = In and diag( ) = ( 1 2 ::: n) for the eigenvalues 1; 2;:::; n of A. While not all complex symmetric matrices can be diagonalized in the usual sense using a unitary matrix and its conjugate-transpose, any complex symmetric matrix can be diagonalized with a unitary matrix and its transpose. Takagi Factorization ([9, Corollary 4.4.4]). If A2Mn(C) is symmetric, there is a unitary matrix C2Mn(C) with C C = In and a nonnegative diagonal matrix 2Mn(R) such that CTAC = . These factorizations not only exist but they can be constructed, see [9, Section 4.4]. Note that the entries of are real and nonnegative. We only need this C to be invertible, but it happens to be unitary. This is the only place we use a unitary matrix in this paper. We will refer to the eigenvalue decomposition of a real symmetric matrix or to the Takagi factorization of a complex symmetric matrix as a T-diagonalization. 7 2.3 Problem Reduction S-orthogonality of a matrix Q is equivalent to D-orthogonality of a matrix U that is similar to Q. Here D = diag( 1) is the inertia matrix of the nonsingular matrix S = ST in Mn(R). Following the same reduction in Mn(C), the nonsingular matrix S = ST is always congruent to D = In. Thus a generalized S-orthogonal matrix in Mn(C) is equivalent to a standard orthogonal matrix in Mn(C). Our construction is valid for both R and C so we will continue to treat them together. For the remainder of this paper we will limit F to be either R or C. Lemma 2.1. If S = ST 2Mn(F) is nonsingular symmetric, then (i ) S is congruent to a diagonal matrix D with diagonal entries 1 via an invertible matrix Y 2Mn(F), i.e., S = YTDY, and (ii ) a matrix Q 2 Mn(F) is S-orthogonal if and only if Q is similar to a D-orthogonal matrix U = YQY 1 via the same Y from (i). Proof. (i ) This is Sylvester?s law of inertia [8, 9] over R, but a similar proof holds for C. According to Section 2.2, T-diagonalize the nonsingular symmetric matrix S2Mn(F) so that VTSV = for a real diagonal matrix 2 Mn(R) and an invertible matrix V 2 Mn(F). Let 1; 2;:::; n be the diagonal entries of . If L = diag 1p j jj ! , then LTVTSVL = LT diag( j)L = diag(sign( j)) = diag( 1) = D : Working over C, all j are positive according to the Takagi Factorization so that D = In. Over R or C, letting Y = (VL) 1 we have S = YTDY : (2.1) 8 (ii ) Let S be a nonsingular symmetric matrix with (2.1) for Y invertible and D = diag( 1) according to part (i). Then Q is S-orthogonal () QTSQ = S () QT(YTDY)Q = YTDY () Y T(QTYTDYQ)Y 1 = Y T(YTDY)Y 1 = D () UTDU = D for U = YQY 1 () U is D-orthogonal and Q is similar to U If HD;w is the D-Householder matrix formed from a non-D-isotropic vector w, then HD;w = YHS;Y 1wY 1 (2.2) for S, D, and Y as in Lemma 2.1 because HD;w = In 2ww TD wTDw = In 2(YY 1)wwT(Y TYT)D(YY 1) wT(Y TYT)D(YY 1)w = In Y 2Y 1wwTY T(YTDY) wTY T(YTDY)Y 1w Y 1 = Y In 2Y 1w(Y 1w)TS (Y 1w)TSY 1w Y 1 = YHS;Y 1wY 1 : This shows the relationship between D-Householder and S-Householder matrices. 9 Consequently, if a D-orthogonal matrix U 2 Mn(F) is written as the product of r D-Householder matrix factors as U = rY j=1 HD;wj ; for a set of r non-D-isotropic vectors fwj2Fng, then by Lemma 2.1 and (2.2) we have Q = Y 1UY = Y 1HD;w1HD;w2 HD;wrY = HS;Y 1w1HS;Y 1w2 HS;Y 1wr : Hence, Q2Mn(F) is the product of the r S-Householder matrices HS;Y 1wi. Thus a factor- ization of an S-orthogonal matrix Q into a product of S-Householder matrices is equivalent to one for the corresponding D-orthogonal matrix U into a product of D-Householder matrices. For the remainder of this paper we denote Hw with a single subscript to be the D- Householder matrix formed by the non-D-isotropic vector w and a xed sign matrix D = diag( 1). 2.4 Case of D(U In) not skew-symmetric Assume that D = diag( 1), U2Mn(F) is D-orthogonal with rankD(U In) > 1, and D(U In) is not skew-symmetric. In this case we show how to construct a D-Householder ma- trix Hw such that D(HwU In) is not skew-symmetric and rankD(HwU In) = rankD(U In) 1. The remaining cases are treated separately in the sections that follow. Later we see that nding a D-Householder Hw that reduces the rank of D(HwU In) by one is easier than nding one that ensures that D(HwU In) is not skew-symmetric. For this reason, we will focus on the more di cult second goal rst. For either task we need to nd a vector v2Fn with vTNv6= 0 where N = D(U In). This v generates a non-D-isotropic vector w = (U In)v used to form Hw. 10 Lemma 2.2. For an arbitrary non-skew-symmetric matrix N 2 Mn(F), a vector v with vTNv6= 0 can always be constructed. Proof. Since N is not skew-symmetric, we may T-diagonalize the nonzero symmetric matrix N +NT CT(N +NT)C = where C has full rank and is a diagonal matrix with rank(N +NT) 1 nonzero diagonal entries. If v is a column of C corresponding to a nonzero diagonal entry of , then vT(N +NT)v = . Since vTNv = (vTNv)T, we have vTNv = =26= 0. The vector v of Lemma 2.2 will be useful in some cases, but there is no assurance that the corresponding D(HwU In) will not be skew-symmetric. In Lemma 2.5, a test is established to determine whether D(HwU In) is skew-symmetric or not. The test relies on the inequality N +NT 6= 1vTNv NvvTN +NTvvTNT : (2.3) The following lemma is used repeatedly in our main construction step (Lemma 2.4). Lemma 2.3. For N 2 Mn(F), let v;b 2 Fn satisfy vTNv 6= 0, bT(N + NT)b = 0, and bTNvvTNb6= 0. Then (2.3) is satis ed. Proof. First bTNvvTNb = (bTNvvTNb)T = bTNTvvTNTb. Now (2.3) follows by comparing bT(N +NT)b = 0 and bT 1 vTNv NvvTN +NTvvTNT b = 2bTNvvTNb vTNv 6= 0 : The next result explains how to choose a vector v satisfying both vTNv6= 0 and (2.3) in order to form w = (U In)v and the Householder update Hw. This lemma is similar 11 to a previous result [13, Lemma 4, pg. 484] that establishes the existence of such a vector only. The proof in [13] nds a contradiction when assuming that all non-isotropic vectors satisfy the negation of (2.3) by examining the dimension of a certain isotropic subspace. Here we construct a vector with the desired properties by nding a diagonal congruence of the symmetric matrix N +NT and choosing a suitable linear combination of the columns of the diagonalizing matrix. Lemma 2.4. Let N 2Mn(F) satisfy rank(N) > 1 and N + NT 6= 0. Unless rank(N) = rank(N+NT) = 2, a vector v with vTNv6= 0 that satis es (2.3) can be found by construction. The case of rank(N) = rank(N + NT) = 2 and, furthermore, any case with rank(N) even and N +NT 6= 0, is treated after Lemma 2.7. Proof. We determine a vector v in three separate cases that depend on the rank of the matrix N +NT. Case 1. Suppose rank(N +NT) = 1. T-diagonalize the symmetric matrix N +NT CT(N +NT)C = where C is full rank and is a diagonal matrix. Without loss of generality let the columns of C be ordered so that diag( ) = ( 0 0) for 6= 0. Hence we have cT1 (N + NT)c1 = and cTj (N + NT)ck = 0 for j and k not both one. Since cTj (N + NT)ck = cTjNck +cTjNTck = cTjNck +cTkNcj; we have cTjNck = 8 >>> >< >>> >: =2 if j = k = 1 0 if j = k6= 1 cTkNcj if j6= k : 12 Thus the matrix CT(N + NT)C is the zero matrix except for its (1;1) entry , and only one diagonal entry of CTNC is nonzero, namely (CTNC)11 = =2. However, in CTNC there must be at least one nonzero o -diagonal entry since rank(CTNC) = rank(N) > 1 is assumed. 0 BB BB BB B@ 0 0 0 ... 0 1 CC CC CC CA CT(N +NT)C 0 BB BB BB B@ =2 0 ... 0 1 CC CC CC CA CTNC Case 1 (a) If there is a nonzero o -diagonal entry in the rst column or row of CTNC, then for some j 6= 1, cTjNc1 = cT1Ncj 6= 0. In this case we let v = c1 and b = cj. Then vTNv = =2 6= 0, bT(N + NT)b = 0, and bTNvvTNb = cTjNc1cT1Ncj = (cT1Ncj)26= 0. By Lemma 2.3, v satis es (2.3). Case 1 (b) Next suppose cT1Ncj and cTjNc1 are both zero for all j6= 1. Then CTNC has a nonzero entry that is neither on the diagonal nor in its rst row or column. Thus there exists a cTjNck = cTkNcj 6= 0 with k6= j and neither j nor k equal to 1. In this case we let v = c1 +cj and b = ck. Then vTNv = cT1Nc1 +cT1Ncj +cTjNc1 +cTjNcj = cT1Nc16= 0 ; bT(N +NT)b = 0, and bTNvvTNb = (cTkNc1 +cTkNcj)(cT1Nck +cTjNck) = (cTjNck)26= 0 : By Lemma 2.3, v satis es (2.3). Of course, v = c1 + ck will also satisfy the three conditions. This allows some freedom of choice in the implementation of the construction. 13 j j 0 BB BB BB B@ =2 0 ... 0 1 CC CC CC CA CTNC { Case 1 (a) j k j k 0 BB BB BB B@ =2 0 0 0 ... 0 0 ... 0 1 CC CC CC CA CTNC { Case 1 (b) Case 2. Suppose rank(N +NT) = 2 and rank(N) > 2. Again we T-diagonalize the symmetric matrix N +NT as CT(N +NT)C = = diag( 0 0) where C has full rank with columns ordered so that the two nonzero entries and are rst and second along the diagonal of , i.e., cTj (N +NT)ck = 8 >>> >< >>>> : if j = k = 1 if j = k = 2 0 otherwise. As before cTj (N +NT)ck = cTjNck +cTjNTck = cTjNck +cTkNcj, so cTjNck = 8 >>>> >>> < >>> >>> >: =2 if j = k = 1 =2 if j = k = 2 0 if j = k =2f1;2g cTkNcj if j6= k : The matrix CTNC has only two nonzero diagonal entries, namely (CTNC)11 = =2 and (CTNC)22 = =2. Since rank(CTNC) = rank(N) > 2, there must be a nonzero o -diagonal entry in CTNC. Furthermore, there must be at least one nonzero entry in CTNC that is neither along the diagonal nor in the leading 2 2 block. 14 0 BB BB BB BB BB @ 0 0 0 ... 0 1 CC CC CC CC CC A CT(N +NT)C 0 BB BB BB BB BB @ =2 =2 0 ... 0 1 CC CC CC CC CC A CTNC Case 2 (a) Suppose there is a nonzero o -diagonal entry in the rst or second row or column ofCTNC that is not in the leading 2 2 block. ThencTjNck = cTkNcj6= 0 for j2f1;2g and k2f3;4;:::;ng. In this case we let v = cj and b = ck. Then vTNv equals =2 or =2 depending on j, bT(N +NT)b = 0 since k =2f1;2g, and bTNvvTNb = cTkNcjcTjNck = (cTjNck)26= 0 : By Lemma 2.3, v satis es (2.3). Case 2 (b) Next suppose all entries in the rst or second row and column of CTNC outside the leading 2 2 are zero. Then there exist j;k2f3;4;:::;ng such that cTjNck = cTkNcj6= 0. In this case we let v = c1 +cj and b = ck. Then vTNv = cT1Nc1 +cT1Ncj +cTjNc1 +cTjNcj = cT1Nc1 = =26= 0 ; bT(N +NT)b = 0, and bTNvvTNb = (cTkNc1 +cTkNcj)(cT1Nck +cTjNck) = (cTjNck)26= 0 : By Lemma 2.3, v satis es (2.3). Again, v = c2 + cj will satisfy the conditions, and thus the constructive process is exible. 15 0 BB BB BB BB BB @ =2 =2 (j;k) (k;j) 0 ... 0 1 CC CC CC CC CC A CTNC { Case 2 (a) 0 BB BB BB BB BB @ =2 =2 0 0 0 (j;k) ... (k;j) 0 1 CC CC CC CC CC A CTNC { Case 2 (b) Case 3. Suppose rank(N +NT) > 2. In this case, any vector v with vTNv6= 0 will satisfy (2.3) because NvvTN +NTvvTN is the sum of two dyads and can have rank at most 2. We can construct v according to Lemma 2.2. Next we show that a vector v constructed to satisfy both vTNv6= 0 and (2.3) provides a non-D-isotropic vector w = (U In)v that guarantees that D(HwU In) is not skew- symmetric for the D-Householder matrix Hw formed by w. Lemma 2.5. Let D = diag( 1), U2Mn(F) be D-orthogonal, N = D(U In), and v2Fn satisfy vTNv6= 0. Then (i ) w = (U In)v is non-D-isotropic, and (ii ) if Hw is the D-Householder matrix formed by w, D(HwU In) is not skew-symmetric if and only if (2.3) holds for v. Proof. Notice that for a D-orthogonal matrix U D(U In) + (D(U In))T = (UT In)D(U In) (2.4) 16 because D(U In) + (D(U In))T = DU D +UTD D = (UTDU DU UTD +D) = (UT In)D(U In) : Then the rst claim (i) can be seen directly: wTDw = vT(U In)TD(U In)v = vT(N +NT)v = 2vTNv : (2.5) Thus w is non-D-isotropic if and only if vTNv6= 0. Now consider D(HwU In) = D In 2ww TD wTDw U In = D U In 2ww TDU wTDw = D(U In) 2Dww TDU wTDw = N 2D(U In)vv T(UT In)DU 2vTNv = N + Nvv T(D DU) vTNv = N Nvv TN vTNv : Therefore D(HwU In) + [D(HwU In)]T = 0, or D(HwU In) is skew-symmetric, if and only if N Nvv TN vTNv +N T N TvvTNT vTNv = 0 ; 17 or if and only if N +NT = 1vTNv NvvTN +NTvvTNT ; which is the negation of (2.3). Lemma 2.6. For two matrices A;B2Mn(F) rank(AB In) rank(A In) + rank(B In) : (2.6) Lemma 2.6 can easily be seen by considering the dimensions of the kernels of A In, B In, and AB In. It is proved in [13, Lemma 3, pg. 483]. The result is used in the following lemma to show that rank(HwU In) is exactly one less than rank(U In). Lemma 2.6 is also utilized in Section 2.6 to show that the number of D-Householder factors needed to express a D-orthogonal matrix U is minimal. Lemma 2.7. Let D = diag( 1) and U2Mn(F) be D-orthogonal. If N = D(U In) is not skew-symmetric and v2Fn satis es vTNv6= 0, then rank(HwU In) = rank(U In) 1 (2.7) where Hw = In 2ww TD wTDw (2.8) is the generalized Householder transform for w = (U In)v: Proof. De ne w = (U In)v: By Lemma 2.5, w is not D-isotropic. If x2ker(U In); then wTDx = vT(U In)TDx = vT (UT In)D(U In) +D(U In) x = 0 (2.9) by (2.4), and HwUx = Hwx = x 2w TDx wTDw w = x 18 by (2.8) and (2.9). Thus x2ker(HwU In); which shows ker(U In) ker(HwU In): If v satis es vTNv6= 0 then vTD(U In)v = vTNv6= 0 : Therefore v =2ker(U In): We have wTDw = 2vTNv = 2vTDw from (2.5) so that Hwv = v 2w TDv wTDw w = v +w = Uv : (2.10) Thus, Hw = H 1w and (2.10) imply that v = HwHwv = HwUv ; or v2ker(HwU In). Therefore ker(U In)[fvg ker(HwU In) and since v =2ker(U In) dim ker(U In) + 1 dim ker(HwU In) : Thus rank(HwU In) rank(U In) 1: Finally by Lemma 2.6, rank(HwHwU In) rank(Hw In) + rank(HwU In) ; or rank(U In) 1 + rank(HwU In) ; proving (2.7). If N = D(U In) is not skew-symmetric and rank(N) is even, then Hw formed as in Lemma 2.7 by any vector v with vTNv 6= 0 will satisfy (2.7). Therefore D(HwU In) 19 cannot be skew-symmetric because rankD(HwU In) is odd. This addresses the case of rankN = rank(N + NT) = 2 that was excluded in Lemma 2.4. Moreover, if rank(N) is even and greater than two while N +NT 6= 0, any vector v with vTNv6= 0 will satisfy (2.7) and guarantee that the updated D(HwU In) is not skew-symmetric. A vector v satisfying vTNv6= 0 for an arbitrary matrix N with N +NT 6= 0 has been constructed in Lemma 2.2. The vector v constructed in Lemma 2.4 also ensures both properties. The decision of vector used for the case of N not skew-symmetric and rank(N) even and greater than two leaves a certain freedom of choice with regards to the stability of the implementation. 2.5 Case of D(U In) skew-symmetric Thus far, a reduction step can be executed as long as D(U In) is not skew-symmetric. If D(U In) is skew-symmetric, we now explain how to perform a generalized orthogonal update so that D(HwU In) is no longer skew-symmetric. Hence after one such step in the skew-symmetric case, the problem is reduced to the not skew-symmetric case of Section 2.4. In addition to choosing the Householder update matrix Hw, we prove that exactly two additional Householder factors are needed if D(U In) is skew-symmetric. Lemma 2.8. Let D = diag( 1) and U 2 Mn(F) be D-orthogonal. Then D(U In) is skew-symmetric if and only if (U In)2 = 0 : (2.11) Proof. If D(U In) is skew-symmetric, then D(U In) + (D(U In))T = 0 : (2.12) We obtain D(U In)2 + (UT In)D(U In) = 0 20 by multiplying (2.12) on the right by U In. By (2.4) and (2.12) D(U In)2 = 0 : Multiplying on the left by D gives us (2.11). On the other hand, if (2.11) holds then U(U In) = U In : (2.13) In this case (UT In)D(U In) = (UT In)DU(U In) = (UTDU DU)(U In) = D(U In)2 : Thus by (2.4) and (2.11), D(U In) is skew symmetric. Lemma 2.9. Let D = diag( 1), U 2 Mn(F) be D-orthogonal, w 2 Fn be any non-D- isotropic vector, and Hw be the D-Householder matrix formed by w. If D(U In) is skew- symmetric, then rank(HwU In) = r + 1 where r = rank(U In). Moreover, r is even and r n=2. Proof. For D(U In) skew-symmetric, we know that (U In)2 = 0 from Lemma 2.8. Therefore im(U In) ker(U In) : (2.14) Hence for r = rank(U In) we have r n r or r n=2. As D is nonsingular, r = rankD(U In) as well, and since D(U In) is skew-symmetric, r is even. 21 Next suppose x2ker(U In) and y2im(U In). Then y = (U In)z for some z, and since D(U In) is skew-symmetric we have yTDx = zT(UT In)Dx = zTD(U In)x = 0 : Therefore ker(U In)?D im(U In). Along with (2.14) this means all vectors in im(U In) are D-isotropic. Let w2Fn be any non-D-isotropic vector and de ne w?D = x2Fn jwTDx = 0 as the space of vectors that are D-orthogonal to w. Since w is not D-isotropic, w =2im(U In). That means that w is not D-orthogonal to ker(U In) or w?D does not contain ker(U In). Therefore dim ker(U In)\w?D = n r 1 : (2.15) The proof is complete by showing that ker(U In)\w?D = ker(HwU In) : (2.16) Suppose rst that x2ker(U In)\w?D. Since x2ker(U In), Ux = x and (HwU In)x = (Hw In)x = 2w TDx wTDww : The latter expression is zero because x2w?D. Next suppose x2 ker(HwU In). Then Ux = HwHwUx = Hwx because Hw = H 1w . Hence (U In)x = (Hw In)x = 2w TDx wTDw w : (2.17) With D(U In) skew-symmetric and (UT In)D(U In) = 0 by (2.4), we have for any x that xT(UT In)D(U In)x = 0 : (2.18) 22 Substituting (2.17) into (2.18) gives 2wTDx wTDw 2 wTDw = 0 : The left side can only be zero if wTDx = 0. Hence x2w?D, and x2ker(U In) by (2.17). Therefore (2.16) holds and nally by (2.15) dim ker(HwU In) = n r 1 or rank(HwU In) = rank(U In) + 1 : Lemma 2.9 states that any non-D-isotropic vector w will su ce to form the update matrix Hw for which rank(HwU In) = rank(U In) + 1 provided D(U In) is skew- symmetric. For example, any standard unit vector ej may be used here because eTjDej = dj = 1. In addition, Lemma 2.9 guarantees that rank(HwU In) is odd and consequently that D(HwU In) can not be skew-symmetric. In the case of D = In, the only D-orthogonal matrix U with D(U In) skew-symmetric is the trivial U = In example. However, Section 2.5 is required to address general D. Here are D-orthogonal matrices U with skew-symmetric D(U In) for n = 4 and n = 6: U = 0 BB BB BB B@ 1 a a 0 a 1 0 a a 0 1 a 0 a a 1 1 CC CC CC CA for any a2F and D = 0 BB BB BB B@ 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 CC CC CC CA ; 23 U = 0 BB BB BB BB BB BB BB @ 1 a b b 0 a a 1 a a a 0 b a 1 0 b a b a 0 1 b a 0 a b b 1 a a 0 a a a 1 1 CC CC CC CC CC CC CC A for any a;b2F and D = 0 BB BB BB BB BB BB BB @ 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 CC CC CC CC CC CC CC A : 2.6 Construction Let S 2 Mn(F) be a nonsingular symmetric matrix that de nes the inde nite inner product space. Let Q2Mn(F) be S-orthogonal and r = rank(Q In). Our construction factors Q into the product of a minimal number of S-Householder matrices. Depending on whether S(Q In) is skew-symmetric or not, the minimal number of factors is r + 2 or r, respectively. We begin by nding D = Y TSY 1 = diag( 1) and the D-orthogonal matrix U = YQY 1 as detailed in the proof of Lemma 2.1. Note that rank(U In) = rankY(Q In)Y 1 = r, and S(Q In) is skew-symmetric if and only if N = D(U In) is. As shown in Section 2.3, the factorization of Q into the product S-Householder matrices is equivalent to the factorization of U into the product of D-Householder matrices. If r = 1, then U is a D-Householder matrix. Clearly U = In + xyT for two nonzero vectors x;y2Fn. Since U is D-orthogonal, or UTDU D = 0, we have 0 = (In +yxT)D(In +xyT) D = DxyT +yxTD(In +xyT) : (2.19) Multiplying on the right by a nonzero vector z we obtain 0 = (yTz)Dx+ [xTD(In +xyT)z]y : 24 Therefore y and Dx are linearly dependent, or y = Dx for some nonzero 2F. Substitut- ing into (2.19) gives 0 = DxxTD + DxxTD(In + xxTD) = (2 + xTDx)DxxTD : Since x is nonzero, the dyad DxxTD is nonzero. Thus = 2=(xTDx), and we have U = In + xxTD = In 2xx TD xTDx : Hence U is a D-Householder matrix. We next need to construct a non-D-isotropic vector to form U. If r = 1, N cannot be skew-symmetric, so there exists a constructible vector v with vTNv6= 0 by Lemma 2.2. Then for w = (U In)v w = 2x TDv xTDx x = x for 2F. We have that w is not D-isotropic from (2.5) and U = In 2xx TD xTDx = In 2 x( x)TD xTD( x) = In 2wwTD wTDw : Next, suppose r > 1 and N = D(U In) is not skew-symmetric. If rankN = rank(N +NT) = 2, choose any vector v with vTNv6= 0 according to the constructive proof of Lemma 2.2. Otherwise, construct a vector v using columns from the T-diagonalization of the symmetric matrix N + NT as detailed in the proof of Lemma 2.4. Alternatively, we may choose any v with vTNv 6= 0 as described after Lemma 2.7 whenever r is even, and we may construct v according to Lemma 2.4 whenever r is odd. With either choice of v, let w = (U In)v and Hw be the D-Householder matrix formed by w. If v is chosen according to Lemma 2.4, D(HwU In) is not skew-symmetric by Lemmas 2.4 and 2.5, and rank(HwU In) = r 1 by Lemma 2.7. If r is even and any v satisfying vTNv 6= 0 25 is chosen, D(HwU In) is not skew-symmetric because rankD(HwU In) = r 1 is odd by Lemma 2.7. We repeat the process inductively with HwU in place of U until rank(Hwr 1Hwr 2 Hw1U In) = 1. Exactly r 1 D-Householder factors have now been constructed. Thus in the case of N not skew-symmetric, U can be expressed as the prod- uct of r D-Householder matrices in total. To see that r is the minimal number of factors, suppose U can be expressed as the product of sD-Householder matrices. Using Lemma 2.6 repeatedly, we have r = rank(U I) s: Finally suppose N = D(U In) is skew-symmetric. For any non-D-isotropic vector w, we have rank(HwU In) = r + 1 by Lemma 2.9 and consequently D(HwU I) cannot be skew-symmetric. Any standard unit vector ei may be chosen here for w. The updated matrix HwU can then be factored inductively as detailed in the previous paragraph. Only one D- Householder matrix is used here, but rank(HwU In) = r+1 additional D-Householder ma- trices are now needed to complete the factorization. Thus in the case of N skew-symmetric, U can be expressed as the product of r + 2 D-Householder matrices in total. To see that r + 2 is the minimal number of factors, suppose U can be expressed as the product of s D-Householder matrices. Then U = HwT for a D-Householder matrix Hw and a matrix T that is the product of the s 1 remaining D-Householder factors. However, we have already seen that expressing T = HwHwT = HwU as a product of D-Householder matrices requires at least r + 1 factors because D(HwU In) is not skew-symmetric. Thus s 1 r + 1. Therefore, setting equal to r if N = D(U In) is not skew-symmetric and equal to r + 2 if it is skew-symmetric, we have U = Y j=1 Hwj : As shown in Section 2.3, this means that Q = Y j=1 HS;Y 1wj : 26 Furthermore, the number of factors is minimal. The freedom of choice for selecting the vec- tors wj in the factorization process above can bene t the numerical stability of our method?s implementation. 27 Chapter 3 Pseudo Code for a CDS Factorization over R or C In this section we give a pseudo code for an algorithm to factor a given S-orthogonal matrix Q into the product of S-Householder matrices that uses the minimal number of factors as speci ed by the CDS Theorem. An outline of the algorithm was given in Section 2.6. Here we omit the earlier explanations of the proof and provide only the steps required. As described in Section 2:6, there are generally several options in selecting a vector v at many of the steps, but for simplicity, we only list one here while making note when a di erent choice might have been made. Here F equals R or C and ej denotes the jth standard unit vector. We use MATLAB notation for matrix columns, i.e., A(:;j) is the jth column of A. Input: a nonsingular symmetric matrix S2Mn(F) and an S-orthogonal matrix Q2Mn(F) Reduce S to D = diag( 1) and the S-orthogonal Q to a D-orthogonal U: T-diagonalize S so that VTSV = diag( 1; 2;:::; n) see Section 2.2 Set L = diag 1= q j jj Set D = LVTSVL Set U = L 1V 1QVL Factor the D-orthogonal matrix U into a product of D-Householder matrices. Set r = rank(U In) { If D(U In) is skew-symmetric, update so that D(HU In) is not If D(U In) is skew-symmetric 28 Set W(:;1) = e1 free choice of any ej Set H1 = In 2e1e T 1D eT1De1 Householders never explicitly formed, see below Update U H1U Set isSkew = 1 Else Set isSkew = 0 End if { Iterate until rank h Y Hj U In i = 1 For ? equal 1 +isSkew to r + 2 isSkew 1 Set N = D(U In) Find C such that CT(N +NT)C = diag( 1; 2;:::; n) see Section 2.2 Sort the columns of C so that j 1j j 2j j nj Set B = CTNC If rankN = rank(N +NT) = 2 or if rank(N +NT) is even set v = C(:;1) or choose v = C(:;2) Else If rank(N +NT) = 1 Lemma 2.4, Case 1 If there is a nonzero entry in B(1;2 : n) Set v = C(:;1) Else there is a B(j;k)6= 0 with j6= k and j;k> 1 Set v = C(:;1) +C(:;j) choice of nonzero entry B(j;k) and End if choice of C(:;k) instead of C(:;j) Elseif rank(N +NT) = 2 Lemma 2.4, Case 2 If there is a B(j;k)6= 0 with j2f1;2g, k2f3;4;:::;ng Set v = C(:;j) choice of nonzero entry B(j;k) Else there is a B(j;k)6= 0 with j6= k and j;k 3 29 Set v = C(:;1) +C(:;j) choice of nonzero entry B(j;k) and End if choice of C(:;1) or C(:;2) and C(:;j) or C(:;k) Else rank(N +NT) 3; Lemma 2.4, Case 3 Set v = C(:;1) rank(N +NT) choices End if End if Set w = (U In)v Set W(:;?) = w Set H? = In 2ww TD wTDw Update U H?U End for { Now rank(U In) = 1 so U is a D-Householder matrix Note: This can easily be combined with the previous ?for? loop. Set N = D(U In) Find C such that CT(N +NT)C = diag( 1; 2;:::; n) Sort the columns of C so that j 1j j 2j j nj Set v = C(:;1) Set w = (U In)v Set W(:;r + 2 isSkew) = w Set Hr+2 isSkew = In 2ww TD wTDw Convert D-Householders to S-Householders For j equal 1 to r + 2 isSkew Set Hj = VLHjL 1V 1 Set W(:;j) = VLW(:;j) End for 30 Output: the matrix W = 0 BB BB @ j j j w1 w2 wr+2 isSkew j j j 1 CC CC A 2 Fn (r+2 isSkew) with columns that form the S-Householder factors H1;H2;:::;Hr+2 isSkew such that Q = r+2 isSkewY j=1 Hj We note that in the MATLAB code, the D-Householder factors Hj = In 2wjw T j D wTj Dwj of Q are never explicitly formed as matrices. Instead only their generating vectors wj are stored. Then, the updates U HwjU are always performed vector and dyad-wise by evaluating HwjU = In 2wjw T j D wTj Dwj ! U = U 2(wT j diag(D)) wj wj (wTj diag(D)) U where the computation of wTj D = wTj diag(D) uses entry-wise multiplication of vectors rather than a vector-matrix multiplication. The operations count is reduced by a factor of n by taking advantage of the structure of D-Householder matrices. We follow the standard practice in numerical linear algebra for using Householder matrices e ciently. 31 Chapter 4 Some Applications of generalized orthogonal matrices In this chapter we look at applications that use generalized orthogonal matrices. In Section 4.1 using D = diag(1;1;:::;1; 1), we see how D-orthogonal matrices and in par- ticular D-Householder matrices can be used to study Pythagorean n-tuples. The results we consider are known, but we work with matrices and vectors rather than number theory and develop a few nice results. In the standard D = In case, the QR matrix factorization plays an important role in many numerical applications. In Section 4.2, we consider conditions which make an inde nite QR factorization possible for a given matrix A. When this is not possible, we study how close we can bring a matrix to triangular form R by applying D-Householder matrices to zero out entries below the diagonal of A. 4.1 Pythagorean n-tuples A Pythagorean n-tuple is an integer vector x2Zn with x21 + x22 + x23 + + x2n 1 = x2n. Equivalently, a nonzero x2 Zn is a Pythagorean n-tuple if x is D-isotropic for D = diag(1; 1; ; 1; 1), i.e., xTDx = 0. Since generalized orthogonal matrices preserve the inner product, one can easily work with Pythagorean n-tuples using generalized orthogonal matrices. In this section we describe a few known number theoretic results with linear algebra techniques in light of inde nite inner product spaces. For any D-orthogonal integer matrix U 2 Mn(Z), note that x is a Pythagorean n- tuple if and only if Ux is a Pythagorean n-tuple since (Ux)TD(Ux) = xTUTDUx = xTDx. Clearly x = (1;0;0;:::;0;1)T is a Pythagorean n-tuple. We can show that one can map the 32 generic Pythagorean n-tuple x = e1 + en to any other Pythagorean n-tuple using a single D-Householder matrix. Lemma 4.1. Let D = diag(1; 1; ; 1; 1) be n n, x = (1;0;0;:::;0;1)T, and y2Zn be any Pythagorean n-tuple with y 6= x. Then w = x y is not D-isotropic and the D-Householder matrix Hw satis es Hwx = y. Proof. For w = x y, wTDx = xTDx yTDx = yTDx and wTDw = xTDx yTDx xTDy +yTDy = 0 yTDx yTDx+ 0 = 2yTDx : Since y 6= x and y satis es yTDy = 0, at least one of y2;y3;:::;yn 1 is nonzero. Then wTDw = 2yTDx = 2(y1 yn) must be nonzero. And Hwx = I 2ww TD wTDw x = x 2w TDx wTDww = x w : Thus any Pythagorean n-tuple is at most one step away from x = (1;0;0;:::;0;1)T in terms of D-Householder transformations. In general the D-Householder matrix Hw of Lemma 4.1 is not in Mn(Z). Hw maps x to the Pythagorean n-tuple y, as well as y to x. Also, it satis es (Hww)TDHww = wTDw = 0 for any Pythagorean n-tuple w. However, Hww may not be a Pythagorean n-tuple since it is not necessarily an integer vector for all w. A D-Householder matrix with only integer entries would map any Pythagorean n-tuple to a di erent Pythagorean n-tuple. If w 2 Zn and wTDw 2f1;2g for example, then the 33 D-Householder matrix Hw maps Pythagorean n-tuples to Pythagorean n-tuples since no fractions would be involved. Clearly, for any Pythagorean n-tuple x2 Zn and any nonzero constant k 2 Z, kx is also a Pythagorean n-tuple. A Pythagorean n-tuple x = (x1;x2;:::;xn)T is called primitive if its greatest common divisor gcd(x1;x2;:::;xn) = 1. Primitive Pythagorean n-tuples can be generated by applying D-Householder matrices to other primitive Pythagorean n-tuples. For w = (1;1;1;0;:::;0;1) for example, we have wTDw = 2, and the D-Householder matrix Hw formed by w has the form Hw = In 2ww TD wTDw = In ww TD = 0 BB BB BB BB BB B@ 0 1 1 1 0 1 1 1 0 0 1 1 1 0 In 4 0 1 1 1 0 2 1 CC CC CC CC CC CA : In [4], Cass and Arpaia show that this Hw generates all primitive Pythagorean n-tuples for 4 n 9. They prove that for any Pythagorean n-tuple y a sequence of Pythagorean n-tuples wj can be found starting with w0 = (1;0;:::;0;1)T so that w0;w1;w2;:::;wm = y. The single matrix Hw above for w = (1;1;1;0;:::;0;1) is used to move through this sequence as described below. However, repeatedly applying any one generalized Householder matrix H is not very useful since H 1 = H. To avoid this, [4] denotes the set of n-tuples found by permuting the rst n 1 entries of wj and/or changing the signs of these entries by S(wj). It is shown there that Hw maps at least one Pythagorean n-tuple of S(wj) to an element of S(wj+1) for each j. For other dimensions, the same Hw does generate Pythagorean n-tuples but will not generate all of them if only the single starting Pythagorean n-tuple x = (1;0;:::;0;1)T is used. In the case of n = 10 for example, [4] states that there are two orbits of Pythagorean n-tuples formed by x1 = (1;0;:::;0;1)T and x2 = (1;1;:::;1;3)T. 34 Arag on et al. [2] use a factorization of an orthogonal transformation into Householder transformations in terms of Cli ord algebras to show the structure of Pythagorean n-tuples for n equal to three or four. They prove that (x1;x2;x3) is a primitive Pythagorean triple if and only if x1 = 2 + 2; x2 = 2 ; x3 = 2 + 2 for ; 2N relatively prime. They prove an analogous result for n = 4 and mention that the process extends to the general formula that x1 = 21 + n 1X j=2 2j; xk = 2 1 k for 2 k n 1; xn = n 1X j=1 2j will be a Pythagorean n-tuple for j2Z. 4.2 Inde nite QR factorization Our goal in this section is to establish a QR-like factorization of a matrix A using Householder eliminations in the inde nite inner product case. In the standard case with D = In, a matrix A is factored into A = QR for an orthogonal matrix Q and an upper triangular matrix R. We wish to generalize so that Q is D-orthogonal for a xed D = diag( 1). First we note that such a generalized QR factorization is not possible for some pairs A and D. Lemma 4.2. Let A = 0 BB BB BB B@ 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 CC CC CC CA and D = 0 BB BB BB B@ 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 CC CC CC CA : 35 Then the matrix A cannot be factored into a D-orthogonal matrix Q and an upper triangular matrix R satisfying A = QR. Proof. Suppose there is a D-orthogonal matrix Q and an upper triangular matrix R with A = QR. Since R is upper triangular, the columns a1;a2;:::;an of A are linear combinations of the columns q1;q2;:::;qn of Q. For 1 k n, this gives ak = kX j=1 rjk qj : In particular, a1 = r11q1. Since Q is D-orthogonal, QTDQ = D. It follows that aT1Da1 = r211 qT1Dq1 = r211 d1 : For this example, d1 = 1 and aT1Da1 = 0 so that r11 must be zero. However, the rst column of R cannot be zero since A has full rank. One way to address this example could be to generalize the inde nite QR factorization to nd A = QB for a D-orthogonal matrix Q and a matrix B with rows and/or columns that can be permuted into an upper triangular or nearly upper triangular form. However, the example in Lemma 4.2 cannot be factored in this way either. If A = QB, there would have to be a nonzero entry bjk of B with the rest of the entries in the kth row of B all zero. But since ATDA = BTDB it follows that aTkDak = b2jkdj with dj = 1. Similar to the above result, we nd the contradiction that bjk = 0 since the A in Lemma 4.2 has aTkDak = 0 for each k. Choosing the D = diag( 1) in dependent of the given matrix A is another possi- bility here. For A of Lemma 4.2, an inde nite QR factorization is possible with D = diag( 1;1;1; 1) or D = diag(1; 1; 1;1). It seems that for any given A there might be a D = diag( 1) for which an inde nite QR factorization is possible, but this question 36 is open. In this section, we study the inde nite QR factorization given both an A and a D = diag( 1). A key step in producing an inde nite QR-like factorization is mapping a vector x via a D-Householder matrix to a vector y that has as many zero entries as possible. Such a map- ping theorem for G-re ectors is proved in [10]. In a symmetric bilinear form, G-re ectors are generalized Householder transformations, but they further generalize re ections across hy- perplanes to skew-symmetric bilinear and sesquilinear Hermitian and skew-Hermitian forms. In our setting, the mapping theorem of [10] states that for distinct, nonzero vectors x;y there is a D-Householder matrix H such that Hx = y if and only if xTDx = yTDy and xTDy6= xTDx. In the following lemmas we map a vector x to a vector y that has only one or two nonzero entries using D-Householder matrices. Lemma 4.3. Let D = diag(d1;d2;:::;dn) = diag( 1), x 2 Rn, and e1;e2;:::;en be the standard unit vectors. If xTDx6= 0, a D-Householder matrix H can be constructed so that Hx = ej for any j with dj xTDx> 0 and = pdj xTDx. Proof. If xTDx< 0, choose j to correspond to one of the diagonal entries of D with dj = 1. If xTDx > 0, choose j with dj = +1. Note that if dj = +1 for all j, then D = In and xTDx = xTx> 0. Also, xTDx is clearly negative if D = In. Then for D = In, any j can be chosen. Hence a suitable j is available in all cases. Let = pdj xTDx and w = x ej. Then wTDx = xTDx eTjDx = xTDx djxj : 37 If xj = 0, then this reduces to wTDx = xTDx6= 0. If xj 6= 0, choose the sign of so that wTDx6= 0. Then we see from wTDw = xTDx eTjDx xTDej + 2 eTjDej = xTDx 2 eTjDx+ (dj xTDx)dj = 2(xTDx eTjDx) = 2wTDx that w is not D-isotropic. Thus the D-Householder matrix Hw associated with w satis es Hwx = ej since Hwx = I 2ww TD wTDw x = x 2w TDx wTDww = x w = ej We would like to nd a similar reduction for x with xTDx = 0. However, there is no D-orthogonal matrix that can reduce a D-isotropic x to a multiple of any standard unit vector ej as the following lemma shows. Lemma 4.4. Let D = diag(d1;d2;:::;dn) = diag( 1), x 2 Rn, and e1;e2;:::;en be the standard unit vectors. If xTDx = 0, then there is no D-orthogonal matrix U2Mn(R) such that Ux is a nonzero multiple of ej for any j. Proof. Suppose there is a D-orthogonal matrix U with Ux = ej for a nonzero constant and some j2f1;2;3;:::;ng. Since U is D-orthogonal, UTDU = D. Then 0 = xTDx = xT(UTDU)x = (Ux)TD(Ux) = 2 eTjDej = 2dj : Since dj = 1, it follows that = 0. This contradicts our assumption. 38 In the cases of D = In, Lemma 4.3 shows that there is a D-Householder matrix H to reduce x2Rn to a multiple of any of the standard unit vectors e1;e2;:::;en. We can use this to work on the case of D-isotropic vectors x. While we can not reduce x to a single standard unit vector with a D-orthogonal matrix, we can use two D-Householder matrices to map x to a linear combination of two standard unit vectors. Lemma 4.5. Let x2Rn be nonzero, e1;e2;:::;en be the standard unit vectors, and D = diag(d1;d2;:::;dn) = 0 B@ In+ 0 0 In 1 CA for n+ +n = n and n+;n > 0. If xTDx = 0, then x can be mapped via the product of two D-Householder matrices H1 and H2 so that H1H2x = ej + ek where and are nonzero constants and dj = dk. Moreover, 2 = 2. Proof. Let D1 = In+ and D2 = In . For nonzero x with xTDx = 0 we write x = (x1; x2)T so that xTDx = xT1D1x1 +xT2D2x2 = xT1 (In+)x1 +xT2 ( In )x2 = xT1x1 xT2x2 : Since x is nonzero, xT1D1x1 = xT2D2x2 6= 0. By Lemma 4.3 there is a D1-Householder matrix H^w1 formed by ^w1 = x1 ^ej and a D2-Householder matrix H^w2 formed by ^w2 = x2 ^ek n+ such that H^w1x1 = ^ej and H^w2x2 = ^ek n+ for some j and k n+ with 1 j n+ and 1 k n+ n . Here ^ej and ^ek n+ are the jth and (k n+)th column of In+ and In , respectively. Finally, we pad ^w1 and ^w2 with n and n+ zeros, respectively, to get w1 = ( ^w1; 0)T and w2 = (0; ^w2)T and form Hw1 = 0 B@ H^w1 0 0 In 1 CA and H w2 = 0 B@ In+ 0 0 H^w2 1 CA : 39 In this notation Hw1Hw2x = Hw1 0 B@ x1 H^w2x2 1 CA = 0 B@ H^w1x1 H^w2x2 1 CA = 0 B@ ^ej ^ek n+ 1 CA = e j + ek : Since 1 j n+ and n+ tol % Lemma 2.4 Case 1 (a) v = C(:,1); else % Lemma 2.4 Case 1 (b) B(:,1) = 0; B(1,:) = 0; B = B-diag(diag(B)); B = abs(B); [J,K] = find(B == max(B(:))); v = C(:,1) + C(:,J(1)); end; elseif r_NNT == 2 % Lemma 2.4 Case 2 if max(abs(B(1:2,3:n))) > tol % Lemma 2.4 Case 2 (a) B(3:n,:) = 0; B(1:2,1:2) = 0; B = abs(B); [J,K] = find(B == max(B(:))); v = C(:,J(1)); else % Lemma 2.4 Case 2 (b) B(1:2,:) = 0; B(:,1:2) = 0; B = B-diag(diag(B)); B = abs(B); [J,K] = find(B == max(B(:))); v = C(:,1) + C(:,J(1)); end; else % r_NNT > 2 % Lemma 2.4 Case 3 v = C(:,1); end; end; w = U*v-v; W(:,iter) = w; % Save w that forms D-Householder H wTD=w.?.*d?; U = U-2/(wTD*w)*w*(wTD*U); % Update U <-- HU end; if s2d == 1, W = V*L*W; end; % Convert back to S-Householders 49 function [C,L] = sym_diag(A,tol); % ------------------------------------------------------------------------ % Find the T-diagonalization of a real or complex symmetric matrix A, % i.e., find a nonsingular C and diagonal L with C^TAC=L. If A has only % real entries, use an eigenvalue decomposition. If A has complex entries, % use the Takagi factorization. % The columns of C are sorted so that the diagonal entries of L appear % in decreasing magnitude. % % INPUT: % A = symmetric n by n matrix % tol = tolerance (optional) % OUTPUT: % C = nonsingular matrix % L = diagonal matrix % ----------------------------------------------------------------------- [n n] = size(A); if nargin == 1, tol = 10^(-10); end; I = eye(n); if unique(isreal(A)) == 1 [C,L] = eig(A); else [C,L] = takagi(A,tol); end; diag_L = diag(L); [tmp,srt] = sort(abs(diag_L),?descend?); C = C(:,srt); L = diag(diag_L(srt)); 50 function [C,L] = takagi(A,tol) % ------------------------------------------------------------------------ % Diagonalize a complex symmetric matrix A, i.e., find a nonsingular C % and diagonal L with C^TAC=L. % This follows Theorem 4.4.3 and Corollary 4.4.4 in Horn and Johnson?s % Matrix Analysis % % INPUT: % A = complex n by n symmetric matrix % tol = tolerance (optional) % OUTPUT: % C = nonsingular (unitary) matrix % L = diagonal matrix % ----------------------------------------------------------------------- [n n] = size(A); if nargin == 1, tol = 10^(-10); end; I = eye(n); A0 = A; U = I; for j = 1:n-1 W=[]; AA_ = A*conj(A); [V,E] = eig(AA_); k = find(abs(diag(E)),1); x = V(:,k); if rank([A*conj(x) x],tol) == 1 W = x; else % rank([A*conj(x) x],tol) == 2 mu = sqrt(E(k,k)); W = A*conj(x) + mu*x; W = W / sqrt(W?*W); end; W(:,2:n-j+1) = null(W?); U = U*blkdiag(eye(j-1),W); if j < n-1 tmp = W?*A*conj(W); A = tmp(2:n-j+1,2:n-j+1); end; end; C = conj(U); L = U?*A0*conj(U); 51