On the block Lanczos and block Golub–Kahan reduction methods applied to discrete ill‐posed problems

The reduction of a large‐scale symmetric linear discrete ill‐posed problem with multiple right‐hand sides to a smaller problem with a symmetric block tridiagonal matrix can easily be carried out by the application of a small number of steps of the symmetric block Lanczos method. We show that the subdiagonal blocks of the reduced problem converge to zero fairly rapidly with increasing block number. This quick convergence indicates that there is little advantage in expressing the solutions of discrete ill‐posed problems in terms of eigenvectors of the coefficient matrix when compared with using a basis of block Lanczos vectors, which are simpler and cheaper to compute. Similarly, for nonsymmetric linear discrete ill‐posed problems with multiple right‐hand sides, we show that the solution subspace defined by a few steps of the block Golub–Kahan bidiagonalization method usually can be applied instead of the solution subspace determined by the singular value decomposition of the coefficient matrix without significant, if any, reduction of the quality of the computed solution.

with a large matrix A ∈ R ×n , whose singular values gradually approach zero without significant gap. Thus, A is very ill-conditioned and may be rank deficient. The data matrix B ∈ R ×p with 1 < p ≪ is a "block vector" with many more rows than columns. The Frobenius norm ||M|| F of a matrix M is defined as follows. For two matrices M 1 , M 2 ∈ R n×p , we introduce the inner product where the superscript T denotes transposition and trace(⋅) stands for the trace of a square matrix. Then The usual inner product of elements u, v ∈ R n is denoted by ⟨u, v⟩ 2 = u T v and the Euclidean norm by ||u|| 2 = √ ⟨u, u⟩. Finally, in the following, (M) stands for the range of the matrix M.
Minimization problems like the one appearing in Equation (1) with a matrix with the properties described are commonly referred to as discrete ill-posed problems; see, for example, Reference 1 and the references therein. They arise, for instance, from the discretization of linear ill-posed problems, such as Fredholm integral equations of the first kind. Applications include color and hyperspectral image restoration; see, for example, References 2,3.
In discrete ill-posed problems of the form (1) that arise in applications in science and engineering, the matrix B typically represents measured data that are contaminated by an error E ∈ R ×p . Thus, where B true ∈ R ×p represents the (unknown) noise-free block vector associated with B. We would ideally like to compute an approximation of the solution X true ∈ R n×p of minimal Frobenius norm of the minimization problem min X∈R n×p Let A † denote the Moore-Penrose pseudoinverse of the matrix A. Then, X true = A † B true . Note that the solution of (1), given by is not a useful approximation of X true because, generally, ||A † E|| F ≫ ||X true || F due to the presence of tiny positive singular values of A. The computation of a meaningful approximation of X true from (1) requires that the system be regularized before solution, that is, the system (1) has to be modified so that its solution is less sensitive to the error E in B than the solution of (1). We regularize the system (1) in two steps: first A is projected to a generally fairly small block tridiagonal or block bidiagonal matrix by application of a few iterations of the block Lanczos tridiagonalization (BLT) algorithm to A (when A is symmetric) or of the block Golub-Kahan bidiagonalization (BGKB) algorithm (when A is nonsymmetric), respectively; then the reduced problem so obtained is solved by Tikhonov regularization. Discussions of these block algorithms for discrete inverse problems of the form (1) can be found in References 4 and [5, section 10.3.6] (for BLT), and in Reference 2 (for BGKB), as well as in Section 2. In addition, recent advances in understanding the convergence behavior of block Krylov methods based on the Arnoldi algorithm can be found in Reference 6, where ways of constructing matrices and right-hand sides producing any admissible convergence behavior are presented. The point of view adopted in this article is fundamentally different, as the derivations presented here are targeted at problems of the kind (1). Indeed, it is the purpose of this article to discuss the structure and properties of the block tridiagonal and block bidiagonal matrices determined by the BLT or BGKB algorithms, respectively, and to show the performance of Tikhonov regularization used jointly with these decompositions.
This article is organized as follows. Section 2 reviews some background material, namely: first, summaries are given about the BLT method for symmetric matrices A and the BGKB algorithm for nonsymmetric, possibly rectangular, matrices A; then, a description is added about how Tikhonov regularization can be applied to solve the reduced problems obtained by such algorithms. Section 3 presents new theoretical bounds for the diagonal and subdiagonal BLT and BGKB blocks when A stems from the discretization of a linear ill-posed problem. A few computed examples are presented in Section 4. Finally, Section 5 contains concluding remarks.

BLOCK ALGORITHMS AND TIKHONOV REGULARIZATION
Summaries of the BLT and BGKB algorithms are given in Sections 2.1 and 2.2, respectively. A solution method based on Tikhonov regularization applied to the projected problems associated to BLT and BGKB is described in Section 2.3.

Block Lanczos tridiagonalization
Let A ∈ R n×n be a symmetric matrix and let B = X 1 S 1 be a compact QR factorization of B ∈ R n×p with 1 ≤ p ≪ n, where X 1 ∈ R n×p has orthonormal columns and S 1 ∈ R p×p is upper triangular. Then, application of m ≪ n/p steps of the block Lanczos method to A with initial block vector X 1 yields a decomposition of the form where the block columns of Here, I p ∈ R p×p denotes the identity matrix and O p ∈ R p×p the zero matrix. Moreover, the matrix is block tridiagonal with a leading symmetric pm × pm submatrix, which we denote by T m, m . The diagonal blocks M i ∈ R p×p , i = 1, 2, … , m, are symmetric, and the subdiagonal blocks S j ∈ R p×p , j = 2, 3, … , m + 1, are upper triangular. We tacitly assume that m is small enough so that the decomposition (3) exists and can be computed as summarized in Algorithm 1.
The block columns X i , i = 1, 2, … , m, of the matrix Q m form an orthonormal basis for the block Krylov subspace An approximate solution of (1) can be computed by the truncated BLT method as follows: compute the solution Y m ∈ R pm×p of the small minimization problem on the right-hand side of min X∈K m (A,X 1 ) where Then,X m ∶= Q m Y m is the solution of the large minimization problem on the left-hand side of (5), as well as an approximate solution of (1). By choosing m suitably small, we can ensure that the matrix T m + 1, m is of full rank and that the effect of the error E in B on the computed solutionX m is smaller than if we attempt to Algorithm 1. Block Lanczos tridiagonalization Compute the compact QR factorization B j+1 = X j+1 S j+1 . 6. EndFor Output: Block Lanczos decomposition (3) solve the original problem (1). The latter is a consequence of the fact that the condition number of T m + 1, m , given by is an increasing function of m. Here and below || ⋅ || 2 denotes the spectral norm of a matrix. A large condition number indicates that the solution Y m of the problem on the right-hand side of (5) is very sensitive to errors in the data as well as to round-off errors introduced during the computations. In Section 3, we will discuss properties of the block tridiagonal matrix T m + 1, m , and in Section 2.3 how to stabilize the solution process by Tikhonov regularization; the solution method so obtained does not require the matrix T m + 1, m to be of full rank.

Block Golub-Kahan bidiagonalization
A large nonsymmetric and possibly rectangular matrix A ∈ R ×n can be reduced to a small lower block bidiagonal matrix by application of a few steps of the BGKB algorithm. This reduction method can be used to determine an approximate solution of the minimization problem where the block vector B ∈ R ×p is error-contaminated and can be written as (2). We assume for notational simplicity that 1 ⩽ n ⩽ . Introduce the compact QR factorization B = P 1 R 1 , where P 1 ∈ R ×p has orthonormal columns and R 1 ∈ R p×p is upper triangular. Then, m ≪ n/p steps of the BGKB algorithm applied to A with initial block vector P 1 give the decompositions where the matrices U m+1 = [P 1 , … , P m+1 ] ∈ R ×p(m+1) and W m = [Z 1 , … , Z m ] ∈ R n×pm have orthonormal columns and is lower block bidiagonal with lower triangular diagonal blocks L j ∈ R p×p and upper triangular subdiagonal blocks R j ∈ R p×p . The matrix U m consists of the m first block columns of U m + 1 , and C m, m is the pm × pm leading principal submatrix of C m + 1, m . We assume that the number of steps, m, is small enough so that the decompositions (7) with the stated properties exists. The main steps required to compute these decompositions are summarized in Algorithm 2.

For
ii. Compute the compact QR factorization F j+1 = Z j+1 L T j+1 (c) EndIf 4. EndFor Output: Block Golub-Kahan decompositions (7) We will use the connection between the BGKB of A and the BLT of A T A in our analysis of the decompositions (3) and (7). Multiplying the left-hand side decomposition of (7) by A T from the left-hand side gives Thus, this decomposition is analogous to (3). In particular, the matrix T m + 1, m is block tridiagonal with block size p × p and its leading pm × pm submatrix is symmetric. We conclude that (8) is a BLT of A T A with initial block vector Z 1 . Since T m + 1, m is block tridiagonal, Equation (8) shows that the block columns Z j of W m satisfy a three-term recurrence relation. Moreover, the block columns Z 1 , Z 2 , … , Z m form an orthonormal basis for the block Krylov subspace The block LSQR method applied to the solution of (6) solves at step m the minimization problem where the right-hand side is obtained by substituting decomposition (7) into the left-hand side. Assume for the moment that the matrix C m + 1, m is of full rank, and denote the solution of the problem appearing on the right-hand side of (9) by Y m . Then,X m = W m Y m is the solution of the problem appearing on the left-hand side of (9), which is an approximate solution of (6).

Tikhonov regularization
The block tridiagonal or lower block bidiagonal matrices in the reduced problems (5) and (9), respectively, might be numerically rank deficient. This often is the case when these matrices are large, because the singular values of the matrix A "cluster" at the origin. It follows that the reduced problems may have to be regularized before solution. We will apply Tikhonov regularization to the reduced problems obtained by BLT and BGKB; we provide details for the former only (i.e., for the problem appearing on the right-hand side of (5)). Tikhonov regularization applied to this setting gives a minimization problem of the form For a given value of the regularization parameter > 0, the solution of (10) can be expressed as There are several techniques for determining a suitable value of , including the discrepancy principle, generalized cross-validation, and the L-curve criterion; see References 7-13 for discussions on these and other methods. In the computed examples of this article, we will use the discrepancy principle, which was first discussed by Morozov 14. This approach to determine requires that a bound for the error E in B, cf. (2), be known, and prescribes that > 0 be chosen so that the solution (11) of (10) satisfies where > 1 is a user-chosen constant that is independent of ; when the available estimate of ||E|| F is deemed accurate, the parameter is generally chosen to be close to unity. We note that there is a > 0 that satisfies (12) only if the number of steps, m, is large enough. It follows from (10) In our examples, we choose m as small as possible so that ||T m+1,m Y − E 1 S 1 || F < . Then a zero-finder is applied to solve (12) for > 0; see Reference 2 for further details. Thus, the discrepancy principle is used both to determine the number of steps m and > 0. Similar derivations and analogous expressions for the norm of the residual errors can be obtained when applying the BGKB algorithm.

BLT AND BGKB APPLIED TO LINEAR DISCRETE ILL-POSED PROBLEMS
This section first discusses the convergence of the subdiagonal and diagonal block entries of the matrices T m + 1, m and C m + 1, m in (3) and (7), respectively, with increasing block index. Unless otherwise stated, and with a slight abuse of notation, here and in the following, we will denote by A the square matrix of order n + q, whose leading principal submatrix of order n is the coefficient matrix A appearing in (3), padded with 0 ≤ q < p rows and columns of zeros, where p is the block size used in the block Lanczos or block Golub-Kahan algorithms. For A symmetric, the following proofs use the spectral factorization where the matrix  = [w 1 , w 2 , … , w n+q ] ∈ R (n+q)×(n+q) is orthogonal and We will use the notation where I p is the ith block, and where 0 < r : = (n + q)/p with q being the smallest nonnegative integer such that r ∈ N.
Theorem 1. Assume that the block Lanczos method applied to the symmetric and positive semidefinite matrix A ∈ R (n+q)×(n+q) with initial block matrix X 1 ∈ R (n+q)×p with orthonormal columns does not break down, that is, that r: = (n + q)/p steps of the method can be carried out. Let the eigenvalues of A be ordered according to (14), and let S 2 , S 3 , … , S m + 1 , m ≤ r, be the subdiagonal blocks of the matrix T m + 1, m determined by m steps of the block Lanczos methods; cf. (4).
Proof. Introduce the monic polynomial p m (t) = ∏ m j=1 (t − j ) defined by the m largest eigenvalues of A. Using the spectral factorization (13), we obtain where the inequality follows from the fact that all j are nonnegative. Hence, Application of r steps of the block Lanczos method gives the decomposition A = Q r T r,r Q T r , where T r,r ∈ R (n+q)×(n+q) is symmetric block tridiagonal. We have Thus, The above inequality follows by direct computations. We are going to show by induction over m that for any m < r. When m = 1, Equation (18) becomes When m = 2, let us consider The first two factors in (19) are while the remaining two factors are It follows that the expression (19) can be written as More generally, by induction, assume that (18) is valid for 2 ≤ m < r − 1. This means We would like to show that (18) is valid for 2 ≤ m + 1 < r. From Hence, (18) is valid. Combining (16)-(18) proves the theorem. ▪ We are interested in problems for which the right-hand side of (15) decreases to zero an m increases. This holds for all symmetric linear discrete ill-posed problems that we have encountered in various applications. It depends on that the eigenvalues of A "cluster" at the origin. In fact, we found the bound (15) to be quite sharp. We give a numerical illustration of the latter in Section 4.

Corollary 1.
Let the matrix A ∈ R (n+q)×(n+q) be symmetric and positive semidefinite. Assume that the eigenvalues of A "cluster" at the origin so that the right-hand side of (15) decreases to zero as m increases, and that the block Lanczos method applied to A with initial block vector X 1 with orthonormal columns does not break down. Furthermore, assume that, for all j > s, for some constant C independent of j and s. Then, both the diagonal and subdiagonal block matrices M j and S j of the block tridiagonal Lanczos matrix T r, r , defined by (3), converge to O p as j increases.
Proof. We first remark that when we let the index j increase in (21), we also may have to increase m in (3). By Theorem 1, we have the bound The product ∏ m j=1 j converges to zero as m increases. It follows that ||S m + 1 S m · · · S 2 || 2 converges to zero as m increases. Therefore, In view of (21), the subdiagonal blocks S j of T r, r approach O p as j increases. We turn to the block diagonal entries M j of the matrix T r, r . Let > 0 be arbitrarily small. Since A = Q r T r,r Q T r , the matrices A and T r, r are similar. Therefore, the eigenvalues of matrix T r, r "cluster" at the origin, which is the only cluster point. Split the matrix T r,r =T j,j +Ẽ j , wherẽ and j is chosen so that ||Ẽ j || 2 ≤ . Thus,T j,j is a symmetric block tridiagonal matrix, which is obtained by setting the subdiagonal blocks of T r, r in the block rows j + 1, j + 2, … , r to zero (the corresponding superdiagonal blocks also are set to zero). The matrixẼ j contains the blocks set to zero in T r, r . Since the eigenvalues of T r, r "cluster" at the origin, it follows from the Bauer-Fike theorem that the eigenvalues of the matrixT j,j "cluster" in the interval [− , ]. For some > 0 arbitrarily small, there is an index s, depending on , such that all eigenvalues of the blocks M k are in the interval [− − , + ] for all k ≥ s. Hence, ||M k || 2 ≤ + for all k ≥ s. Since and can be chosen arbitrarily small, this shows that the diagonal blocks M j converge to the zero matrix as j increases. ▪ Corollary 1 is stated in Reference 15 for block size p = 1 without the condition (21). Numerical experiments with a large number of discrete ill-posed problems indicate that this condition does not have to be imposed. We conjecture that this is the case.
The following example illustrates that condition (21) is required if the matrix A is not a discretization of an ill-posed operator equation.
Example. Let p = 1 and consider the symmetric tridiagonal matrix T 2m with subdiagonal entries S 2j = 1, j = 1, 2, … , m, and S 2j + 1 = 10 −j , j = 1, 2, … , m − 1, and diagonal entries M j equal to the sum of the subdiagonal and superdiagonal entries in the same row. Then, T 2m satisfies the conditions of Corollary 1 except for (21). Its eigenvalues cluster at the origin and at 2. Since the eigenvalues of T 2m cluster at two points, the matrix is not a discretization of a linear operator of an ill-posed problem. Neither the diagonal nor subdiagonal entries of T 2m converge to zero for increasing index number as m increases.
□ We observe that the decrease of the subdiagonal blocks S j of T r, r to the zero matrix follows from the clustering of the eigenvalues of A. It is not necessary that they cluster at the origin. This can be seen by replacing the matrix A in Corollary 1 by A + cI n for some constant c ∈ R.
We turn to symmetric indefinite matrices.

Theorem 2.
Let the eigenvalues { j } n+q j=1 of the symmetric matrix A ∈ R (n+q)×(n+q) be ordered according to (14). Assume that the block Lanczos method applied to A with initial matrix X 1 does not break down. Then Proof. Let p m (t) be the monic polynomial of the proof of Theorem 1. Then, just like in that proof Due to the ordering (14), it follows that the eigenvalues m+1 , m+2 , … , n+q are contained in the interval Therefore, In addition, we have shown in (18) that ▪ Assume that the eigenvalues of A cluster at the origin. Then, Theorem 2 shows that the quantity ||S m + 1 S m … S 2 || 2 decreases to zero, because the factors | m+1 | + | k | decrease to zero as m and k increase, with 1 ⩽ k ⩽ m. Moreover, the more block Lanczos steps are taken, the tighter is the bound for the norm of the product of the subdiagonal block matrices of the matrix T r, r .
We can obtain sharper bounds if more information about the spectrum of A is available. For instance, if all but a few eigenvalues of A are known to be nonnegative, then only factors with negative eigenvalues have to be modified as in Theorem 2, resulting in improved bounds for ||S m + 1 S m … S 2 || 2 . In the next corollary, we derive a simpler, but cruder, bound than (22).

Corollary 2.
Let the eigenvalues { j } n+q j=1 of the symmetric matrix A ∈ R (n+q)×(n+q) be ordered according to (14). Assume that the block Lanczos method applied to A with initial block vector X 1 with orthonormal columns does not break down. Then is the range space of X. Thus, there is a matrix M ∈ R p×p such that AX = XM. Let > 0. We say that the block vector X with orthonormal columns is -invariant under A if there is a matrix M ∈ R p×p such that Proof. We have Because M j and S j approach O p as j increases, we can conclude that the Lanczos block vectors X j are -invariant under A with M = O p for j large. ▪ We conclude this subsection by deriving an estimate that is analogous to the one in Theorem 1 for nonsymmetric and potentially rectangular matrices that require the use of the BGKB partial factorization. This estimate involves both the diagonal and lower diagonal blocks of the matrix C m + 1, m in (7), and leverages the fact that (8) is a block Lanczos decomposition of A T A, analogous to (3). Here, we assume that A ∈ R ×(n+q) , with ≥ (n + q), is a matrix whose first n columns contain the coefficient matrix A appearing in (7), padded with q ≥ 0 columns of zeros; q is the smallest nonnegative integer such that r ∶= (n + q)∕p ∈ N. Corollary 3. Let A ∈ R ×(n+q) have the singular values 1 ⩾ 2 ⩾ … ⩾ n+q ⩾ 0, and assume that the BGKB algorithm applied to A with initial block vector P 1 does not break down, that is, that r : = (n + q)/p steps of the method can be carried out. Then Proof. The block entries of the block tridiagonal matrix in (8) can be expressed as Since the block subdiagonal entries of the matrix T m + 1, m are L T j R j , and the eigenvalues of A T A are 2 j , the result follows from Theorem 1. ▪ Note that, since the singular value decomposition of A can be characterized in terms of the eigendecompositions of A T A and AA T , the bound (23) can only be given in terms of both the diagonal and lower diagonal blocks of the matrix C m + 1, m .
We are interested in problems for which the right-hand side of (23) decreases to zero an m increases. This holds for all linear discrete ill-posed problems that we have come across in many applications.

COMPUTED EXAMPLES
To illustrate the properties discussed in the previous sections, we applied the symmetric block Lanczos method and the BGKB method to a set of test matrices that stem from the discretization of ill-posed problems. The numerical experiments were carried out using MATLAB R2017a with about 15 significant decimal digits, on a Xeon E-2244G computer (8 cores, 16 threads) with 16 GB RAM. Although the results of many numerical tests validating the use of BLT and BGKB are already provided in References 4 and 2, respectively, the ones presented in this section illustrate the theoretical results in this article, include new comparisons with other direct and sequential solution methods, and span a wider set of test problems. A first set of experiments uses fairly small square test matrices of order 200 (unless otherwise stated), so that computing the eigendecomposition and the SVD is computationally affordable; indeed, the properties discussed in this article can be observed already for quite small matrices. The symmetric test matrices are listed in the upper part of Table 1, and the nonsymmetric ones in the bottom part of the same table. Among the symmetric matrices, gravity is positive definite, deriv2 is negative definite, and phillips is indefinite. In the case of the nonsymmetric test matrix tomo, we set the size to 400 × 400, because of the very slow decay of its singular values. All matrices but one (i.e., lotkin from MAT-LAB's gallery) in this set of experiments are from the Regularization Tools package 16. More precisely, we use the test problems from Reference 16 to define matrices A, the first column of B true , and the associated error-free solution x 0 in the first column of the block vector solution X true . The other columns of X true are obtained by setting x i = x i−1 + y 2 for i = 1, 2, … , p − 1, where y is a vector obtained by discretizing a function of the form 1 2 cos t 3 + 1 4 at equidistant points on the interval −6 ⩽ t ⩽ 6. Consequently, the other columns of B true are obtained by taking b i = Ax i for i = 1, 2, … , p − 1. The solution of the lotkin example is the same as for the phillips example. The contaminated data block vector is given by (2) with where the random block vectorȆ ∈ R n×p models Gaussian white noise with mean zero and variance one, and is a chosen noise level. In our experiments, we let ∈ {10 −6 , 10 −4 , 10 −2 }. Unless otherwise stated, the blocksize is p = 5. As prescribed by Algorithms 1 and 2, the BLT and BGKB algorithms are initiated with the block vector B. One reorthogonalization step is carried out; the process is repeated if needed. The computed results do not agree with the theory developed in the previous section when no reorthogonalization is carried out. The quantity K ≤ r denotes the number of BLT or TA B L E 1 Solution of symmetric linear systems: The errors E BLT and E TEIG are optimal for truncated block Lanczos iteration and truncated eigenvalue decomposition, the errors E BGKB and E TSVD are optimal for BGKB and truncated singular value decomposition (TSVD) BGKB steps performed for each test problem. The last experiment models image deblurring of a color image, and uses some of the functionalities available within the IR Tools package 17.
In the first set of experiments we use an error-free initial block, both in the BLT and BGKB algorithms, that is, we set B = B true . We verified that the graphs in Figures 1-5 do not change significantly for noise levels up to 10 −2 .
We first illustrate the properties derived in Section 3. Figure 1 displays, in logarithmic scale, the values taken by the left-hand side and right-hand side in the inequalities (15), (22), and (23), as functions of the number of iterations. Iterations were carried out until breakdown, that is, m = 1, 2, … , K ≤ r. The graphs show that for symmetric discrete ill-posed problems the decay of the subdiagonal blocks of T m + 1, m to zero may be much faster than suggested by the bounds (15) and (22). It follows that the ability of the Lanczos block vectors to approximate the space spanned by the principal eigenvectors often is stronger than indicated by the bounds (15) and (22). The same holds true for the BGKB method. We also remark that round-off errors introduced during the computation of the eigenvalues and subdiagonal blocks of the matrices T m, m , m = 1, 2, … , may affect the graphs. In any case, when m is large, the matrix T m, m has eigenvalues of "tiny" absolute value.
We next illustrate that the subspaces (Q k ) generated by the block Lanczos method (3) essentially contain subspaces of eigenvectors of A associated with the eigenvalues of largest absolute value. In addition, we show the convergence of the largest eigenvalues (in absolute value) of the matrices T k, k in (3) to the largest eigenvalues (in absolute value) of A as k increases. Here, T k,k ∈ R pk×pk denotes the matrix obtained by neglecting the last block row of the matrix T k+1,k ∈ R p(k+1)×pk in (3), with m replaced by k. The block Lanczos method is applied until breakdown occurs. For each k, consider the The eigenvalues {̆( are commonly referred to as Ritz values of A. We compare the Ritz values of largest absolute value to the corresponding eigenvalues of the matrix A. For each step k of the block Lanczos algorithm, we compute the relative difference that is, we compare the ⌈pk/3⌉ eigenvalues of largest absolute value of T k, k and A, where ⌈ ⌉ denotes the integer closest to ≥ 0. Figure 2 shows excellent agreement between the first ⌈pk/3⌉ Ritz values of A and the corresponding eigenvalues already for small k. We turn to a comparison of subspaces determined by the span of the Lanczos block vectors of A associated with the Ritz values of largest absolute value. For each k, consider Q k ∈ R n×pk made up of the first k block columns of the matrix Q m in (3). Partition the eigenvector matrix of A, cf. (13), according to  = [ (1) i  (2) n−i ], where the columns w j (j = 1, … , i) of  (1) i ∈ R n×i are the first i eigenvectors, and let the columns  (2) n−i ∈ R n×(n−i) be the remaining eigenvectors. The columns of  (1) i and  (2) n−i span orthogonal subspaces. Let  k = I n − Q k Q T k be the orthogonal projector onto (Q k ) ⟂ , the subspace orthogonal to the range of Q k . We consider the quantities The value of R w k,i is small when span{w j } i j=1 is approximately contained in span{q j } pk j=1 , that is, when the solution subspace generated by the block Lanczos vectors essentially contains the space generated by the first i eigenvectors. The graphs in the left-hand side column of Figure 3 depict R w k,i for k = ⌈n/(2p)⌉ (k = K if a breakdown occurred) and i = 1, 2, … , pk, for the symmetric test matrices. They show that, for a fixed k, only a fraction of the eigenvectors are well approximated by pk Lanczos vectors.
A few comments on the graphs of Figure 4 are in order. The left-hand side graphs show that the span of the first ⌈pk/3⌉ eigenvectors of A is numerically contained in the span of the first pk Lanczos vectors already for quite small values of k.

F I G U R E 2
The graphs in the left-hand side column display the relative difference R k versus k between the ⌈pk/3⌉ eigenvalues of largest absolute value of the symmetric test matrices and the corresponding Ritz values. The right-hand side column shows the behavior of R k versus k for nonsymmetric problems We remark that this is not true if we compare the spaces spanned by the first pk eigenvectors of A and by the first k Lanczos block vectors. Graphs in the right-hand side column, that compare the span of the first ⌈pk/2⌉ eigenvectors of A with the span of the first pk Lanczos vectors, look similar to the graphs in the left-hand side column, but display slower convergence.
We turn to nonsymmetric matrices A. Introduce the singular value decomposition Thus, U ∈ R × and V ∈ R n×n are orthogonal matrices, and where r is the rank of A. The block Lanczos method in the above experiments is replaced by the block Golub-Kahan method (7). The latter method is applied until iteration pK ≤ r, when breakdown occurs. The graphs in the right-hand side column of Figure 2 show the relative differences

F I G U R E 3 Distances R w
k,i (resp. R (u,v) k,i ), versus i = 1, 2, … , pk, between the subspaces spanned by the first i eigenvectors (resp. singular vectors) of the symmetric (resp. nonsymmetric) test matrices, and the subspaces spanned by the corresponding k Lanczos (resp. Golub-Kahan) block vectors. Here, k = ⌈n/(2p)⌉, unless a breakdown occurred F I G U R E 4 The graphs in the left-hand side column display the distances R w k,⌈pk∕3⌉ , versus k = 1, 2, … , ⌈n/p⌉, between the space spanned by the ⌈pk/3⌉ principal eigenvectors of the symmetric test matrices and the space spanned by the first k Lanczos block vectors. The right-hand side column shows the behavior of R w k,⌈pk∕2⌉ between the singular values̆( k) i of C k + 1, k and the corresponding singular values of A. Let U and V be the orthogonal matrices in the singular value decomposition (24) of A, let = n, and partition these matrices similarly to what was done for the eigenvector matrix for symmetric matrices A, that is, we let U = [U (1) i , where the submatrices U (1) i and V (1) i contain the first i left and right singular vectors, and the submatrices U (2) n−i and V (2) n−i contain the remaining n − i left and right singular vectors, respectively. To investigate the convergence of subspaces, we introduce the orthogonal projectors where the matrices U k and W k contain the first k block columns of the matrices U m and W m , respectively, in the decompositions (7). To measure the distance between the spaces spanned by the singular vectors of A and those spanned by vectors computed with the BGKB method, we define the following merit index The graphs in the left-hand side column display the distance R u,v k,⌈pk∕3⌉ , versus k = 1, … , ⌈n/p⌉, between the space spanned by the first ⌈pk/3⌉ singular vectors of the nonsymmetric test matrices and the first k Golub-Kahan block vectors. The right-hand side column shows the behavior of R u,v k,⌈pk∕2⌉ The quantities R (u,v) k,i are displayed, for k = ⌈n/(2p)⌉ (k = K in case a breakdown occurred) and i = 1, 2, … , pk, in the right-hand side column of Figure 3. The figures illustrate that the subspaces spanned by the first few columns determined by the block Lanczos and block Golub-Kahan algorithms are close to the subspace spanned by the first few eigenvectors and singular vectors, respectively, of the matrix A. Figure 5 depicts graphs for the quantities R (u,v) k,⌈pk∕3⌉ and R (u,v) k,⌈pk∕2⌉ , for k = 1, 2, … , ⌈n/p⌉ (k = 1, 2, … , K, if a breakdown occurred).
We finally illustrate the performances of the BLT and BGKB algorithms when applied to the solution of discrete ill-posed problems. Since we assume that the desired solution X true is known, we first use it to elucidate that the solution subspaces determined by these algorithms can give approximations of X true of as high quality as the solution subspaces defined by the truncated eigenvalue or singular value decompositions of A. Subsequently, we present examples that regularize the discrete ill-posed problem by Tikhonov regularization as described in Section 2.3. The latter examples do not require knowledge of X true and show how applications to real-world discrete ill-posed problems can be carried out.
In the experiments reported in Table 1, the test matrices are of size 200 × 200 (400 × 400 for the tomo matrix), and the block size is p = 5. We measure the accuracy of the approximations of X true determined by each regularization method by the relative error Note: The truncation indexes k BLT , k TEIG , k BGKB , and k TSVD , are determined by the discrepancy principle (12). The test matrix is of size 1000 × 1000 for gravity, and of size 1024 × 1024 for tomo. Abbreviations: BGKB, block Golub-Kahan bidiagonalization; BLT, block Lanczos tridiagonalization; TEIG, truncated eigenvalue decomposition. The image encoded in B is contaminated by Gaussian white noise E of level ||E|| F /||B true || F = 10 −2 . Exact and corrupted images are displayed in the leftmost and central frames of Figure 7, respectively.
The leftmost frame of Figure 8 displays, in logarithmic scale, the upper bound given in (23) as a function of the number of iterations. Despite this problem being large-scale, the quantities on the right-hand side of (23) can be easily computed by exploiting the Kronecker product structure of A.
The remaining frames of Figure 8 display the values of the relative error and the regularization parameter versus the number of iterations, for both the regularization method based on BGKB used together with Tikhonov regularization (see Section 2.3) and for a classical regularization method based on GKB, that is, Golub-Kahan bidiagonalization with block size one, and Tikhonov regularization; see, for example, References 9,17,18 for discussions of this solution method. Running the methods based on GKB and BGKB took 5.4 and 1.7 s, respectively (note that, in order to compare approximation subspaces of the same dimension, 150 GKB and 50 BGKB iterations were performed).

CONCLUSION
This article applies a few steps of the block Lanczos or the BGKB methods to large discrete ill-posed problem to determine the solution by solving a projected problem of fairly small size. The eigenvalues or singular values of the projected matrix This suggests that in order to determine a solution of a given large discrete ill-posed problem, it often suffices to use a partial Lanczos block tridiagonalization or a partial Golub-Kahan block bidiagonalization, instead of computing partial spectral or singular value decompositions. This is advantageous because the computation of a partial Lanczos block tridiagonalization or a partial Golub-Kahan block bidiagonalization is much cheaper. Computed examples provide illustrations.