A tensor bidiagonalization method for higher-order singular value decomposition with applications

The need to know a few singular triplets associated with the largest singular values of third-order tensors arises in data compression and extraction. This paper describes a new method for their computation using the t-product. Methods for determining a couple of singular triplets associated with the smallest singular values also are presented. The proposed methods generalize available restarted Lanczos bidiagonalization methods for computing a few of the largest or smallest singular triplets of a matrix. The methods of this paper use Ritz and harmonic Ritz lateral slices to determine accurate approximations of the largest and smallest singular triplets, respectively. Computed examples show applications to data compression and face recognition.

1. Introduction.The last 20 years has seen an immense growth of the amount of data that is collected for analysis, but it is a challenging problem to extract useful information from available data.This difficulty arises, e.g., in machine learning, data mining, and deep learning; see, e.g., Arnold et al. [1].The extraction of useful information from data that is represented by a matrix often is facilitated by the singular value decomposition of the matrix.Typically, only a few of the largest singular triplets, i.e., the largest singular values and associated right and left singular vectors, are required to extract useful information from the matrix.A restarted Lanczos bidiagonalization method for computing accurate approximations of these singular triplets is described in [5], and R code written by Bryan W. Lewis is available at [6].
In many recent applications the given data are represented by a multidimensional array.These arrays, known as tensors, are natural generalizations of matrices.Several approaches to define tensor-tensor products and tensor-matrix products are described in the literature, including the n-mode product [9,25], the t-product [22,31], and the c-product [21,30].Generalizations of the singular value decomposition (SVD) to tensors are described in [25] using the n-mode product (the so-called HOSVD), and in [21,22] using the tensor c-product and t-product.The need to compute the SVD or a partial SVD of a tensor arises in a variety of applications, including image restoration, tensor completion [10], robust tensor principal component analysis [13], tensor compression [3], and recognition of color faces [17,18].These applications require knowledge of the largest singular values and associated lateral tensor singular slices.
It is the purpose of the this paper to introduce a new restarted tensor Lanczos bidiagonalization method for third-order tensors using the t-product for approximating a few of the largest singular values and associated lateral tensor singular slices.This method generalizes the approach described in [5] from matrices to tensors.We remark that the Lanczos bidiagonalization method (also known as the Golub-Kahan bidiagonalization method) for third-order tensors using the t-product has been described in [15,16,22,32]; however, this bidiagonalization method differs from the one of the present paper.
In [5] the authors also describe a restarted Lanczos bidiagonalization method for the computation of a few of the smallest singular values and associated singular vectors of a large matrix by determining harmonic Ritz values is presented.This paper presents an analogous scheme 2. The tensor t-product.This section reviews results by Kilmer et al. [22,23] and uses notation employed there and by Kolda and Bader [25].A third-order tensor is an array A " ra ijk s P R ℓˆpˆn .Matrices and vectors are tensors of order two and one, respectively.A slice or frame of a third-order tensor A is a section obtained by fixing any one of the three indices.Using MATLAB notation, A pi, :, :q, A p:, j, :q, and A p:, :, kq denote the ith horizontal, the jth lateral, and the kth frontal slices of A , respectively.The lateral slice A p:, j, :q also is denoted by A j , and the frontal slice A p:, :, kq is an ℓ ˆp matrix that is sometimes denoted by A pkq .A fiber of a third order tensor A is defined by fixing any two of the three indices.The fiber A pi, j, :q is called a tube of A .We will use capital calligraphic letters A to denote thirdorder tensors, capital letters A to identify matrices, bold face lower case letters a to denote tubes, and lower case letters a stand for scalars.Further, K ℓˆp n " R ℓˆpˆn denotes the space of third-order tensors of size ℓ ˆp ˆn, K ℓ n " R ℓˆ1ˆn stands for the space of lateral slices of size ℓ ˆn, and K n " R 1ˆ1ˆn denotes the space of tubes with n entries.For a third-order tensor A P K ℓˆp n with frontal slices A piq , i " 1, . . ., n, we define: ‚ The block circulant matrix associated with A : ‚ The operator unfold applied to A gives the matrix made up of its frontal slices, unfoldpA q " » ---- We also will need the inverse operator fold such that fold punfold pA qq " A .‚ The block diagonal matrix associated with A is defined as bdiag pA q " » ---- A p2q . . .be third-order tensors.The t-product of A and B is defined by A ‹ B :" fold pbcircpA q unfoldpBqq P K ℓˆp n .
The block circulant matrix (2.1) can be block-diagonalized by using the discrete Fourier transform (DFT) as follows: where F n P C nˆn is the discrete Fourier matrix, F H n denotes its conjugate transpose, A stands for the Fourier transform of A along each tube, I ℓ P R ℓˆℓ denotes the identity matrix, and b is the Kronecker product.The matrix A can be computed with the fast Fourier transform (FFT) algorithm; see [23] for details.Using MATLAB notations, we have A " fftpA , r s, 3q.
The inverse operation can be evaluated in MATLAB with the command Hence, the t-product C " A ‹ B can be evaluated as where A piq , B piq , and C piq are the ith frontal slices of the tensors A , B, and C , respectively.As already pointed out by Kilmer et al. [22], one can use symmetry properties of the DFT when applied to real data to reduce the computational effort when evaluating the t-product with the FFT.This is described by the following result, which can be found, e.g., in [33].
Lemma 1.Given a real vector v P R n , the associated DFT vector v " F n v satisfies where conj denotes the complex conjugation operator and It follows that for a third-order tensor A P K ℓˆp n , we have This shows that the t-product of two third-order tensors can be determined by evaluating just about half the number of products involved in (2.2).Algorithm 1 describes the computations.
C piq " A piq B piq .4: end for ¯.
7: end for The following definition is concerned with the t-product of a third-order tensor and a tube.Definition 2. Let A P K ℓˆp n and b P K n .Then C :" A ‹ b P K ℓˆp n is obtained by applying the inverse DFT along each tube of C , where each frontal slice is determined by the standard matrix product between each frame of A and b, i.e., A third-order tensor A P K ℓˆp n can be written as thus, for the tensors A P K ℓˆq n and B P K qˆp n , the t-product A ‹ B can be expressed as where The Frobenius norm of a third-order tensor A P K ℓˆp n is given by and the inner product of two third-order tensors of the same size A , B P K ℓˆp n is defined as We have the relations We recall for later use the definitions of some special tensors and operations: ‚ The identity tensor I ℓ P K ℓˆℓ n is the tensor whose first frontal slice is the identity matrix and all other slices have zero entries only.‚ The transpose of a real third-order tensor, A P K ℓˆp n , denoted by A H P K pˆℓ n , is the tensor obtained by first transposing each one of the frontal slices of A , and then reversing the order of the transposed frontal slices 2 through n; see [23].Let the thirdorder tensors A and B be such that the products A ‹ B and B H ‹ A H are defined.Then, similarly to the matrix transpose, the tensor transpose satisfies pA ‹ Bq H " B H ‹ A H . ‚ A tensor Q P K ℓˆℓ n is said to be orthogonal if and only if ‚ A square third-order tensor A P K ℓˆℓ n is invertible if there is a third-order tensor In this case B is said to be the inverse of A , and is denoted by A ´1.

Definition 3. ([22]
) Let A i P K ℓ n for i " 1, 2, . . ., p be lateral slices of the tensor A P K ℓˆp n .A t-linear combination of these slices is defined as where the b i for i " 1, 2, . . ., p are tubes in K n .Moreover, The tensor singular value decomposition (t-SVD) associated with the t-product, introduced by Kilmer and Martin [23], generalizes the classical SVD of a matrix.It is described in the next theorem.

Theorem 4. ([23]
) Let A P K ℓˆp n be a third-order tensor.Then it can be represented as the t-product of three third-order tensors, where U P K ℓˆℓ n and V P K pˆp n are orthogonal tensors, and S P K ℓˆp n is an f-diagonal tensor, i.e., each frontal slice of the DFT of S is a diagonal matrix.
Algorithm 2 summarizes the computation of the t-SVD of a third-order tensor with the aid of the FFT.
Algorithm 2 The t-SVD of a third-order tensor.Input: A P K ℓˆp n .Output: U P K ℓˆℓ n , S P K ℓˆp n , V P K pˆp n .1: A " fftpA , r s, 3q.
The factorization (2.3) can be expressed as where the s i " S pi, i, :q are singular tubes, and U i " U p:, i, :q and V i " U p:, i, :q are right and left lateral tensor singular slices, respectively, for i " 1, 2, . . ., minpℓ, pq.The triplets ts i , U i , V i u i"1:minpℓ,pq will be referred to as singular triplets of the tensor A .The singular tubes are ordered so that their norms σ i " }s i } F are decreasing with i, i.e., σ 1 ě σ 2 ě . . .ě σ minpℓ,pq ě 0.
Note that we also have the relations We remark that the latter relations have to be modified if A has complex-valued entries.We note for future reference that S pi, i, 1q " In the following, we will need the notion of rank of a third-order tensor.Definition 5. Let A P K ℓˆp n be a third-order tensor.Then its tubal rank is defined as where σ i is the norm of the singular tube s i of A and card stands for the cardinality.
The next result generalizes the Eckart-Young theorem for matrices to third-order tensors.It is important in the context of data compression.
Theorem 6. ( [3,23]) Let the t-SVD of a third-order tensor A P K ℓˆp n be given by A " U ‹ S ‹ V H .For 1 ď k ď mintℓ, pu, define the truncated t-SVD by Then Where M is the set given by M " tX ‹ Y ; with X P K lˆk n , Y P K kˆp n u.The matrix QR factorization also can be generalized to tensors.Theorem 7. ( [23]) Let A P K ℓˆp n .Then A can be factored as where Q P K ℓˆℓ n is an orthogonal tensor and R P K ℓˆp n is an f-upper triangular tensor, i.e., each frontal slice of the DFT of R is an upper triangular matrix.The factorization (2.5) is referred to as the t-QR factorization of A .
Algorithm 3 summarizes the computation of the t-QR factorization (2.5).The function qr in line 3 of the algorithm computes a QR factorization of the matrix A piq P R ℓˆp ; thus A piq " Q piq R piq , where the matrix Q piq P R ℓˆℓ is orthogonal and the matrix R piq P R ℓˆp has an upper triangular leading principal submatrix of order ℓ.
Algorithm 3 t-QR factorization of a third-order tensor.
r Q piq , R piq s " qrp A piq q. 4: end for ¯.
Following Kilmer et al. [22], we define orthogonality of lateral tensor slices.Let X and Y be two lateral tensor slices in K ℓ n and define the inner product of these slices as The lateral slices in the set with p ě 2, are said to be orthogonal if where e 1 is the tube in K n , whose its first element is 1 and the remaining elements vanish, and the α i , i " 1, 2, . . ., p, are nonvanishing scalars.Furthermore, if α i " 1 for all i " 1, 2, . . ., p, then the set (2.6) is said to be orthonormal.Following [22], we observe that any lateral slice X P K ℓ n can be normalized as › " 1, and a P K n .Here the tensor norm is defined as Note that Y has unit norm if and only if Y , Y " e 1 ; see [22] for more detail.Algorithm 4 summarizes the normalization process.The MATLAB function randn in the algorithm generates a vector in R ℓ with normally distributed pseudorandom entries with mean zero and variance one.
Algorithm 4 Normalize( X ). Input: n of unit norm and a P K n that satisfy (2.7).
3. Tensor Lanczos bidiagonalization for computing the largest and smallest singular triplets.This section describes the Lanczos bidiagonalization process for tensors using the t-product, and discusses how approximations of the largest and smallest singular triplets of a large third-order tensor A P K ℓˆp n can be computed.

The tensor Lanczos bidiagonalization algorithm.
The Lanczos bidiagonalization process was introduced for matrices by Golub and Kahan [14] and therefore sometimes is referred to as the Golub-Kahan bidiagonalization process.For a matrix A P R ℓˆp , this process is closely related to symmetric Lanczos process applied to the real symmetric matrices AA T and A T A, or alternatively to the symmetric matrix Lanczos bidiagonalization algorithms have been applied to solve numerous problems such as large-scale least squares problem [28], the approximation of the largest or smallest singular triplets of a large matrix [5,19,24], and in Tikhonov regularization of large linear discrete ill-posed problems; see, e.g., [11,12].We note that the bidiagonalization method described in [28] and applied in [11,12] reduces a large matrix A to a small lower bidiagonal matrix, while in [5] the matrix A is reduced to a small upper bidiagonal matrix.We will review the latter approach.
Application of m !mintℓ, pu steps of the Lanczos bidiagonalization process to the matrix A P R ℓˆp with the initial unit vector p 1 P R ℓ generically produces two matrices The columns of P m and Q m form orthonormal bases for the Krylov subspaces respectively, where q 1 " Ap 1 {}Ap 1 } 2 .A matrix interpretation of the recursion relations of the Lanczos process gives the matrix relations where e m " r0, . . ., 0, 1s T P R m , β m ě 0 is a scalar, and p m`1 P R p .The matrix B m P R mˆm is upper bidiagonal and satisfies B m " Q T m AP m .When considering bidiagonalization of a third-order tensor A using the t-product, the scalars and the columns of the matrices P m and Q m in the matrix decompositions (3.1) and (3.2) become tubes and lateral slices, respectively, in the decompositions determined by the tensor Lanczos bidiagonalization process.The application of m steps of tensor Lanczos bidiagonalization to the third-order tensor A P K ℓˆp n generically computes two tensors whose lateral slices form bases for the tensor Krylov subspaces They are defined by where P 1 P K p n is a lateral slice of unit norm, and the lateral slice Q 1 P K ℓ n is of unit norm and proportional to A ‹ P 1 .Algorithm 5 describes the tensor Lanczos bidiagonalization algorithm.
Algorithm 5 Tensor Lanczos bidiagonalization using the t-product.
Input: A P K ℓˆp n , number of steps m ď mintℓ, pu, P 1 P K p n with unit norm.Output: P m " r P 1 , P 2 , . . ., with orthonormal lateral slices, B m P K mˆm n a bidiagonal tensor, and 5: for i " 1 to m do 6: 8: if i ă m then 9: 10: 11: 12: 13: 14: 15: end if 16: end for We remark that Algorithm 5 differs from the tensor bidiagonalization algorithms described in [22,32] in that the former produces an upper bidiagonal tensor B m , while the latter determine a lower bidiagonal tensor.The use of an upper bidiagonal tensor in the present paper is inspired by the choices in [5,14].Algorithm 5 is said to break down when one of the tensor slices R i or Q i`1 vanishes.We comment below on this situation, but note that breakdown is exceedingly rare.
Theorem 8. Generically, Algorithm 5 determines the decompositions ) , where The tensor E m P K m n is the canonical lateral slice whose elements are zero except for the first element of the mth tube, which equals 1, and R m P K p n is determined by steps 4 and 5 of Algorithm 5 such that P H m ‹ R m " 0. The tensor B m P K mˆm n is upper bidiagonal, each of whose frontal slices is an upper bidiagonal matrix.Thus, The proof is closely related to the proof of the existence of the relations (3.1) and (3.2), and the properties of the matrices involved.The latter relations are used in [5].
The Lanczos bidiagonalization process may suffer from loss of orthogonality of the lateral slices of the tensors P m and Q m .Therefore, reorthogonalization is carried out in Lines 5 and 9 in Algorithm 5. We remark that reorthogonalization makes the algorithm more costly both in terms of storage and arithmetic floating point operations.The extra cost may be acceptable as long as the number of steps m is fairly small; see [5,34] for discussions in the matrix case.
Let R m be the tensor whose lateral slices are defined in Line 5. Then In the rare event that some β j , 1 ď j ă m, vanishes, Algorithm 5 breaks down.Then the singular tubes of B j are singular tubes of A , and the left and right lateral tensor singular slices are obtained as described below.When no breakdown takes place, we can express equation (3.4) as where P m`1 is obtained from P m by appending the lateral slice P m`1 , defined in (3.5), to get P m`1 " , and B m,m`1 P K mˆpm`1q n is obtained by appending the lateral slice We turn to the connection between the partial Lanczos tridiagonalization of a third-order tensor and the partial Lanczos tridiagonalization process of the tensor A H ‹A .This connection will be used later.Multiplying (3.3) from the left by A H , we get Let T m be the symmetric tridiagonal tensor defined by Then (3.6) is a partial tensor Lanczos bidiagonalization of A H ‹ A with initial lateral slice P 1 " P m ‹ E 1 .The lateral slices of P m form an orthonormal basis for the tensor Krylov subspace Similarly, multiplying (3.4) from the left by A , we obtain It follows that the lateral slices of Q m form an orthonormal basis for the Krylov subspace 3.2.Approximating singular tubes and singular lateral slices.We describe an approach to approximate the largest or smallest singular triplets (singular tubes and associated left and right lateral singular slices) of a large tensor A P K ℓˆp n using restarted partial tensor Lanczos bidiagonalization.Since the tensor A is large, computing its k largest or smallest singular triplets by determining the t-SVD of A is very expensive.The idea is to approximate the extreme singular triplets of the tensor A by determining the extreme singular triplets the bidiagonal tensor B m , where m is small.Let ts i , U i , V i u, 1 ď i ď m, denote the singular triplets of B m .They satisfy The k ď m largest singular triplets of A are approximated by the triplets For i " 1, 2, . . ., k, we have To accept ts A i,m , U A i,m , V A i,m u as an approximate singular triplet of A , the remainder term R m ‹ E H m ‹ U i should be small enough.We can bound the remainder term according to Analogously as in [5], we require for 1 ď s ď n that for a user-chosen parameter δ 1 ą 0, where ´s A j,m ¯psq denotes the sth element of the jth approximate singular tube of A .We obtain from eq. (2.4) that , for some user-specified parameter δ ą 0.
To keep the storage requirement fairly small for large-scale problems, we would like the number of steps m of the tensor Lanczos bidiagonalization process to be small.However, when m is small, it may not be possible to approximate the desired singular triplets sufficiently accurately using the available Krylov subspaces for this situation is to restart the tensor Lanczos bidiagonalization process.The idea is to repeatedly update the initial lateral slices used for the tensor Lanczos bidiagonalization process, and in this way determine a sequence of increasingly more appropriate Krylov subspaces, until the k desired singular triplets have been found with required accuracy.We remark that restarting techniques have been used for computing a few desired singular triplets or eigenvalueeigenvector pairs of a large matrix, where properties of Ritz vectors, harmonic Ritz vectors, and refined Ritz vectors have been exploited; see, e.g., [5,19,20,35,36] for details.
3.3.Augmentation by Ritz lateral slices.Assume that we would like to approximate the k largest singular triplets of A P R ℓˆpˆn .To this end, we carry out m ą k steps of tensor Lanczos bidiagonalization as described in the previous subsection.The approximate right singular lateral slice V A i,m is a Ritz lateral slice of A H ‹ A associated with the Ritz tube `sA i,m ˘2 " s A i,m ‹ s A i,m for i P t1, 2, . . ., mu, and we have In what follows we will show some results that will help us to approximate the largest or smallest singular triplets of a third-order tensor.The idea behind these results is to find equations that are analogous to (3.3) and (3.4), and such that the reduced tensor will contain the k approximate singular tubes among its first k elements on the diagonal, and the right projection tensor will contain the k right Ritz lateral slices among its first k lateral slices, and the left projection tensor will contain the k left Ritz lateral slices among its first k lateral slices.The following theorem will be helpful.
Theorem 9. Assume that m steps of Algorithm 5 have been applied to the third-order tensor A P K ℓˆp n , and suppose that β m in (3.4) is nonvanishing.Then for k ă m, we have where have orthonormal lateral slices, and the first k lateral slices of P m are the first k Ritz lateral slices of A , B k`1 P K pk`1qˆpk`1q n is an upper triangular tensor, P k`2 P K p n is a lateral slice that is orthogonal to P k`1 , β k`1 P K n , and is the canonical element under the t-product.Proof.Let the Ritz lateral slices V A i,m for 1 ď i ď k be associated with the k Ritz tubes of A .Introduce the tensor where P m`1 is given by (3.5).Then, using the fact that A ‹ V A i,m " U A i,m ‹s A i,m for i " 1, 2, . . ., k, we obtain Orthogonalizing the term A ‹ P m`1 against t U A i,m u i"1:k gives where R k is orthogonal to t U A i,m u i"1:k , and the ρ i for i P t1, 2, . . ., ku are given by

and introduce the tensors
and On the other hand, as , we get Since where F k`1 is orthogonal to P m`1 as well as to , the parameter γ in (3.18) is given by Consequently, where β k`1 and P k`2 are determined by the normalization of F k`1 , i.e., F k`1 " β k`1 ‹ P k`2 , because The orthogonality of P k`1 and Q k`1 now follows from the orthogonality of the sequences , respectively, given by (3.7).In the preceding theorem we assumed β m to be nonvanishing.If, instead, β m vanishes, then the singular tubes of B m are singular tubes of A , and the left and right singular lateral slices of A can be determined from those of B m .Similarly, if β k`1 in (3.18) vanishes, then the singular tubes of B k`1 are singular tubes of A , and the singular lateral slices of A can be determined from P k`1 and Q k`1 .
If the β k`1 is nonvanishing, then we append new lateral slices to P k`1 and Q k`1 repeatedly until iteration m ´k.This is the subject of the following theorem.
Theorem 10.Assume that m steps of Algorithm 5 have been applied to A and that eqs.(3.17) and (3.19) hold.If the β k`1 are nonvanishing for 1 ď k ă m, then we have the following relations where P m P K pˆm n and Q m P K ℓˆm n have orthonormal lateral slices, B m P K mˆm n is an upper triangular, β m P K n , P m`1 P K p n is orthogonal to P m , and E m P K m n is the canonical element under the t-product.The first k lateral slices of P m and Q m are the same as those of the tensors P k`1 and Q k`1 , respectively, given in Theorem 9.
Proof.Let the tensors P k`1 and Q k`1 defined in (3.17) and (3.19), respectively, be represented by and the tensor P k`2 be given by By normalizing the quantity ´Iℓ Application of (3.11) gives Consider the tensors To determine the lateral slice P k`3 , we normalize We can continue this procedure until iteration m ´k and then obtain where P m and Q m have orthonormal lateral slices and This gives the desired result.
If we would like to compute the smallest singular triplets of A , then we can use the same theorem, but instead of working with the first right singular lateral slices V A i,m for 1 ď i ď k, we use the last k right singular lateral slices in (3.12).The computations are analogous to those described above.

Augmentation by harmonic Ritz lateral slices.
When the smallest singular values of a matrix A are clustered, their computation by the restarted Lanczos bidiagonalization method as described above may require many iterations.In this situation it may be beneficial to instead compute approximations of the smallest singular values of A by seeking to determine approximations of the largest singular values of the matrix `AT A ˘´1 without explicitly computing the matrix `AT A ˘´1 .This was done for the matrix case by using computing harmonic Ritz vectors; see [5,27].Harmonic Ritz vectors furnish approximations of eigenvectors of A T A associated with the corresponding harmonic Ritz values.
In the case of tensors, harmonic Ritz lateral slices furnish approximations of eigenvectors of A H ‹ A associated with harmonic Ritz tubes of A H ‹ A .The harmonic Ritz tubes q θ j of A H ‹ A associated with the partial tensor tridiagonalization defined in (3.6) are the eigentubes of the generalized eigenvalue problem ´ The eigenpair t q θ j , q ω j u can be computed without forming the tensor Using the relations we can write Therefore, using (3.23), the relation (3.22) can be written as In this subsection, we denote the singular triplets of B m,m`1 by with the first k of them being the smallest singular triplets.Recall that we are interested in determining approximations of the smallest singular triplets of A .The k smallest singular triplets of B m,m`1 form the tensors where We obtain from the above equations that Consequently, the eigenpair , and ) is an eigenpair of (3.22).It follows that the harmonic Ritz lateral slice associated with q θ j is given by We turn to the computation of the residual of harmonic Ritz lateral slices.Using eqs.(3.6) and (3.24), we obtain the relations It follows that the residual can be expressed as We now proceed analogously as in the previous subsection, i.e., we use the smallest harmonic Ritz eigentubes of B H m`1,m ‹ B m`1,m and associated eigenslices to approximate the k smallest singular triplets of A .This yields relations that are analogous to (3.3) and (3.4).The following theorem provides the details.
Theorem 11.Apply m steps of Algorithm 5 to the third-order tensor A and assume that the tensor B m in (3.3) and (3.4) is invertible.Then, for k " 1, . . ., m ´1, we have the relations where have orthonormal lateral slices and q B k`1 P K pk`1qˆpk`1q n is an upper triangular tensor, where the k first lateral slices of | P k`1 are a t-linear combination of the k first harmonic Ritz lateral slices of A with } P k`2 P K p n is orthogonal to n is the canonical lateral slice under the t-product.Proof.Let t | V i u i"1:k be the first k harmonic Ritz lateral slices of A .Using (3.25) and (3.26), we get Define the tensor Using the reduced t-QR factorization of J k`1 , we get where is an f-upper triangular tensor.This factorization can be computed by a simple modification of Algorithm 3. Let Then Using the orthogonality of where › " 1 and q α k`1 is the tube obtained from the normalization of the tensor It follows from (3.31) and (3.32) that Hence, with where q B k`1 is an upper triangular tensor as it is the t-product of two upper triangular tensors.To show (3.28), we first notice that It follows from the above result that We obtain The relation (3.33) now yields where q B k,k`1 P K pk`1qˆk n is the subtensor of q B k`1 , which is obtained by removing the last horizontal slice of q B k`1 .Then Algorithm 7 Facial recognition using tensor Lanczos bidiagonalization with Ritz augmentation.
1: Input: Training set of images X (N images), mean image X , test image I 0 with its associate lateral slice X 0 " vecpI 0 q; m the number of tensor Lanczos bidiagonalization algorithm; k the number of the desired left singular slices.2: Output: Closest image in the database.3: rU k , S k , V k s " t-LBRpX , m, kq using Algorithm 6. 4: Project X onto U k to get P " U H k ‹ X .5: Project the mean of the test image 5. Numerical experiments.This section illustrates the performance of Algorithm 6 for detecting the largest or smallest singular triplets when applied to synthetic data, tensor compression, and facial recognition.All computations are carried out on a laptop computer with 2.3 GHz Intel Core i5 processors and 8 GB of memory using MATLAB 2018a.

5.1.
Examples with synthetic data.We use synthetic data generated by the MATLAB command randnpℓ, p, nq, which generates a tensor A P R ℓˆpˆn , whose entries are normally distributed pseudorandom numbers with mean zero and variance one.5.1 displays the error in the four largest approximate singular tubes computed by augmentation by Ritz lateral slices (referred to as Ritz in the table) and by the partial Lanczos bidiagonalization/Golub-Kahan algorithm (referred to as GK in the table) as described in [17], but using the t-product.These errors are given by }S pi, i, :q ´Σpi, i, :q} F for i " 1, 2, 3, 4 with m " 20.Table 5.2 shows the number of iterations required when using augmentation by Ritz lateral slices to approximate the four largest singular triplets for tensors of different sizes and the number of Lanczos bidiagonalization steps m.
i Methods 100 ˆ100 ˆ3 500 ˆ500 ˆ3 1000 ˆ1000 ˆ3 100 ˆ100 ˆ5 500 ˆ500 ˆ5 Table 5.1:The Frobenius norm }S pi, i, :q ´Σpi, i, :q} F , where S pi, i, :q denotes the singular tubes computed by either augmentation by Ritz lateral slices (Ritz) or by partial Lanczos bidiagonalization also known a partial Golub-Kahan bidiagonalization (GK), and Σpi, i, :q stands for the singular tubes determined by the t-SVD method with m " 20 for i " 1, 2, 3, 4. Table 5.1 shows the Ritz augmentation method to yield much higher accuracy than the GK method.Figures 5.1 and 5.2 display the values of some frames of the first 10 singular tubes of third-order tensors of sizes 100 ˆ100 ˆ3 and 1000 ˆ1000 ˆ5, respectively, computed by Ritz augmentation using Algorithm 6, the t-SVD, and partial Lanczos bidiagonalization (GK).Each tube is denoted by S pk, k, :q P K n , where n is equal to 3 or 5, and k " 1, 2, . . ., 10.In other word, for a fixed i with 1 ď i ď n, we plot S pk, k, iq P K n for k " 1, 2, . . ., 10.As mentioned above, the ith computed singular triplet is accepted as an approximate singular triplet where k is the number of desired singular triplets and the U i are left singular lateral slice of the current tensor B m ; see eq. (3.9). Figure 5.3 shows the evolution of the error computed by (3.9) for the first three singular triplets determined by Algorithm 6 when applied to a third-order tensor of size 1000 ˆ1000 ˆ3 for m " 20.5.2 illustrate that using Algorithm 6 with Ritz augmented method gives more accurate approximations than the GK method.In particular, the frontal slices of each tube computed with Algorithm 6 are very close to the corresponding frontal slices of the tubes determined by the t-SVD, independently of the size of the third-order tensor.For each image, we compute the kth largest singular triplets using Ritz augmentation in Algorithm 6, which will be referred to as "Ritz," for different numbers k of desired singular triplets.Figure 5.7 displays the relative error of the compressed images for k " 5, 10, 15, 25, by using Ritz augmentation (Ritz) and the t-SVD method.This error is measured by where A denotes the tensor that represents the original image and A k "     Each image in the data set is of size 100 ˆ100 ˆ3 pixels, and we use 3 randomly chosen images of each person as test images.The remaining 600 images form our training set and define the tensor X P R 10000ˆ600ˆ3 .We applied Algorithm 7 and compared the results with those obtained by the t-SVD and also with results obtained by the' Golub-Kahan (GK) algorithm using the t-product.The performance of these methods is measured by the identification rate given by Identification rate " number of correctly matched images number of test images ˆ100p%q.
(5.2) Figures 5.9     Table 5.5 reports CPU times for Algorithm 7 for m " 10 (Ritz) and for the t-SVD method for different values of the truncation index k.The results show Algorithm 7 to be very effective both in terms of accuracy and CPU time compared to the t-SVD and the classical Golub-Kahan methods.
6. Conclusion and extensions.This paper presents two new methods for approximating the largest or smallest singular triplets of large third-order tensors using the t-product.We use restarted Lanczos bidiagonalization for third-order tensors to develop the Ritz augmentation method to determine the largest or smallest singular triplets.Moreover, we propose the harmonic Ritz augmentation method to compute the smallest singular triplets.These methods are applied to data compression and face recognition.

Definition 1 .
([23]) Let A P K ℓˆq n and B P K qˆp n

,
. . . .α m´1 β m´1 0 . . . . . .where α i and β i are tubes in K n .Proof.The relations (3.3) and (3.4) follow immediately from the recursion relations of Algorithm 5.The orthonormality of the lateral slices of P m and Q m can be shown by induction.

Figure 5 . 1 :
Figure 5.1: On the left, we display the values of the first frontal slices (frames) of the first 10 singular tubes detected by t-SVD, Ritz augmentation and Partial Lanczos bidiagonalization (GK) for a synthetic data of size 100 ˆ100 ˆ3 with m " 20, and on the right we plotted the third frontal slices of these tubes, i.e., S pk, k, iq with k " 1, 2, . . ., 10 and i " 1, 3.

Figures 5. 1
Figures 5.1and 5.2 illustrate that using Algorithm 6 with Ritz augmented method gives more accurate approximations than the GK method.In particular, the frontal slices of each tube computed with Algorithm 6 are very close to the corresponding frontal slices of the tubes determined by the t-SVD, independently of the size of the third-order tensor.
Figures 5.1and 5.2 illustrate that using Algorithm 6 with Ritz augmented method gives more accurate approximations than the GK method.In particular, the frontal slices of each tube computed with Algorithm 6 are very close to the corresponding frontal slices of the tubes determined by the t-SVD, independently of the size of the third-order tensor.

Figure 5 . 2 :
Figure 5.2: The left-hand side pane shows the values of the first frontal slices (frames) of the first 10 singular tubes computed by t-SVD, Ritz augmentation, and the partial Lanczos bidiagonalization (GK) method for a synthetic data of size 1000 ˆ1000 ˆ5 with m " 20.The right-hand side pane displays the third frontal slices of these tubes, i.e., S pk, k, iq for k " 1, 2, . . ., 10 and i " 1, 3.

Figure 5 . 3 :
Figure 5.3: Evolution of the remainder term for a third-order tensor of size 1000 ˆ1000 ˆ3 when computing the first three singular triplets by Algorithm 6 with Ritz augmentation.

Tables 5 .
3 and 5.4  show that harmonic Ritz augmentation gives higher accuracy than Ritz augmentation when computing the smallest singular triplets.Figures 5.4 and 5.5 depict the Frobenius norm of the remainder term R m ‹ E H m ‹ U i for each iteration with Algorithm 6 with Ritz augmentation and harmonic Ritz augmentation when approximating the last two singular triplets for m " 20.

Figure 5 . 4 :
Figure 5.4: The Frobenius norm of R m ‹ E H m ‹ U i obtained by Algorithm 6 with Ritz augmentation when approximating the two smallest singular triplets of a synthetic tensor of size 500 ˆ500 ˆ5 with m " 20 at each iteration for i " 499, 500.

Figure 5 . 5 :
Figure 5.5: The Frobenius norm of R m ‹ E H m ‹ U i obtained by harmonic Ritz augmentation when approximating the last two singular triplets of a synthetic tensor data of size 500 ˆ500 ˆ5 with m " 20, at each iteration for i " 499, 500.

Figures 5 .
Figures 5.4 and 5.5 show the error } R m ‹ E H m ‹ U i } F associated with Ritz augmentation in 6 to converge in a smoother way than the corresponding error for harmonic Ritz augmentation.Both errors converge to zero as the number of iterations increases.

Figure 5 .
6 displays examples of image compression using two color images: "house" of size 256 ˆ256 ˆ3 and "Hawaii" of size 1200 ˆ1200 ˆ3.

Figure 5 . 7 :
Figure 5.7: Relative compression error (5.1) for the images "house" and "Hawaii" obtained with Algorithm 6 with Ritz augmentation (Ritz) and the t-SVD method.

Figure 5 .
Figure 5.7 shows the relative errors obtained with Algorithm 6 with Ritz augmentation and the t-SVD are almost the same.This means that the approximate singular tubes and the right

Figure 5 .
8 shows an example of images of one person in the data set.

Figure 5 . 8 :
Figure 5.8:An example of a person with different facial expressions and orientations.

and 5 .
Figures 5.9 and 5.10 show results obtained for k " 1 and k " 5 for two different persons.The mean image is defined as in Algorithm 7.

Figure 5 . 11 :
Figure 5.11: Identification rates for different truncation indices k by Ritz augmentation, t-SVD and Golub-kahan methods.

Figures 5 . 5 :
Figures 5.9 and 5.10 show that Algorithm 7 performs well for some values of the truncation index k.In Figure5.11,we plotted the identification rate (5.2) obtained with Algorithm 7 (Ritz augmentation), GK for m " k, and with the exact t-SVD method for the 150 test images.

Table 5 .
2: Number of iterations (iter) needed by the Ritz augmentation method to determine the four largest singular tubes for third-order tensors of different sizes with m " 10, 20.The columns with header "time" shows the CPU time in seconds.

Table 5 .
3 displays the error in the fourth smallest singular tubes computed by Ritz augmentation and harmonic Ritz augmentation for m " 20, and compares with results determined by the t-SVD method.In Table5.4we show the number of iterations and the required CPU time (in seconds) for these methods when m " 20.

Table 5 .
3: The Frobenius norm }S pi, i, :q ´Σpi, i, :q} F , where S pi, i, :q denotes the singular tubes determined by Ritz augmentation or harmonic Ritz augmentation for m " 20, and Σpi, i, :q are tubes computed by the t-SVD method for the four smallest tubes, i.e., for i " n ´3, n ´2, n ´1, n.

Table 5 .
4: CPU time in seconds, and number of iterations required by Algorithm 6 with Ritz augmentation and harmonic Ritz augmentation for m " 20 to compute the four smallest singular triplets of synthetic third-order tensors of different sizes.