A fast direct matrix solver for surface integral equation methods for electromagnetic wave scattering from non-penetrable targets


  • Jian-Gong Wei,

    Corresponding author
    1. ElectroScience Laboratory, Department of Electrical and Computer Engineering, Ohio State University, Columbus, Ohio, USA
      Corresponding author: J.-G. Wei, ElectroScience Laboratory, Department of Electrical and Computer Engineering, Ohio State University, 1330 Kinnear Rd., Columbus, OH 43212, USA. (wei.140@osu.edu)
    Search for more papers by this author
  • Zhen Peng,

    1. ElectroScience Laboratory, Department of Electrical and Computer Engineering, Ohio State University, Columbus, Ohio, USA
    Search for more papers by this author
  • Jin-Fa Lee

    1. ElectroScience Laboratory, Department of Electrical and Computer Engineering, Ohio State University, Columbus, Ohio, USA
    Search for more papers by this author

Corresponding author: J.-G. Wei, ElectroScience Laboratory, Department of Electrical and Computer Engineering, Ohio State University, 1330 Kinnear Rd., Columbus, OH 43212, USA. (wei.140@osu.edu)


[1] The implementation details of a fast direct solver is described herein for solving dense matrix equations from the application of surface integral equation methods for electromagnetic field scatterings from non-penetrable targets. The proposed algorithm exploits the smoothness of the far field and computes a low rank decomposition of the off-diagonal coupling blocks of the matrices through a set of skeletonization processes. Moreover, an artificial surface (the Huygens' surface) is introduced for each clustering group to efficiently account for the couplings between well-separated groups. Furthermore, a recursive multilevel version of the algorithm is presented. Although asymptotically the algorithm would not alter the bleak outlook of the complexity of the worst case scenario,O(N3) for required CPU time where N denotes the number of unknowns, for electrically large electromagnetic (EM) problems; through numerical examples, we found that the proposed multilevel direct solver can scale as good as O(N1.3) in memory consumption and O(N1.8) in CPU time for moderate-sized EM problems. Note that our conclusions are drawn based on a few sample examples that we have conducted and should not be taken as a true complexity analysis for general electrodynamic applications. However, for the fixed frequency (h-refinement) scenario, where the discretization size decreases, the computational complexities observed agree well with the theoretical predictions. Namely, the algorithm exhibits O(N) and O(N1.5) complexities for memory consumption and CPU time, respectively.

1. Introduction

[2] Computation of electromagnetic (EM) wave scattering from non-penetrable targets has been an active research topic for many years. It can be argued that the surface integral equation (SIE) methods are best in addressing such an application. Particularly, in recent years, there are many fast and efficient integral equation methods that extend the SIEs to EM wave problems with thousands of wavelength in dimensions. Among them, we mention the multilevel fast multipole method (MLFMM) [Rokhlin, 1985; Coifman et al., 1993; Rokhlin, 1990; Song and Chew, 1995; Lu and Chew, 1994], which reduces the memory and CPU complexities to O(N) and O(N log N), respectively, though a strict decomposition of the Green's function is required. Unfortunately, FMM, or it's multilevel version MLFMM, suffers the “sub-wavelength breakdown problem” [Dembart and Yip, 1998] and it is heavily kernel dependent. Other methods such as the pre-corrected fast Fourier transform (pFFT) [Phillips and White, 1997] and the adaptive integral method (AIM) [Bleszynski et al., 1996; Wang et al., 1998] accelerate the matrix-vector multiplications by substituting the current basis functions of the original problem via the new equivalent current sources that reside on a rigid grid, thus facilitates the use of FFT algorithm.

[3] Although we have witnessed significant advancements of fast integral equation methods in recent years, most of them mainly address the issue with the speed of the matrix-vector multiplications. Nonetheless, the overall success still relies on the availability of a robust and effective preconditioner for the integral equation methods. Even though, there are many substantial developments in this regard [Peng et al., 2011], the existence of a preconditioner that guarantees the convergence in the iterative matrix solution process remains largely elusive. Direct solvers for integral equation methods are another important and interesting branch, they are sometimes favored over their iterative counterparts, especially in solving ill-conditioned matrix equations. Moreover, they often exhibit high efficiency in multiple right-hand-sides (RHSs) owing to the small constant in front of the complexity asymptotic when dealing with small or moderate electrical size problems. However, the conventional direct solver, based on the LU factorization, scales asO(N2), O(N3) for memory consumption and the factorization time, respectively. The inherent high complexities of the conventional LU direct solvers severely limit their application to solve practical EM problems. To circumvent these difficulties, several fast direct solvers have been proposed in the literature. In Shaeffer [2008], the author reported solving an one-million unknown problem using MultiLevel Adaptive Cross Approximation (ML-ACA) algorithm. Also, inAdams et al. [2008], a local-global solution method separates the radiating current from the non-radiating counterpart and reported to achieveO(N1.3) complexity in terms of memory consumption for electrically large problems. Additionally, Heldring et al. [2007] discussed a compressed block decomposition (CBD) method and demonstrated a complexity of O(N1.5) for the memory consumption. Another work conducted in Winebrand and Boag [2007] and Boag [2007]adopts the non-uniform grid (NG) based matrix compression method, it introduces a non-redundant coarse spherical non-uniform sampling grid to effectively skeletonize the coupling process and compress the matrix using Schur's complement.Chai and Jiao [2011] claimed to find the inline image2 representation of the inverse of the dense matrix in an error-controllable manner and reported a linear complexity for both CPU time and memory consumption. However, we disagree with the complexity analyses presented inChai and Jiao [2011] and remain unconvinced of the performance reported. Moreover, one of the recently published works, Li et al. [2012] shares some similarities with the proposed algorithm in this paper. It also seeks for a unique mapping matrix for each group to represent the coupling.

[4] In this paper, a fast direct solver, based largely on the algorithm outlined in Greengard et al. [2009] and Martinsson and Rokhlin [2005], is presented to solve SIE matrix equations for electrodynamic applications. This algorithm utilizes a low rank decomposition of the off-diagonal coupling blocks of the dense matrices [Cheng et al., 2005]. Moreover, a multilevel version in-conjunction with a Huygens' surface to account for couplings between well-separated groups is also discussed in detail. Although, we believe that the algorithm will not alter the complexities of matrix solutions in SIEs (in the worst case scenario for electrically large problems), the proposed algorithm can be very efficient for many practical numerical examples. Particularly, during the process ofh-refinement, where the discretization size decreases to improve the accuracy, the complexities observed are O(N) and (N1.5) for memory and CPU time, respectively. The reported complexities agree well with the theoretical predictions in Martinsson and Rokhlin [2005]for smooth integral kernels on general two-dimensional surfaces. Several numerical results are included to validate the algorithm. Additionally, numerical experiments are conducted for fixed mesh size scenario, where the frequency increases, as well as for fixed frequency, while the mesh size decreases (theh-refinement) [Wei et al., 2011].

2. Problem Statement

[5] The following boundary value problem statement can be established for a scattering problem from a non-penetrable target. (with the factor ejωt suppressed, inline image)

display math

As illustrated in Figure 1, an incident plane wave Einc impinges on the boundary Γs of a PEC object Ωs. The complement of the scatter is denoted by Ω(=ℝ3 \ Ωs) where Γs is the bounding surface of Ωs. The wave number is denoted by inline image, where μ0 and ε0 are the permeability and the permittivity of the free space, respectively. Subsequently, we have the following boundary condition to be satisfied:

display math

where πt denotes the tangential trace operator on the surface and it is defined as inline image. Expanding the total field, E = Einc + Esca, we have

display math

Moreover, both the electric E and the magnetic H fields in the exterior region, Ω, can be obtained through the Stratton-Chu representation formulae in terms of the surface electric current density, inline image. Consequently, the electric field integral equation (EFIE) can be rewritten as:

display math

where inline image, ∇Γ(x) and ∇Γ(y) are the surface gradient operators operate on the observation and the source coordinates, respectively.

Figure 1.

Illustration of the boundary value problem.

[6] Magnetic field integral equation (MFIE) can also be derived for closed surface targets as:

display math

where inline image, pv. stands for principal value, inline image and inline image is the Green's function in free space. Note that the MFIE formulation, equation (5), is only applicable for closed-surfaced non-penetrable targets.

[7] The combined field integral equation (CFIE) combines the EFIE and the MFIE to yield:

display math

where inline image is the free space intrinsic impedance. The CFIE formulation mitigates the notorious internal resonances by treating Γs as an impedance surface [Vouvakis et al., 2007], and thus renders the resonance frequencies complex so long as ℜα ≠ 0.

3. Theory of Direct Solver

3.1. Single Level Direct Solver

[8] The fast algorithms of direct solution of integral equations are usually developed by exploiting the redundancy in the couplings between well-separated groups. This is mainly due to the fact that the discretization size employed in the SIEs is usually much smaller than required by the Nyquist sampling rate. Subsequently, the resulted meshes are oftenoverkillfor computing the radiation between well-separated groups leading to rank deficiency in the off-diagonal matrix blocks. Herein, we compute the inverse of the system matrix by exploiting the rank deficiency of the coupling matrices hierarchically.

[9] For the application of method of moments, one normally starts by discretizing the geometry, followed by proper choice of basis functions (commonly Rao-Wilton-Glisson (RWG) basis functions [Rao et al., 1982]), to span the unknown surface electric current density J. Explicitly, inline image where λi denotes ith RWG basis function. In order to systematically take advantage of the rank deficiency of the couplings, a hierarchical decomposition of the original geometry would be desirable. For problems in 3D, a hierarchical octree would be constructed and the RWG basis functions are sorted into separate boxes from each level of the octree according to their coordinates. Subsequently, the RWG basis functions would be reordered such that the first n1 unknowns belongs to the box number 1, the second n2 unknowns belongs to box number 2, etc. Testing the EFIE and/or MFIE with proper testing functions [Harrington, 1968] results in dense linear matrix equations of the form Ax = b. To facilitate the discussions, we shall assume that the unknowns are partitioned into three groups (see Figure 2). As a consequence, the impedance matrix can be written accordingly as:

display math

where Aij denotes the coupling matrix between box i and box j, xi denotes the coefficients of the basis functions in box i while birepresents the right-hand-side (RHS) vector of boxi.

Figure 2.

Partitioning of the problem geometry into 3 groups.

[10] Assuming the off-diagonal sub-matrices,Aiji ≠ j, are low rank and, subsequently, can be decomposed as (with ki < ni, kj < nj):

display math

where ni, nj are the numbers of unknowns and ki, kj are the effective ranks in box i and box j, respectively. This decomposition [Greengard et al., 2009] proves beneficial since Li and Rj associate only with boxes i and j. Unlike other data-sparse-representation techniques such as ACA, which generates separate low rank decomposition for each distinct coupling pair, theLi matrix produced herein by the proposed skeletonization process can be shared by Aij, ∀ j ≠ i. Yet another interesting characteristic of equation (8) is that the entries of Sij are comprised of the original entries of Aij. The reduced set of basis functions employed for Sij are thus named “skeletons”.

[11] To obtain the decomposition in equation (8) for block i, one concatenates all the Aij sub-matrices or AijT sub matrices with i ≠ j:

display math

where Ntot is the total number of DoFs and Nj is the number of geometrical partitions in the current level. Each of the matrices is then subjected to a thresholded QR decomposition as proposed in Cheng et al. [2005]. Subsequently, the matrices Li and Ri can be computed separately and, accordingly, the incoming (receiving) and the outgoing (radiating) skeletons. Namely, for each of the groups, there would be one skeleton responsible for radiating and receiving, respectively. However, an alternative approach enables one to consolidate the L and R matrices, through concatenating all Aij and AijTsub-matrices. We write:

display math

As a consequence, only one QR decomposition needs to be performed for each group, and

display math

Specifically, we have:

display math

where PR is a permutation matrix and PRT = PR−1. The numerical rank ki could be determined from the diagonal entries of R = [R11 R12] matrix.

[12] R11T = R12 yields

display math

where I is the ki × ki identity matrix. ARS is the first ki rows of PRTAi. Consequently, the first ki rows of the permutated Ai matrix are the skeleton indices while the rest of the (ni − ki) rows are linear combinations of the previous ki skeletons through TT matrix. After the QR decomposition, the Li matrix is readily available:

display math

It is worth pointing out that the skeletonization process could be highly parallelized.

[13] The matrix equation (7) can be rewritten as:

display math

with yj = LjTxj. From equation (15), it is obvious that the solution of the original problem, x, can be expressed in terms of the new y vector with much reduced dimension Nk(1) (assume the effective ranks are much smaller than the number of unknowns in each block). Written explicitly, we have

display math

Evidently, the computation of the solution vector x hinges on y vector. To do so, equation (15) can be rewritten as:

display math

Left multiply equation (17) with

display math

results in

display math

Equation (19) can be simplified by removing the redundant equations, and it yields:

display math

where Eii = (LiTAii−1Li)−1, Ci = EiiLiTAii−1. Substituting equation (20) to equation (16) gives:

display math

with inline image and inline image. Moreover, from the simple fact that S = (E + S) − E, equation (21) can be rewritten as:

display math

where AD, LD, CD are all block diagonal matrices whose diagonal blocks are Aii, Li, Ci, respectively.

[14] Finally, the approximate A−1 can be displayed as:

display math

where B = AD−1LDE , C = CD, D = AD−1 − AD−1LDCD, N is the number of RWG basis functions, and Nk(1) is the sum of all effective ranks of all groups, which in many applications can be notably less than N.

[15] The use of equation (23) leads to a significant reduction in memory consumption since for all the matrices involved in equation (23), B, C, D, E matrices are all block diagonal. The Smatrix is off-diagonal dense matrix, namely,Sii = 0. However, considering its entries are just a permuted subset of the original MoM impedance matrix in equation (7), one only needs to store the indices of the skeleton DoFs. This is crucial for the multilevel implementation. Even for the four block diagonal matrices aforementioned, not all the entries need to be stored explicitly. One obvious approach stores only Aii−1, Eii and Li matrices, and subsequently, all operations involved in the direct solver can be fully accounted for.

3.2. Skeletons and Skeletonization

[16] The skeletons, or the skeleton DoFs are the reduced set of DoFs that are capable of representing the receiving and/or radiating phenomena. Here in this section we visually describe the skeleton and the related skeletonization process through Figure 3. Take the example of an wedge geometry as illustrated in Figure 3a, the DoFs are partitioned into groups and their supports are depicted using distinct colors as in Figure 3b. The skeletonization process would then be evoked to identify the skeleton of each group, i.e. the reduced set of DoFs, whose supports are colored the same as in Figure 3c. Thereafter, the indices of these reduced DoFs (or skeleton), are stored and utilized to assemble the S matrix.

Figure 3.

(a–c) Skeletonization.

3.3. Multilevel Direct Solver

[17] Single level direct solver takes advantage of the rank deficiency property of the off-diagonal coupling matrices and represents the solution of the original problem in terms of a reduced set of unknowns, i.e., skeletons. Note that this reduced set of unknowns are a subset of the original unknowns, which implies that the redundancy among the first level partitions could still exist among couplings in coarser levels. This can be displayed clearly from the expression of (E + S) matrix in equation (23), where S matrix is the coupling matrix between the skeleton unknowns.

[18] It would be logical to extend the algorithm to further compress the (E + S) matrix in (23). Mathematically, one can simply substitute the original A matrix with the (E + S) matrix obtained in equation (23). Subsequently, we cluster the DoFs within the current children groups into groups in higher levels, i.e., those children groups share the same parentin the tree structure. This re-grouping process can be straightforwardly demonstrated byFigure 4, i.e., the skeleton DoFs resulted from the previous level operation as in Figure 4awould be re-grouped into coarser level groups as inFigure 4b.

Figure 4.

(a–c) Multilevel skeletonization.

[19] In terms of matrix operation, this procedure is equivalent to clustering the diagonal matrix groups as illustrated in Figure 5. After the 1st level direct solver, the resulting (E + S) matrix could be represented by the first square in Figure 5, where gray blocks represent matrices that are recalculated and updated, while dark blocks represents matrices that are not changed beyond row and column permutations. Subsequently, the red square lines would cluster the group to the next level and generate an (E + S) matrix.

Figure 5.

Matrix representation of the direct solver process. Gray blocks represent matrices that are updated, dark blocks represent matrices that are not changed beyond row and column permutation.

[20] Next, we perform the algorithm to this re-clustered (E + S) matrix. The supports of the resulted skeletons are then plotted in Figure 4c. Mathematically, the corresponding expression can be written explicitly as:

display math

Note that the dimension of (E(2) + S(2)), i.e., Nk(2), is further compressed. Continuing this algorithm recursively, one ends up with a multilevel version of the direct solver and consequently achieves gain in a telescope manner. The current multilevel direct solver would come to a halt and switch to direct LU factorization when it reaches the coarsest level, e.g., the level with no more than 8 groups left.

3.4. Skeletonization Using a Huygens' Surface

[21] Skeletons revealed in this algorithm are the effective basis functions that are capable of accounting for both the near field and far field couplings. For group i, whose support of DoFs is depicted by red triangulation in Figure 6, its near field exhibits higher degree of oscillations/variations. However, the couplings between group i and the groups reside outside group i's Huygens' surface are smooth. Consequently, for DoFs that reside in groups outside group i (whose supports are colored black), this process can be accelerated using the Huygens' surface. The use of the Huygens' surface to compute far field couplings can be justified through the Huygens' principle. Namely, any field induced by sources inside the Huygens' surface can be well induced by sources on the Huygens' surface [Greengard et al., 2009].

Figure 6.

Near field and far field region of group i.

[22] We remark that the Huygens' surface could be any shape as long as Huygens principle is respected. In the current implementation, the Huygens' surfaces are constructed by simply setting image where di is the box size at ith level as in Figure 7. As a consequence, the skeletonization process can be accelerated via Huygens' surfaces through the following steps. First, these surfaces denoted by Γa are discretized based on the precision required [Greengard et al., 2009] as in Figure 7, and assigned basis functions to the corresponding traingulations. The following matrix image can be assembled,

display math

where, Ninb is the number of basis functions in the neighboring groups of group i, K is the number of basis functions on Γa, Ai,k denotes the coupling matrices where DoFs in group i serves as receiver and DoF jnb serves as transmitter while Ai,kT denotes the alternative scenario. Therefore, the adoption of the Huygens' surface Γa separates the far field from the near field.

Figure 7.

Huygen surface of group i.

[23] Subsequently, the image matrix is subjected to the pivoted QR decomposition as in equation (12). For a m × n matrix, the complexity of a pivoted QR decomposition scales as O(u2v), where u = min{mn}, v = max{mn}. For the case of direct solver, it is almost always valid that (Ntot − ni) > > (Ninb + 2K). Thus, the QR decomposition of image is notably more efficient than its Ai counterpart in equation (10).

4. Numerical Results

4.1. Sphere

[24] The first example to validate the proposed algorithm is a plane wave scattering from a 4 m radius PEC sphere at 300 MHz. It is discretized using an averaged mesh size of 0.1λ. The discretization gives rise to 74,169 unknowns. The entire problem geometry is decomposed hierarchically, via an octree structure, and results in 3 levels of partitions. At the leaf level, we have total 272 non-empty groups, which corresponds to approximately 272 DoFs per group on average. CFIE withα = 0.5 and QR threshold of 10−3 are adopted. The computed far fields are plotted against those computed from the Mie series in Figure 8. On an Intel Xeon platform with X5450 CPUs at 3.00 GHz, it takes 4 hours 55 min 14 s and 6.23 GB memory using double precision arithmetic. While the direct LU factorization, we estimate, would take 595 hours 21 min 32 s.

Figure 8.

Computed RCS of a 4 λ radius sphere compared against the Mie Series results.

4.2. Complexity and Error Studies

[25] Two numerical experiments are conducted to study the computational complexities of the proposed direct solver, both the memory consumption and CPU computational time. Specifically, we aim to determine the computational complexities under two scenarios: the first one by fixing the mesh size with respect to the wavelength and increasing the frequency; and, the second one is to fix the operating frequency while decreasing the mesh size with respect to the wavelength. In all cases reported herein, the results are obtained using the CFIE with double precision and the tolerance of the QR decomposition is set to be 10−3. Moreover, we have parallelized the proposed algorithm using openMP and the computational results reported here are computed with 8 threads. The first complexity studies is a PEC sphere of 1.5 m radius and is discretized using mesh size of 0.1λ with varying frequency. The computational statistics are listed in Table 1, in which N denotes the number of DoFs for the problem and n denotes the size of the matrix that is subjected to direct LU factorization at the final level. Complexities of approximately O(N1.3) is observed for memory consumption and O(N1.8) for CPU time. Note that our conclusion is simply based on the numerical results for this particular example. However, it does demonstrate the common trend that we have observed for many other examples. Since we can only apply the proposed direct solver, even with the aids of openMP parallelization, to moderate electrical size problems, we should not extrapolate the results to the asymptotic complexities of electrically large structures, where the unknowns may be in the range of hundreds of millions. Nevertheless, the computational complexities illustrated here are still very encouraging since for electrically large EM problems, the proposed direct solver will often be used in-conjunction with the Krylov iterative solver as an effective preconditioner.

Table 1. Computational Statistics for Fixed Mesh Size Study
Frequency (MHz)Number of LevelsNnMemory (GB)CPU (s)εsolεfacεapp
300310,2423,5150.464.900e + 26.17e-46.43e-44.82e-4
375316,0204,6400.861.158e + 36.53e-46.48e-44.83e-4
500428,8066,4501.873.391e + 3-7.64e-45.68e-4
600441,4157,7162.896.201e + 3-8.75e-47.23e-4
10005116,25313,92510.594.135e + 4-1.07e-38.84e-4
20006465,01228,56564.025.034e + 5-2.12e-39.23e-4

[26] Secondly, for the same sphere at 30 MHz, we study the computational complexities with respect to the decrease of the mesh size (h-refinement). Computational statistics are obtained for different mesh sizes ranging from 0.01 to 0.0015 wavelengths; and, complexities of O(N) for memory and O(N1.5) for CPU time are observed from Table 2. Our observed complexities for both memory and CPU time agree well with the predictions described in Martinsson and Rokhlin [2005].

Table 2. Computational Statistics for Fixed Frequency Study
h (λ)Number of LevelsNnMemory (GB)CPU (s)εsolεfacεapp
0.01310,2422,6440.316.800e + 23.70e-33.42e-33.87e-4
0.008316,0203,3960.521,387e + 34.62e-34.25e-34.01e-4
0.006428,8064,5971.033.413e + 3-5.80e-34.08e-4
0.005441,4155,4741.505.881e + 3-6.47e-34.12e-4
0.0035116,2539,2564.322.783e + 4-1.06e-24.30e-4
0.00156465,01218,97117.022.290e + 5-2.42e-25.65e-4

[27] Furthermore, three sets of error are defined to investigate the accuracy of the solution. They are (1) the solution error, i.e., εsol, defined as inline image, where xexact denotes the solution computed from the usual LU factorization; (2) the factorization error, denoted by inline image, where à is the approximate matrix to A through the skeletonization process; and (3) the approximation error, inline image. 10 sets of independent and identically distributed (i.i.d.) vectors are generated and adopted in the error calculations, the results of the maximum errors are included in Tables 1 and 2 for fixed mesh size and fixed frequency scenarios, respectively.

[28] In Tables 1 and 2, we observe that the approximation errors are all under the specified tolerance 10−3. However, both the solution and factorization errors increase monotonically as the number of DoFs increases. Especially in the fixed frequency case, where smaller discretization sizes translate directly to larger matrix condition numbers, ( inline image), without any low frequency regularization techniques [Stephason and Lee, 2009]. Moreover, the largest singular value of the coupling matrices of any group grows as inline image since the smallest coupling capacitance between two neighboring groups decreases as O(h); whereas the small singular values of the off-diagonal blocks are almost independent of the mesh size since they are predominantly associated with the MFIE operator. Behavior of similar nature was also reported in some literature [e.g.,Adams et al., 2008]. This phenomenon reinforces the statement that the condition number of the matrix would affect the performance of the direct solution algorithms, especially the ones rely on the rank deficiency of the system matrix [Adams et al., 2008]. So even for the direct solvers, it is still desirable to have formulations that yield good conditioned system matrices.

4.3. Machine Gun

[29] In this section, we consider a more complex and somewhat practical target. The current distribution on a free-standing machine gun subjected to an EM plane wave at 2 GHz is shown inFigure 9. The target gives rise to 74,310 unknowns after being discretized, and 201 non-empty groups are established for the 1st level, and 370 DoFs per group on average. Multiscale structures like the machine gun discussed here would usually yield ill-conditioned matrix equations, that will be costly for the Krylov iterative methods to solve. In circumstances like this, the proposed direct solver will then be highly desirable. Particularly, when it is combined with the newly developed integral equation domain decomposition method (IEDDM) [Peng et al., 2011] as an effective sub-domain solver for sub-domains that involve singularities and/or small features. As an example, we mount the machine gun to a tank and calculate the EM scattering from it with the same incident plane wave. The entire target, after discretization, results in 1,027,554 unknowns using a mesh size, on the average,h = 0.1λ. The current distribution is plotted in Figure 10. In this example, the proposed direct solver serves as a sub-domain solver for the sub-domains, and therefore, the convergence of these sub-domains are guaranteed. At initial running, it took 5 hours 25 min to obtain the inverse representation of the impedance matrix for the machine gun domain alone. For the IEDDM iterations other than the first one, responses from this subdomain can be solved by simply applying the existing inverse representation to the RHSs.

Figure 9.

Current distribution on a free-standing gun.

Figure 10.

Current distribution of a gun mounted on a tank.

4.4. Mock-Up Aircraft

[30] For the final example, an electromagnetic wave scattering from a mock-up air platform at 75 MHz is considered. The positivex polarized plane wave impinges from positive z direction. For this problem, we have a discretization of the surface with the average mesh size of approximately λ/40, which gives rise to 60,495 unknowns. It is usually difficult for a mesh generation software to guarantee the quality of each and every triangle when facing geometries of such complexity. However, the presence of ill-shaped triangles with inferior qualities would greatly affect the convergence behavior of the iterative solver, though the percentage of these bad triangles are usually very small. Additionally, the condition number of EFIE deteriorates [Stephason and Lee, 2009] when it is applied to such a low frequency application. Combinations of these technical difficulties result in the failure of the Krylov iteration methods. Specifically we employed the CGS, and the relative residual stagnates around 0.1 after a few hundred iterations. To combat the above mentioned difficulties, we apply the proposed direct solver and it results in 4-level partitioning and with 221 non-empty groups and 273 DoFs per group on average for the leaf level.Figure 11plots the computed current distribution on the platform, without the dielectric radome of the nose of the air platform. Moreover, the mono-static far field pattern is also calculated and included inFigure 12. On an Intel Xeon platform with 8 X5450 CPUs at 3.00 GHz, it takes 3 hours 11 min 53 s to obtain the inverse representation. Then it takes 27 min 24 s to calculate the RHS response from 360 mono-static incident angles onϕ = 0° plane.

Figure 11.

Current distribution of a mock up platform at 75 MHz.

Figure 12.

Mono-static far field pattern of a mock up platform at 75 MHz, observation planeϕ = 0°.

5. Conclusion

[31] In this paper, a hierarchical direct solver algorithm is developed to solve integral equations in 3D electromagnetic wave scattering from non-penetrable targets. The proposed algorithm utilizes the skeletonization process to effectively compress the rank-deficient off-diagonal blocks, which correspond to the couplings between groups. Huygens' surfaces are also introduced to account for the far field couplings efficiently, and thus further accelerate the algorithm. It has been demonstrated that the condition number of the system matrix would still affect the solution errors of direct solvers. Despite the limitations, for problems of small or medium electrical sizes, the multilevel version of the proposed algorithm featuresO(N1.3) and O(N) complexities for memory consumption and O(N1.8) and O(N1.5) for CPU time for fixed mesh size and for h-refinement scenarios, respectively. We emphasize that the complexities reported herein are based on a finite number of moderate electrical size problems that we have studied, and should not be taken asymptotically into electrically large electrodynamic problems. Finally, the proposed direct solver when combined with the newly developed IEDDM provides us a versatile tool to solve complex electromagnetic problems with multiscale geometrical features.


[32] The authors thank an anonymous reviewer for sharing insights of the largest and smallest singular values of the coupling matrices in the CFIE formulation for the fixed frequency studies.