Some issues related to the novel spectral acceleration method for the fast computation of radiation/scattering from one-dimensional extremely large scale quasi-planar structures

Authors


Abstract

[1] The novel spectral acceleration (NSA) algorithm has been shown to produce an equation image(Ntot) efficient iterative method of moments for the computation of radiation/scattering from both one-dimensional (1-D) and two-dimensional large-scale quasi-planar structures, where Ntot is the total number of unknowns to be solved. This method accelerates the matrix-vector multiplication in an iterative method of moments solution and divides contributions between points into “strong” (exact matrix elements) and “weak” (NSA algorithm) regions. The NSA method is based on a spectral representation of the electromagnetic Green's function and appropriate contour deformation, resulting in a fast multipole-like formulation in which contributions from large numbers of points to a single point are evaluated simultaneously. In the standard NSA algorithm the NSA parameters are derived on the basis of the assumption that the outermost possible saddle point, ϕs,max, along the real axis in the complex angular domain is small. For given height variations of quasi-planar structures, this assumption can be satisfied by adjusting the size of the strong region Ls. However, for quasi-planar structures with large height variations, the adjusted size of the strong region is typically large, resulting in significant increases in computational time for the computation of the strong-region contribution and degrading overall efficiency of the NSA algorithm. In addition, for the case of extremely large scale structures, studies based on the physical optics approximation and a flat surface assumption show that the given NSA parameters in the standard NSA algorithm may yield inaccurate results. In this paper, analytical formulas associated with the NSA parameters for an arbitrary value of ϕs,max are presented, resulting in more flexibility in selecting Ls to compromise between the computation of the contributions of the strong and weak regions. In addition, a “multilevel” algorithm, decomposing 1-D extremely large scale quasi-planar structures into more than one weak region and appropriately choosing the NSA parameters for each weak region, is incorporated into the original NSA method to improve its accuracy.

1. Introduction

[2] Quasi-planar structures (QPS) play an important role in many electromagnetic (EM) applications, including radiation and scattering from rough surfaces, microstrip structures, microwave integrated circuits, and optical gratings. Several efficient numerical methods have been proposed recently [Tsang et al., 1995; Johnson, 1998; Chou and Johnson, 1998, 2000; Jandhyala et al., 1998a, 1998b; Bindiganavale et al., 1998; Torrungrueng et al., 2000; Torrungrueng and Johnson, 2001a, 2001b, 2001c; Hu and Chew, 2000; Valero-Nogueira and Rojas, 2000] to obtain an accurate description of EM scattered fields for electromagnetically large structures. One of these methods, the novel spectral acceleration (NSA) algorithm, has been shown to be an extremely efficient equation image(Ntot) iterative method of moments (MOM) for the computation of scattering from both one-dimensional (1-D) and two-dimensional (2-D) large-scale QPS [Chou and Johnson, 1998; Torrungrueng et al., 2000; Torrungrueng and Johnson, 2001a, 2001b, 2001c; Chou and Johnson, 2000; Valero-Nogueira and Rojas, 2000]. This method accelerates the matrix-vector multiplication in an iterative MOM solution and divides contributions between points into “strong” (exact matrix elements) and “weak” (NSA algorithm) regions. The NSA method is based on a spectral representation of the electromagnetic Green's function and appropriate contour deformation, resulting in a fast multipole-like formulation in which contributions from large numbers of points to a single point are evaluated simultaneously. Unlike traditional multipole methods, however, only one large group of points is considered for the calculation of weak region contributions, resulting in a more efficient computation. Like the steepest descent fast multipole method [Jandhyala et al., 1998a, 1998b], to further improve efficiency of the spectral expansion, the angular spectral integration path is deformed in the complex angular plane, resulting in a smaller domain of integration with less rapid oscillation along the path. The multiplication is performed in a forward sweep followed by a backward sweep, with the weak region continuously increasing in size as the multiplication proceeds in one direction. Because of the use of forward and backward sweeps, the NSA approach is well suited for incorporation into the “forward-backward” (FB) iterative method [Kapp and Brown, 1996; Holliday et al., 1996; Toporkov et al., 1998] but can also be used in any standard iterative method. Details of the NSA algorithm are provided by Chou and Johnson [1998].

[3] In the NSA method the most important issue is to determine the appropriate NSA parameters, which include the tilt angle (δ) of the deformed contour in the complex angular domain, the domain of integration ([−ϕmax, ϕmax]), and the integration step size (Δϕ). In the original paper [Chou and Johnson, 1998] these parameters are derived on the basis of the assumption that the outermost possible saddle point, ϕs,max, along the real axis in the complex angular (ϕ) plane is small. For a given surface height variation this assumption can be satisfied by adjusting the size of the strong region. However, for QPS with large height variations the adjusted size of the strong region is typically large, resulting in significant increases in computational time for the strong-region contribution and degrading overall efficiency of the NSA algorithm. In addition, for the case of 1-D extremely large scale structures, studies based on the physical optics (PO) approximation and a flat surface assumption show that the given 1-D NSA parameters of Chou and Johnson [1998]may yield inaccurate results. Inaccuracy comes from the fact that the complex radiation function (plane wave spectrum) of a source group far separated from the receiving element is rapidly decayed along the deformed contour away from the origin in the ϕ plane, requiring a higher sampling rate in ϕ to retain accuracy. Analytical results obtained from asymptotic evaluations of the radiation integral associated with the PO and flat surface assumptions suggest that the very large weak region associated with 1-D extremely large scale QPS should be decomposed into more than one separate weak region, and appropriate NSA parameters must be determined separately for each weak region to improve accuracy. Thus the new proposed scheme for the NSA algorithm can be classified as a “multilevel” algorithm. Note that the multilevel NSA algorithm is distinct from standard multilevel algorithms applied in other methods [Brandt, 1991; Lu and Chew, 1994; Song and Chew, 1995].

[4] This paper is organized as follows: Section 2 presents the derivation of analytical formulas associated with the 1-D NSA parameters for an arbitrary value of ϕs,max. The “multilevel” NSA algorithm for 1-D extremely large scale QPS is illustrated in section 3. Some numerical results are presented in section 4, and a summary and conclusions can be found in section 5. An eiωt time harmonic convention is assumed and suppressed throughout this paper, and the propagation constant is defined as equation imagewhere ω is the radian frequency and ϵ and μ are the permittivity and permeability of free space, respectively.

2. Derivation of General Formulas Associated With the 1-D NSA Parameters

[5] For convenience in discussion, consider as an example of a quasi-planar structure a one-dimensional (1-D) finite rough surface profile L, z = f(x), as shown in Figure 1. Let ρ and ρ′ denote a field point and a source point on the rough surface L, respectively, where equation imageand equation image. Let Ls be a neighborhood distance within which interactions between points are classified as strong and outside of which interactions are classified as weak. Note that Lx is the surface length obtained from the projection of the surface profile Lonto the x axis; the maximum and minimum surface height variations are denoted as zmax and zmin, respectively. Without loss of generality, consider only the 1-D NSA algorithm in the forward sweep for which xx′ > 0 as illustrated in Figure 1. In the weak region the 1-D NSA algorithm employs the angular spectral representation of the two-dimensional (2-D) scalar free space Green's function, H0(1)(k|ρρ′|), along the original contour Cϕ:

equation image
equation image

where equation image and the saddle point for a given pair of source and observation points is ϕs = tan−1 [(zz′)/(xx′)]. For convenience, first consider the topology in the complex ϕ plane for a single pair of points ρ and ρ′ on a flat surface. For a flat surface the steepest descent path (SDP) CSDPz = 0) passes through its saddle point at the origin as shown in Figure 2. From an asymptotic analysis, most of the contribution occurs on portions of the SDP path near a saddle point on the real axis. As the distance from the saddle point increases along the SDP path, the integrand I(ϕ) is exponentially attenuated so that the contributions become negligible. Thus it is numerically advantageous to deform the original contour Cϕ to the SDP contour for a flat surface CSDP (Δz = 0) as illustrated in Figure 2.

Figure 1.

A 1-D finite rough surface profile L, z = f(x).

Figure 2.

The original contour Cϕ, the deformed contour Cδ, and the 1-D NSA parameters in the complex ϕ plane.

[6] However, when coupling between many pairs of points is considered as in the weak region contribution to the receiving point as shown in Figure 1, there is no longer a unique SDP path for a rough surface along which only attenuation of the integrand is obtained away from a single saddle point. Figure 2 also shows the distribution of all possible saddle points that can exist for a given surface profile. Thus the deformed contour Cδas shown in Figure 2 must be chosen as a compromise between extreme exponential growth and rapid oscillation of the integrand I(ϕ). The efficiency of the spectral expansion can be improved by deforming the original contour Cϕ to the deformed contour Cδ. Employing Cauchy's residue theorem [Churchill and Brown, 1990] with the fact that there are no singularities encountered between Cϕ and Cδfor xx′ ≥ Ls, H0(1) (k|ρρ′|) can be rewritten as

equation image

Let ϕs,max and ϕmaxbe the outermost possible saddle point and the upper limit of integration along the real axis of Cδ, where ϕs,max = tan−1zmax/Ls) and Δzmax = zmaxzmin. Note that the linear part of Cδ extends from −ϕmax to ϕmax, and the rest, where the integrand I(ϕ) behaves like an exponential decaying function, is deformed to connect with Cϕ. Along the linear portion the contour Cδcan be expressed as

equation image

where δ ∈ (0, π/4] and ϕRdenotes the real part of the complex angle ϕ. Note that δmax = π/4 rad, which is the tilt angle of the steepest descent path CSDPfor the flat surface (Δz = 0) passing through its saddle point at the origin.

[7] Consider the magnitude |I(ϕ)| and phase ψ(ϕ) of the integrand I(ϕ) in (3)for a single saddle point ϕs:

equation image
equation image

Without loss of generality, consider only ϕs > 0, i.e., assuming that Δz = zz′ > 0 in the rest of this paper. As discussed in section 1, the 1-D NSA method involves three parameters: the tilt angle δ of the deformed contour Cδ, the domain of integration [−ϕmax, ϕmax], and the integration step size Δϕ. These parameters can be derived by considering the integrand I(ϕ) for a pair of source and field points in the worst case scenario, i.e., when ϕs = ϕs,max, due to the fact that the magnitude |I(ϕ)| and phase ψ(ϕ) of the integrand I(ϕ) usually vary most rapidly for this configuration. In this section, first an analytical formula associated with δ for an arbitrary value of ϕs,max is derived, and then the derivation of analytical formulas associated with ϕmax and Δϕ follows.

2.1. Analytical Formula for δ

[8] One criterion for selecting a possible value of δ, denoted as equation image, can be obtained by limiting the maximum value of |I(ϕ)| along the contour Cδ to equation image, where amax is a given positive constant. Numerical tests show that a value of amax = 5.0 yields a reasonable value for δ. Figure 3 illustrates a typical plot of |I(ϕ)| versus ϕR along the deformed contour Cδ with a single saddle point ϕs = ϕs,max. Note that the maximum value of |I(ϕ)|, denoted as |I(ϕ)|max, occurs at ϕR = ϕR,o, and ϕR,o is always located between the origin and ϕs,max. Employing standard calculus, ϕR,o can be determined by maximizing equation (5) using ϕs = ϕs,maxto obtain the following nonlinear equation for ϕR,o:

equation image

where ϕR,o∈ [0,ϕs]. For a given pair of observation and source points the saddle point ϕs is fixed. Note that (7) has two unknowns ϕR,o and equation image, so that another equation is required to uniquely specify equation image. Setting equation imageyields another nonlinear equation:

equation image

where ϕR,o∈ [0,ϕs]. Typically, it is found that for a given surface a possible pair of points possessing the largest saddle point ϕs,max, i.e., xx′ = Ls and zz′ = Δzmax, approximately yields |I(ϕ)|maxat ϕR = ϕR,o. Thus, for a given amax, equation imagecan be determined analytically via solving (7)and (8) simultaneously with ϕs = ϕs,max and R = Rw, where Rw is the corresponding distance between a pair of points possessing ϕs,max, defined as equation image. Note that (7) and (8) can be solved simultaneously via a root-finding technique such as Muller's method [Press et al., 1992] by first initially guessing a value of equation imagedenoted as equation image, where equation image and then solving for ϕR,onumerically using (7). Substitute this ϕR,ointo (8), and then solve for equation imagenumerically, where this new equation imagecan be used as an initial quess for the next iteration. This procedure repeats iteratively until equation imageconverges to the desired value within a specified tolerance. Finally, once equation image is known, the tilt angle δ is obtained as follows:

equation image

since δmax = π/4 rad as discussed in section 2. Equations (7) and (8) can be simplified for two limiting cases:

Figure 3.

Plot of |I(ϕ)| versus ϕR along the deformed contour Cδ.

2.1.1. Case 1: Small ϕs,max

[9] For small ϕs,max(0 ≤ ϕs,max ∼ 0.3 rad) it is implied that ϕR,ois also small since ϕR,o∈ [0,ϕs,max]. Performing a Taylor's series expansion and keeping only the first term, (7) and (8)with ϕs = ϕs,max and R = Rw can be simplified to the following equations:

equation image
equation image

respectively. It should be pointed out that equation imagegiven by Chou and Johnson [1998] is derived on the basis of the assumption that ϕs,max is small and the approximation that ϕR,o occurs at the intersection between the steepest ascent path (SAP) passing through ϕs,max (employing a linear approximation of the SAP) and the linear part of the contour Cδ as given in (4). It is found that equation imagefrom both approaches match well when equation imageapproaches π/4 rad.

2.1.2. Case 2: Large ϕs,max

[10] In this case, equation imageis expected to be small to avoid excessive exponential increase of |I(ϕ)| along the contour Cδ. Performing the Taylor's series expansion and keeping only the first term, (7) and (8)with ϕs = ϕs,max and R = Rw can be approximated by the following equations to solve for ϕR,o and equation image:

equation image
equation image

respectively, where DR,o) = kRw ϕR,o sin (ϕs,max− ϕR,o) and ϕR,o∈ [0,ϕs,max]. Note that once ϕR,ois known by solving (12), equation image can be obtained via (13). To further simplify (12), performing the Taylor's series expansion and keeping up to the second term (since the difference ϕs,max − ϕR,o may not be small enough to neglect the second term), the following cubic equation for ϕR,o is obtained:

equation image

Solving (14) analytically [Abramowitz and Stegun, 1972], keeping only the real solution, and discarding a pair of complex conjugate roots, one obtains

equation image

where equation imagev = 1.5 ϕs,max, and ϕR,o∈ [0,ϕs,max]. To obtain a more accurate value of equation image, one may use equation imageand ϕR,o given in (13) and (15), respectively, as an initial guess in solving (7) and (8) simultaneously. It is interesting to point out that (12) and (13) can be simplified to (10) and (11) for the case of small ϕs,max, respectively. Thus equation imageobtained from (12) and (13) is expected to work well for both small and large ϕs,max. Numerical results also show that equation imagefrom (12) and (13) usually provides a good estimate for the exact equation image.

2.2. Analytical Formula for ϕmax

[11] The maximum of the domain of integration ϕmax can be determined analytically by considering the distribution of |I(ϕ)| along the deformed contour Cδ. It is found that ϕmax corresponding to ϕs = ϕs,max and R = Rwis usually the largest, and |I(ϕ)| for this case is exponentially decaying and less than 1 outside the interval [0,ϕs,max] as illustrated in Figure 3. Thus ϕmax can be determined by limiting the value of |I(ϕ)| along Cδ to eb, where b is typically equal to 6. A possible value of ϕmax, denoted as equation image, is then obtained by solving the following nonlinear equation:

equation image

where equation image> ϕs,max. Note that (16) can be solved for equation image via a root-finding technique such as Muller's method. The condition equation image > ϕs,max is imposed to avoid another possible invalid solution of equation image, denoted as ϕll < 0), as shown in Figure 3, and the analytical formula for ϕmax is given as follows:

equation image

where the maximum value of ϕmax is equal to π/2 rad as shown in Figure 2. For the case of very large ϕs,max( ϕs,max close to π/2 rad) it should be pointed out that ϕmax is usually equal to π/2 rad, and the domain of integration should include the portion of the vertical lines at ±{π/2 rad as shown in Figure 2to allow the integrand I(ϕ) to decay sufficiently. This increase in the domain of integration can be avoided by properly adjusting the size of the strong region Lsto trade off the computation of the contributions from the strong region and the weak region. No results including the vertical portion of the contour at ±π/2 rad are included in this paper.

2.3. Analytical Formula for Δϕ

[12] The integration step size Δϕ can be determined by considering the variation of the integrand I(ϕ) (see equation (2)) along the deformed contour Cδ with ϕs = ϕs,max and R = Rw and sampling I(ϕ) according to its highest-frequency component. Let fmax be the maximum frequency of I(ϕ) (cycles per radian) obtained by taking the discrete Fourier transform (DFT) of I(ϕ) (via the fast Fourier transform (FFT) algorithm) and searching for its maximum frequency component. The number of sampling points used to determine fmax in the FFT computation is increased until no significant aliasing effects are observed. Assuming that I(ϕ) is a band-limited function, the integration step size ΔϕFFT can then be expressed as

equation image

where Ns is the number of samples per cycle and Ns is typically equal to 8. For illustration, consider the case of Δz = Δzmax = 5.0λ and xx′ = Ls = 1.0λ, where λ is the electromagnetic wavelength in free space. Using (7)–(9), it is found that the appropriate tilt angle δ for this case is equal to 19.406°. Figures 4a and 4b plot the real and imaginary parts of I(ϕ), denoted as IR(ϕ) and II(ϕ), respectively, along Cδ versus the real angle ϕR. From the plots it is observed that both IR(ϕ) and II(ϕ) are quite oscillatory. In addition, the plots of the corresponding magnitudes of the DFT of IR(ϕ) and II(ϕ) versus the frequency variable f are also shown in Figures 4c and 4d, respectively. Note that both IR(ϕ) and II(ϕ) are band-limited functions, and thus their maximum frequency can be determined by finding the frequency such that its DFT magnitude normalized by the maximum DFT magnitude is less than a given tolerance (typically 0.001). Using this procedure, it is found that the maximum frequencies for IR(ϕ) and II(ϕ), denoted as fmax,R and fmax,I, respectively, are equal to 7.0 cycles/rad. Thus the maximum frequency fmaxof I(ϕ), which is the maximum of fmax,R and fmax,I, is equal to 7.0 cycles/rad, and the integration step size Δϕ via the “FFT” approach is equal to 1.023°.

Figure 4.

(a)–(d) Plots of the real and imaginary parts of I(ϕ) versus ϕR and their discrete Fourier transforms versus f.

[13] For the case of large ϕs,max, which frequently occurs in practice when Ls is optimized for a given Δzmax, the phase component of I(ϕ), ei ψ(ϕ), typically varies more rapidly than the magnitude term |I(ϕ)|. Thus it is reasonable to consider only the variation of eiψ(ϕ) instead of I(ϕ) to determine Δϕ. It should be pointed out that |I(ϕ)| acts as a band-pass filter, i.e, suppressing the components outside the interval [ϕl, ϕmax], as illustrated in Figure 3. Following the same procedure as in section 2.2 results in the following equations to solve for ϕl:

equation image
equation image

where equation image is a possible value of equation imageand the parameter b is defined in section 2.2. Performing a first-order Taylor's series expansion on ψ(ϕR) at a local point equation image, where

equation image

and equation image, ψ(ϕR) and equation imagecan be approximated as

equation image
equation image
equation image

where

equation image
equation image

Note that the first term equation imagein (24) is a constant and the second term equation image is a linear phase term. Thus Δϕ can be determined analytically by considering only the highest-frequency component contained in the second term; this occurs when equation imageis maximum. If qmax denotes max equation image, where equation imagethe approximate analytical formula of Δϕ is

equation image

The value qmax can be determined by simply using a numerical searching procedure or using a standard maximization/minimization technique as described by Press et al. [1992]. It should be pointed out that the FFT approach takes both magnitude and phase variations of I(ϕ) into account at the cost of the computation of the Fourier transform of I(ϕ), and ΔϕFFT is therefore employed as a reference solution. For the case of large ϕs,max, numerical tests show that Δϕ obtained from (27) yields reasonable results as compared to ΔϕFFT in (18) even though the variation of the magnitude term |I(ϕ)| is neglected. For the case considered in the FFT approach, ϕs,maxis equal to 78.69°, which is quite large, and Δϕ in (27) can be shown to be 1.398°, which is close to ΔϕFFT = 1.023°. Note that once ϕmax and Δϕ are known, the total number of plane waves QTOT employed in the 1-D NSA algorithm is

equation image

where ⌈·⌉ denotes the ceiling operator, i.e., rounding its argument to the nearest integer towards plus infinity.

3. The 1-D “Multilevel” NSA Algorithm

[14] For the case of 1-D extremely large scale quasi-planar structures the 1-D NSA parameters given by Chou and Johnson[1998] may yield inaccurate results because of the fact that the complex radiation function I(ϕ) of a source group far separated from the receiving element is rapidly decayed along the deformed contour Cδ away from the origin in the complex angular ϕ plane, so that a significantly higher sampling rate in ϕ is needed. To maintain the accuracy of the 1-D NSA algorithm without significantly degrading its efficiency, the very large weak region needs to be decomposed into two or more weak regions, and appropriate 1-D NSA parameters must be determined separately for each weak region. In this section, a PO approximation for surface currents and a flat surface assumption are employed to determine appropriate sizes for each weak region. These choices allow useful analytical expressions to be obtained; the flat surface assumption is reasonable because all saddle points are in the neighborhood of the origin of the ϕ plane for observation and source point pairs with source points in the additional weak regions.

[15] Without loss of generality, consider a transverse electric (transverse electric (TE) or horizontally polarized) plane wave impinging on a flat perfect electric conductor (PEC) half plane as illustrated in Figure 5, where

equation image

kx = k sin θi, kz = k cos θi, θi is an incident angle measured from the z− axis and E0 is the amplitude constant of the incident field. In Figure 5, xn is a receiving point on the half plane, Ls is the size of the strong region, and the weak region of size Lw extends from the origin to the source point xw. Using the PO approximation, the electric field of the weak region at the receiving point xn, denoted as equation imagecan be expressed as

equation image

where xnx′ > 0, x′ ∈ [0, xw], equation image and

equation image

Employing equation (1) with R = xnx′ ≥ Ls and ϕs = 0, i.e.,

equation image
equation image

Ew(xn) defined in (30) can be rewritten as

equation image

Interchanging the spatial domain integration and the contour integration, and then performing the spatial domain integration analytically, Ew (xn) can be expressed in terms of the plane wave spectrum of the flat surface F(ϕ) as follows:

equation image

where

equation image
equation image
equation image
equation image

In contrast to section 2, which was based on analysis of spectral integral properties for a single source-observation point pair, the analytical form for F(ϕ) in (37) captures contributions from all source points in the weak region. The result obtained consists of effective contributions to the field at observation point xn from the nearest source point in the weak region x = xw [G1(ϕ)] and from the farthest source point x = 0 [G2(ϕ)].

Figure 5.

A flat PEC half plane illuminated by a TE plane wave.

[16] In the 1-D NSA algorithm, Ew(xn) in (35) is evaluated numerically along the deformed contour Cδinstead of Cϕ, where δ = π/4 rad for the flat surface case. Using (4)with δ = π/4 rad, the contributions G1(ϕ) and G2(ϕ) along Cδ(δ = π/4) simplify to

equation image
equation image

where κ1 = kLs, κ2 = kxn, and Dg(ϕ) = k(cos ϕR cosh ϕRi sin ϕR sinh ϕR) + kx. Note that κ2 ≫ κ1 for large weak regions. The contributions G1(ϕ) and G2(ϕ) decay exponentially away from the saddle point along Cδbecause of the fact that sin ϕR sinh ϕR > 0 along Cδ(δ = π/4). Thus, in the neighborhood of the saddle point ϕs = 0, G1(ϕ) and G2(ϕ) can be approximated as

equation image
equation image

The distributions of G1(ϕ) and G2(ϕ) are thus approximately Gaussian and centered at the origin; the half widths of Gi(ϕ), denoted as equation image, can be determined by setting

equation image

where i = 1 or 2 and τ is a positive constant (typically equal to 10). Solving for equation imagein (44) yields

equation image

Performing the same analysis as above, it can also be shown that equation image and equation imagecorrespond to the half widths of the integrand of the Hankel function If(ϕ) along the deformed contour Cδ(δ = π/4), where x′ = xw and 0, respectively. Thus, based on the PO and flat surface assumptions, one can consider only the source points located at the edges of the weak region instead of the whole source group, as far as the width of the integrand F(ϕ) is concerned. Figure 6 plots equation imagewhere i = 1 or 2, and the the difference between these two versus ϕR, where Ls = 10λ and xn = 5000λ. Using (45) with τ = 10, it is found that equation imageand equation image. As expected, equation image, and the difference varies most rapidly in the neighborhood of the origin.

Figure 6.

Plots of equation image, where i = 1 or 2, and their difference versus ϕR.

[17] Let equation image be appropriate angular step sizes employed in sampling F(ϕ), G1(ϕ), and G2(ϕ), respectively. Because of the Gaussian distribution of Gi(ϕ), it is found that

equation image

where i = 1 or 2, yields accurate results. If F(ϕ) is sampled using a uniform sampling rate, Δϕ must be set to be equation imageto obtain accurate results, and as the weak region increases, Δϕ decreases inversely proportional to equation image, which degrades the efficiency the 1-D NSA algorithm significantly.

[18] If the strong region is also electromagnetically large (kLs ≫ 1), Ew (xn) can be evaluated asymptotically using the method of steepest descent [Felsen and Marcuvitz, 1973], and the following asymptotic result is obtained:

equation image

The first and second terms in (47)correspond to the contributions G1(ϕ) and G2(ϕ), respectively. Because κ2 ≫ κ1, the contribution from G2(ϕ) is much less than the contribution from G1(ϕ) in (47). To maintain the efficiency of the 1-D NSA algorithm, it is thus reasonable to consider only the contribution from G1(ϕ) for F(ϕ), i.e., to trade off the accuracy for the efficiency of the 1-D NSA algorithm, and this procedure is employed in the development of the original NSA algorithm [Chou and Johnson, 1998].

[19] One way to maintain the accuracy of the 1-D NSA algorithm without significantly degrading its efficiency is to decompose the very large weak region into more than one weak region as illustrated in Figure 7. Figure 7 shows the decomposition of the very large weak region of size Lw into M weak regions of size Lw,j, where j = 1, 2,…,M and Lw = Σj = 1MLw,j. It will be shown later that for a given accuracy, Lw,j + 1Lw,j, and each weak region except the first requires the same number of plane waves. For the new proposed scheme shown in Figure 7, it is preferable to rewrite F(ϕ) as

equation image

where

equation image

xw,0 = xw, and xw,M = 0. Next consider the procedure for determining the size of each weak region Lw,j.

Figure 7.

Decomposition of the very large weak region of size Lw into M weak regions of size Lw,j, where j = 1, 2,…,M and Lw = Σj = 1MLw,j.

[20] For illustration, consider the first weak region of size Lw,1 located between xw,1 and xw,0 as shown in Figure 7. As pointed out earlier in this section, considering only the source points located at the edges of the weak region of interest is sufficient to obtain information about the width of the entire source group in that weak region. In this case, the width of the integrand F1(ϕ) associated with the first weak region is governed by the edge points xw,1 and xw,0. Let Ww,1 be the width of the integrand I(ϕ; R = xnxw,1s = 0) along Cδ, which is given by (see equation (45))

equation image

If the edge point xw,1 is not too far separated from the observation point xn, the angular step size Δϕ1for the flat surface case, employed in accurately sampling the integrand F1(ϕ), is equal to the angular step size associated with the integrand I(ϕ; R = Ls, ϕs = 0); that is,

equation image

For a given accuracy the largest size of the first weak region Lw,1 for the flat surface case can be determined by setting Δϕ1 to be a fraction of Ww,1; that is,

equation image

where α is a positive constant (α ≥ 1). Solving for Lw,1 results in

equation image

Note that for a given Ls, Lw,1 only depends on α−2; i.e., a smaller α value yields larger Lw,1 values with less accuracy and vice versa. Typically, α = 8 yields quite accurate results, and Lw,1 is found to be 59.5 Ls. Using the same procedure as described above for other weak regions, it can be shown that the appropriate sizes of each weak region are given by the following equation:

equation image

where Lw,0 is defined to be Ls. From (54) with α = 8, it is clear that Lw,jLw,j−1; for example, Lw,1 = 59.5 Ls, Lw,2 = 3599.75 Ls, Lw,3 = 217,784.875 Ls, and Lw,4 = 13,175,984.9375 Ls. In addition, the contribution from the jth weak region of interest decreases at the rate of equation image as predicted by (47), where κw,j = kΣn = 0j−1Lw,n and j = 1,2,…, M. Thus, for 1-D extremely large scale QPS in most practical problems, only a few weak regions are necessary to obtain accurate results.

[21] Although the flat surface assumption was used in deriving the preceding equations, Lw,1 can still be determined via (52) for rough surfaces except that Δϕ1 derived for the case of rough surfaces must be employed, i.e., using Δϕ1 as given in (18) or (27) for the case of large ϕs,max. Thus, solving for Lw,1 in (52) yields

equation image

and the appropriate sizes of other weak regions for the case of rough surfaces are still given as in (54) except that j starts from 2 to M. As in the case of the flat surface, only a few weak regions (or even one) are required for most practical problems to obtain accurate results. Appropriate sizes for each weak region have now been determined completely. Next, consider the formulas associated with the 1-D NSA parameters δj, ϕmax,j and Δϕj, where j = 1, 2,…, M, employed in the 1-D “multilevel” NSA algorithm.

3.1. Analytical Formula for δj

[22] For the first weak region, the tilt angle δ1 is given as in (9) (also see (11) and (13) for the two limiting cases). For other weak regions, δj is set to be the tilt angle for the flat surface, i.e.,

equation image

due to the fact that the saddle points associated with these weak regions are in the neighborhood of the saddle point of the flat surface; that is, the flat surface is a good approximation to obtain δj.

3.2. Analytical Formula for ϕmax,j

[23] The maximum of the domain of integration of the first weak region ϕmax,1is given as in (17). For other weak regions, on the basis of the PO and flat surface assumptions, it can be shown that the associated integrands Fj(ϕ) (see (49)) along the deformed contour Cδ(δ = π/4) are distributed approximately as the superposition of wide and narrow Gaussians centered at the origin. In addition, the support of each Fj(ϕ) is governed by the source point located at the leading edge (closer to the observation point xn as shown in Figure 7), and thus ϕmax,j is given by the following equation (see equation (45)):

equation image

where κw,j = kΣn = 0j − 1Lw,n. Note that ϕmax,j decreases significantly for one weak region to the next since Lw,j + 1Lw,j.

3.3. Analytical Formula for Δϕj

[24] For the first weak region the appropriate angular step size Δϕ1 is given as in (18) or (27) for the case of large ϕs,max, and the number of plane waves QTOT,1 is given as in (28). Because of the Gaussian-like distribution of Fj(ϕ) for other weak regions as discussed in section 3.2, the angular step size Δϕj is given as follows (see equation (46)):

equation image

Note that the number of plane waves QTOT,j required for each weak region (except the first weak region) is the same, and it is given as follows:

equation image

4. Numerical Results

[25] To investigate the validity of the 1-D NSA parameters derived in sections 2 and 3, the 1-D NSA parameters are evaluated numerically for two different surface heights: Δzmax = 5.0λ (case A) and Δzmax = 15.0λ (case B). First, it is interesting to see the variations of the 1-D NSA parameters for the first weak region as a function of the size of the strong region Ls. Figure 8 illustrates the 1-D NSA parameters versus Ls (in λ) for case A. Figure 8a plots the tilt angle δ1 (in degrees) calculated by using the exact and approximated formulas as described in section 2.1 versus Ls. From the plot, the δ1 based on the small ϕs,max approximation (see equations (9) and (11)) has some discrepancy with the exact δ1 (see equations (7)–(9)) for small Ls, i.e., large ϕs,max. Note that for a given roughness, ϕs,max decreases as Ls increases as shown in Figure 8c. As Ls increases, the δ1 based on the small ϕs,max approximation and the exact δ1 approach the π/4 tilt angle for the flat surface, and finally they are equal to π/4 rad (45°) at Ls ≈ 7.0λ (ϕs,max = 35.535°). In addition, the δ1 based on the large ϕs,max approximation (see equations (9) and (13)) and the exact δ1 are in good agreement for all values of Ls, even for moderate and large Ls, corresponding to moderate and small ϕs,max, respectively. Thus the δ1 based on the large ϕs,max approximation seems to provide a good estimate for the exact δ1.

Figure 8.

(a–d) Plots of the 1-D NSA parameters in the first weak region versus the size of the strong region Ls for the case of Δzmax = 5.0λ.

[26] Figure 8b exhibits a plot of the integration step size Δϕ1 (in degrees) versus Ls for both the “FFT” approach (see equation (18)) and the approximate formula as given in (27). Note that Δϕ1 calculated by these approaches follow the same trend only when Ls is small because of the fact that the approximate Δϕ1 is derived on the basis of the assumption that the phase component of the integrand I(ϕ) varies more rapidly that its magnitude, which is valid only when ϕs,maxis large; that is, Lsis small. In general, ΔϕFFT provides better results and should be employed in the 1-D NSA algorithm. Furthermore, Figure 8c plots the maximum domain of integration ϕmax,1 and the maximum saddle point ϕs,max (in degrees) versus Ls. Note that ϕmax,1 is always greater than ϕs,max as expected, and both ϕmax,1and ϕs,max decrease as Ls increases.

[27] Figure 8d plots the number of plane waves QTOT,1 computed by using ΔϕFFT versus Ls. Note that QTOT,1 tends to decrease as Ls increases, i.e., to trade off between the weak region computation, which is related to the number of plane waves, and the strong region computation, which depends on the size of the strong region. In practice, this plot serves as a tool in selecting the appropriate Ls to optimize the 1-D NSA algorithm.

[28] Figures 9a and 9b plot the size of the first region Lw,1 (see equation (55)) and the relative error (in percent) in computing the 2-D scalar free space Green's function in the angular spectral domain for Δx = Ls (the shortest horizontal distance between the observation point and the source point in the first weak region) and Δx = Ls + Lw,1 (the farthest horizontal distance between the observation point and the source point in the first weak region) versus Ls for case A, respectively. Both Lw,1 and the relative error are computed using a sampling rate determined from the FFT algorithm (ΔϕFFT). From Figure 9a, Lw,1 tends to increase as Ls increases; that is, a larger strong region yields a larger size of the first weak region, as expected. From Figure 9b the relative error for Δx = Lsis almost constant (0.015% on average), and thus it is implied that the derived 1-D NSA parameters work quite well. In addition, the relative error for Δx = Ls + Lw,1is about 1.0%, indicating that the choice of Lw,1 is acceptable. Although this relative error is larger than at distance Δx = Ls, the contribution from the point at Δx = Ls + Lw,1 is much smaller than that from Δx = Ls, so that a larger level of error may be tolerable. It will be shown later in this section that by incorporating the second weak region into the NSA algorithm the accuracy can be improved. Next, consider the 1-D NSA parameters for the rougher surface case B.

Figure 9.

Plots of (a) the size of the first region Lw,1 and (b) the relative error (in percent) in computing the free space Green's function in the angular spectral domain versus Ls for the case of Δzmax = 5.0λ.

[29] Figure 10 plots the 1-D NSA parameters in the first weak region versus Ls for case B. Figure 10a demonstrates the plot of δ1 (in degrees) versus Ls. As in case A, the exact δ1 and the δ1 based on the large ϕs,max approximation are in very good agreement for all values of Ls. In addition, the δ1 based on the small ϕs,max approximation is slightly different from the exact δ1for small Ls (large ϕs,max), and the former tends to approach the latter as Ls increases. Figures 10b, 10c, and 10d plot Δϕ1 (in degrees), ϕmax,1, and QTOT,1 versus Ls, respectively. Similar behaviors to those in case A are observed. Comparing between Figure 8d and Figure 10d, it is observed that for a given ϕs,max, QTOT,1 tends to increase as Δzmax increases; that is, as the roughness increases, more plane waves are required to obtain the same accuracy. Thus QTOT,1 depends on both ϕs,max and Δzmaxin general.

Figure 10.

(a–d) Plots of the 1-D NSA parameters in the first weak region versus Ls for the case of Δzmax = 15.0λ.

[30] Figures 11a and 11b plot Lw,1 and the relative error (in percent) for Δx = Ls and Δx = Ls + Lw,1 versus Ls for case B, respectively. As in case A, Lw,1 stends to increase as Ls increases. Comparing Lw,1 between cases A and B, one can see that for a given ϕs,max, Lw,1 seems to increase as Δzmax increases. This is due to the fact that Lw,1 depends on Δϕ1 (see equation (55)) and Δϕ1 tends to decrease as the roughness increases. From Figure 11b the average relative errors for Δx = Ls and Δx = Ls + Lw,1 are about 0.025% and 1.0%, respectively. Thus the derived 1-D NSA parameters work reasonably well for both cases. It should be pointed out that for a very flat surface case, the derived 1-D NSA parameters also work very well, and tilt angle δ1 for the first weak region for this case is equal to π/4 rad as expected.

Figure 11.

Plots of (a) Lw,1 and (b) the relative error (in percent) in computing the free space Green's function in the angular spectral domain versus Ls for the case of Δzmax = 15.0λ.

[31] It is also of interest to see the accuracy improvement of the 1-D NSA algorithm by incorporating the multilevel approach. Figures 12a and 12b plot the relative error (in percent) using the one-level and two-level NSA algorithms versus the horizontal distance between source and observation points Δx = xx′ (in λ) for both case A and case B, respectively. Unlike previous Figures, which were plotted versus the strong region size Ls, Figure 12 illustrates the relative error in the one-level FB/NSA method as the distance between source and observation points increases given that NSA parameters have been derived for a fixed value of Ls. The two-level FB/NSA result employs NSA parameters for the second weak region chosen based on the distance Ls + Lw,1 (assuming that ΔzmaxLs + Lw,1) as described in section 3, and uses these same parameters as Δx increases beyond Ls + Lw,1. From the plots in Figure 12 the minimum Δx is equal to Ls in the 1-D NSA algorithm. Parameters for cases A and B are Ls = 1.5λ (ϕs,max = 73.281°) and Ls = 4.0λ (ϕs,max = 75.057°), respectively. The 1-D NSA parameters for case A are given as follows: δ1 = 21.429°, ϕmax,1 = 89.553°, Δϕ1 = 0.974°, Lw,1 = 339.7λ, δ2 = π/4 rad, ϕmax,2 = 3.913°, Δϕ2 = 0.126°, and Lw,2 = 20,302.6λ. For case B the 1-D NSA parameters are given as follows: δ1 = 7.276°, ϕmax,1 = π/2 rad, Δϕ1 = 0.43°, Lw,1 = 1740.2λ, δ2 = π/4 rad, ϕmax,2 = 1.73°, Δϕ2 = 0.055°, and Lw,2 = 103,777.7λ. From both plots it is obvious that incorporating the second weak region yields appreciable reduction of the relative error. Note that as the surface size increases, additional weak regions are required to maintain the accuracy. However, as pointed out in section 3, Lw,jLw,j − 1, and the contribution from the jth weak region of interest decreases at the rate of equation image as predicted by (47) based on the PO and flat surface assumptions, where κw,j = kΣn=0j−1Lw,n and j = 1, 2,…, M. Thus only a few weak regions (or even one) are required for most practical problems to obtain accurate results.

Figure 12.

Plots of the relative error (in percent) from the one-level and two-level NSA algorithms versus Δx = xx′ (in λ): (a) Δzmax = 5.0λ (b) Δzmax = 15.0λ.

[32] To illustrate the accuracy of the 1-D FB/NSA method with the new NSA parameters, consider a deterministic 1024λ PEC rough surface illuminated by a transverse magnetic (TM or vertically polarized) tapered plane wave at 14 GHz with the taper parameter G = 5 [Thorsos, 1988] at an incident angle of 85°. For the 1024λ surface size chosen, incident fields are approximately 54.3 dB down at the surface edges, and thus the edge effects are negligible. A deterministic realization of a zero-mean Gaussian stochastic process described by a Gaussian spectrum is used; the spectrum is described by

equation image

where W(kf) represents the spectrum amplitude in m3, kf represents the spatial wave number of the surface in rad/m, σh refers to the root-mean-square (RMS) of surface heights in meters, and l is the surface correlation length in m. For case A the parameters of the Gaussian spectrum W(kf) are chosen as follows: σh = 0.7λ and l = 1.414 σh, while for case B, σh = 2.1λ and l = 1.414 σh. The surface realization chosen satisfies Δzmax = 5λ for case A and 15λ for case B.

[33] Figure 13 compares normalized bistatic radar cross sections in decibels (dB) from the standard 1-D FB and the 1-D one-level FB/NSA methods for cases A (Figure 13a) and B (Figure 13b). The normalized bistatic radar cross section (RCS) σαβsi) for deterministic surface due to a tapered incident plane wave is defined as

equation image

where Eαβs is the α-polarized scattered field of the β-polarized incident wave, Sβi is the time average Pointing vector of the β-polarized incident wave, η is the free space intrinsic impedance, θs refers to the scattered angle, Lf is an infinite flat surface profile and equation imageis a unit vector pointing into the flat surface from the free space region. The NSA parameters for both cases are given as in the previous paragraph. Note that only one level is sufficient for RCS accuracy for both cases. The plots show a high level of agreement for RCS predictions between the standard FB and FB/NSA algorithms for cases A and B. Both the FB and FB/NSA methods required three iterations to converge to within 1% accuracy in all cases; total CPU time for the standard FB method was 450.0 s in cases A and B, while total CPU times were 12.63 (approximately 36 times less) and 28.83 (approximately 16 times less) s for cases A and B, respectively, with the FB/NSA method. CPU times presented were obtained with a Pentium III 700 MHz computer with 128 Mb RAM.

Figure 13.

Normalized bistatic RCS in decibels from the standard 1-D FB and 1-D one-level FB/NSA methods. (a) Case A: Δzmax = 5.0λ. (b) Case B: Δzmax = 15.0λ. Scattering angle θs is defined so that −85° is backscattering.

[34] Finally, it is of interest to investigate the computational efficiency of the 1-D FB/NSA method, particularly for surfaces with large height variations. A TM tapered plane wave incident field with taper parameter G = 5 [Thorsos, 1988] at an incident angle of 45° is applied in this case. Figures 14a and 14b illustrate CPU time per iteration (in seconds) and memory (in megabytes) of the one-level FB/NSA method versus number of unknowns Ntot, respectively, compared between cases A and B. Results are plotted on a log-log scale, and the maximum number of unknowns considered is 262,144. From Figure 14a, CPU time per iteration versus number of unknowns is a straight line with unity slope for both cases. CPU time per iteration for case B is greater than for case A because of the larger neighborhood distance and larger total number of plane waves. From Figure 14b the memory requirement versus number of unknowns is also a straight line of unity slope for larger numbers of unknowns. It is observed that for a small number of unknowns, case B requires more memory than does case A. However, as the number of unknowns increases, the memory requirement for both cases is almost identical. It can be concluded from these plots that the 1-D FB/NSA method is an extremely efficient iterative method, and that it indeed yields equation image(Ntot) for both computational cost and memory storage requirements even in the case of large surface height variations.

Figure 14.

Comparison of computational efficiency of the 1-D one-level FB/NSA method between case A and case B: (a) CPU time per iteration versus number of unknowns and (b) CPU memory versus number of unknowns.

5. A Summary and Conclusions

[35] In this paper, the original NSA algorithm has been generalized for the fast computation of radiation/scattering from 1-D extremely large-scale quasi-planar structures. Analytical formulas associated with the 1-D NSA parameters for an arbitrary value of ϕs,max are also presented, resulting in more flexibility in selecting Ls to compromise between the computation of the contributions of strong and weak regions. The plot of the number of plane waves QTOT versus the strong region size Lsserves as a tool to assist a user in selecting Ls to optimize the 1-D NSA algorithm. Numerical results illustrate that the derived 1-D NSA parameters work well for both small and very rough surfaces. In addition, a theoretical study based on the PO and flat surface approximations leads to the “multilevel” concept to improve the accuracy of the original NSA algorithm in the case of 1-D extremely large scale QPS. It is found that only a few weak regions or even one are required for most practical problems to obtain the desired accuracy, but the accuracy of the original 1-D NSA algorithm indeed can be improved when incorporating the “multilevel” algorithm. Numerical results show that the 1-D FB/NSA method yields accurate results with a significant reduction of CPU time compared to the standard 1-D FB method even in the case of large surface height variations. In addition, the computational cost and memory storage requirement of the 1-D FB/NSA method remains equation image(Ntot) for very rough surfaces. The procedure illustrated in this paper can also be extended to the 2-D NSA algorithm as considered by Torrungrueng et al.[2000] and Torrungrueng and Johnson [2001a, 2001b, 2001c].

Ancillary