[17] Retrieving the parameters of the ducts from the propagation losses is a complex problem. Two different inversion methods are presented in this paper. The first is a GA included in the SAGA code developed by *Gerstoft et al.* [2003a, 2003b]. The second, the LS-SVM, is a learning machine selected for its computational speed once trained. This latter method is based on a pregenerated and preprocessed database. As the main part of the computation is made prior to the operational use, the inversion using LS-SVM is real time.

#### 3.1. Genetic Algorithm

[18] Genetic algorithms (GA) start with the selection of a population of *q* member models. Models consist of bit strings for each uncertain/unknown parameters (parameters are, thus, discretized in GA, in contrast to Simulated Annealing methods, which usually work with continuous variables).

[19] The “fitness” of each member is the value of the objective function for the particular model. On the basis of the fitness of the members, “parents” are selected and through a randomization, a set of “children” is produced. These children replace the least fit of the original population and the process iterates to develop an overall fitter population. The formation of child models is performed through the application of operators to the parents.

[20] Figure 4 shows the GA principle. Each child population P_{i+1} is processed in three steps: first, the more likely “parents” are selected in the parent population P_{i}. Then, crossover uses a part of the string corresponding to a parameter from one parent and supplements it with a part of the string for the same parameter from the other parent. The operation is applied individually to every parameter string (multipoint crossover), resulting in all-direction parameter perturbations.

[21] Mutation follows crossover and changes bit values in parameter strings in a random fashion. Bit changes occur with a low probability (usually 0.05). The small changes imposed on the new generation through the occasional bit changes assist the optimization process to escape from local minima. These three steps are applied on each population to obtain a final population containing an overall fitter population P_{k}. A more detailed description of genetic algorithms and their application to parameter estimation is given by *Gerstoft* [1994].

#### 3.2. LS-SVM

[23] LS-SVM is a training process. The aim of the training process is to obtain an approximation of the nonlinear function *f* with respect to the vector of the propagation losses **L** at different ranges. The vector of the duct parameters **M** is the output: **M** = *f*(**L**).

[24] Figure 5 displays the process to generate the *N* element training database. First, *N* items of three-dimensional parameter sets **M**^{tr} describing *N* different surface-based ducts are drawn using a Latin hypercube sampling [*McKay et al.*, 1979]. Then, from these parameter values, *N* propagation losses vectors **L**^{tr} are obtained by carrying out *N* propagation simulations using the PWE solved by split-step Fourier (SSF) propagation method [*Barrios*, 1992]. This wave propagation method is accurate and takes into account the refractive index variations. Thereby, the training database is the set (**M**^{tr}, **L**^{tr}).

[25] Latin hypercube can generate a multidimensional training database more efficiently than using a regular sampling of each parameter *z*_{b}, *M*_{d}, and *z*_{thick} [*Loh*, 1996]. This method is extracted from the design of experiments theory [*Vivier*, 2002] and generalizes Leonard Euler's Latin square. Here is the principle of the generation of the *N*-sized three-dimensional database of the duct parameters as illustrated in Figure 6. In the three-dimensional space, the variations of the variables (*z*_{b}, *M*_{d}, *z*_{thick}) are represented from minimum to maximum value on each dimension. Each interval is divided into *N* equal sections. For the first draft, an interval is randomly drawn among the *N* intervals on the 3 dimensions. The intervals *I*, *I*, *I* are obtained. Then a vector value (*z*_{b}^{1}, *M*_{d}^{1}, *z*_{thick}^{1}) is randomly drawn into these intervals. For the second draft, the intervals *I*, *I*, *I* are drawn among the *N* × 3 intervals deprived of the intervals *I*_{zb}^{1}, *I*_{Md}^{1}, *I*_{zthick}^{1}. The process is repeated *N* times until the last remaining intervals *I*, *I*, *I*. All random distributions are uniform.

[26] Now that the database is generated, it has to be processed to obtain a nonlinear approximation of the aimed function. In learning theory, this step is the training of the system. Once the training database is generated, the system must be trained in order to obtain an approximation of the function *f* in the form

where *σ*_{K}^{2} is the width of the Gaussian function. It is a parameter of the inversion system. The support vectors *α* = (*α*_{j})_{j=1…N} and the bias *B* are the values to optimize. The optimization is carried out during the training of the system, where the function is tested on the training database. *M*_{output} is scalar and represents *z*_{b}, *M*_{d}, or *z*_{thick}. So there is a function *f* for each parameter.

[27] Figure 7 shows the training process of LS-SVM. The system “learns” the best approximation of the function *f* by optimizing *α* and *B* on the training database itself. The Gram matrix Ω_{ij} = exp(−∥**L**_{i}^{tr} − **L**_{j}^{tr}∥^{2}/*σ*_{K}^{2}), with *i*, *j* ∈ {1, …, *N*}, is introduced. The optimized support vectors and bias are computed by solving the system

where *γ* ∈ ] 0, + ∞ [ is the second parameter of the inversion algorithm: *γ* defines the trade-off between the accuracy on the training database and the ability of the function to find solutions outside the training database. It is called the regulation parameter; *α*^{opt} are the support vectors, and *B*^{opt} is the bias of the function at the optimum.

[28] In theory, the propagation losses are mapped through a higher dimensional space using the Gaussian function in (5) and (6) in order to mimic the nonlinearity of the function *f*. Note that this Gaussian function can be replaced by another kernel function [*Mercer*, 1909]. The final system (6) is a Karush-Kuhn-Tucker system obtained by solving a ridge regression system as an optimization system under equality constraints. The ridge regression is carried out on the whole training data in order to obtain an approximation of the aimed function. For details, see *Suykens et al.* [2002].

[29] Determination of *α*^{opt} and *B*^{opt} in (5) is the training of the system. It requires inverting the *N* × *N* matrix **Ω** + *γ***I**_{d}. Once the system is trained, the inversion process is fast, less than 1 s. For an observed vector of propagation losses **L**_{input,} the equation (5) to determine the duct parameter *M*_{output} is very fast. A nonlinear and real-time approximation of the aimed function *f*(**L**_{input}) = *M*_{output} is obtained.