Quantitative processing of broadband data as implemented in a scientific split‐beam echosounder

The use of quantitative broadband echosounders for biological studies and surveys can offer considerable advantages over narrowband echosounders. These include improved spectral‐based target identification and significantly increased ability to resolve individual targets. An understanding of current processing steps is required to fully utilise and further develop broadband acoustic methods in marine ecology. We describe the steps involved in processing broadband acoustic data from raw data to frequency dependent target strength ( TSf ) and volume backscattering strength ( Svf ) using data from the EK80 broadband scientific echosounder as examples. Although the overall processing steps are described and build on established methods from the literature, multiple choices need to be made during implementation. To highlight and discuss some of these choices and facilitate a common understanding within the community, we have also developed a Python code which will be made publicly available and open source. The code follows the steps using raw data from two single pings, showing the step‐by‐step processing from raw data to TSf and Svf . This code can serve as a reference for developing custom code or implementation in existing processing pipelines, as an educational tool and as a starting point for further development of broadband acoustic methods in fisheries acoustics.


I. INTRODUCTION
Management of fisheries resources typically requires knowledge of age and abundance structure over many years.For numerous fish species and stocks, abundance estimates can be obtained by spatially extensive surveys using quantitative echosounder systems, which measure the backscatter from fish (Simmonds and MacLennan, 2005).An important property of scientific echosounders used for fisheries surveys is accurate measurement of the backscatter amplitude -this is of lesser importance for other echosounder applications, such as bathymetry, presence/absence of objects, and spatial properties of objects.Conventional echosounders generate an acoustic pulse with a narrow bandwidth (several kHz at most) and when several are operated simultaneously at widely spaced frequencies (such as 18,38,70,120,200,and 333 kHz) can help to categorize the backscatter into species or target categories.This is termed the multi-frequency approach (Holliday and Pieper, 1980;Korneliussen and Ona, 2002).
The application of acoustic pulses with a wide and continuous frequency range (broadband pulses) to fisheries applications is an obvious enhancement.Broadband pulses provide better frequency resolution and coverage than multi-frequency systems, i.e., a continuous frequency coverage over a wide frequency band, and with appropriate processing can provide significantly better along-beam resolution and a higher signal to noise ratio than narrowband pulses (Chu and Stanton, 1998;Ehrenberg and Torkelson, 2000).These benefits can lead to improved backscatter categorisation (Korneliussen et al., 2018 , 2016; Stanton et al.,   2012; Traykovski et al., 1998) and hence more accurate abundance estimates.
There have been several scientific broadband echosounder systems developed for laboratory use (Chu et al., 1992;Conti and Demer, 2003;Forland et al., 2014), some prototype or custom-made systems (Barr et al., 2002; Briseño-Avena et al., 2015; Foote et al., 2005;   Imaizumi et al., 2009; Simmonds et al., 1996; Zakharia et al., 1989 , 1996) and some commercially available systems (Denny and Simpson, 1998;Ehrenberg and Torkelson, 2000;Gordon and Zedel, 1998;Stanton et al., 2010;Zedel et al., 2003).None of these systems have achieved widespread use in quantitative fisheries surveys presumably because the commercially-available systems have lacked one or more of the features that are very useful for an operational quantitative fisheries survey echosounder.These comprise (a) a large dynamic range receiver that does not saturate in normal conditions; (b) operating frequencies that are useful for detecting fish at several hundred metres range (between approximately 20 and 200 kHz); (c) split-aperture for detection of angle of arrival of echoes from individual organisms and calibration spheres; (d) capability to perform in-situ calibrations of the amplitude response; (e) a relatively easy transition from existing survey echosounders; (f) useful at typical ship survey speeds (e.g., 5 m s -1 ).A recent addition to the set of commercially-available broadband echosounders is the Simrad EK80, which meets all of these features.
The purpose of this paper is to present a systematic and comprehensive explanation of how to derive quantitative broadband data from recorded broadband signals.Without loss of generality, we use the Simrad EK80 as an example since it is currently the most commonly used broadband echosounder in the fisheries acoustics field.By presenting the design goals, implementation details, and recommended procedures and processing required to obtain quantitative broadband data, the authors hope to encourage and facilitate the realistic use of broadband signals in fisheries acoustics.
Our presentation uses nomenclature and approaches that are commonly used for narrowband echosounder systems, which were derived from radar processing (Cook and Bernfield, 1967).In particular, the expressions for target strength (TS) and volume backscattering strength (S v ) (MacLennan et al., 2002) are presented in a similar manner for broadband signals as for narrowband signals.

B. Internal signal processing
The controlling computer program generates a short-duration digital transmit signal (a ping), y tx (n), that is converted to an analogue electric signal, y tx,e (t), by the transceiver and applied to the transducer to generate the transmitted acoustic signal y tx,a (t) (Fig. 1) where n is a sample index in the discrete time domain and t is time for the analog signal.Any returning acoustic signal, y rx,a (t), is received by each transducer sector, u, and converted to an analog electric signal, y rx,e (t, u), in the transducer and received by corresponding receiver channels, u, in the transceiver.For a split-aperture echosounder system, there are typically three or four channels in the system (a minimum of three are necessary for estimating the angle of arriving echoes).Here, we focus on a four-channel system.
The received electric signal, y rx,e (t, u), from each channel, u, is pre-amplified, filtered by an analog anti-aliasing filter, and digitized in the transceiver at a frequency of f s , creating the digital signal, y rx,org (n, u).
To remove noise and reduce the quantity of data, the sampled signal from each channel is filtered and decimated in multiple stages, v, using complex bandpass filters, h fl (i, v), and decimation factors, D(v).The individual filter coefficients for each filter and decimation stage are indexed by i.The output signal from each channel, u, from each filter and decimation stage, v, is then given by: where y rx (n, u, 0) is set to y rx,org (n, u), being the signal before decimation, * indicates convolution and N v is the total number of filter stages.The output signal from the final filter and decimation stage, y rx (n, u, N v ), is shortened to y rx (n, u) for convenience.For the output signal, y rx (n, u), the decimated sampling rate, f s,dec , is given by: . (2) The characteristics of the bandpass filter and decimation factors are chosen with regard to the desired operating bandwidth, noise suppression levels, impulse response duration, and other common filter characteristics, with the aim of maintaining sufficient information in the data.The filtered and decimated complex samples from each transducer channel, y rx (n, u), are considered raw data and are recorded together with additional data such as from position and motion sensors and system configuration data in raw data files for display and analysis by processing software.

C. Pulse compression
To increase signal-to-noise ratio and resolution along the acoustic beam a matched filter may be applied to the raw data samples (Turin, 1960).This technique is also known as pulse compression (Klauder et al., 1960).One approach for a matched filter is to use a normalized version of the ideal transmit signal as the replicate signal, filtered and decimated using the same filters and decimation factors as applied in Eq. 1.The normalized ideal transmit signal, ỹtx (n), is given by: where max is the maximum value of y tx (n).The filtered and decimated output signal, ỹtx (n, v), from each filter stage, v, using the normalized ideal transmit signal, ỹtx (n), as the input signal, is given by: where ỹtx (n, 0) is set to ỹtx (n) and ↓ indicates decimation by the factor D(v).The output signal from the final filter and decimation stage, ỹtx (n, N v ), is used as the matched filter and is denoted as y mf (n).
To perform pulse compression the received signal, y rx (n, u), is convolved with a complex conjugated and time-reversed version of the matched filter signal with the matched filter signal, and here also normalized with the l 2 -norm of the matched filter to maintain received signal power.The pulse compressed signal, y pc (n, u), then becomes where ||y mf || indicates the l 2 -norm of y mf , also known as the Euclidean norm.The received power samples are then used to estimate target strength and volume backscattering strength.
For estimating received power samples, the mean signal, y pc (n), over all transducer sectors, N u , will be used: Compensation of echo strength for position in the acoustic beam requires an estimate of the echo arrival angle.This is obtained using the split-aperture method (Burdic, 1991), which for broadband pules can be implemented with the angle values contained in the complexvalued y pc (n) data, in combination with knowledge of transducer sector geometry.The principle is demonstrated with a transducer that is divided into four quadrants (Fig. 2).In this example the summed signals from four halves (1+2, 2+3, 3+4, 4+1) are calculated as: where fore, aft, star(board), and port indicate the relevant transducer halves.

D. Auto correlation function
The auto correlation function of the matched filter signal, y mf,auto (n), that will be used in a later processing step is defined as:

E. Power and angle samples
The transceiver measures voltage over a load, z rx,e , connected in series with the transducer impedance, z td,e .When calculating various acoustic properties a system gain parameter will be used which assumes a matched receiver load.The total received power, p rx,e (n), from all transducer sectors for a matched receiver load (Fig. 3) is given by: Forward/aft and port/starboard phase angles of target echoes are estimated by combining the transducer half signals thus: where y θ (n) is the electrical angle along the minor axis of the transducer (positive in the forward direction when ship-mounted) and y φ (n) the electrical angle along the major axis of the transducer (positive to starboard when ship-mounted), where complex signals are represented in the form e j2πf t , where j = √ −1.The physical echo arrival angles (θ and φ) are then given by: where γ θ and γ φ are constants that convert from phase angles to physical echo arrival angles and are derived from the transducer geometry and f c the centre frequency of the chirp pulse where log 10 is the logarithm with base 10 and r 0 is 1 m.
The power-budget equation (i.e., sonar equation) for a single target (Formulation D, Lunde and Korneliussen, 2016) at frequency f is: where P rx,e,t (f ) is the Fourier transform of the received electric power in a matched load for a signal from a single target at frequency f , r is the range to the target, α the acoustic absorption at frequency f , p tx,e the transmitted electric power, λ the acoustic wavelength, and g the transducer gain along the main acoustic axis, incorporating beam compensation based on the estimated target bearing, (θ, φ).
The point scattering strength, S p (n), is estimated by applying Eq. 18 to the received digitized power samples using the on-axis gain value with f set to the centre frequency of the broadband pulse, f c : noting that S p (n) is an average over frequency of all echoes from single or multiple targets received at sample n.
Based on the point scattering strength samples and the phase angle samples, single targets can be detected, and range and bearing to the single targets can be estimated.
From the pulse compressed data, y pc (n), the samples before and after the peak echo level corresponding to an echo from a single target are extracted to obtain the signal, y pc,t (n), noting that the number of samples before the peak can differ from those after the peak.
From the autocorrelation function of the matched filter signal, y mf,auto (n), the equivalent samples around the peak are extracted to create the reduced autocorrelation signal of the matched filter signal, y mf,auto,red (n).Depending on the target scattering characteristics and the distance to any adjacent single targets, the number of samples around the peak echo level in y pc,t (n) that contain the majority of the echo energy can be more or less than the total number of samples around the peak of y mf,auto (n).If the number of samples around the target is more than the total number of samples around the peak of y mf,auto (n) all samples around the peak of y mf,auto (n) are used.If the number of samples around the target is less than the total number of samples around the peak of y mf,auto (n), this lower number is used to create a reduced autocorrelation signal, y mf,auto,red (n).
The discrete Fourier transforms of the target signal, Y pc,t (m), and the reduced auto correlation signal, Y mf,auto,red (m), are given by: where DFT indicates the Fourier transform of length N DFT and m the sample index in the frequency domain.The nomalized discrete Fourier transform of the target signal, Ỹpc,t (m), is then calculated by: Ỹpc Assuming, as a first approximation, that the impedances of the transceiver and transducer are independent of frequency, the received power into a matched load, P rx,e,t (m), is then estimated by: noting that any variation of impedance with frequency will be reflected in the g 0 obtained from the calibration process.

IV. VOLUME BACKSCATTERING STRENGTH
Echoes from multiple scatterers can be quantified using volume backscattering strength, S v , being the density of backscattering cross sections, and is given by: where V is the volume occupied by the scattering targets.The power-budget equation for multiple targets is then: where P rx,e,v (f ) is the received electric power in a matched load for the signal from a volume at frequency f , c the sound speed, and t w the duration of the time window, excluding the zero-padded portion if applied, used for evaluating the frequency spectrum.Note that r is the range to the centre of the range volume covered by t w , and the two-way equivalent beam angle, ψ, is a function of frequency that is derived from an empirical estimate of ψ at the nominal frequency, f n : Volume backscattering samples compressed over the operational frequency band are estimated by applying Eq. 26 to the received digitized power samples using the on-axis gain value with f set to the centre frequency of the broadband pulse, f c : noting that S v (n) is an average over frequency of all echoes received at sample n.In this case, the time window, t w , is the effective pulse duration, τ eff , resulting from pulse compression.
The effective pulse duration is defined as the pulse duration at transmit power p tx,e which produces the same energy as the actual transmitted pulse, and is estimated from digitized data via: where p tx,auto is the square of the absolute value of the matched filter autocorrelation function, and the summation is calculated over a duration of twice the nominal pulse duration, 2τ .For an ideal system, i.e., no tapering at the rising and trailing edges of the transmitted signal, the effective pulse duration is the same as the transmit pulse duration.
To estimate S v as a function of frequency a Fourier transform is used, repeatedly applied via a sliding window in range.However, the duration of this sliding window can be so long that the difference in spherical spreading loss compensation (r 2 , implemented as the 20 log 10 (r) term in Eq. 28) from the beginning of the window to the end can be significant, particularly for short range measurements.Thus, compensation for spreading loss is performed before applying the discrete Fourier transform.Absorption loss compensation is also range dependent (and frequency dependent), but since absorption loss compensation is insignificant for typical operating frequencies at short ranges and the difference in absorption loss compensation between the beginning and the end of the sliding window is insignificant at longer ranges, compensation for absorption loss is performed after applying the discrete Fourier transform.
Compensation of spherical spreading loss requires compensation of received power by a factor of r 2 , and hence compensation of amplitude by a factor of r: where y pc,s (n) is the pulse compressed signal compensated for spherical spreading.A discrete Fourier transform is performed on the range compensated pulse compressed sample data using a normalized sliding Hanning window, w(i).The duration, t w , of the sliding window is chosen as a compromise between along-beam range resolution and frequency resolution.
We suggest that it be at least twice the pulse duration and for computational efficiency reasons should result in a number of samples, N w , which is a power of 2.
The normalised Hanning window, w, is given by: and the discrete Fourier transform of the windowed data, Y pc,v (m), is then obtained from: where u(i) is the step function and n is the sample data index for the centre of the sliding window.The discrete Fourier transform of the auto correlation function of the matched filter signal, Y mf,auto (m), also needs to be evaluated at the same frequencies: The normalized discrete Fourier transform of the windowed data, Ỹpc,v (m), is then given by: and received power into a matched load, P rx,e,v (m), is estimated from: Finally, the discretized estimate of S v (f ), S v (m), is given by: S v (m) = 10 log 10 (P rx,e,v (m)) + 2α(m)r − 10 log 10 p tx,e λ 2 ct w ψ(m)g 2 0 (m) 32π 2 . (37)

V. ILLUSTRATIVE EXAMPLES
A frequency modulated pulse scattered by a metallic sphere will exhibit frequencies at which very little energy is returned due to destructive interference (Stanton and Chu, 2008).This is visible in the TS (Fig. 4a, 4b) and agrees well with theoretical estimates of the backscatter from spheres (MacLennan, 1981).The amplitude of the backscatter signal also clearly shows these nulls (Fig. 4c), which are readily visible here due to the use of a linear chirp where time through the pulse corresponds to specific frequencies.The marked increase in range resolution is apparent once pulse compression has been applied (Fig. 4d), as are the temporal effects of the pulse compression operation.
A metallic sphere is a rather simple and ideal scatterer and we also present S v from a school of Atlantic mackerel (Scomber scombrus) (Fig. 5a).The trend for increasing S v with frequency is well-known (Korneliussen, 2010) and is consistent with the trend observed in this example.In contrast to data from isolated scatterers, such as metallic spheres, the benefit of pulse compression on the backscatter from an object that generates many overlapping echoes is not immediately obvious (Fig. 5b, 5c), although in regions where the fish density decreases (e.g., top left of the school), single target echoes become visible.

VI. DISCUSSION
Obtaining quantitative broadband data is more complicated than for narrowband echosounders, due in part to the need to account for frequency dependence in most variables and parameters, and the need for an increased understanding of digital signal processing techniques.For example, the Simrad EK80 currently provides complex demodulated (Hasan, 1983) voltages for each transducer sector rather than derived quantities (e.g., the envelope of a pulse compressed signal and the frequency response of single targets and volumes)these outputs must instead be calculated as proposed in this paper.This was an implementation decision made to allow for more flexible use of the echosounder data, and to ease the development of new processing methodologies.This decision has several disadvantages, such as the significantly increased data quantity and markedly higher amount of computation required to simply display an echogram.These can be ameliorated to some degree by processing the data into a more directly useful form before storage.The advantages of having access to unprocessed data were considered to out-weigh these disadvantages due to the potential benefits of more sophisticated uses of the acoustic data, especially for a tool that is only beginning to be applied to the field of fisheries acosutics.
The methodology to process broadband data has been presented in a general form in the previous sections without any accord to engineering limits.However, any physical implementation introduces operational constraints.As an example, a system designed for shipboard installations with ample access to electrical and computer processing power and large data storage will typically have different operational contraints compared to a system intended for autonomous platforms where electrical power and computing resources can be severely limited.In addition, the transducer is a significant constraint on the operational parameters of an echosounder and is usually the main determinant of the usable transceiver operating parameters (such as transmit bandwidth, maximum transmit power, pulse duration, ping rate, etc).
The use of broadband signals in fisheries acoustics is a developing area and we anticipate many valuable enhancements will occur in the coming years.For example, the use of TS(f ) and S v (f ) to improve acoustic target classification (Bassett et al., 2018;Korneliussen et al., 2018), and the potential of the high range resolution from pulse compression to observe small-scale fish behviours (Skaret et al., 2020) and to detect objects adjacent to boundaries (Lavery et al., 2017).The basic formulation for calculating TS(f ) and S v (f ) presented here provides the foundation for future enhancements.
The formulation presented in this paper results in several frequency dependent parameters, such as transducer gain, two-way equivalent beam angle, and the water absorption coefficient, that are required to quantitatively estimate TS(f ) and S v (f ) from received broadband signals.Methods to estimate these are not within the scope of this paper, but common practise is to use the conventional sphere backscatter calibration methodology (Demer et al., 2015) slightly enhanced for broadband (Hobaek and Forland, 2013;Lavery et al., 2017).We note that these methods do not provide an operational method to estimate τ eff or ψ(f ), especially for ship-mounted transducers, and that empirical measurements of these parameters are necessary to fully calibrate both narrowband and broadband echosounders.
The processing equations and methodology presented in this paper have been implemented in version 1.12.4 and earlier of the Simrad EK80 software.
II. SIGNAL FLOW ANDINITIAL PROCESSING A. System overview A basic echosounder system consists of a transducer, a transceiver, and a computer program that controls the operation of the transceiver and records received signals.During transmission the program defines the signals which are created as electric signals in the transceiver, converted to acoustic signals by the transducer and transmitted into the water.The acoustic signals propagate through the water, are reflected or scattered by objects in the water, and propagate back to the transducer.During reception the transducer converts the received acoustic signals to electric signals, which are received, pre-amplified, filtered, digitized, and processed in the transceiver, and then transferred to the controlling program for further data processing and storage (Fig. 1).Many types of transmit signals are feasiblethis paper considers only the transmission of linear frequency modulated signals (also known as linear chirps).

Figure 1 .
Figure 1.Signal and data flow in the Simrad EK80 system.An echosounder ping starts with the definition of a transmit signal (upper left) and ends with file storage (lower middle) and display and analysis (lower right).Note that all signals are complex-valued time-series.

Figure 2 .
Figure 2. Transducer divided into four quadrants.The labels are directions often used when a transducer is mounted on a ship.

Figure 3 .
Figure 3. Equivalent circuit diagram of transducer/transceiver with system impedances.

Figure 4 .
Figure 4. (color online) Target strength of 57.2 mm (a) and 22.0 mm (b) diameter tungsten carbide (with 6% cobalt binder) calibration spheres, theoretical (solid line) and measured and derived using Eq.(24) (dotted line).Echogram of non pulse compressed backscatter [dB re 1 W] (c) and pulse compressed backscatter [dB re 1 m 2 ] (d) from 57.2 mm (lower band) and 22.0 mm (upper band) diameter tungsten carbide spheres using a 2.048 ms duration linear frequency modulated pulse (160-260 kHz).The range from the spheres to the transducer varied during the time period shown.

Figure 5 .
Figure 5. (color online) (a): Mean frequency response (solid line) and standard error (dashed line) from a school of Atlantic mackerel (Scomber scombrus) from three simultaneously operating linear frequency modulated echosounder channels (90-170, 160-260, and 280-450 kHz).Echogram of backscattered power [dB re 1 W] (b) and pulse compressed S v [dB re 1 m −1 ] (c) from the 160-260 kHz pulse.The school is the trapezoid-shaped object between 25 and 90 m below the surface.The echo from the seafloor is visible at 130 m depth.