Design of optimum filters for signal processing with silicon drift detectors

Funding information KETEK GmbH In order to improve noise filtering for high-resolution X-ray spectroscopy using silicon drift detectors, optimum finite impulse response filters are calculated and tested experimentally. Common matched filter theory cannot be applied for this problem since the filters need to fulfill several requirements regarding their time domain transfer function, like, for example, the presence of a flat-top, zero-filter area. Therefore, the utilization of digital penalized least mean square method with silicon drift detectors is presented. Adaptations to the method are applied in order to suit the considered application targeting low dead-time at high count rates using modern silicon drift detectors with fast charge amplifiers. The workflow for filter calculation is presented: Signal data are acquired using a spectroscopy setup in order to obtain pulse shape and noise information. Desired constraints regarding the time-domain transfer function of the filters are written out, and limits for the precision of their fulfillment are derived. Optimization is carried out, and resulting filters are presented. In order to experimentally test the calculated filters, implementation in hardware is done. Fe spectra are acquired and evaluated. Noise reduction is rated by calculation of residual electronic noise in the recorded Fe spectra. Comparison to the noise reduction of trapezoid filters is done. Improvement up to (5.2 ± 0.3)% was found using the calculated optimum filter, proving the successful utilization of digital penalized least mean square method for signal processing with modern silicon drift detectors.


| INTRODUCTION
High-resolution X-ray spectroscopy has a wide range of scientific and industrial applications, especially for nondestructive material analysis. 1 In the field of energydispersive detection of X-rays, excellent energy resolution and high count rate capability are achieved by silicon drift detectors. 2,3,4 When X-rays are absorbed in the active detector volume, free charge carriers are generated. By integration of released charge from the detector using an application-specific integrated circuit (ASIC), step-like voltage signals are generated. 5 The magnitude of a voltage step is proportional to the energy of the corresponding X-ray photon. For state-of-the-art signal processing, the output voltage signals of silicon drift detectors are converted into digital values by an analogto-digital converter (ADC), and pulse height analysis is done in a digital signal processing unit like a field programmable gate array (FPGA). 6 Statistical disturbances and in particular electronic noise superimpose the desired X-ray signals. Since the disturbances lead to inaccuracies in the signal height determination, energy resolution worsens and the quality of the material analysis suffers. 6,7 In order to reduce the influence of noise, digital filters are applied during signal processing. Beside the optimization of the signal-to-noise ratio, the filters need to fulfill several requirements regarding their time-domain transfer function. Indispensable properties of the filter are a flat-top in the signal response in order to avoid ballistic deficits, finite filter duration to minimize pile-up effects, and zero area to guarantee in-dependency of the baseline from direct current (DC) voltage. 7 The most common type of filter is the trapezoid filter which provides good reduction of white noise, fulfills the requirements, and can be implemented very efficiently in digital electronics. 6 However, the trapezoid filter provides nonoptimal noise reduction in certain cases, like, for example, in the presence of flicker noise and cannot take into account unknown experimental disturbances, like, for example, pickup noise caused by electromagnetic interference. 8 In this work, a method for the calculation of alternative filters providing optimum signal-to-noise ratio will be demonstrated, targeting the improvement of the energy resolution achieved by silicon drift detector system.

| DPLMS METHOD FOR OPTIMUM FILTER CALCULATION
Filters achieving maximum signal-to-noise ratio for signals superimposed by stochastic noise can be described and calculated using matched filter theory. 9 In some applications, including high-resolution X-ray spectroscopy, however, the filters furthermore have to fulfill certain constraints regarding their transfer function since signal processing is done in time domain. Besides the optimization of the signal-to-noise ratio demands on the filters, for example, might be the finite duration of the step response, the presence of a flat-top in the filter output, and specific values of the filter area. Common methods for designing optimal filter under constrains in the transfer function are, for example, the Wiener method, the discrete-time Fourier transform (DFT) method, the least mean square (LMS) method, and the digital penalized least mean square (DPLMS) method. 10 In this work, the DPLMS method, which has been introduced in, 11 is chosen. This method takes into account the real noise in the system directly from time-domain signal data. There is no need for transformation into frequency domain, modeling of noise sources and deconvolution calculations. Influences from the preamplifier, the analog front-end, the ADC quantization noise, and also unknown sources of disturbances are taken into account while calculating the filter with best possible signal-to-noise ratio. Also, DPLMS offers the possibility to weight constraints regarding the transfer function in order to guarantee their fulfillment up to the needed precision without further nonessential degradation of the noise reduction. In the following, the basic principle of the method will be recapitulated before adaptations will be presented in section 3.

| Filter description
For digital signal processing in high-resolution X-ray spectroscopy, a common type of filter is finite impulse response (FIR) filter. 6 This type of filter is inherently stable, and the finite duration of their pulse response leads to a well-defined length of the filter output for the steplike signals of semiconductor X-ray detectors. During application, often a trade-off between noise reduction and dead-time due to pile-up effects has to be found. For an FIR filter, the length of the step response can easily be adjusted by the number of filter taps. The filter taps are weighted with coefficients that define the transfer function. 12 The task while designing optimum filters, therefore, is finding the coefficients that lead to minimum noise in the filter output while fulfilling requirements in the transfer function.
In order to describe the FIR filter, a vector x ! containing the N coefficients is defined: In the following, a cost function ε to be optimized will be derived. ε is a function of the filter coefficient vector x ! and has its global minimum at the desired optimum filter x ! opt : For this purpose, it accumulates the filter output noise and deviation to the constraints. The cost function should be steady and differentiable and is constructed to have quadratic form.

| Representation of the output noise
First, an expression for the evaluation of the filters noise reduction is needed. A sample Ψ on the output of the FIR filter with a length of N taps can be expressed as the discrete convolution of an input signal sequence i ! and the N filter coefficients in x ! : A sequence of input signal samples i ! can be described as the superposition of a desired noiseless signal s ! and unwanted noise d It is assumed that the noise sequence has an average value of zero and is uncorrelated to the signal. The variance of the filter output Var[Ψ] is by definition the expectation of the squared deviation of Ψ from its mean E[Ψ]: Since the expectation of the signal is the desired noiseless signal, using Equation (4), the expression E[i n ] in Equation (5) can be substituted by E[i n ] = s n = i n − d n : The variance of the filter output, therefore, is only a function of the filter coefficients and of the noise on the input. Due to its quadratic form, it is a suitable measure for the filter output noise in the cost function. For optimum noise reduction, the variance of the filter output is minimal: Therefore, a suitable term ε n for noise reduction in the cost function to be minimized is:

| Representation of punctual time constraints
Punctual time constraints are introduced in order to avoid ballistic deficits by a flat-top in the output signal of the filter. 7 In general, punctual time constraints aim for a defined value C k in the filter output of the reference signal Ψ ref at a certain point k: For optimum fulfillment of a punctual constraint, the deviation in the filter output in k to the defined value C k is minimal: From Equation 10, a quadratic expression ε p,k for the deviation of a punctual constraint can be derived: For every desired punctual time constraint, such a term is added into the cost function. The length of the filter flat-top, therefore, can be controlled by the number of punctual constraints.

| Representation of area constraints
The sum of coefficients defines the gain for DC input and the area under the impulse response of an FIR filter. 13,12 In case the area of an FIR filter with N taps should aim for a defined value A, an area constraint can be applied: For optimum fulfillment of an area constraint, the deviation between the desired and the actual filter area is minimal: From Equation (13), a term for the quadratic deviation of the filter area to the desired area A can be derived and added into the cost function:

| Formulation of the cost function
The quadratic expressions for the output noise, punctual constraints, and area constraints are added to the cost function ε. In order to adjust the effectiveness of the constraints, weighting factors are applied. The weighting factors α 1 , α 2 , …, α K for the K punctual constraints and β for the area constraint: With Equation (8), Equation (11), and Equation (14), the cost function becomes: The cost function might be expanded by further expressions for additional constraints, like, for example, in the frequency domain. 11 In order to find the optimum filter, the filter coefficients that minimize the cost function have to be found.

| ADAPTATIONS TO DPLMS
DPLMS as presented in the original source is adapted to the application and the setup of this work. Theory of the method and exemplary results using HPGe detectors were shown in Reference 11. In this work, a state-of-the-art silicon drift detector with an ASIC charge amplifier is used. Compared to silicon drift detectors with JFET-based readout, PIN diode photodetectors or Si(Li) detectors, silicon drift detectors with ASICs offer a fast-readout and superior noise characteristics. 5 This provides the possibility of using short digital filters to improve signal throughput at high count rates. In the setup, used signals from the detector are digitalized DC-coupled in a high performance 16-bit ADC. For signal processing, an FPGA with low-power consumption and small form factor is used in order to be applicable for batterypowered, mobile setups like, for example, hand-held X-ray fluorescence analyzers. The outsourcing of the filter synthesis from the signal processing unit, therefore, is preferred.

| Punctual time constraints for filter step response
In this work, punctual time constraints are used to achieve a flat-top in the filter output in order to avoid a ballistic deficit. 7 DPLMS method applies punctual time constraints to the filter output for a reference signal (section 2.3). Therefore, filter coefficients are found that lead to a flat-top in the filter output of the reference signal. However, this does not guarantee a proper flat-top for every single signal trace in cases where the signal shapes differs. For silicon drift detectors, signal rise-times vary due to the different drift times of electrons in the semiconductor material. X-ray absorptions near the anode lead to short drift times and fast signals, while X-ray absorptions near the edges lead to long drift times and slow signals. 14,15 For small detectors (e.g., 20 mm 2 active area) with a fast-readout ASIC signal rise-times typically vary between 20 ns and 80 ns and for large detectors (e.g., 80 mm 2 active area) signal rise-time are typically in the range between 20 ns and 250 ns. The obtained reference signal has a signal rise-time corresponding to the average signal rise-time of the detector. Therefore, it is not suitable to apply time constraints for a flat-top to the reference signal like it is done in DPLMS. For X-ray signals that are faster than the average signal, a flat-top in the filter output is not guaranteed by the method. These fast X-ray signals might suffer from ballistic deficits causing an error in pulse height analysis. The method, therefore, is modified by a new way of applying the punctual time constraints for the flat-top. Punctual time constraints are applied to the ideal step response of the filter. An ideal step of the length T, which is equal to the length of the signal traces, is defined: where t is the discrete time index ranging from 1 to T. A sample on the filter output for the ideal step is: x n Ás step,n : ð18Þ Similar to Equation (9), punctual time constraints can be applied to the filter output: Unlike the reference signal, the ideal step has no signal rise-time and the flat-top will have a well-defined length.
The flat-top length is independent of the reference signal rise-time and will as well be suitable for the fastest X-ray signals. A proper flat-top, therefore, can be generated by applying at least as many punctual constraints as the longest rise-time of the detector to the signal output of the ideal step.

| Consideration of multiple signals
DPLMS has been introduced in the original source for filter synthesis on the digital signal processing unit in operation. 11 However, in this work, the filter optimization is done off-line on a computer. Reasons for this are: • Performing an optimization off-line offers higher computing power, access to toolboxes with more advanced optimization algorithms, and better control over the calculation and the results. • The signal processing unit does not need to offer the resources to perform an optimization. Therefore, a device with small form factor, low-power consumption, and low cost can be chosen. • In application (especially in industry), predictable and reproducible operation of the signal processing unit is often required. Filter synthesis during application is undesirable in these cases.M When calculating the filter off-line, multiple noisy signal traces are available at the same time. This offers new ways to deal with the noisy signals. In case of real-time optimization by iterating over the latest samples, influence of the signal traces varies over time. In this work, another approach is chosen in order to optimize the filter for all noisy samples simultaneously. When is the number of available signal traces, the cost function ε can be calculated for every signal trace s 1 to s M . A new optimization target E is defined as the sum of all cost functions: All of the cost functions ε 1 to ε M are sums of quadratic terms and, therefore, are positive or equal to zero. In order to consider all signal traces, E is minimized: In this way, the filter coefficients which minimize the sum of all cost functions are found. The influence of signal traces, therefore, is independent of their order. Instead the filter with the best overall performance for all given signals is sought.

| Access to advanced optimization methods
Since the optimization in this work is done off-line on a computer (section 3.2), filter calculation does not need to be done in real time. Furthermore, higher computing power and optimization toolboxes are accessible in order to minimize the cost function. Therefore, more advanced optimization algorithms without limitations due to the computing power of the signal processing unit might be employed. The present optimization problem is nonlinear and is done using an unconstrained method due to the implementation of constraints in DPLMS. Possible algorithms, for example, might be gradient-based, with or without step-size control, or not-gradient-based, deterministic or stochastic. The optimization algorithm can be chosen in order to achieve proper convergence for the given problem.

| PROCEDURE OF OPTIMUM FILTER CALCULATION
In this section, the practical calculation of optimum filters with several lengths for a given X-ray spectroscopy setup containing a silicon drift detector will be described.
As the original source of DPLMS focuses on the mathematical description of the filter optimization, in the following practical steps are described in more detail. Major steps in order to calculate optimum filters are the acquisition of signal data using a suitable setup and the choice of a suited optimization algorithm. Furthermore, since DPLMS offers nonperfect constraint fulfillment, proper fulfillment limits will be derived and corresponding weighting factor identified.

| Acquisition of signal data
In order to carry out the calculation of optimum filters, first signal data are acquired. Figure 1 shows schematically the signal chain of the used X-ray spectroscopy setup. The setup consists of a commercially available VIAMP system from KETEK, which contains a silicon drift detector with 20 mm 2 active area and a "CUBE" ASIC as a charge amplifier. 16 The silicon drift detector is operated at a chip temperature of 238 K using the thermoelectric cooler of the detector module. The preamplifier provides a ramped reset-type signal with a gain of 5 mV/keV on the output. This signal is fed into a circuit board for signal processing containing an analog front-end, an ADC, and an FPGA. On the analog front-end, a 15 MHz Bessel-type low-pass filter is applied to the signal as antialiasing filter. Although it has rather poor attenuation in the stop band, the advantage of the Bessel filter is the smooth transient response due to a linear phase. 17 This ensures minimal distortion and overshoot for the step-like X-ray signals. In order to digitize the filtered detector signal, a 16-bit pipelinetype ADC with 80 MHz sampling rate is used. 18 For signal processing, a Xilinx Artix-7 FPGA is employed. 19 Advantages of an FPGA over other digital computing devices are the parallel processing and calculation at a fast rate, cost efficiency, and real-time capacity. 20 The detector is irradiated with a 55 Fe source, and ADC signal data is acquired using the FPGA. A photo of the experimental setup is shown in Figure 2. Signal traces with a length of 32,768 ADC-taps are transferred with full 16-bit precision to the computer. On the computer X-ray, signals with a length of 800 ADC-taps are extracted, signal position is centered to ADC-tap 400, and step heights are normalized to one. X-ray signals containing resets or multiple pulses are rejected. Overall 20,514 valid X-ray, signal traces are extracted in this way and an average signal is calculated (Figure 3). The signal has the typical step-like shape and a signal 10/90-rise-time of approximate six ADC-taps or 75 ns. In the following, this average signal is used as reference signal for filter optimization.

| Choice of optimization algorithm
Calculation of optimum filters is done using Python 3.6. 21 From the SciPy toolbox, the package "Optimization and Root Finding" is deployed for the minimization of the cost function. 22 This package provides functions for minimizing objective functions with various solvers for nonlinear problems and provides control over convergence criteria and optimization results. Best convergence was found experimentally to be achieved when using the quasi-Newton method of Broyden, Fletcher, Goldfarb, and Shanno ("BFGS-algorithm"). 23 The BFGS method is a gradient-based algorithm using first derivatives only. Like all gradientbased algorithms, it finds local minima next to a starting value. In order to verify that the algorithm has found the global minimum of the cost function, every optimization is done repeatedly with different, randomly chosen starting values. For every filter coefficient, a random number between one and minus one is chosen using the Numpy package "random." 24 The optimization results for multiple different starting values are checked for equality within numerical precision.

| Limits of constraint fulfillment
The DPLMS methods offer the possibility to adjust the strength of constraints by the choice of weighting factors. A higher weighting factor for a constraint improves the precision of its fulfillment but might worsen the noise reduction of the filter. Therefore, limits for the desired precision of fulfillment have to be settled. In the following, a method for the derivation of precision limits for the desired constraints based on estimations of the influence for nonperfect constraint fulfillment will be presented.
F I G U R E 1 Signal chain of the X-ray spectroscopy setup

| Deviation of area constraint
First, the influence of a nonzero area filter on the X-ray energy spectrum will be considered. A constant c at the input of an FIR filter will generate a filter output value of The constant c is seen on the output multiplied by the sum of the filter coefficients. Therefore, the sum of an FIR filters coefficient gives the DC gain of the filter. In signal analysis of silicon drift detectors, step-like signals from X-ray interactions are randomly distributed on a ramped signal. The heights of the X-ray signals in the filter output should ideally be independent of the DC level. This can be achieved by constraining the filter area to zero: For a filter with nonzero area, the measured energy of an X-ray signal on the lower limit of the ramped signal will differ from the measured energy of the same X-ray signal on the upper limit of the ramped signal. Due to the random distribution of signals on the ramped signal, this leads to the broadening of the peak in the energy spectrum. In the present setup, a ramped signal with a voltage span (peak-to-peak) of U PP = 1.5 V and a gain of G = 5 mV/keV is used. 16 The maximum absolute energy deviation ΔE between two identical X-ray signals due to a nonzero area is ΔE is an additive error, which is independent of the signal height. Therefore, the relative influence of the peak broadening is highest for low energies. In order to determine an acceptable filter area, a worst-case consideration is done at lowest detectable energies. Low-energy efficiency of silicon drift detectors is limited by the transmission of the entrance window. For an 8 μm Be entrance window, transmission exceeds 5% for X-rays with energies higher than 630 eV. 25 The energy resolution of an ideal noiseless setup is limited by the statistical Fano-noise caused by ionization processes in the semiconductor material to where F is the Fano factor, w is the pair creation energy, and E is the X-ray energy. 7 For silicon detectors, the Fano factor is F = 0.115 and the pair creation energy is w = 3.65 eV . 26 For X-rays with an energy of E = 630 eV, the Fano limited energy resolution is approximately FWHM Fano (630 eV) = 38 eV. It will be assumed that an addition broadening by 10% of the Fano limit will have negligible influence on the actual energy resolution. With this assumption, an additional broadening of 3.8 eV can be tolerated. Using Equation (24), an acceptable filter area is For higher energies, the Fano-limit increases by a square root law, while the error due to the nonzero area of the F I G U R E 3 Normalized reference X-ray signal obtained by the detection system F I G U R E 2 Photo of the X-ray spectroscopy setup filter stays constant. Therefore, this is a worst-case consideration, fully neglecting the additional electronic noise which leads to additional degradation of the energy resolution.

| Fluctuation of the maximum value
Energy values are extracted from the filter by searching the maximum in the filter output in a certain time interval after an X-ray signal has been detected. The maximum of the filter output, therefore, should equal the height of the step-like X-ray signal. This can be achieved by constraining the maximum of the filter output for the ideal unit-step to one. Due to the finite risetime of the X-ray signals, this maximum value should be held for at least the duration of the longest signal rise-time plus one ADC clock cycle. 6 Therefore, the number of punctual constraints equals the rise-time of the slowest signals in ADC-taps plus one. For ideal constraint fulfillment, the filter will give the same maximum value for every signal rise-time of the detector, because the maximum of the filter is independent of where the maximum occurs. For nonideal fulfillment of the punctual time constraints, however, there will be a fluctuation of the maximum value in the filter output for an ideal unit-step. The maximum value for X-ray signals of the same height, therefore, will be a function of the signal rise-time. For the randomly distributed signal rise-times of a silicon drift detector, the nonideal fulfillment leads to a broadening of peaks in the energy spectrum. Due to the linearity of FIR filters fluctuations in the filter output for an unit-step will scale linearly with the heights of X-ray signals, while the optimum energy resolution given by the Fano-limit only scales with a square-root law (Equation (25)). Therefore, the relative error due to fluctuation of the maximum value increases for high energies. Highenergy efficiency of a silicon drift detector is limited by the absorption probability of X-rays in the semiconductor. For a 450 μm thick Si detector, the quantum efficiency for X-rays with an energy higher than 40 keV drops below 5%. 25 The ideal energy resolution of a silicon detector given by the Fano-limit at 40 keV is 305 eV or approximate 0.76% (Equation 25). When again accepting 10% of the Fano-noise additional fluctuations a tolerable relative error between two points of the maximum value has to be smaller than 0.076%. Therefore, the largest acceptable fluctuation of the maximum value in the filter output for the unit step is 7.6 Á 10 −4 .

| Choice of weighting factors
As result of sections 4.3.1 and 4.3.2, the desired constraints are a filter area less than 1.3 Á 10 −5 and a fluctuation of the filter top less than 7.6 Á 10 −4 . The duration of the filter top should be at least one tap longer than the rise-time of the slowest X-ray signal since X-rays signals are asynchronous to the ADC clock. Since slowest X-ray signals found in the ADC traces have a signal rise-time of 180 ns the flat-top length is set to 16 ADC-taps. Therefore, K = 16 punctual constraints with the target values C 1 = 1, C 2 = 1, … C 16 = 1 are introduced into the cost function. The punctual constraints are applied to the ideal step response of the filter. The weighting factors α k and β relative to the noise term in the cost function have to be chosen in order to fulfill the constraints with desired precision. Proper choices of the weighting factors were found by iteratively increasing the weighting factors until desired constraint fulfillment, as derived in sections 4.3.1 and 4.3.2, was achieved. Table 1 shows the determined values for the weighting factors. Higher weighting factors are needed for short filters due to the poorer noise reduction and the resulting higher variance in the filter output.

| RESULTS AND DISCUSSION
Filters with the lengths of 24 Taps (300 ns), 32 Taps (400 ns), 48 Taps (600 ns), 64 Taps (800 ns), and 80 Taps (1,000 ns) are calculated using the presented method. These filter lengths typically are applied in applications using high X-ray rates aiming for a high signal throughput in order to minimize data acquisition time. Results for optimization of the 600 ns filter will be shown and discussed in the following subsection. Experimental test is carried out for all filters and compared to trapezoid  filters of the same length since trapezoid filters are known for excellent noise reduction at short filter lengths. 8

| Coefficients of 600 ns filter
As an example, the filter with a length of 600 ns is presented in detail here. Figure 4 shows the 48 filter coefficients in comparison with the coefficients of the trapezoid filter. For the trapezoid filter, the first 16 coefficients are 1/16 = 0.0625, the coefficients 16-31 are zero, and the coefficients 32-47 are −1/16 = − 0.0625. The filter found by the optimization has higher weighting factors on the edges of the filter and the gap. In between, the filter coefficients show a smooth course. The optimum filter output for the reference input signal is shown in Figure 5. It shows a steeper rise at the start of the signal due to the higher weighting factor near the edges. The overall filter output length is equal due to the same amount of filter taps. While the coefficients of the trapezoid filter between 16 and 31 are exactly zero, the coefficients of the optimum filter are finite. This causes a fluctuation of the maximum value in the filter step response. This is shown in Figure 6 for the 600 ns filter. The fluctuation found on the top of the filter step response is 7.5 Á 10 −4 and thus smaller than the allowed fluctuation of 7.6 Á 10 −4 . The filter area is 1.2 Á 10 −6 and therefore below the allowed maximum of 1.3 Á 10 −6 . The filter fulfills all constraints with desired precision. In order to understand the differences of trapezoid and optimum filter in frequency domain, Figure 7 shows the noise spectral density of the system as well as the magnitude responses of the 600 ns filters. The noise curve F I G U R E 4 Coefficients of trapezoid and optimum filter for 600 ns filter length F I G U R E 5 Output for the reference signal (Figure 3) of 600 ns trapezoid and optimum filter F I G U R E 6 Flat-top in the step response of 600 ns trapezoid and optimum filter F I G U R E 7 Noise spectral density of the system and magnitude response of trapezoid and optimum filter of 600 ns length was determined experimentally by obtaining time domain signal traces of system noise using the ADC and the FPGA described in section 4.1. Hundred noise traces with a length of 32,768 ADC-taps were acquired, transformed into frequency domain using a DFT algorithm on a computer, and the mean noise spectral density is calculated. Observed system noise is a combination of white noise and flicker noise, superimposed with pickup noise caused by electromagnetic interference on the circuit board. 6 Noise is band-limited by the 15 MHz Bessel-type low-pass filter in the analog front-end. The filter magnitude responses in Figure 7 were calculated by applying the formula for the frequency response of an FIR filter to the coefficients shown in Figure 4. 12 Due to the symmetric filter coefficients, the magnitude response of the trapezoid filter has zeros at the edges of side lobes. Damping within the side lobes increases for higher frequency, causing the good rejection of white noise of the trapezoid filter. The magnitude response of the optimum filter, on the other hand, has no zeros. The attenuation for high frequencies is lower compared to the trapezoid filter. However, reduction of flicker noise below 5 MHz and pickup noise is improved using the optimum filter.

| Implementation in hardware
The calculated optimum filters are integrated in an FPGA design for digital signal processing. A block diagram of functions within the digital signal processing is shown in Figure 8. Input data from the ADC are distributed to a pulse detection filter and to an energy filter. The output of the pulse detection filter is evaluated in a trigger logic. Within this block X-ray pulses are detected and pile-up inspection is done. A pulse height determination block is extracting pulse amplitudes from the output of the energy filter using the pulse timing information from the trigger logic. A multichannel analysis of pulse amplitudes is done in the following block. This spectrum data can be read out using a host communication block. Optimum and trapezoid filter are implemented in the energy filter block. The implementation of the trapezoid filter on the FPGA can be done very efficiently. Due to the equal absolute values of all filter coefficients only one multiplication operation per clock cycle is needed. For the optimum filter, however, implementation on the FPGA is more complex. In each clock cycle for every filter tap, the multiplication of ADC data with filter coefficients is needed since the coefficients vary arbitrarily. For this purpose, the 25x18 multiplier in the "DSP48E1" slices of the Xilinx FPGA is used. 27 ADC data have a width of 16 bits, and the coefficients are chosen to be represented with 25-bit signed-type in order to use one multiplier per filter tap. For filter calculation, the Xilinx IP core "FIR compiler" is used and configured to be clocked with single rate since the frequency of the ADC equals the frequency of the FPGA clock. 28 Due to the higher computing effort, the optimum filters cause higher latency than the trapezoid filters. Additional latency compared to the trapezoid filter is in between 32 and 89 clock cycles depending on the filter length. This is compensated by delaying pulse detection for the additional latency. The overall hardware resource utilization of the FPGA design is 9701 lookup tables (LUTs), 14,071 Flip-Flops, 13 block RAMs with 36 Kbit each, and 84 DSP slices. Clocking speed is 80 MHz.

| Evaluation of noise reduction
In order to experimentally evaluate the noise reduction and compare the performance of calculated filter to the trapezoid filter, the silicon drift detector is irradiated with a 55 Fe X-ray source. Spectrum data are acquired in relatively long runs of 10 min in order to decrease statistical influences. Each five runs using the optimum filter and F I G U R E 8 Block diagram of functions within the digital signal processing the trapezoid filter of the same length were done. The rate of X-ray signals was chosen to be 100,000 counts per second, which is a typical value for applications desiring fast spectrum acquisition while using short filters. Signal throughput is equal for optimum and trapezoidal filter since the nominal filter length and applied pile-up criteria are the same. Figure 9 shows as an example two spectra acquired with the 600 ns filter. The spectrum acquired using the optimum filter matches well with the spectrum acquired using the trapezoid filter. No unwanted artifacts due to bad filter shape can be observed. Deviations in the shape of the spectra only occur in energy regions with a low amount of counts due to statistical fluctuations and for unresolved pile-ups due to the different filter shapes. However, the widths of the peaks in the spectra differ due to the different noise reductions of the filters. In order to quantitatively evaluate the noise reduction of the filters, the full width at half maximum (FWHM) of the Mn-Kα peak at 5.89 keV is calculated by a fitting a Gauss-peak to the spectrum. 29 By quadratic subtraction of the Fano-noise FWHM Fano from the FWHM of the Mn-Kα peak FWHM exp , the FWHM of the electronic noise FWHM el can be calculated in order to get an energy-independent noise measure 7 : Table 2 shows the determined values for the FWHM of the electronic noise in the spectra. Each of the entries represents the mean and standard deviation of the five repeated measurements. Noise reduction is improved with increasing filter length. Superior noise reduction is achieved by the optimum filter at all filter lengths.
FWHM el is reduced by (2.0 ± 0.1)% (300 ns filter) to (5.2 ± 0.3)% (1,000 ns filter) when using the optimum filter. Improvement increases with higher filter lengths, with the exception of the 800 ns filter. Reason for this might be the increased degree of freedom during the optimization due to the higher amount of filter coefficients. Furthermore, the trapezoid filter becomes less optimal for higher filter lengths as the influence of flicker noise increases. 8 Considering the results in Table 2, successful utilization of digital penalized least mean square method for signal processing with modern silicon drift detectors is shown. However, the improvements in electronic noise filtering using the optimum filters are not higher than several percent compared to trapezoid filters under lab conditions. Trapezoid filters therefore already offer nearoptimum reduction of electronic noise in the setup used, since noise has a high portion of white noise. However, the presented filters can be useful for the detection of characteristic X-rays at low energies, like, for example, Kα peaks of Ne, F, O, since the influence of electronic noise on this spectral lines is high. Also the filters can be applied in more noisy environments as the presented workflow can be used in applications dealing with higher amounts of pick-up noise that can be suppressed with optimized filters.

| SUMMARY AND OUTLOOK
High-resolution X-ray spectroscopy using silicon drift detectors requires detection setups with minimum influence of electronic noise on the spectra to achieve low detection limits, like, for example, in nondestructive material analysis. While extracting X-ray energy values from the output signal of a silicon drift detector, FIR filters are applied to the digitalized signal using an FPGA. Since noise reduction of the FIR filter has a high impact on the energy resolution achieved by the detection system, this work presents the calculation of optimum filters for silicon drift detectors. Besides the maximization of the signal-to-noise ratio, the filters have to fulfill time domain constraints, like, for example, finite duration, flat-top, and zero area. DPLMS is a known method for the constrained optimization of filters. Adaptations are made to the method regarding the consideration of signal rise-time, parallel filter optimization for multiple noisy signals, and used optimization algorithm. The workflow for optimum filter calculation is presented: Signal data are acquired using a spectroscopy setup in order to gain information about pulse shape and noise. Constraints regarding the time-domain transfer function of the filters are written out to prevent ballistic deficits and dependency of the baseline from DC level. A method to derive F I G U R E 9 55 Fe Spectra acquired using the 600 ns trapezoid and optimum filter limits for the minimum precision of constraint fulfillment is presented considering influence of nonperfect constraint fulfillment at low-and high-X-ray energies based on limits of the detection efficiency of the silicon drift detector. Filters with lengths between 300 and 1,000 ns are calculated for applications operating at high X-ray photon rates in order to reduce data acquisition time. Resulting filter shapes are presented and experimentally tested by implementation in hardware. 55 Fe spectra are recorded using the optimum filters, and influence of electronic noise is calculated. Comparison of the noise reduction to trapezoid filters is done since trapezoid filters are known for excellent noise reduction at short filter lengths. However, improvement of the noise reduction by the optimum filters is found to be in between (2.0 ± 0.1)% for the 300 ns filter and (5.2 ± 0.3)% for the 1,000 ns filter. Therefore, successful utilization of digital penalized least mean square method for signal processing with modern silicon drift detectors is shown, and numerical results of improvements are presented. The presented filters might be useful for the detection of low-energy characteristic X-rays in material analysis. Since the improvement of noise reduction under lab conditions is relatively low, in future works, calculation and test of optimum filters will be done in more noisy environments using the presented workflow. In some applications, presence of pick-up noise, like, for example, generated by Xray tube electronics, cannot be avoided and the utilization of optimum filters might be more beneficial. In future works, optimum filters with greater lengths will be calculated and tested for the use in applications operating at lower X-ray photon rates. In these applications, short filters are not required, and the improved noise reduction of longer filters can be used. The method also might be deployed under conditions with different noise behavior. Examples are large-area silicon drift detectors with increased leakage current or silicon drift detectors operated at higher chip temperature. In both cases, a higher noise level and an increased portion of flicker noise can be seen. For differing noise behavior, filter shapes expect to adapt to the noise environment, like, for example, become more cusp-like for higher portions of flicker noise. Furthermore, optimum filters might be calculated and deployed not only for the acquisition of energy values but also for pulse detection and baseline correction.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.