Image quality evaluation for FIB‐SEM images

Focused ion beam scanning electron microscopy (FIB‐SEM) tomography is a serial sectioning technique where an FIB mills off slices from the material sample that is being analysed. After every slicing, an SEM image is taken showing the newly exposed layer of the sample. By combining all slices in a stack, a 3D image of the material is generated. However, specific artefacts caused by the imaging technique distort the images, hampering the morphological analysis of the structure. Typical quality problems in microscopy imaging are noise and lack of contrast or focus. Moreover, specific artefacts are caused by the FIB milling, namely, curtaining and charging artefacts. We propose quality indices for the evaluation of the quality of FIB‐SEM data sets. The indices are validated on real and experimental data of different structures and materials.

vicinity.Clearly, a certain image quality is a prerequisite for correct segmentation.This general fact gains particular importance in the case of FIB-SEM images of highly porous media.In these, material from deeper layers is visible through the pore phase (shine-through-artefacts) which additionally hampers correct segmentation into pore space and solid component.[9][10][11][12][13] However, their successful application depends even more critically on the image quality.
Resolution and brightness of the images depend on the electron beam acceleration voltage. 14Increasing the electron beam size can blur the SEM images. 15Increasing the dwelling time, that is, the amount of time that the electron beam illuminates a single pixel, decreases the noise.In practice, there are many more degrees of freedom and the material and the electron beam interact in a complex way.FIB-SEM imaging parameters are chosen by experience and optimized by trial-and-error to yield images as good as needed at reasonable effort.As the imaging process is time-consuming and might be destructive, insufficient image quality should be detected early.Objective quality indices, applicable on a few slices in the beginning of the imaging process, can help to decide whether the imaging should be continued.
7][18] In contrast, methods for evaluating the quality of microscopic images are limited and, to the best of our knowledge, there are no dedicated methods for FIB-SEM image stacks.An index frequently used for SEM images is the signal-to-noise-ratio. 19,20However, its estimation from an SEM signal is very difficult because it depends on the characteristics of specimens as well as on the SEM operating conditions. 21There is no standard that could be transferred between measurements of different samples.][24][25] For individual SEM images, recent no-reference methods 26,27 evaluate blurring combining gradients and grey value statistics.Also, Wang et al. 28 employ a neural network trained on 650 high-and low-quality SEM images of ants, metal, stamens, colloids and minerals (details described in Li et al. 29 ) to classify SEM images into 'good' and 'bad' quality.Their focus is on rich texture images of separate objects which is in contrast to the homogeneous material samples studied here.Finally, Koho et al. 30 sort microscopic images according to quality based on image statistics, in both, spatial and frequency domains.
When evaluating image quality, multiple criteria should be considered separately.Clearly, images should be noisefree and feature sharp edges and a good grey value contrast between the components.FIB-SEM-specific artefacts are curtaining and charging.Curtaining artefacts are caused by the ion beam milling through phases with different densities and appear as vertical, thin, uneven bands on one or a few consecutive slices.(see Figure 1C,E) Charging is a commonly known problem in SEM imaging of electrically insulating samples. 31High-energy electrons hit the non-conductive sample, charge builds up rapidly on the material's surface and in deeper layers and finally causes artefacts such as the bright regions in Figure 1(F).Even worse, charging disturbs both, the electron and the ion beam.As a consequence, the signal deteriorates visibly, for example, in the low contrast in Figure 1(I) and the FIB cannot mill off the material completely, leaving lamellae of material that block the view on subsequent slices. 32ften, quality evaluation is based on comparing the input image with a given high-quality reference image.In contrast, no-reference methods do not require such a reference image such that they are more suitable for our setting.
4][35] For measuring the severity of curtaining and charging artefacts, we introduce dedicated novel quality indices.We test the five indices extensively on simulated and experimental FIB-SEM data sets with varying characteristics.The indices allow for quick and efficient evaluation of FIB-SEM images and coincide well with visual impression, which in turn is strongly correlated with the chances to obtain a good segmentation of the spatial structure.Finally, we suggest ways to correct the detected flaws and show how the success of the correction can, again, be judged based on the indices.
This paper is organized as follows: In Section 2, we describe the synthetic and real FIB-SEM image data used, define our indices for image quality evaluation, and suggest quick remedies for flaws reported by them.In Section 3, we apply the indices and remedies where needed.Finally, conclusions are drawn in Section 4.

MATERIALS AND METHODS
In Section 2.

Image data
In the following two sections, we describe the FIB-SEM stacks used throughout this paper.Obviously, it is

Synthetic images
Besides the real images, we use synthetic FIB-SEM data obtained using the simulation tool described by Prill and Schladitz. 36The microstructure is generated as a realization of a Boolean model, [37][38][39] which is a random closed set model given by the union of grains centred at the points of a homogeneous Poisson point process.Here, the grains are spherical and have a constant radius of 9 voxels and the Boolean model has a porosity of 65%.
For the simulation of the FIB-SEM imaging, we assume that the balls are made from carbon while the complement of the ball system is air.FIB-SEM images are generated by simulating electron diffusion through the microstructure.Electron paths are simulated using the Monte Carlo method of MONSEL II by Lowney. 40Various acceleration methods introduced by Prill and Schladitz 36 allow for a simulation of physically sound FIB-SEM stacks in reasonable time.These synthetic data enable objective comparison of processing methods as well as training of machine learning methods (see Salzer et al., 8 Fend et al. 13 and Roldán et al. 41 ).Depending on the imaging parameters used in the simulation, the synthetic images also feature variations in contrast, blur and noise, as visible in Figure 2.

Image quality indices
Due to the sequential imaging and the FIB sectioning in between, image quality in a 3D stack can differ considerably from slice to slice.Hence, we assess image quality separately for each slice and therefore introduce quality indices for 2D images in the following.By a 2D image, we understand a function  ∶ (ℤ + ) 2 ∩  ⟶ ℝ, where  ⊂ ℝ 2 is a compact rectangular window and such that for a point (, ) ∈ (ℤ + ) 2 ∩  the image grey value at that point is given by (, ).A pixel is defined by the triple (, , (, )), and for simplicity, we will only write (, ) below.If 0 ≤  <  and 0 ≤  < , we say that the image  has size  × .
In the following, we introduce indices indicating noise, blur, missing grey value contrast, curtaining and charging.All five indices are constructed to have range [0,1].The value 1 is reached by a perfect image only, whereas values close to 0 are a sign of serious flaws.

Noise index
Empirical studies concerning the type of noise in SEM images suggest that the noise follows a Gaussian distribution. 42Hence, the key to characterization of the noise level as well as noise removal is the estimation of the parameters, in particular the variance, of the noise distribution. 34,43,44Liu et al. 34 classify estimation methods as filter-based, patch-based and statistical.][47] Patch-based approaches decompose the image into image patches and estimate the noise level from selected patches. 34,48Zoran and Weiss's 49 statistical approach estimates noise variance by modelling the change of kurtosis values due to adding noise to the image.For a summary of further methods, we refer to the reviews of Pyatykh et al. 48and Liu et al. 34 Pyatykh et al. 48introduced a patch-based approach based on a principal component analysis.When evaluated on two benchmark data sets, their algorithm resulted in the highest accuracy and was observed to be much faster than competing methods with similar accuracy.Here, we use Liu et al.'s 34 adaption of the method which is similarly accurate but further reduces the run time.In the following, we briefly sketch the method.Liu et al. 34 provide a MATLAB implementation.
We assume that the observed image  =  +  is an additive decomposition of a noise-free image  and Gaussian white noise  with mean 0 and variance  2 that is independent of .For estimating the noise variance, square image patches of size  ×  are defined.We follow the recommendation of Liu et al. 34 by setting  = 7. Sliding the patch centre over the image, an input image of size  ×  will generate a sample of  = ( −  + 1) ⋅ ( −  + 1) patches.In vectorized form, we get with   ∼  (0,  2 ) and independent   and   .For simplicity, the patches are assumed to be uncorrelated, even though this is not the case for overlapping patches.Estimation of the noise variance is now based on the following idea: Assume that a -dimensional random vector  can be decomposed into  =  +  such that  and  are independent and  ∼  (0,  2 ).Denoting the covariance matrix of a vector  by Σ  , we get Let  ∈ ℝ  be a unit vector.We project the data  onto the axis spanned by  and compute the variance of the result.Due to the independence of  and , we get var(  ) = var(  ) +  2 .
This implies that the direction  min of minimal variance is identical for  and .The direction  min is obtained as the eigenvector associated to the minimal eigenvalue  min (Σ  ) of Σ  .Thus, While  min (Σ  ) can be estimated from the empirical covariance matrix of a noisy image, estimation of  min (Σ  ) is not possible.However, if we can assume that the data  are contained in a subspace of ℝ  of dimension smaller than , we get  min (Σ  ) = 0 such that σ2 = λmin (Σ  ).
For image patches, this assumption is fulfilled if the images do not contain too fine texture details (see Liu et al. 34 ).The microscopy images considered here are expected to fulfil this condition.
In case of doubt, fine texture patches can be detected and removed from the sample.To do so, Liu et al. 34 use the trace of the gradient covariance matrix of the image patch as a measure of texture strength.A patch is discarded if its texture strength is higher than a texture strength threshold.This threshold is chosen as the 1 − -quantile ( = 1 − 6) of a gamma distribution whose parameters can be estimated from the image.In particular, the scale parameter depends on the noise level  2 .Strongly textured patches influence the noise level estimate σ2 and, hence, the threshold.To remove this influence, an iterative procedure is chosen: Noise level estimation, threshold selection and deletion of patches are successively repeated until the estimated noise level σ2 does not change any more.
To define a noise index with values between 0 and 1, we consider an upper bound  for the noise standard deviation, such that the noise index of images with  >  is set to 0. That is, we define the noise index as Our threshold selection is based on the 3 rule for the Gaussian distribution which states that more than 99% of the distribution mass is contained within three standard deviations of the mean.We consider a grey value range of [0,1].A Gaussian kernel with standard deviation  centred in the central grey value 0.5 will fit in the interval [0,1] if 3 ≤ 0.5, hence,  ≤ 0.167.For larger , a significant amount of noisy grey values will exit the allowed grey value range of [0,1].This motivates our choice of  = 0.16.See Figure 3 for a toy example illustrating the noise index.

Blur index
Gradient, variance and frequency domain-based metrics to measure the blur have been proposed by Crété-Roffet et al., 35 Windisch and Kozlovszky 50,51 and Erasmus and Smith. 524][55][56][57][58] Recent no-reference methods for blurring evaluation in SEM images are proposed by Wang et al. 26 and Li et al. 27 Wang et al. 26 present an elaborated algorithm for blurring measurement decomposing the original image in two images: a cartoon part and a In Li et al.'s 27 method, initially a grey scale erosion of the image is computed.Edges are then detected by a Sobel filter followed by noise filtering, such that the structural edges are preserved.Finally, the maximum and the average gradient are computed and combined into a blurring index.
Due to the low degree of texture details of our images, we decided to adapt the simple gradient approach of Crété-Roffet et al. 35 comparing the gradients in the original image  and a blurred version ().If  is sharp, then the gradients of  and () differ more strongly than if  is blurred.Crété-Roffet et al. 35 blur by horizontal and vertical linear mean filters to detect directed motion blur.As this is not expected in microscopy images, we replace these by just a square mean.Figure 4 shows an example.
Formally, the approach is defined as follows: Starting from an input image , a blurred version () is obtained by applying a mean filter with mask size 9 × 9 pixels.Next, we consider the absolute values of the gradient images of  and () denoted by   and  () , respectively.For comparing the gradient images, we set and sum over all pixel values to obtain The blur metric is defined as: This index has values between 0 and 1, where 0 means poor and 1 excellent.

Contrast index
Differences in intensity create image contrast, allowing individual features and structural details to become visible.
In microscopy, the contrast depends on several factors such as the chemical composition and microstructure geometry of the material to be imaged, the spatial resolution and microscope characteristics.Several contrast metrics for images have been proposed, see, for example, the summary in Olsen et al. 33 Basically, one can distinguish global contrast metrics that characterize contrast in the whole image and local metrics that evaluate contrast in the neighbourhood of each pixel.The latter is of interest for natural images which may show an object of interest in front of a rather homogeneous background.Here, we are interested in images of spatially homogeneous microstructures.Hence, we restrict attention to global contrast metrics which measure to which extent the available grey value range has been exploited but do not consider the spatial distribution of contrast in the image.One such measure is the Michelson contrast, which is defined as the ratio between the difference and the sum of the maximum and minimum grey values.A drawback of this measure is that it can be easily biased even by single noisy pixels.
A metric that is more robust with respect to changes of single pixels is root mean squared (RMS), that is, the standard deviation of the grey values of the image where  is the mean grey value.To obtain a standardized measure, we assume that the grey value range has been scaled to the interval [0,1].0][61][62] It is experimentally proven to be in accordance with the human perception.The RMS has positive values between 0 (constant image) and 0.5 (image with grey values 0 and 1 in equal proportion). 65The maximum RMS value observed on all image data available in this study is approximately 0.3 which roughly coincides with the value expected for a uniform distribution of grey values.Setting this value as an upper limit, we define a contrast index with range [0,1] by   = min(()∕0.3,1).
A value of   close to 0 represents almost constant grey values, hence poor contrast, while for high contrast images we will get   ≈ 1. See Figure 5 for some examples.

Curtaining index
Curtaining artefacts appear in the -axis direction, and their shape varies from thin and deep stripes to thick and shallow bands.In the literature, the focus is on removal of stripe artefacts rather than their quantification.See Section 2.3.4 for details.7][68] We adopt ideas of Münch et al. 68 to derive an index quantifying curtaining.
Vertical stripe artefacts in an image  produce frequency components in the horizontal direction x in the Fourier transform f of .Curtaining information can be emphasized by computing the directional gradient along the -direction prior to the Fourier transform.Figure 6(A,B) shows an image with curtaining artefacts and its -gradient.The stripe information is condensed to the abscissa in Figure 6(C).
Frequency decurtaining methods perform well on frequencies near the origin.The stripes caused by curtaining are parallel to the -axis.Thus, in frequency space, we have to look for peaks along the -axis.Therefore, to quantify the degree of the curtaining, we use a window elongated in -direction.From our experience, a region of width 120 pixels and height 30 pixels is sufficient to collect the frequencies of interest and observe the curtaining induced peaks.
For this region, we define a binary image  of the same size as the box as follows.For each column of f the maximal grey value within the box is detected.Its location is marked by a white pixel (value 1) in .The remaining pixels of  are set to black (value 0).Examples are shown in the top of Figure 7.In the next step, the row sums of  are computed and normalized by the width of the box to obtain a resultant vector .For images with ideal stripes, the index obtains the minimum value of 0. In this case, the maximum locations of f are in the centre of the image, and the resultant binary image  consists of a horizontal line in the middle.For clean images, we assume that the maximum locations are uniformly distributed.Thus, the binary image consists of scattered white pixels with an expected number of four pixels per row.As we consider the sum of three normalized entries of , the maximum value is given 1 − 12∕120 = 0.9.Therefore, to exploit the full index range [0,1], we rescale the index by setting   =    ∕0.9.With this setting,   = 0 means strong curtaining and   = 1 means free of artefacts.For instance, the toy example images (Figure 5) show two straight vertical stripes resembling strong curtaining artefacts.Thus, the frequency domain exhibits horizontal frequencies corresponding the stripes.Curtaining values in these cases are close to zero.The

Charging index
Charging artefacts develop when a material cannot adequately conduct the charges generated by the interaction of the material with the electron beam during SEM scanning.
The ratio between electrons absorbed from the sample to electrons emitted as secondary electrons, backscattered electrons and Auger electrons determines the charges that are building up on the sample surface.These can both be positive and negative.An electrical potential on the sample surface generates bright or dark spots and causes image distortions as electron production is artificially changed (see Figure 1C,D,F).Note that the low contrast in Figure 1(I) is caused by charging, too, as the electrons are deflected.This is however reported by the contrast index defined above.We extract information from the image histogram as the histogram of an image with charging has typically a significant amount of pixels in the upper region.Thus, our index is based on grey values exceeding the 90th percentile  90 of the image grey value distribution (see Figure 8).We define the function ) where 0 is bad and 1 excellent.

Remedies
This section summarizes options for dealing with quality flaws detected by the suggested indices.It is neither intended to be a comprehensive review of all available methods nor a strict guideline.We rather collect practically helpful information on severity of the problem, abundance and costs of remedies.Whenever possible, optimization of the imaging set-up to avoid distortions should be preferred to later trying to enhance the acquired stack by image processing techniques.We restrict to 2D processing methods as our quality indices operate slice-wise.Moreover, successful application of 3D algorithms to FIB-SEM stacks typically demands pre-processing of the whole stack, in particular alignment and correction of grey value fluctuations.Processing the whole stack contradicts however our intention of providing tools for early intervention.

Denoising
Noise can be reduced experimentally by increasing the dwell time, the beam current or the acceleration voltage as well as by using other detection schemes.However, increasing the dwell time or the beam current will potentially lead to increased charging and an increase in acceleration voltage will reduce the surface sensitivity potentially increasing blurring.Alternatively, if the sample is exhibiting limited conductivity, it can be beneficial to change the scan strategy by averaging multiple scans from the same area.
In image processing, denoising is probably the main goal of classical image filtering.In general, denoising reduces high-frequency image components like edges, textures and noise and thus degrades the image.Obtaining a denoised image without degradation is a challenging open task. 70inear filters like the mean filter remove noise, but blur the image at the same time.Non-linear filters, such as the median, a weighted median 71 or more general rank value filters as, for example, described by Heygster 72 preserve details of the original image better.They substitute each pixel's grey value with the respective quantile or trimmed (or truncated) quantile of grey values within the filter mask.González-Ruiz et al. 47 propose a dedicated method for denoising FIB-SEM images of biological structures which employs local filtering to preserve the original structure within the image.We suggest a simple 3 × 3 or 5 × 5 median filter which is usually sufficient for noise removal and preserves structural details quite well.Variational methods that denoise by minimizing a suitably defined energy function are harder to implement but allow to incorporate knowledge on image content and imaging method (see, e.g.Teuber et al. 73 ).

Sharpening
Experimental approaches to sharpen the images can be to reduce the beam diameter, either by more accurate focusing and stigmation, which should be part of the routine microscope adjustment for every new image area, or by choosing optical conditions of the electron gun and the condenser system to yield a smaller beam.This second option is connected to a reduced intensity and thus a reduced signal-to-noise ratio.Furthermore, reducing the acceleration voltage can improve the surface sensitivity and thus help to sharpen surface features in the images.
][76] Blind deblurring methods use masks, too.However, these masks are computed from the blurred image. 77,78e suggest the popular non-blind filtering with an unsharp mask 79 : First, derive a detail image  detail consisting of the details that are removed by smoothing the blurred input image  by the Gaussian filter ℎ  .That is,  detail =  − ℎ  ().The sharpened image is then obtained as pixelwise weighted sum of the original  and the detail image:  sharp =  +  detail where  is the weight of the detail added to the original image.We use MATLAB's function imsharpen with default parameters  = 1 and  = 0.8 as suggested by Mathworks. 80Noise or edge effects can occur as side effects but can be diminished by selecting a suitable mask or filtering locally on regions of the image.

Contrast enhancement
The image contrast should always be maximized experimentally prior to acquiring the final image by tuning the gain and offset of the detector.Care should be taken in particular when imaging many consecutive slices as the contrast might change in some of the upcoming slices and oversaturation or intensity cut-offs need to be prevented.In image processing, there is a wide variety of contrastenhancing methods.Direct methods like those of de Haan, 81 Cheng and Xu 82 and Beghdadi and Le Negrate 83 enhance images by maximizing some quality measure.Lacking a generally applicable metric for maximizing contrast in SEM images, we choose an indirect approach as those of Sherrier and Johnson, 84 Polesel et al. 85 and Arici et al. 86 Indirect methods exploit the dynamic range of the image's grey values for enhancing.4][95][96] For a thorough review of contrast correction methods, we direct the interested reader to Kaplan et al. 97 Ordinary histogram equalization has limitations in images with regions that are significantly brighter or darker than most of the image.Adaptive histogram equalization (AHE) methods overcome these limitations by splitting the image into subregions and equalizing the histogram for each of those.AHE enhances the local  1C) and image (B), whereas for (C) (same as Figure 1B), it decreases significantly.
contrast and preserves edges, may however emphasize noise in quasi-constant regions.Contrast Limited (CL)AHE is a variant reducing this problem.Zuiderveld 98 applied CLAHE successfully on FIB-SEM images.We therefore suggest CLAHE, too.The algorithm is available as MATLAB function adapthisteq, and is controlled by the parameters clipLimit and distribution. 99The former is a contrast factor preventing oversaturation in homogeneous areas.The latter specifies a distribution family (uniform, Rayleigh and exponential) that models the desired histogram shape.

Decurtaining
In heterogenous samples, it is difficult to completely prevent curtaining during FIB cutting.It can be reduced by applying a reasonable surface coating to prevent surface features to add to curtaining effects.Furthermore, it might be worthwhile changing the sample orientation to perform FIB cutting such that particularly inhomogeneous regions are located at the bottom of the volume of interest.If the microscope is equipped with a rocking stage, this can be used to continuously tilt the sample while cutting to smear out and reduce curtaining.
As algorithmic remedy, the first dedicated decurtaining algorithm by Münch et al. 68 uses a discrete wavelet transform to separate the image's vertical components on several scales.Subsequently, a filter in Fourier space removes the stripes.The inverse wavelet transformation, recursively applied, yields the decurtained image.The user chooses the wavelet family, the decomposition depth and the spectral filter.The algorithm works very well for stripes stretching over the image's complete vertical dimension.It is available as part of a set of ImageJ plug-ins maintained by the group for 3D-Microscopy at EMPA. 100 Variational methods [101][102][103] minimize a stripe penalizing cost function via primal dual techniques.We implemented the algorithm proposed by Liu et al. 103 as MATLAB function.For suggestions on the choice of the parameters, we refer to Liu et al. 103 Overall, the method performs very well on FIB-SEM data, may however introduce undesired blurring if the parameters are not chosen optimally.The algorithms of Münch et al., 68 Liu et al. 103 and Fitschen et al. 102 are implemented in ToolIP provided by Fraunhofer ITWM, Department of Image Processing, 104 too.

Dealing with charging
Avoiding charging is probably the most challenging aspect to handle experimentally when imaging non-conductive samples, especially as it is not always clear which features are due to charging and which features are inherent in the sample.If the region of interest is not too deep, application of a conductive coating connected to the ground can reduce charging.Otherwise, reducing the acceleration voltage and/or the beam current reduces charging, but also the image intensity and thus the signal-to-noise ratio.Changing the dwell time and the scan strategy can be efficient to reduce charging as well as changing the detection scheme-BSE imaging is less sensitive to charging compared to SE imaging.Luckily, when aiming at segmentation (or reconstruction) of the solid component of a porous material, charging is not a true concern as it just renders the brighter pixels belonging to the solid even brighter.Figure 9 shows an example.
F I G U R E 1 0 Representative slice from the infiltrated silica balls data set from Figure 1(C) and Figure 9(A), enhanced as described in Section 3.3, together with the quality indices for this slice.
Sim et al. 31 suggest so-called exponential contrast stretching transforming the original image's grey value distribution into a given distribution as treatment of charging artefacts in microscopic images.Wan Ismail et al. 105 follow Sim et al., 31 but replace the exponential by a Rayleigh distribution to prevent image oversaturation.

General procedure
When trying to improve image quality, noise should be considered first as it is usually rather easy to remove and other flaws like curtaining can only be detected if the noise does not deteriorate the stripes too strongly.Image quality evaluation and enhancement should therefore always start by noise removal.Subsequently, the image is evaluated using the indices revealing additional potential quality problems.
All enhancement methods can have side effects.Thus, any image processing needs to be applied with prudence.From our experience, after noise removal, contrast, blur, curtaining and charging should be treated in that order.Contrast enhancement may emphasize noise and charging in the image slightly, without however compromising the overall image quality seriously.Sharpening increases contrast, too, and helps to detect curtaining artefacts, may however as well amplify the noise.Decurtaining may blur the image.Sharpening should therefore be considered to complement decurtaining.
As a rule of thumb, any enhancement should improve the overall quality of the image.If a transformation decreases one or more indices significantly, then either the method induces collateral effects or it reveals hidden artefacts.An example of the former is the tradeoff between noise and blurring.The latter happens for instance when denoising decreases the curtaining index as in Figure 10(B).

RESULTS
In this section, we apply the indices for objective quality evaluation of FIB-SEM images first to the synthetic images from Section 2.1.2,then to the real images presented along this document.Finally, the full enhancement work-flow as suggested in Section 2.4 is applied using methods from Section 2.3.

Sanity check based on synthetic SEM images
The synthetic FIB-SEM images from Section 2.1.2mimic different dwell times and vary from poor to excellent quality as a function of dwell time as visible in Figure 2. The low dwell time images are noisy as expected and the noise index reports this accordingly as shows Table 1.Increasing the dwell time improves the contrast.The contrast index reflects this properly.Longer dwell times yield less noise, which in turn lets the image appear sharper.Thus, the blur index decreases slightly.The charging index varies notoriously because of overexposed halos around the solid phase, which are misinterpreted as charging artefacts.The curtaining index remains nearly constant.Results are summarized in Table 1.All values reported by the indices agree well with visual impression.

Results on experimental data
In this section, we use the indices to evaluate the quality of the experimental images from Section 2.1.1 featuring various characteristics and artefacts.We add a few complementary images to demonstrate specific effects.Results are summarized in Table 2. Values near zero reveal seri-ous quality problems as strong noise in the infiltrated silica balls in Figure 9(A).As to be expected, higher dwell times yield larger noise index values for Figure 9(B,C).Moreover, contrast, blurring and curtaining indices differ considerably as dwelling time varies.Contrast and sharpness decrease as consequences of less noise.The curtaining index works well only for higher dwell time, as curtaining artefacts are hidden by noise otherwise.
In general, index values close to zero indicate serious quality concerns while high index values indicate good quality.For instance, the polystyrene balls, porous carbon and zirconium dioxide-Figure 1(A,D,F)-are sharp with minimal noise as the noise index values show.Moreover, the ZrO 2 image features exceptionally high contrast and consequently the highest contrast index value.This is not only due to the deep dark pores but also because of strong charging artefacts.
Rather simple methods as suggested in Section 2.3 can enhance the SEM images considerably.However, they usually affect more than one of the indices.After each enhancement, the image should be inspected.Special attention is due if an enhancement causes critical index values near zero.Indices should not be interpreted separately as they interact closely.Interpretation and comparison of the quality indices is easier for images with similar content, that is, similar structures imaged under similar conditions in as in the case shown in Figure 9.
Clearly, the indices are limited in several ways, one being that the index is assigned to the whole image.enhancement without applying any processing is proper cropping.Similarly, the curtaining index will mistake thin vertical structures for imaging artefacts.
In general, the quality of all SEM images within a 3D FIB-SEM stack should stay the same as the imaging parameters remain unchanged.Thus, the quality indices have to be calculated for a few slices only to check the imaging parameters.However, during long FIB-SEM measurements, the sample can run out of focus and the risk of charging rises as a consequence of continuous exposure to the electron beam.Curtaining is a local phenomenon affecting just one or a couple of consecutive SEM slices, too.Continuous monitoring of the indices could help to detect deteriorating image quality fast.

Image processing for quality enhancement
In this section, we exemplify the quality enhancement suggested in Section 2.3 for the infiltrated silica balls data set featured in Figure 9(A).Figure 10 shows a slice representative for the whole stack.Figure 10(A) is the unprocessed original, as Figure 9(A).Figure 10(B-D) are enhanced versions.The original is a bit noisy and moderately affected by curtaining and charging.
First, we denoise by a 3 × 3 median filter as described Section 2. Note that our focus here is clearly on suggesting a set of generally applicable quality indices and ensuring their consistency.As a consequence, the example here is chosen primarily for being particularly instructive as it features several quality problems and the effectiveness of the suggested remedies can be visually perceived.

DISCUSSION AND CONCLUSIONS
We suggest objective indices measuring the quality of microscopic images with respect to several characteristics, namely, noise, contrast, blurring, curtaining and charging.The first three apply generally while the latter two are dedicated to FIB-SEM images.The indices are indeed motivated by FIB-SEM imaging where they can help to detect suboptimal imaging parameters early in the slicing and imaging process.
The definitions of our indices involve a couple of rather empirically found constants.However, they are chosen based on more than a dozen synthetic and more than 50 real FIB-SEM image stacks of a wide variety of materials and structures imaged at four institutions using four devices.We prove the indices to capture the intended features and rank images correctly.We suggest means to alleviate quality flaws reported by the indices.For removing noise, lack of contrast and blur, a wealth of well-known image processing methods are available.We concentrate here on simple, easily accessible tools.For decurtaining, we discuss dedicated algorithms.We show the effects of these improvements and prove that the indices reflect the corresponding changes in the images accordingly.A systematic study on the severity of detected flaws and their impact on the possibility and the quality of 3D reconstructions is however beyond the scope of this paper and subject of further research.Cautionary examples show that the indices should always be considered as an ensemble as, for example, a lot of noise can in fact hide artefacts or pretend a high contrast.Moreover, their interpretation depends critically on the imaged structure as, for example, no index can differentiate bright vertical edges from curtaining.
As mentioned, the indices have been tested on a much wider variety of FIB-SEM images than actually shown here.They are applicable for FIB-SEM images in general.In particular, they are not restricted to secondary electron SEM images but apply to images obtained from any SEM detector.Clearly, the quality of the images depends on the chosen detection scheme and different detectors will react very differently to the various artefacts.The indices report such variations without any adaption.However, the interpretation of their values might vary.Moreover, we believe they can be valuable for other imaging methods as well.
Image processing, in particular segmentation or classification is nowadays dominated by machine and in particular deep learning (DL) methods.The investigations presented here have been motivated by the challenge to reconstruct highly porous structures from FIB-SEM stacks.DL solutions for this semantic segmentation task have been suggested, too, for example, Fend et al., 13 Sardhara et al. 106 and Osenberg et al. 107

13652818, 2024, 2 ,F I G U R E 1
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Scanning electron microscopy (SEM) slices from exemplary focused ion beam (FIB)-SEM data sets of porous and non-porous materials used throughout this paper.impossible to represent the full range of materials, structures and SEM imaging modes even approximately.Instead, we chose a portfolio that is both practically relevant and instructive.Synthetic FIB-SEM stacks are used to illustrate how the quality indices work.2.1.1FIB-SEM Real data, selected to represent the diversity in terms of quality and structures, is presented in the next sections.Concerning the quality, characteristics as contrast, noise, blurring and image artefacts vary along the data sets.Fur-thermore, data sets contain several structures including highly porous and non-porous materials.The selected data were provided by the following institutions: Institute of Nanotechnology and Karlsruhe Nano Micro Facility at Karlsruhe Institute of Technology (KIT), Fraunhofer Institute for Ceramic Technologies and Systems (IKTS), Max Planck Institute for Polymer Research (MPIP) and Chair of Functional Materials at Saarland University (CFMSU).

Figure 1 (
A) shows an image of sintered polystyrene balls acquired by MPIP using an FEI Helios NanoLab 660 microscope.A protective Pt layer of thickness 1 m and area 12 × 8 m 2 was deposited on the sample.A volume of 12 × 8 × 6 m 3 was imaged with cubic voxels of size 35 nm.

13652818, 2024, 2 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 2 Simulated scanning electron microscopy (SEM) images of a Boolean model with varying dwell times.

Figure 1 (
Figure 1(B,C,H,I) features infiltrated silica balls at two different dwelling times, an infiltrated silica monolith and a silica monolith.They were obtained by KIT using an FEI Strata 400S Dual Beam at 5 kV, with cubic voxels of size 20 nm.An FEI Helios NanoLab 600 microscope was used by CFMSU for acquiring the images of the porous carbon structure in Figure 1(D), the AlSi alloy in Figure 1(E) and the etched aluminium foil in Figure 1(G) with cuboidal voxels of 83 × 106 × 250 nm 3 and cubic voxels of 10 and 20 nm, respectively.Finally, the ZrO 2 in Figure 1(F) was imaged at IKTS using a crossbeam NVision 40 Field Emission Scanning Electron Microscope by Carl Zeiss at 1.5 kV with a voxel size of 3 × 3 × 6 nm 3 .All images shown in the following are secondary electron signal SEM images.

13652818, 2024, 2 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 3 Toy example image with added Gaussian noise with mean zero and varying variance and corresponding noise indices.texture part.The blurring is measured by quantifying spectral and gradient image information from the decomposed images.Compared to the images in Wang et al., 26 the FIB-SEM images shown here show relatively few texture details such that a decomposition does not seem necessary.

F I G U R E 4
Toy example images and gradients.Top: Original image with sharp edges and two mean filtered versions.Filter mask size 3 × 3 and 5 × 5 pixels for (B) and (C), respectively.Bottom: gradient images.

13652818, 2024, 2 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 5 Toy example image from Figure 3, here with varied grey value range.Grey value histograms and contrast indices.Versions (A) and (B) are restricted to low and high grey values, respectively.Version (C) uses the full 8bit range of grey values.F I G U R E 6 Scanning electron microscopy (SEM) image with curtaining artefacts, gradient and Fourier transform.A predominant peak in the centre of the resultant signal suggests an image with curtaining artefacts, whereas resultant signals without a predominant peak suggest images without these artefacts as shown in Figure7(B).We propose a curtaining index as the global maximum of the resultant signal global max() plus the values obtained at the left and right of this maximum denoted as  + and  − .These neighbours are included as they allow to report curtaining artefacts that slightly deviate from ideal stripe shape.The initial curtaining index is defined as    = 1 − (global max() +  + +  − ).

13652818, 2024, 2 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 7 Binary boxes (top) and normalized frequencies (bottom) of images with and without curtaining artefacts.Binary boxes of affected images feature horizontal line patterns and a dominant peak at the centre of the resultant signal (A) whereas those of curtaining free images contain scattered points and no dominant peak (B).In the plots of the normalized frequencies, the global maximum with the left and right neighbours are highlighted.The aluminium foil (C) features vertical edges leading to overestimated curtaining.Rotation such that the edges are perfectly vertical (D) further emphasizes this effect.
where ℎ() is the absolute histogram count at the grey value  for an image grey value range of [0,1].Original images are 8-bit, and thus there are 256 different possible grey values.Here, we consider a histogram discretization to the range [0, 1] by using the MATLAB function 2.69We consider a normalized version of  by dividing the function by the total number of pixels of  with grey values above  90 .We denote this normalized version as  * .We propose the charging index by considering the mean of  * in the upper half interval [0.5, 1] as grey values for charging are expected in this interval.As a last step, we transform the obtained mean value by scaling the interval [0.5, 1] to the interval [0,1].Then, the charging index is given by

13652818, 2024, 2 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 9 Representative slices from focused ion beam scanning electron microscopy (FIB-SEM) stacks of infiltrated silica balls imaged using different dwelling times.Contrast and blur indices decrease as dwelling times are larger.The curtaining index are similar for the noisy, low-contrast image (A) (also shown in Figure

TA B L E 2
Quality index values for the experimental data from Figure1.The polystyrene balls and the infiltrated silica monolith images are good quality examples and the index values agree with this visual impression.The remaining examples have one or more quality problems revealed by lower index values.For example, the infiltrated silica balls yield low contrast and suffer from curtaining, whereas for the porous carbon charging artefacts are the main concern.

13652818, 2024, 2 ,
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 1 1 Reconstruction of the representative slices from Figure 10.Noise and curtaining artefacts effects are visible in reconstructions (A), (B) and (C).Subimage (D) shows reconstruction after the noise, contrast and decurtaining treatment as suggested in Section 2.4.

3 . 1 .
The denoised Figure 10(B) not only yields a higher noise index but also lower blur, contrast and curtaining indices.The contrast enhancement by CLAHE as described in Section 2.3.3 nearly recovers the contrast of the original image leaving the curtaining as main problem of Figure 10(C).Consequently, we remove the curtaining using Liu et al.'s 103 method to obtain the final enhanced Figure 10(D).Benefits of the quality enhancement reflect in the reconstruction featured in Figure 11(D).The data set is reconstructed by simple global thresholding with a manually selected threshold.Tresholding is a valid approach for the reconstruction of the data set as the grey values of the solid spheres and the infiltrated material are different.
Our quality indices are not contrasting DL methods.They can contribute to understanding and improving results of these methods by offering explanations for unsatisfying results and an easy and fast way to check whether the training data are really representative.Of course, ML could be used to evaluate image quality directly, too.However, this requires sufficient and representative training data which are hard to gather as humans perceive image quality features individually.The presented indices are objective and valuable for quick and reliable quality evaluation based on the first couple of slices of an FIB-SEM stack.A C K N O W L E D G E M E N T S The authors acknowledge support by the German Academic Exchange Service DAAD, and by the German Federal Ministry of Education and Research through project REPOS 03VP0049.The authors thank Michael Engstler (Chair of Functional Materials, Saarland University), Matthias Klingele (Department of Microsystems Engineering -IMTEK, University of Freiburg), Sören Höhn (Fraunhofer Institute for Ceramic Technologies and Systems IKTS) and Regina Fuchs (Max Planck Institute for Polymer Research) who provided the data sets used for the study.We thank Niklas Rottmayer of RPTU Kaiserslautern-Landau and Fraunhofer ITWM for the decurtaining algorithms.Open access funding enabled and organized by Projekt DEAL.O R C I D Diego Roldán https://orcid.org/0000-0002-5556-1564Claudia Redenbach https://orcid.org/0000-0002-8030-069XKatja Schladitz https://orcid.org/0000-0003-4903-3180Sabine Schlabach https://orcid.org/0000-0002-1640-5067R E F E R E N C E S 13652818, 2024, 2, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jmi.13254by Karlsruher Institut F., Wiley Online Library on [25/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Index values for the synthetic data from Figure 2. Noise index values increase significantly with increasing dwelling time.Contrast improves.Blurring index values decrease moderately.Charging values decrease significantly due to the increased number of electrons.There is no curtaining, and the corresponding index does not vary considerably.
TA B L E 1

Table 2 reports
charging in the porous carbon.The bright regions are however outside of the actual sample.A simple