Mask or Enhance: Data Curation Aiding the Discovery of Piezoresponse Force Microscopy Contributors

Piezoresponse force microscopy (PFM) is routinely used to probe the nanoscale electromechanical response of ferroelectric and piezoelectric materials. However, many challenges remain in the interpretation of the recovered signal. Specifically, many non‐ferroelectric contributions affect the measured response, ranging from electrostatics, to charge injection and trapping, and topographic cross‐talk. Recently, machine learning (ML) has been utilized to identify multiple contributors within complex data systems, such as PFM response. A substantial advancement in ML approaches for PFM techniques is offered by dimensional stacking, enabling encoding of physical and/or chemical correlations within the materials' response across different data dimensions spanning varying ranges. However, dimensional stacking requires appropriate scaling for each dimension (before ML analysis) to minimize undesired information loss. Here, the impact of clustering globally and locally scaled parameters in polarization switching experiments via resonant PFM (RPFM) are discussed. Specifically, dimensional stacking of scaled parameters can mask or enhance ferroelectric and non‐ferroelectric behaviors, and aid identification of various physical phenomena contributing to the measured RPFM response. This study highlights the importance of data curation for ML, and its role in identifying signal contributors to scanning probe microscopy (SPM)‐based techniques with multidimensional data, such as resonant and/or spectroscopic SPM.


Introduction
Within the past few decades, advancements in micro-and nano-scale technology have enabled the miniaturization of devices, from computers and personal electronics, to a substantial number of other systems, ranging from sensors and actuators, to transducers and energy harvesters. [1][2][3][4] A prominent factor in facilitating the continued technological drive toward miniaturization lies in the accurate understanding of materials' functional properties -whether they be electronic, ferroelectric, piezoelectric, optical, or magnetic -at length scales required for their deployment. [5,6] Scanning probe microscopy (SPM) and its derivative techniques are some of the most versatile characterization tools for studying functionalities with nanometer scale resolution. [7] At their core, all SPM techniques are based on the interactions between the tip of a microcantilever and the surface of interest, where the acquired signal depends on the specific setup and application. For instance, electrochemical strain microscopy measures local deformations induced by electric voltage in mixed ionic-electronic conductors, [8,9] while magnetic force microscopy aims at studying magnetic properties of the surfaces. [10,11] Piezoresponse force microscopy (PFM) is the leading SPM-based method for evaluation of piezoelectric response, imaging of the polarization state, and measurement of switching and relaxation characteristics in ferroelectric materials. [12] In PFM, a voltage is usually applied through the tip to the sample surface; or alternatively, in thin films, the tip can be grounded, while the voltage is applied through the bottom electrode to the sample. The resulting electric field within the material generates a surface deformation -primarily through the converse piezoelectric effect -resulting in deflection of the microcantilever, which is simplified as out-of-plane or in-plane amplitude (A), and phase ( ). [13] Traditionally, A and are combined and represented as the piezoresponse, PR = A cos .
The signal-to-noise ratio in PFM can be improved if the measurement is performed near the cantilever's contact resonant frequency ( ). In doing so, additional information on the viscoelastic properties of the tip-sample interface is gained through tracking the contact resonant frequency ( ), and the energy dissipation per measurement cycle via the quality factor (Q). Dual AC resonance tracking PFM (DART-PFM), band-excitation or resonant PFM (BE-PFM or RPFM) are examples of such an approach. [14][15][16] Spectroscopic measurements can further augment the acquired information through superposition of a dc-bias waveform onto the measurement signal, offering insights into the localized switching behavior of a ferroelectric material (switching spectroscopy), [17] or polarization relaxation, [18] among others. Such measurements can be repeated over a grid of points within a specified area, resulting in multidimensional datasets, mapping variations in A, , and Q as a function of the applied voltage across the region.
Although the PFM signal was originally attributed only to piezoelectricity, it has become clear over the last decade that many physical and chemical phenomena can contribute to the measured PFM response. [19] For instance, both short and long range electrostatic interactions -between the cantilever tip and the sample, and along the cantilever body and the sample surface, respectively -can significantly contribute to the measured surface displacement in spectroscopic PFM experiments. [20] Secondary effects, such as charge injection and trapping as well as ionic migration within the sample, can also contribute to surface displacements directly and/or through creation of internal fields resulting in additional PFM response. [21] Higher order effects -such as electric polarization induced by strain gradient, i.e., flexoelectricity, as well as strain dependent on the square of the electric field, i.e., electrostriction -can also induce further surface displacements. [22,23] Even the environmental conditions can play a role in the PFM signal, for example, through a water meniscus modulating the electrostatics and electrodynamics at the tip-sample contact, thus directly affecting polarization stability and switching. [24] Lastly, topographic features can also result in enhancement or reduction of the above-mentioned phenomena. [25][26][27] Separation of various signal contributors has been the subject of many studies over the last few years; [28][29][30] however, many such approaches are influenced by the user (and their bias in the interpretation of the results), or rely on the averaged response from the sample surface, thus resulting in substantial information loss.
To address these limitations, machine learning (ML) has proven itself a valuable tool for facilitating interpretation of multidimensional datasets, including those generated by resonant SPM methods like RPFM. [27,[31][32][33][34][35][36][37][38] Although less susceptible to user bias, ML approaches are still mathematical algorithms by definition, agnostic of the underlying physical and chemical phenomena. Hence, their outputs are not necessarily representative of the physical response of the sample. Most recently, dimensional stacking, or the concatenation of data along specific dimensions, has been proposed as a method to provide chemical and/or physical constraints, ultimately offering a better basis for interpretation of these algorithms' output. [18,39] However, stacking parameters without appropriate scaling could result in over-weighing or under-weighing of specific information (i.e., stored in specific parameters). This concept is particularly important for RPFM parameters which differ in units and span varying orders of magnitude. Here we discuss systematic approaches for scaling and stacking the RPFM data and its direct consequences on masking and/or enhancing ferroelectric and non-ferroelectric contributors to the switching response. We demonstrate this separation of RPFM signal contributors by using a k-means clustering algorithm applied to a grid switching spectroscopy RPFM (SS-RPFM) experiment of a [001]-cut 0.6Pb(Mg 1/3 Nb 2/3 )O 3 -0.4PbTiO 3 relaxor-ferroelectric solid solution single crystal. Specifically, we discuss the effects of topographic cross-talk, and how electrostatic contributions within domains of differing polarity can be identified directly, and/or minimized when analyzing the physical response.

Results and Discussion
The RPFM surface scan of the probed area before switching experiments is shown in Figure 1. The phase ( ) map clearly highlights two sets of domains of opposite (out of plane) polarity, with a mean phase shift of approximately 180°between them. The domain separation is consistent with the A, , and Q maps, which further reveal the domain walls. Additionally, topographic crosstalk is observed, most strongly in A and maps, as diagonal striations across the scanned area. These grooves are indeed consistent with the height retrace scan of the same region, and are due to surface polishing during sample preparation. The post SS-RPFM scans were performed on a wider area and show evidence of persistent switched polarization in only a few locations. A discussion on these observations is not critical for this work, but is available in Section S1, Supporting Information ( Figure S1, Supporting Information). No sign of irreversible surface modification due to the SS-RPFM experiments was observed.
The switching experiments were performed across a 50 point × 50 point grid. At each location, three cycles of a bipolar triangular waveform, with maximum applied bias of V Bias = ±12.65 V were applied. For each voltage increment, four parameters were extracted, namely A, , , and Q. A discussion on the cantilever scan direction and structure of the experimental dataset is available in Section S1, Supporting Information ( Figures S2-S4, Supporting Information). Thus, the complete experiment resulted in a multidimensional data set with a challenging representation of the acquired information. Specifically, for such multidimensional data sets, it would be quite difficult to visually evaluate any correlation between variations in the response parameters across the sample surface with any sample features. Here, we leverage a clustering algorithm, k-means, often used as the first step in ML approaches to SS-RPFM data analysis, to evaluate different physical contributions to the response. K-means is a particularly convenient ML technique, given its minimal complexity, fast output, and low computational cost when applied to SPM data. Specifically, this algorithm partitions a set of observations into groups sharing similarities, known as clusters, by minimizing the square Euclidean distance between the features of each observation, [40] as shown in Equation (1): where S denotes the sets of clustered observations, N is the desired number of clusters, x is the d-dimensional vector representation of an observation, and k i is the mean representation of observations for a cluster center (i.e., the centroid). In our experi-  showing the inertia (i.e., withincluster sum-of-squares distance) as a function of the number of clusters N. b) Cluster analysis at "elbow" of scree plot -i.e., with N = 3 -highlighting ferroelectric domains (k 2 , k 3 ), as well as a non-ferroelectric contribution (k 1 ). c) K-means clustering analysis of PR with N = 10. The components identify proximity to domain walls (k 8 , k 9 ), as well as non-ferroelectric contributions for each domain polarity (i.e., k 3 , k 5 , and k 10 for one domain polarity, and k 2 , k 4 , k 6 , and k 7 for the other domain polarity). Topographic effects (diagonal streaks) are visible in the top half of the map. The clusters were organized in order of appearance along the scan direction of the cantilever tip (top to bottom). For each PR cluster centroid, the first cycle is dotted, the second one is dashed, and the third one is plotted as a solid line. ment, S corresponds to the probed locations, x is the corresponding local response of an SS-RPFM parameter (e.g., A, , PR, , or Q), and d is the number of measurements in x. Since the number of contributors in a generic SS-RPFM experiment is not known a priori, scree plot can be used to estimate N. [41] This method is heuristic in determining the optimal number of components, and indeed observing groups solely from the "elbow" of the plot may be insufficient; nonetheless, it is a great first approach to identify the most prominent signal contributors. [39] PR has often been the preferred parameter for analysis of SS-RPFM loops -given its succinct representation of both amplitude and phase -and the scree plot for this parameter shows an elbow at approximately N = 3 components (Figure 2a). The corresponding k-means analysis at N = 3 ( Figure 2b) identifies regions within the scanned area consistent with the previously observed domains of opposite polarity (k 2 and k 3 ), plus an initial region that is seemingly independent of domain polarity (k 1 ). Such initial regions can be assigned to prevalent non-ferroelectric electrochemo-mechanical (ECM) contributions (including charge injection and trapping, electrostatic effects, etc.), which are generally present during "settling" of the cantilever and tip contact with the surface, and often ignored in analysis in literature. [42] This effect was further confirmed through a secondary SS-RPFM grid study on the same sample with a reversed scan direction (Figure S5, Supporting Information). Furthermore, a small feature that can be correlated with the presence of a surface scratch is observed within this initial region. In order to better capture all the contributors to the response, often an over-fitting analysis can be performed, with an N much larger than the value suggested by the scree plot's elbow. [39] Here, an N = 10 clustering analysis reveals additional ferroelectric and non-ferroelectric contributors ( Figure 2c). The cluster map highlights ferroelectric domains Noticeably, A and show distinctive clusters associated with ferroelectric domains, while and Q show clusters associated with topographic effects and possible electro-chemo-mechanical contributions. In each cluster centroid, the first cycle is dotted, the second one is dashed, and the third one is plotted as a solid line. and the influence of the domain walls (k 8 , k 9 ). A prominent nonferroelectric behavior is also highlighted by the separation of domains of the same polarity into multiple clusters (e.g., k 2 , k 4 , k 6 , and k 7 ). Additionally, diagonal lines coinciding with topographic features are also observed in the first half of the cluster map. Similar to the N = 3 case, there remains some ambiguity in evaluating whether the first cluster (i.e., k 1 ) exclusively belongs to one or multiple domains. Furthermore, within that same region, the domain walls are not visible, in contrast to the lower portion of the cluster map. These observations suggest that while the use of PR is a fast approach to identify some contributors to the response, the overall reduced information density (in combining two parameters into one) increases ambiguity in the interpretation of clustering results.
In order to overcome the above limitations, A and can be evaluated separately. The analysis can also be further augmented through the inclusion of the and Q data. As previously mentioned, while A and are strongly affected by the ferroelectric character of the sample, and Q are more susceptible to the tipsurface interactions. The sensitivity of the different parameters to different physical phenomena is clearly shown in an independent k-means analysis of each (Figure 3). Intriguingly, the scree plot for all parameters ( Figure S9, Supporting Information) still identified the "elbow" at approximately N = 3, confirming the need for a more reliable cluster number estimation method. The cluster map for A is similar to the PR analysis from Figure 2b, highlighting two ferroelectric domains as well as the initial ECM regime within the scanned region where the signal is independent of the domain polarity, suggesting the dominant effect of amplitude in the combined piezoresponse signal. The map for similarly identifies domains of opposite polarity, as well as additionally yielding the domain walls, with the latter being indistinguishable from the ECM region. Neither nor Q shows any feature easily relatable to the ferroelectric characteristics of the sample, that is, domain polarity; instead, both parameters are sensitive to topographic features, clearly identified by the diagonal lines across the entire cluster map, and possible ECM effects extending over the whole scanned area. The observed differences in the A, , , and Q clus-ter maps underline the wealth of distinct information stored in these parameters. To further capitalize on the full information density stored in these parameters, they should be analyzed simultaneously, an approach enabled by dimensional stacking.
Dimensional stacking concatenates data from multiple parameters into a new single dimension, effectively resulting in increased physical correlation within the data. However, we recall that the objective of the k-means function (Equation (1)) is to minimize the sum of the squared distances between the observations and their respective cluster centroid. Thus, its Euclidean distance metric will inherently prioritize features spreading the largest numerical values. For instance, as shown in the Supporting Information, direct concatenation of the parameters in the International System of Units as reported by the instrument (Figure S6, Supporting Information) results in a cluster map quite different from those obtained by the data scaled to common units (Figure S7, Supporting Information). The former prioritizes , while the latter prioritizes A in their respective cluster separation. In the former case, for example, 's values are in the hundreds of thousands of Hz, substantially bigger than A values in the thousandths of V range, in the hundreds of degrees, and Q, which is unit-less but in the higher tens to lower hundreds. Therefore, it is imperative to appropriately scale all dimensions to create comparable ranges if each parameter is to be treated equally. Two of the most popular scaling methods are z-standardization, Equation (2), and (min-max) normalization, Equation (3): [43,44] In Equation (2), is the mean within the population (i.e., all the probed locations in the discretized region), is the population's standard deviation, and z is the z-score, which indicates how many standard deviations away a measurement is from the mean. Hence, the mean is scaled to zero, and all other data are indicated by their respective z-scores. Z-standardization has a major drawback for dimensionally-stacked SS-RPFM parameters: scaling to unit variance (division by ) works best for input data with a Gaussian distribution. This premise is not valid for all SS-RPFM parameters ( Figure S8, Supporting Information). For instance, is by default represented by two separate peaks corresponding to in-and out-of-phase signals. Consequently, 's standardized range would be substantially different compared to 's, which follows a left-skewed distribution. Thus, z-standardization is more suitable for a single parameter analysis (without dimensional stacking). In contrast, min-max normalization, Equation (3), ensures each parameter will scale to the same range (i.e., [0 − 1]), regardless of their distribution, by using the absolute minimum, min(x), and absolute maximum, max(x), for any given input (x). However, min-max normalization is particularly sensitive to outliers, which could effectively compress features of interest within the response (as shown for example for in Figure S9, Supporting Information).
In general, all scaling methods can be applied either globally (i.e., across all probed locations simultaneously) or locally (i.e., individually at each probed location). The impact of a global and local scaling on the ultimate ML analyses of SS-RPFM parameters is discussed in detail further below, and in Section S4, Supporting Information (Figures S9-S11, Supporting Information). In practice, any scaling approach performed locally is subject to more information separation than global scaling, given the loss of information of pixel-to-pixel variations. In order to preserve these effects, we first adopt a modified global normalization, with the intent of limiting outlier effects, through use of the global 2 nd and 98 th quantiles, as minimum and maximum respectively, for and Q. For A, the global 98 th quantile is used as the maximum, while the minimum is set to 0, which is expected to correspond to polarization switching at the coercive voltage. For , the minimum and maximum values were identified through the peaks of the parameter's bimodal distribution across the scan, corresponding to the in-phase and out-of-phase signals. This specific normalization approach for has the additional advantage of minimizing the influence of instrumental effects, which can result in phase shifting or wrapping. [45,46] Thus, the peaks in this scaled parameter's distribution will map to approximately 0 and 1. Overall, the modified normalization approach proposed ensures a more comparable range for all SS-RPFM parameters, as suitable for subsequent concatenation.
Using the thus scaled data, various combinations of stacked parameters are considered, with iterations on the number of clusters that best represent the effects of interest. Based on the observations from clustering results of the single parameters (Figure 3), A and were stacked to highlight the simplest ferroelectric components (Figure 4a). Similarly, the non-ferroelectric contributors were enhanced by stacking and Q (Figure 4b). Using the different components identified in the A-and -Q stacking in Figure 3a,b, we can build maps of the various contributors to the SS-RPFM response (Figure 4c). Specifically, within the A-kmeans analysis output, k 2 and k 4 correspond to different domain polarities, while k 3 corresponds to domain wall locations, as verified through a visual correlation with the original RPFM scans in Figure 1. Similarly, a direct correlation of k 5 from the -Q stacked k-means analysis with the height scan in Figure 1 helps to identify this contributor to the SS-RPFM response as due to topographic cross-talk. The remaining components are assigned generally to electro-chemo-mechanical contributions. Given the improved separation of these regions in the analysis of the stacked -Q compared to A-, the former are used in Figure 4c, and throughout the following discussion.
In order to gather more insight into the actual physical phenomena contributing to the electro-chemo-mechanical contributors, we leverage a method previously reported for identifying electrostatic contributions. [46][47][48] Specifically, the electrostatic forces at the probe tip-sample junction due to the application of a dc-bias can contribute to the recorded surface displacement. Such electrostatic forces are expected to result in a linear difference between the On-field and Off-field PR hysteresis www.advancedsciencenews.com www.advphysicsres.com Figure 5. K-means clustering of PR differential (PR On − field − PR Off − field ) with N = 4 components. All components show a largely linear response, which is associated with the electrostatic interactions between the probe tip and the sample surface. The first cycle is dotted, the second one is dashed, and the third one is plotted as a solid line.
curves. The results of a k-means clustering on such PR differential loops, (PR On − field − PR Off − field ) are shown in Figure 5. Each component identified in this analysis is indeed strongly linear in nature. Additionally, the resulting distribution map of k 1 through k 3 resembles the distribution of components k 1 and k 2 from the concatenated -Q analysis (Figure 4b,c), including locations noticeably affected by their proximity to topographic features. These observations indicate the strong presence of electrostatic forces as the major electro-chemo-mechanical contributor to the SS-RPFM response. Additional clustering analyses near the elbow of the scree plot for this parameter (N = 3 − 6) are consistent these results ( Figure S12, Supporting Information). Last, we note that the linear component in Figure 5 slightly decreases in overall intensity from approximately top to the bottom of the scan, indicating reduction of the electrostatic contributions to the response throughout the scan.
Having identified and separated locally the four major contributors to the SS-RPFM response (i.e., domains, domain walls, topographic effects, electrostatic regimes), we can now use these to evaluate the impact of the non-ferroelectric contributors on the measured (ferroelectric) response. To this end, locations associated with topographic features and domain walls are removed from the analysis, and the initial region corresponding to k 1 in the ECM regime ( Figure 4b) is ignored, given the limited statistical information. Thus, by leveraging the labeled contributions in Figure 4c, a map of the remaining electrostatic regimes within each domain polarity can be obtained (Figure 6). The median parameter responses, computed from representative 2 pixel × 2 pixel areas, are used to portray the overall behavior within each region ( Figure 6). The most prominent difference between the two domain polarities (L 1 , L 2 , L 3 vs. L 4 , L 5 , L 6 ) is overall due to the asymmetry of the amplitude response. However, for each domain polarity, this asymmetry is reduced from the beginning to the end of the scan, while saturation in amplitude signal increases. Additionally, the mean value of shifts substantially during the scan (top to bottom of the probed region). The above two observations are consistent with a reduction of electrostatic forces during the experiment as observed in the differential PR curves, and previous literature reports. [39,49] Indeed, Q also shows minimal shifts within the different ECM regions, indicating a conservative force contribution, consistent with prevalent effects of electrostatic forces.
The same clustered map can be used to observe the effects of topographic features. It is intriguing to note that many pixels in proximity of the topographic groove lines (white contrast in Figures 4c and 6) are associated with a higher ECM component than the immediate regions where these features are embedded (e.g., observation of k 4 -type behavior in regions otherwise assigned to k 2 or k 3 in Figure 4b). Indeed, the locations matching topographic trenches (k 5 in Figure 4b) have the lowest mean and Q values, which is closest to the behavior observed in the ECM regions at the bottom of the scan, that is, k 4 in Figure 4b. This observation is reminiscent of previous reports on highly defective PbZr 0.2 Ti 0.8 O 3 thin films, where the locations within surface tranches and in proximity thereof were found to exhibit similar behaviors. [27] With the above discussions in mind, it is clear that one of the major effects of direct or indirect (e.g., in proximity of topographic features) electrostatic contributions to the PFM response is the change in the mean resonance frequency ( ) response. Therefore, it is worth considering if removal of such variation across pixels would result in (at least partial) masking of electrostatic effects in SS-RPFM measurements. Indeed, local normalization, even in the modified approach to the 2 nd and 98 th quantiles described herein, would enable masking of these effects. Figure 7 shows a k-means analysis with N = 3 components, performed on all four SS-RPFM parameters (A, , , and Q) dimensionally stacked together after modified local normalization, as discussed earlier herein. In contrast with the results of single parameter clustering (Figure 3), or A-and -Q stacked analyses (Figure 4), the cluster map in Figure 7 shows the presence of only ferroelectric domains and domain walls, without electrostatic or topographic features. Furthermore, an increase of number of components to N = 4 and N = 5 results in no substantial change of components (Figures S13 and S14, Supporting Information), but pixels with outliers within the hysteretic curves (often observed in in proximity of the coercive voltage) are identified within one of the domain polarities. An additional clustering analysis with N = 5 components was performed on the SS-RPFM parameters, excluding , to confirm its influence on the observed pixels ( Figure S15, Supporting Information). These results are in excellent agreement with previous work that has shown that a local mean removal (before global z-standardization) can result in masking of electrostatic effects, while highlighting the presence of ferroelectric domains. [39] Overall, the methodology described herein is one more step toward addressing a fundamental limitation of unsupervised machine learning: no analysis based only on mathematical models can fully capture the complex underlying physical and chemical effects contributing to the signal measured in SPM, and specifically, PFM experiments. Hence, in addition to introduction of chemical and physical correlations, another noteworthy benefit of dimensional stacking lies in the herein delineated opportunity for masking or enhancing various physio-chemical phenomena, within a minimal number of components. In contrast, the single parameter analysis (e.g., PR, Figure 2) is not only inadequate to identify all the contributors within the full scanned area (e.g., topographic effects only visible in the top half), but the output also is often too complex to analyze within a single, coherent physical and/or chemical model. We note that the approach recommended with respect to data curation, leveraging dimensional stacking in systematic global scaling to identify contributors, followed by local scaling to mask or enhance features, is  Non-ferroelectric effects are effectively minimized, revealing previously unobserved ferroelectric contributions (i.e., domains and proximity to domain walls) in the initial part of the scan. The first cycle is dotted, the second one is dashed, and the third one is plotted as a solid line.
valid well beyond switching RPFM analysis: this approach can be extended not only to any other spectroscopic RPFM techniques, but to any multidimensional SPM characterization (e.g., electrochemical strain microscopy, ESM) or multidimensional data set created by use of different characterization methods as shown previously. [31,50]

Conclusions
In summary, we demonstrated a systematic approach to unravel and classify the physical contributors to the switching RPFM response, through increased information density provided by analysis of all the acquired parameters during the experiments. Appropriate scaling of parameters resulting in comparable quantitative ranges enabled dimensional stacking of multiple parameters for machine learning algorithms. Through appropriate choice of stacked parameters via local or global scaling, ferroelectric and non-ferroelectric behaviors were either masked or enhanced. Fur-thermore, the proposed systematic approach enabled identification or "fingerprinting" of specific contributors, for example, electrostatic tip-sample interactions, and their signature for each of the SS-RPFM parameters. In conclusion, through careful consideration of parameter scaling, and by capitalizing on the information density stored in SS-RPFM parameters, we have demonstrated practical tools in characterizing the signal contributors that occur during spectroscopic resonant-SPM techniques. Indeed, the proposed methods are valid for analysis of any multidimensional data set through machine learning algorithms, beyond SPM-based techniques, and beyond clustering methods.
Last, we reiterate that the onus remains on the subject expert (user) to classify the algorithm's output components and label the behaviors of interest through fundamental knowledge of the expected underlying phenomena. The process is iterative both in choice of number of components for ML analysis, and in the actual stacked dimensions (parameters). Furthermore, neither global nor local (modified) normalization is intended to be complete in identifying all components exclusively. Rather, an iterative process should include a first step of global scaling in order to identify ferroelectric and major non-ferroelectric contributors, complemented by subsequent local scaling processes to enhance or mask phenomena of interest. The scope of the proposed methodology, and in general machine learning tools, is not in the perfect reproduction of the observed "curves," but rather in enabling understanding of the underlying physics, unmarred by instrumental artifacts and overlapping (but spurious or not of interest) contributions.

Experimental Section
The local electromechanical response of a [001]-cut 0.6Pb(Mg 1/3 Nb 2/3 ) O 3 -0.4PbTiO 3 relaxor-ferroelectric solid solution single crystal was probed via RPFM. The sample's surface was rinsed with isopropyl alcohol followed by de-ionized water, and dried with argon gas before testing. The study was performed on an Asylum Research Cypher S atomic force microscope in ambient conditions. An Olympus platinum coated tip (OMCL-AC240TM-R3) with a nominal resonant frequency of 70 kHz and spring constant of 2 N m −1 was utilized, with an instrument setpoint of V setpoint = 1.5 V. The switching measurements were carried out over a 50 point × 50 point grid, covering a 5 μm × 5 μm area. At each point, three cycles of a bipolar pulsed triangular waveform were applied through the cantilever tip to the sample, with a maximum applied bias of V Bias = ±12.65 V, well exceeding the material's coercive field, to induce polarization switching in both directions. A step size of V step = 0.33 V was implemented, resulting in a total of 450 On-field -i.e., signal recorded during application of the DC bias -and 450 Off-field -i.e., the signal was recorded immediately after the removal of the DC bias -measurements. During RPFM, a chirp signal of V ac = 1.25 V was applied through the cantilever to the sample across a band of frequencies (from 315-365 kHz). The measured signal was deconvoluted and fit to a simple harmonic oscillator model, extracting four parameters: the effective out-of-plane displacement, or amplitude (A), the phase relative to the applied signal ( ), the resonant frequency of the tip-sample contact ( ), and the quality factor (Q). The V Bias scale is the same in all figures, with axis limits at ±14 V, and the tick marks correspond to −10, 0, and 10 V, respectively.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.