Structure of an Ultrathin Oxide on Pt3Sn(111) Solved by Machine Learning Enhanced Global Optimization

Abstract Determination of the atomic structure of solid surfaces typically depends on comparison of measured properties with simulations based on hypothesized structural models. For simple structures, the models may be guessed, but for more complex structures there is a need for reliable theory‐based search algorithms. So far, such methods have been limited by the combinatorial complexity and computational expense of sufficiently accurate energy estimation for surfaces. However, the introduction of machine learning methods has the potential to change this radically. Here, we demonstrate how an evolutionary algorithm, utilizing machine learning for accelerated energy estimation and diverse population generation, can be used to solve an unknown surface structure—the (4×4) surface oxide on Pt3Sn(111)—based on limited experimental input. The algorithm is efficient and robust, and should be broadly applicable in surface studies, where it can replace manual, intuition based model generation.


GOFEE details
For the evolutionary structure search, we employed the GOFEE method which detailed in Ref. 31 The machine learned energy landscape was constructed based on the global fingerprint feature from Oganov and Valle. 50A Gaussian process regression model was built utilizing the same kernel as in Ref. 31 The kernel has two squared exponential terms with different characteristic length scales, whose values together with that of the maximal covariance were found via optimization of the marginal likelihood.
In the present work, a sample of prior DFT structures replaces the population used in the original work. 31The sample is constructed with the k-means++ clustering method using the euclidian distance between the global fingerprint features as the distance measure between structures.All DFT structures with energies within ∆E sample of the most stable structure found so far are used in the clustering.We use ∆E sample = 5 eV as structures of larger energy are not expected to represent interesting regions of configuration space.The sample size, N sample , was chosen to 10 for the surface oxide structures as this proved efficient.Previous use of the GOFEE method has been successful with a similar size for the population.For illustrative purposes, N sample = 5 is used in Fig. 1.Introducing the present k-means based sampling method as opposed to the original evolving population method eliminates the parameter, k max , that decides how different a new population member should be from existing population members to be adopted.Figure S1 displays a comparison of the two methods showing that the k-means based method is both faster and more reliable than the population based method (for two different choices of k max ). Figure S2 illustrates the actual composition of a sample from one of the conducted GOFEE searches for Sn 11 O 12 .
In each episode, 200 new independent candidates were constructed via rattling of the atoms in the sampled structures.From these, the best candidate according to E LCB = E − κσ, where For the latter, two different criteria for how similar a structure may be to other population members were used, either having max 0.99 or max 0.999 kernel elements between any two structures.Success is considered achieved when a structure has a total energy within 0.2 eV of that of the global minimum energy structure.The shaded regions represent 95% confidence intervals.The new sample-based method is both faster and more reliable.For instance, ∼67% of the independent restarts find the GM in less than ∼600 episodes, while for the evolving population based approach, ∼800 episodes are needed for the same fidelity.Similarly, the new method achieves 86% success after 1000 episodes, while the old method only achieves just short of 75% success.The structure searches were conducted for GOFEE implemented in a modified code base.They used 24 candidates and one DFT calculation per episode.The DFT calculation were sped up by having 1 k-point and 300 eV energy cutoff for the plane waves.
E and σ are the model energy and uncertainty, respectively, was chosen.For the constant, κ we used the value 2.
In the DFT evaluation step of GOFEE, we employed the original double-step procedure, where two single-point DFT calculations are performed.First, one is done for the candidate structure as just emerging from the acquisition.Next, another single-point DFT calculation is done for the structure modified by ⃗ F ∆x, where ⃗ F is the DFT force just calculated for the first structure, and where ∆x is a step length.This proceedure seeks to provide data for the machine learned landscape that encodes the proper direction of the energy gradient, and hence enables efficient relaxation in this surrogate energy landscape.

Population
In the main text Fig. 1, two dimensional tin oxide nano-clusters were used as an accessible example to present the new population scheme.To support this discussion we here present the application of the method to the oxidized Pt 3 Sn(111) surface.This is depicted in Fig. S2, where

DFT calculations
GOFEE searches were carried out on a fixed support, consisting of two layers of Pt 3 Sn(111).
DFT evaluations during the searches were performed using the Atomic Simulation Environment (ASE) 51 with the grid-based projector-augmented wave (GPAW) code 52,53 in plane wave mode with an energy cutoff of 400 eV and a (2 × 2 × 1) k-point grid.Generalized gradient approximation (GGA) with the PBE functional 54 was used to describe the exchange-correlation interaction.The four best candidates for each composition was subsequently transferred to a five layer support and relaxed, fixing only the bottom two layers, using a (4 × 4 × 1) k-point grid and a 500 eV energy cutoff until all atomic forces were below 0.05 eV/Å.Fig. 3 reports the best candidate for each composition.
Four independent GOFEE searches were carried out for each of the 16 compositions, to al-

Stability comparison
To compare the thermodynamic stability across the different compositions, and following Reuter and Scheffler, 55 the surface γ free energy was calculated for each structure, as where A is the surface area in the computational cell, E slab is the total energy of the relaxed surface slab, N Sn and N O denote, respectively, the number of tin and oxygen atoms in the slab, and finally µ Sn and µ O denote the corresponding chemical potentials.Contributions from the pressure and entropy terms are neglected. 55Similarly, temperature and pressure contributions are neglected when evaluating the chemical potential of µ Sn , such that µ Sn (T, p) = µ Sn .With this, the surface free energy can be simplified to To estimate µ Sn , we assume the surface to be in equilibrium with the Pt 3 Sn bulk phase, which supplies the tin atoms and turns into bulk Pt 7 Sn.This gives Finally, assuming the O 2 atmosphere to form an ideal gas reservoir, the chemical potential of oxygen is taken from 55 to be where E O 2 is the energy of the isolated molecule and ∆µ O (T, p 0 ) is the size of the temperature and pressure dependent contribution to the chemical potential at temperature T and pressure p 0 , which is tabulated in. 55e surface free energy, γ, is related to the free energy per (4 × 4) cell, as given in Fig. 3, by Surface X-ray diffraction  S1.
SXRD measurements were performed at the I07 beamline at Diamond Light Source, where the sample was prepared and characterized in an ultra-high vacuum system equipped with facilities for ion sputtering, annealing and low-pressure gas exposure. 56The sample was a 6 mm diameter Pt 3 Sn(111) single crystal (Mateck, GmbH) prepared by cycles of Ar + sputtering (1 keV) and annealing (∼800 • C, 5 min) followed by oxygen exposure (10 −5 mbar, ∼600 • C, 15 min) to form the (4 × 4) surface oxide.Pt 3 Sn(111) prepared under these conditions consists of a combination of bulklike grains as well as tin-depleted inclusions consisting essentially of Pt(111) terminated by a characteristic ( √ 3× √ 3R30 • ) surface alloy. 11These phases, as well as superstructures formed on them, can be distinguished in SXRD due to the difference in lattice parameter and only diffraction signals from the Pt 3 Sn(111) surfaces were used for the analysis here.SXRD measurements were acquired at room temperature.Crystal truncation rods and superstructure rods were measured with stationary L-scans using a Pilatus 100K area detector.
Structure factors were extracted by integration of the 2D peak in each image corresponding to the intersection of the diffraction rod with the Ewald sphere, using a 'seed-skew' peak search algorithm as described by Drnec et al. 57 Structure factors were derived from raw intensities after application of the appropriate polarization and Lorentz correction factors. 57,58 n-plane structure factors were extracted from rocking scans taken at L = 0.5.
Fitting of SXRD data was performed using a kinematic computation of the surface structure factors 59 and least-squares minimization as implemented in SciPy. 60Atomic coordinates were constrained to p3 symmetry during optimization, with the initial positions derived from the DFT-optimized coordinates after symmetric averaging.The formation of p3 overlayers on the p3m1 Pt 3 Sn(111) surface implies the presence of two mirrored domains producing overlapping diffraction rods, and the fitted curves are computed as incoherent superpositions of the structure factors for the two domains.Individual intensity factors were included in the fitting of each rod and a single Debye-Waller parameter was assumed for all atoms.Due to their relatively small contribution to the overall X-ray scattering cross sections, the oxygen coordinates were constrained to those of the DFT optimization, except for an allowed vertical relaxation of the O layer as a whole.For the purpose of fitting, constant uncertainties in the measured values of 20% were assumed.The resulting rod fit yielded a reduced χ 2 = 0.7.In-plane structure factors were fitted afterward, using the final, fixed coordinates from the rod fitting and allowed variation of only an overall intensity factor and an in-plane Debye-Waller parameter.The correspondence between experimental and simulated patterns for the structure is good, with R = 0.15.The best-fit structure is depicted in Figure S3, with coordinates presented in Table S1.Fits to SXRD   Scanning tunneling microscopy (STM) characterization was performed using an Omicron

STM and AFM measurements
VT STM located at the MAX IV Laboratory, Lund, Sweden.Measurements were acquried at room temperature with an etched W tip in constant current mode.The sample was prepared in the same manner as in the SXRD experiments.
Atomic force microscopy (AFM) was performed at the Vienna University of Technology (TU Wien) using an Omicron LT-STM equipped with a qPlus sensor, a W tip, and custom preamplifier. 61Measurements were acquired at ∼5 K after a similar sample preparation procedure.At short tip-sample distances (bottom images in Fig. S4), frequency shift images showed an array of protrusions matching what is imaged by STM, and the interactions of the tip with these atoms was exclusively repulsive.
For the AFM simulations, the Sn 11 O 12 structure was modeled with DFT calculations using the Vienna Ab initio Simulation Package (VASP) 62,63 using the generalized gradient approximation (GGA) within the Perdew, Burke, and Ernzerhof parametrization. 54The slab contained 48 Pt atoms, 27 Sn atoms, and 12 O atoms, placed in a 11.29×11.29×35.0Å 3 cell, with 11.29×11.29×23.5 Å 3 volume of vacuum.The reciprocal space was mapped with a 4×4×1 mesh of k-points.The 6s 1 5d 9 Pt, 5s 2 5p 2 Sn, and 2s 2 2p 4 O valence electrons were explicitly modeled, while the remaining (core) electrons were modeled using the corresponding, elementspecific pseudo-potentials.The criterium for ionic convergence was set to 0.01 eV/Å, together with the strong criterium for electronic convergence.The electrostatic potential was generated such that it includes the ionic potential, the Hartree contribution and the exchange-correlation potential.
Constant-height AFM images were simulated by calculating the force acting on the virtual AFM tip over a range of distances from the surface.The force maps were subsequently transformed into frequency shifts according to the small oscillation amplitude approximation. 64ree different simulations methods were applied: (i) Probing the DFT-optimized electrostatic potential above the surface with a unit point charge; (ii) the Probe Particle Model with an empirical Lennard-Jones potential modeling the interaction between the surface atoms and a negatively charged, oxygen-terminated tip; 65 and (iii) explicitly calculated DFT force-distance curves between a CO molecule above the top-protruding Sn atom of the surface, according to a procedure implemented in ref. 39 (Here an increased vacuum volume was used.)In each case the resulting simulated AFM image corresponds well to the measured AFM images.The simulated image shown in Fig. 4c in the main text was created with method (iii) 3.9 Å away from the protruding Sn atoms.

Figure S1 :
Figure S1: Comparison of k-means based sampling with an evolving population.The search for the Sn 11 O 12 structure was restarted 300 times using either the new k-means based sampling or the original evolving population.For the latter, two different criteria for how similar a structure may be to other population members were used, either having max 0.99 or max 0.999 kernel elements between any two structures.Success is considered achieved when a structure has a total energy within 0.2 eV of that of the global minimum energy structure.The shaded regions represent 95% confidence intervals.The new sample-based method is both faster and more reliable.For instance, ∼67% of the independent restarts find the GM in less than ∼600 episodes, while for the evolving population based approach, ∼800 episodes are needed for the same fidelity.Similarly, the new method achieves 86% success after 1000 episodes, while the old method only achieves just short of 75% success.The structure searches were conducted for GOFEE implemented in a modified code base.They used 24 candidates and one DFT calculation per episode.The DFT calculation were sped up by having 1 k-point and 300 eV energy cutoff for the plane waves.
the clustering and subsequent population extraction have specifically been applied to the 500 first structures of a GOFEE search for the Sn 11 O 12 surface composition.The figure shows the data is clustered into families of related structures along with example structures for some of these families.The families include variations in the number and arrangement of protruding Sn atoms, variations in the placement of the oxide layer with respect to the underlying metal surface, and alternative arrangements of Sn such as those in a square configuration with alternating up and down Sn, resembling strips from bulk SnO.

Figure S2 :
FigureS2: Sketch of the adopted population scheme applied to data from the first 250 iterations (500 structures) of a GOFEE search on the Sn 11 O 12 surface composition.The scheme considers all structures evaluated so far, with an energy within ∆E sample = 5 eV of the currently lowest energy structure found.The upper left plot depicts a feature space representation of these structures, projected onto two dimensions using principal component analysis (PCA), and colored according to energy.In the center, the same structures are colored according to a clustering performed in the full feature space using the k-means algorithm.Example structures are shown for some of the clusters with atoms in the slab dimmed to highlight structural differences.The population is formed by selecting the lowest-energy structure from each cluster.The five enumerated example structures are part of the population for this particular data set and clustering.The enumeration is according to energy, with structure 1 being lowest in energy.This is coincidentally also the global minimum structure for this composition.The PCA dimensionality reduction captures 76% of the variance in the data, with the remainder (the other dimensions) accounting for the apparent overlap of clusters in the figure.

Figure S3 :
Figure S3: Structure of the (4×4) Sn 11 O 12 surface oxide (a) Top view of the (4×4) phase, showing the unit cell and the locations of 3-fold symmetry axes.(b) Side view of the (4×4) phase, shown with expanded z-coordinates to highlight corrugation within the layers.c) Individual layers in the structure.Labels indicate symmetrically distinct atoms with coordinates given in TableS1.
Fractional coordinates given with respect to the hexagonal unit cell indicated in Fig.S3, with a = b = 11.29 Å, c = 6.92Å, α = β = 90 • , γ = 120 • .Grayed values indicate parameters that were fixed during fitting.In the case of the oxygen atoms, lateral positions were fixed to those of the DFT structure, while a single displacement parameter was allowed to shift the positions in the vertical direction.data were also attempted for a platinum-skin model, where Sn in the topmost metallic layer is substituted with Pt.The fit result was notably worse, with χ 2 increasing by 70%, to 1.1, and R = 0.26 for the in-plane structure factors.Although experiments indicate that a similar (4×4) phase is formed on pure platinum surfaces9 which is probably the same Sn 11 O 12 phase found here, under our experimental conditions the oxide forms atop bulk-terminated Pt 3 Sn(111).

Figure S4 :
Figure S4: AFM/STM measurements with varying tip heights.Sequence of AFM images acquired in constant-height mode as the tip was stepped successively closer to the sample surface.Tip heights are given relative to that acquired at the smallest distance.Short-range interactions are exclusively repulsive between the tip and the protruding Sn, as indicated by the increasingly positive frequency shift (bright contrast).

Table S1 :
Atomic coordinates of the Sn 11 O 12 structure.