Unraveling Crystallization Mechanisms and Electronic Structure of Phase‐Change Materials by Large‐Scale Ab Initio Simulations

Ge–Sb–Te (“GST”) alloys are leading phase‐change materials for digital memories and neuro‐inspired computing. Upon fast crystallization, these materials form rocksalt‐like phases with large structural and vacancy disorder, leading to an insulating phase at low temperature. Here, a comprehensive description of crystallization, structural disorder, and electronic properties of GeSb2Te4 based on realistic, quantum‐mechanically based (“ab initio”) computer simulations with system sizes of more than 1000 atoms is provided. It is shown how an analysis of the crystallization mechanism based on the smooth overlap of atomic positions kernel reveals the evolution of both geometrical and chemical order. The connection between structural and electronic properties of the disordered, as‐crystallized models, which are relevant to the transport properties of GST, is then studied. Furthermore, it is shown how antisite defects and extended Sb‐rich motifs can lead to Anderson localization in the conduction band. Beyond memory applications, these findings are therefore more generally relevant to disordered rocksalt‐like chalcogenides that exhibit self‐doping, since they can explain the origin of Anderson insulating behavior in both p‐ and n‐doped chalcogenide materials.

The electronic structure of the recrystallized phase is of high interest, since it determines the properties of the SET state of memory cells. The stable crystalline phase of GST is trigonal and consists of layers of Ge, Sb, and Te atoms. [41][42][43] However, upon fast crystallization of amorphous GST, a metastable cubic rocksalt-like phase is obtained: Te atoms occupy one sublattice, whereas Ge, Sb, and vacancies appear to be arranged in a random fashion on the second sublattice. [44][45][46][47][48] AIMD crystallization simulations have corroborated this picture. [24][25][26][27][28][29][30] The large amount of disorder in the rocksalt-like phase has a substantial impact on the electronic states-and thus on transport properties. At room temperature, GST shows relatively high p-type conductivity, due to self-doping. However, low-temperature transport measurements revealed that GST and related chalcogenide crystals are in fact Anderson insulators and that an insulator-metal transition can be induced by thermal annealing. [49][50][51][52] Our previous DFT simulations of GST crystals indicated that vacancy clusters induce localization of the electronic states near the edge of the valence band, and that the ordering of vacancies into layers reduces the total energy of the system and drives both a rocksalt-to-trigonal structural transition and an insulatorto-metal phase transition. [53][54][55] In these computational works, models of disordered GST were created using a quasi-random number generator and, in some cases, manually adjusted to generate different distributions of vacancies in the structure. However, a complete characterization of the electronic properties of a truly ab initio generated model of GST-that is, one obtained by full large-scale AIMD crystallization, corresponding to the SET state in devices-is still missing.
In the present work, we create precisely such large-scale models of rocksalt-like GST, obtained by AIMD crystallization simulations that start from the melt-quenched amorphous phase. We elucidate the atomic details of crystal growth with a kernel-based similarity metric that was initially developed in the context of atomistic machine learning (ML). Our electronic structure calculations reveal that, besides vacancy clusters affecting the valence band, two additional sources of disorder induce Anderson localization in the conduction band. Beyond GST, these generic localization mechanisms lead to Andersoninsulating behavior in both p-and n-doped disordered chalcogenide crystals.

Crystallization of GeSb 2 Te 4 using Large-Scale AIMD Simulations
We consider two independently-generated 1008-atom models of rocksalt-type GeSb 2 Te 4 in an orthorhombic supercell of size 2.954 × 2.558 × 4.178 nm 3 (144 Ge, 288 Sb, and 576 Te atoms and 144 vacant sites), corresponding to the experimental crystalline density (a = 6.03 Å in the conventional cubic unit cell). The density was fixed during the whole crystallization process. This scheme is relevant to PCRAM devices, wherein the crystalline matrix surrounding the amorphous region constrains the volume available during crystallization. This model size was shown to be large enough to capture strong localization of electrons in ref. [53]. With two (111) crystalline layers being fixed, the system was first randomized above 2000 K for 10 ps, and then quenched down to 1100 K in 6 ps. It was held at 1100 K for 30 ps and then quenched to 300 K with a cooling rate of 10 K ps −1 to create a planar amorphous-crystalline interface. Subsequently, the models were re-heated and kept at ≈630 K for about 600 ps, during which they both crystallized into a cubic rocksalt phase. The two models are denoted as "model 1" and "model 2" in the following. We present some snapshots of the two crystallization trajectories in Figure 1. In both models, crystallization is dominated by the growth from the interface and no nucleation event takes place during the simulation. This is because nucleation is a stochastic process, and rapid crystal growth from the interfaces led to complete crystallization before a sizable nucleus could form.
For structural analysis, we calculate the value of the "'dotproduct"' bond order parameter, 4 dot q , [56] for each atom. This parameter is used to quantify crystallinity and thus to distinguish crystal-like and amorphous-like environments. We define an atom to be part of the crystal-like region if its value of 4 dot q is larger than 0.45: this value was shown to discriminate well between amorphous and crystalline environments ( Figure S1, Supporting Information), as also reported in refs. [17,19]. The growth velocity v g is then determined by computing the total volume of the crystalline region V and the total area S of the interface between the amorphous and the crystalline region at each time step. The quantities V and S are obtained by summing up, respectively, the volumes and the relevant surface areas of the Voronoi polyhedron [57] around the atoms that belong to the crystal ( Figure S2, Supporting Information). We note that some atoms are occasionally identified as crystallike in the non-crystalline region, due to thermal fluctuations (Figure 1), which in fact should be considered as "noise" during the crystallization. Hence, a correction is performed on these atoms when calculating V and S. Averaging over time and two structural models, we obtain a growth velocity of ≈1.5 m s −1 at 630 K ( Figure S2, Supporting Information). The narrow growth interfaces and high atomic mobility in the amorphouslike parts ( Figure S1, Supporting Information) enable frequent atomic impinging and attaching on the crystal-like boundary, leading to fast growth. These observations are consistent with our previous crystallization simulations on Ge 2 Sb 2 Te 5 and Ag 4 In 3 Sb 67 Te 26 PCMs. [14,58] The large-scale structural models provide additional details about the crystallization process that were not clearly captured in previous AIMD simulations of smaller model size. In particular, the amorphous-crystal interfaces become notably extended along the growth direction during crystal growth, due to the formation of some crystalline "protrusions," which hints at roughness of the growth front at the nanoscale ( Figure 1). Furthermore, both recrystallized models contain a non-negligible amount of antisite defects (Ge/Sb atoms on Te sites, or vice versa). The formation of "wrong bonds" was also observed in previous crystallization simulations of GST with and without an embedded nucleus. [26][27][28] Unlike local structural disorder, these antisite defects, as chemical disorder, cannot be directly identified by 4 dot q , which only relies on bond lengths and angles.

Kernel-Based Structural Analysis for the Recrystallization Process
We develop a protocol based on the smooth overlap of atomic positions (SOAP) [59] similarity kernel to quantify progressive changes in both the crystallinity (here referred to as "SOAP crystallinity") and the chemical disorder (antisite defects) during the crystallization process. Initially used as a descriptor in the fitting of ML based interatomic potentials, [59] the SOAP formalism has also been shown to be useful for analyzing many aspects of atomistic structure, [60,61] including the similarity between amorphous and crystalline materials. [62][63][64][65] To obtain the "fingerprint" of an atomic environment (within a sphere of cutoff r c ) for a given atom, the local density is constructed via a  sum of Gaussian functions with broadness σ at placed on each neighbor of the central atom (Figure 2a). We note that both r c and σ at need to be carefully calibrated ( Figure S3, Supporting Information). The chemical order within this atomic environment is of particular interest, and densities of different species give rise to what we call the "element-resolved local atomic density" (ELAD), in (nominal) analogy to the element-resolved atomic structures of Ge 2 Sb 2 Te 5 presented in ref. [46], and the element-resolved local lattice distortion discussed in ref. [66]. The similarity k between two atomic environments can be computed by matching separately these n ELADs of different species, and then averaging (Figure 2b). We note that two or more atomic species can be taken into account in one ELAD, if these species are chemically equivalent to each other-as we do here for Ge and Sb, which both occupy cation-like sites largely interchangeably with one another, whereas Te is defined as the other ELAD because the swap between Te and cation-like atoms leads to antisite defects (Figure 2b).
To now define a "SOAP crystallinity" for atoms during crystallization, a straightforward way is to calculate the SOAP similarity between the current state and the final state of each individual atom. This crystallinity measure, which we call self k is based on the fully recrystallized structure from the same AIMD simulation (referred to as "REC" model; Figure S4, Supporting Informationa). Another, more general way of defining a "SOAP crystallinity" is to calculate the average similarity of a given atom with respect to all atoms of the same species in a selected ensemble of reference structures. This quantity, which we call k , requires the choice of a separate reference ( Figure 2c). Given that both models crystallized into a single cubic phase and that our AIMD-simulated REC models contain local distortions and antisite defects, we here use the k crystallinity and consider the reference structures to be three idealized rocksalt-like GeSb 2 Te 4 models with randomly distributed Ge/Sb/vacancies, independently constructed with the help of a pseudo-random number generator (referred to as "RNG" model; Figure S4b, Supporting Information).
The crystallization process of model 1 is shown in Figure 3a with the SOAP crystallinity k indicated by color coding. Both amorphous-like particles and local structural distortions can well be distinguished from the crystal-like atoms based on the SOAP crystallinity ( Figure 3a). More importantly, some atoms (color coded in black), though seemingly crystal-like in the crystalline matrix, attain relatively low k values-indeed, these are antisite defects in the recrystallized model ( Figure 3b). This observation emphasizes that the SOAP crystallinity measure k can reveal chemical order in addition to the geometric order characterized in Figure 1. The distribution of k for crystal-like (c-), amorphous-like (a-), and antisite atoms ( Figure S5, Supporting Information) evidences that these three types of atoms can be well distinguished based on k . The three peaks (labeled as 1, 2, and 3) in the k profiles also match the characteristics of c-, a-, and antisite atoms, respectively ( Figure 3c). These observations suggest that k can serve as a good indicator of both geometric and chemical order, used here in characterizing the crystallization process of GST.
Adv. Mater. 2022, 34, 2109139 Figure 2. SOAP-based characterization of the recrystallization process. a) Construction of local atomic density for a given atom. A Gaussian function is placed on each atomic position within a sphere of cutoff r c . The chemical information can be analyzed by constructing the element-resolved local atomic density (ELAD). b) The calculation of similarity between two atomic environments. In this work, Ge and Sb (cation) atoms are considered in one ELAD, and Te (anion) atoms in the other one. The similarity k between the two environments can be computed via a dot product (indicated by the dashed arrows) of the power-spectrum vectors p of these two ELADs, and then averaged over all pairs. c) The principles of self k and k crystallinity measures. self k characterizes the similarity of a given atom in the recrystallized model with respect to the fully recrystallized structure ("REC"); here, a simple (single) dot-product kernel is shown for simplicity. k is computed as an average over the similarities of this atom with all individual atoms of the same element in the reference structure (here, "RNG"). Figure 3d, we corrected for the "noise" atoms and calculated the volume V of crystalline atoms, the area S of the crystal-amorphous interfaces, and thus the growth velocity v g for model 1, but now basing the analysis on the SOAP crystallinity k (with respect to RNG), and we compare the results to the 4 dot q case (Figure 3d). The corresponding result for model 2 is shown in Figure S6, Supporting Information. We note that the crystalline volumes calculated based on SOAP crystallinity are almost the same as those obtained using 4 dot q . However, the computed areas of crystal-amorphous interfaces are larger when using SOAP, suggesting slightly rougher growth interfaces if the analysis is based on SOAP crystallinity. The estimated growth velocity, averaged over two models, is ≈1.2 m s −1 as calculated using SOAP, slightly smaller than that calculated by 4 dot q , ≈1.5 m s −1 ( Figure S2, Supporting Information). Overall, our results are in good agreement with previous computational work on Ge 2 Sb 2 Te 5 (1 m s −1 at ≈600 K) [29] as well as the estimated value obtained from ultrafast differential scanning calorimetry experiments. [40] In short, our SOAP crystallinity analysis reveals the evolution of the antisite defects during crystallization in a quantitative way, enriching the observation of structural changes with chemical information. These antisite defects are known to be energetically unfavorable: [67][68][69] indeed, by manually swapping all of the antisite Ge/Sb atoms with nearby antisite Te atoms, the total energy of the two recrystallized models is reduced by 20 and 17 meV per atom, respectively. Thus, these defects are expected to be healed upon proper thermal annealing. Most likely for this reason, antisite defects were not captured in previous thin film experiments based on scanning transmission  . SOAP-based characterization of the crystallization process. a) Snapshots of the crystallization simulation for model 1, with the k similarity (with respect to RNG) indicated by color coding. The atoms in black are identified as antisite defects in the recrystallized model. b) Enlarged atomic images for the areas labeled as "A" and "B" on the right-hand side of (a), containing antisite defects. c) The k profiles of atoms in the 200 ps, 400 ps, and 600 ps configurations, in which the three peaks (labeled as 1, 2, and 3) match well with the distributions of a-, c-and antisite atoms in (b). d) The volumes of the crystalline region V, areas S of the interface between the amorphous and the crystalline region, and crystal growth velocities v g as a function of time for model 1, calculated using SOAP crystallinity k (purple lines) in contrast to 4 dot q (green lines). A correction for "noise" atoms is performed before calculating these quantities (Experimental Section). Volume and area data are smoothed using a Savitzky-Golay filter with a time window of 20 ps. The time window between the two dashed lines in (d) is used to calculate the growth velocity. electron microscopy and energy-dispersive X-ray mapping, in which rocksalt-type GST was obtained by long-term thermal annealing (≈30 min) at moderately high temperatures (150-180 °C) [46,47,49] (see sketch in Figure 4a). Due to the much longer time scale involved, it is not feasible to assess such annealing processes using direct AIMD simulations. However, in PCRAM devices, the SET state is typically obtained by applying electrical pulses of tens of nanoseconds. Antisite defects could form during such rapid phase transition, but their impact on the electrical performance of the SET state of GST remains elusive. Our large-scale crystallization simulations provide an opportunity to investigate the effects of antisite defects (as well as of other sources of disorder) on the electronic properties of nanosecond-recrystallized GST, which is relevant to applications in fast-switching memory devices (Figure 4a). Smaller models with strong finite-size effects imposed by periodic boundary conditions would not allow one to distinguish between localized and extended states, given the relatively large localization radius of the states in the tail of the conduction and valence band.

As shown in
We furthermore expect these large-scale crystallization simulations to become useful in the development of ML-based interatomic potentials for PCMs. Our models contain 1008 atoms, and so they constitute (to our best knowledge) the largest simulation systems for any direct AIMD description of atomistic mechanisms in PCMs to date. In the near future, moving to substantially larger system sizes may be envisioned with the increasing availability of ML-based models, which are "trained" on quantum-mechanical reference data. [70][71][72] Indeed, neuralnetwork potential driven crystallization simulations have been reported for GeTe [73] and very recently for elemental Sb, [74] and a Gaussian Approximation Potential model was developed and applied for the ternary compound Ge 2 Sb 2 Te 5 . [75,76] High-quality AIMD simulations are expected to be beneficial (and indeed required) for the long-term goal of developing more comprehensive ML potentials for GST materials, explicitly including the GeSb 2 Te 4 composition: both in terms of providing information about what (small-scale) reference data may be needed to represent the (large-scale) growth interface most efficiently, thereby providing guidance on the construction of the reference database, and in terms of providing physical property predictions against which future ML potentials may be validated.

Effects of Defects on the Electronic Structure and Localization
The ab initio generated crystallized models reported herein allow for a direct study of their electronic properties. Prior to electronic-structure calculations, the two recrystallized models were shortly heated at higher temperatures to remove the mismatch at the interface where the two growth fronts meet; subsequently, the systems were quenched to close to 0 K in 40 ps, and the geometries were further optimized by DFT calculations. The resulting structure of model 1 is shown in Figure 4  significant number of Te atoms with 3 nearest neighbor vacancies (17.7%), many of which are located inside the vacancy cluster highlighted in Figure 4b. There are even a few Te atoms with four neighboring vacancies. Regarding antisite defects, model 1 shows 22 anti-Te atoms (i.e., Te atoms on cation sites) and 18 anti-Ge/Sb atoms, corresponding to 4.0% of the total number of atoms (3.8% for model 2). As highlighted by the homopolar bonds in Figure 4c, these antisite defects are statistically distributed, owing to the rapid and stochastic atomic migration during the fast crystal growth process. We note that upon quenching from the melt, the amorphous state also contains a relatively high fraction of homopolar bonds, ≈11.9% ( Figure S8, Supporting Information), and that spontaneous structural relaxation on longer time scales would lead to a gradual reduction of the number of homopolar bonds. [77][78][79] Accordingly, large structural models are required to obtain good statistics, and the analysis of two independent simulations (cf. Figure 1) provides additional support in this regard. Figure 4d shows the inverse participation ratio (IPR) of the Kohn-Sham states, providing an estimate for the inverse of the number of atoms on which the state is localized (see Experimental Section). In the thermodynamic limit ("infinite" system size), the IPR is thus finite for localized states, whereas it is equal to zero for extended wave functions. For our models with finite size, the IPR values are non-zero even for delocalized states. Nevertheless, the system size is sufficiently large to clearly distinguish between strongly localized and extended states. The model exhibits large IPR values near the top of the valence band and the bottom of the conduction band. As shown in Figure 4e, the highest occupied molecular orbital (HOMO) turns out to be localized inside the vacancy cluster mentioned above, in line with our previous work. [53] Some additional charge-density pockets appear in the right-hand side of the model where antisite defects are found, resulting in a smaller IPR value of the HOMO state as compared to those of RNG models without such defects (note, however, that the IPR values in the tail of the valence band generally depend on the shape and size of the vacancy clusters as well). [53,80] Regarding the lowest unoccupied molecular orbital (LUMO) state, it has a higher IPR value and displays a cigar-shape localized state (see Figure 4f). Note that the two localization regions in the figure are connected due to the periodic boundary conditions. Its charge density is mostly localized in a region that contains an Sb-Te chain, as well as two antisite Te defects. We refer to such regions as "electron-rich," since, for an idealized solid where each atom retains its atomic electronic configuration, the average number of p electrons per site within these regions is larger than 3. In a real solid forming electronic bands, these regions correspond to an excess of positive nuclear charge, which leads to the formation of electronic bound states near the bottom of the conduction band. For instance, a Te antisite defect has a similar effect as a P impurity in silicon (except for the smaller localization radius). Most states with high IPR values in the conduction band are localized inside these regions.
It is difficult to distinguish the contribution of antisite defects from that of Sb-Te chains in this recrystallized model, due to the large amount of disorder. To disentangle the role of both, we build new and more idealized models, which show that both ingredients can separately lead to Anderson localization of electrons at the tail of the conduction band. First, we construct several RNG models without antisite defects. After geometrical relaxation, these models show similar energy values and comparable degree of lattice distortion compared to our previous calculations. [53,80] Some of these models possess states at the tail of the conduction band with relatively large IPR values, besides the strongly localized states at the edge of the valence band. We show the DOS/IPR and LUMO state of one such model in Figure 5a. These unoccupied states display an elongated shape that is localized along Sb-Te chains terminated by Ge atoms or vacancies. To further simplify the analysis of these states and avoid the inherent stochasticity of the RNG models, we manually construct a GST model that accommodates only one Sb-Te chain. As shown in Figure 5b, the resulting stoichiometric model, Ge 285 Sb 2 Te 288 , contains a chain comprised of 4 Sb-Te bond pairs, which is terminated by two vacancies. Clearly, two states at the tail of conduction band show high IPR values. The charge density of one of these states is shown in Figure 5b. The charge distribution of these states is indeed mainly localized in this "electron-rich" area.
Next, we manually create some antisite defects in RNG-GeSb 2 Te 4 models. One such model is shown in Figure 5c. Its LUMO state appears to be mostly localized along the Te-Te bonds. We note that the possibility of electron localization by these defects in rocksalt GST has been discussed in ref. [55], in which high pressure was utilized to tailor such disorder. As already mentioned, in disordered models, there is an interplay between antisite defects and Sb-Te chains. To single out the effect of the former defects, we thus consider the parent compound GeTe. It was shown by DFT calculations using relatively small supercell models that antisite Te defects generate additional energy states in the gap region. [67] Here, we include one antisite pair (Te Ge and Ge Te ) in an orthorhombic supercell with 576 Ge and 576 Te atoms. As shown in Figure 5d, the electronic structure of the relaxed models shows localized states at the tail of the conduction band. The wave functions of these unoccupied states are mainly localized along the bonding directions of the Te-Te "wrong bonds." It is possible to explain these findings by considering the electronic structure of single Ge Te and Te Ge defects. As reported in ref. [67], a Ge Te antisite does not lead to electron localization, whereas a Te Ge defect induces a localized state at the very edge of the conduction band. In principle, it is possible that the Ge Te defect leads to the formation of a shallow state with exceedingly large localization length, far larger than our simulation cell, due to effective screening stemming from the large dielectric constant of GST. The investigation of this point would require the use of extremely large supercells, which are still beyond reach for ab initio methods. A scaling analysis confirms the genuine electron localization at the tail of conduction band induced by both Sb-Te chains and antisite defects ( Figure S8, Supporting Information). We also include chargedensity scans of this state along the x, y, and z axis for the largest 3360-atom models ( Figures S9-S11, Supporting Information) and 2304-atom models (Figures S12-S14, Supporting Information): the scans are performed by computing the electron density at several planes perpendicular to the relevant axis.
We summarize the localization in GST as follows. There are three localization sources: <1> vacancy clusters, <2> Sb-Te chains, and <3> antisite Te defects (Figure 6; Figure S17,  Supporting Information). The localization induced by vacancy clusters is 3D, and is typically found at the top of the valence band. However, the other two localization sources mostly induce quasi-1D cigar-shape localization at the bottom of the conduction band. As regards the antisite defects, they are energetically unfavorable and induce localized empty states in the conduction band. Nevertheless, it is noted that the abundance of antisite defects could also affect the localization properties of the occupied states-see, for instance, the HOMO state in the recrystallized model 1.
Our new findings regarding the impact of "electron-rich" configurations on the electronic wave functions are valid beyond the specific compositions of GST PCMs, and are expected to contribute to the understanding of transport in disordered rocksalt-like chalcogenides that exhibit n-type transport, such as SnBi 2 Te 4 [81] and PbSb 2 Te 4 . [82] To corroborate this claim, we have carried out simulations of disordered SnBi 2 Te 4 and PbSb 2 Te 4 , which show that "electron-rich" areas, respectively Bi-Te chains in SnBi 2 Te 4 and Sb-Te chains in PbSb 2 Te 4 , also induce cigar-shape localization in the conduction band ( Figure S15, Supporting Information). It has recently been predicted that many ternary and binary chalcogenides can form a metastable, disordered rocksalt-like phase. [80] Our present work now suggests that these compounds could be Anderson insulators irrespective of the self-doping tendency. Indeed, previous thin film experiments on n-type rocksalt chalcogenides revealed Anderson localization, as they showed negative temperature coefficient of resistance values, despite the relatively smaller resistivity as compared to p-type rocksalt chalcogenides, such as GST and SnSb 2 Te 4 . [81]

Conclusion
We have reported direct AMD simulations, with more than 1000 atoms per cell, of the crystallization of the GeSb 2 Te 4 phase-change memory material. In line with experimental findings, fast crystallization from the amorphous phase and subsequent annealing lead to a strongly disordered rocksalt-like phase, exhibiting Anderson localization. The simulations also yield further insight into the crystallization process and into the electronic structure, which would not be directly accessible from experiment alone (nor would it be accessible from small-scale ab initio simulations). Our results show that localization near the edge of the valence band is mostly induced by vacancy clusters corresponding to electron-deficient regions. The resulting electronic wave functions mainly show contributions from the p-orbitals of the undercoordinated Te atoms in the clusters. Our simulations also indicate that other types of defects can lead to Anderson localization in the conduction band or to midgap states. These defects include "electron-rich" regions originating from the local abundance of Sb atoms (Sb-Te chains) or Te antisites. We stress that our models are stoichiometrically precise, that is, they have an exact Ge:Sb:Te ratio along the GeTe-Sb 2 Te 3 pseudo-binary line. Non-stoichiometric defects (e.g., excess vacancies) are present in real samples and are responsible for self-doping effects leading to p-type transport in Ge-Sb-Te, Sn-Sb-Te, Sb 2 Te 3 , and related alloys, and to n-type transport in Pb-Sb-Te, Sn-Bi-Te, Bi 2 Te 3 , and related alloys. Our ab initio calculations suggest that all these alloys could be Anderson insulators, as localized states are predicted to be present at the edges of both valence and conduction band, being due to different sources of disorder. These findings open up the possibility of employing a combination of p-and n-type disordered chalcogenides [80,83] in novel multilevel devices, where multiple states with different resistivity are obtained by tuning the degree of disorder.
We also envision that SOAP crystallinity measures, as demonstrated in this work, may surpass structural analysis with specific bond order parameters in many aspects when characterizing crystallization mechanisms in different functional material systems. With the construction of element-resolved local densities, the chemical ordering is revealed in addition to the geometrical ordering. The quantified similarity between two atomic environments provides a straightforward measure for the crystallinity which is amenable to chemical interpretation. Importantly, the choice of reference is rather flexible: for materials with a well-known recrystallized form, we can reference to structures from data sources such as the Inorganic Crystal Structure Database; multiple structures with slightly different local environments can be combined to enable better sampling (e.g., of the randomly distributed Ge/Sb/vacancy sites in cubic GST). If the recrystallized form of a material is a priori unknown, for example, in water, metallic glass, and other amorphous solids, [84][85][86] we could characterize the MD-simulated recrystallization using self k with respect to its final crystallized state, which is independent from the choice of an external reference. In addition, one might search for an appropriate reference through unsupervised ML. [61,87] We expect that in the future, SOAP-based analyses will provide a considerably more general approach for studying various microscopic mechanisms that are relevant in PCMs, including crystal nucleation [27][28][29] and pressure-induced crystallization, [88,89] the effects of impurities and dopants, [90][91][92] and heterostructure interfaces. [93,94] In the latter cases, different reference structures could be defined for the various chemical species involved, allowing for SOAP-based analyses of the local environments of the dopant atoms as well as their effect on the structure of the PCM phase overall (and how this structure is different from the pristine bulk phase, e.g., as simulated here for GeSb 2 Te 4 ). With continuing advances in the atomistic modeling of PCMs and other functional materials, such tools for structural characterization will likely be of growing interest in the future.

Experimental Section
Computational Details: AIMD simulations were carried out using the "second-generation" Car-Parrinello scheme [95] which combined the efficiency of Car-Parrinello molecular dynamics with the accuracy and large timesteps of Born-Oppenheimer molecular dynamics. This scheme was implemented in the Quickstep code of the CP2K package. [96,97] In this code, periodic boundary conditions were enforced, and a Gaussian and plane waves scheme was used. The localized basis sets allowed very efficient calculation of the total energy and the Kohn-Sham matrix that scaled linearly with the system size, and the use of the orbital transformation for plane waves led to good parallel performance for large-scale calculations. A triple-zeta plus polarization Gaussian-type basis set was used to expand the Kohn-Sham orbitals and plane waves with a cutoff of 300 Ry to expand the charge density. The generalized gradient approximation was employed to the exchange-correlation potential [98] and scalar-relativistic Goedecker pseudopotentials. [99] The Brillouin zone was sampled at the Γ point and the time step was set to 2 fs. The simulations were conducted in the canonical ensemble (NVT) and the temperature was controlled by a stochastic Langevin thermostat. The IPR of a state Ψ α is defined as Σ i |Ψ α,i | 4 /( Σ i |Ψ α,i | 2 ) 2 , where the Ψ α,i 's indicate the expansion coefficients of Ψ α with respect to the basis set. An isovalue of 0.01 a.u. was employed for the visualization of all the Kohn-Sham states presented in the main text and Supporting Information. The amorphous models were generated by a standard melt-quench protocol as is commonly used in the field, [100][101][102] and the corresponding structural analyses are provided in Figure S8, Supporting Information.
Growth Velocity Calculations: The growth velocity v g was calculated using the same method as discussed in the previous work, [29] and the v g as a function of time t could be seen as the temporary variation of crystalline volumes V, divided by the crystal-amorphous interface areas S, shown as follows: The crystal-like particles could be determined based on the distributions of quantified crystallinity ( 4 dot q or SOAP crystallinity), in which a selection value could clearly separate the crystal-like and amorphous-like particles. As shown in Figures S2,S5, Supporting Information, the selection value was set to be 0.45 for 4 dot q crystallinity (i.e., atoms were defined as crystal-like particle when 4 dot q > 0.45 and amorphous-like ones when 4 dot q < 0.45), and 0.57 when using k similarity (with respect to RNG). It was noted that some crystallike (or amorphous-like) atoms were separated from the crystalline (or non-crystalline) matrix. The presence of these atoms, mainly on account of thermal fluctuations, could be seen as a "noise" during the recrystallization process, leading to an inaccurately estimated crystalline volume and an enlarged growth interface. Hence, as a correction for this "noise," the crystalline volumes V were manually calibrated and the areas of growth interface S were deducted, created by these "noise" atoms, before calculating v g . The raw data of volumes V and areas S were smoothed using a Savitzky-Golay filter with a time window of 20 ps for the calculation of growth velocity.
SOAP Similarity Kernel: To quantify both the chemical and geometric similarity of two atomic environments (ξ A and ξ B ) up to a cutoff r c , the SOAP approach [59] was used to construct suitable kernels. An ELAD that includes different species (Ge/Sb or Te) separately in the local atomic environment was considered in this work (Figure 2a), noting the similarity to "alchemical" SOAP-based kernels in ref. [ 60 ]. A Gaussian function with broadness σ at was placed on each atomic position within the cutoff, and the neighbor density was then expanded in a basis of radial functions R n (r) and spherical harmonics ( ) Y r l m as: [59] with the combination coefficients c nlm , as well as n and l as convergence parameters running up to a suitable maximum (n ≤ 16 and l ≤ 16 in this work). The similarity between two atomic environments was obtained by calculating the dot product of the power-spectrum vectors p α of two ELADs of the same element (or, here, group of elements, namely Ge/Sb). [60] The authors then average over n species: Given the reference structures to be three RNG models, the SOAP crystallinity k of an individual atom x in the recrystallized model could thus be calculated with respect to the reference structures, defined as an average over the similarities of this atom to all the N atomic environments of the same species in the reference structure: The calculation of the SOAP power-spectrum vectors was carried out as implemented in the QUIP/quippy and GAP code (https://github.com/ libAtoms/QUIP). It was noted that the hyperparameters (cutoff r c and broadness σ at ) of the SOAP kernel needed to be carefully chosen to enable a proper distinction between crystal-like and amorphous-like atoms. As seen in Figure S3, Supporting Information, the distribution of k similarity of crystal-like and amorphous-like atoms was calculated with respect to RNG, and found overlapping distributions for moderate to large r c values (<8.5 Å). This suggested that a longer-ranged SOAP kernel was needed to distinguish crystal-like and amorphous-like regions in GST. The results were found to converge when using r c larger than 9.5 Å; hence, r c = 9.0 Å was used in this work. Likewise, it was also found that too small (<0.3 Å) or too large (>0.4 Å) settings for σ at led to a mix-up of crystal-like and amorphous-like particles ( Figure S6, Supporting Information), and thus set σ at = 0.3 Å.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.