Predicting and Rationalizing the Soret Coefficient of Binary Lennard‐Jones Mixtures in the Liquid State

The thermodiffusion behavior of binary Lennard‐Jones mixtures in the liquid state is investigated by combining the individual strengths of non‐equilibrium molecular dynamics (NEMD) and equilibrium molecular dynamics (EMD) simulations. On the one hand, boundary‐driven NEMD simulations are useful to quickly predict Soret coefficients because they are easy to set up and straightforward to analyze. However, careful interpolation is required because the mean temperature in the measurement region does not exactly reach the target temperature. On the other hand, EMD simulations attain the target temperature precisely and yield a multitude of properties that clarify the microscopic origins of Soret coefficient trends. An analysis of the Soret coefficient suggests a straightforward dependence on the thermodynamic properties, whereas its dependence on dynamic properties is far more complex. Furthermore, a limit of applicability of a popular theoretical model, which mainly relies on thermodynamic data, was identified by virtue of an uncertainty analysis in conjunction with efficient empirical Soret coefficient predictions, which rely on model parameters instead of simulation output. Finally, the present study underscores that a combination of predictive models and EMD and NEMD simulations is a powerful approach to shed light onto the thermodiffusion behavior of liquid mixtures.


Introduction
In the nineteenth century, Ludwig [1] and Soret [2] observed independently from each other that a temperature gradient in an isotropic mixture can induce a concentration gradient (Figure 1). This coupling phenomenon between heat and mass flow anisms of thermal diffusion and also allow for the prediction of Soret and other transport coefficients. The Lennard-Jones interaction potential is particularly attractive because it is a computationally efficient and yet realistic molecular model, thus helping to rapidly generate a fundamental understanding of molecular processes. Furthermore, Lennard-Jones mixtures can exhibit a rich vapor-liquid equilibrium behavior, [29][30][31] which can facilitate constructing new theoretical models and testing application limits of existing ones. Knowing the phase diagram is in itself central to studying thermodiffusion because phase changes are typically to be avoided, that is, gas bubbles and solid precipitates are unwanted when investigating thermal diffusion in the liquid state.
Thermodiffusion can be simulated in various ways by molecular simulations, where non-equilibrium molecular dynamics (NEMD) simulations are a well-established standard approach. These simulations closely mimic experimental scenarios for thermodiffusion measurements, making them the ideal candidates to construct benchmark data. NEMD simulations monitor the response to system perturbations out of equilibrium. Synthetic NEMD simulations [32][33][34] add external forces on molecules, thus altering their dynamics in an artificial way. On the contrary, boundary-driven NEMD methods [35][36][37][38][39][40][41][42] aim at paralleling experimental situations much more directly by their conceptual set-up. These methods impose different temperatures within the simulation volume that give rise to temperature gradients, which, in turn, exert thermodynamic forces on the molecules in an indirect and natural way. Typically, a narrow region along one spatial direction is assigned as a heat source volume, and a corresponding heat sink region is placed half the simulation volume apart. The difference between the various boundary-driven methods is the way in which the temperature is controlled in the two regions.
Another approach to simulate thermal diffusion and to calculate the Soret coefficient is based on the spontaneous fluctuations of the microscopic fluxes that are present in equilibrium molecular dynamics (EMD) simulations through the Green-Kubo formalism. Although the corresponding flux expressions for mixtures were thoroughly derived in the 1980s, [43,44] only a few molecular simulation studies on thermal diffusion employing EMD techniques can be found in the literature. [32,33,[45][46][47][48][49][50] Further-more, EMD methods have been mostly used for validation purposes, that is, together with NEMD simulations. [33,[45][46][47][48]50] Presumably, this originates from the difficulties inherent to EMD simulations. First, cross-correlation functions are weaker and thus more challenging to sample than auto-correlation functions. Second, reliable information on the thermodynamic factor and the partial molar enthalpy is required to calculate the Soret coefficient. However, EMD simulations are valuable to gain insight into the sign and magnitude of the individual contributions to thermal diffusion processes, as recognized by Miller et al. [49] In this work, the strengths of NEMD and EMD simulations were combined to reliably predict and understand the Soret coefficient of three different Lennard-Jones mixtures in the liquid state. [29,30] Furthermore, a popular theoretical model, [27] which mainly relies on thermodynamic input data, and a regression model to predict the Soret coefficient were tested, investigating their applicability limits. Our results and insights underscore that NEMD and EMD simulations in conjunction with theoretical models build a strong framework to foster understanding of thermodiffusion in liquid mixtures.

Methodology
When two or more processes occur simultaneously in a system, they may couple, causing new effects. [51] Examples of such cross-phenomena are the Soret and Dufour effects. The thermodiffusion or Soret effect addresses the macroscopic motion of molecules that results in a concentration gradient caused by a temperature gradient. The reciprocal phenomenon, that is, a temperature gradient caused by a concentration gradient, is known as Dufour effect.

Non-Equilibrium Thermodynamics
Linear non-equilibrium thermodynamics (NET) provides a compact framework to describe the behavior of systems subject to various driving forces and is fundamentally based on four postulates: [14] 1) the quasi-equilibrium postulate, requiring that gradients are not too large; 2) the linearity postulate, stating that all fluxes are linear functions of all relevant driving forces; 3) Curie's postulate, constraining which fluxes and forces are coupled; and 4) Onsager's reciprocal relations, requiring symmetric coefficient matrices in the flux-force relations.
Considering the Soret and Dufour effects only, the equations for heat flux J Q and mass flux of component 1 J m 1 in a binary mixture can be written in the NET framework as [52] where T is the temperature, p is the pressure, and 1 and w 1 are the chemical potential and mass fraction of component 1, respectively. L ij are phenomenological Onsager coefficients that describe the proportionality between the thermodynamic forces and the fluxes that they induce. [53] The auto-correlated properties L 11 and L QQ are the mass-mass and heat-heat phenomenological coefficients. The cross-correlated phenomenological coefficients L 1Q = L Q1 follow Onsager's reciprocity relations. [54] On the other hand, experimentalists usually employ the following constitutive equations [46,52] to describe heat and mass fluxes where is the thermal conductivity, m is the mass density of the mixture, and D S T and D D T represent the thermal diffusion coefficients of Soret-and Dufour-type, respectively.
When the steady state is reached (i.e., J m 1 = 0) and a constant mass density can be assumed, Equation (5) can be rearranged to where x i is the mole fraction of component i and J 1 the molar flux of component 1. Another frequently determined and closely related quantity is the thermal diffusion factor [52] T The relations between phenomenological coefficients and transport coefficients can be determined by comparing the NET phenomenological Equations (2) and (3) with the constitutive Equations (4) and (5). Consequently, the thermal diffusion coefficients of Soret-and Dufour-type are thus given by [46] From Onsager's reciprocity relations, it follows that The Fick diffusion coefficient is related to the mass-mass phenomenological coefficient L 11 by where is the molar density of the mixture. M i is the molar mass of component i, k B is the Boltzmann constant, and Γ the thermodynamic factor. The Soret coefficient is thus [46] S T = 1 The key elements of the Langevin NEMD simulations are two temperature control regions. One of those regions (center) is maintained at a higher temperature T high , whereas the outermost regions that are connected via periodic boundary conditions are kept at a lower temperature T low . The temperature T and the mole fractions x i as functions of the Cartesian z-direction are obtained from histograms using 50 bins. b) The profiles of T and x 1 obtained from Langevin NEMD simulations are linear. Two regions (green) can be used to determine slopes for further analysis to calculate the Soret coefficient.

Non-Equilibrium Molecular Dynamics
NEMD simulations closely mimic experimental scenarios to study thermodiffusion because these simulations exhibit regions of high and low temperature within the simulation volume. Furthermore, linear concentration profiles and linear temperature profiles in between these regions are obtained at steady state, the latter implying a constant thermal conductivity. [35] The thermal conductivity of pure Lennard-Jones fluids and binary mixtures increases with temperature and density, [35,[55][56][57] the two effects possibly compensating each other. Thus, a narrow region in the center of the simulation volume is defined as a heat source and a corresponding region as a heat sink at the edges of the simulation volume in z-direction (Figure 2a). The temperature profile T(z) and the mole fraction profiles x i (z) were determined on the basis of particle trajectories obtained with LAMMPS [58] (stable version: October 29, 2020). Both temperature-control regions were of equal size, having a width of 0.3 1 , unless stated otherwise, where 1 is the Lennard-Jones size parameter of component 1. The resulting width was typically ≈1% of the simulation volume. Two Langevin thermostats [59] were used, where one thermostat heated one temperature-control region and the other thermostat cooled the other T-control region, so that we refer to this method as Langevin NEMD.
The Langevin NEMD simulations were typically carried out with 1200 molecules. The z-direction of the simulation volume was divided into 50 bins of constant width Δz = l z ∕50 (Figure 2a). Each molecule i was assigned to one of the bins j, according to its position z i : j = int(z i ∕Δz). For each bin j, the kinetic energy of all molecules located in the bin E kin (z j ) was determined where N(z j ) is the number of molecules in bin j, m i , and v i the mass and the velocity of molecule i, respectively, and v is the center-of-mass velocity of the system. Since the total momentum of the system was set to zero, v vanishes. The local temperature in bin j T(z j ) was calculated by [35] T The number of degrees of freedom 3 N(z j ) should be adjusted to account for fixing the total momentum of the entire system, by subtracting three in total, but distributed over the entire temperature profile. Since 50 bins were typically used, we would need to subtract approximately 3∕50 = 0.06 from the number of degrees of freedom in each bin. As the number of degrees of freedom was typically (3 × 1200)∕50 = 72 in each bin with N(z j ) ≈ N∕N bins = 1200∕50, the adjustment would have been on the order of 0.1% and thus negligible. In this context, we investigated the influence of system size (i.e., the number of molecules) on the Soret coefficient-and thus on the temperature profile-while keeping the number of bins constant, and found no measurable influence when using larger systems, in agreement with previous work. [60] Usually, four different temperature differences ΔT between the region of high T high and low temperature T low were considered. The relative imposed temperature differences (T high − T low )∕(0.5 ⋅ (T high + T low )) were set to 13.3%, 26.6%, 40%, and 53.3%, respectively. The largest temperature gradient observed in the simulations was 0.026 such that violation of local equilibrium is not expected. [61,62] After a certain response time, a steady state was observed, where the temperature gradient induced a mole fraction gradient. The simulation time of 3500 to 5000 Lennard-Jones units spent to establish the steady state was selected based on transient density profiles reported by Bonella et al. [63] The mole fraction of component k in bin j x k (z j ) was calculated as the ratio of the number of molecules of component k in bin j N k (z j ) over the total number of molecules in bin j N(z j ) The slopes of the mole fraction dx k ∕dz and temperature profiles dT∕dz as well as the average mole fraction of component kx k provide all required input data to calculate the Soret coefficient of component k using Equation (6). For each ΔT, ten simulations with independent initial configurations were performed in order to average the individual Soret coefficient data and obtain reliable uncertainty estimates via the sample standard deviation. [64,65] A transition regime was observed, in which the temperature and mole fraction profiles were still influenced by the temperature-control regions ( Figure 2b). Therefore, all data points that were closer than ≈ 0.05 × l z to the center of each thermostat layer were discarded. Thus, roughly 20% of the profile data were discarded in order to maximize the likelihood to analyze linear profiles only.
Unless it is specified otherwise, the average temperature in the profile fit region V fit was identified as the temperature of the NEMD simulations, that is, T = T(z)| V fit . The average values from profiles agree very well with the corresponding global mean temperature obtained from all molecules in the entire NEMD simulation volume (Figure 3a). Similarly, the mean mole fraction of component 1 in the profile fit region of the NEMD simulations x 1 (z)| V fit was close to the globally imposed value of x 1 = 0.5 mol mol −1 (Figure 3b).
Newton's equations of motion were integrated with a velocity-Verlet algorithm [66,67] during the production phase. More details about the NEMD simulations and the corresponding computational workflow are provided in Section S1.1, Supporting Information.

Equilibrium Molecular Dynamics
All EMD simulations were carried out with the ms2 package. [68] The mass-mass, heat-heat, and heat-mass phenomenological coefficients defined by the NET framework [52] can be sampled with EMD simulations, employing the Green-Kubo formalism. [69,70] In this framework, the phenomenological massmass phenomenological coefficient L 11 is given by [44] where t is the time. The microscopic mass flux J m i is defined as Here, m i is the mass of a molecule of component i, v k i (t) the center of mass velocity vector of molecule k at some time t, N i is the number of molecules of component i, and the angular brackets indicate the canonical (NVT) ensemble average. After setting the total momentum to zero, ⟨v⟩ deviated from zero during simulation only within machine error such that J m 1 + J m 2 = 0. The Green-Kubo expression for the heat-mass phenomenological coefficient L Q1 is In EMD simulations, the heat flux J Q cannot be directly determined-in contrast to the internal energy flux J E which can be directly computed. Thus, the associated internal energy-mass phenomenological coefficient L E1 was calculated by where the internal energy flux J E is given by Here, u kl ij is the inter-molecular potential energy and r kl ij is the distance vector between molecules k and l. The indices i and j denote the components of the mixture. The dyadic product is represented by : and I is the unitary tensor. In a binary mixture, L Q1 can be determined from L E1 if the partial molar enthalpies h i of both components are known [71] Because of Onsager's reciprocity L E1 = L 1E . Thus, both crosscorrelated coefficients were independently sampled during simulation and averaged to improve statistics.

Partial Molar Enthalpy
In this work, the partial molar enthalpy was calculated from EMD simulations in two steps. First, the residual molar enthalpy of the binary mixture h res was calculated for ten different mole fractions in the isobaric-isothermal (NpT) ensemble over the composition range x 1 = 0.5 ± 0.05 mol mol −1 . Subsequently, an appropriate function h = f (x 1 ) was fitted by a least squares optimization to the simulation data to generate an expression for the molar enthalpy The partial molar enthalpy h i was then obtained from

Thermodynamic Factor
To determine the Fick diffusion coefficient from the massmass phenomenological Onsager coefficients according to Equation (10), the thermodynamic factor Γ is required It describes the thermodynamic non-ideality of a mixture and can be obtained from Kirkwood-Buff integrals [72] G ij where g ij (r) is the radial distribution function, while 1 and 2 are the partial densities i = x i . The truncation method by Krüger et al. [73] and the corrections of the radial distribution function by Ganguly and van der Vegt [74] were employed because of convergence issues. [75]

Models
Equimolar (i.e., x 1 = x 2 = 0.5 mol mol −1 ) binary liquid mixtures of molecules that interact via the Lennard-Jones potential were considered where ij and ij are the Lennard-Jones energy and size parameter for the interaction between molecule pairs belonging to components i and j, and r kl ij denotes the distance between molecules k and l. The interaction parameters if i = j i and i are provided in Table 1. The parameters for the interaction between different components were determined with the Lorentz-Berthelot combining rules [76] unless stated otherwise: ij = 0.5 ( i + j ) and ij = √ i j . Furthermore, the potential was explicitly evaluated up to a cutoff radius r cut = 2.5 × max( 1 , 2 ), [39] and analytic tail corrections were applied to pressure and energy terms. [77,78] Small time step sizes Δt were chosen to integrate Newton's equations of motion [79] (Table 1).
All results are presented in Lennard-Jones units, where the mass m 1 , the Lennard-Jones energy 1 , and Lennard-Jones size parameters of component 1 1 were used for reducing all quantities: m * = m∕m 1 , E * = E∕ 1 , and l * = l∕ 1 , where E denotes energy and l length or distance. The reduced time and temperature are t * = t √ 1 ∕(m 1 2 1 ) and T * = Tk B ∕ 1 , respectively. For the sake of brevity, the asterisks on the reduced quantities are omitted in the remainder of this work.

Results
First, we present NEMD results because the corresponding Soret coefficients serve as a benchmark for the subsequent EMD simulations. For both approaches, specific issues, pitfalls, and necessary procedures are highlighted to obtain reliable Soret coefficient data. This is followed by a comparison and discussion of the data for all Lennard-Jones models considered in this work, where two lean prediction models are also included. The first model is based on thermodynamic and approximate kinetic data, [27]  The color from blue to orange indicates an increasing relative temperature difference and thus a rising T gradient in the simulation volume. Literature values [32,33,35,41,49] (gray diamonds) and their average value [41] (black square) are provided for comparison.
whereas the second one is a regression model that relies on Lennard-Jones and mass parameters of the force field. [39]

Validation
The Langevin NEMD method and the corresponding analysis workflow was validated on the basis of literature data for the mixture argon/krypton. [32,33,35,41,49] The present Soret coefficient data (Figure 4) agree well with data from literature, which were obtained using a broad variety of approaches: NEMD with synthetic NEMD simulations, [32,33] the heat exchange method, [35] as well as the modified heat exchange method and reverse NEMD simulations [41] and Green-Kubo-based EMD simulations. [49] The average temperature that was observed in the central profile-fit region of our simulation volume varied with the imposed temperature difference. The colors on the depicted data points correspond to the observed relative temperature difference ΔT∕T. It can be concluded that the present workflow is reliable but that it  Table 1). The color from blue to orange indicates an increasing relative temperature difference ΔT∕T in the Langevin NEMD simulations. Open symbols represent linear interpolations of S T and its standard deviation around the desired target temperature T target . Corresponding plots for the other two Lennard-Jones mixtures are provided in the Supporting Information ( Figure S5). leaves some imprecision regarding the target temperature, which is on the order of 1%.
The Langevin NEMD procedure was also validated based on selected Lennard-Jones systems, [39,71] and the agreement was usually excellent ( Figure S2, Supporting Information). Furthermore, the Langevin NEMD method and the enhanced heat exchange (eHEX) algorithm [40] as implemented in LAMMPS [58] yield consistent results ( Figure S4, Supporting Information). Therefore, it can be concluded that the Langevin NEMD procedure reliably yields Soret coefficient data for Lennard-Jones systems.

Interpolation
The Soret coefficient from a given Langevin NEMD simulation is associated with the mean temperature in the fit region, which is the common practice when handling the rather complicated temperature-dependency of thermodiffusion quantities. [14] Typically, Soret coefficient data from NEMD simulations should be extrapolated to the zero-force regime, [32,39,80] where a linear function can be used. [39] However, MacGowan and Evans estimated zero-force transport coefficients by weighted averaging if a distinct force dependence of a given coefficient could not be detected. [32] On the one hand, the variation of the Soret coefficient is typically small for different thermodynamic driving forces around a given state point in this work (Figure 5). On the other hand, the average value of the temperature in the gradient fit region does not exactly match the target temperature (i.e., mean value of imposed high and imposed low temperature), but the deviations are small: a few percent. Furthermore, the Soret coefficient data determined with different gradients and at different temperatures usually exhibit consistent temperature trends (Figure 5). Therefore, comparison to EMD predictions, in which the target temperature is reached exactly, was facilitated by interpolation. Specifically, S T and its standard deviation were linearly in-terpolated in the small window around the target temperature to provide the most likely values at the target temperature.

Screening Other Influences
Validation on the basis of literature data and solving the interpolation issue were central to verifying the reliability of our Langevin NEMD method. However, other technical and model-related factors that can possibly influence the Soret coefficient were also sounded out. Specifically, it was reassured that the Langevin damping constant, the width of the temperature control regions, the system size, and the cutoff radius have no systematic impact on S T (Section S2.4, Supporting Information), which leads us to the conclusion that the Langevin NEMD method proposed here yields reliable and robust results.

EMD: Obtaining Causal Data
The EMD approach based on correlation functions and the Green-Kubo formalism is challenging for dense liquids because the cross-correlations required to calculate thermal diffusion are at least one order of magnitude smaller than the auto-correlation functions [44] required to determine the self-diffusion coefficient. In a first step, EMD simulations were performed using the same simulation parameters as for the NEMD simulations ( Table 1). The resulting Soret coefficient data are provided in Figure 8 and Table S1, Supporting Information. In a second step, the simulation parameters were optimized to yield consistent data with the NEMD simulations.

Finite Size Effects and Cutoff Radius
It is well known that small molecular systems under periodic boundary conditions are associated with systematic errors when diffusion coefficients are calculated. In this context, the size dependence of the mass-mass L 11 and internal energy-mass L E1 phenomenological coefficients was studied for one of the considered systems (i.e., mixture B2). A series of simulations with varying system size containing 600, 1200, 2400, 4800, and 8000 molecules was performed. As expected, a clear system size dependence was observed for the mass-mass phenomenological coefficient L 11 (Figure 6a). Its value in the thermodynamic limit, obtained by extrapolation to infinite size, was found to be 5.5% higher than its sampled value for 1200 molecules. Therefore, L 11 had to be corrected to account for finite size effects when working with small systems, for example, as proposed in ref. [81]. A finite size correction of L 11 based on a modified version of the correction factor by Yeh and Hummer [82,83] for binary mixtures yields an improvement, but L 11 remains underestimated by 4%. On the other hand, no clear system size dependence could be inferred for the internal energy-mass coefficient L E1 despite its diffusive nature (Figure 6b).
The prediction of the thermodynamic factor by Kirkwood-Buff integration is also subject to system size effects, as can be seen in Figure 7. Because this formalism was originally derived for the grand canonical ( VT) ensemble, it has to be extrapolated to the thermodynamic limit when the NVT ensemble is used. The correction method by Krüger et al. [73] was successfully employed in a previous work [75] to determine extrapolated values when dealing with small systems. However, when sufficiently large systems are employed, this methodology can lead to an over-correction of the thermodynamic factor (Figure 7).
Because of finite size-related issues, additional EMD simulations were performed with larger volumes for all studied mixtures. For this purpose, relatively large systems with 8000 molecules were employed and the cutoff radius was extended to half of the edge length of the simulation volume l∕2. For mixture A2, the difference between the extrapolated and sampled mass transfer coefficient was within its statistical uncertainty (i.e., ≈ 2%). Further, the sampled thermodynamic factor differed by +2.5% from its extrapolated value. Therefore, the sampled values of the mass-mass phenomenological coefficient and the thermodynamic factor were employed directly without further finite size corrections for the calculation of the Soret coefficient. On the other hand, the coefficients sampled in a smaller simulation volume, containing 1200 molecules, do require finite size corrections. Figure 8 shows the Soret coefficient of the studied Lennard-Jones mixtures for different system sizes (i.e., 1200 and 8000 molecules). As can be seen, there is a very good agreement be- Figure 7. a) Thermodynamic factor of the mixture B2 as a function of the inverse edge length of the simulation volume l −1 , including a linear fit (dashed line) and the resulting extrapolated value in the thermodynamic limit (ochre cross). b) Thermodynamic factor sampled for systems with 1200 (solid circles) and 8000 (solid squares) molecules, respectively, and their corrected values (empty symbols) following Krüger et al. [73] as functions of the calculated Soret coefficient. The ochre cross represents the linearly extrapolated thermodynamic factor of mixture B2, as obtained from (a). The open symbols represent coefficients calculated with the corrected mass-mass phenomenological coefficients [81] L 11 and the corrected thermodynamic factor [73] Γ. The ochre symbols represent data calculated with 1200 molecules and a cutoff radius of l∕2. Figure 9. Range of the calculated Soret coefficient when the magnitude of the mass-mass phenomenological coefficient (ochre) changes by ± 10% and when the partial molar enthalpy difference (h 1 ∕M 1 − h 2 ∕M 2 ) (blue) changes by ± 10%. The original Soret coefficient data with uncertainties (solid squares) were added for comparison.
tween the Soret coefficient based on the simulations with 1200 molecules and those obtained from the simulations with 8000 molecules for the mixtures B and C. The peculiarly strong differences found for mixture A could be traced back to the use of a relatively short cutoff in the simulations with 1200 molecules. When the cutoff radius was changed to l∕2, the agreement with the 8000 molecules simulation was significantly improved. A strong dependence of the cross-correlated phenomenological coefficient L E1 on the cutoff radius was found especially for mixture A1. Changes of other transport properties due to the cutoff radius (e.g., diffusion coefficients, shear viscosity, and thermal conductivity) were found to be within their statistical uncertainties. Negligible changes were also found for the thermodynamic equilibrium properties. Solely, the average pressure was 15% lower for the simulations with a smaller cutoff radius.

Sensitivity
Two factors were identified to highly affect the Soret coefficient value when predicted from EMD simulations: the mass-mass phenomenological coefficient and the difference between the partial molar enthalpies of both components (h 1 ∕M 1 − h 2 ∕M 2 ). For the majority of considered mixtures, a change of a few percent of either of these quantities leads to a large change of the Soret coefficient. For example, Figure 9 shows the resulting changes of the Soret coefficient when the mass-mass phenomenological coefficient L 11 and the enthalpy difference (h 1 ∕M 1 − h 2 ∕M 2 ) are varied by ±10%. The strong variation of the Soret coefficient is related to L Q1 . As can be seen in Figure 10, the contribution of the enthalpy on the heat-mass phenomenological coefficient is large for most of the studied mixtures. Furthermore, in cases where terms of Equation (20) have a very similar magnitude a small change of the second term may even lead to a change of sign of the Soret coefficient.
Because of the strong sensitivity of the EMD simulations for the studied mixtures, the results obtained from the larger systems featuring 8000 molecules are regarded as the more ade- quate values for comparison with NEMD results. The numerical results are listed in Table S2, Supporting Information.

Other Properties
EMD simulations for the calculation of the Soret coefficient are challenging and have to be performed with carefully chosen parameters. However, all relevant quantities for gaining a microscopic understanding of the heat and mass transfer processes can be calculated directly with EMD: the Fick and the thermal diffusion coefficients as well as the thermal conductivity or the degree of coupling q = L Q1 ∕(L 11 L QQ ) 0.5 . These quantities are listed in Table S3, Supporting Information. Figure 11 shows the final data set with Soret coefficients of Lennard-Jones component 1 S T from NEMD and EMD simulations as functions of temperature for two different average pressures and three Lennard-Jones mixtures (A, B and C; cf. Table 1). None of the systems studied gives rise to a sign change in the Soret coefficient and thus to a switch in thermophoretic behavior when the temperature or pressure is altered. The temperature and pressure ranges investigated are small and the data sparse. The observation is yet important because the change from positive to negative thermodiffusion behavior is subject of many studies due to the technological significance of this phenomenon.

NEMD versus EMD: Rationalizing S T
The pressure impact on S T is consistent across the different systems; however, the temperature influence can be different. In all cases, an increase in pressure from 0.09 to 1.8 yields a decrease of the Soret coefficient of component 1: by 30%, 417%, and 53% for mixtures A, B, and C, respectively. By contrast, S T increases with rising temperature from 0.75 to 1 at p = 0.09 for mixture B by 50%, whereas it decreases by 44% and 6% for mixtures A and C, respectively. However, half of these relative change figures  [27] (small purple symbols) are also included as well as predictions on the basis of Reith's and Müller-Plathe's regression model (ochre solid line). [39] exhibit large uncertainties when relying on a 2 d S T threshold to achieve 95% confidence. For the temperature-induced S T changes, the uncertainties are 11%, 67%, and 37% and, for the pressure-induced S T changes, they are 15%, 622%, and 23% for mixtures A, B, and C, respectively. We used the S T uncertainties S T from the NEMD simulations to obtain the uncertainties d S T and applied conventional uncertainty propagation rules, which yields, for example, for the temperature-induced relative S T change where the factor ±1 depends on whether S T (T 1 ) ≷ 0. NEMD and EMD simulations usually yield consistent results because the 1 S T uncertainties always overlap-with exception of mixture B, state point 2. The statistical uncertainties of S T from EMD simulations are larger than those from NEMD simulations, which can be explained by the relatively small signal-to-noise ratio inherent to the cross-correlation functions. For mixtures A, all 1 S T uncertainties are small, whereas those of mixtures B and C might seem moderate or large. This, however, is only a plotting effect because the vertical axis range in Figure 11 is more narrow for mixture B and much more narrow for mixture C.

Simulations versus Theoretical Models: Efficiently Predicting S T
Several theoretical models [25,27,84,85] for predicting S T leverage readily accessible thermodynamic information. In particular, the model by Shukla and Firoozabadi [27] is successful for gaseous and liquid mixtures. [86] The Soret coefficient is given by [27,86] where u i and v i are the partial molar internal energy and the partial molar volume of component i, respectively. The componentspecific factor i , which is the cohesive energy of the liquid divided by the activation energy supplied for molecular motion, [27] ranges typically between three and five for liquids. [27,86] The predictions are included in Figure 11 using i = 3.5 (cf. ref. [86] and Section S2.6, Supporting Information). Since none of the state points chosen for the three considered Lennard-Jones mixtures is in the proximity of the critical point, the model should work well. [27] We performed a detailed uncertainty propagation analysis [87] for the Soret coefficient given by the Shukla-Firoozabadi model (Section S2.6, Supporting Information). This could explain some of the discrepancies observed between NEMD, EMD, and predictive model results. For mixtures A and C, the model predicts the correct sign and most of the temperature and pressure dependences of S T . The difference between NEMD simulation and model prediction relative to the model uncertainty is, on an average, however as large as 285% for these two mixtures. While uncertainties cannot be made fully responsible, they could contribute significantly to the deviations. In order to better understand the precise source of the model uncertainty, its relative contributions were calculated (Section S2. 6  ith and Müller-Plathe [39] might help shed light onto the reason for the prediction failure. The Reith-Müller-Plathe model consists of three additive quadratic functions, each of which was separately fitted to a series of NEMD-derived Soret coefficients. In each series, a single model parameter of component 1, m 1 , 1 , or 1 , was systematically varied so that the impact of one model parameter was encoded in one quadratic function. The resulting overall Soret coefficient, which was successfully validated on realistic systems such as isotopic argon-argon, argon-krypton, and xenon-methane, is given by [39] S RMP ) + 111.9 The applicability ranges for the different contributions are limited by originally fitting within m 1 ∕m 2 ≤ 8, 1 ∕ 2 ≤ 1.25, and 1 ∕ 2 ≤ 1.75. Moreover, the Soret coefficient is not given in Lennard-Jones units but in (10 −3 K −1 ). Hence, S RMP T has to be adjusted to the present unit system.
The regression model by Reith and Müller-Plathe can reliably predict the sign of the Soret coefficient. Furthermore, the magnitude of S RMP T is reasonable, considering the fact that this prediction does not incorporate any temperature or pressure/density dependence. Another drawback of this model is the assumption that there are no coupling effects among the Lennard-Jones mass, size, and energy parameters. However, it has been found that such effects play a non-negligible role for the prediction of thermodiffusion of equimolar Lennard-Jones mixtures. [88] The insight that was gained from these predictions is that mass and Lennard-Jones size parameter have opposing effects on the Soret coefficient. The lower mass of component 1 alone would result in negative thermodiffusion, whereas the smaller size parameter would trigger positive thermodiffusion. It thus depends on which effect is stronger and hence which type of model parameter ratio is larger. In the case of mixture B, the mass ratio ultimately dictates the negative sign. The parameters used in the model by Shukla and Firoozabadi [27] were originally determined for real systems, for which size and mass typically correlate. It is therefore possible that these parameters require a re-optimization if size and mass strongly diverge, as seems to hold for mixture B. Very recently, Hoang and Galliero [89] came to a similar conclusion on the lack of reliability of the Shukla-Firoozabadi model for the prediction of thermal diffusion factors for Lennard-Jones mixtures that are asymmetric in size and mass.
Finally, note that another possibility to describe the effects of mass, size, and energy parameter ratios on transport properties of Lennard-Jones mixtures is the van der Waals one-fluid model. [76,90] This model considers the mixture to be a hypothetical pseudo-compound that approximates its equilibrium and transport properties. [91] However, this approach has significant limitations when dealing with strongly asymmetric mixtures [56,91] so that is not applied to the present systems.

Conclusion
We used Langevin NEMD alongside EMD simulations to investigate the Soret coefficient of binary Lennard-Jones mixtures. The Langevin NEMD simulations served as benchmarks because EMD predictions based on the Green-Kubo formalism are challenging due to a weak signal-to-noise ratio of the cross-correlation functions, the strong system-size dependence of the mass-mass phenomenological coefficient, and the uncertainties related to the calculation of the thermodynamic factor. Once optimal settings for the EMD simulations were determined, the data-set from the EMD simulations was successfully used to rationalize the behavior of the Soret coefficient. Furthermore, the EMD data were leveraged to test the applicability of a lean theoretical model, [27] which allows for predictions of the Soret coefficient mainly based on thermodynamic data.
We could identify a limitation of the theoretical model by Shukla and Firoozabadi. [27] Importantly, a detailed uncertainty analysis, which has, to the best of our knowledge, not been performed for the model yet, and complementing insights from a regression model, [39] which leverages model parameters of the mixture chosen, helped us pinpointing the probable reason. The theoretical model is unable to account for opposing effects of mass and size on the Soret coefficient. A parameter that is mainly kinetic in nature plays here a major role in dictating the sign of the Soret coefficient and thus in the applicability of the analytical model. More elaborate models such as those by Artola et al. [92] could be used instead, but this model requires, for example, more transport data.
Extending the combination of NEMD and EMD simulations to more complex systems is expected to provide insight into practical applications of thermodiffusion. The NEMD method was successfully applied to study aqueous alkali halide solutions [93] as well as to ternary [94,95] and quaternary [94] liquid mixtures of organic molecules. For both methods, the sampling of species with low concentration poses a challenge in these applications.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.