Dynamic Structural Changes and Thermodynamics in Phase Separation Processes of an Intrinsically Disordered–Ordered Protein Model

Abstract Elastin‐like proteins (ELPs) are biologically important proteins and models for intrinsically disordered proteins (IDPs) and dynamic structural transitions associated with coacervates and liquid–liquid phase transitions. However, the conformational status below and above coacervation temperature and its role in the phase separation process is still elusive. Employing matrix least‐squares global Boltzmann fitting of the circular dichroism spectra of the ELPs (VPGVG)20, (VPGVG)40, and (VPGVG)60, we found that coacervation occurs sharply when a certain number of repeat units has acquired β‐turn conformation (in our sequence setting a threshold of approx. 20 repeat units). The character of the differential scattering of the coacervate suspensions indicated that this fraction of β‐turn structure is still retained after polypeptide assembly. Such conformational thresholds may also have a role in other protein assembly processes with implications for the design of protein‐based smart materials.


Design and cloning of the ELP constructs
For the generation of (VPGVG)20, (VPGVG) 40, and (VPGVG)60 the "one-vector-toolboxplatform" (OVTP) was applied as modular cloning approach. OVTP was used for cloning of the highly repetitive sequences of the ELP constructs and their constituents starting with the assembly of synthetic oligonucleotides. OVTP provides a basic plasmid backbone with special restriction sites where individual DNA blocks can be incorporated, replaced or substituted and directly used as an E. coli expression vector. DNA blocks for different ELPs were excised with SacI and EarI from a library of existing sequence blocks and assembled into the plasmid backbone as described in Huber et. al. [12c] Correct assembly was confirmed by sequencing (GATC GmbH). The final plasmids used for protein expression were named pET28 NMBxL-His-V20, pET28 NMBxL-His-V40 pET28 NMBxL-His-V60.

Polypeptide production and purification
E. coli strains BL21 (DE3) were used for the expression of the desired protein constructs. Bacteria were grown while shaking at 250 rpm in LB medium at 37°C until they reached an OD600 of 0.7. At this point, expression of the ELP-coding gene was induced with 1 mM IPTG (final concentration). After shaking for 6-7 h at 20°C and 180 rpm cells were harvested by centrifugation at 4,000 g for 30 min. After resuspension of the bacteria in lysis buffer (50 mM Tris-HCl pH 8.0, 500 mM NaCl, 20 mM imidazole), 1 mM TCEP (final concentration) and 10 µg mL −1 lysozyme per liter of culture volume were added and incubated on ice for 30 min. Cell lysis was induced by freezing the cells in liquid nitrogen twice followed by sonication and centrifugation of the lysate at 10,000 g for 40 min. Finally, the supernatant was loaded onto a Ni-NTA-column (Macherey-Nagel), washed and eluted with 50 mM Tris pH 7.5, 500 mM NaCl, and 50 mM-300 mM imidazole. For purity analysis and identification a standard 10% Tris/Glycine SDS-PAGE was applied ( Figure S1).

Determination of the ELP concentration
After purification, ELPs were dialyzed against milliQ water using a 12-14 kDa MWCO cellulose mixed ester membrane (Roth). After dialysis, the ELP solution was shock frozen in liquid nitrogen and placed in a pre-cooled ALPHA 2-4 LSC freeze-dryer (Christ, Osterode, Germany). The ELP solution was freeze-dried at −20°C to −80°C and 10 −6 mbar for 48 h. The polypeptide dry weight was used to determine the amount of the sample. After weighing, the lyophilized polypeptide sample was resuspended in 10 mM NaH2PO4/ Na2HPO4 buffer containing 20 mM of NaCl buffer (pH 7.5) at a final concentration of 1 mg mL −1 (stock solution for all further analyses). Table S1. ELP properties calculated with Expasy [40] and the concentration used for CD measurements in 10 mM NaH2PO4/Na2HPO4, 20 mM NaCl, pH 7.5. CD spectroscopy CD and UV/visible absorbance spectra in the spectral range of 180−350 nm were recorded on a J-810 Spectropolarimeter (Jasco, Tokyo, Japan), equipped with a Model PFD-425S Peltier element (Jasco) at different temperatures between 10 and 80 °C. ELP solutions (9.7 µM) in deionized water were placed in a 1 mm quartz cuvette (Hellma, Müllheim, Germany). The spectra were measured using a scanning speed of 50 nm min −1 , a step size of 0.5 nm, a bandwidth of 1 nm, a response time of 1 s, and an accumulation of 3 scans. The spectra were background-corrected by subtracting water spectra recorded at the corresponding temperatures.

Matrix least-squares (MLS) global fitting
The MLS analysis was performed in MATLAB (MathWorks, Natick, MA, USA) in a similar way as described previously for the analysis of other dynamic processes. [20][21][22][23]41] In brief, a matrix A containing the temperature-dependent spectra was subject to singular value decomposition (SVD) delivering three matrices U, S and V according to the decomposition A = U × S × V T , where the columns of U contain the orthornormal spectra, i. e. spectra containing the spectral contributions that are correlated throughout the whole data set, the columns of V contain the associated temperature-dependent coefficients, and S is a diagonal matrix with the singular values. [42] Assuming that the activation energy between the conformational states is low, [43] we neglected kinetic effects in spectral fitting. The open parameters ΔH and ΔS were thus obtained from non-linear least-squares fitting of a Boltzmann model (see Results and Discussion) to the coefficients in V. The fitted parameters delivered the partition between the two states for (VPGVG)20 and the partition between three states for (VPGVG)40 and (VPGVG)60, respectively, which allowed obtaining the associated amplitude spectra and pure species spectra after matrix inversion. While (VPGVG)20 was fitted using least-squares as the only convergence criterion, for the fitting of (VPGVG)40 and (VPGVG)60 an approach similar to the maximum a posteori method was employed. Here, the similarity of the first two pure spectra, which are expected to correspond to disordered and β-turn structure, to the pure spectra obtained for (VPGVG)20 was included as a second convergence criterion (noise-free spectra from Gaussian fitting served as reference spectra). As a measure for spectral similarity, we used cosine similarity (cosθ), which, independently from relative size of the spectra, delivers similarity values between 1 (identical) and 0 (dissimilar): In the minimization function, the similarity term and the least-squares term are multiplied by the regularization parameters λsim and λlsq, respectively. As temperature-dependent weighting led to different amounts of spectral information in (VPGVG)40 compared to (VPGVG)60, also different regularization parameters have been applied: λsim = 10 4 and λlsq = 10 5 for (VPGVG)40; λsim = 10 5 and λlsq = 10 5 for (VPGVG)60.

Data weighting
As the quality of the data strongly varied with wavelength and temperature, the data in A was weighted by wavelength (for example, intensities at <185 nm were ignored) and temperaturedependent functions in the SVD and the following fitting steps. Noise functions for (VPGVG)20, (VPGVG)40, and (VPGVG)60 were obtained from Fourier-filtering of each temperaturedependent CD spectrum with a 4 nm cut-off and taking the average over all spectra.

S5
To account for temperature-dependent uncertainty of the CD data (for example spectra recorded above ITT are considerably smaller and therefore less reliable), in singular value decomposition (SVD) and matrix least-squares (MLS) global fitting, the CD data were weighted by temperature-dependent uncertainty functions. Temperature-dependent noise-levels were obtained from averaging the results from Fourier-filtering (see above) over all wavelengths. To obtain the uncertainty functions, the temperature-dependent noise levels were additionally weighted by a similarity function: in the case of (VPGVG)20 the relative deviation from the average absorbance spectrum; in the case of (VPGVG)40 and (VPGVG)60 the cosine similarity of each absorbance spectrum and the average absorbance spectrum of (VPGVG)20 (a spectrum considered as scattering-free).
While the first criterion weights the most reliable data points, the second favors a fitting with a structurally meaningful result.

Fitting model and back calculation of the pure spectra
Each temperature-dependent CD spectrum is modeled as the Boltzmann-weighted sum of M amplitude spectra (a0, a1,…aM) for M considered equilibrium species: Here, pj(T)is the Boltzmann distribution as given in equation (1) in the main text. The amplitude spectra are obtained from: [42] = + , where D is a matrix whose columns correspond to the amplitude spectra a0, a1,…aM, A is the matrix containing the temperature-dependent CD spectra, and P T+ is the pseudoinverse of the transpose of matrix P. The columns of P contain the Boltzmann distributions p0, p1,…pM, each calculated with the corresponding ΔH and ΔS from non-linear least-squares fitting of the Boltzmann model to the temperature-dependent SVD coefficients.

Simulation and fitting of differential scattering spectra
For the fitting of putative contributions from differential scattering, we simulated differential scattering spectra from the pure species spectra of disordered and β-turn structure obtained from MLS global fitting of (VPGVG)40 and (VPGVG)60 and a representative absorbance spectrum. The contribution from differential scattering Δs to the CD spectrum of a suspension as derived from statistical fluctuation theory is given by: [44] ∆ ( ) ∝ 2 ( 0 ( )) Here, V is the volume element where the scattering occurs, n0(λ) is the refractive index spectrum of the solvent, λ is the wavelength, c̄ is the average concentration in the volume S6 element, ∂n(λ)/∂c is the gradient of the refractive index n of the solution with respect to the concentration c of the scattering, and ∂Δn(λ)/∂c is the difference refractive index gradient for left minus right circularly polarized light. The scattering intensity also depends on ⟨(∂c) 2 〉, the mean quadratic fluctuation in a volume V. It is proportional to c̄ and (1−q), where q is the probability of a light beam to meet a particle in a volume element V.
The wavelength-dependent refractive index components ∂n(λ)/∂c and ∂Δn(λ)/∂c (in molar absorptivity units) according to were calculated after Kramers-Kronig transform (KKT) of the Gauss-fitted absorptive spectra using the KKT-tool by Lucarini et al. in MATLAB. [45] The fitting of pure CD spectra and the differential scattering spectra to the 80 °C spectra of (VPGVG)40 and (VPGVG)60 was also performed in MATLAB.

Dynamic light scattering (DLS)
Light scattering of solutions of (VPGVG)20, (VPGVG)40, and (VPGVG)60 was measured at 20 °C on a Zetasizer ZS instrument (Malvern Panalytic, Malvern, UK) at a measurement position set at 1.05 mm and a scattering angle of 173°. For optimal conditions for the determination of particle size distributions, the laser transmission was set to automatic attenuation.

Confidence of matrix least-squares (MLS) global fitting
The fitting residuals r(λi,Tj) as shown in Figure S5, Figure S6, and Figure S7 represent the difference between the raw data (non-weighted) and the fitting model given in equation (S1).
The R-values given in the text are calculated by: The R-values together with cosine similarity (cosθ) from the comparison of the pure species spectra obtained for disorderered and β-turn structure from the fitting of the (VPGVG)40 and (VPGVG)60, respectively, to the disorderered and β-turn structure spectra obtained for (VPGVG)20 are summarized in Table S2.  Figure S5. Fitting residuals for (VPGVG)20. As no residual spectrum contains spectral features different from noise, the temperature-dependent process is perfectly described by the model. S10 Figure S6. Fitting residuals for (VPGVG)40. As expected, minor non-random features appear in the residuals above ITT (70 and 80 °C). Figure S7. Fitting residuals for (VPGVG)60. Due to the dramatic decrease of signal intensity above ITT, residuals calculated for spectra measured at 45, 50, 60, 70 and 80 °C clearly show non-random features.