FRET Dyes Significantly Affect SAXS Intensities of Proteins

Structural analyses in biophysics aim at revealing a relationship between a molecule’s dynamic structure and its physiological function. Förster resonance energy transfer (FRET) and small-angle X-ray scattering (SAXS) are complementary experimental approaches to this. Their concomitant application in combined studies has recently opened a lively debate on how to interpret FRET measurements in the light of SAXS data with the popular example of the radius of gyration, commonly derived from both FRET and SAXS. There still is a lack of understanding in how to mutually relate and interpret quantities equally obtained from FRET or SAXS, and to what extent FRET dyes affect SAXS intensities in combined applications. In the present work, we examine the interplay of FRET and SAXS from a computational simulation perspective. Molecular simulations are a valuable complement to experimental approaches and supply instructive information on dynamics. As FRET depends not only on the mutual separation but also on the relative orientations, the dynamics, and therefore also the shapes of the dyes, we utilize a novel method for simulating FRET-dye-labeled proteins to investigate these aspects in atomic detail. We perform structure-based simulations of four different proteins with and without dyes in both folded and unfolded conformations. In-silico derived radii of gyration are different with and without dyes and depend on the chosen dye pair. The dyes apparently influence the dynamics of unfolded systems. We find that FRET dyes attached to a protein have a significant impact on theoretical SAXS intensities calculated from simulated structures, especially for small proteins. Radii of gyration from FRET and SAXS deviate systematically, which points to further underlying mechanisms beyond prevalent explanation approaches.


Introduction
In the past decades, an enormous variety of protein structures has been accumulated experimentally by employing sophisticated high-resolution techniques such as X-ray crystallography or nuclear magnetic resonance spectroscopy (NMR). [1] With cellular function, however, being dictated by the interplay between static structures and dynamic conformational changes, alternative methods have been catching up so as to elucidate the dynamic nature of the structure-function paradigm. Förster resonance energy transfer (FRET) and small-angle X-ray scattering (SAXS) are particularly popular approaches to this and complementary to the aforementioned methods. SAXS can be used to study average structures of various systems and enables even time-resolved analyses of conformational transitions in direct response to altered external conditions. [2] A solution of biomolecules is exposed to X-rays and the integrated scattered intensity is recorded in the small-angle regime, which contains information on structural features of the solute molecules. FRET provides access to time-resolved distance information on, e.g., folding dynamics, [3] intermediate structures, [4,5] and function-related conformational transitions. [6] After labeling specific molecular sites with fluorescent dyes, the distance-dependent energy transfer efficiency between them is measured.
Both FRET and SAXS are widely applied for analysis of unfolded and intrinsically disordered proteins (IDPs). [7,8] The characteristics of such systems are of great interest due to their relevance to folding and the physiological prevalence of partly and entirely unstructured proteins as, despite lacking definite structure, IDPs fulfill important functional roles. Polymer physics is applied to understand the dynamics of unstructured proteins with their high conformational diversity and further relate their properties to folding and function, and the validity of such approaches has been studied extensively in the context of FRET and SAXS. [8][9][10][11] More recently, there has been an ongoing discussion on how to interpret FRET measurements in the light of SAXS data, especially for IDPs and unfolded ensembles. A popular structural quantity equally derived from FRET and SAXS is the radius of gyration R g , a measure of overall molecular size. Important questions are how to mutually relate and interpret derived values of R g obtained by either of the methods, and to what extent FRET dyes influence SAXS intensities in concomitant applications. Recent studies find that FRET implies IDPs to be compacted in water by comparison with high denaturant concentrations, while this compaction could not be validated within SAXS, which is known as the so-called SAXS-FRET controversy. [12][13][14] For globular proteins, theory and simulation predict the dimensions of unfolded conformations to decrease with the denaturant concentration. Whereas interpretation of single-molecule FRET data supports this prediction, SAXS data point to the opposite. [15] Based on theoretical considerations, simulations, and new experimental data, Thirumalai et al. found that sizes of unfolded states of globular proteins have to decrease as the denaturant concentration goes down, and stated compaction of unfolded proteins to be universal. [15] In this context, water's critical role as a solvent further comes to the fore. [9,15,16] These findings are in accordance with results by Reddy et al., who studied the SAXS-FRET controversy in coarse-grained simulations including denaturant using the example of Ubiquitin. [11] A possible explanation for these at first glance contradicting observations is a decoupling of size and shape fluctuations, leading to the conclusion that FRET and SAXS do not measure the same quantity but are complementary approaches. [17] Fuertes et al. hypothesize proteins to be subject to a sequencespecific decoupling of end-to-end distance R e measured by FRET and radius of gyration R g deduced from SAXS, and as heteropolymers, proteins may exhibit diverse R g -R e relationships. [18] Other studies assume the analysis methods to be the primary source of the apparent discrepancies. [19] Based on combined FRET and SAXS studies of unfolded proteins and IDPs, Borgia et al. suggest SAXS measurements to be basically model-free, whereas interpretation of FRET data always relies on a model such as a Gaussian or excludedvolume chain to relate R g and R e . [20] Zheng et al. performed explicit-solvent MD simulations of a 79-residue IDP, revealing potential discrepancies between FRET and SAXS for this particular system. [21] It however remains unclear if and, if yes, to what extent FRET dyes influence SAXS measurements, and how distinct calculation methods for R g differ with respect to their results. Molecular simulations are the ideal tool to clarify these issues. They can be applied to study the influence of FRET dyes on scattering patterns and give access to all different variants of R g .
Here, we illuminate the interplay of combined FRET and SAXS from a computational simulation point of view. Molecular simulations are a valuable complement to experiments and, depending on their complexity, provide insightful information up to atomistic dynamics of a system. FRET does not directly access quantitative information about molecular distances, but measures the energy transfer efficiency between the dyes. This efficiency depends not only on the separation distance of the dyes but also on their relative orientation and dynamics, which can be observed within molecular simulations best. We consider a novel method for simulating FRETdye-labeled proteins using native structure-based models (SBMs) on the atomistic level. [22,23] Based on energy landscape theory and the principle of minimal frustration, [24][25][26][27] SBMs probe dynamics arising from the system's native geometry. [28] By this means, force field complexity is drastically decreased without loss of substantial information on the system's characteristics, resulting in improved sampling and high computational efficiency. In particular, such models enable thorough sampling of large conformational ensembles such as intrinsically disordered or unfolded systems. Using the simulation protocol by Reinartz et al., [23] we calculate theoretical SAXS curves from molecular simulations of four different proteins with and without dyes. By comparing these intensities, we investigate the influence of FRET dyes on scattering curves from SAXS for both folded and unfolded ensembles. Furthermore, we derive and compare different variants of the radius of gyration as a particularly popular quantity accessible in both FRET and SAXS. In doing so, we hope to make an important contribution to elucidating the relationship and interplay between the experimental methods of FRET and SAXS.

Förster Resonance Energy Transfer
Förster resonance energy transfer (FRET) [30] is a mechanism describing non-radiative energy transfer between two lightsensitive molecules. An electronically excited donor may transfer energy to an acceptor via non-radiative dipole-dipole coupling. The efficiency of this energy transfer depends on the sixth power of the distance between donor and acceptor. FRET consequently is extremely sensitive to small distance changes in the nanometer range and also referred to as a "spectroscopic ruler". [31] By labeling specific protein residues with suitable dyes as illustrated in Figure 1, different conformations become distinguishable and conformational changes can be observed directly through changes in spatial dye separation. Experimentally, the FRET efficiency E is measured, which depends on the inter-dye distance R DA as [32] The Förster radius R 0 is given by the donor-acceptor distance at which E equals 0.5. It depends on the relative orientations of donor and acceptor represented by the dipole orientation factor k 2 as R 6 0 / k 2 . [32] Rotational dye diffusion is usually assumed to be fast with respect to the lifetime of the

Full Paper
Isr. J. Chem. 2020, 60, 725 -734 excited state, yielding a constant value of k 2 ¼ 2=3 in the "isotropic averaging regime". [32] In contrast to this, dye molecules are modeled explicitly at atomistic resolution in the structure-based protocol for simulation of dye-labeled proteins by Reinartz et al. [23] applied here. Thus, k 2 can be calculated directly from such simulations without further approximations.

Small-angle X-ray Scattering
Small-angle X-ray scattering (SAXS) is an efficient tool for low-resolution structural characterization of dissolved biomolecules. [2,33] A solution of proteins is exposed to X-rays with wavelength l. The integrated intensity from elastic scattering is measured in the small-angle regime as a function of momentum transfer q ¼ 4psinq= l where 2q is the scattering angle. SAXS records the averaged scattering intensity over the entire conformational ensemble and all possible orientations of the solute molecules. Ideally, this isotropic intensity distribution is proportional to the spatially averaged scattering from a single particle. The net solute scattering, in return, is related to the electron density difference between solute and solvent.
The spherically averaged scattering intensity I of a molecule modeled as a collection of elementary scatterers, e. g., atoms or amino acids, can be calculated via the Debye equation: [34] r ij is the distance between two scatterers i and j, f i and f j are the corresponding form factors. Different parts of such an intensity pattern provide information about different structural features. However, it is important to note that the signal-to-noise ratio of experimentally measured intensities decreases rapidly with increasing momentum transfer q.
For small q, the intensity can be described by the Guinier approximation: [35] Accordingly, R g can be extracted from the curve slope in a Guinier plot. Note that the Guinier approximation is only valid for qR g < 1:3 for globular proteins [2] and in an even smaller range for elongated structures.

Structure-based Models
Gō-type or structure-based models (SBMs) provide a minimalistic description of biomolecular dynamics arising from the native geometry. Giving access to biologically relevant timescales, computationally efficient SBMs provide rich information on the system's characteristics. Successful applications cover a wide range of protein dynamics such as folding pathways [36][37][38][39][40][41] and kinetics. [42] SBMs are also employed for structure prediction, [43][44][45][46] integrative structural modeling of experimental data from, e. g., SAXS [47] or cryo-EM, [48] and investigation of transition state ensembles. [49,50] Founded on energy landscape theory and the principle of minimal frustration, protein dynamics are modeled based on the assumption that native interactions are generally stabilizing, whereas non-native interactions are only included to preserve excluded volume. [24][25][26][27][28] The essential part lies in the so-called contact potential. Each native contact defined by a pair interaction between atoms spatially close in the native state is assigned an attractive potential, whereas a purely repulsive excluded-volume term is included for all atom pairs. As a result, an overall energetic drive to the native structure overtops kinetic traps which would originate from non-native interactions.
We use an all-atom SBM taking into account all heavy atoms of the protein [22] as implemented in eSBMTools. [38] With native bond lengths r 0 , bond angles q 0 , and proper and improper dihedral angles f 0 and c 0 , the simplified potential reads: [51] V SBM ¼ Tenth type III domain of fibronectin ( 10 FNIII, PDB code: 1TTG [29] ) with AF 546 (AF546, blue) and AF 647 (AF647, red) dyes attached at residues 11 and 86, respectively. The C a atoms of these residues are shown as blue and red sphere, respectively. Inter-dye distance R DA and C a distance R C a are marked.

Full Paper
Isr. J. Chem. 2020, 60, 725 -734 Numerical values of energetic weights K, the excluded volume for Pauli repulsions, and the functional form of the Gaussian contact potential C G can be found in Supplementary Information S2.1 (see also Refs. [23] and [52]).

Simulation of Dye-labeled Proteins
To simulate protein systems with dye pairs attached, we apply a novel structure-based simulation protocol developed by Reinartz et al. [23] In this method, quantum-chemical calculations are initially carried out to obtain three-dimensional dye structures from available chemical structures. Subsequently, linkers are added to bind the dyes to the protein. The dyes are parametrized for inclusion into the SBM, where the only interaction considered is excluded-volume repulsion. [23] In a last step, they are attached preferably orthogonally to the protein surface. Simulations are run in GROMACS v4.5.4 [53] using the structure-based potential introduced in Eq. (4) and molecular dynamics parameters as described in Ref. [23] (see also Supplementary Information S2).

Proteins
As a first test system, we use the 94-residue tenth type III module of fibronectin ( 10 FNIII, PDB code: 1TTG [29] ) depicted in Figure 1. Fibronectin is a homodimeric glycoprotein of the extracellular matrix. It plays a major role in cell adhesion, growth, migration, and differentiation, and is important for wound healing and embryonic development. [54] Altered expression, degradation, and organization of this protein have been associated with several pathologies, including cancer and fibrosis. [55] Chymotrypsin inhibitor 2 (CI-2, PDB code: 2CI2 [56] ) is a widely studied and well-understood 83-residue serine proteinase inhibitor from barley seeds. It was among the first proteins to have its folding/unfolding transition state extensively characterized by the protein engineering method. [57,58] Its denatured state and folding were subsequently characterized by NMR and hydrogen exchange. [59][60][61] We study the globular 66-residue cold shock protein from Thermotoga maritima (CspTm, PDB code: 1G6P [62] ) as a third system. Upon rapid temperature decrease, many bacteria produce small cold shock proteins. During cold shock, the efficiency of transcription and translation is reduced due to stabilization of nucleic acid secondary structure. Cold shock proteins are thought to counteract this by preventing the formation of messenger RNA secondary structure at low temperature as nucleic acid chaperones.
Cytolysin A (ClyA) of Escherichia coli is a pore-forming hemolytic toxin. This protein exists as a monomer of 303 residues (PDB code: 1QOY [63] ) and undergoes a conformational change to the protomer before assembling into a dodecameric pore (PDB code: 2WCD). [64]

Dyes
We use two pairs of the Alexa Fluor (AF) family of fluorescent dyes, [65] which are frequently applied as cell and tissue labels in fluorescence microscopy. The excitation and emission spectra of the AF series cover the visible spectrum and extend into the infrared. Individual members are numbered according to their approximated excitation maxima (in nm). We use the AF 488 dye with C 5 -linker (AF488) and AF 594 dye with C 5 -linker (AF594), and the AF 546 dye with C 5 -linker (AF546) and AF 647 dye with C 2 -linker (AF647).
Additionally, we use the Biotium dye CF680R (B680) for simulations with three dyes. [66] Figures 1 and 2 show examples of the studied systems. A detailed list and depictions of all composite systems can be found in Supplementary Information S1. For structures and parameters of the dyes, see Ref. [23].

Calculation of SAXS Profiles from Structural Models
From a computational simulation perspective, theoretical calculation of accurate scattering patterns from atomic positions is a key factor for successful analysis and interpretation of SAXS data. Existing methods can be divided into either implicit-or explicit-solvent. One drawback of the computationally more efficient and widely used implicit-solvent methods is their dependence on several non-trivial free parameters with the most prominent example of the solvation shell's excess density. Given experimental data, the latter can be determined by a least-squares fit of the forwardly calculated curve at the risk of overfitting. Otherwise, it is set to 10 % to 15 % of the bulk water electron density. [67] At this point, it is important to note that it may have different values for folded

Full Paper
Isr. J. Chem. 2020, 60, 725 -734 and unstructured proteins depending on their specific solvation properties. [68] We apply the popular implicit-solvent method CRYSOL, which uses multipole expansion to evaluate spherically averaged scattering patterns from biomolecular structures. [67] To simulate the primary hydration layer, the solvation shell is approximated by a border layer of 3 � effective thickness and excess density d1 with respect to the average density of free bulk water 1 0 ¼ 0:334 e� À 3 . [67] According to Henriques et al., d1 can substantially influence SAXS curves forwardly calculated from structural models, especially for unfolded proteins, and small variations to d1 can change computed radii of gyration by 5 % to 10 %. [68] They report that the CRYSOL default value of 0:03 e� À 3 yields suboptimal results and generally suggest lower solvation shell contrasts between 0:01 e� À 3 and 0:02 e� À 3 . Whilst a value of 0:0125 e� À 3 is recommended for folded proteins, specifying a single density contrast is not valid for disordered proteins. [68] To assess the influence of d1 on R g for the systems studied here, we conduct a sensitivity analysis and compare derived values of R g for different values of d1 in the range of 0:00 e� À 3 to 0:03 e� À 3 . Results can be found in Supplementary Information S6. As expected, different values slightly affect SAXS-derived R g , which generally increase with d1.
With the exception of d1 ¼ 0:00 e� À 3 neglecting the solvation shell completely, we find the overall trend discussed in Section 3 to be preserved among different values of d1 .

Radius of Gyration
A popular structural feature derived from both FRET and SAXS is the radius of gyration R g , a measure of a molecule's spatial extent. GROMACS calculates it as [53] R g ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P where r i is the distance of atom i to the molecular center of mass and m i is the atomic mass deposited in a GROMACS parameter file.

Determination of Different R g Variants
To analyze different R g variants in the context of FRET and SAXS, we first calculate a "true" reference value R g;gmx from the molecular model of the protein without dyes using GROMACS (see Eq. (5)). Second, we consider the corresponding value R þdyes g;gmx computed from the molecular model with dyes.
Analysis of the Guinier region in SAXS provides two additional values R g;saxs and R þdyes g;saxs . Due to the occasionally narrow Guinier region, these may contain errors, in particular for large elongated systems as the unfolded monomer and protomer (also see Supplementary Information S5).
For unfolded proteins, we calculate R g as done previously in experimental work. [20] The proteins are assumed to behave like excluded-volume chains [23,69] (see also Supplementary Information S3). To mimic dyes attached to the chain termini, we analyze truncated systems. Based on simulations of respective full systems, we only consider atoms of residues between the dye positions, neglecting remaining residues in all calculations. We then extract the end-to-end C a distance R e (corresponding to the C a distance between dye-labeled residues) to calculate the apparent radius of gyration for excluded volume chains. [20] FRET is often assumed to measure the distance between dye-labeled C a atoms, thus ignoring contributions of the dyes' linkers. Modeling the linkers as chain extensions of a certain additional sequence length L, the inter-dye distance R DA can be rescaled to a corresponding C a distance via [20] f with the number of considered residues between dye-labeled sites N interÀ dye , scaling exponent n, and additional sequence length L for the dye pairs from [23]. With this factor, we obtain

Influence of FRET Dyes on SAXS Measurements
To start with, we examine the direct impact of FRET dyes on SAXS measurements and derived quantities. Recapitulate that experimental SAXS data always reflect an average over all possible solute conformations and orientations. To mimic this ensemble average, we compute a representative intensity curve for each system by considering 5 000 structures equidistantly distributed over one simulation. Having calculated the individual intensity of each structure with CRYSOL, [67] we determine mean and standard deviation of the resulting 'array' of intensities to obtain an ensemble-averaged curve and assess the degree of agreement of SAXS curves from different conformations. We proceed accordingly for simulations of all folded and unfolded systems, each with and without dyes.
From these representative intensity curves, SAXS-derived radii of gyration are computed and compared to a "true"

Full Paper
Isr. J. Chem. 2020, 60, 725 -734 reference given by the average R g;gmx of each simulated ensemble, where single-frame values are calculated according to Eq. (5). To further investigate how the dyes influence the systems, we additionally study R g;gmx distributions, discussed in Supplementary Information S4 in detail.
As shown in Figure 3a, the resulting intensity curves of 10 FNIII with and without dyes exhibit considerable differences for the folded ensemble. Because dyes change both size and shape of a system, this is to be expected. Although rather small, a difference is still observable for the unfolded ensemble due to an increased chain size with dyes attached. Similar curve shapes indicate that the dyes have a minor but still visible influence in this case. For the larger system of monomer, differences in Figure 3b are almost insignificant in accordance with our expectations.
All derived plots for 10 FNIII are shown in Figure 4. The Guinier approximation in Figure 4a is only valid in a certain region where ln I q ð Þ=I 0 ð Þ ð Þ versus q 2 can legitimately be More details on the systems can be found in Supplementary Information S1. Average intensities (solid lines) versus momentum transfer q along with each intensity-curve distribution's standard deviation over corresponding single-frame intensities directly calculated from individual simulation snapshots (shaded area). It is important to note that this standard deviation is to be interpreted as the intensity distribution width at a particular q point rather than an actual "error" in the sense of statistical uncertainties or systematic deviations as they would occur in experimental data. In accordance with R g distributions (see Supplementary Information S4), the standard deviation may be considered a measure of conformational heterogeneity in the underlying simulated ensemble, which also shows in the fact that standard deviations for unfolded systems are consistently larger than those for folded systems. Curves are depicted for both systems in the folded states without (green) and with dyes (red) and in the unfolded states without (blue) and with dyes (orange).  Figure 3 to the Guinier representation. R g errors resulting from the linear regression in the Guinier analysis are in the order of 10 À 2 to 10 À 4 R g and can be found in Supplementary Information S5. (b) The dimensionless Kratky plot gives information about the protein's conformations. Both plots are shown for 10 FNIII in folded states (green, red) and unfolded states (blue, orange) without and with dyes, respectively. More details on the systems can be found in Supplementary Information S1.

Full Paper
Isr. J. Chem. 2020, 60, 725 -734 approximated by a straight line in a linear fit. The slopes, from which respective radii of gyration are extracted, are different for both folded and unfolded states as well as for the system with and without dyes. The Kratky plot in Figure 4b exhibits a distinct peak for the folded ensemble and a plateau for the unfolded ensemble, thus giving a perfect example of how this kind of analysis can be used to study molecular folding. In the folded case, the broader peak for the system with dyes suggests a less compact structure compared to the purely proteinic system.

Comparison of R g Variants
Ratios of all R g variants to the "true" R g;gmx are presented in Figure 5 for the folded systems. R g;gmx naturally depends on the protein only. With dyes attached, the systems appear to be larger, manifesting in a greater R þdyes g;gmx with respect to R g;gmx . As expected, the smaller the system, the more significant is this effect. SAXS-derived R g values show a similar shift, i. e. R þdyes g;saxs . Exempt from this is ClyA protomer with dyes at positions 56/252, where R þdyes g;gmx is actually lower than R g;gmx . This can be explained by a center-of-mass shift due to the dyes in favor of a reduced radius of gyration (see Figure 2). As evident from CspTm and ClyA monomer and protomer, different dye positions affect R g only marginally. In contrast, dye types seem to have a more pronounced effect, as shown for CI-2. We find that R g variants derived from SAXS apparently overestimate R g for small systems, whereas underestimating it for larger systems in some cases. For the smaller proteins 10 FNIII, CI-2, and CspTm, SAXS-derived R g values are consistently larger than those calculated with the mass-weighted formula in Eq. (5). This is true for systems with and without dyes as well as for folded and unfolded conformations and in full accordance with the expectations. This overestimation could be triggered by the CRYSOL method for calculating SAXS profiles from structural models, which takes into account the hydration shell, or arise from neglecting hydrogen atoms in the molecular model.
The only exception from this typical behavior is ClyA in its elongated monomer and protomer configuration. Here, all values are located in a very narrow range, and SAXS-derived R g are similar to or slightly smaller than corresponding R g;gmx . We assume this counter-intuitive behavior to be caused by a rather narrow Guinier region, which likely results in a greater error in the linear regression.
Analogous results for the unfolded systems are depicted in Figure 6. As apparent from CI-2 and CspTm, R g;gmx is affected by both dye types and positions here, suggesting a subtle but however perceivable influence of the dyes on the chain dynamics. Just as for the folded systems, R þdyes g;gmx is consistently larger than R g;gmx . This effect is related to the dyes' distance in the protein sequence and can be illustrated using the examples of monomer and protomer. With dyes attached to the termini affecting the occupied volume to a greater extent than if attached in the middle, the observed shift increases with the dye separation. The more peripheral the dye-labeling positions in a protein sequence, the more the dyes with their linkers increase the dimensions of a system as reflected by R g , in particular for completely elongated unfolded conformations. For the smaller systems CI-2, CspTm, and 10 FNIII, R g;saxs and R þdyes g;saxs show the expected tendency just as for the folded case. For larger ClyA monomer and protomer, we find the SAXS- Figure 5. R g variants for the folded systems with respect to R g;gmx (green line) given at the bottom in Å. We study 10 FNIII, CI-2 with two different dye pairs (AF546/AF647 and AF488/AF594), CspTm with AF488/AF594 at three different labeling positions, and ClyA monomer, protomer, and dodecamer with AF488/AF594/(B680) at different labeling sites. More details on the systems can be found in Supplementary Information S1. R g values calculated from atomic structures with dyes (R þdyes g;gmx , red) and those derived from SAXS curves without (R g;saxs , blue) and with dyes (R þdyes g;saxs , orange) are depicted. R g errors derived from the Guinier linear regression are listed in Supplementary Information S5.

Full Paper
Isr. J. Chem. 2020, 60, 725 -734 derived values of R g;saxs and R þdyes g;saxs to be almost identical to the respective references of R g;gmx and R þdyes g;gmx . Finally, we analyze R g variants obtained from end-to-end distances presented in Figure 7. Here, we only consider the residues between the dye positions to mimic labeling at the termini. The ratios of R þdyes g;gmx and R þdyes g;saxs with respect to R g;gmx are in good agreement and both shifted to higher values as before (see Figure 6). R g;saxs , R app g;C a , and R app g;R DA are all very similar to R g;gmx . R app g;R DA is consistently larger than R g;saxs , pointing to a small systematic difference in the quantities accessible to FRET and SAXS. Note that investigating IDPs in varying denaturant concentrations as done experimentally [20] and in explicit-solvent MD simulations [21] is not yet possible within the structure-based simulation protocol. However, this Figure 6. R g variants for the unfolded systems with respect to R g;gmx (green line) given at the bottom in Å. The systems studied are 10 FNIII, CI-2 with two different dye pairs (AF546/AF647 and AF488/AF594), CspTm with AF488/AF594 at three different labeling positions, and ClyA monomer and protomer with AF488/AF594/(B680) at different labeling sites. More details on the systems can be found in Supplementary Information S1. R g values calculated from atomic structures with dyes (R þdyes g;gmx , red) and those derived from SAXS curves without (R g;saxs , blue) and with dyes (R þdyes g;saxs , orange) are depicted. R g errors derived from the Guinier linear regression are listed in Supplementary Information S5. Figure 7. R g variants for different truncated systems in the unfolded states with respect to R g;gmx (green line) given at the bottom in Å. We study 10 FNIII, CI-2 with two different dye pairs (AF546/AF647 and AF488/AF594), CspTm with AF488/AF594 at three different labeling positions, and ClyA monomer and protomer with AF488/AF594 at different labeling sites. More details on the systems can be found in Supplementary Information S1. R g values calculated from atomic structures with dyes (R þdyes g;gmx , red), those derived from SAXS curves without ( R g;saxs , blue) and with dyes (R þdyes g;saxs , orange), and apparent values calculated from C a end-to-end distance (R app g;Ca , brown) and inter-dye distance ( R app g;R DA , purple) are shown. R g errors derived from the Guinier linear regression are listed in Supplementary Information S5.

Conclusion
We find that FRET dyes attached to a protein significantly affect SAXS measurements on that system, as the dyes change both its size and shape. This effect is particularly pronounced for small proteins. In the case of unfolded ensembles, the difference is small however observable, while almost insignificant for larger systems. Systems appear to be larger with dyes than without, manifesting in a larger radius of gyration. In line with our expectations, the smaller the protein, the more significant is this effect. Dye types also show an effect on R g . For unfolded ensembles, dye positions further affect the values derived, and our findings suggest a subtle but observable influence of FRET dyes on the chain dynamics. This means that, when performing both FRET and SAXS measurements on the same system, respective effects have to be taken into account in the data analysis methods applied. We find that R g variants derived from SAXS apparently overestimate R g for small systems, whereas underestimating it slightly for some of the larger systems. As expected, SAXS-derived variants shift to higher values for systems with dyes attached. All R g values derived by FRET and SAXS are in good agreement, consistent with prior work suggesting that the analysis methods are the primary source of the discrepancies observed. [19,20] However, we find the FRET-derived R g variant to be consistently larger than the SAXS-derived value, pointing to a small systematic difference in the quantities accessible to FRET and SAXS.