## Introduction

Over the past decade, there has been increasing awareness of the importance of unfolded states of proteins *in vivo*.1–3 These partially or fully unfolded states include the nascent forms of proteins that fold to stable three-dimensional structures, states that arise from transient unfolding, and the functional forms of some proteins that have little or no tendency to fold under physiological conditions. This latter class of proteins, referred to as intrinsically disordered proteins (IDPs), has attracted particular attention, as they appear to contradict the traditional paradigm of protein structure–function relationships.4–8

One of the defining features of unfolded proteins as a class is the scaling of the average dimensions of the molecules with chain length. Hydrodynamic measurements by Tanford in the 1960s demonstrated a simple power-law relationship between the overall sizes of denaturant-unfolded proteins and their chain lengths.9 If the average size is expressed as the radius of gyration (*R*_{g}), the relationship has the form:

where *N* is the number of amino-acid residues, *R*_{0} is a constant, and ν is a scaling exponent. Relationships of this form are characteristic of disordered polymers, and ν is commonly referred to as the Flory exponent.10, 11 In the intervening decades, this relationship has been confirmed and extended for a large number of proteins, using a variety of experimental techniques.12 Small-angle X-ray scattering (SAXS) has proven particularly useful, as this method yields a more direct measurement of the radius of gyration than do hydrodynamic methods. For proteins unfolded in denaturants, such as guanidinium chloride (GuHCl) and urea, current estimates of ν place its value at about 0.6, very close to the value predicted from polymer theory for well solvated random coils with excluded volume effects accounted for.13

At present, the relationship between average dimensions and chain length for intrinsically disordered proteins is less well defined than for denaturant-unfolded globular proteins. Relatively few SAXS studies of IDPs have been published,14–16 but the hydrodynamic radii (*R*_{h}) of more than 30 IDPs have been measured by either size-exclusion chromatography or NMR-based diffusion measurements.17, 18 For both denaturant-unfolded proteins and IDPs, *R*_{h} has been found to follow a power-law relationship with respect to chain length, similar to that observed for the radii of gyration for denatured proteins. Direct comparison of the two parameters is not straight forward, however, as they depend differently on molecular shape and the distribution of conformations in an ensemble.19, 20 For the IDPs, there is significantly more deviation from the correlation, and many of these proteins appear to be more compact than denaturant-unfolded proteins of the same length. On the other hand, Kohn *et al*.13 identified five IDPs for which the radius of gyration (determined from SAXS measurements) was larger than for comparably sized denaturant-unfolded proteins and excluded these proteins from the data set used to estimate ν. These observations have led to the view that there may be multiple subclasses of IDPs with different degrees of compaction.17, 18

The range of results observed for IDPs highlights a general difficulty in interpreting the average dimensions of individual unfolded proteins. If the measured value of *R*_{g} or *R*_{h} for a given protein is found to be consistent with the measurements of other proteins, it is tempting to conclude that the molecule is well described as a random coil. However, a discrepancy between observed and predicted dimensions may be due to the differences in experimental methods or conditions, aggregation, or other experimental artifacts. In addition, it is now widely appreciated that the average dimensions of unfolded molecules can be quite insensitive to the presence of nonrandom local structure.21

One way to address these difficulties is to examine the internal scaling relationships of individual proteins, which can be measured without reference to other molecules. If the distance between two residues in a disordered chain is *R*_{i,j}, then the (root mean square (RMS)) average distance is predicted to follow the relationship:

where ν is the same exponent as in Eq. (1), provided that the number of intervening residues, *N*_{i,j}, is sufficiently large. This relationship can be tested, and the value of ν estimated, by experimentally measuring distances between individual atom pairs within a protein, for instance by Förster resonance energy transfer (FRET).22, 23 Carrying out measurements of this type for a series of labeled proteins is a major undertaking, however, and the individual measurements may be overly sensitive to local structures or interactions with the fluorescent probes.

An alternative approach to assess the internal scaling exponent is based on small-angle scattering by X-rays or neutrons (SANS). In most published scattering studies with unfolded proteins, the data have been used to estimate only the average radius of gyration, usually from Guinier plots derived from the scattering at very small angles, but the scattering from larger angles contains valuable additional information. At the very small angles, the intensity of scattering from a particle of any shape is predicted to follow a relationship of the form:

where *K* is a constant, and *q* represents the scattering angle expressed as the scattering vector magnitude, *q* = (4π sin θ)/λ; θ is one-half of the scattering angle, and λ is the X-ray or neutron wavelength.24 This relationship is valid for *q* ≲ 1/*R*_{g}, depending somewhat on the shape of the particle. In the Guinier plot, ln(*I*) is plotted versus *q*^{2}, and the average radius of gyration is estimated from the slope of the linear region of the plot. For an ensemble of conformations, the radius of gyration derived from the Guinier plot represents an RMS average of the values for the individual molecules. At larger angles, scattering from many particle types is described by25

where *D*_{m} is a constant. This relationship gives rise to a linear region in a log(*I*) versus log(*q*), the negative slope of which, *D*_{m}, reflects the distribution of interatomic distances on different length scales and can be interpreted as a mass fractal dimension.26–29

The significance of the fractal dimension can be visualized by imagining a sphere enclosing a portion of an object of interest [Fig. 1(A)]. For fractal objects, that is, objects that display self-similarity over at least a limited range of scales, the mass, *m*, enclosed by the sphere increases with radius, *r*, according to a power-law relationship:

For a one-dimensional object (a line), the exponent is 1; for a plane *D*_{m} = 2 and for a solid *D*_{m} = 3. The values of *D*_{m} are not limited to integers, however, and many fractal objects have *D*_{m} less than the number of spatial dimensions they occupy. An object described by a self-crossing random walk has a fractal dimension of 2. More generally, the fractal dimension for a polymer is related to the Flory exponent, ν, according to *D*_{m} = 1/ν. (For globular particles, the exponent in the scattering relationship [Eq. (4)] is 4, rather than 3, because of contributions from the solvent interface.)

Because *D*_{m} can be directly estimated from the scattering profile, SAXS and SANS provide information about the internal scaling of interatomic distances, without reference to other proteins or specific models. Although the fractal dimension is not a measure of overall size, such as *R*_{g} and *R*_{h}, *D*_{m} is sensitive to favorable or unfavorable solvation of the chain, with larger values reflecting unfavorable interactions with solvent and more compact conformations.27, 28, 30–32

The ability of SAXS or SANS to distinguish between folded and unfolded proteins is illustrated in Figure 1 (B–D) in which different representations of scattering data for a folded and an unfolded protein of similar molecular weights are shown. In each panel, the solid curve is the calculated SAXS profile for a small globular protein, bovine ribonuclease A (RNAse A, *M*_{r} = 13,700), whereas the dashed profile was calculated for a simulated ensemble of the IDP used in this study, the bacteriophage λ N protein (λN, *M*_{r} = 12,300) described below. In Panel B, the full scattering profiles are represented, showing the more rapid fall off of scattering intensity predicted for the unfolded protein. The Guinier representation is shown in Panel C, where the more negative slope for the unfolded protein represents the larger average radius of gyration for the unfolded ensemble (37 Å) than the globular protein (15 Å). A log(*I*) versus log(*q*) representation is shown in Panel D, showing the difference in the exponent, *D*_{m}. Importantly, the slope of the linear region at intermediate values of *q* is independent of the size of the molecule and reflects only the internal scaling exponent.

In this present study, SAXS was used to determine both the radius of gyration and the fractal dimension of λN under a range of solution conditions. Previous studies have shown that λN is an IDP, displaying very little structure in the absence of other molecules. *In vivo*, λN functions as a transcriptional antitermination factor and interacts with a specific RNA sequence (the Box B segment) and multiple proteins in the transcription complex. On binding the Box B RNA, the amino terminal segment of λN takes on an α-helical conformation, and another λN segment binds to a bacterial host protein (NusA) in an extended conformation. Relatively little is known about the conformational properties of the other regions of λN in the transcription complex, and it is possible that the disordered nature of the protein plays a role in facilitating structural changes in this dynamic complex. The SAXS data presented here, together with a computational simulation, confirm that the isolated protein is extensively unfolded, with the scaling exponent of a well-solvated polymer, and it is remarkably insensitive to changes in solution conditions.