Nanomaterial Functionalization Modulates Hard Protein Corona Formation: Atomistic Simulations Applied to Graphitic Materials

The protein corona is an obstacle to exploiting exotic properties of nanomaterials in clinical and biotechnological settings. The atomic‐scale dynamic formation of the protein corona at the bio‐nano interface is impenetrable using conventional experimental techniques. Here, molecular dynamics simulations are used to study the effect of graphene‐oxide (GO) functionalization on apolipoprotein‐cIII (apo‐c3) adsorption. An analysis pipeline is developed, encompassing binding energy calculations to protein structure analyses employing uniform manifold approximation and projection (UMAP) dimensionality reduction and clustering. It is found that apo‐c3 is denatured by GO adsorption, driven by the large energetic contributions of electrostatic interactions; enthalpic contributions of such binding events outweigh the intraprotein bond enthalpy required to maintain the protein tertiary structure. Through denaturing and exposing buried hydrophobic residues, the protein backbone is stabilized by forming β‐bridges, which serve as binding motifs for protein–protein interactions that drive further protein aggregation on the nanomaterial surface. In contrast, adsorption on double‐clickable azide‐ and alkyne‐double functionalized GO (C2GO), apo‐c3 largely retains its tertiary structure. Binding with the nanomaterial surface is dominated by weaker van der Waals interactions that are dispersed over the protein surface, where charged protein residues are sterically hindered by azide functional groups. The apo‐c3 C‐terminus remains unchanged upon C2GO adsorption, conserving its lipid‐binding function.


Introduction
Graphene-oxide (GO) is a semi-ordered 2D material that shares many novel mechanical and electronic properties with graphene. It is utilized in applications including ion trap-nanovector, [20] where click reactions conjugate targeting moieties on azide and trimethylsilyl (TMS)-alkyne functional groups. [21] In vitro, both C2GO and GO show varying HC character when exposed to serum proteins, which subsequently impacts their biological identities. [15] An experimental qualityby-design screening platform was used to unpick the HC character and evaluate its relationship with biological fate, cellular uptake and cytotoxicity. Using this, proteins that reduced material dependent toxicities were identified to play a role in diminishing cytotoxicity, including apolipoprotein C-III (apo-c3). [15] Protein coronas are known to temporally evolve, leaving the total amount of protein constant but varying their composition according to binding affinity, known as the Vroman effect. [22] Upon the introduction of a nanomaterial to a biological medium, highly abundant proteins such as albumin, immunoglobulin G and fibrinogen bind to the nanoparticle surface in the early stages of the Vroman effect, followed by their replacement with high binding affinity proteins such as apolipoproteins and coagulation factors in the late stages of the Vroman effect. [22][23][24][25][26] The Vroman effect has previously been observed in computational and experimental studies of peptide, cellulose and fatty-acid binding on GO. [27] Previously, molecular dynamics (MD) has been used to study the interactions of GO with peptides to understand conformational transitions of amyloid-beta during adsorption, [28] the hydration pattern of an adsorbed toy-model alpha-helix [29] and verification of enzyme active-site deformation following adsorption. [30] To the best of our knowledge, no MD simulations have so far studied protein adsorption on accurate models of GO, and have instead randomly placed oxidized functional groups on the GO surface. Accurate modeling of GO should reflect the semiordered structure of GO; composed of inhomogeneous regions of oxidized and unoxidized domains, where amorphous alcohol and epoxy groups make up the oxidized regions. [31] Ab initio MD simulations of GO show that semi-ordered models of GO are the most stable structures in vacuum as well as in liquid water. [32] Furthermore, we have recently shown the importance of accurate functionalization and accounting for steric strain and edge functional groups in large scale MD simulations of GO, using generalized and bespoke electronic structure MD forcefield design. [33] In this work, we use MD simulations to study protein denaturing through adsorption on GO and C2GO, to understand HC conformation-activity relationships through binding free energy, contact map, protein structure and solvent exposure analyses. We apply clustering and dimensionality reduction techniques to decode the spatio-temporal MD data that is hard to interpret into distinct protein secondary structure conformations.

Binding on the Bio-Nano Interface
In both GO and C2GO systems, apo-c3 readily adsorbs to the substrate and reaches a stable conformation over the course of the adsorption trajectory. To investigate the binding of apo-c3 to the GO and C2GO interface, we compute contact maps and free energies of binding that underpin the changes in protein structure following adsorption. These analyses probe interfacial dynamics that may play a large role in the formation of a protein corona around a nanomaterial upon its introduction to a biological medium. To identify adsorption of apo-c3 to the graphitic sheets we calculated the minimum distance over time between any heavy (non-hydrogen) atom of the protein to any heavy atom of the sheet ( Figure S1, Supporting Information). We also compute the minimum distance of each residue to the graphitic sheets over time. These are combined to produce a heatmap of contact probability between apo-c3 and GO/C2GO ( Figure 1B). The binding free energy is calculated according to the molecular mechanics with Poisson-Boltzmann and surface area solvation (MM-PBSA) method, implemented using g_mmpbsa. [34] The average binding free energy components for the GO-apo-c3 and C2GO-apo-c3 systems are given in Table 1. The binding free energy components are decomposed to per-residue contributions ( Figure 1A). In this way we can acquire a better understanding of each amino-acid's contribution to the binding free energy of the protein-nanomaterial complex. Per-residue binding free energy contributions are color-coded by their interaction strength (blue to red) in the protein structure of the final configuration ( Figure 1C) for GO and C2GO, where each largely contributing amino acid residue has been illustrated and labeled explicitly.
The binding of apo-c3 to the graphitic substrate is driven by amino acids with the highest binding affinities from the beginning of the adsorption process. Apo-c3 residues 60-70 initiate binding with GO ( Figure S1, Supporting Information), most likely due to the high binding affinity of TRP65 to π-π stack with pristine graphene domains on the GO surface, as well as GLU63 binding with positively charged (tertiary alkyl, phenol and carboxylic acid) carbon atoms ( Figure 1A). From 100 ns apo-c3 binding to GO is driven by positively charged (HIS18, LYS21) and polar (THR20) residues ( Figure 1A) interacting with oxygen containing surface functional groups. Accordingly, the binding free energy of GO-apo-c3 has a higher net contribution of electrostatic interactions when compared with C2GO-apo-c3, contributing -208 kJ/mol to the total binding free energy (Table 1).
In contrast, binding to C2GO is driven by both of the extreme N-and C-terminal ends of apo-c3, which also have the highest binding affinity to the C2GO substrate according to the MM-PBSA binding free energy analysis. The N-terminus serves as the main binding domain with C2GO, driven by negatively charged (GLU4, ASP5) and polar (SER10, GLN13) residues interacting with a TMS group and positively charged (tertiary alkyl, phenol and carboxylic acid) surface and edge carbon atoms ( Figure S3, Supporting Information). Meanwhile, charged amino acid residues contribute a net positive change in binding free energy of apo-c3 to C2GO ( Figure 1A), stabilizing C2GO-apo-c3 binding and a conserved apo-c3 tertiary structure. Charged amino acids are the only residues in apo-c3 that sterically hinder neighboring amino acids from binding to the C2GO interface, and this is wholly achieved through azide functional groups, with the exception of ARG40 which is sterically hindered by a silanol group. However, the role of azide groups is not exclusively restricted to an agent for steric hindrance of amino acid side chains to the graphitic surface, as they drive the binding of hydrophobic amino acids in the extreme C-terminal end of apo-c3 ( Figure 1A,B). Unlike GO-apo-c3, C2GO-apo-c3 binding is dominated by van der Waals interactions, contributing -463.2 kJ mol -1 to the total binding free energy (Table 1).
In the case of GO-adsorbed apo-c3 binding is dominated by electrostatic interactions (Table 1) with a locus around two binding hotspots-LYS21, TRP65 ( Figure 1A)-which stabilizes unfolding of the protein. This is due to the enthalpic contribution of this binding outweighing the bond enthalpy of the intraprotein interactions maintaining the tertiary structure. Due to the changes to the apo-c3 native state tertiary structures induced by adsorption, apo-c3 has a significantly higher conformational entropy. The energetic stabilization of a denatured protein is attributed to the conformational entropy of amino acid side chains, which would be a barrier to recovering the native state tertiary structure. [35] In contrast, the enthalpic contribution in C2GO-adsorbed apo-c3 is more dispersed over the surface ( Figure 1A) and is on average dominated by weaker (van der Waals) interactions (Table 1), thus the intraprotein interactions maintaining the tertiary structure are not compromised by any energetic spikes caused by strong binding in any location on the complex interface.

Protein Structure
Changes to protein structure during adsorption-leading to protein denaturing or deformation-are evaluated using secondary structure, intramolecular hydrogen bonding and solvent-accessible surface area (SASA) analyses. Protein native contacts indicate changes in secondary structure due to adsorption, their temporal evolution is calculated using the define secondary structure of proteins (DSSP) algorithm [36] and protein intra-molecular hydrogen bonds.

DSSP and Hydrogen Bonds
To understand the dynamic progression of denaturing in GO-adsorbed apo-c3, DSSP and hydrogen bond analyses indicate the sequential loss of secondary structure and number of intramolecular hydrogen bonds respectively. The number of protein intramolecular hydrogen bonds over time ( Figure 2B) complement the DSSP results ( Figure 2A) at a higher resolution and are invariant to the DSSP algorithm criteria for defining strands, helices or coils. It shows that over time, unlike C2GO-adsorbed apo-c3, GO-adsorbed apo-c3 has a dynamic character, sporadically gaining or losing hydrogen bonds. These high spatiotemporal hydrogen bond frequencies stabilize intrachain contacts by recovering the loss of hydrogen bonds, indicating structural reorganization is taking place during adsorption.
Loss of secondary structure in C2GO-adsorbed apo-c3 is transient, with temporary loss of α-helix character spanning residues 30-40 ( Figure 2A). In contrast, GO-adsorbed apo-c3 α-helix character is lost irreversibly (Figure 2B), concomitantly with the formation of β-turns whose lifetime range from tens to hundreds of ns and persist throughout adsorption. A decrease of α-helices accompanied by an increase of β-strands elsewhere in the protein is a characteristic observed in experimental secondary-structure analysis of denatured protein libraries using vacuum-ultraviolet circular dichroism. [37] Within the initial 150 ns of adsorption, α-helices spanning residues 1-10, 30-33, and 40-50 transition to coils (Figure 2A). Subsequently at 200 ns, the C-terminus loses its α-helix character in residues 55-65, the lost hydrogen bonds ( Figure 2B) are recovered through forming β-turn motifs ( Figure 2A) and stabilizing the apo-c3 protein backbone ( Figure 2C) for the remainder of adsorption until 400 ns. β-turns remain stable in a solvated state, potentially serving as recognition motifs of protein-protein interaction, [38] until binding takes place with other proteins. Both L4-S7 and W54-T56 transitioned from an alpha-helix to type-I and type-II β-turns, respectively, whereas S48-K51 transitioned from a coil to a type II β-turn.

Clustering
UMAP dimensionality reduction is a useful approach to map high dimensional data of MD protein backbone coordinates trajectories to a lower dimensional protein configuration space. [39] This is due to the better performance of UMAP in preserving local and global structure relationships compared with conventional dimensionality reduction techniques such as principle component analysis or t-distributed stochastic neighbor embedding (t-SNE). A projection of the high dimensional spatiotemporal MD data into a low-dimensional space is used to identify distinct protein structures, bridging pathways and global relationships between distinct configurations. It is also a practical tool for visualizing what is otherwise a vast dataset.
The number of clusters in the protein configuration space reflects the number of ensembles the protein backbone adopts, namely the distinct structures of apo-c3 during the entire adsorption process. Using this and the visualization of protein structures corresponding to each cluster, we can see that apo-c3 has higher structural variance when adsorbing on GO than on C2GO, which have 13 and 10 clusters, respectively (Figure 3). GO-adsorbed apo-c3 clusters show how the protein sequentially undergoes a process of reorganization (Figure 3) driven by interchain hydrogen bonding interactions ( Figure 2B) due to the binding interactions with the GO surface ( Figure 1A).
The temporal evolution of apo-c3 adsorption follows the cluster numbers in the protein configuration space (Figure 3). The pathways between clusters are synonymous with the DSSP analysis, corresponding to transitions between configuration states where apo-c3 secondary structure either gains or loses α-helix, β-bridge, or coil character. In both cases of GO and C2GO adsorption, the reorganization process requires transitions to discrete intermediary states (clusters) before converging to a stable complex (Figure 3). The stable complex may or may not recover its secondary structure following these transitions, corresponding to whether it has or has not been denatured via the adsorption process. This is indeed the case in GO-adsorbed apo-c3, which has undergone drastic structural changes and retains some of its N-terminus ( Figure 3) but loses the C-terminus secondary structure. In contrast C2GO-adsorbed apo-c3 does recover most of its secondary structure, where different cluster pathways transition between the highly conserved native backbone (Figure 3), which is indicative of binding deformations instead of protein denaturing. Note that the images of the protein backbone annotating the clusters only approximate secondary structure character and are not as accurate as the stateof-the-art characterization of the secondary structure using DSSP analysis (Figure 2).

Solvent Exposure
As a result of structural changes induced by adsorption, the hydration of the apo-c3 amino acids change and contribute to potential protein aggregation in vivo. As the structure of apo-c3 changes during adsorption on GO, solvent exposure is increased significantly over multiple regions of the amino acid sequence (Figure 4A). Some residues will have a larger SASA after adsorption due to change in conformation. Some residues will have a smaller SASA after adsorption either due to a change in conformation leading to new protein-protein contacts or due to interaction of the residue with the nanomaterial in question. There is a correlation with increased solvent exposure in regions where apo-c3 forms β-turns, as analyzed by DSSP and illustrated in (Figure 2). The β-turns remain in this solvated state, potentially serving as recognition motifs for protein-protein interactions until binding to other proteins in vivo or in vitro. [38] As well as increased solvent exposure of the GO-adsorbed apo-c3 C-terminal helix (residues 47-65), the EVRPTSAVAA minimotif (residues 70-79) has a consistently higher solvent exposure than the native state of apo-c3 in solution ( Figure 4A,B). These exposed hydrophobic residues keep the adsorbed complex in a disordered state until it can coalesce with its protein/lipid environment. In contrast, the solvent exposure of apo-c3 adsorbed to C2GO is limited to isolated short sequences leaving the conformation of most of the C-terminal helix region (residues 47-65) unchanged ( Figure 4A). Previous work has found that the C-terminal region of apo-c3 is essential for mediating lipid binding, with the N-terminal region (residues 1-40) playing a limited secondary role. [40][41][42] Solvent exposure analysis of the C-terminal helix expresses the conservation of its lipid-binding function from a physio-chemical perspective. These results may in part delineate the experimentally observed preferential uptake of corona-coated C2GO to corona-coated GO by J774 cells. [15]

Conclusions
This work has dissected the protein structural changes induced by interactions of functional groups and protein residues on the bio-nano interface. Through adsorption on GO, apo-c3 shows denaturing of the secondary structure over large swathes of the protein sequence. Whereas C2GO-adsorbed apo-c3 has limited deformation, with most variance displayed in the N-terminus. The C-terminus of apo-c3 has previously been found to be responsible for lipid binding, with the N-terminus playing a limited secondary role. [40][41][42] These findings correlate with experimental evidence of the increased cellular uptake of C2GO, as compared to GO, hard protein corona. [15] Apo-c3 is found to denature upon adsorption to GO, according to solvent exposure, DSSP and conformational clustering analyses. Following adsorption on GO, apo-c3 forms 3 separate β-turns that serve as binding motifs for proteinprotein interactions. These can aid in subsequent aggregation of serum proteins to the corona, contributing to the corona identity in vivo, hence complicating the targetability of the nanomaterial-corona complex.
The functional groups at the binding interface between the nanomaterials and apo-c3 set off a series of dynamic events that result in large-scale secondary structure changes. Contact distances, heatmaps, and binding free energy calculations collectively describe the driving forces of the changes in protein structure following adsorption. The introduction of azide and TMS functional groups was sufficient to stop the denaturing of apo-c3 and the formation of β-turns that serve as proteinprotein interaction binding motifs. We find that contact with azide functional groups correlate strongly with contact to other surface functional groups and therefore work cooperatively to maximize the binding surface at the bio-nano interface. Consequently, the C2GO-adsorbed protein is stabilized and is not pushed to structural deformation as is the case in GO-adsorbed protein. The results from this analysis pipeline suggest the causes of protein aggregation and cellular uptake are mediated by the aforementioned changes in protein structure.
This work indicates that the dynamic and functional role of adsorbed proteins may be a more important probe to understanding the protein corona, rather than quantifying a static image of its constituent proteins. MD simulations are a valuable tool for such an investigation and our analysis pipeline serves as a transferable method for understanding the structure/function relationship of dynamic protein-nanomaterial adsorption.

Experimental Section
Nanomaterial Structure: A Python package [43] was modified to generate azide and trimethylsilyl (TMS)-alkyne functional group conjugated rectangular graphene-oxide flakes. [44] The package improved on the commonly applied protocol, of randomly placing oxidized functional groups, which was now known to be an incomplete model of grapheneoxide structure. [32,33] Instead, the two-phase nature of oxidized and unoxidized graphene domains observed in microscopy experiments [45][46][47][48] was recreated in accordance with the recent study of accurate large scale modeling of graphene oxide. [33] Molecular Dynamics: MD simulations were performed in GROMACS version 2020.1 on the ARCHER2 AMD EPYC Zen2 (Rome) 64 core CPUs at 2.2GHz or Nvidia V100 GPUs. The all-atom OPLS forcefield was used to simulate the classical simulations. A position restraint algorithm was applied to all nonsolvent atoms during equilibration. Protein X-ray diffraction structures for the apo-c3 structure (PDB code: 2JQ3) was used for all simulations. [49] The apo-c3 structure was placed at least 10 Å from the nanomaterial surface for each of the two major accessible surface areas of apo-c3, giving two replicas per nanomaterial. For each nanomaterial, the most energetically favorable orientation was determined via MM-PBSA and report on the analysis of these replicas in the Results section of this manuscript. The ≈130 Å cubic simulation box was fully solvated with TIP3P water molecules. The system net charge was neutralized by adding four sodium ions into the system. The system was relaxed energetically using steepest-descent energy minimiaation for 50 000 steps with an energetic step size of 0.01 kJ mol -1 . The minimization was terminated after the maximum energetic contribution was lower than a threshold of 1000.0 kJ mol -1 nm -1 . NVT and NPT equilibration was performed for 100 ps using two separate modified velocity-rescaling thermostat-with a stochastic term to ensure generating the canonical ensemble-coupling temperature to velocities for graphene and solvent molecules, [50] where a temperature of 300 K was maintained and 1 bar using the Parrinello-Rahman barostat. The Verlet cut-off scheme was employed to generate pair lists and the electrostatic interactions were calculated using the Particle-Mesh Ewald algorithm. Both electrostatic and van der Waals interactions were cut off beyond 1.2 nm. All bonds involving hydrogen atoms were constrained using the LINCS algorithm. Production simulations were run for 400 ns using a timestep of 1 fs, convergence of the nanomaterial-apo-c3 complex was evaluated using RMSD over time ( Figure S4, Supporting Information), which was supported by the secondary structure (DSSP) analysis (Figure 2), as well as the cluster configurations obtained via UMAP/ HDBSCAN (Figure 3). Analysis was performed using the MDAnalysis package [51][52][53] and its analysis modules. [54,55] To identify hydrogen bonds the hydrogen bond analysis tool [55] implemented in MDAnalysis was used. The MDTraj [56] implementation of DSSP was used. Analysis methods are outlined below, including contact maps, MM-PBSA binding energy calculations [34] , UMAP dimensionality reduction [57,58] and HDBSCAN clustering, [57] and SASA solvent exposure [56,59] .
Contact maps: Contact maps between apo-c3 and the graphitic sheets were calculated using a hard cutoff of 5.0 Å between any two heavy (nonhydrogen) atoms. Data over the final 50 ns were used for the contact maps, as during this period apo-c3 was stably adsorbed to both GO and C2GO. Contact maps were constructed between GO/C2GO and each specific residue of apo-c3 ( Figures S2 and S3, Supporting Information), as well as between GO/C2GO and each amino acid regardless of its position in the sequence ( Figure 1B).
PBSA Binding Energies: The last 10 ns of MD trajectories were post processed using MM-PBSA analysis implemented using g_mmpbsa, [34] to obtain a relative order of binding of apo-c3 to the different graphitic nanosheets. The energetic components of the binding free energy were the changes in the system potential energy in vacuo, the polar and nonpolar solvation energies. The potential energy accounts for bonded (bond, angle, and torsion) and non-bonded (van der Waals and electrostatic) energetic terms. Polar solvation energy is the electrostatic contribution to the solvation free energy and was estimated using the Poisson-Boltzmann equation. The non-electrostatic contribution to the solvation free energy accounts for forces between solute and solvent, which were calculated using the solvent accessible surface area (SASA). [34] UMAP Dimensionality Reduction: The atomic coordinates of apo-c3 were used to obtain an ensemble of distinct structural conformations when adsorbed to the graphitic sheets. The structures were first centered and aligned along their backbone atoms, using every tenth frame (50ps) from the trajectory. Then the UMAP [39] algorithm was used to embed the atomic coordinates of heavy atoms of apo-c3 into a 2D space. UMAP constructed a graph of the points in the high-dimensional space, then optimized a low-dimensional representation of the graph such that the topological distance was preserved to a degree in the embedding. [39] Therefore, similar conformations that were close in the high-dimensional space will also be close in the reduced space. The n_neighbours and min_dist hyperparameters were set to 15 and 0.0, respectively, with the latter being a requirement if the points in the reduced space were to be clustered. HDBSCAN [57] , implemented in scikit-learn [58] , was used to cluster the apo-c3 conformations in the embedded space. Representative structures of each conformation were identified by calculating the mean structure of a conformation, then finding the structure with the smallest RSMD from this mean structure.
SASA Solvent Exposure: To quantify the solvent exposure of amino acids during adsorption, the solvent accessible surface area of each residue of apo-c3 was calculated over time. To understand how the conformational changes lead to exposure of residues that were buried in the native state, the SASA time-series were normalized by the mean SASA of each residue of the protein in solution. The final 50ns of apo-c3 solution was used for determining the mean SASA of each residue. Values greater than and less than 1 indicate, respectively, increased and decreased exposure to the solvent ( Figure 4). The SASA was calculated using the "rolling-ball" Shake-Rupley algorithm, [59] implemented in MDTraj. [56] This approximated the SASA by effectively rolling a ball over each atom of the protein to define a surface, then examining how much of the each atom's surface was exposed to solvent as opposed to overlapping with surfaces of neighboring atoms. The radius of the probe was typically set to 1.4 Å-approximately the radius of a water molecule.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.