A Mass‐Spectrometry‐Based Modelling Workflow for Accurate Prediction of IgG Antibody Conformations in the Gas Phase

Abstract Immunoglobulins are biomolecules involved in defence against foreign substances. Flexibility is key to their functional properties in relation to antigen binding and receptor interactions. We have developed an integrative strategy combining ion mobility mass spectrometry (IM‐MS) with molecular modelling to study the conformational dynamics of human IgG antibodies. Predictive models of all four human IgG subclasses were assembled and their dynamics sampled in the transition from extended to collapsed state during IM‐MS. Our data imply that this collapse of IgG antibodies is related to their intrinsic structural features, including Fab arm flexibility, collapse towards the Fc region, and the length of their hinge regions. The workflow presented here provides an accurate structural representation in good agreement with the observed collision cross section for these flexible IgG molecules. These results have implications for studying other nonglobular flexible proteins.

2 from the lowest charge state species. All CCSexp values were converted to CCSHe in PULSAR.

High-Resolution Native Mass Spectrometry
A Thermo Q-Exactive mass spectrometer (Thermo Fisher Scientific, Germany) modified for detection of high molecular weight ions was used for IgG1 glycosylation analysis. Data was obtained in positive ion mode with an acquisition window of m/z 1000 to 15000. Ions were desolvated in the HCD cell with 100V. Additional settings were as follows: capillary voltage = 0.8-1.0 kV; source temperature = 60°C; max injection time = 100 ms; S-lens RF = 150; resolution = 17500. Spectra were obtained with 10 microscans, averaged over 50 scans.
Data was processed using XCalibur 2.1 software (Thermo Fisher Scientific, Germany) and glycoforms were assigned manually.

Generating initial models of IgG1, IgG2 and IgG4
All homology modelling was performed using MODELLER [3] . IgG1 (Uniprot accession: P01834 and P01857) was modelled using PDBs 1HZH (human) and 1IGY (mouse) as template (Supplementary Figure 2). The Fab of 1HZH (chains B and D) missing covalent connection to the rest of the molecule, was extracted and aligned to the Fab of 1IGY in order to recover an extended solution-like conformation of IgG1. The structure of IgG2 (AN: P01834 and P01859) was modelled using PDB 1IGT (mouse; whole molecule), 4L4J (human; Fc) and 2QSC (human; Fabs). Two additional disulphides were inserted into the hinge to generate a representative model of human IgG2. 200 models of each IgG1 and IgG2 were generated and evaluated based on their discrete optimised protein energy (DOPE) score [4] . Missing residues of the human IgG4 crystal structure were re-generated automatically in MODELLER using PDB 5DK3 (human) as template. Glycans structures were not modelled into any of the IgG molecules for two reasons. Firstly, each IgG exhibits numerous glycoforms which dramatically increases the number of starting models of our study, both for the Fab arm sampling and gas phase simulation sections. Secondly, deglycosylation of IgG molecules results in no significant difference in experimental CCS.
These observations have led us to believe that the added complexity of including glycan structures does not offer significant benefits to our modelling workflow.

Homology modelling of IgG3
We acquired fragments of the structure from PDBs 4HAF (human; Fc) and, 4HDI and 1CLZ (mouse; Fab) and manually built the hinge structure using 11 CYS-CYS pairs interspersed with six tri-proline helices (Supplementary Figure 2). All other hinge residues were automatically added with MODELLER. Glycans were not modelled for IgG3, as described in the homology modelling procedure of IgG1, 2 and 4. The IgG3 model was then subjected to 100 ns of explicit solvent molecular dynamics simulation in GROMACS 5.1.3 [5] with the CHARMM27 (modified CHARMM22 for proteins) forcefield [6] . The atomistic homology model of IgG3 was added to a triclinic simulation box (178 x 169 x 252 Å) with an edge buffer of 10 Å to account for flexibility and prevent interactions with periodic images. Disulphide bonds were manually checked to ensure correct bonding. 243,611 TIP3 waters and 2 chloride counterions were added to neutralise the system charge. We then performed energy minimisation using a steepest-descent algorithm, followed by equilibration in isochoricisothermal (300 K, τ = 0.1 ps) and isobaric-isothermal (1.0 bar, τ = 2.0 ps) ensembles for 1ns each. Equilibration employed the "V-rescale" modified Berendsen thermostat and Parrinello-Rahman barostats. The LINCS algorithm was employed to restrain bonds. Finally, production simulation of the system was continued for 100 ns at constant temperature and pressure. For non-covalent interactions, we utilised particle mesh Ewald (PME) with a grid spacing of 0.16 nm for long-range electrostatic interactions, and the Verlet cut-off scheme for Van der Waal calculations. The RMSD evolution of the simulation was monitored and reviewed after 100 ns of simulation to ensure appropriate convergence of the IgG3 structure.
To extract a single representative model of IgG3, we clustered models from the final 50 ns of the simulation and identified centroid model of the major conformation.

Fab arm conformational sampling
For conformational sampling of each of the IgG1-4 Fab arms, we first identified the selection of residues which constituted their upper hinges. For each IgG heavy chain, these were, IgG1: D446-T450, IgG2: E437-K439, IgG3: E219-T230, IgG4: E437-P443. The conformational space of each upper hinge and its Fab were then sampled using a rapidly exploring random tree (RRT) algorithm available from the Integrative Modelling Platform (IMP) [7] . This procedure sets the disulphide top-most disulphide of each hinge as the tree root and randomly tests availability for each node (connected atom) to rotate to a new position which is also permissible by the residue's torsional space. Conformations which do not result in steric clashing or overlap are exported as a structure within the ensemble. Both Fab arms are sampled simultaneously with 10,000 models being generated in total for each IgG1-4.

Gas phase molecular dynamics simulations of IgG1-4
Each of the lowest CCS conformations of IgG1-4, including two models selected from the pool of lowest 50 CCS models, were subjected to gas phase molecular dynamics simulations using GROMACS 5.1.3 [5] . Since there is no method of determining the experimental charge sites, we pre-charged our IgG models using a localised charge model. This model reflects the lowest observed experimental charge state (21+ for IgG1, IgG2 and IgG4, 22+ for IgG3). Charges were applied to a randomly selected distribution of basic (lysine, histidine and arginine) residues which were found within 5 Å residue depth of the protein surface, using a combination of the DEPTH server [8] and in-house scripts. Precharging was repeated where charges were placed too close to each other or prevented the structure from collapsing. We also did not consider acidic residues or neutral salt bridges due to there being no method of accounting for these interactions experimentally. All acidic residues (aspartate and glutamate) remained neutral for our gas phase simulations.
Simulations were performed using the OPLS forcefield due to the availability of protonated arginine topologies. All disulphide bonds were manually checked to ensure correct bonding.
Energy minimisation was performed for 50,000 iterations using a steepest descent minimiser, followed by position restraints for all bonds for 500 ps. Simulations were carried out at a temperature of 300 K and regulated using the Berendsen thermostat (τ = 0.1 ps).
Pressure coupling and periodic boundary conditions were switched off due to the in vacuo nature of the simulations. A cut-off scheme of infinite distance was used for coulombic and van der Waals interactions. Each model was equilibrated briefly at the correct temperature for 1 ns and then for a further 10 ns to produce the collapsed topologies. RMSD, radius of gyration and CCS was monitored throughout all simulations.

Gas phase simulations of IgG4 for charge states 22-25+
All simulations were carried out as detailed above. Each simulation begins from an identical pre-collapsed model of IgG4 (produced by Fab arm sampling). 22, 23, 24 and 25 charge sites were selected randomly using in house scripts. The charge site distributions are different between each simulation. Each simulation was performed for a total of 10 ns, and the average CCS and CCS variation over the last 1 ns of simulation time was calculated.

CCS Calculation of Computational Models
All CCS for IgG structures were calculated as CCSHe using IMPACT software [9] . CCS were calculated through scaling the projection approximation (PA) from IMPACT, by a factor of 1.14 to account for PA underestimation. PA measurements from IMPACT have been calibrated to values calculated from the trajectory method, with a root mean square relative error of less than 1% [9] . This linear scaling factor of PA has shown success with approximating the experimental CCS of large protein complexes [10] . While other direct CCS approximation methods such as exact hard sphere scattering (EHSS) [11] , trajectory method (TJM) [12] and projection superposition approximation (PSA) [13] are available, the magnitude of models generated in our study (minimum of 50,000), required a high throughput calculation method such as IMPACT [9] .