Improving the quality of co-evolution intermolecular contact prediction with DisVis

The steep rise in protein sequences and structures has paved the way for bioinformatics approaches to predict residue – residue interactions in protein complexes. Multiple sequence alignments are commonly used in contact predictions to identify co-evolving residues. These contacts, however, often include false positives (FPs), which may impair their use to predict three dimensional structures of biomolecular complexes and affect the accuracy of the generated models. Previously, we have developed DisVis to identify FP in mass spectrometry cross-linking data. DisVis allows to assess the accessible interaction space between two proteins consistent with a set of distance restraints. Here, we investigate if a similar approach could be applied to co-evolution predicted contacts in order to improve their precision prior to using them for modeling. We analyze co-evolution contact predictions with DisVis for a set of 26 protein – protein complexes. The DisVis-reranked and the original co-evolution contacts are then used to model the complexes with our integrative docking software HADDOCK using different filtering scenarios. Our results show that HADDOCK is robust with respect to the precision of the predicted contacts due to the 50% random contact removal during docking and can enhance the quality of docking predictions when combined with DisVis filtering for low precision contact data. DisVis can thus have a beneficial effect on low quality data, but overall HADDOCK can accommodate FP restraints without negatively impacting the quality of the resulting models. Other more precision-sensitive docking protocols might, however, benefit from the increased precision of the predicted contacts after DisVis filtering.

contacts are then used to model the complexes with our integrative docking software HADDOCK using different filtering scenarios.Our results show that HADDOCK is robust with respect to the precision of the predicted contacts due to the 50% random contact removal during docking and can enhance the quality of docking predictions when combined with DisVis filtering for low precision contact data.DisVis can thus have a beneficial effect on low quality data, but overall HADDOCK can accommodate FP restraints without negatively impacting the quality of the resulting models.Other more precision-sensitive docking protocols might, however, benefit from the increased precision of the predicted contacts after DisVis filtering.

| INTRODUCTION
What is the prediction quality of protein complexes for which isolated structures are available but protein-protein interface (PPI) information is not?Unfortunately, there is still a low probability of predicting (or identifying) the correct PPI in those cases and this has been one of the main challenges for the structural bioinformatics field for the past decades.The steady increase of protein sequences in data banks such as Uniprot 1 and major technical advances in the structural biology field 2 have been important factors for the enhanced prediction accuracy of protein complexes over the past year. 3With the rise in experimental data, software is now being developed to leverage the large quantity of sequences and structures by mining them, via co-evolution or machine learning (ML) algorithms, [4][5][6] for example.The release of Alphafold2 6 has demonstrated that ML approaches can compete or even outperform the state-of-the-art software packages in the protein structure-prediction field. 7Besides protein structures, recent studies are also exploring Alphafold2's predictive power for protein-protein 8,9 and protein-peptide 10,11 complexes.
Co-evolution has been proven to be an important tool to identify residues at potential PPIs. 12 Identifying co-evolving residue pairs requires the availability of multiple sequence alignments (MSA) of orthologous sequences.When applied to the prediction of intermolecular contacts, an additional complexity comes from predicting the correct pairing of the sequences of the two proteins when considering multiple paralogues.The predicted intermolecular contacts derived from coevolving residues can then be used in de-novo modeling of protein complexes. 5,13Although this technique has mainly been used for prokaryotic systems, recent findings suggest eukaryotic complexes could also benefit from applying co-evolution prediction approaches. 12,14,15dependent of the protein system, one major challenge in coevolution predictions remains the presence of false positive (FP) contacts.Although FP contacts are deduced from MSAs in the same way as correct contacts, they do not describe the physiological protein-protein interface.When such contact data are used to model protein complexes, these false positives can negatively affect the modeling results as they potentially steer the model away from the correct solution, reducing the prediction accuracy.This is a more general problem, which, for example, also occurs in cross-linking massspectrometry (XL-MS) data which also suffer from FPs.To deal with this problem we have previously developed DisVis, available both as a web server 16 and python package, 17 which, given the 3D structures of the component of a complex, can assess the accessible interaction space defined by a set of distance restraints and identify possible false positives.Similarly, the identification and removal of FPs in co-evolution predicted contacts through DisVis could potentially improve the modeling of protein complexes based on residue-residue contact information.
Here, we use 26 protein-protein complexes and co-evolution contact predictions selected from the work of Green et al. 18 to evaluate whether DisVis analysis can help in FP removal.We then assess the impact of using this information (original co-evolution or DisVisfiltered contacts) on the quality of the docking results using our integrative modeling software HADDOCK which allows defining distance restraints to guide the docking.We show that DisVis-filtering increases the precision of the predicted contacts and that HADDOCK is not very sensitive to this precision increase in the predicted contacts as it is able to generate correct models even in the presence of a significant number of FP contacts.

| Dataset preparation
The study by Green et al. 18 was used to extract 26 protein dimers together with their respective top 20 co-evolution predicted interface contacts obtained through EVcomplex 5 (Figure 1).The protein complexes were selected according to the total number of true residueresidue contacts predicted within the top 10 intermolecular coevolution contacts of each system, information extracted from the supporting material of Green et al. 18 We selected cases having a top 10 contact precision ranging from 20% to 100%, ensuring an equal distribution over contact precision ranges (Table 1, Figures S1 and   S2): Five complexes per total number of true contacts (2, 4, 6, 8, or 10) were included.An additional complex with 9 true contacts was added, resulting in a total number of 26 complexes.Besides the precision range in the top 10 contact quality, the quality of the experimental structures was considered for dimer selection as well.Structures were selected to have the best resolution achievable within the dataset (lower bound 1.45 Å, upper bound 3.5 Å).Third, the combination of uniprotIDs within each dimer was selected to be unique within the dataset in order to generate a diverse range of protein complexes.
Finally, if possible, protein complexes (25 out of 26) were selected to not contain DNA in the experimentally resolved system.The structures of the monomers were prepared for use in DisVis and HAD-DOCK using a python script to rename the protein chains (chain A and B) with pdb-tools 19,20 (pdb_chain and pdb_tidy).While those structures have exactly the same backbone conformation as in the experimental reference complex, their side chains were perturbed and optimized using SCWRL4 by Green et al., 18 which thus represent a semi-unbound conformation for docking purposes.

| DisVis scoring of co-evolution predicted intermolecular contacts
For each dimer within our dataset, the top-20 co-evolution predicted contacts (see Data availability statement section) were used as input for DisVis through its web server implementation (https://wenmr.science.uu.nl/disvis).The two monomer structures together with a list of predicted intermolecular contacts were submitted with the complete scanning option settings (1 Å voxel size and 9.72 scanning angle).Predicted residue-residue contacts that involved residues absent in the available 3D protein structures were removed from the co-evolution contact lists prior to DisVis calculations (see Section 2.4.1 and Table 1: Present contacts).The upper distance limit for the co-evolution contacts was set to 10 Å between Cα atoms (in their work, Green et al. used 8 Å between   Cβ-Cβ atoms) as during the rotational scan only Cα atoms are considered, which was implemented to reduce computational costs. 17The Dis-Vis calculated z-scores were used to rank the residue-residue contacts.
The z-score is calculated for each distance restraint by taking into account each DisVis modeled complex which meets at least one of the distance criteria included by the user.For each complex that meets this requirement, all violated restraints are calculated and stored.This results in a violation matrix in which the violation data of all approved complexes are combined.Each row of the matrix represents the number of consistent restraints, from 1 to N, and each column describes the frequency of restraint violation per distance restraint in which at least N restraints are consistent.This violation matrix is used to calculate the z-score per restraint: where v i is the average per column i of the violation matrix, and v and σ describe the violation matrix average and standard deviation, respectively. 17The resulting z-scores were ordered for this study from low (negative z-score) to high (positive z-score), least to most likely to be a false positive.
From the DisVis-reranked co-evolution contacts, the top 10 and 5 were extracted for use as distance restraints in HADDOCK.The entire set of 20 contacts (i.e., without DisVis filtering) was also considered as well as DisVis filtered data, using a z-score threshold of 0.5 or 1.0.

| Docking protocols
The docking calculations were performed using a local installation of where E elec and E vdw correspond to the electrostatics and Van-der-Waals intermolecular energies, respectively, calculated with the OPLS forcefield. 22The desolvation energy, E desolv , is a solvent accessible surface area-dependent empirical term, 23 which estimates the energetic gain or penalty of burying specific side chains upon complex formation.E AIR represents the energy term assigned to the ambiguous interaction restraints (AIRs) (in this case the predicted contacts). 21fault settings were used for all HADDOCK runs, 21 except for the random removal of restraints (see Table 2).The DisVis-reranked distance restraints or the original 20 co-evolution contacts were included as input in the ambiguous restraints class.Ten different docking protocols (Table 2) were performed with HADDOCK, which (upper bound) between Cβ atoms of the two chains, except for glycine residues for which the Cα atom was selected.This definition of the distance restraints is the same as in the docking calculations performed by Green et al. 18 Besides co-evolution restraints, additional intramolecular Cα-Cα distance restraints were included during docking for protein chains in which parts of the protein structure were missing to keep the domains together during the refinement stage (note that this is done automatically in the web server).These restraints were calculated with the restrain_bodies.pyscript from haddock_tools (https://github.com/haddocking/haddock-tools).

| Contact precision
The contact precision was calculated for each complex as a function of the number of contacts considered (based on the original or DisVis rankings).The precision p was defined as the number of true contacts divided by the total number of contacts considered: where TC stands for true contact, a contact of which both residues are present at the interface of the reference complex within an 8 Å distance cutoff of each other, considering all heavy atoms.False T A B L E 1 Structure information of the 26 hetorodimeric protein-protein complex dataset used in this study.Note: Each entry describes one dimer with its corresponding PBD ID, the chains that have been used in DisVis and HADDOCK, the resolution of the experimental structure, the Green ID 18 equivalent, the number of true residue-residue contacts and corresponding top-10 and top-20 precision (%) according to calculations performed by Green et al., 18 and the number of predicted contacts (Present Green 10/20) in the Green top-10 and top-20 for which the corresponding residues are present in the experimental protein structures (Figure S2).
The unique ID number of each complex used by Green et al. 18 in their supplementary information.b Number of true contacts in the top 10 and top 20 co-evolution contacts according to the definition used by Green et al., 18 using the experimental structures and a heavy-atom interface cutoff of 8 Å calculated with haddock-tools (https://github.com/haddocking/haddock-tools) to identify the true contacts.The contact precision is shown in brackets.
contacts (FC) are those for which the shortest distance between any heavy atoms exceeds this 8 Å cutoff.Subsequently, the average contact precision p was calculated over all complexes.

| Interface root-mean square deviation and success rate calculation
The quality of each complex was determined by calculating the interface root-mean-square deviation (i-RMSD), which is obtained by aligning the backbone atoms at the protein-protein interface of both protein chains on the reference complex, using all residues making contacts within a 10 Å cutoff with the partner molecule.The quality of each model is rated according to the critical assessment of predicted interactions (CAPRI) with an i-RMSD of ≤1 Å denoted as high, ≤ 2 Å as medium and ≤4 Å as acceptable quality. 24We did not consider the fraction of native contacts in this study since in our experience with HADDOCK the limiting factor for defining the quality of a model is the i-RMSD (i.e., a model will never be "downgraded" in quality because of a lower fraction of native contact value).
These model quality ratings are used to calculate the success rates per tested condition.The success rate is defined as the percentage of targets for which a model of acceptable (or better) quality has been generated within the top N (N = 1, 5, 10, 20, 50, 100, and 200) ranked models based on the HADDOCK itw score.

| RESULTS
The 26 protein complexes (Figure 1) used in this study are taken from the dataset published by Green et al. 18 with 2, 4, 6, 8, and 10 true contacts in the top-10 co-evolution predicted contacts according to the true contact definition used by Green.Five complexes for each number of true contacts were included in our dataset as well as one additional complex with 9 true contacts (Table 1).The number of true contacts in the top-20 for these 26 complexes ranges from 2 to 18 (precision of 10%-90%) (Table 1 and Figure S3).The co-evolution intermolecular contacts from Green et al. 18 were reranked by DisVis via their z-score to identify potential false positives from the predicted contacts.Different selections of co-evolution restraints were tested (Table 2) to assess the impact of contact precision on the docking performance and model quality.

| Reranking predicted co-evolution contacts with DisVis enhances the precision of the top 10
Co-evolution intermolecular contacts produced by EVcomplex form the starting point for the DisVis analysis of this study.Twenty coevolution contacts per complex were assessed by DisVis and reranked according to their obtained z-score (see Section 2).Subsequently, the DisVis-reranked contacts were compared to the original co-evolution results.The average contact precision (Figure 2) shows that a difference in precision is already present between the co-evolution and DisVis-reranked contacts in the top 1 (precision of DisVis-reranked 88 ± 32% vs. 81 ± 40% for the original contacts).For the top 10 contacts, the difference in precision is 6% (DisVis-reranked 67 ± 29% vs. 61 ± 27% for the original contacts).Including more contacts, up to the maximum of 20 considered, lowers the precision further to 47%.

| The number of contacts considered rather than their precision enhances HADDOCK's performance
In respectively.The number of contacts selected varies for the protein complexes between 8 and 17 for the 0.5 cutoff, and 12 and 20 for the 1.0 cutoff.
van KEULEN and BONVIN orange vertical lines in Figure 2), both using the original EVcomplex ranking and the DisVis reranking of contacts.In addition, as a reference, docking was performed using the original top-20 EVcomplex predictions.The success rates for these different sets of restraints (Table 2) are shown in Figure 3, calculated over the 200 HADDOCKranked models after final refinement (itw).The ranking of models is based on the HADDOCK scoring function, which consists of a linear combination of energy terms (see Section 2).Five docking conditions were tested: using the top 5 contacts (EVcomplex or DisVis-reranked) as distance restraints without random contact removal (i), the top 10 contacts (EVcomplex or DisVis-reranked) without random removal of contacts (ii), the top 10 contacts (EVcomplex or DisVis-reranked) with a 50% random removal of provided contacts (iii), and as a reference the top 20 contacts (EVcomplex) with 50% random contact removal (iv) and without (v) (see Section 2 and Figure 3).The random removal of restraints is done per model (1000 models are generated per docking run), meaning that models will be generated based on different combinations of restraints within a docking run.
The first condition with five restraints and no random restraint removal (EV5-0 and DisVis5-0) includes a set of contacts with the highest contact precision compared to the top-10 and top-20 contacts.EV5-0 and DisVis5-0 perform similarly well in the top-10 success rate for high-and medium-quality models.However, EV5-0's predictions surpass the DisVis setup when it comes to the percentage of acceptable quality predictions.Even though the accuracy of the top-5 restraints used in these protocols is significantly higher than the top-10 contacts for both EVcomplex and DisVis-reranked setups (Figure 2), EV5-0 and DisVis5-0 are outperformed by the other protocols.
Next, the top-10 contacts were included in four protocols (DisVis10-0, DisVis10-50, EV10-0, and EV10-50) to investigate the impact of random removal of restraints on the docking performance.
The DisVis10-50 and EV10-50 protocols (10 restraints and 50% random removal) achieve the best performance with respect to the setups without random removal, reaching an acceptable or higher quality success rate of 85% for the top-10 HADDOCK-scored models.
A similar trend can be observed when a cluster-based analysis is performed (Figure S4).Hence, turning on random restraint removal (the default setting) improves the docking performance (Figure 3), making HADDOCK robust to the presence of false positives.
The DisVis-reranked performance was also compared to the EVcomplex results, both with random removal of restraints turned on (DisVis10-50 vs. EV10-50).When considering the top-5 predicted models, the success rate of the high-and medium-quality models between the two setups is comparable.However, DisVis outperforms the EVcomplex restraints with a success rate of 85% versus 73% for the number of acceptable models in the top 5, suggesting DisVisreranking can have a quality enhancing effect on co-evolution predicted data when used for docking.
However, none of the top-5 and top-10 DisVis-reranked or EVcomplex setups outperform the EVcomplex condition using 20 restraints and 50% random removal (Figure 3).The inclusion of 20 restraints during docking results in 35% high-quality structures, 65% medium and 85% acceptable models according to the CAPRI criteria.This finding suggests that although an accuracy improvement within the top-5 and top-10 residue-residue contacts due to contact reranking with DisVis improves the input data for HADDOCK, using a lower precision contact list with more contacts actually outperforms shorter contact lists with higher precision (Figure 4 and Figure S4).A comparison of the success-rate results obtained with 20 EVcomplex contacts and 50% or 0% random contact removal confirms the beneficial impact of random removal on the prediction quality (Figure 3).
Therefore, HADDOCK appears to be robust with respect to contact precision and benefits from a combination of contact quantity and 50% random contact removal.

| Better precision does lead to both better quality and better ranking of models
When analyzing the impact of the precision of residue-residue contacts on the quality of the resulting models in terms of i-RMSD values, it becomes apparent that they are correlated.In Figure 4, the docking results of the best performing protocol, EV20-50 (EVcomplex top 20 contact restraints with 50% removal), is shown in dark blue.The performance of the two runners-up protocols, EV10-50 and DisVis10-50, are depicted in light blue and yellow.We observe a moderate anti-correlation between contact precision and the i-RMSD of the top-1 docked model or the model with the best i-RMSD (correlation coefficients between À0.51 and À0.56 depending on the data set).More interesting is the fact that irrespective of the dataset, we observe that HADDOCK is able to reliably predict acceptable models in the top ranked models, starting around a contact precision of 0.4 (although acceptable models are already obtained in some cases for Residue-residue contact precision.Average precision of co-evolution and DisVis-reranked residue-residue contacts calculated over a dataset of 26 dimers.The average precision of the co-evolution predicted contacts are represented by a black line while the DisVis result is shown in blue.Orange lines highlight the two top cutoffs used as input for docking calculations. precisions as low as 0.2) (Figure 4B).A comparison of Figure 4A,B also shows that the ranking of models improves with the precision, with the top models being of acceptable or better quality when the precision reaches 0.5-0.6.
A comparison of the results obtained by using 10 (both DisVis and EV contacts) or 20 contacts (Figure 4) shows that for a similar precision, having more contacts does lead to better quality models in general (the dark blue points are in most cases lower than the others).
This effect is more apparent at lower precisions and can also be observed in the clustered HADDOCK results (Figure S5).These findings indicate that HADDOCK is robust with respect to the precision of contacts and benefits from longer (up to 20 here) contact lists, being able to generate and reliably identify acceptable models down to about 30% precision.

| DISCUSSION
In this study, we have investigated the effect of residue-residue contact filtering on protein-protein docking by comparing the docking results of DisVis-reranked contact restraints to the original coevolution contacts.Although the DisVis analysis was limited in this work to 20 contacts (those provided by Green et al. 18 ), it can be extended to larger numbers of contacts.Because of the available true contact distribution of the studied dataset, we could analyze how contact quality impacts the docking results.These subsets were defined by using the true contact precision in the Green top 10 (see Section 2).Of the 26 complexes, 10 fall into the low-quality category with 20%-40% true contacts in the top 10, and 11 into the high quality category with a true contact precision of 80%-100%.Unsurprisingly, success rate analysis for these two groups (20%-40% vs. 80%-100%), show that the original co-evolution contacts with 50% random removal performs best for the 80%-100% precision category (Figure S6).Overall, the random contact removal (enabled by default in HADDOCK) appears to be crucial to counterbalance the presence of false positives as each of the 1000 docking attempts generates a different set of 50% of the contacts, leading to a robust performance of HADDOCK in regard to contact precision (Figure 3).
The performance enhancing combination of a large set of distance restraints with medium precision and 50% random removal is also shown in Figure 4B.In this graph, the results clearly demonstrate that while overall interface precision is reduced in the dataset for the 20 contacts setup, HADDOCK generates higher quality models with 20 contacts than 10 contacts at the same interface precision.The difference in ranking performance (Figure 4A) also shows that while 10 contacts appear to require a contact precision of $60% to predict an acceptable model at the top 1, 20 contacts achieve a similar quality starting from $40% precision.
We have also investigated if selecting contacts based on a z-score criterion rather than a predefined number of contacts would improve F I G U R E 3 Comparison of co-evolution and DisVis-reranked docking success rates for the 26 dimers dataset.Success rate of co-evolution and reranked DisVis contact lists used as input for protein-protein docking.Three sets of contact lists, 5, 10, and 20, were used to assign distance restraints in HADDOCK.When using the top 5 contacts, all five contacts were included in the docking protocol.Hence no random removal was applied.For the top 10, 50% of the included contacts were randomly removed upon docking in #10-50% and none were removed in #10-0%.The fourth condition represents the docking results using 20 distance restraints with 50% random removal while the fifth condition considers 20 distance restraints without contact removal.Seven bars have been plotted per condition, denoting the top 1, 5, 10, 20, 50, 100, and 200 structures according to the HADDOCK itw score.The assignment of a high, medium or acceptable label to a protein complex represents its accuracy in iRMSD with high being ≤1 Å (dark green), medium ≤2 Å (light green) and acceptable ≤4 Å (light blue) (Table S1).
their quality.Removing contacts with a z-score higher than 0.5 results in an average of 10.6 ± 2.7 contacts per complex with an average precision of 68% ± 29%.Compared to the original top-20 co-evolution set (Table S2), including an average of 19.2 ± 1.2 contacts with an average precision of 48% ± 25%, this is an improvement in precision of 20% (68%-48%).The same analysis performed on the low-and high-quality subsets separately leads to 8.8 ± 1.0 contacts per complex for the low-quality set with a precision of 51% and 12.6 ± 2.9 for the high-quality set with an average precision of 83% (Figure S6 and Table S2).Compared to the original top-20 co-evolution contacts subsets (Table S1) this is an improvement in precision of 24% (51%-27%) and 18% (83%-65%) for the low-and high-quality datasets, respectively.Hence, z-score filtering can positively impact the precision of the contact dataset, especially when the contact set has a low precision initially (Figure S3).This is confirmed by the docking results for the low-quality contacts set (20%-40%) (Table 1) when only the contacts with a DisVis z-score lower than 0.5 out of the 20 contacts are included (Figure S7).While a z-score cutoff of 0.5 improves the average precision of the remaining contacts in the low-quality contacts set, a removal of z-score values higher than 1.0 does not seem to be able to filter the contact data sufficiently (Table S2), resulting in a similar docking performance as the original co-evolution contact set (Figure S7).In case of the high-quality dataset (80%-100% precision), DisVis reranking and z-score filtering seems to have a mild or no impact on contact precision improvement in the top 10 contacts as the number of true contacts is high and therefore reranking of true contacts will not enhance contact precision (Figure S6).The docking success rate compared to the EV20-50 protocol seems, however, to be negatively affected by DisVis z-score filtering (Figure S7).This trend could be due to the removal of true contacts (Figure S6) via zscore filtering as the average contact precision at 20 contacts with a z-score filter of 1 or 0.5 lies below the contact precision reached by the original co-evolution contacts (Figure S6).
In a real-world scenario of experimental data or co-evolution data for which a complex structure is not available, the quality of the contacts cannot be assessed before docking.Therefore, comparing and/or combining the top-10 HADDOCK itw-scored structures for both identified approaches, using the original contact data (with 50% random removal) and the DisVis-filtered contacts with z-score < 0.5 (and 50% random removal) (EV20-50 and DisVis20-50 < 0.5), could provide a way to check the consistency of the solutions between the runs and possibly refine all solutions together for the best docking performance, discarding the restraints (in this way the score would only reflect the quality of the interface).The highest contact precision for a set of 20 contacts should be obtained by reranking the contacts with DisVis and applying a cutoff, for example, the top 5 or top 10.Intermolecular contacts derived from co-evolution analysis provide a valuable source of information to guide the modeling of proteinprotein complexes by docking.These can be used to guide the docking process (as done, e.g., in HADDOCK) or as filters to score the generated models.The presence of false positives within the predicted contact data can, however, hamper the docking performance, both in terms of quality and number of acceptable models generated.
Here, we have shown that DisVis can reduce the number of FPs in co-evolution contact data by taking into consideration the spatial restrictions imposed by protein structures and the defined contacts.
This precision enhancement can have a positive effect on the docking results depending on the software and approach used.Although HADDOCK is robust to the presence of false positive contacts and overall benefits most from a large set of interface contacts and 50%

HADDOCK 2 . 4 .
The docking protocol in HADDOCK consists of three stages.21In the first stage (it0), rigid body docking is performed with the distance restraints defined between the two chains guiding the docking.From the 1000 (default) generated models, the top 200 based on the it0 HADDOCK score progress to the next step.The second stage (it1) consists of a semi-flexible simulated annealing in torsionangle space during which flexibility at the interface is introduced step wise, first along the side chains and later for both side chains and backbone.By default, the flexible interface is defined automatically for each model from an analysis of residues that are in close contact between the chains.All structures from it1 are transferred to the final step of the docking protocol (itw) which consists in HADDOCK 2.4 of a final energy minimization (previous versions of HADDOCK were performing a very short optimization by molecular dynamics simulation in explicit solvent-this option is still available but turned off by default in version 2.4).Finally, the models are scored based on the HADDOCK itw scoring function which is a linear combination of energetic terms:

1
Dataset of 26 dimers used in this study.In each dimer the two chains are highlighted in yellow and blue.The PDB ID as well as the resolution of the experimental structure in Ångstrom are depicted.The representation of the shown protein complexes was obtained by using PyMOL. 25differ in the number and type of restraints considered and the percentage of restraints randomly discarded for each model.The latter option makes HADDOCK potentially less sensitive to wrong (e.g., false positive) restraints.Intermolecular co-evolution distance restraints were defined as distances of 3 Å (lower bound) to 7 Å

1 a
order to test the impact of DisVis reranking on the quality of the models generated by HADDOCK, two contact list cutoffs were used as input for docking calculations: top 5 and top 10 (indicated by T A B L E 2 HADDOCK protocols tested with the original or DisVis-reranked co-evolution restraints.Percentage of random removal of restraints.This random removal is done for each model calculated.b Co-evolution intermolecular contacts directly taken from Green et al. 18 c Co-evolution intermolecular contacts taken from the DisVis reranking.d Co-evolution intermolecular contacts taken from the DisVis reranking by applying a z-score cutoff: z-score lower than 0.5 or 1.0, for protocol 8 and 9,

4
Contact precision versus interface root-mean square deviation (i-RMSD).(A) Residue-residue contact precision versus the i-RMSD of the top 1 predicted model per complex, using the HADDOCK itw scoring function.The dark blue circles represent the docking results obtained by using the top 20 EVcomplex contact restraints (Pearson correlation of À0.51).Its linear regression fit is plotted in the same color.The light blue and yellow data points show the HADDOCK results from the docking runs performed with the top-10 EVcomplex contacts and the top-10 DisVis-reranked contacts with 50% random removal which have a Pearson correlation of À0.51 and À0.56, respectively.The linear regression fit for the top-10 EVcomplex and DisVis results combined (light blue and yellow data) is highlighted in green.The dashed black line depicts the 4 Å CAPRI cutoff for docked models with acceptable quality.(B) Residue-residue contact precision versus the model with the best i-RMSD per complex, using the HADDOCK itw scoring function.The dark blue circles represent the docking results obtained by using the top-20 EVcomplex contact restraints (Pearson correlation of À0.51).Its linear regression fit is plotted in the same color.The light blue and green data points show the HADDOCK results from the docking runs performed with the top-10 EVcomplex contacts and the top-10 DisVis-reranked contacts with 50% random removal which have a Pearson correlation of À0.53 and À0.56, respectively.The linear regression fit for the top-10 EVcomplex and DisVis results combined (light blue and yellow data) is highlighted in green.The dashed black line depicts the 4 Å CAPRI cutoff for docked models with acceptable quality.
random removal of restraints (the default setup) rather than high interface precision for a small set of contacts, other software or approaches might well benefit from improved precision contact data resulting from DisVis filtering, especially if those contacts are used for scoring purposes rather than to guide the docking.While this work concentrated on co-evolution data, the acquired insights should also be relevant for other types of distance-based information.