A comparison of successful and failed protein interface designs highlights the challenges of designing buried hydrogen bonds


  • P. Benjamin Stranges,

    1. Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599
    Search for more papers by this author
  • Brian Kuhlman

    Corresponding author
    1. Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599
    2. Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599
    • Department of Biochemistry and Biophysics, 120 Mason Farm Rd., Chapel Hill, NC, 27599-7260
    Search for more papers by this author


The accurate design of new protein–protein interactions is a longstanding goal of computational protein design. However, most computationally designed interfaces fail to form experimentally. This investigation compares five previously described successful de novo interface designs with 158 failures. Both sets of proteins were designed with the molecular modeling program Rosetta. Designs were considered a success if a high-resolution crystal structure of the complex closely matched the design model and the equilibrium dissociation constant for binding was less than 10 μM. The successes and failures represent a wide variety of interface types and design goals including heterodimers, homodimers, peptide-protein interactions, one-sided designs (i.e., where only one of the proteins was mutated) and two-sided designs. The most striking feature of the successful designs is that they have fewer polar atoms at their interfaces than many of the failed designs. Designs that attempted to create extensive sets of interface-spanning hydrogen bonds resulted in no detectable binding. In contrast, polar atoms make up more than 40% of the interface area of many natural dimers, and native interfaces often contain extensive hydrogen bonding networks. These results suggest that Rosetta may not be accurately balancing hydrogen bonding and electrostatic energies against desolvation penalties and that design processes may not include sufficient sampling to identify side chains in preordered conformations that can fully satisfy the hydrogen bonding potential of the interface.


The computational design of new protein–protein interactions has proven to be a difficult challege.1, 2 Experimental measurements have shown that the majority of designed interactions do not form tight complexes (KD < 10 μM)3 or bind in an alternate conformation to the design model.4 However, there has been exciting progress as a handful of de novo designed interfaces have been shown to bind with submicromolar binding affinities and adopt the intended binding orientation.5–9 Here, we compare models that interact as predicted to those that fail with the goal of identifying common themes between the two sets. For instance, are current design methods more or less likely to succeed when designing interfaces enriched in polar or nonpolar amino acids?

One recent study sought to improve design selection methodology by asking the computational protein docking community to establish metrics that discriminate designed proteins that were known not to bind from natural interfaces.3 Some of the best discriminating metrics showed that the designs had unfavorable solvation energy at the interface and poor electrostatic complementarity between the two proteins in the complex. However, most metrics failed to distinguish natural small hydrophobic interfaces from designed small hydrophobic interfaces.

Here, we focus on the differences between failed and successful designs in addition to comparing design models with native interfaces. We choose to investigate designs made with the molecular modeling program Rosetta because we have access to a large number of design models and many of the recent successful designs were made using Rosetta. Our data set contains five successful interface designs and 153 failures. The successes and failures represent a wide variety of design goals including the creation of both heterodimers and homodimers. In all cases, the design models were created using Rosetta's rotamer optimization algorithms and full atom energy function to optimize contacts at the target interface. The Rosetta energy function emphasizes short range forces including steric repulsion, London dispersion forces, hydrogen bonding, and bond torsion strain.10 Solvent is modeled implicitly with the pair-wise additive desolvation model from the EEF1 force field.11

In general we find that the designs are smaller and more hydrophobic than native protein interactions. Though most designs fail to form experimentally, the ones that successfully interact are dominated by hydrophobic packing interactions. All attempts to design polar, hydrogen bond rich, interfaces have failed to produce proteins that bind. We address possible causes and solutions to the discrepancies between designed and native protein–protein interfaces.


ΔGbind; calculated binding energy of two proteins; HA, hemagglutinin; REU, Rosetta energy unit; ΔSASA, change in solvent accessible surface area upon binding.


Definition of a successful design

For the purpose of this study, the computational interface designs were divided into three categories, strong success, weak success, and failure. A strong success is defined as a high affinity interaction (KD ≤10 μM) where the X-ray crystal structure of the complex closely matches the computational prediction. A weak success has at least a moderate affinity (KD ≤ 100 μM) and either mutational or NMR chemical shift data suggest the interface forms as designed. A failed design does not meet the previous criteria. Figure 1 shows several successful and failed de novo protein interface design models. Table I shows a summary of how many designs satisfy either definition of success. A complete list of structures used is given in Supporting Information Table SI.

Figure 1.

Examples of successful (left) and unsuccessful (right) protein interface design models. Separate chains of successful designs are shown in purple and gray; the different chains of failed models are colored green and brown; dashed black lines represent interface spanning side-chain involved hydrogen bonds. The structures shown represent examples design models of β-strand mediated interface (A, B),two models targeting flu HA (C, D), and design of helix secondary structure to bind a target protein. The successful models shown are βdimer1 (A), HB36 (C), and GLhelix-4 (D). [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Table 1. The Numbers of Experimentally Tested Computational Protein–Protein Interfaces Examined in This Work
Design goalNo. testedExpress/ solubleStrong successWeak success
  1. The number of expressed/soluble proteins represents the number of total that could actually be expressed in the experimental system and did not aggregate. Strong successes are high affinity interactions (KD ≤ 10 μM) where an X-ray crystal structure matches the design model. Weak successes (KD ≤ 100 μM) have moderate to high affinity and other experimental evidence that the interface forms as designed. Citations are given when available.

PAK1 binders1410601
GTPase binders6600
Gαi binding peptides5111110
Ubiquitin or UbcH7 binders5500
Metal mediated homodimers88610
Metal-mediated heterodimers6501
β-strand-mediated homodimers7101011
FNIII to SH3-domain3300
Flu-hemagglutinin binders6,9887320

There are five examples in our list of models that meet the criteria for a successful protein–protein interface design. The first is the design of the structure and sequence of a peptide that binds to Gαi1(GLhelix-4).5 Two others are proteins that were redesigned to bind influenza hemagglutinin (HB36 and HB80).6, 9 The final two are redesigns of natural monomeric proteins to form symmetric homodimers via Zn+2 binding (MID1)8 or β-strand pairing (βdimer1).7 We chose to use the MID1 H12E mutant throughout the rest of this investigation because the crystal structure was a closer match to the design model. Some successful interface designs that are not included in Table I area novel helical tetramer from Harbury et al.12 and two large assemblies designed by King et al.13 We choose to not include these designs in our analysis because the other design goals were the construction of dimers.

Four designed interfaces are classified as weak successes because there is not a crystal structure of the modeled complex. These include a low affinity binder to PAK1,14 a Zn+2 mediated heterodimer and a β-strand mediated homodimer (βdimer2).7 The designed interaction of Prb and Pdar is classified as a weak success because a crystal structure of the complex is a 180° rotation from the computational model.4

The designed interfaces are small

Natural dimeric interfaces broadly sample different contact areas ranging from a change in solvent accessible surface area (△SASA) of 850–10,000 Å2 (up to 7000 Å2 for heterodimers) (Supporting Information Fig. 1).15 The designs sample much smaller interfaces ranging from 850 to 2400 Å2, with the majority of designs having an interface area between 1000 and 1600 Å2 (Supporting Information Fig. S1 inset). Successful designs are represented over the range of designed interface sizes and the crystal structures of the successful designs show a similar △SASA to the design models. There have been no successful dimer designs where the interface area is over 1600 Å2 suggesting that better sampling and additional effort is required to recapitulate the sizes of native complexes.

Native proteins have a similar interface energy density as calculated with Rosetta (△Gbind/△SASA) across all sizes of interfaces [Fig. 2(A,B)] while the designed interfaces vary in energy density depending on the size of the interface [Fig. 2(C)]. Larger designed interfaces tend to have a less favorable △Gbind/△SASA than smaller ones. This observation suggests that, unlike native complexes, current sampling and design strategies are unable to create high-quality contacts across large interfaces. It should be noted that most of the protocols used to produce the computational designs allow rigid body motion between the protein chains but not substantial backbone rearrangement. Crystal structures of the successful designs maintain a similar △Gbind/△SASA to the computational models (Supporting Information Fig. S2). However, △Gbind/△SASA cannot be used to clearly separate failed from successful designs. Both the failed and successful designs have △Gbind/△SASA values that are similar to native interfaces, which reflects the fact that △Gbind is optimized during the design process.

Figure 2.

Interface energy density as computed by Rosetta for (△Gbind/△SASA) designed and natural interfaces. The change in SASA upon binding versus △Gbind/△SASA is shown for native heterodimers (A), native homodimers (B), and all designed interfaces(C). Large points with gray interiors represent the successful design models. Least-squares lines were fit to each set of interfaces. The correlation coefficient for the design models is r = 0.33.

Success is not determined by packing quality

Proteins designed with Rosetta can exhibit lower packing quality than natural proteins.16 Atomic packing defects at modeled protein interfaces can indicate that a complex is unlikely to form experimentally.17 We analyzed the design models to determine if poor packing quality was responsible for the failure of designed interfaces to form. Two measures of packing contacts at a protein interface were used to interrogate the designs and native structures, a shape complementarity score18 and the RosettaHoles score.19 Residues at the interface of designed and native interfaces do not show a difference in packing quality (Fig. 3). Furthermore, the successful designs do not cluster towards better shape complementarity [Fig. 3(A)] or a better RosettaHoles score [Fig. 3(B)]. Rotamer optimization and minimization of crystal structures of the natural interfaces does not significantly alter the packing quality and the crystal structures of the successful designs span a similar range of packing quality to the design models (Supporting Information Fig. S3). Thus, these two metrics are not sufficient to distinguish which designs were likely to form from those that failed.

Figure 3.

Packing quality measure of the design models and Rosetta minimized natural interfaces. Two independent measures of packing quality are shown; (A) the shape complementarity score18 for the interface and (B) the RosettaHoles score for residues at the interface. For each metric a value of 1.0 represents perfect packing, while lower values represent packing defects. Lines represent the successful design models.

Natural protein–protein interfaces contain hot-spot residues that contribute a large amount to the binding energy of the complex.20 We employed an alanine-scanning method in Rosetta to determine if designed interfaces exhibited a similar trend. The designed heterodimeric interfaces have a similar number of hot-spot residues to natural proteins (defined as ΔΔGbind> +2.0 REU), further demonstrating that Rosetta can form packing interactions similar to natural proteins (Supporting Information Fig. S4).

Successful designs have few polar interactions

Successful Rosetta designed interfaces have a low amount of polar area at their interface compared with many of the other computational models and native interfaces. The amount of △SASA that polar atoms contribute to an interface normalized by the total △SASA of the interface shows that the successful designs all have below 40% of their interface made up of polar atoms (Fig. 4). The successful design that has the largest fraction of polar interface area, βdimer1, has six main-chain to main-chain hydrogen bonds across the interface which account for a large amount of buried polar area. The remainder of the βdimer1 interface is predominantly hydrophobic.

Figure 4.

Polar content of designed and natural interfaces. The polar fraction of interface area is shown for designs versus heterodimers (A) and homodimers (B). Successful designs highlighted by lines. An asterisk above the line denotes the value for the crystal structure while no asterisk is present above the successful design models.

Overall, Rosetta designed interfaces have less contribution from polar interactions at the interface than natural dimers. This was noticed previously when comparing proteins designed to bind HA to natural heterodimers.3 Even after including the design models from our lab, Rosetta designed heterodimers still have lower polar content at the interface compared with natural heterodimers [Fig. 4(A)]. The amount of polar interface area for designed and natural homodimers is similar [Fig. 4(B)]. The proteins designed to bind HA targeted a hydrophobic region on HA, thus raising the possibility that those designs skew the data set to disfavor successful polar interactions. However, both the designed HA binders and the proteins designed in our lab include designs that span a range of polar content ranging from ∼30–50% of the interface area (Supporting Information Fig. S5).

The design of polar residues at an interface can result in the burial of a polar atom without a hydrogen-bonding partner. Native interfaces tend to have no more than two buried, unsatisfied, polar atoms per 1000 Å2 of interface [Fig. 5(A)]. The design models have a similar number of buried unsatisfied polar atoms as native interfaces. The crystal structures of three of the strong successes (GLhelix-4, MID1, and βdimer1) have no buried unsatisfied polar atoms at the interface [Fig. 5(A)]. The design models for these three interfaces also have no buried unsatisfied polar atoms (Supporting Information Fig. S6A). The two successful HA binders (HB36 and HB80) have buried unsatisfied polar atoms at their interface (Supporting Information Fig. S6A). Following affinity maturation, the crystal structure of these interfaces shows a drop in the number of buried unsatisfied polar atoms compared with the design models [Fig. 5(A)]. This reduction could indicate one way in which directed evolution was able to raise the affinity of the interaction. All design models that had more than two buried polar groups without a hydrogen bond partner in the design model failed to form high affinity complexes.

Figure 5.

Buried polar atoms and buried hydrogen bonds at interfaces. Values for Rosetta minimized crystal structures are shown by lines. A: The number of buried polar atoms without a hydrogen-bonding partner per 1,000 Å2 of interface area. B: The total energy of a buried, side-chain involved hydrogen bond at the interface as a fraction of total binding energy (△Gbind). HB36 and HB80 acquired an additional buried hydrogen bond due to mutations introduced by affinity maturation.

One of the most striking differences between successful designs, unsuccessful designs, and native complexes is the amount of binding energy, as calculated by Rosetta, contributed by buried hydrogen bonds involving side chains. The successful designs all have few or zero buried hydrogen bonds [Fig. 5(B)]. The design models of GLhelix-4, HB36, and HB80 each have one buried hydrogen bond across the interface (Supporting Information Fig. 6B). A buried hydrogen bond in GLhelix-4 was not observed in the crystal structure. MID1 and βdimer1 have no buried side chain hydrogen bonds across the interface. One new buried hydrogen bond was introduced to HB36 and HB80 during affinity maturation of the computational design. In cases where multiple buried hydrogen bonds were present in the design model, the designed complex failed to form. This is not because buried hydrogen bonds are forbidden by the rules of physical chemistry, many native interfaces have multiple buried hydrogen bonds and a significant portion of the binding energy as calculated by Rosetta is derived from hydrogen bonds.

Successful interface designs made with computational programs other than Rosetta have also had interfaces that are predominately hydrophobic. For example, the de novo designed helical bundle RH412 has a polar △SASA fraction of 0.23, which is lower than the successful Rosetta designs described here as well as many native interfaces. It also has no buried unsatisfied polar atoms and no buried hydrogen bonds.


These results indicate that successful Rosetta designed protein–protein interactions differ from unsuccessful designs and native interactions in the polar makeup of the interface. Designed interactions tend to be more hydrophobic and smaller than most natural protein–protein interfaces. The successful designs have less polar area at the interface when compared to most design models and few buried hydrogen bonds or unsatisfied polar atoms. Burying polar atoms, even those modeled to form hydrogen bonds appears detrimental to the success of a computational interface design.

The scarcity of polar interactions in the successful designs highlights the difficulty of designing polar interactions at protein-protein interfaces. There have been several examples of the successful design of new hydrogen bonds at a natural interface,21–23 however, these redesigns have lower affinity than the wild type interaction. New hydrogen bonds can increase the affinity of a natural interaction in some cases, typically by designing a interface spanning salt bridge at the edge of an interface.24, 25 However, there are no buried salt bridges in the successful designs investigated here. Another strategy for increasing affinity involves replacing a polar residue with a nonpolar one, or a small hydrophobic residue with a larger one.26 None of the examples of successful novel interface design derive a large portion of their interface from polar interactions. Unlike computational methods, nature is able to make protein interfaces with substantial polar area and hydrogen bond interactions [Fig. 4(A,B)].

There are more examples of successful computational redesign of natural protein–protein interactions for increased affinity24–29 or altered specificity21–23, 30, 31 than of the design of a new protein interface. Energy and search functions are able to optimize the local interactions required for binding in the context of a known partner. The design of a novel interface requires searching for alignments of two proteins and the addition of new residue interactions without a native like context to help direct the simulation.2 A search strategy that is able to orient two protein scaffolds into an arrangement similar to a native conformation could turn the difficult problem of novel interface design into the more tractable one of redesign of native interactions.

Inaccuracies in the Rosetta energy function could account for the failure of polar designs. Rosetta does not explicitly model water during design, in part because a previous effort to model with explicitly solvated rotamers did not yield improvements in computational benchmarks.32 Natural homodimers and heterodimers contain about 10 water molecules per 1000 Å2 of interface area. On average 30% of these waters are buried from bulk solvent.33 The inability to account for waters at polar interfaces could prevent computational methods from finding a sequence that allows for binding. We are unable to draw conclusions about the solvent content of the designed interfaces because the crystal structures for HB36, HB80, and GLhelix-4 were not determined at a resolution high enough to allow for accurate water placement near the designed interface. Another reason that polar designs fail could be the design of poor hydrogen bonding geometries. The Rosetta hydrogen bond energy function used to make the design models does not include a term to ensure that a hydrogen bond donor is in the plane of lone-pair electrons on acceptor carbonyl oxygens.34 Some of the failed designs have interface spanning hydrogen bonds that are more than 60° away from optimal sp2 acceptor geometry. In addition, none of the design protocols that produced the models investigated here, made use of long-range electrostatic interactions. Rosetta employs a course grain energy term that favors the proximity of residues with opposite charges. Including a more complex electrostatic potential can improve prediction of ΔΔGbind in Rosetta.9 An alternative target function, for instance optimizing for ΔGbind instead of total energy, could also be a way to approach future design goals.35

Another reason that polar designs could be failing is insufficient sampling and stabilization of a binding competent conformation. Fleishman et al. previously noted that interface residues in natural complexes tend to favor a similar rotamer in both the bound and unbound form. Designed interfaces did not favor the bound rotamer in the unbound state.36 In addition residues with three or four dihedral angles tend to undergo rotameric shifts upon binding,37 suggesting that sampling large rotamer libraries for these residues might be necessary at protein–protein interfaces. Failed designs are rarely investigated further to determine other possible reasons the design did not interact with the target. Mutating a large number of residues in the design process could destabilize the designed protein or alter the intrachain contacts such that the designed protein's conformation does not match the model. In fact, some MID1 structures show noticeable backbone rearrangements from the starting structure.8 Experimental determination of structures of failed designs could help inform new design methodology.

The successful protein–protein interaction designs outlined here show that it is now possible to design interactions using a variety of strategies as long as the interaction is small and hydrophobic. In addition, residues in either α-helices or β-strands dominate all successful designs [Fig. 1(A,C,D)]. Three important challenges in de novo computational protein–protein interface design remain; (1) The design of an interaction where over 40% of the atoms at the interface are polar and several buried hydrogen bonds are made; (2) the design of a single interface larger than 1,600 Å2; (3) the design of a loop based interaction. The absence of a successful loop mediated design is surprising given the prevalence of loops in interfaces from phage display38 and the development of methods to accurately design and model loops.39, 40 To achieve these goals it is likely that there will need to be improvements in conformational search methods and in energy functions for protein design.

Materials and Methods

Set of designed interfaces

The computational models used in this analysis represent a wide array of interface design goals (Table I). The design models fall in two main categories: (i) design of one protein chain to bind to a natural target and (ii) design of both chains involved in an interaction to create a novel heterodimer or homodimer. The majority of the designs, 140 out of 158, fall into the first category. These predominantly consist of interfaces of a scaffold designed to bind some target of interest such as a small GTPase, PAK1,14 proteins involved in ubiquitin transfer, and influenza hemagglutinin from Fleishman et al.6 Another 11 models represent the design of both the structure and sequence of a peptide to bind Gαi1.5 The second category is comprised of 18 redesigns of natural proteins to form homodimers mediated by metal binding8 or β-strands,7 and 11 models from Karanicolas et al. where both interface forming chains are designed to form a new heterodimer.4

Of the 59 designs from our laboratory 52 of them successfully expressed in E. coli. All designs made by our group are available in Supporting Information. Seventy-three of the 88 proteins designed to bind HA successfully expressed using yeast surface display. All of the designed pairs from Karanicolas et al. successfully expressed.

The interfaces used for the native dataset were taken from those chosen by Zhanhua et al.15 This set is comprised of high-resolution X-ray crystal structures (resolution < 2.5 Å) of 170 homodimers and 156 heterodimers. Of these, 167 homodimers and 152 heterodimers were read by Rosetta and used in this analysis (Supporting Information Table SII).

Computational evaluation of protein interfaces

The natural and designed interactions were all minimized with Rosetta to make energy evaluations between them comparable. The minimized and X-ray crystal structures were then evaluated for several metrics including computed binding energy (ΔGbind = EABEAEB) buried solvent accessible surface area upon binding (ΔSASA) and buried unsatisfied polar atoms at the interface (discussed below). A full description of the computational protocols used and the command lines is given in Supporting Information.

Polar burial definition

Rosetta calculates SASA using the Le Grand and Merz method.41 The SASA for a polar atom is sum of the SASA for that atom, plus the SASA for any bound hydrogens. A polar atom is defined as buried if the total SASA for that atom is less than 0.1 Å2. If a buried polar atom does not have a hydrogen-bonding partner, as defined as having a hydrogen-bond energy of less than 0.0 REUs, then that atom is considered buried and unsatisfied. A hydrogen bond is defined as buried if the SASA for the two involved polar atoms is less than 3.0 Å2. Based on distances observed from low B-factor waters to protein atoms42 we chose to use atomic radii from Reduce43 and a water probe radius of 1.2 Å to find buried polar atoms and hydrogen bonds. Buried and unsatisfied hydrogen bonds for the natural interfaces were calculated based on the conformation in the crystal structure because it has been observed that repacking a structure with Rosetta can increase the number of buried unsatisfied polars.16


The authors would like to think those who shared designed models and{ unpublished experimental results with us including Bryan Der, Ramesh Jha, Steven Lewis, and Deanne Sammond.