SEARCH

SEARCH BY CITATION

Keywords:

  • crystal structure;
  • canonical structure;
  • cis-isomer;
  • CDR conformation;
  • structure prediction

ABSTRACT

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Despite sequence diversity, five out of six hypervariable loops in antibodies assume a limited number of conformations called canonical structures. Their correct identification is essential for successful prediction of antibody structure. This in turn requires regular updates of the classification of canonical structures to match the expanding experimental database. Antibodies with the eight-residue CDR-L3 represent the second most common type of antibodies after those with the nine-residue CDR-L3. We have analyzed all crystal structures of Fab and Fv with the eight-residue CDR-L3 and identified three major canonical structures covering 82% of a nonredundant set. In most cases, the canonical structure is defined by the absence or presence and position of a proline residue within the CDR. Proteins 2014; 82:1668–1673. © 2014 Wiley Periodicals, Inc.


INTRODUCTION

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

The antigen-binding sites of antibodies are formed by six loops, three each from the variable domains of the light chain (VL) and of the heavy chain (VH). These “complementarity-determining regions” or CDRs[1] are hypervariable in sequence, however, five of them assume a limited number of main-chain conformations, called “canonical structures.”[2] These conformations are determined by the CDR length and by the presence of key amino acid residues at specific positions either within the CDRs or in the framework regions. The specific pattern of residues that determines each canonical structure forms a “signature” whereby a canonical structure can be recognized in the sequence of an immunoglobulin of unknown structure and can, therefore, be predicted from sequence alone.[3]

CDR-L3 connects two C-terminal β-strands of the VL domain and forms a wide loop anchored at positions 90 and 97, which are part of the β-scaffold [Fig. 1(A)]. CDR-L3 typically contains nine residues between the invariant residues Cys88 and Phe98 with a cis-proline occupying position 95. Due to junctional diversity in V-J recombination, a significant fraction of antibodies contains only eight residues in CDR-L3. Structurally, the deletion occurs at position 95 resulting in the sequences either without Pro, or with a Pro at positions 94 or 96.

image

Figure 1. Canonical structures for CDR-L3. A: The most typical nine-residue canonical structure L3–9-cis7 (yellow) superimposed on L3-8-NP (blue). B: L3-8-NP (blue) and L3-8-P7 (orange). C: L3-8-NP (blue) and L3-8-P6 (green). D: L3-8-NP (blue) and L3-8-NP-sub (pink). For non-Pro residues only backbone atoms are shown. Pro95 in L3–9-cis7 and Pro94 in L3-8-P6 are cis, Pro96 in L3-8-P7 is trans. Main-chain hydrogen bonds are shown by dashed lines.

Download figure to PowerPoint

The 8-residue CDR-L3 canonical structures were systematically classified in 2009,[4] when 14 crystal structures were assigned to four types: 3A, 3B, 6, and 7. Two types, 3A and 7, contained only two members each and could hardly be called “canonical,” but the new structures determined since then proved it to be correct.

In the most recent and comprehensive classification,[5] 19 nonredundant structures with the 8-residue CDR-L3 were grouped into three clusters on the basis of the backbone dihedral angles φ and ψ. Two clusters, L3-8-1 and L3-8-cis6, were clearly defined with an average φ/ψ deviation from the median structure of only 10°. However, the third cluster, L3-8-2, was loosely defined by four structures with an average φ/ψ deviation of 41° and apparently included all structures of Types 3A and 3B.

In our analysis of the eight-residue CDR-L3, we expanded the database by including all Fab and Fv crystal structures available to date, the number of which nearly tripled over the last 3 years. This allowed us to identify two main canonical structures that cover 70% of all observed conformations. The remaining structures were grouped into three categories, one of which is new. Inspection of electron density maps reveals several cases where the canonical type was previously assigned incorrectly because of errors in modeling CDR-L3.

MATERIALS AND METHODS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Antibody structures (Fab and Fv) with the 8-residue CDR-L3 were selected from the Protein Data Bank[6] using the IMGT server.[7] In total, 132 structures determined by X-ray crystallography (121 Fab and 11 Fv) were identified. All of them have kappa light chains. Taking into account sequence identity, this set was reduced to 66 nonredundant structures. For each unique sequence, the structure determined at the highest resolution was selected. Out of these, five low-resolution structures (>3 Å) were excluded. Electron density was manually inspected to ensure that the CDR-L3 conformation was defined unambiguously. Two structures (1tzh and 3dgg) were removed from the set because of poor electron density for CDR-L3. Three more structures (1eo8, 1fn4, 1kcr) were removed because of poor quality as indicated by a large number of outliers in the Ramachandran plot including CDR-L3 residues. Electron density for these three structures was not available because structure factors were not deposited in the PDB. Application of all filters resulted in a set of 56 structures.

The Chothia antibody numbering scheme[2, 8] is used throughout the paper. The CDR-L3 definition according to both Kabat[1] and Chothia[2, 8] includes residues between Cys88 and Phe98. In the eight-residue CDR-L3, residue at position 95 is absent.

The CDR conformations were described in terms of the backbone dihedral angles φ and ψ. For visual convenience, the CDR sequences were mapped onto the Ramachandran plot divided into six regions (Fig. 2) following North et al.[5] Canonical structures were assigned manually on the basis of φ/ψ patterns only. Spatial orientation with respect to other CDRs was not taken into account. All crystallographic calculations were performed with the CCP4 suite of programs.[9] Protein structures were inspected using Coot.[10] Figures were created with PyMOL, version 0.98 (DeLano Scientific, LLC).

image

Figure 2. Regions of the Ramachanran plot according to North et al.[5] A for α-helix, B for β-sheet, P for polyproline II, L for the left-handed helix, D for the δ-region, G for the γ-region.

Download figure to PowerPoint

RESULTS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Although less frequent than the nine-residue CDR-L3, the eight-residue version is the second most common among the PDB structures. The conformations of CDR-L3 observed in the crystal structures available to date cluster into two large groups covering altogether 35 out of 46 nonredundant structures. The two conformations exhibit a common wide loop encompassing residues 91–96 (without 95) and differ at positions 94–96, which fall either in the “GB” or in the “PP” region of the Ramachandran plot. The former group, with the “GB” conformation, contains no Pro residues and shows a strong preference for Leu at position 94 and for Arg or an aromatic residue at position 96 (Table 1). The group includes 25 structures, which is nearly half of the reference dataset. This canonical structure was identified as the main cluster L3-8-1 by North et al.[5] and was known before as Type 6.4 As the group is characterized by the lack of proline residues, we labeled it L3-8-NP, i.e., “No-Pro” (Table 2).

Table 1. Canonical Structures for the 8-Residue CDR-L3
StructureResolution (Å)SourceAntigenSequenceConformationrmsd (°)
  1. Antigen, hap, hapten; pep, peptide; prt, protein; nuc, nucleic acid. The conformation is letter-coded as specified in Figure 2. Small letters indicate residues in the cis-conformation. RMSD is the root-mean-square deviation of φ and ψ from the mean values over eight residues of CDR-L3. PDB entries marked with stars were rerefined in this study using x-ray data deposited in the PDB.

L3-8-NP (rmsd = 9.6°)      
1a0q-L2.3MousehapLQYYNLRTBPDABGBB5.7
1c5d-L2.4RatprtLQYGNLYTBPDABGBB6.1
1eap-A2.5MousehapLQYYNLRTBPDABGBB15.1
1h3p-L2.6MouseprtKQSYSLYTBBDABGBB6.8
1il1-B2.2MousepepQQYYHYRTBBDABGBB8.2
1jrh-L2.8MouseprtQQYWSTWTBPDABGBB5.3
1q9k-A1.9MousehapKQSYNLRTBPDABGBB5.1
1q9o-A1.8MousehapKQSYNLRTBPDABGBB5.2
2adf-L1.9MouseprtLQYDNLRTBPDABGBB5.1
2ck0-L2.2MouseprtKQSYNLYTBBDABGBB13.7
2g5b-A2.3MousepepKQSYNLRTBPDABGBB5.9
2i9l-A3.1MouseprtKQSYNLWTBPDABGBB9.6
3b9k-L2.7RatprtLQYDTLYTBPDABGBB11.3
3cmo-L2.3MousepepQQYSKLFTBPDABGBB5.4
3dur-A1.9MousehapKQSYNLRTBPDABGBB7.4
3dus-A1.9MousehapKQSYNLRTBPDABGBB8.2
3hzm-A1.8MousehapKQSYNLRTBPDABGBB5.4
3i02-A2.6MousehapKQSYNLRTBPDABGBB6.3
3ijh-A2.1MousehapKQSNNLRTBPDABGBB7.9
3o2d-L2.2MouseprtQQYYSYRTBPDABGBB3.9
3okd-A1.8MousehapKQSYNLRTBPDABGBB6.0
4dgv-L1.8HumanpepQQRSNWITBBDABGBB10.6
4fz8-L2.7HumanprtMQALQAVGBBDAPGBB22.0
4jr9-L2.6MouseprtLQYNSLLTBPDAPGBB16.4
4m61-A1.6MousenucHQHLSSWTBBDABGBB11.4
L3-8-NP-sub (rmsd=23.2°)      
1ors-A*1.9MouseprtHQFHRSLTBBBDDBBB31.1
2j88-L2.6MouseprtQHHYGTRTBPDADPBP22.8
4dvr-L2.5HumanprtQQANSFFTBPDADPBP19.1
4irz-L2.8HumanizedprtLQYDNLWTBBBADBPB17.1
L3-8-P7 (rmsd=9.8°)      
1dql-L2.6HumanpepLQQNSNWTBPDABPPB12.2
1pz5-A1.8MousepepSQTTHVPTBBDABPPB11.1
1qkz-L1.9MousepepSQSTHFPTBBDABPPB10.4
1t4k-A*2.5MouseprtKQSYDLPTBPDAPPPB11.5
2xtj-B*2.7HumanprtQQFDGDPTBPDABPPB6.6
3hi6-L2.3HumanprtQQSYSTPSBPDABPPB9.2
3iet-A2.2MousepepSQSTHVPTBBDABPPB7.7
3qeh-B2.6HumanprtMQAKESPTBBDABPPB9.3
3qpx-L2.0RatprtQQYNSRDTBPDAPPPB9.7
3raj-L3.0MouseprtQQYWSTPTBBDABPPB8.8
3uo1-L1.6MousepepSQSTHVPTBBDABPPB10.5
4kuz-L2.7MouseprtQQYYSYPTBBDABPPB11.3
4kq3-L1.9HumanprtQQYSDDPTBPDABPPB7.1
4j1u-A2.6MouseprtKQSYDLPTBPDABPPB10.2
L3-8-P7-sub (rmsd=29.7°)      
1a7o-L2.0MouseprtQHFWSTPTBBDBGPPB30.4
1keg-L2.4MousenucFQGSLVPTBBABGBPB21.1
1yqv-L1.7MouseprtQQWGRNPTBBBPABPB27.9
3oz9-L1.6MouseprtHQWSGFYTBBBBLBPB38.5
3vw3-L2.5MousenucFRGSHVPTBPABGBPB22.3
3w9d-B2.3HumanprtQQYGSSPTBBDPABPB27.9
4hfw-L2.6HumanprtQKTLRTWTBPDBGPPP30.7
4kph-L2.6MouseprtHQWSSYPTBBBBABPB34.7
L3-8-P6 (rmsd=9.4°)      
1e6o-L1.8MouseprtQQWNYPFTBPABPaLP7.3
2fat-L1.8MouseprtQQWNYPFTBPABPaLP6.4
3l5y-L2.8HumanizedprtQQHDYPYTBPAPPaLP13.1
Not classified      
3phq-A2.0MousehapQHSRELRTBPABGALP 
3mcl-L1.7MousepepQNWRSSPTBBAAPBpA 
Table 2. Relationship Between Classifications of Canonical Structures for the 8-Residue CDR-L3
Kuroda et al.[4]North et al.[5]This workSequence features
  1. The number of structures is in parentheses.

Type 6 (6)L3-8-1 (13)L3-8-NP (25)no Pro, Leu94
N/AN/AL3-8-NP-sub (4)no Pro; His90
Type 3B (4)N/AL3-8-P7 (14)Pro96 (trans)
Type 3A (2)L3-8-2 (4)L3-8-P7-sub (8)Pro96 (trans)
Type 7 (2)L3-8-cis6-1 (2)L3-8-P6 (3)Pro94 (cis)

In the “PP” group, residues 94–96 adopt a polyproline conformation [Fig. 1(B)]. Residue 96 is almost always (with two exceptions) a proline in the trans-configuration. The group includes 13 structures and corresponds to Type 3B,[4] although it was not identified as a separate cluster by North et al.[5] We label it L3-8-P7 to indicate Pro at position 96 (position 7 within the CDR) as a characteristic feature of the group.

Two members of the group, 1t4k[11] and 2xtj,[12] were assigned to L3-8-P7 after certain corrections in the models. Both structures deposited in the PDB show peptide 93–94 flipped with respect to the canonical structure resulting in the “LP” conformation for residues 94–96. However, inspection of the electron density maps indicates that both models should be corrected (Fig. 3). In addition to the peptide flip, Gly93 in 2xtj should be modeled in the trans rather than cis configuration. This and other examples emphasize the need of a curated structural database free of errors that could be used for structure classifications and as a template source for antibody modeling.

image

Figure 3. Electron density (2Fo-Fc omit map contoured at 1.2 RMSD) for CDR-L3 with the correct conformation shown in green. A: 1t4k.[11] B: 2xtj.[12]

Download figure to PowerPoint

The third group has canonical structure L3-8-cis6-1 as described by North et al.,[5] which corresponds to Type 7.4 It includes three structures and is characterized by cis-Pro at position 94, which is the 6-th position within the CDR [Fig. 1(C)]. Importantly, all reference structures with Pro94 have it in the cis-configuration and all have the same CDR-L3 canonical structure, which we label L3-8-P6.

Although the majority of antibodies adopt one of the three canonical structures, there are examples of certain deviations from the canonical patterns. There are 12 structures with minor deviations that can be put together in the corresponding sub-groups labeled as L3-8-NP-sub and L3-8-P7-sub (Table 1). L3-8-NP-sub includes four structures that are characterized by a flip of peptide 93–94 with respect to canonical structure L3-8-NP [Fig. 1(D)]. None of the structures contains Pro residues.

Members of the L3-8-P7-sub group have residues 94–96 in a polyproline conformation “PP” or “BP”. The differences to L3-8-P7 occur at positions 92 and 93 where the peptide bond flip changes the conformation of residue 92 from α-helical in L3-8-P7 to β-sheet in L3-8-P7-sub. There are eight structures assigned to L3-8-P7-sub, six of which have trans-Pro96 typical for L3-8-P7.

Two structures, 3phq[13] and 3mcl,[14] remain unclassified. Both were determined at high resolution, and the electron density for CDR-L3 is very clear. Similarly to the L3-8-NP group, 3phq has no prolines in CDR-L3 and has a characteristic Leu at position 94. However it deviates significantly from the canonical structure. On the other hand, the CDR-L3 conformation in 3phq is quite similar to L3-8-P6 despite the fact that there is no proline at the sixth position. For comparison, the backbone root-mean-square deviation (RMSD) from 1q9k[13] (L3-8-NP) and 1e6o[15] (L3-8-P6) is 2.1 Å and 1.0 Å, respectively.

The second unassigned structure, 3mcl, differs from all other structures in that Pro96 is in the cis-conformation. It is also the only example with Asn at position 90, which is usually Gln or His. Whether the two structures represent new canonical conformations remains to be seen.

DISCUSSION

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Following the introduction of the canonical structure concept,[2, 16] systematic classifications of the CDR conformations occur regularly while the structural database grows. The analysis performed in 2002 has identified two canonical structures for the eight-residue CDR-L3 depending on the presence or absence of Pro at position 96.[17] Sequences with Pro96 were considered to form one canonical structure (known as Type 3), whereas sequences without Pro and with Leu at position 94 were proposed to form another canonical structure (Type 6). Although only two examples of the latter, 1eap[18] and antibody CRIS-1 (not in PDB),[19] were available at that time, this category became the most populated among antibodies with “short” CDR-L3 (Table 2).

Classification of 2009 proposed to divide the classical Type 3 canonical structure into subtypes 3A and 3B based on the dihedral angles for residues 91–93.[4] We followed the same rationale in defining subgroup L3-8-P7-sub (equivalent of subtype 3A) to include a few structures that do not fit exactly L3-8-P7 (equivalent of subtype 3B). In addition to Types 3 (A and B) and 6, Kuroda et al.[4] introduced Type 7 for the structures with Pro at position 94 rather than 96 as in Type 3. Although they have not explicitly stated as a requirement that Pro94 should be in a cis-conformation whereas Pro96 in trans, there have been no exceptions to this rule up to date. We believe that this comes as a consequence of geometric constraints within the 5-residue loop 91–96, which must fit the fixed positions 90 and 97.

With more structures available now, the major canonical class L3-8-NP includes two times more structures than in 2011 and 4 times more than in 2009. Analysis of the expanded set of structures also prompted us to introduce a no-Pro sub-group L3-8-NP-sub that was not identified previously. It currently includes 4 structures, 3 of which lack a typical Leu at position 94. They differ from L3-8-NP in the orientation of the peptide bond preceding residue 94. There may be various factors that define the orientation of peptide 93–94. For instance, 2j88[20] has an unusual His at position 90, which, unlike Gln90, cannot form stabilizing hydrogen bonds to the main-chain atoms of residue 93. In the case of 1ors,[21] the network of hydrogen bonds is biased due to the chelation of a chloride ion at CDR-L3, which may be a reason for the flipped peptide 93–94. This was not apparent until the structure was rerefined using X-ray data deposited in the PDB. The N-terminal Gln was modeled in a pyroglutamate form, and a chloride ion was incorporated to coordinate the amino groups of Ile2, Gln90, Arg93, and Ser94. In the case of 4irz,[22] the humanization process potentially could have an effect on the CDR conformation. It seems that members of the L3-8-NP-sub group exhibit some disruption of intra-CDR interactions, which may shift a fine balance towards the observed conformation. The reasons for the disruption are not always clear.

To address the question of whether a unique CDR sequence unambiguously defines its canonical structure, we have analyzed all 132 PDB entries with eight-residue CDR-L3 (the redundant dataset). Several Fab structures were determined in both the bound and unbound states, which provides the base for comparisons. CDR-L3 is sandwiched between CDR-L1 and CDR-H2 and is usually in contact with CDR-H3. Residue 92 lies against Tyr32 of VL whereas residues 94–96 pack against Trp47 of VH. Binding of an antigen often causes rearrangement of CDR-H3 and adjustment of the VL/VH packing angle, both of which could in principle affect the conformation of CDR-L3. Yet, we found that “short” CDR-L3 is remarkably rigid and retains the conformation in the interactions with antigens and neighboring CDRs.[23] In one example (3okd vs. 3okm),[24] association of Fab S25–39 with the bacterial sugar Kdo causes a dramatic induced fit of CDR-H3 and an adjustment of the VL/VH angle by 7°; however, CDR-L3 maintains the conformation (canonical structure L3-8-NP).

We also compared different antibodies with the identical sequences in CDR-L3 and came to the same conclusion that the environment is not a major determinant of the CDR conformation. Two members of the canonical group L3-8-P7, 3iet,[25] and 3uo1,[26] have identical CDR-L3 sequences but completely different VHs. Moreover, the bound antigens are different and the side chain of the invariant Trp47 is flipped over in 3uo1. Nevertheless, CDR-L3 retains the conformation with the RMSD of 0.35 Å for all main-chain atoms of the CDR.

It is worth noting that not only the conformation of CDR-L3 is preserved in different circumstances, but also its orientation with respect to the rest of Fv. This feature of the “short” CDR-L3 likely stems from the fact that only four residues of the loop are outside the β-sheet structure.

While the CDR sequence generally defines the CDR conformation, there may be other determinants, such as the underlying germline genes, that favor one canonical structure over the other. Within the reference dataset, we found no correlation between the germline and the canonical structure. Likewise, there is no clear preference of the antigen type (hapten, peptide, protein) for any particular canonical structure. Although all hapten antibodies fall into the category L3-8-NP (Table 1), it should be noted that those antibodies are not completely independent and represent only two groups of related antibodies. One group was raised against chlamydial lipopolysaccharide,[13, 24] and another group are catalytic antibodies induced by the hapten transition state analog.[18] After all, there are only 44 antibodies distributed over five groups, which is probably too few to draw statistically significant conclusions.

Correct identification and use of canonical structures is a keystone of antibody modeling as was demonstrated in the Antibody Modeling Assessments.[27, 28] Classification of canonical structures is an evolving process, and great progress has been made recently with the growing number of Fab structures in the PDB. This work undoubtedly needs to continue being for the benefit of antibody engineering.

REFERENCES

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES