A new generation of protein display scaffolds for molecular recognition

Authors

  • Ralf J. Hosse,

    1. Preventative Health National Research Flagship, Parkville, Victoria 3052, Australia
    Search for more papers by this author
  • Achim Rothe,

    1. CSIRO Molecular and Health Technologies, Parkville, Victoria 3052, Australia
    Search for more papers by this author
  • Barbara E. Power

    Corresponding author
    1. Preventative Health National Research Flagship, Parkville, Victoria 3052, Australia
    2. CSIRO Molecular and Health Technologies, Parkville, Victoria 3052, Australia
    • CSIROMolecular and Health Technologies, Protein Design for Diagnostics, 343 Royal Parade, Parkville, Victoria 3052, Australia; fax: +61-3-9662-7314.
    Search for more papers by this author

Abstract

Engineered antibodies and their fragments are invaluable tools for a vast range of biotechnological and pharmaceutical applications. However, they are facing increasing competition from a new generation of protein display scaffolds, specifically selected for binding virtually any target. Some of them have already entered clinical trials. Most of these nonimmunoglobulin proteins are involved in natural binding events and have amazingly diverse origins, frameworks, and functions, including even intrinsic enzyme activity. In many respects, they are superior over antibody-derived affinity molecules and offer an ever-extending arsenal of tools for, e.g., affinity purification, protein microarray technology, bioimaging, enzyme inhibition, and potential drug delivery. As excellent supporting frameworks for the presentation of polypeptide libraries, they can be subjected to powerful in vitro or in vivo selection and evolution strategies, enabling the isolation of high-affinity binding reagents. This article reviews the generation of these novel binding reagents, describing validated and advanced alternative scaffolds as well as the most recent nonimmunoglobulin libraries. Characteristics of these protein scaffolds in terms of structural stability, tolerance to multiple substitutions, ease of expression, and subsequent applications as specific targeting molecules are discussed. Furthermore, this review shows the close linkage between these novel protein tools and the constantly developing display, selection, and evolution strategies using phage display, ribosome display, mRNA display, cell surface display, or IVC (in vitro compartmentalization). Here, we predict the important role of these novel binding reagents as a toolkit for biotechnological and biomedical applications.

The increasing availability of structural data and genomic sequences paves the way for new opportunities in the field of combinatorial library construction. A protein library, per definition, contains up to billions of molecules consisting of an underlying constant scaffold and randomized variable regions that differ from each other. Because of this enormous variety, achieved by either rational or combinatorial protein engineering, it is possible to isolate library members binding specifically to virtually any new target (Bradbury and Marks 2004). These selected binding reagents may subsequently become invaluable tools to be used in a wide range of biotechnological and biomedical applications, e.g., affinity purification (Lamla and Erdmann 2004), protein microarray technology (Renberg et al. 2005), bioimaging (Wikman et al. 2004), enzyme inhibition (Amstutz et al. 2005), and potential targeted drug delivery (Heyd et al. 2003; Nicaise et al. 2004). Most of the prototype molecules for these novel library frameworks are involved in binding interactions with proteins or other ligands under natural conditions (Table 1). Binding is often facilitated by exposed loops that link stable structural elements like α-helices or β-sheets of the underlying scaffold. Randomization and substitution efforts are focused on these loops as they are accessible for ligand binding and likely to accommodate sequence variation without changing the overall structure and stability of the protein. Based on the architecture of their backbone, scaffold proteins can be assigned to one of three groups: (1) scaffolds consisting of α-helices (Fig. 1A–D), (2) small scaffolds with few secondary structures or an irregular architecture of α-helices and β-sheets (Fig. 1E), and (3) predominantly β-sheet scaffolds, representing the majority of proteins used for library display (Fig. 1F–I). Furthermore, the scaffolds differ in the presence or the absence of stabilizing disulfide bonds linking spatially separated strands of the protein, a distinction that has consequences for the choice of expression system.

The nature of combinatorial changes to the framework can be very diverse: randomization of only one to several, noncontiguous loops (contributing to a large binding interface) (Roberts et al. 1992a; Koide et al. 1998; Xu et al. 2002), restricted sequence variation (Zhao et al. 2004) versus full variation (Schneider et al. 1999), or substitutions of only a few selected residues relevant for ligand binding and biomolecular recognition (Cicortas et al. 2004). In some cases, even permissive positions in the secondary structural elements of α-coils or β-sheets have carefully been randomized (Nord et al. 1995; Lehtio et al. 2000; Binz et al. 2004), given that they were solvent accessible at the surface of the molecules. Furthermore, peptide libraries can be presented by supporting proteins in a structurally constrained manner, thus protecting these peptides from proteolytic cleavage. The topologies of the resulting binding surfaces differ greatly, ranging from protruding fingers to flat and crevice-like surfaces as shown in Figure 2. As modifications of the amino acid sequence may potentially lead to misfolding or instability of the scaffold itself and upper limits of tolerated variability exist, the residues to be randomized have to be chosen carefully. Wiederstein and Sippl (2005) computationally addressed this question. They estimated the stability of protein structures under in silico sequence variation, predicting putative mutable target regions for randomization and calculating the resulting fractions of stable mutants. Seven of the reviewed scaffolds were analyzed: the fibronectin type III domain, a lipocalin, a knottin, cytochrome b562, a kunitz-type protease inhibitor, the Z-domain, and the carbohydrate binding module CBM4-2. Except for cytochrome b562, the investigators obtained strong correlations between in silico and respective experimental in vitro data (Wiederstein and Sippl 2005).

This review describes a variety of protein scaffolds currently used for combinatorial library display, focusing specifically on nonantibody related frameworks. We introduce the most recent proteins used for the display of peptide libraries and describe developments in scaffold engineering. Furthermore, we discuss display and selection techniques as well as the applications of the resulting high affinity binding reagents.

Scaffolds with α-helical frameworks

Affibodies (Z domain)

Although helical coiled coils and helix bundlesbelong to the most abundant structural protein motifs, the number of library display scaffolds based on α-coil architecture is very limited compared to the numerous examples for β-sheet frameworks. Nevertheless, a class of high-affinity binding reagents, termed “affibodies” (Fig. 1A), have been successfully obtained by library construction and affinity selection during the last decade. In fact, affibodies have been one of the first non-β-sheet protein backbones used for library construction. Derived from bacterial cell surface receptors, they represent an engineered version (Z domain) of one of the five stable three-α-helix bundle domains from the immunoglobulin Fc-binding region of staphylococcal protein A (Nilsson et al. 1987). This Z domain is highly soluble, proteolytically and thermally stable, effectively produced in Escherichia coli, and does not contain disulfide bonds. The first combinatorial library based on the Z domain and cloned into a phagemid vector was described in 1995 (Nord et al. 1995). Thirteen surface residues involved in Fc-binding were randomized in two out of three α-helices using codons of different degeneracy. These phage libraries were screened against Taq DNA polymerase, human insulin, and human apolipoprotein. Selected clones were expressed in E. coli, showing micromolar dissociation constants for their targets, and maintained the secondary structure of the native Z domain (Nord et al. 1995, 1997). Since then, the affinity and avidity of affibodies has been further increased by α-helix shuffling of a primary TaqDNApolymerase binding affibody (selective re-randomization of six amino acid positions in one of the two α-helices binding TaqDNApolymerase) and multimerization of a resulting second generation affibody (Gunneriusson et al. 1999), respectively. Two structures of affibodies complexed with a respective target protein have been published (Högbom et al. 2003; Wahlberg et al. 2003) and biophysical characterization of the affibody ZSPA-1, a binder to protein A, has been performed, suggesting a molten globule conformation upon target binding (Lendel et al. 2004). Today, these robust and small (6 kDa) molecules are being developed for a broad range of biotechnological and biotherapeutic applications ranging from protein microarrays (Renberg et al. 2005) to targeting the extracellular domain of the HER2 receptor in breast cancer (Wikman et al. 2004; Steffen et al. 2005).

Immunity proteins

A novel all-α-coil scaffold is the E. coli colicin E7 immunity protein (ImmE7) (Fig. 1B). It belongs to a group of immunity proteins that bind very tightly to their respective DNase colicins (bacteriocins), produced by enterobacteria for nonspecific endonucleolytic cleavage of single- and double-stranded DNA of competing bacteria. In its natural role, complexation of ImmE7 (9.8 kDa) with the 61-kDa colicin E7 (ColE7) inactivates the cytotoxic activity of the latter inside the producing cell. Once released from the complex, the DNase domain of ColE7 is able to penetrate other bacteria. The 87-residue ImmE7 protein contains no cysteines and folds into a four-helix bundle topology (α-helices I–IV). α-Helices I and III are followed by two loops (Dennis et al. 1998). Very recently, the cognate inhibitor of colicin E9 (ColE9), the immunity protein 9 (Im9), was selected by in vitro compartmentalization (IVC) for inhibition of ColE7 (Bernath et al. 2005). Randomization of Im9 was carried out by error-prone PCR and DNA shuffling. The selected Im9 variants showed between 33% and 97% inhibition of ColE7. Interestingly, these variants showed mutated residues identical to those determining the selectivity of the natural counterpart ImmE7, while retaining the residues that are conserved throughout the family of immunity protein inhibitors.

Cytochrome b562

Cytochrome b562 (Fig. 1C) is another example of a four-helix bundle protein thatwas used to randomly alter ligand binding properties (Ku and Schultz 1995). This protein has a well-packed hydrophobic core providing a stable framework structure. In two loops connecting the α-helical framework, the investigators randomized five and four residues, respectively. This library was phage-displayed and selected against the bovine serum albumin conjugate of N-methyl-p-nitrobenzylamine derivative 1 (BSA-1). Isolated mutants bound an epitope consisting of the ligand N-methyl-p-nitrobenzylamine derivative 1 and the carrier protein with micromolar dissociation constants (Kd). They did not bind to BSA alone or to an ovalbumin conjugate of the ligand. In comparison to a monoclonal antibody that selectively bound the ligand with a Kd of 290 nM, the limitations of the cytochrome scaffold in binding relatively small ligands became obvious. It was reasoned that the two hypervariable surface loops may not provide a cavity of sufficient size and structural diversity to tightly bind the ligand alone and that larger antigens might be more suitable for this type of scaffold. Recently, apocytochrome b562 has been used for selection of stable mutants by phage display and proteolysis (Chu et al. 2002; Feng and Bai 2004).

The peptide α2p8

In 2000, Barthe and colleagues (Barthe et al. 2000) synthesized a 38-amino acid peptide, α2p8, as a α-helical scaffold particularly promising for its stability, permissiveness of sequence mutations, ease of chemical synthesis, and biological expression. α2p8 is an α-helical hairpin, stabilized by two interhelical disulfide bridges. A short loop links the two helices. This hairpin represents the two N-terminal helices of the human p8MTCP1 protein (a small 8-kDa protein encoded by the human oncogene MTCP1) and keeps its native structure after being excised from its protein context. So far, however, no use of α2p8 in combinatorial library construction has been reported.

Repeat proteins

A modular approach to library construction has been performed using repeat-motif proteins with favorable biophysical properties. Repeat-motif proteins offer a different strategy for library design as the size of the binding interface can be varied depending on the number of randomized repeats. Ankyrin repeat domains (Fig. 1D), one of the most common modular protein–protein interaction motifs, are one well-studied example. They consist of repetitive structural units of 33 residues comprising a β-turn followed by two anti-parallel α-helices and a loop linking up to the turn of the next repeat. Library constructions of designed ankyrin repeat proteins (DARPins) with up to four repeats between N- and C-terminal capping repeats have been described (Binz et al. 2003), and the crystal structure of one library member has been solved (Kohl et al. 2003). Lacking disulfide-bonding cysteine residues, they are well-suited for cytoplasmic, high-level expression in E. coli. Artificial consensus sequence motifs have been designed and nonconserved residues suitable for restricted randomizations have been determined (Mosavi et al. 2002). Interestingly, they were clustered over the β-turn, α-helix, and loop region, representing the binding surface under natural conditions. In one study, libraries were assembled with two or three randomized ankyrin repeat domains comprising six diversified residues (Binz et al. 2004). N- and C-terminal capping repeats were engineered to shield the hydrophobic core of the scaffold, thus avoiding insolubility and aggregation. By ribosome display, binders with nanomolar affinities were selected for E. coli maltose binding protein and the two eukaryotic mitogen-activated protein kinases JNK2 and p38 (Binz et al. 2004). Noteworthy is the occurrence of randomly scattered framework mutations after several rounds of selection as an inherent trait of this in vitro method that requires each molecule to undergo hundreds of PCR cycles during library generation and selection. However, these framework mutations can lead to the selection of improved scaffolds with superior display and expression characteristics. The most recent application of ankyrin repeat libraries was performed by Amstutz and colleagues (Amstutz et al. 2005), who selected high affinity inhibitors of aminoglycoside phosphotransferase (3′)-IIIa (APH). One of these inhibitors has been co-crystallized with the target protein, elucidating the allosteric inhibition mechanism (Kohl et al. 2005). This intracellular kinase mediates resistance to aminoglycoside antibiotics in pathogenic bacteria. In vitro and in vivo assays showed complete enzyme inhibition underlining the great potential of DARPins for modulation of intracellular protein function (Amstutz et al. 2005).

An implementation of this modular library construction strategy has also been performed on the basis of a consensus repeat sequence from leucine-rich repeat proteins (mammalian ribonuclease inhibitor family) with up to 14 repeats in total by Stumpp et al. (2003). Furthermore, a novel strategy to generate combinatorial libraries using the modular nature of repeat proteins has been published by the same investigators in the same year (Forrer et al. 2003). Only recently, leucine-rich repeats were found to mediate the adaptive immune response of the sea lamprey (Pancer et al. 2004). This is a noteworthy finding showing that not only the immunoglobulin fold is used for diversification and selection under natural conditions and that other protein architectures may be as suitable for library construction, selection, and generation of high affinity binding reagents as antibodies have proved to be for the last decades.

Scaffolds with irregular secondary structures

Insect defensin A

This group of scaffolds comprises proteins with an irregular architecture of α-coils and β-sheets, or proteins with few secondary structures. A novel scaffold based on insect defensin A has been proposed for the construction of a conformation-constrained peptide library (Zhao et al. 2004). A reconstructed version of this small protease-resistant protein has been designed in order to eliminate antimicrobial activity and to increase the productivity in a bacterial expression system. This optimized scaffold, 1ICA29, has a molecular weight of 3 kDa and consists of 29 amino acids. It forms an α-helix and two β-strands, has two loop regions with tolerance for substitutions, and is stabilized by two disulfide bridges. The reconstruction did not change the original tertiary structure of the molecule. Furthermore, it could be displayed on phage in fusion with coat protein III. A total of seven residues were randomized within the two loops of the peptide. Using phage display, the investigators panned the libraries against four target proteins of different size and origin successfully enriching for binders to each of them (Zhao et al. 2004). However, neither the expression of specific clones nor the determination of their binding affinities has been reported yet.

Kunitz domain inhibitors

A group of small, irregular protease inhibitors with few secondary structures, namely “kunitz domain inhibitors” (Fig. 1E), has been used for modeling specificity to novel protease targets and has even entered clinical trials (Williams and Baird 2003). These stable proteins of ∼60 amino acids are stabilized by three disulfide bonds and act as reversible inhibitors of trypsin or serine proteases. Most important for library generation, they possess loop regions that can be mutated without destabilizing the structural framework. Examples of Kunitz domain inhibitor scaffolds are bovine pancreatic trypsin inhibitor (BPTI) (Roberts et al. 1992a,b), human pancreatic secretory trypsin inhibitor (PSTI) (Rottgen and Collins 1995), Alzheimer's amyloid β-protein precursor inhibitor (APPI) (Dennis and Lazarus 1994a,b), the leech-derived trypsin inhibitor (LTDI) (Tanaka et al. 1999; Campos et al. 2004), the mustard trypsin inhibitor II (MTI II) (Volpicella et al. 2001; Ceci et al. 2003), and the periplasmic E. coli protease inhibitor ecotin (Laboissiere et al. 2002; Stoop and Craik 2003). Steps toward the development of a high-affinity, high-specificity protein drug have already been undertaken for DX-88 (Williams and Baird 2003), selected from a phage display library based on variants of the first Kunitz domain of human lipoprotein associated coagulation inhibitor (LACI). DX-88 binds with high affinity to human plasma kallikrein, a serine protease that is an important mediator in the pathophysiology of Hereditary Angioedema (HAE). The start of a clinical phase III trial was planned for the first half of 2005 (http://www.dyax.com).

PDZ domain

Also, the PDZ domain, a protein-recognition module involved in signaling networks, has been used to engineer artificial PDZ variants recognizing C-terminal peptides of a range of target proteins different from their natural binding partners (Schneider et al. 1999). This molecule of ∼100 residues contains three α-helices and five β-strands. Schneider and colleagues (1999) used the PDZ domain of the Ras-binding protein AF-6 as a scaffold, and chose an approach which differs from randomization and screening procedures of most studies reviewed here. Randomization throughout the PDZ domain was achieved by PCR mutagenesis and screening was performed in vivo by intracellular yeast twohybrid system. Thus, no subsequent rounds of panning were performed. A single round of mutagenesis screening yielded specifically binding clones, suggesting that this approach can be applied for intracellular targets. Recently, a mutant PDZ domain of the mammalian serine protease Omi targeting the Myc-oncoprotein and subsequently demonstrated to induce cell death was also isolated by yeast two-hybrid system (Junqueira et al. 2003). By computer-aided rational design, Reina et al. (2002) generated PDZ domains to recognize new target sequences to be used in Western blotting, affinity chromatography and pull-down experiments.

Charybdotoxin

Scorpion toxins represent another source of small and stable protein scaffolds. Charybdotoxin from Leiurus quinquestriatus hebraeusmay serve as a well-defined example. It is a 37-residue motif consisting of an anti-parallel triple-stranded β-sheet, a short α-helix, and three stabilizing disulfide bonds. As it represents a perfect example of cysteine-stabilized α-helical (CSH) and β-sheet (CSB) motifs, this fold has been termed “cysteine-stabilized αβ motif” (Cornet et al. 1995). Very similar structures are adopted by not only scorpion toxins but also insect defensins described above and plant γ-thionins. Charybdotoxin has been used as an underlying scaffold for several grafting experiments (Pierret et al. 1995; Vita et al. 1998). More recently, Li and colleagues (2001) constructed a β-turn loop library of a charybdotoxin-based miniprotein scaffold. They displayed this library on phage and selected against HIV-1 gp120, a protein which binds to the CD4 receptor on T-cells, leading to conformational changes in the viral envelope that trigger viral entry into host cells. Selectants specifically bound gp120 and were competed by soluble CD4, suggesting that this approach may provide path to inhibitors of HIV entry into T-cells.

PHD finger

The plant homeodomain (PHD) finger protein from the transcriptional cofactor Mi2β has been examined as a scaffold for the presentation of selected binding functions (Kwan et al. 2003). This small protein has a well-structured core that contains two zinc ions and no disulfide bonds. Two variable and flexible loops (loops 1 and 3) seem to be tolerant to mutagenesis, expansion, and loop grafting. A binding site for the co-repressor CtBP2 was grafted onto the domain producing an engineered PHD domain specifically binding CtBP2 in a GST pull-down assay and in the yeast two-hybrid system.

TEM-1 β-lactamase

TEM-1 β-lactamase is in many respects different from the other scaffolds described in this section so far. This protein is much larger (263 residues) and has a protein back-bone consisting of numerous α-helices and β-sheets, a disulfide bond, and, of course, intrinsic enzyme activity. Phage-displayed TEM-1 β-lactamase libraries were constructed by inserting random peptides close enough to the active site that complex formation with a target protein could affect enzyme activity. These libraries were first selected for retained β-lactamase activity. The active libraries were then used for biopanning on monoclonal antibodies against the prostate specific antigen (PSA) or on streptavidin in order to develop homogeneous immunoassays as binding of the antigen markedly affected enzyme activity (Legendre et al. 1999). In a subsequent study, the same investigators isolated β-lactamase clones binding to horse spleen ferritin and β-galactosidase. Affinity maturation of a ferritin-specific clone resulted in Kd values between 10 and 20 nM for this protein (Legendre et al. 2002).

Scaffolds with β-sheet frameworks

Antibodies are the natural prototype of specifically binding β-sheet proteins with specificity mediated through hypervariable loop regions, so called complementary determining regions (CDR). However, this review is focused on scaffolds of nonimmunoglobulin origin. Thus, the numerous examples of engineered antibodies are not further detailed here. The literature provides an abundance of recent reviews (Hudson and Souriau 2003; Bradbury and Marks 2004; Kipriyanov and Le 2004), book chapters (Marks and Bradbury 2004), and books (Barbas et al. 2001) presenting library construction and affinity selection of whole antibodies and their fragments. But even with antibody-derived scaffolds left aside, β-sheets are still the dominating secondary structural elements in natural binding proteins used for library construction and grafting experiments.

10th fibronectin type III domain (10Fn3)

The human 10Fn3 domain (Figs. 1F, 2A), a small β-sheet domain, has been shown to be a suitable scaffold for peptide display. Residues in two surface loops have been randomized and mutant proteins have been selected by phage display (Koide et al. 1998), whereas Xu and colleagues (Xu et al. 2002) randomized three loops selecting by mRNA display, and more recently, this scaffold was used for the successful isolation of binders against the SH3 domain of human c-Src (Karatan et al. 2004). The fibronectin with the highest affinity could even be used in immuno-precipitation of full-length c-Src from murine fibroblast cell extracts.

CTLA-4

In 1999, Nuttall et al. showed that the human cytotoxic T-lymphocyte associated protein 4 (CTLA-4), comprising CDR-like loops similar to antibodies, is permissive to loop replacements. The resulting variant molecules did not bind their native ligands anymore but maintained their correctly folded framework. Furthermore, they could be expressed in soluble form in the periplasm of E. coli and functionally displayed on phage as fusions with gene III protein (Nuttall et al. 1999). These results identified that CTLA-4 is a candidate human library scaffold, and this was confirmed by Hufton et al. (2000). Subsequently, the CDR3-like loop of this scaffold was randomized, and binders to a human integrin were selected by phage display (Nuttall et al. 1999) while libraries with randomized CDR1- and CDR3-like loops were selected by ribosome display against lysozyme (Irving et al. 2001).

T-cell receptors

Very recently, human T-cell receptors (TCRs) were displayed on phage stabilized by a nonnative interchain disulfide bond and selected against two different peptide–human leukocyte antigen (pHLA) complexes (Li et al. 2005). High-affinity TCRs were obtained by directed evolution. The investigators demonstrated a dramatic ∼106- fold affinity increase compared to the natural affinities of TCRs to pHLAs of ∼1–100 μM. In another study, yeast display technology and flow-sorting was used to screen TCR-libraries of CDR mutants or random mutants. High affinity T-cell receptors were obtained, and the positions of the mutations indicate that virtually every CDR loop can contribute to the antigen-specificity of a T-cell (Chlewicki et al. 2005).

Knottins

Members of the so-called “knottin family” (Fig. 1G) are structurally related to the above-described scorpion toxin charybdotoxin, and have served as scaffolds for peptide libraries. The knottins, some of which also function as protease inhibitors, are small, 25- to 35-residue proteins. They comprise conserved disulfide bonds, leading to a characteristic knotted topology, and interspersed variable peptide loops. The most comprehensive data source for knottins is “The knottin web” (http://knottin.cbs.cnrs.fr; http://knottin.com) (Gelly et al. 2004). A constrained library based on the trypsin inhibitor from the squirting cucumber Ecballium elaterium (EETI-II) has been constructed by randomizing six residues in the first loop and screened by mRNA display (Baggio et al. 2002). Furthermore, the C-terminal cellulose-binding domain (CBD) of cellobiohydrolase I from the fungus Trichoderma reesei (Figs. 1G, 2C), another member of the knottin family, was employed as a library scaffold by various investigators (Smith et al. 1998; Lehtio et al. 2000) for phage display selection against different macromolecular targets. Selective binders with novel binding activity against alkaline phosphatase and porcine α-amylase (PPA), respectively, have been obtained.

A novelty among knottin scaffolds is Min-23, a derivative of the 28-residue EETI-II (Souriau et al. 2005). It is believed to be one of the shortest peptides containing the previously mentioned CSB motif. Min-23 has an N-terminal truncation of five amino acids, is devoid of an active site, and contains only two stabilizing disulfide bonds instead of three for the parent protein. Nevertheless, it represents an autonomous folding unit consisting of a small triple-stranded β-sheet and exhibits a high Tm of ∼100°C. Because it is small, stable, and easy to synthesize, Min-23 is considered a good starting point for grafting recognition sites or active sites onto it for the production of new bioactive molecules. Souriau et al. (2005) recently showed that the core of its CSB motif is permissive to loop insertions. Peptide epitopes from hemagglutinin or Gla-protein have been inserted and, furthermore, a phage library has been constructed by insertion of a randomized sequence on a β-turn of Min-23. This library of >108 different clones was panned against seven different targets, resulting in the isolation of 21 new specific binders.

Neocarzinostatin

Neocarzinostatin, an enediyne-binding chromoprotein isolated from Streptomyces carzinostaticus and in use as an anti-tumoral agent, might represent a scaffold for potential drug delivery. This protein consists of seven β-strands in two sheets forming a β-sandwich, a topology similar to the immunoglobulin fold. Two loops form a deep crevice where the organic neocarzinostatin chromophore is bound. The enediyne ring of this chromophore is responsible for its potent anti-tumoral and antibiotic activity, due to its DNA cleaving activity. Furthermore, the neocarzinostatin protein component (NCS) may act as a transporter for small molecules. By randomizing up to 13 side chains pointing toward the binding crevice of the NCS scaffold and generation of three different libraries, Heyd and colleagues (Heyd et al. 2003) selected specific testosterone binders by phage display. These variants were efficiently expressed in E. coli and showed nanomolar Kd values for streptavidin-bound testosterone, but only micromolar Kd values for free soluble testosterone. Moreover, due to topological similarity to an immunoglobulin fold, NCS comprises sites for the generation of new binding specificities, well suited for targeting macromolecular surfaces: Its two loops, structurally equivalent to the CDR1 and CDR3 of an antibody, could be randomly substituted and selected for recognition of tumor-associated antigens facilitating a targeted delivery of its antitumoral chromophore or other small molecule drugs. Nicaise and colleagues (2004) showed that this might be possible. They transferred the CDR3 of the VHHchain of a camel anti-lysozyme immunoglobulin to the equivalent site in the corresponding loop of neocarzinostatin. The structure of this engineered NCS–CDR3 was similar to the wild-type NCS. Furthermore, it was stable, efficiently produced, and bound specifically to lysozyme (Nicaise et al. 2004).

Carbohydrate binding module 4-2

Cicortas et al. (2004) investigated the use of a thermostable carbohydrate binding module (CBM4-2) (Fig. 1H), derived from the Rhodothermus marinus xylanase Xyn10A, as a novel scaffold. The 18-kDa protein CBM4-2 has a β-sandwich structure formed by 11 strands and contains no disulfide bonds. A combinatorial library based on restricted variation at 12 positions located in the carbohydrate binding site was generated. In order to maintain the structure of wild-type CBM4-2, only substitutions by related residues were performed. The library was screened against three different carbohydrate polymers as well as against a human IgG4 antibody using phage display. Isolated binders showed high expression levels in E. coli and displayed high melting transition temperatures.

Tendamistat

An additional example of a β-sheet sandwich scaffold is represented by tendamistat, a 74-residue inhibitor of α-amylase from Streptomyces tendae. Several publications describe the use of this disulfide bond containing protein for phage display library construction. Especially the loops of tendamistat seem to be permissive to randomization (McConnell and Hoess 1995; Li et al. 2003). Takahashi and colleagues (Takahashi et al. 2003) randomized the isolated α-amylase-binding loop 1 of tendamistat, attached fluorescent labels, and immobilized these library peptides onto microtiter plates. Binding of various proteins to these plates was analyzed based on fluorescence changes, and different recognition patterns could be detected, suggesting that this system may be useful for the development of a peptide microarray.

Lipocalins

As it was indicated for Neocarzinostatin, it has proven difficult to find scaffolds suitable for isolating variants that recognize low molecular weight or hapten-like ligands with high affinity. However, this has been achieved with scaffolds from the lipocalin protein family (Beste et al. 1999; Schlehuber et al. 2000). Remarkably, a human scaffold from the same protein group of so-called “anticalins” could likewise be used for the isolation of binders to the protein target hemoglobin (Vogt and Skerra 2004). Lipocalins (Figs. 1I, 2B) are 160- to 180- residue polypeptides involved in storage or transport of hydrophobic and/or chemically sensitive organic compounds (Flower 1996; Nygren and Skerra 2004). They consist of a β-barrel of eight anti-parallel β-strands, which form a conical structure. The entrance to the ligand-binding pocket is composed of four hypervariable loops connecting the β-strands in a pair-wise fashion at the open end of this central folding motif. For library construction, the target molecule determines the residues subjected to randomization, e.g., cavity randomization (of residues in the ligand-binding pocket) for low molecular weight ligands, loop randomization for protein targets. A 174-residue member of the lipocalin family, the bilin-binding protein (BBP) from the butterfly Pieris brassicae, was used for library generation to isolate variants binding to small molecules other than its natural ligand biliverdin IXγ (Fig. 2B). Sixteen residues across all four loops of the ligand-binding pocket were mutated, and a phage library selected against fluorescein (Beste et al. 1999) and digoxigenin (Schlehuber et al. 2000). Specific binders with nanomolar Kd values were obtained. These anticalins bound fluorescein and digoxigenin as true haptens independent of the carrier proteins used during selection. The crystal structures of these anticalins, FluA and DigA16, respectively, have been solved (Korndörfer et al. 2003a,b). Recently, a human member of the lipocalin family, apolipoprotein D (ApoD), served as a scaffold for randomization of an increased number of loop residues (24 amino acids). Binders against human hemoglobin were selected from the resulting library (Vogt and Skerra 2004). This human scaffold might well pave the way for anticalin-based therapeutics. A review on lipocalins and their engineered versions (anticalins) was published early this year (Schlehuber and Skerra 2005).

Another member of the lipocalin family is the bovine heart fatty acid-binding protein (FABP). It is a stable 14.7-kDa protein and offers the possibility of N-terminal library generation, since the four N-terminal, solvent-exposed amino acid residues are not involved in any secondary structure. Lamla and Erdmann (2003; 2004) substituted these residues with 15 randomized amino acids and performed in vitro selection by ribosome display for isolation of streptavidin-binding peptides of low nanomolar affinity. A 15- and a 9-residue version of the best binder, termed “Nano-tag15” and “Nano-tag9,” respectively, have been further developed as a basis for affinity tag detection and purification of recombinant proteins.

Green fluorescent protein

One application of high-affinity binders is the use as probes for bio-imaging purposes linked to a fluorochrome, a radioisotope, or an enzymatic function. Even better, if the scaffold itself has fluorescent activity and the engineered version will recognize the target protein of choice. Consequently, the β-barrel green fluorescent protein (GFP) was used in several studies to accommodate randomized residues in the intracellular environment. Abedi et al. (1998) made a first attempt of evaluating GFP for library construction and described two sites in the loops connecting its β-strands (in addition to the N and C termini) that were found to display a variety of peptides in a manner compatible with autofluorescence. In a more recent study, a random peptide library was fused to the C terminus of GFP, screened for peptides inhibiting tumor cell growth, and four peptide sequences exhibiting antiproliferative effects were obtained (Hitoshi et al. 2003).

Conclusion

Combinatorial peptide libraries, first realized by arachnids for the development of neurotoxins almost 400 million years ago (Sollod et al. 2005), as well as the complex evolution of the vertebrate immune system, can now be mimicked by combinatorial biochemistry and protein engineering within days and weeks. Although we now know that antibodies are not the only source of diversified proteins employed by nature for an adaptive immune response (Pancer et al. 2004), they have been the natural prototype of specifically binding proteins used as diversity carrying scaffolds for library construction during the last decades. In many applications, however, the constant regions of whole antibodies are not necessarily required. Thus, there has been an increasing emphasis on systems based on antibody fragments (Holliger and Hudson 2005), a trend which is also reflected by the variety of small scaffolds of nonimmunoglobulin origin ranging from 263 residues for TEM-1 β-lactamase to 23 residues for Min-23. At present, domain antibodies (dAbs) are the smallest known antigen-binding fragments of antibodies (Holt et al. 2003). Furthermore, antibody engineering is expanding to novel antibody lineages of sharks (Nuttall et al. 2001, 2003, 2004; Dooley et al. 2003) and camels (Omidfar et al. 2004; Rahbarizadeh et al. 2004). The biotechnological and clinical applications of the diverse antibody formats are numerous; however, there is increasing competition with nonimmunoglobulin scaffolds reviewed in this article.

The emerging field of protein engineering has led to a wide range of different nonimmunoglobulin scaffolds with widely diverse origins and characteristics. In fact, >30 of them have been used as alternatives to antibodies for the construction of protein libraries and grafting experiments (Binz and Plückthun 2005). Some of them are comparable in size to a scFv of an antibody (∼30 kDa), e.g., TEM-1 β-lactamase, T-cell receptors, or green fluorescent protein, while the majority of them are much smaller. The smallest scaffolds include the knottin Min-23 (23 residues), a designed version of insect defensin A (29 residues), or the scorpion toxin charybdotoxin (37 residues). In contrast, modular scaffolds based on repeat proteins, e.g., the 33- residue ankyrin repeat motif, vary in size depending on the number of repetitive units.

How do these nonimmunoglobulin scaffolds perform in comparison to antibodies and their fragments, especially in terms of production, biophysical characteristics, and target recognition? Many alternative scaffolds are still in an early phase of proof-of-principle evaluation, are being developed for practical biotechnological applications, and only a handful have advanced further toward preclinical or even clinical trials. Antibodies, however, are clearly in the lead regarding the profound knowledge that has been gained for the overwhelming number of different antibodies and related formats successfully being exploited in biotechnological and pharmaceutical industry. Nevertheless, several advantages of this new generation of scaffolds over recombinant antibodies seem to be obvious. Many of them, especially the scaffolds based on α-helical frameworks or repeat motifs but also some β-sheet frameworks like 10Fn3 or CBM4–2, are stable without disulfide bonds. This enables cheap and efficient production in the reducing cytoplasm of bacteria. Affibody molecules can even be produced by chemical synthesis (Engfeldt et al. 2005; Renberg et al. 2005). Furthermore, unlike IgGs, these scaffolds are single-domain proteins and do not require post-translational modifications, which increases their ease of production. In terms of stability and folding properties, many of the small nonimmunoglobulin scaffolds, disulfidebond- containing or not, show exceptional thermal stability and superior robustness in affinity chromatography.

An important challenge for these novel scaffolds is the recognition of a wide range of different target molecules, especially of small molecular weight targets, peptides, and post-translationally modified proteins. Small molecular weight targets have been successfully recognized by anticalins (Beste et al. 1999; Schlehuber et al. 2000), whereas several studies on, e.g., cytochrome b562 (Ku and Schultz 1995) or neocarzinostatin (Heyd et al. 2003) have revealed limitations in binding small ligands. Not many alternative scaffolds have been shown to bind peptides yet. The few that do recognize peptides are usually restricted to a specific context and affinities are micro- to nanomolar. PDZ domains can be selected for binding to novel C-terminal peptides (Schneider et al. 1999; Reina et al. 2002; Ferrer et al. 2005). Src homology domains 2 (SH2) mediate protein interactions by recognizing phosphotyrosine residues in peptides and library construction and selection yielded binders to several phosphopeptides (Malabarba et al. 2001). In contrast, the src homology domains 3 (SH3) bind to peptides that fold into a polyproline helix conformation (Hiipakka and Saksela 2002; Panni et al. 2002). As far as recognition of carbohydrates and glycoproteins are concerned, CBM4-2 may serve as an example. Library variants specific for different carbohydrate polymers as well as for a glycoprotein (human IgG4) have been successfully selected (Cicortas et al. 2004), although it is not clear yet whether the selected CBM4-2 variants recognized carbohydrate or protein epitopes. However, it is clear that the majority of alternative library scaffolds still show obvious limitations with regard to the range of ligands they can recognize, whereas binders to haptens, peptides, carbohydrates, and proteins with subnanomolar dissociation constants could readily be obtained from one and the same single-chain Fv antibody library (Söderlind et al. 2000).

The most relevant differences between the various scaffolds apart from size are in disulfide cross-linking, overall topology of the binding interface, extent of permissiveness for substitutions, and origin (human vs. nonhuman). Applications range from tools in biotechnology to diagnostic and therapeutic reagents. For the development of diagnostic biosensors, high binding affinities, selectivity, and exceptional thermal, chemical, and protease in vitro stability play a major role. In contrast, in vivo pharmacokinetic characteristics like serum stability, tissue penetration, blood clearance, and target retention are important for therapeutic applications.

As immune responses to therapeutic proteins have always been an issue (Carter 2001; Koren et al. 2002), immunogenicity of potential biopharmaceuticals deserves careful examination. On one hand, efforts are taken to make nonhuman therapeutic proteins as similar to their human counterparts as possible. For antibodies, numerous examples of humanization (grafting mouse surface residues onto human acceptor scaffolds), deimmunization (removal of T-cell epitopes) (Roque-Navarro et al. 2003), and production of human proteins in transgenic mice have been published (Hudson and Souriau 2003). On the other hand, a human scaffold for library construction might be less immunogenic right from the start. However, even an entirely human scaffold is no guarantee for a protein that does not elicit a human immune response, especially if it is an intracellular protein. Randomization of amino acids during library construction can potentially introduce novel T-cell epitopes. Even single point mutations can render a human protein immunogenic. Furthermore, most human scaffolds cause some autoimmune response (Binz et al. 2005). Thus, strategies have been developed to decrease immunogenicity by protein PEGylation (Chapman 2002) or prediction of T-cell epitopes (Schirle et al. 2001; Flower 2003). In the end, only clinical trials will provide a reliable picture of a protein's immunogenic behavior in humans.

Affinity reagents can also be conjugated to reporter enzymes, fluorochromes, or radiolabels, and can deliver various payloads to cells expressing the targeted marker. Once again, immunogenicity issues related to these conjugates have to be carefully considered.

During the last decade, much expertise has been accumulated, enabling us to choose the ideal scaffold, antibody- derived or other, to construct new libraries or make use of existing ones, well suited for a given target and application. This variety will provide a solid foundation for the successful isolation and exploitation of high-affinity binders in the fields of research, diagnosis, and therapy.

Table Table 1.. Examples of alternative protein display scaffolds used for library display
NameScaffoldSecondary structureResidues/cross-linksRandomized elementsSelected referencesExamples of binding affinities [Kd] for (target)d
AffibodiesZ-domain of protein A3 α-helices58/—13 residues in 2 helicesNord et al. 1995, 1997; Gunneriusson et al 1999

30–50 nM after affinity maturation (Taq polymerase) Gunneriusson et al. 1999;

50 nM (extracellular domain of HER2/neu) Wikman et al. 2004

Immunity proteinsImmE74-helix bundle87/—2 loopsChak et al. 1996N.D.
Cytochrome b5624-helix bundle106/bound heme2 loopsKu and Schultz 19955–22 μM (N-methyl-p-nitrobenzylamine-BSA) Ku and Schultz 1995
α2p8α-helical hairpin38/2 S-SN.D.Barthe et al. 2000N.D.
Repeat-motif proteinsAnkyrin repeat2 α-helices, β-turn, variable repeat numbersa33/—β-turn, 1 α-helix and loopMosavi et al. 2002; Binz et al. 2004

4.4–22 nM (maltose binding protein), 2.1 nM (JKN2), 3.7 nM (p38) Binz et al. 2004;

28.6–0.5 nM (APH) Amstutz et al. 2005

Insect defensin A (1ICA29)α-helix, 2 β-strands, loops29/2 S-S2 loopsZhao et al. 2004N.D.
Kunitz domainsBPTI/APPIα-helices, β-sheets58/3 S-S1–2 loopsRoberts et al. 1992a, 1992b; Dennis and Lazarus 1994a, 1994b1.0–2.8 pMe (human neutrophil elastase) Roberts et al. 1992a; 10–500 nMf (TF-FVIIa) Dennis and Lazarus 1994a; 2–20 nMf (TF-FVIIa) Dennis and Lazarus 1994b; 11 pMg (plasma kallikrein) Stoop and Craik 2003
PDZ-domainsRas-binding protein AF-63 α-helices, 5 β-strands100/—Entire domain by PCR mutagenesisSchneider et al. 1999160–240 nM (different C-terminal peptides) Schneider et al. 1999
Scorpion toxinsCharybdotoxinTriple-stranded β-sheet, short α-helix37/3 S-SGrafting of functional sites onto β-sheetVita et al. 1998N.D.
10Fn3β-sandwich of 7 β-strands94/—2–3 loopsKoide et al. 1998; Xu et al. 20021–24 nMh (TNF-α), 20 pMh after affinity maturation (TNF-α) Xu et al. 2002
  • a

    Table 1 provides an overview on protein scaffolds of nonimmunoglobulin origin. Examples from diverse protein sources may serve as references. Whereas some protein families have already given rise to several alternative scaffolds, libraries, and isolated affinity binders, for others there is only one successful application known to date. For more detailed descriptions, see text.

  • a

    aThe number of 33-residue ankyrin repeats in DARPins can vary dependent on the library. Furthermore, the ankyrin repeats are shielded by additional N- and C-terminal capping repeats.

  • b

    bIn some of the libraries, only one of the two disulfide bonds in wild-type neocarzinostatin remained as the two corresponding cysteines (Cys37 and Cys47) were randomized.

  • c

    cThe bovine fatty acid binding protein has been used as a carrier for an N-terminal peptide library. The four natural N-terminal residues were replaced by 15 randomized amino acid residues, which were subsequently further reduced to nine residues.

  • d

    dValues derived by surface plasmon resonance (SPR) unless otherwise stated.

  • e

    eKd values were determined using a fluorometric assay and the method of Green and Work (1953).

  • f

    fApparent equilibrium dissociation constants (Ki) were determined by enzyme inhibition assay (Seymour et al. 1994).

  • g

    gApparent equilibrium dissociation constants (Ki) were determined by enzyme inhibition assay (Eggers et al. 2001).

  • i

    hDissociation constants were determined by radioactivity titration as described in Xu et al. (2002).

  • j

    iThe dissociation equilibrium constants for free testosterone has been determined by fluorescence titration.

  • k

    jDetermined by fluorescence titration (Voss and Skerra 1997).

  • l

    kFor Nano-tag15 and Nano-tag9, respectively.

  • m

    N.D., Not done.

CTLA-4 (extracellular domain)V-like Ig β-strands136/2 S-S1–2 loopsNuttall et al. 1999; Irving et al. 2001N.D.
KnottinsMin-23Triple-stranded β-sheet23/2 S-Sβ-turnSouriau et al. 2005N.D.
 Cellulose binding domainTriple-stranded β-sheet36/2 S-S11 residues distributed over β-sheets and loopsLehtio et al. 200015 μM (porcine α-amylase) Lehtio et al. 2000
Neocarzinostatinβ-sandwich113/2 S-SbUp to 13 residues pointing toward the binding creviceHeyd et al. 200321 nM (streptavidin-bound testosterone), 7–55 μM (free testosterone)i Heyd et al. 2003
CBM4-2β-sandwich168/—12 residues in carbohydrate binding siteCicortas et al. 2004N.D.
Tendamistatβ-sheet74/2 S-S1–2 loopsMcConnell and Hoess 1995; Li et al. 2003N.D.
AnticalinsApolipoprotein Dβ-barrel and loops178/2 S-S24 residues in 4 loopsVogt and Skerra 20042.16 μM (hemoglobin) Vogt and Skerra 2004
 Bilin-binding proteinβ-barrel and loops174/2 S-S16 residues in center of binding siteBeste et al. 199935.2 nMj (fluorescein) Beste et al. 1999
 FABPβ-barrel133/—N terminuscLamla and Erdmann 2003, 20044 nM/17 nMk (streptavidin) Lamla and Erdmann 2003, 2004
Figure Figure 1..

Representative protein display scaffolds selected for novel molecular recognition by library construction or grafting experiments. Scaffold proteins in A–D consist of α-coils, the small kunitz domain inhibitor depicted in E shows an irregular α-coil and β-sheet architecture, whereas F–I show scaffolds predominantly consisting of β-sheet frameworks. α-Helices are depicted in red; β-sheets, in blue; disulfide bonds, in orange; and positions subjected to random or restricted substitutions, in yellow. The PDB IDs used to generate this figure are given in parentheses: (A) Affibody: Z-domain of protein A (1Q2N), (B) immunity protein: ImmE7 (1CEI), (C) cytochrome b562 (1M6T), (D) repeat-motif protein: ankyrin repeat protein (1SVX), (E) kunitz-domain inhibitor: Alzheimer's amyloid β-protein precursor inhibitor (1AAP), (F) 10th fibronectin type III domain (1FNA), (G) knottin: cellulose binding domain from cellobiohydrolase Cel7A (1CBH), (H) carbohydrate binding module CBM4-2 (1K45); and (I) anticalin FluA: bilin-binding protein (1T0V) with cavity randomization for fluorescein binding.

Figure Figure 2..

Topologies of different binding surfaces. Scaffold proteins show different topologies to bind their natural ligands. The choice of scaffold depends on the nature of the target protein used in the selection process. (A) Fibronectin type III domains are found in many animal proteins involved in ligand binding. The loops of the 10th fibronectin type III domain (1TTG) are structurally analogous to the CDRs in immunoglobulin VH domains. (B) The ligand-binding cleft in the lipocalin scaffold bilin-binding protein (1BBP) is formed by four loops (depicted with its natural ligand biliverdin IXγ). (C) The cellulose binding domain of Cel7A cellobiohydrolase from T. reesei (1CBH) belonging to the knottin family has a tip-shaped topology. Protein regions involved in natural binding and suitable for amino acid randomization are depicted in red; disulfide bonds, in orange. The PDB IDs used to generate this figure are given above, in parentheses.

Acknowledgements

We are grateful to Jenny Carmichael (CSIRO Molecular and Health Technologies) for her assistance with the figures used in this review.

Ancillary