PYRIN domains were identified recently as novel protein modules found at the N-termini of proteins involved in apoptotic and inflammatory signaling pathways (Masumoto et al. 1999; Bertin and DiStefano 2000; Inohara and Núñez 2000; Hlaing et al. 2001; Martinon et al. 2001; Masumoto et al. 2001). The PYRIN homology region corresponds to the highly conserved residues 1–95 of human, mouse, and rat pyrin. Mutations within the human pyrin gene cause familial Mediterranean fever, an autosomal recessive disorder characterized by short, recurrent bouts of fever (Aksentijevich et al. 1997; Bernot et al. 1997). In vitro stimulation of monocytes with proinflammatory agents induces pyrin expression, consistent with the suggestion that it has a role in regulating the inflammatory response.
The N-terminal location of the PYRIN domain in the nucleotide-binding domain/leucine-rich repeat (NBD/LRR) family members NBS1/NALP2 (Bertin and DiStefano 2000; Martinon et al. 2001) and CARD7/DEFCAP/NAC/NALP1 (Chu et al. 2001; Hlaing et al. 2001; Martinon et al. 2001) has led to the suggestion that this domain likely mediates protein–protein interactions, because the related proteins Apaf-1 (Zou et al. 1997) and Nod1/CARD4 (Bertin et al. 1999; Inohara et al. 1999) each contain a caspase recruitment domain (CARD), a well-known protein–protein interaction module (Hofmann et al. 1997), N-terminal of their NBDs. Consistent with this hypothesis, PYRIN domains have also been identified at the N-termini of zebrafish caspase-13 (Inohara and Núñez 2000) and the CARD-containing protein ASC/TMS1/PYCARD (Masumoto et al. 1999; Conway et al. 2000; McConnell and Vertino 2000; Martinon et al. 2001; Masumoto et al. 2001). Caspase-13 (Humke et al. 1998; Inohara and Núñez 2000) and other initiator caspases have CARD or death effector domain (DED)-containing prodomains that mediate their interactions with upstream activating adaptor proteins (Cohen 1997; Hofmann 1999). In turn, the bipartite domain structure of ASC, which has an N-terminal PYRIN domain and a C-terminal CARD, is reminiscent of adaptor proteins such as FADD (Boldin et al. 1995; Chinnaiyan et al. 1995) and RAIDD (Ahmad et al. 1997; Duan and Dixit 1997). These proteins have an N-terminal DED or CARD, respectively, and C-terminal death domains (DD), and mediate a variety of protein–protein interactions in apoptotic signaling (Aravind et al. 1999; Hofmann 1999).
The DD, DED, and CARD modules are important protein–protein interaction domains found in many proteins involved in apoptosis and inflammation, including receptor, adaptor, effector, and inhibitor proteins (Aravind et al. 1999; Hofmann 1999). These domains are required for the transmission and regulation of signals from receptors to effectors, such as caspases, via homotypic interactions in which DDs interact with DDs, DEDs interact with DEDs, and CARDs interact with CARDs. Interestingly, despite their low degree of sequence similarity, these homotypic interaction domains have been shown to share a common three-dimensional fold, classified as the death domain-fold in SCOP (Murzin et al. 1995) (note that even within the families the sequences can be so divergent that statistically significant sequence similarities cannot be detected readily by conventional sequence comparison methods). The death domain-fold corresponds to an antiparallel six-helix bundle with Greek-key topology and internal pseudo-twofold symmetry. The functional and structural similarity between the death domain-fold superfamily members has led to the suggestion that they probably evolved from a common ancestor (Aravind et al. 1999; Hofmann 1999). More recently, Aravind et al. and Martinon et al. have indicated that the PYRIN domains are also related evolutionarily to the death domain-fold superfamily and, therefore, have the same six-helix fold, although only limited supporting data have been reported (Aravind et al. 2001; Martinon et al. 2001). Martinon et al. have further shown that PYRIN domains are capable of interacting specifically with other PYRIN domains, consistent with earlier speculation that these domains mediate protein–protein interactions (Martinon et al. 2001). In the present work we use efficient secondary structure prediction methods (Rost and Sander 1994; King and Sternberg 1996; Frishman and Argos 1997), potential-based fold recognition (Sippl and Weitckus 1992), molecular modeling, and experimental data from circular dichroism (CD) spectroscopy to provide compelling evidence that PYRIN domains adopt the death domain-fold.
Results and Discussion
Initial sequence analysis of the originally reported PYRIN domain sequences (Bertin and DiStefano 2000), using the homology search tools Pfam and BLAST, failed to identify any statistically significant sequence homology with previously categorized protein domains or proteins of known three-dimensional structure. This analysis did, however, identify three additional human proteins as containing N-terminal PYRIN domains: the interferon-inducible gene products absent-in-melanoma-2 (AIM2), interferon-inducible protein-16 (IFI16), and myeloid cell nuclear differentiation antigen (MNDA), and their murine homologues, interferon-activatable proteins 203, 204, and 205 (Landolfo et al. 1998; Geng and Choubey 2000). The presence of PYRIN domains in AIM2 and IFI16 was reported previously by Aravind et al. (2001).
A multiple sequence alignment of the identified PYRIN domains is shown in Figure 1A. As noted for the death domain-fold superfamily (Hofmann 1999), the level of identity within the PYRIN domain sequences is rather low. Pairwise sequence alignments show that the three interferon-inducible proteins, AIM2, IFI16, and MNDA, are the most similar with sequence identities in the range of 29–49%; the remaining pairwise alignments yield sequence identities between 8–28% (Table 1). Nevertheless, the multiple sequence alignment does reveal a distinct pattern of hydrophobic residues that likely form the core of the PYRIN domain fold.
Secondary structure predictions obtained using three different programs were in good general agreement, with each method predicting almost exclusively helical secondary structure. A consensus secondary structure prediction obtained from the consensus of the individual predictions is indicated in Figure 1A.
Because the PYRIN domains showed no statistically significant sequence homology to proteins of known three-dimensional structure we used a potential-based fold recognition or “threading” method to determine which protein folds give the best alignments with the sequences. Each of the PYRIN domain sequences were threaded against a comprehensive fold-library using the program ProFit (ProCeryon Biosciences, Inc.). The resulting sequence-structure alignments were ranked based on a combined energy score derived from residue–residue and residue–solvent interactions (pair/surf), a sequence similarity score (seq), or a normalized combination of both pair/surf and seq. In all cases death domain-fold family members were represented among the top 20 pair/surf scores. Most of the other high-ranking folds were eliminated as possibilities if the fold recognition models contained significant amounts of β-sheet (in contrast to the secondary structure prediction), or if they did not correspond to complete compact domains (because we assume that the PYRIN domain sequences should correspond to compact globular folded domains). Additionally, the death domain-fold is the only candidate fold, based on SCOP classifications (Murzin et al. 1995), that is ranked in the top 20 pair/surf scores for all the PYRIN domain sequences tested. This is a strong indication that the homologous PYRIN domains share the death domain-fold. Other folds that are commonly featured among the top ranking candidates include the calcium-binding EF-hand, the DNA/RNA-binding three-helical bundle, and the α-α-superhelix fold (which includes ankyrin and armadillo repeats). The death domain-fold sequences also score well when threaded against the α-α-superhelix fold, in particular the ankyrin repeats, although proteins with this fold are typically longer than the ∼90 residue death domain-fold sequences (W.J. Fairbrother, unpubl.).
The pair/surf scores and identities obtained for the PYRIN domain sequences versus the death domain-folds in the fold-library are compiled in Table 2. To provide a suitable baseline for interpreting these results, we also threaded the sequences corresponding to the death domain-fold family members in the fold library against the same structures (Table 3). With a couple of exceptions, the pair/surf scores and identities obtained for the PYRIN domains threaded against the death domain-folds are in the same range as those observed for threading the DD sequences against the CARD folds and vice versa, indicating that the PYRIN domain sequences are as consistent with the death domain-fold as sequences known to adopt this fold. Of the exceptions, the pyrin sequence versus the ICEBERG (1dgn) fold is clearly misaligned (based on the other sequence-structure alignments), with the N-terminal half of the sequence being aligned with the C-terminal half of the structure; reducing the weight of sequence comparison information to zero when calculating the sequence-structure alignment results in an alignment consistent with the others.
To test the compatibility of the PYRIN domain sequences with the death domain-fold we constructed an homology model of the CARD7 PYRIN domain using Apaf-1 CARD (1cy5), caspase-9 CARD (3ygs.P), p75 DD (1ngr), and mFADD DD (1fad) as template structures. Our choice of the CARD7 PYRIN domain was based on a prior interest in this protein and the fact that we could readily produce the PYRIN domain for experimental verification. Following structural superposition of the template proteins using the program ProSup (ProCeryon Biosciences, Inc.), the initial sequence-structure alignments generated using ProFit were optimized to provide simultaneous agreement with the structurally aligned template sequences and the consensus secondary structure prediction; in each case improved alignments were obtained by reducing the weighting of the sequence information used in the ProFit calculation. The optimized sequence alignment between the CARD7 PYRIN domain and the template sequences is illustrated in Figure 1B. The process of optimizing the sequence-structure alignments significantly reduced the level of sequence identity between the target sequence and the template structures relative to that obtained originally using the default ProFit parameters (Table 2); for the alignment in Figure 1B, the identity with the PYRIN domain sequence ranges from only 8% for the two DDs to 12% for the two CARDs, while the similarity ranges from 12% for the mFADD DD to 19% for the caspase-9 CARD. The agreement between the secondary structure deduced from this alignment and the consensus predicted secondary structure given in Figure 1A, however, is good. The low level of homology between the target and template sequences, together with the fact that the template structures do not share a high homology (Table 3), made this a challenging homology modeling exercise; the use of fold recognition tools to optimize the sequence-structure alignments was a critical step in model construction.
Structurally conserved regions (SCRs) corresponding to the helical secondary structure elements were defined initially, and coordinates for the PYRIN domain were subsequently assigned using HOMOLOGY (Molecular Simulations, Inc.). Because the interhelical loops in the template structures have different lengths (Fig. 1B) and conformations, the loop coordinates for CARD7 were assigned by finding peptide segments in the PDB that fit the model's spatial environment or by generating segments de novo. Following local geometry optimization the model was subjected to global minimization. The final model (Fig. 2) has good stereochemistry as determined using the program PROCHECK (Laskowski et al. 1993), with no residues found in the disallowed region of the Ramachandran plot (86.1% of residues are found in the “most favored” region, 12.7% in the “additional allowed” region, and only one residue is found in the “generously allowed” region). The overall 3D-1D self-compatibility score of the model calculated using Profiles-3D, S/Scalc = 1.02, where Scalc is the score expected for a correct structure having the same sequence length (Lüthy et al. 1992), indicates that the model structure is compatible with the sequence. This value compares favorably with those obtained for the experimentally determined template structures, where S/Scalc = 0.97, 1.17, 0.94, and 0.86 for Apaf-1 CARD (1cy5), caspase-9 CARD (3ygs.P), p75 DD (1ngr), and mFADD DD (1fad), respectively. In the PROSAII hide and seek analysis (Sippl 1993) the model was recognized as the most favorable fold on the basis of a combined z-score calculated from Cα–Cα and Cβ–Cβ interactions. The PROSAII energy profile analysis, using a window size of 10 residues, identified only one residue (Lys47) as having an unfavorable positive energy; the energy profiles obtained look typical of correctly folded protein models and compare favorably with profiles calculated for the death domain-fold template structures.
Consistent with the Profiles-3D and PROSAII analysis, the hydrophobic core of the CARD7 PYRIN domain structure is packed well. Residues contributing to the hydrophobic cores of the PYRIN domain model and the template structures are indicated in Figure 1B; the positions of many of these residues are conserved in these diverse sequences. Most of the CARD7 residues found in the hydrophobic core of the model structure are highly conserved as hydrophobic residues in the other PYRIN domain sequences (Fig. 1B), consistent with the other PYRIN domains also adopting the death domain-fold.
To further define the structure of PYRIN domains the N-terminal 100 residues of CARD7 were expressed and purified. Our initial goal was to verify the structure prediction by NMR spectroscopy; unfortunately, the protein exhibited extensive self-association, and was soluble only to a concentration of 20–30 μM (10 mM sodium acetate, pH ∼ 4), precluding detailed NMR-based structural analysis. Observation of upfield-shifted methyl resonances in 1D-1H NMR spectra and good spectral dispersion in 2D-1H/15N-HSQC spectra, however, confirmed that the protein was folded (data not shown). Note that self-association has also been observed for a number of domains sharing the death domain-fold, including Fas DD (Huang et al. 1996), mFADD DD (Jeong et al. 1999), FADD DED (Eberstadt et al. 1998), RAIDD CARD (Chou et al. 1998), ICEBERG CARD (Humke et al. 2000), and caspase-1 and -2 CARDs (W.J. Fairbrother, unpubl.). A CD spectrum of the CARD7 PYRIN domain was therefore acquired and compared with spectra of the known death domain-fold family members, caspase-1 CARD, hFADD DD, and TNFR1 DD (Fig. 3). CD spectroscopy is complementary to NMR measurements as an experimental probe of the overall fold of a protein. The CD spectrum of the CARD7 PYRIN domain is typical of a predominantly α-helical protein, having maxima at ∼ 193 nm and minima at ∼ 209 and 222 nm. Additionally, the CD spectrum is similar in shape and intensity to those of the CARD and DD proteins, indicating that the PYRIN domain and the death domain-fold proteins have similar amounts of helical secondary structure. The experimental data are thus consistent with the prediction that the PYRIN domains adopt a death domain-like fold.
In conclusion, we have presented results from secondary structure prediction and fold recognition calculations that indicate the recently identified PYRIN domains are structurally related to the death domain-fold superfamily of proteins. Molecular modeling of the PYRIN domain of CARD7, together with CD spectra of the same domain strongly support this conclusion. Members of the death domain-fold superfamily function as adaptor modules in apoptosis and inflammatory signaling pathways by mediating specific protein–protein interactions. Recognition that PYRIN domains in related apoptotic and inflammatory proteins share the same fold as the death-domain family, is consistent with recent reports that this domain also mediates homotypic protein–protein interactions (Martinon et al. 2001). Efforts to identify specific binding partners for PYRIN domains are currently ongoing.
Materials and methods
Sequence homology searches were performed using the homology search tools Pfam (Bateman et al. 2000) and BLAST (Altschul et al. 1990) with a database comprising proteins from the National Biomedical Research Foundation's Protein Information Resource (NBRF/PIR), the SWISSPROT database from EMBL, translations of annotated coding regions in both the GenBank and EMBL nucleotide sequence databases, and selected sequences from the RCSB Protein Data Bank (PDB).
Multiple sequence alignments of the identified PYRIN domains were performed using Clustal X (1.8) (Jeanmougin et al. 1998) or Clustal W (Thompson et al. 1994).
A consensus secondary structure prediction for the aligned PYRIN domain sequences was derived from predictions performed using the programs DSC (King and Sternberg 1996), PHD (Rost and Sander 1993,1994), and PREDATOR (Frishman and Argos 1996, 1997), as implemented on the NPS@ web server (http://pbil.ibcp.fr/NPSA/npsa_prediction.html) (Combet et al. 2000).
Fold recognition or “threading” experiments were carried out to identify protein three-dimensional structures consistent with the PYRIN domain sequences. The program ProFit (ProCeryon Biosciences, Inc.) was used to screen the PYRIN domain sequences against a fold-library of 4,749 structures (Flöckner et al. 1995,1997; Sippl and Flöckner 1996). The sequence/structure compatibility was evaluated based on a combination of knowledge-based potentials computed from pair-wise residue–residue interactions and protein residue–solvent interactions, together with sequence information based on the BLOSUM40 amino acid substitution matrix.
An homology model of the PYRIN domain of CARD7 was constructed using the HOMOLOGY module of InsightII 98.0 (Molecular Simulations, Inc.). A consensus alignment between the target sequence and four template structures was based on the alignments generated by ProFit, a structural alignment of the homologous template structures using the program ProSup (ProCeryon Biosciences, Inc.) (Domingues et al. 2000; Koppensteiner et al. 2000), and the consensus-predicted secondary structure. Final energy minimization was carried out using the program DISCOVER (Molecular Simulations, Inc.) and the all-atom AMBER forcefield (Weiner et al. 1986). The resulting model was validated using the programs Profiles-3D (Lüthy et al. 1992), as implemented in InsightII, PROSAII (Sippl 1993), and PROCHECK (Laskowski et al. 1993).
Protein expression and purification
The coding region of the CARD7 PYRIN domain, corresponding to residues 1–100, was subcloned by PCR amplification into the bacterial expression vector pET15b (Novagen) using NdeI and BamHI sites. The resulting plasmid was transformed into BL21(DE3)pLysS competent cells (Strategene). The cells were grown to OD600 = 0.7, induced with ∼ 1 mM IPTG, grown for an additional 2 h at 37°C, and then harvested by spinning at 5,000 rpm for 20 min. The cell pellet was stored at −80°C. Cells were then homogenized and lysed in the presence of 6 M guanidium–HCl, 50 mM Tris (pH 8.0), and sonicated briefly. Lysed cells were spun at 14,000 rpm for 30 min, and the supernatant loaded onto a 4-mL Ni-NTA column. The column was washed with 8 M urea, 50 mM Tris (pH 6.3), and the protein was eluted in 8 M urea, 50 mM Tris, 250 mM imidazole (pH 8.0). The fractions containing the target protein were pooled and dialyzed into 2 M urea, 150 mM NaCl, and 50 mM Tris (pH 7.8). Thrombin (1 unit/mg) was added to remove the His-tag. Once the digestion was complete, the pH was adjusted to 4.0 with acetic acid. The protein was dialyzed into 0.1% acetic acid six times, then lyophilized. Electrospray mass spectrometry was used to verify the protein as the CARD7 PYRIN domain with the addition of three residues (GSH) at the N-terminus as introduced by the vector.
CD spectra were acquired using an Aviv model 202 circular dichroism spectrometer. Samples were prepared in 10 mM sodium acetate, pH 4.0, with protein concentrations from 1–3 μM. Samples of hFADD DD (residues 89–208 + 21-residue N-terminal His-tag) and TNFR1 DD (residues 313–416 + 21-residue N-terminal His-tag) were provided by Robert Kelley (Genentech, Inc.). Caspase-1 prodomain (residues 2–134) was provided by Stephanie Shriver and Borlan Pan (Genentech, Inc.). Spectra were recorded between 260–190 nm in a 0.5-cm pathlength cuvette at 25°C. Reference spectra of buffer alone were subtracted from protein spectra followed by conversion to mean residue ellipticity.
We thank Ben Hitz, Maria Teresa Pisabarro, and Nicholas Skelton for helpful discussions and advice, Stephanie Shriver and Borlan Pan for the CD spectrum of caspase-1 prodomain, Robert Kelley for supplying hFADD DD and TNFR1 DD, and James Bourell for mass spectrometry.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.