Comparative genomics of the Mill family: a rapidly evolving MHC class I gene family

Authors

  • Yutaka Watanabe,

    1. Department of Biosystems Science, School of Advanced Sciences, The Graduate University for Advanced Studies (Sokendai), Hayama, Japan
    Search for more papers by this author
  • Takako Maruoka,

    1. Department of Biosystems Science, School of Advanced Sciences, The Graduate University for Advanced Studies (Sokendai), Hayama, Japan
    2. College of Bioresource Sciences, Nihon University, Fujisawa, Japan
    Search for more papers by this author
  • Lutz Walter,

    1. Abteilung Immungenetik, Universität Göttingen, Göttingen, Germany
    Search for more papers by this author
  • Masanori Kasahara

    Corresponding author
    1. Department of Biosystems Science, School of Advanced Sciences, The Graduate University for Advanced Studies (Sokendai), Hayama, Japan
    • Department of Biosystems Science, School of Advanced Sciences, The Graduate University for Advanced Studies (Sokendai), Shonan Village, Hayama 240-0193, Japan Fax: +81-46-858-1544
    Search for more papers by this author

Abstract

Mill (MHC class I-like located near the leukocyte receptor complex) is a novel family of class I genes identified in mice that is most closely related to the human MICA/B family. In the present study, we isolated Mill cDNA from rats and carried out a comparative genomic analysis. Rats have two Mill genes orthologous to mouse Mill1 and Mill2 near the leukocyte receptor complex, with expression patterns similar to those of their mouse counterparts. Interspecies sequence comparison indicates that Mill is one of the most rapidly evolving class I gene families and that non-synonymous substitutions occur more frequently than synonymous substitutions in its α 1 domain, implicating the involvement of Mill in immune defenses. Interestingly, the α 2 domain of rat Mill2 contains a premature stop codon in many inbred strains, indicating that Mill2 is not essential for survival. A computer search of the database identified a horse Mill-like expressed sequence tag, indicating that Mill emerged before the radiation of mammals. Hence, the failure to find Mill in human indicates strongly that it was lost from the human lineage. Our present work provides convincing evidence that Mill is akin to the MICA/B family, yet constitutes a distinctgene family.

Abbreviations:
CEA:

Carcinoembryonic antigen

CP:

Connecting peptide

CYT:

Cytoplasmic tail

LRC:

Leukocyte receptor complex

Mill:

MHC class I-like located near the LRC

RACE:

Rapid amplification of cDNA ends

RH:

Radiation hybrid

RT:

Reverse transcription

SP:

Signal peptide

TM:

Transmembrane segment

UTR:

Untranslated region

1 Introduction

Classical MHC class I genes, known as class Ia genes, are highly polymorphic and are expressed almost ubiquitously 1, 2. They encode the heavy chain of class I molecules that present antigenic peptides to CD8+ T cells, thereby enabling the immune system to destroy abnormal cells that synthesize viral or other foreign proteins. On the other hand, non-classical class I genes, collectively called class Ib, are oligomorphic or monomorphic and usually have restricted tissue distribution with low cell surface expression 3. Accumulated evidence indicates that class Ib genes have diverse functions ranging from specialized antigen presentation 48 to the transport of IgG 9, pheromone detection 10, 11 and lipid mobilization and catabolism 12. Whereas class Ia genes are encoded exclusively in the MHC, a significant proportion of class Ib genes are encoded outside the MHC 13.

We recently described a new family of mouse class Ib genes encoded outside the MHC 14. Because this gene family is located near the leukocyte receptor complex (LRC), we named it Mill (MHC class I-like located near the LRC). Although the biologic function of the Mill family, which is made up of two members designated Mill1 and Mill2, is currently not understood, it has interesting features suggesting a role in innate immunity. First, among the known class I genes, Mill genes are most closely related to the MICA/B family encoded in the HLA. Second, the Mill family is apparently absent in humans and, conversely, mice do not have the MICA/B family 15. Third, MILL molecules lack residues essential for the docking of peptides, strongly suggesting that they do not bind peptides. Fourth, Mill genes are poorly transcribed in most adult tissues, suggesting that some stimuli may induce their expression. To understand the evolutionary dynamics of the Mill family and its rela-tionship to the MICA/B family, we isolated Mill cDNA from ratsand carried out a comparative genomic analysis.

2 Results

2.1 Isolation of rat Mill cDNA

To isolate rat Mill cDNA, we first performed Blast searches of the rat draft genome assembly, using mouse Mill1 and Mill2 cDNA sequences as queries. These searches identified two clones likely to contain rat Mill genes: clone CH230–389A5 containing Mill1 (AC126511.2) and clone CH230–105N7 containing Mill2 (AC110846.4). Using this sequence information, we isolated full-length cDNA coding for MILL1 and MILL2 from WKAH/HkmSlc rats (Fig. 1). A 3′-rapid amplification of cDNA ends (RACE) revealed that the 3′-untranslated region (UTR) of Mill1 occurs in two distinct forms. One form had a 3′-UTR of 1,453 bp with a canonical polyadenylation signal preceding the poly(A) tail (AB113960). Another form lacking a typical polyadenylation signal had a 3′-UTR of 1,006 bp (AB126063). The 3′-UTR of Mill2 was made up of 1,493 bp with a canonical polyadenylation signal preceding the poly(A) tail (AB113961).

Rat MILL1 is made up of 405 amino acids, the N-terminal 32 residues of which were predicted to constitute the signal peptide (SP). Rat MILL2 has 373 amino acids, the N-terminal 29 residues of which were predicted to form the SP. The calculated molecular masses of the mature MILL1 and MILL2 polypeptides are 43,062.69 and 38,253.60, respectively. Residues 357–373 of MILL1 were predicted to constitute the transmembrane segment (TM). An alternative possibility for the TM is residues 331–345. If this latter possibility is correct, the cytoplasmic tail (CYT) of MILL1 contains di-leucine motifs that might function as sorting or internalization signals. In rat MILL2, residues 319–341 were predicted to form the TM. Thus, the putative CYT of rat MILL2 has only three residues. Rat MILL1 has four potential N-linked glycosylation sites in the extracellular domains as compared to three in mouse MILL1. Like mouse MILL2, rat MILL2 has three potential N-linked glycosylation sites.

Phylogenetic analysis using the amino acid sequences of the α 1–α 3 domains showed that rat MILL1 and MILL2 are orthologs of mouse MILL1 and MILL2, respectively (Fig. 2A), indicating that Mill1 and Mill2 duplicated before the divergence of rats and mice. Furthermore, this tree confirmed our previous conclusion 14 that, of all known class I genes, the Mill family is most closely related to the human MICA/B family.

Figure 1.

Deduced amino acid sequences of rat MILL1 and MILL2. The sign ‘–’ and asterisks indicate identity with the top sequence and absence of residues, respectively. Predicted SP are underlined. Potential N-linked glycosylation sites (NXT/S) are also underlined. Putative TM are boxed. Triangles indicate the locations of exon/intron boundaries. Mouse Mill2 has an additional intron at the end of the exon coding for the α 3 domain. Horse MILL has a frameshift mutation (denoted as X) in the α 2 domain. Horizontal bars (S1–S7) and # indicate the predicted locations of β -strands in the α 3 domain and conserved cysteine residues likely to form disulfide bridges, respectively.

Figure 2.

 Neighbor-joining trees of representative mammalian MHC class I molecules based on the amino acid sequences of the α 1–α 3 domains (A), the α 1 domain (B), the α 2 domain (C) and the α 3 domain (D). Numbers at the nodes represent the bootstrap values (shown only in main branches and only when they exceed 80). H–, M– and R– stand for human, mouse and rat, respectively. Names of other animals are spelled out.

2.2 The Mill family displays a high level of sequence divergence between rats and mice

The deduced rat MILL1 and MILL2 molecules showed overall amino acid identities of 65.4% and 62.4% to mouse MILL1 and MILL2, respectively. This level of sequence conservation is among the lowest in all known class I molecules (Table 1). Of the class I gene families that share orthologous copies between rats and mice, Mill was clearly the least conserved. Indeed, the extent of sequence conservation in MILL (71–74% amino acid identity in the α 1–α 3 domains) was comparable to that normally found in the comparison of human and rodent class I molecules. Long branch lengths connecting the Mill genes of rats and mice (Fig. 2A) also testify the high level of sequence divergence in the Mill family.

Among the α 1–α 3 domains, the α 3 domain of MILL1 was the least conserved (Table 1); this was mainly because this domain of mouse MILL1 contained many residues characteristic of MILL2 (Fig. 1). Except for this, the α 1 domain showed the highest sequence divergence. Furthermore, non-synonymous substitutions were more frequent than synonymous substitutions in the α 1 domain (Table 2). When calculations were made using both Mill1 and Mill2 genes, the excess of non-synonymous substitutions in the α 1 domain was found to be statistically significant, indicating that the α 1 domain is under diversifying selection. By contrast, synonymous substitutions were more frequent than non-synonymous substitutions in the α 2 and α 3 domains.

To more precisely address the evolutionary dynamics of the Mill family, we made neighbor-joining trees using individual extracellular domains (Fig. 2B–D). In the α 1 domain, rat MILL1 was most closely related to mouse MILL1, and the same was true for MILL2, consistent with the idea that rats and mice share orthologous Mill genes (Fig. 2A). However, in the α 2 domain, rat MILL1 was more closely related to rat MILL2 than it was to mouse MILL1 (Fig. 2C). Likewise, the α 3 domain of mouse MILL1 was more closely related to that of mouse MILL2 than it was to that of rat MILL1 (Fig. 2D). These observations suggest that the α 2 and α 3, but not α 1, domain sequences were homogenized by gene conversion-like events in each species. It is notable that the Mill family clustered with the MICA/B family in all domain-by-domain comparisons, although we obtained bootstrap values over 80 only in the α 3 domain.

Table 1. Amino acid sequence comparison of rat and mouse class I moleculesa)
  1. a) Scores stand for percentage of identical amino acid residues for the indicated domain(s). Accession numbers of mouse sequences: MILL1, AB086265; MILL2, AB086267; AZGP, Q64726; CD1D, I49581; FCGRT, I56197; HFE, P70387; H2-M3, NP–038847; MR1, JC5663; Qa-1 (H2-T23), NM–010398; H2-TL, P14433; PROCR, A55945; H2-K, I49713; H2-D, NP–034510; and H2-L, I54069. Accession numbers of rat sequences: MILL1, AB113960; MILL2, AB113961; AZGP, NM–012826; CD1D, NM–017079; FCGRT, X14323; HFE, NM–053301; M3, NM–022921; MR1, XM–213919; Qa-1 (RT-BM1), AF029240; TL (RT1-N1), M74822; PROCR, ENSRNOG00000019330; RT1-A1, X90375; and RT1-A2, X90376. NA, not available; ‘–’, not applicable.

original image
Table 2. Non-synonymous (dN) and synonymous (dS) substitution rates in the rodent Mill familya)
original image

2.3 The α2 domain of rat Mill2 contains a premature stop codon in many inbred strains

Comparison of the rat draft genome assembly and our cDNA sequences revealed multiple nucleotide changes in both Mill genes. Furthermore, the former predicted a stop codon in the α 2 domain of Mill2. Because the genomic and our cDNA sequences were derived from BN/SsN and WKAH/HkmSlc rats, respectively, these nucleotide differences could represent sequence polymorphisms. To test this possibility, we isolated Mill1 and Mill2 cDNA by RACE from BN/SsNSlc rats. The sequences of these cDNA (AB113962 and AB113963) perfectly matched the exonic sequences of the genomic clones, confirming that Mill1 and Mill2 have sequence polymorphisms and that Mill2 contains the stop codon in BN/SsN rats. We performed reverse transcription (RT)-PCR using cDNA from several tissues as templates to examine whether rat Mill2 generates splicing variants that lack the α 2 domain; no such variants were identified in WKAH/Hkm or BN/SsNSlc rats (data not shown). Therefore, Mill2 of BN/SsNSlc rats is a pseudogene that retains transcriptional activity.

We extended our analysis of Mill polymorphism by determining exonic sequences of Mill1 and Mill2 from an additional 12 inbred rat strains (Table 3). We found four alleles for Mill1 and five alleles for Mill2. Interestingly, 12 of the 14 rat strains had an identical premature stop codon, indicating that Mill2 is a pseudogene in many rat strains. In total, seven different combinations of Mill1 and Mill2 alleles were identified.

Table 3. Polymorphism of Mill1 and Mill2 in inbred rat strainsa)
  1. a) The signs ‘–’ and ‘*’ indicate identity with the top sequence and absence of residues, respectively. Database accession numbers: AB113964–AB114035.

  2. b) Codon numbers of mature MILL proteins.

  3. c) Domains where the codons are located.

original image

2.4 Expression analysis

Mouse Mill genes are poorly transcribed in most adult tissues. While Mill2 is transcribed in most adult and neonatal tissues at low levels, Mill1 is transcribed in selected tissues such as neonatal skin and adult muscle 14. To examine whether these features of the Mill family are conserved across species, we carried out an expression analysis of Mill1 and Mill2 by RT-PCR in WKAH/HkmSlc rats (Fig. 3). As in mice, rat Mill2 transcripts were detectable in all tissues examined. By contrast, the distribution of rat Mill1 transcripts was more restricted as in mice; Mill1 transcripts were detectable in the skin of neonatal rats as well as in esophagus, tongue, skin, muscle, uterus, ovary, testis and epididymis of adult rats. Interestingly, unlike mouse Mill1, rat Mill1 was expressed in the reproductive organs such as testis and epididymis.

Figure 3.

 Expression profiles of rat Mill1 and Mill2. Expression patterns were analyzed by RT-PCR using first strand cDNA prepared from various tissues of 5-day-old neonatal (A) or adult (B) rats. β -actin was used as a positive control.

2.5 Structures of the rat Mill genes are similar, but not identical, to those of their mouse counterparts

We deduced the structures of the rat Mill genes by comparing the cDNA sequences of WKAH/HkmSlc with the genomic sequences (Fig. 4A). Rat Mill1 has six or seven exons, depending on isoforms, and spans 26 kb. A unique feature of the mouse Mill family is the occurrence of an exon between the exons coding for the SP and for the α 1 domain 14. The presence of this small exon, which can be spliced out in mouse Mill2, but not in Mill1, adds 28 and 14 amino acid residues to the N-termini of mouse MILL1 and MILL2, respectively. Rat Mill1 also has this exon coding for 26 amino acids. Thus, exons 1, 3 and 4 encode the SP, α 1 domain and α 2 domain, respectively. The rest of the coding sequence is encoded by a single exon as in mouse Mill1. However, the 3′-end of Mill1 has different organizations between rats and mice. In mouse Mill1, the 3′-UTR is encoded in a single exon, along with the α 3 domain, the connecting peptide (CP), TM and CYT. By contrast, the 3′-UTR of rat Mill1 is split into two or three exons. As described above, rat Mill1 generates two types of cDNA that differ in their 3′-UTR. In one form (AB113960), both exons 6 and 7 are utilized to encode the bulk of the 3′-UTR; in another form (AB126063), only exon 6 is used to code for the bulk of the 3′-UTR. Thus, the two forms of 3′-UTR result from alternative splicing.

Rat Mill2 has five exons and spans 19 kb. Although we were unable to identify transcripts with an insertion between the SP and the α 1 domain, the rat Mill2 gene has a sequence stretch that resembles exon 2 of mouse Mill2 (location indicated with an asterisk in Fig. 4A). A notable feature of rat Mill2 is that, similar to mouse and rat Mill1, but unlike mouse Mill2, the α 3 domain, CP, TM and CYT are encoded in a single exon. In mouse Mill2, exon 5 coding for the α 3 domain is separated from exon 6 coding for the CP, TM and CYT by a 104-bp intron. In rat Mill2, the GT dinucleotides that function as a splice donor site in mouse Mill2 are mutated to GC, and the AG dinucleotides that function as a splice acceptor site in mouse Mill2 are mutated to GG (Fig. 4B). Thus, the region that is split into exon 5, intron 5 and exon 6 in mouse Mill2 occurs as a single exon in rat Mill2. Interestingly, the region of rat Mill2 corresponding to intron 5 of mouse Mill2 has a single nucleotide insertion compared to mouse. This insertion maintains the reading frame of rat Mill2 and results in the insertion of 35 amino acids in the CP (Fig. 1).

2.6 Mill1 and Mill2 have multiple transcription initiation sites

To determine transcription initiation sites, we conducted 5′-RACE experiments using the GeneRacerTM kit. Of 12 Mill1 5′-RACE clones sequenced, one, four and seven clones started 25, 29 and 36 bp upstream from the adenine residue of the translation initiation codon, respectively (Fig. 4C). Thus, rat Mill1 has at least three transcription initiation sites, with an adenine residue at position –36 serving as a major transcription start site. Similar experiments revealed that the transcription of rat Mill2 occurs at 15, 23, 26 and 33 bp upstream from the adenine residue of the translation initiation codon. The major transcription initiation site represented by 8 out of 14 5′-RACE clones was the adenine residue located 26 bp upstream from the translation initiation codon. The promoter region of class Ia genes contains a series of conserved upstream DNA sequences 16. None of these elements was found in the promoter region of rat or mouse Mill genes. The upstream region of rat Mill1 contains potential binding motifs for NF-κB and AP-1 at positions –626 to –617 and –390 to –382, respectively (Fig. 4A). These motifs were also found at similar positions in mouse Mill1.

Figure 4.

 (A) Organizations of the rat Mill genes. Numbered boxes indicate exons. Coding regions are shown as solid boxes. Stippled boxes indicate 5′- and 3′-UTRs. Segments of 50–100% identity between the rat and mouse sequences were plotted based on the coordinates of the rat sequences, using the PIPmaker program. The mouse sequences were obtained from the Celera database through use of the Celera Discovery System. (B) The genomic sequence of mouse Mill2 encoding intron 5 and its adjacent region, and the corresponding sequence of rat Mill2. Filled triangles indicate the exon/intron boundaries. Conserved dinucleotides at the splice sites, GT/AG, are underlined. The sign ‘–’ and asterisks indicate identity with the top sequence and absence of residues, respectively. (C) Transcription initiation sites of rat Mill1 and Mill2. Arrowheads on top of the sequence indicate the start positions of the 5′-RACE clones. The number of clones that started at each position is indicated in brackets. Nucleotide position 1 is the adenine residue of the translation initiation codon indicated by double underlines. Exon 1 is shown in capital letters.

2.7 Mill1 is embedded in the carcinoembryonic antigen (CEA) gene cluster and Mill2 is adjacent to this cluster

We physically mapped rat Mill1 and Mill2 using a rat/ hamster radiation hybrid (RH) panel. The vector data were 0010000000000000000001100010000010010000000000000001001001000000000001000001000011010011000100010010111000 for Mill1 and 0000000010000000000001100010000010010000000000000001001001000000000001000001000011010011000100010010111000 for Mill2. Both genes could be mapped to rat chromosome 1 between markers D1Rat23 and D1Rat97 (not shown) and, thus, near the LRC which is also localized on chromosome 1.

Localization of rat Mill1 and Mill2 to the vicinity of the LRC is in good accord with the NCBI Rattus norvegicus genome Build 2 assembly (Fig. 5). In this assembly, Mill1 and Mill2 are approximately 100 kb apart from each other, with the same transcriptional orientation. Mill1 is embedded in the cluster of the CEA gene family. Mill2 is located adjacent to, but outside the telomeric border of the CEA family. Inspection of the Mus musculus Ensembl assembly (version 18.30.1) indicates that the mouse Mill region has basically similar organizations (Fig. 5). In mice, Mill1 and Mill2 are ∼500 kb apart from each other, both having the same transcriptional orientation as rat Mill1 and Mill2. Most of the organizational differences between rat and mouse can be ascribed to the distribution of two categories of genes: genes for CEA that do not show orthology between rats and mice, and computationally predicted gene or gene fragments. In the mouse, but not the rat Mill region, we identified two pseudogenes of the Mill family designated Mill-ps1 and Mill-ps2. Mill-ps1 has two vestigial exons encoding the α 1 and α 2 domains, and Mill-ps2 has three vestigial exons encoding the α 1–α 3 domains.

Figure 5.

 Organization of the Mill region in rats and mice. Physical map of a 1-Mb region containing Mill1 and Mill2 is shown for each species. Pregnancy-specific glycoproteins (Psg) are members of the CEA family. Gene names not written in italics are provisional ones. Arrows indicate orientations of transcription. (t) The corresponding transcripts have been identified at the mRNA level; (c) computationally predicted genes or gene fragments for which no information on expression is available.

2.8 Mill is not a rodent-specific class I gene family

A computer search of GenBank identified a horse expressed sequence tag (BM781326) presumed to encode a member of the Mill family. Using 5′- and 3′-RACE, we obtained a composite full-length cDNA sequence (AB126064). Translation of this clone revealed that it contained a 1-bp deletion in the presumptive α 2 domain. The presence of this frameshift mutation was confirmed by PCR using the genomic DNA isolated from three horses as a template (data not shown). In Fig. 1, we show the hypothetical sequence of horse MILL, assuming that there was no frameshift mutation. Phylogenetic analysis showed that this horse sequence qualifies as a member of the Mill family (Fig. 2).

3 Discussion

Mill is a novel family of MHC class I genes identified in mice 14. Our previous work indicated that this gene family is most likely absent in man and that, of all known class I genes, Mill1 and Mill2 are most closely related to the human MICA/B family. These observations, coupled with the absence of MICA/B in mice, raised the possibility that Mill might be a rodent counterpart of human MICA/B. In the present study, we isolated Mill cDNA from rat and horse and performed a comparative genomic analysis of the Mill family.

One striking feature of the Mill family that became apparent from the present study is the low interspecies sequence conservation. Among the class I genes that share orthologous copies between rats and mice, Mill1 and Mill2 were clearly the least conserved (Table 1). The sequence divergence of the Mill family even exceeds that of class Ia genes which do not share orthologous copies between rats and mice. The only class I gene family that might show higher sequence divergence between rats and mice is the Raet family 17, 18]. Although the absence of information on rat Raet cDNA does not allow us to perform rigorous comparison of the Raet family in rats and mice, our preliminary analysis suggests that this gene family, which probably does not share orthologous copies, shows only less than 50% overall sequence identity at the amino acid level.

Interspecies comparison of the Mill family revealed that non-synonymous substitutions occur more frequently than synonymous substitutions in the α 1, but not in the α 2 or α 3 domain (Table 2), indicating that sequence divergence is positively selected in the α 1 domain. Two other observations also point to the functional importance of this domain. First, the α 1 domain sequence of rat MILL1 is only 44% identical to that of rat MILL2. By contrast, the amino acid sequences of rat MILL1 and MILL2 are 85% and 66% identical in the α 2 and α 3 domains, respectively, suggesting the involvement of the α 1 domain in the functional specialization of MILL1 and MILL2. Second, α 1 is the only extracellular domain apparently free from gene conversion-like events (Fig. 2B–D). This is consistent with the idea that the homogenization of this domain is functionally disadvantageous.

A limited survey of laboratory mice suggested that the Mill family is oligomorphic with a low level of allelic divergence 14. More extensive analysis of Mill polymorphism in rats confirms this suggestion (Table 3). Notably, 12 out of the 14 rat strains analyzed have a stop codon in Mill2, indicating the presence of two major Mill2 allelic lineages. Interestingly, the Mill2 pseudogene retains transcriptional activity in BN/SsN rats, suggesting that the inactivation of Mill2 took place recently. Although we found no null mutations in our previous survey of Mill genes in laboratory mice, Mus spretus has a stop codon in the α 1 domain of Mill114. These results indicate that neither Mill1 nor Mill2 is essential for survival.

The failure to find human genes orthologous to Mill14 raised two possibilities concerning the origin of this gene family. One is that Mill emerged in the rodent lineage after the divergence of the primate and rodent lineages. Another is that Mill emerged before this divergence, but was lost from the human genome. Our previous ‘Zoo’ blot analysis suggested that some non-rodent mammals might have Mill gene(s) 14. The present work provides convincing evidence that Mill is not a rodent-specificclass I gene family (Fig. 1, 2). While the phylogenetic relation of modern placental mammals has been a matter of considerable debate, recent evidence indicates that primates and rodents constitute one of the four principal placental lineages that is distinct from that containing odd-toed ungulates 19, 20. Thus, the existence of Mill in odd-toed ungulates and rodents indicates that humans lost this gene family secondarily. Therefore, Mill is a class I gene family closely related to, but distinct from the MICA/B family. In rats and mice, Mill1 is embedded in the CEA cluster, and Mill2 is adjacent to this gene cluster (Fig. 5). Thus, a series of unequal crossover events involving CEA genes 21 may have accidentally eliminated Mill genes in the human lineage.

Comparative genomic analysis reported here enabled us to identify several unique properties of the Mill family such as poor sequence conservation across species (a feature characteristic of immunity-related genes 22), diversifying selection operating on the α 1 domain, poor expression in most tissues, the occurrence of null alleles in many inbred strains of rats, and the existence in non-rodents. All of these features suggest that Mill performs some specialized immune functions required only in certain species or, more likely, some redundantfunctions part of which are executed by other molecules.

4 Materials and methods

4.1 Animals

Inbred strains of the laboratory rat (Rattus norvegicus) were used in this study. Two strains, WKAH/HkmSlc and BN/SsNSlc, were purchased from Japan SLC, Inc. (Hamamatsu, Japan). Strains BB/OKGun, BUF/Gun and LOU/Gun are maintained in the animal facility of the Abteilung Immungenetik, Universität Göttingen, Germany, whereas strains BDV/Ztm, BDX/Ztm, BH/Ztm, BS/Ztm, DA.1H/Ztm, F344/Ztm, LEW.1LM1/Ztm, PVG/Ztm and R21/Ztm were obtained from Dr. Hans-J. Hedrich, Institut für Versuchstierkunde und Zentrales Tierlaboratorium, Medizinische Hochschule Hannover, Germany. Horse spleen was purchased from Funakoshi Co., Ltd. (Tokyo, Japan).

4.2 Isolation of rat Mill cDNA

A computer search of the draft genome assembly (version 3.1) released by the Rat Genome Sequencing Consortium (http://www.hgsc.bcm.tmc.edu/projects/rat/) identified genomic clones containing Mill1 and Mill2. Based on this sequence information, we designed primers and isolated by PCR the coding regions of rat Mill1 and Mill2 cDNA. The primers were 5′-TCAGGAGTCTCAGAAAGGCAGA-3′ and 5′-CTTGCTGTCTTGGGTGTTCGT-3′ for Mill1, and 5′-CCACACAAAGGAGGAAAATGG-3′ and 5′-TGGAAGCTTCTTATCCTTTAGGC-3′ for Mill2. The cDNA template for PCR was synthesized using total cellular RNA isolated from heart or skin of adult female BN/SsNSlc and WKAH/HkmSlc rats 23. The 5′- and 3′-UTR of rat Mill1 and Mill2 were isolated by RACE using the GeneRacerTM kit (Invitrogen, Carlsbad, CA), which selectively identifies full-length mRNA by removing the 7-methylguanine cap and replacing it with a specific oligo 24. For 5′-RACE, gene-specific primers for Mill1 and Mill2 were 5′-GGCAACTCAGGCAACTTCAGA-3′ and 5′-TGCACTCCTCCTGTCTCCATT-3′, respectively. Gene-specific primers for 3′-RACE were 5′-TCAGGAGTCTCAGAAAGGCAGA-3′ and 5′-GCTCTGGTGGCTTCTGTTCTT-3′ for Mill1, and 5′-CCACACAAAGGAGGAAAATGG-3′, 5′-GCTGCTCCTGACAGAGTCACA-3′ and 5′-TGCACCCAAAGCTGACCTAGT-3′ for Mill2. PCR reactions were set up in triplicate to eliminate PCR errors.

4.3 Expression analysis

We analyzed expression profiles of rat Mill genes by RT-PCR. Total RNA was extracted from various tissues of WKAH/HkmSlc and BN/SsNSlc rats. The cDNA templates were synthesized using the Omniscript reverse transcriptase kit (QIAGEN, Tokyo, Japan). Primer pairs were designed in separate exons so as not to amplify contaminating genomic DNA. The primers were 5′-GCAGTTCGGCTGTGGATAATG-3′ and 5′-GGCAACTCAGGCAACTTCAGA-3′ for Mill1, 5′-GCTGCTCCTGACAGAGTCACA-3′ and 5′-TGCACTCCTCCTGTCTCCATT-3′ for Mill2, and 5′-TGTAACCAACTGGGACGATAT-3′ and 5′-CTTTTCACGGTTGGCCTTAG-3′ for β -actin. Conditions for PCR were initial denaturation at 95°C for 2 min, then 30 cycles of 95°C for 40 s,56 or 58.5°C (58.5°C for Mill1 and Mill2 and 56°C for β -actin) for 30 s, 72°C for 40 s, and a final extension for 5 min at 72°C. Neonatal tissues were obtained from 5-day-old WKAH/HkmSlc rats.

4.4 Analysis of Mill polymorphism

DNA fragments of Mill1 and Mill2 were amplified by PCR from genomic DNA isolated from 12 inbred rat strains as described 14. The primers for Mill1were 5′-GCAGCTATTGATGCTTGTGGA-3′ and 5′-CCACTGGTCTGTTCAATCAGC-3′ for the α 1 domain, 5′-GGCTCCACTGAGAAATGAGGT-3′ and 5′-ATTCGGACCACCACAAATACC-3′ for the α 2 domain, and 5′-GCGAATATGAACAGGGTGGTT-3′ and 5′-TGGGTGGGTGACTACCTTTCT-3′ for the α 3/CP/TM/CYT domains. The primers for Mill2 were 5′-CCACAGCCGTACAGTTCTCTG-3′ and 5′-GCCCCATGGAAAGTGTTAGTG-3′ for the α 1 domain, 5′-TAAGCCATGATTCTGGGAAGG-3′ and 5′-CCGGATCCTTCTGTCTTGTCT-3′ for the α 2 domain, and 5′-GTGCTGTACTCCCCAGTCCTC-3′ and 5′-CGGGCTAGAAGGAAAACCAG-3′ for the α 3/CP/TM/CYT domains. PCR products were subjected to direct sequencing. PCR reactions were set up in duplicate to avoid PCR errors.

4.5 Cloning of horse Mill cDNA

We designed PCR primers based on the horse expressed sequence tag (BM781326) and isolated full-length cDNA using the GeneRacerTM kit. The cDNA template for PCR was synthesized from total cellular RNA extracted from horse spleen. Gene-specific primers were 5′-CTCACATCCTCCGTGTCCTT-3′ and 5′-TACAGGGACCGTGATGTTCA-3′ for 5′-RACE, and 5′-ACTCTGACCTGGCTTCTGGA-3′ and 5′-TGAACATCACGGTCCCTGTA-3′ for 3′-RACE. Cycling conditions for the first round of PCR were 5 cycles of 30 s at 94°C, 2 min at 72°C, 5 cycles of 30 s at 94°C, 2 min at 70°C, then 20 cycles of 30 s at 94°C, 30 s at 60°C, 2 min at 68°C. Cycling conditions for the second round of PCR were initial denaturation of 2 min at 94°C, 25 cycles of 30 s at 94°C, 30 s at 60°C, 2 min at 68°C, and a final extension of 10 min at 68°C.

4.6 RH mapping

Mapping of the rat Mill genes was performed using the T55 rat/hamster RH cell panel 25 obtained from Research Genetics (Huntsville, AL). The primer pairs used for mapping were 5′-GCAGCTATTGATGCTTGTGGA-3′ and 5′-CCACTGGTCTGTTCAATCAGC-3′ for Mill1, and 5′-CCACAGCCGTACAGTTCTCTG-3′ and 5′-GCCCCATGGAAAGTGTTAGTG-3′ for Mill2. These primer pairs amplified bands of 480 bp (Mill1) and 478 bp (Mill2) in rat, but not in Chinese hamster DNA samples. PCR conditions for RH mapping wereinitial denaturation at 94°C for 3 min followed by 35 cycles of 30 s at 94°C, 30 s at 58°C and 40 s at 72°C. Analysis of the RH data was carried out online at the BioinformaticsResearch Center, Medical College of Wisconsin, USA (http://rgd.mcw.edu/RHMAPSERVER/#instruction) and the Otsuka GEN Research Institute, Japan (http://ratmap.ims.u-tokyo.ac.jp/cgi-bin/RH/rhNgv.pl).

4.7 Phylogenetic analysis

Amino acid sequences were aligned with the Clustal X program 26. The alignment was then adjusted by eye to maximize sequence similarity. The distance matrix was obtained by calculating Poisson-correction distances for all pairs of sequences. Neighbor-joining trees were constructed with the MEGA program (version 2.1) using the pairwise deletion option 27. The reliability of branching patterns was assessed by bootstrap analysis (5,000 replications).

4.8 Data analysis

Comparison of the Mill region in rats and mice was performed using the NCBI Rattus norvegicus genome Build 2 assembly (version 1), the Mus musculus Ensembl assembly (version 18.30.1) and the Celera database (Celera Genomics, Rockville, MD). SP was predicted using the SignalIP version 2.0.b2 server (http://www.cbs.dtu.dk/services/SignalP-2.0/). TM was predicted using the TMpred program (http://www.ch.embnet.org/software/TMPRED–form.html). Percent identity plots were constructed with the PIPmaker program (http://bio.cse.psu.edu/pipmaker/). Mean numbers of nucleotide substitutions per 100 synonymous sites (dS) and per 100 non-synonymous sites (dN) were calculated using the modified Nei and Gojobori method implementedin the MEGA program (version 2.1).

Acknowledgements

We acknowledge Motoko Sumasu and Taeko Nagata for their technical assistance. This work was supported by Grants-in-Aid for Scientific Research from The Ministry of Education, Culture, Sports, Science and Technology of Japan; the Joint Research Project of Sokendai (Soken/K01–4); Uehara Memorial Foundation; and The Naito Foundation.

Footnotes

  1. 1

    WILEY-VCH

  2. 2

    WILEY-VCH

  3. 3

    WILEY-VCH

  4. 4

    WILEY-VCH

  5. 5

    WILEY-VCH

Ancillary