The primary reason for generating and studying animal models is the ability to gather insights to our understanding of human disease and gene function. The mouse, with its fully sequenced genome, almost complete genetic orthology with the human, and numerous genetic manipulation tools has become the principal model organism for such purposes [Rosenthal and Brown, 2007].
A wealth of readily accessible mouse phenotype data already exists thanks to the researchers who have generated and characterized mouse mutants, and the manual curation of the literature and submitted data carried out by the Mouse Genome Informatics (MGI) group of the Jackson Laboratory and stored in their Mouse Genome Database (MGD) [Blake et al., 2011]. At the time of writing, MGD contains 26,945 phenotyped mutant alleles representing 11,356 markers, including heritable phenotypic markers, deletions, inversions, and other complex genomic mutations, in addition to mutations in 8,124 protein coding and RNA genes. Hence, phenotype data is already available for a large proportion of mouse genes although, for many, the published phenotypes focus on a specific area of research rather a broad phenotypic characterization of the mutant. Throughout this decade, phenotype data will be made available for all mouse protein-associated genes due to the efforts of the International Mouse Phenotyping Consortium (IMPC; http://www.mousephenotype.org) [Abbott, 2010]. The IMPC will implement a high-throughput phenotyping pipeline to characterize strains carrying the null mutations produced by the systematic efforts of the International Knockout Mouse Consortium (IKMC) [Ringwald et al., 2011; Skarnes et al., 2011]. The results of this will be available from the IMPC portal, as well as the mammalian phenotype ontology (MPO) [Smith and Eppig, 2009] annotated data being deposited at MGI as part of a merged dataset with the mouse phenotype data from publications and submissions. The approach taken by the IMPC or performing the same wide spectrum of phenotype assays on every line will hopefully provide more comprehensive coverage of the mouse phenome than the current approach of curating literature reported observations.
Until recently, systematic use of mouse phenotype data by the human clinical and research communities to identify candidate disease genes and gain knowledge of protein function has been a rarity, despite the potential power of such an approach. For example, Kitsios et al. (2010) identified mouse models for human genome-wide associations, which provide concordance of evidence and novel insights into the roles of the candidate genes. Another study identified candidates for autism spectrum disorders using mouse phenotype data, some of which overlapped with the results from a global copy number variation study [Meehan et al., 2011].
Where associated or candidate genes already exist for a human disorder, it is trivial to recover the mouse ortholog and any associated phenotype data. Similarly, if some knowledge of function or pathway involvement is known for the disease then searches for genes with these functions or pathway involvement can reveal candidates.
However, for many disorders only the observed phenotype is known and here the ability to identify equivalent phenotypes in model organisms with a known genotype becomes critical. The main impediment to this is the lack of direct mappings between the terms used for human disease and mouse phenotypes [Schofield et al., 2010]. MGI address this issue by manually curating disease associations for published mouse models. However, this is a huge effort and is likely to be unscalable for the IMPC project. In addition, most publications involving mouse models are focused on a particular disease and do not address whether a mouse could be a good or even better model for another disease. The systematic phenotype analysis performed by Meehan et al. identified several models in the MGD databases that had not previously been associated by the manual curation effort. Clearly, there is a requirement for computational methods for associating mouse models with human diseases, and systematic analysis of both human and mouse datasets.
The first step to automation is capturing the phenotype data in a computable form using ontologies and controlled vocabularies. The mouse community is already in a good position, as MGD and many other mouse databases use the well-established MPO. Although termed “Mammalian,” MPO has primarily been used to capture mouse phenotypic data at MGD and rat phenotypic data at the Rat Genome Database (RGD) [Twigger et al., 2007]. Other model organism databases, such as ZFIN [Sprague et al., 2008] for zebrafish and FlyBase [Tweedie et al., 2009] for Drosophila, do not use a “precomposed” species-specific phenotype ontology but rather use a “postcomposed” Entity-Quality (EQ) approach. In this, the Q variable comes from the phenotype and trait ontology (PATO) and the E variable from one of the Open Biomedical Ontologies (OBO) such as Gene Ontology (GO), ZFA (zebrafish anatomy), or FBbt (Flybase anatomy ontology). For example, motor neuron degeneration is represented in the mouse by MP:0000938 (motor neuron degeneration) and in zebrafish by the combination of ZFA:0009052 (motor neuron) and PATO:0000639 (degenerate). The latter approach is termed a postcomposed approach, as the terms are joined postcuration to form a human readable text description [Gkoutos et al., 2009; Washington et al., 2009].
Use of ontologies to capture human phenotype data is a more recent activity, stemming from the development of the Human Phenotype Ontology (HPO) [Robinson et al., 2008]. Like MPO, HPO uses a precomposed approach; for example, HP:0007373 (atrophy/degeneration involving motor neurons) would be used for the motor neuron degeneration example above.
The mixed use of pre- and postcomposed approaches and different ontologies would appear to hinder any cross-species phenotype querying. Lexical (text matching) based approaches can be used as demonstrated in PhenomicDB [Groth et al., 2006] and PhenoHM [Sardana et al., 2010] but will require nontrivial solutions where the same concept is described with different words (synonyms) or where the same word can refer to different concepts (homonyms) in the human and model organism communities. For instance, to a human reader MP:0000573 (enlarged hind paws) and HP:0001833 (large feet) clearly represent largely equivalent biological concepts, but to a computer using a purely lexical approach this association would be lost. In addition, the full semantic power of the ontologies is lost using a lexical approach. For example, the phenotypic consequence of the same genetic abnormality may be related but subtly different in diverse species; for example, PAX6 mutations result in “small eyed” mice, “opaque cornea” in humans, a “malformed retina” in zebrafish, and “eyeless” Drosophila. A lexical, computational approach could not identify these related phenotypes but a semantic approach, using the structure and relationships of the phenotype ontologies and logical definitions, will identify that all involve “eye abnormalities.” Similarly, the human clinical community and the various model organism resources can annotate the same phenotype at different resolutions. This will present problems to a lexical approach but can be solved by the subsumptive power of an ontology approach.
A more logically rigorous approach to compare phenotypes in different species is to use a set of species-agnostic ontologies as the building blocks for logical definitions of terms in precomposed species-specific ontologies. This approach is implemented by (1) generating EQ statements (known as logical definitions or equivalence axioms) for each of the terms used in the precomposed phenotype ontologies such as MPO and HPO, and (2) linking between the ontologies used in the EQ statements. Most of the ontologies used in the logical definitions are applicable to both species, but anatomy presents a special problem so the task is simplified to linking across the species-centric anatomical ontologies. Taking our example above of enlarged hind paws and large feet, the logical definition of the MPO term involves the PATO term “increased size” (PATO:0000586) and Mouse Anatomy term “foot” (MA:0000044) while the HPO logical definition involves the same PATO term and the human-centric Foundational Model of Anatomy term “foot” (FMA:9664). In this case, MA has already been made species-agnostic to some extent by referring to the foot of the hind limb rather than hind paw, so it is obvious that the MA and FMA terms refer to the same concept. However this is often not the case and we tackle this problem by using a bridging multispecies anatomy ontology to map between the individual species anatomy terms. Methods to generate these bridging ontologies, range from manually assisted, automated matching, for example, the UBERON unified metazoan anatomy ontology [Mungall et al., 2010], to relations based on the nearest common evolutionary ancestor of the structure in question, for example, the Vertebrate Bridging Ontology [Ravensara et al., 2011].
This logical definition approach generated promising results in identifying gene candidates and animal models of human disease using 11 manually annotated diseases with known genes [Washington et al., 2009]. Recently, a cross-species network built from the phenotype ontologies, logical definitions and UBERON has been shown to recall orthologues, genes involved in the same pathway and gene–disease associations [Hoehndorf et al., 2011].
We now have HPO annotations of almost all clinical OMIM entries representing Mendelian diseases and logical definitions available for a large proportion of the HPO and MPO terms. We can therefore extend our approach to nearly all known Mendelian diseases. Here, we present this extension using new semantic matching software (OWLSim) and report high recall of known disease genes. We describe a new Web tool (MouseFinder), which allows anyone to mine the results of this analysis for the identification of new candidates for human disease and present some intriguing examples of this.
HPO annotations of OMIM diseases, and the HPO ontology itself, were downloaded from http://www.human-phenotype-ontology.org/index.php/downloads.html. Known OMIM disease to gene associations are recorded in morbidmap and were downloaded from http://www.omim.org/downloads. MPO was obtained from http://obo.cvs.sourceforge.net/viewvc/obo/obo/ontology/phenotype/mammalian_phenotype.obo. MPO annotations of mouse models (MGI_PhenotypicAllele.rpt and MGI_GenePheno.rpt), MGI asserted disease models (ALL_OMIM.rpt) and OMIM human gene to MGI gene mappings (HMD_OMIM.rpt) were downloaded from the MGI ftp site (ftp://ftp.informatics.jax.org/pub/reports). Note, we used the MGI_GenePheno.rpt file, recently made available by MGI, rather than the larger MGI_PhenoGenoMP.rpt file used by most of the previously published studies. The latter file contains all phenotyped models including those with multiple genes mutated where it will be unclear which mutation is causative, conditional mutations which need further crossing to disrupt the gene and mutations of nongene markers and complex/cluster/region markers (includes deletion regions, inversions).
All files were downloaded on August 7, 2011, processed, and the contents stored in a simple database schema. The database stores the mappings from HPO-annotated Mendelian diseases recorded in OMIM, through to mouse genes via orthology and thence to mutant allele and mouse model phenotype annotations. About 5,035 OMIM diseases (1,858 with known gene association(s) and 3,177 with no known gene) and 1,791 OMIM genes with HPO annotation, along with the MPO annotations of 24,904 mouse models and 8,124 mouse genes, are stored in the database (Fig. 1). In addition, 2,624 associations between OMIM diseases and particular models from MGI curation of the literature are also captured (Fig. 1).
OWLTools (freely available from http://owlsim.org) was used to prepare OWL representations of the human and mouse phenotype annotation at the genotype and the gene level. OWLTools provides convenience methods on top of the java OWL API [Horridge 2009], and includes the OWLSim package used for all the semantic comparisons described here. OWLSim uses the same metrics as described in Washington et al. (2009), and is, in fact, largely a reimplementation of the same system using a different underlying ontology model. Our previous approach was implemented on top of a relational database system called OBD, whereas, OWLSim is implemented on top of the OWL API and does not require an underlying database to run the semantic comparisons. This makes OWLSim easier to set up, and faster to run.
SimJ scores similarity as the ratio of shared attributes to total attributes. In the case of OWLSim, the attributes being compared are inferred attributes (for a full technical description see owlsim.org):
where ap is the inferred attributes of phenotype p.
The IC of a description is the negative log of the number of features annotated with that description over the total number of annotations in the dataset:
In the case of OWLSim, IC is calculated for the least common subsuming (LCS) phenotype of the HPO–MPO pair which is the most specific set of all shared attributes (the algorithm to identify the LCS is, again, more fully described at owlsim.org). The IC method provides a measure of how unusual or “surprising” the set of attributes in common is and the higher the score, the less frequent is the LCS. Thus, a match in which the combination of attributes in common is rare, or involves highly specific terms, will score more highly than those involving more frequent or less granular terms.
For each human–mouse comparison, we aggregate the measurements of individual HPO-MPO best matching pairs to give:
1.avgIC–average IC score across all the pairs;
2.maxIC–maximum IC score across all the pairs;
3.avgSimJ–average SimJ across all the phenotype pairs;
4.maxSimJ–maximum SimJ across all the phenotype pairs.
The mappings of human diseases/genes to mouse models/genes along with the various measures of semantic similarity were stored in the same database for further analysis of the results and are displayed in the MouseFinder tool.
Recall of Known Disease Genes and Models from Mouse Phenotype Data
To test the potential of OWLSim human to mouse phenotype matches to recall genuine disease associations, we took advantage of OMIM morbid map which records known disease causative genes. In our database, 1,858 of the 5,035 OMIM disease records have a disease association with one or more of 1791 human genes (Fig. 1). Where mouse models involving mutants of these genes have been phenotyped, we should be able to recall these disease associations with high specificity and sensitivity using just the phenotype comparison methodology. The subset of OMIM records with an associated gene with a mouse ortholog that has been phenotyped includes 1,514 OMIM diseases and 1989 distinct disease to gene associations for 1253 unique human genes.
Figure 2 shows the results from the OWLSim phenotype comparison of HPO-annotated human diseases and MPO annotated mouse mutant lines for recall of any of the associated gene(s) for the 1,514 OMIM diseases. About 58% of associations were recalled with most appearing in the top 50 hits. The maxSimJ metric performed best, followed by maxIC, avgIC, and then avgSimJ. Figure 2 also shows the results of a 1,000 random runs. For each random run, n mouse models were randomly selected for each OMIM disease and assessed for a mutation in the known disease associated gene to calculate the expected level of recall in the top n hits if there was no biological association between the HPO-annotated diseases and MPO-annotated mouse models.This was repeated 1000 times and the average result plotted on Figure 2. In all cases, OWLSim is recalling the disease gene associations at significantly higher levels than random. For example, the 53% recovery of the disease–gene associations in the top 10 hits by maxSimJ has a p value <10−325 assuming a binomial distribution.
Another test of the effectiveness of OWLSim is to utilize the manual, literature curation the MGI group performs to assert that particular mouse lines are models of a human diseases. For these associations we again assessed our success at recalling these models (Fig. 3). About 65% of these models could be retrieved, with most appearing in the top 50 hits. About 20% were recalled as the top or joint top hit using maxSimJ. Again the recall is much higher than that shown by 1000 runs where mouse models were randomly chosen for each disease.
We can also test the recall using phenotype annotations projected onto the gene level rather than disease and mouse model (genotype) level as used above. About 1,253 HPO-annotated human genes were compared to their MPO-annotated mouse orthologs using OWLSim phenotypic comparisons to assess whether the mouse ortholog was recalled at significantly higher levels than expected by chance (Fig. 4). Compared to the 1,000 runs where the orthologs were randomly chosen, the recall was significantly higher. This time the avgIC metric performed best except for recall as the top hit or in the top 3. About 78% of the orthologs could be recalled at highly significant levels, for example, the 48% found in the top 50 by avgIC has a p value <10−325. To give a measure of sensitivity versus specificity, receiver operating characteristic (ROC) analysis was carried out on the avgIC ordered data from this human–mouse gene analysis. A highly significant area under the curve (AUC) value of 0.82 was obtained.
Comparison with Phenomenet
PhenomeNET [Hoehndorf et al., 2011] uses the same set of annotations, ontologies and logical definitions to compare human and mouse phenotype data. Although there are considerable similarities between the two approaches, the algorithms differ in a few key respects. PhenomeNET relies exclusively on subsumption between classes when calculating the least common ancestor: whereas, OWLSim makes use of other ontology relationships. In addition, OWLSim can generate class expressions on the fly whilst PhenomeNET relies on there being phenotype classes explicitly precoordinated in advance. PhenomeNET calculates the average of all pairs of phenotypes: whereas, the default algorithm in OWLSim is the average of best matches.
To compare the two approaches, we analyzed the recall rates of known disease genes using OWLSim and the data available from the PhenomeNET site (files generated on 16 September, 2011, at http://bioonto.gen.cam.ac.uk/phenomenet). As shown in Figure 5, the recall rates using OWLSim were considerably higher except outside the top 500, as OWLSim has a cutoff of 500 matches. The improved recall may be due to the algorithmic approach used and/or the fact that our OWLSim analysis makes use of simple lexical matching in addition to the ontological cross products; whereas, PhenomeNET uses a purely semantic approach. In addition, PhenomeNET also covers yeast, zebrafish, C. elegans and Drosophila phenotypes, and does not have the overhead of running the pairwise phenotype comparisons and storing the results in a database.
Mousefinder Web Tool
Our Web tool, MouseFinder (www.mousemodels.org), provides access to the phenotype comparisons described above. Users can identify a particular OMIM disease by browsing or searching by the disease name or OMIM ID, or by any of the associated genes or HPO terms. Once a disease is selected, a ranked list of the matching mouse models is displayed, ordered by avgIC as default (Fig. 6A). Each row shows the allelic composition and genetic background of the mouse model, along with the mutated gene and the rank and score according to the chosen similarity measure. The disease, gene and mutant allele fields link out to more detailed data on the OMIM and MGI Websites. Further tabs show the models ranked by maxIC, avgSimJ, or maxSimJ, and the final tab reveals any known associated OMIM genes from morbid map or known, published mouse models of the disease as curated by MGI. Where a known OMIM gene exists, a red box in the gene symbol column indicates its mouse ortholog. Similarly, any MGI asserted mouse models are indicated by a green tick to the left of each row of results. The results can be restricted to hits involving matches to a limited set of the HPO-annotated terms using the HP button at the top of the window.
In the example shown in Figure 6 for Craniosynostosis, Type 1 (MIM# 123100), a model involving the known causative gene (Twist1) is the top hit when ranked by avgIC. An MGI curated mouse model involving Axin2 is the 10th best hit, as indicated by the green tick in the screenshot. Expanding the detail for the top Twist1 match reveals further detail on the HPO and MPO annotation of the disease and mouse model, along with the phenotype terms and the IC and simJ measures for each paired match (Fig. 6B). The MPO annotation of the mouse model (premature suture closure) matches the craniosynostosis (premature closure of the cranial sutures), turricephaly (high head resulting from premature closure of the cranial sutures) and dolichocephaly (long head resulting from premature closure of the cranial sutures) clinical features.
Novel Candidates for Human Diseases
The rationale for developing our approach and MouseFinder is, of course, to identify novel candidates for human diseases. The recall analysis described above suggests that, for OMIM diseases with no known gene, the real causative gene should be present high up in our rankings. To explore this, we took the 468 OMIM diseases with a mapped locus but no known causative gene and looked for MouseFinder hits in the top 10 results by avgIC where the human orthologue maps to the correct genomic position. About 9% of diseases had a candidate mapping to the correct locus in the top 10 hits (Table 1). This success rate was well above that seen in 1,000 runs where the disease to locus mappings were randomized, strongly suggesting MouseFinder is discovering candidates worthy of further study for these uncharacterized diseases.
Table 1. Candidate Genes for OMIM Diseases with a Mapped Locus but No Known Associated Gene(s)
The candidates shown appear in the top 10 hits by OWLSim using the avgIC metric.
An example of one of the candidates is shown in Figure 7. Here, a mouse model (Artntm1Jmi/Artntm1Jmi with a genetic background of 129×1/SvJ* FVB/N) is the top hit for Ptosis, hereditary congenital 1 (MIM# 178300). The causative locus for this disease has been mapped to 1p34.1-p32 and ARTN, the human orthologue of Artn, maps to 1p34.1. The clinical feature of congenital ptosis (drooping eyelids) matches the mouse phenotype of blepharoptosis (drooping eyelids) and clearly warrants further investigation of ARTN as a candidate for this disease. Ptosis is thought to result either from damage to the eyelid muscle (levator palpebrae superioris), the superior cervical sympathetic ganglion, or the oculomotor nerve (CNIII) which controls this muscle. ARTN is expressed in the nucleus of the oculomotor nerve in the pre- and perinatal period along wth neurturin, persephin; all three being members of the GDNF family [Quartu et al., 2007]. Intriguingly, this mouse model was published as proof that ARTN is a neurotropic factor for developing sympathetic neurons [Honma et al., 2002] and the mice also show abnormalities in sympathetic ganglion morphology (in the small superior cervical ganglion) and sympathetic neuron morphology. This further strengthens the case for ARTN being the causative gene at this locus.
Conclusion and Future Directions
In this article, we have described a novel approach and tool for the identification of candidate disease genes for human disease. The recall of known disease gene associations at highly significant rates demonstrates that we can start to fully utilize model organism phenotype data for this purpose. As shown above for a form of hereditary ptosis, MouseFinder can identify plausible candidates for the human disease using only the clinical phenotypic features. It should be borne in mind that for many diseases we have no information available regarding the protein function, biochemical pathway involvement or expression pattern of the affected gene. In these cases, our phenotype approach represents a viable alternative to the classical computational methods of candidate gene selection using Gene Ontology (GO) or pathway enrichment studies, or expression data analysis. However, an integrated approach using phenotype data alongside these other lines of evidence (when available), as well as the mapped locus, would of course improve the success rates in identifying disease genes. Future efforts will be focused on developing this integrated analysis.
Despite the significant recall shown by all OWLSim analyses, there still remain some known disease gene and mouse model associations that were not recovered when using OWLSim to compare human and mouse phenotype annotations. At the genotype-annotated level, some 40% of known gene associations were not recalled, at the gene level 22% of associations were missed, and for the MGI asserted models 35% were not recovered. This could be for a number of reasons, including:
i.need for improvement in the recently developed ontologies and logical definitions. Our analysis produces a tractable set of missed phenotype relationships that can be used for improvements by the groups developing HPO, MPO, and the logical definitions. In addition, methods are being developed to automatically evaluate and improve the ontologies and logical definitions [Köhler et al., 2011].
ii.limitations of the OWLSim approach which we can investigate and improve in the future.
iii.informative phenotype assays not yet having been carried out on the mouse model.
iv.under-representative annotation of the human disease and mouse models, for example, for 4% of the disease genes we were trying to recall, the only MPO annotations for the mouse orthologs were “no abnormal phenotype” or “embryonic or postnatal lethal.” It will not be possible to recover these associations until more MPO annotation becomes available as a result of further curation or experimental work to generate further models and phenotype data.
v.for an unknown number of cases the mouse will prove not to be a good model for the particular human disease.
As highlighted by some of the articles in this special issue, this is an exciting time with rapid developments occurring in mouse phenotyping through the IMPC [see Schofield et al., 2010] and collection of human phenotype data through projects such as Orphanet. These new projects will generate a wealth of new phenotype data as well as physical mouse resources for the community to generate additional data. These initiatives can only improve the recall rates, and we envisage accurate, integrated phenotype querying across species becoming an essential tool for the clinical research community.
The authors declare no conflict of interest in this work. We thank the whole of the OMIM, HPO, and MGI teams for the curation that made this work possible.