A clinician's guide to omics resources in dermatology

Summary With recent advances in high‐throughput technologies, we are now in an era where the use of large‐scale datasets of biological samples and individual diseases can be analysed using omics methodologies. These include genomics, transcriptomics, proteomics, metabolomics, lipidomics and epigenomics. Omics approaches have been developed to deliver a holistic understanding of systems biology, to identify key biomarkers, and to aid in the interpretation of molecular, biochemical and environmental interactions. Navigating through the plethora of online datasets to find useful and concise information for comparison of data can be complex and overwhelming. The purpose of this article is to review the current repositories and databases, and to evaluate their application in dermatological research and their relevance to clinical practice. For this study, an extensive review of online platforms used in dermatology research was undertaken. Online resources for genetic disease information, genetic disease connection platforms for patients and researchers, clinical interpretation of variants, genome and DNA databases, and omics data repositories and resources were collected. This study provides a comprehensive overview of relevant databases that will aid clinicians and scientists using omics data in dermatology.


Introduction
Traditionally, clinicians have relied on an in-depth understanding of a patient's medical history and clinical examination to make an accurate diagnosis and to establish an appropriate treatment plan. However, in the 21st century there has been a great advance in scientific technologies, primarily spearheaded by innovations in genetics. 1 Decreases in DNA sequencing costs and increases in the speed and accuracy of reading DNA have moved genetics and quantitative genetic data into mainstream dermatology. 1 We are now in an age where the global scientific community has accumulated vast amounts of publicly available data, including clinical, epidemiological and molecular information.
Collectively, these advances provide a platform allowing dermatology datasets to be analysed for prospective improvements in diagnostics, therapeutics and insights into dermatological diseases. Specifically, data obtained from patients, biological samples or a particular disease can be compared with those from other datasets generated via high-throughput analyses of omics approaches, such as genomics (the study of gene expression changes at a genome-wide level), transcriptomics (the study of the complete set of RNA transcripts that are produced by the genome), proteomics (the study of dynamic protein products and their interactions), metabolomics (the study of cellular metabolites and their processes), lipidomics (the study of the structure and function of lipids) and epigenomics (the study of how cells control gene activity without changing the DNA sequence). 1 These multiomics approaches have been developed in an attempt to provide precision medicine for patients, while also endeavouring to deliver a holistic understanding of a 'systems biology' and an interpretation of genotype-phenotype interactions. 2 From a clinical setting, the challenge in using omics data is to generate concise, meaningful information, without being overwhelmed by the complexity of the data. Navigating through this process can be challenging, with > 700 web resources providing access to many thousands of systems (for example, http://pathguide. org/). In this review, we navigate through the tools and repositories that provide accurate dermatology datasets and explain how their clinical applications can be used in dermatology.

Resources Genomics
There have been huge advances in next-generation sequencing (NGS) technologies, which have led to data generation for genomes (single nucleotide polymorphisms, copy number variants and rare variants). 2 Because of the benefits of diagnostics and reduced costs, the use of genetic testing is becoming more widely used in diagnostic laboratories, with many physicians integrating genomics into their management plans. 1 To accurately interpret genetic findings, there has been substantial investment into establishing genetic disease databases, with web-based sites such as Online Mendelian Inheritance in Man 3 and OrphaNet 4 constructed to maintain updated information on published skin diseases and their associated gene mutations ( Table 1). The Genetic and Rare Diseases Center, 5 Orphanet 4 and GeneReviews 6 offer clinicians an overview of diseases, as well as providing lay information for patients on their disease, where to find expert centres for management of genetic diseases and information on genetic counselling. Despite the refinements in NGS technologies, the diagnostic yield of exome and genome sequencing is estimated at approximately 30%, leaving the majority of tested individuals undiagnosed. 7 This shortfall has led to the development of genomic matchmaking, where two or more parties looking for cases with similar phenotypes and variants in the same candidate gene can expedite the gene-discovery process. Currently, platforms including MyGene2, 8 and Gene-Matcher 9 allow for submission of data from approved researchers or clinicians (Table 1). In addition, the Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources shares publicly available patient data on rare diseases, detailing > 45 000 copy number and sequence variants, and including > 150 000 phenotype observations. 10 Additional sites, including Genome Connect, 11 GeneticAlliance 12 and MatchMaker Exchange, 13 have been created for upload of clinical and genetic information by patients and clinicians for unsolved cases.
When interpreting clinical gene variants from genetic testing, websites such as ClinGen, 14 ClinVar, 15 Leiden Open Variation Database 16 and/or gene2phenotype, 17 represent databases that report variants found in patient samples and provide assertions regarding their clinical significance. Genome browsers such as Ensembl 18 and the University of California and Santa Cruz Genome Browser, 19 can be used as graphical viewers that facilitate inquiry-driven data mining including gene predictions, mRNA alignments and variation data. The Genome Aggregation Database 20 allows users to connect to an online database and compare individual data to a population database, allowing the identification of pathogenic variants from among common, benign variations in the human genome. In addition, the use of repositories for analysis of large datasets from molecular, clinical and epidemiology data provides an opportunity to compare data easily, and with the promise of precision medicine it can be used for hypothesis generation in addition to hypothesis testing (Table 2).

Transcriptomics
Transcriptomics technologies, such as expression arrays and RNA sequencing, can provide an analysis of differentially expressed genes as a transcriptional response of the genome to various environmental stimuli. 21 Online resources, such as The Genotype-Tissue Expression project 22 and The Human Cell Atlas, 23 have created platforms that link DNA sequencing and multi-tissue RNA sequencing across donor samples and all human cell types.

Proteomics
Assessing proteins at the cellular and tissue level can more accurately define the disease state compared with transcriptomics or genomics alone. 24 Proteomics can lead to biomarker discovery for disease diagnostics, prognostics, prediction of treatment response and development of novel therapeutic targets. For instance, the Human Skinatlas site 24 provides a comprehensive proteomic overview for measurement and mapping of 11 000 functional and structural proteins from healthy skin samples.

Metabolomics and lipidomics
Metabolomics offers an opportunity not only to assess disease pathogenesis and treatment, but also to understand associations with inflammatory pathways, the pathogenic role of skin metabolites and the gut microbiome, and the downstream effects of environmental factors. Using mass spectrometry (MS)-based high throughput technologies, individual data can be compared to online repositories such as METLIN, 25 and The Human Metabolome Database (HMDB). 26 Lipidomics is a newly emerging discipline studying the cellular lipids on a large-scale throughput analysis of MSbased data on platforms such as the HMDB. 26 Given that the skin barrier is mainly comprised of a lipid-enriched extracellular matrix including ceramides, cholesterol and free fatty acids, interest in the biological processes of lipid metabolism in dermatological diseases is increasing. Currently, databases such as the LIPID MAPS â Structure Database 27 provide a consortium of lipids identified through lipid experimentation, computational analysis and relevant lipids manually curated from public sources.

Epigenomics
Although the human genome is largely preserved in all human cell types, there are considerable changes within the epigenomic landscape that underwrite the downstream expression of genes and biological functions. Currently, databases such as the Encyclopedia of DNA Elements Project 28 provide core gene sets corresponding to major cell types, mapping of patterns of human transcription factors and their key biological features. 28 Similarly, the Roadmap Epigenomics Project 29 was created to support reference epigenome production and to coordinate epigenomics data.

Conclusion
The shift from clinical examination to integrative data analysis can provide a more comprehensive approach to complex disease management. Furthermore, with the recent introduction of omics technologies at a single-cell level, which have the capacity to profile gene expression patterns and the interactions between individual cells, understanding of human skin and skin diseases is rapidly responding. The future challenge now lies in the ability of these resources to integrate data across multiple omics fields, 30 and to translate this information into individualized care for patients. To further improve precision medicine, the integration of various biobanks into a global platform is required, with harmonization in the methods of sample collection, storage, analytical methods used and protocols defined in study designs to secure reproducibility of future research.

Learning points
• Owing to recent advances in high-throughput technologies, the use of large-scale datasets can now be used for precision medicine.
• Clinicians and scientists can obtain samples from patients with a particular disease and analyse datasets generated by omics approaches.
• Obtaining meaningful information from omics resources can be overwhelming and complex.
• Understanding how to access and use dermatology datasets and repositories can aid in clinical data interpretation and patient management.