SEARCH

SEARCH BY CITATION

Keywords:

  • antibody;
  • biomarker;
  • cancer;
  • plasma profiling;
  • proteomics;
  • tissue microarray

Abstract.

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

Pontén F, Schwenk JM, Asplund A, Edqvist P-HD (Uppsala University, Uppsala; and KTH – Royal Institute of Technology, Stockholm; Sweden). The Human Protein Atlas as a proteomic resource for biomarker discovery (Review). J Intern Med 2011; 270: 428–446.

The analysis of tissue-specific expression at both the gene and protein levels is vital for understanding human biology and disease. Antibody-based proteomics provides a strategy for the systematic generation of antibodies against all human proteins to combine with protein profiling in tissues and cells using tissue microarrays, immunohistochemistry and immunofluorescence. The Human Protein Atlas project was launched in 2003 with the aim of creating a map of protein expression patterns in normal cells, tissues and cancer. At present, 11 200 unique proteins corresponding to over 50% of all human protein-encoding genes have been analysed. All protein expression data, including underlying high-resolution images, are published on the free and publically available Human Protein Atlas portal (http://www.proteinatlas.org). This database provides an important source of information for numerous biomedical research projects, including biomarker discovery efforts. Moreover, the global analysis of how our genome is expressed at the protein level has provided basic knowledge on the ubiquitous expression of a large proportion of our proteins and revealed the paucity of cell- and tissue-type-specific proteins.


The Human Protein Atlas project

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

Background

The successful completion of the Human Genome project has demonstrated the power of systematic efforts to map the building blocks of human and provided a foundation for the next major challenge to explore and understand the instructions embedded in the genome. The current version of the Ensembl database [1] (build 62.37) includes 20 790 predicted protein-coding human genes, and the major repository for protein information, Universal Protein Resource (UniProt) [2], includes approximately 20 300 reviewed human protein entries, of which only approximately 66% have evidence at the protein level. Gene expression in both tissues [3] and cell lines [4] has mainly been studied using mRNA transcript levels and large sets of microarray data, as global expression analyses at the protein level have long been hampered by a lack of specific probes for the majority of proteins encoded by the human genome. To fill this void, there is an ongoing effort in the field of proteomics to generate specific antibodies against the human proteome. Antibody-based proteomics provides an advantageous strategy for systematic generation and use of specific antibodies to explore the proteome [5]. Based on such a strategy, the Human Protein Atlas project was initiated in 2003.

Strategy of the Human Protein Atlas project

The Human Protein Atlas project has been set up for the systematic generation of specific antibodies on a global scale and to utilize these antibodies for the studies of the corresponding proteins and protein isoforms using a wide range of assays, including the determination of protein expression patterns in cells and tissues (Fig. 1). This effort relies on an antibody-based proteomics strategy, applied to combine high-throughput generation of affinity-purified polyclonal antibodies with immunohistochemistry (IHC)-based protein profiling on tissue and cell microarrays containing normal human tissues, cancer tissues and cell lines. Furthermore, subcellular protein expression patterns are determined in cell lines using immunofluorescence (IF) and confocal microscopy. The ultimate goal of this effort is threefold: (i) to generate and validate antibodies against all human proteins (a major isoform for each protein-encoding gene), (ii) to create an information database including a map of protein expression patterns in cells and human tissues and (iii) to utilize affinity reagents and protein expression data to identify and explore the biomarkers of clinical relevance. All results, including underlying high-resolution images, are presented in a free, publicly available ‘gene-centric’ database, the Human Protein Atlas (http://www.proteinatlas.org). The atlas contains the sequences of used antigens and antibody validation data, in addition to all images and expression data from protein profiling using both in-house-generated antibodies and antibodies provided by external sources.

image

Figure 1.  Schematic flow chart of the Human Protein Atlas. For each gene, a signature sequence (PrEST) is defined from the human genome sequence, and following RT-PCR, cloning and production of recombinant protein fragments, subsequent immunization and affinity purification of antisera results in monospecific antibodies. The produced antibodies are tested and validated in various immunoassays. Approved antibodies are used for protein profiling in cells (immunofluorescence) and tissues (immunohistochemistry) to generate the images and protein expression data that are presented in the Human Protein Atlas.

Download figure to PowerPoint

The Human Protein Atlas web portal

The first version of the Human Protein Atlas was launched in 2005 and has been followed by annual releases enhancing the coverage of the human proteome. For each new release, the number of antibodies has increased and new features have provided added functionality. The first version (1.0) of the Human Protein Atlas contained 275 internally generated antibodies and 443 external antibodies obtained from various commercial antibody manufacturers. For each antibody, 576 high-resolution images of immunohistochemically stained tissues were generated and subsequently annotated by certified pathologists [6]. The second version (2.0) of the atlas, released in 2006, contained 1514 antibodies corresponding to 1344 genes. As a new feature, IHC staining of sections from a cell microarray containing 59 human cells and cell lines was added, generating an additional 118 images for each antibody [7]. Protein expression levels in cells were analysed using automated image-analysis software [8]. Version 3.0 of the atlas was released in 2007 and the number of antibodies and genes doubled compared with version 2. Over 2 million images and data from a total of 1774 in-house-generated antibodies and 1241 external antibodies were made publicly available on the Human Protein Atlas portal. This version also included a new search function that allowed for queries involving tissue profiles in normal and cancer tissues [9]. Furthermore, a new subcellular atlas based on IF and confocal microscopy was introduced and included as part of the Human Protein Atlas [10]. In addition to the labelled antibody, three further probes were included to stain the nucleus, the endoplasmic reticulum and the cytoskeleton. All images were generated to allow the visualization of these four probes simultaneously or separately. In the next version 4.0 of the atlas, the number of antibodies again doubled to include 6122 antibodies with the majority (63%) generated in-house. Altogether, proteins encoded by 5087 genes, corresponding to approximately 25% of all predicted human genes, were included [11]. The earlier ‘antibody-centric’ database structure, based on the antibodies in the first three versions of the protein atlas, was changed to a ‘gene-centric’ structure to include information on all genes predicted by Ensembl. In this version, a putative list of members of various protein classes, both functional classes such as kinases, transcription factors and G-protein-coupled receptors and project-related classes such as candidate genes for cancer, plasma proteins and CD markers, was introduced. In the following two new versions, additional antibodies and protein profiling data were added, and version 6.0 contained over 11 200 antibodies and more than 9 million images.

The current version of the Human Protein Atlas

With the release of the Human Protein Atlas version 7.0, an important objective was reached with the inclusion of 10 118 protein-coding genes corresponding to over 50% of the 19 627 human entries as defined by UniProt. In addition to containing protein profiles for 50% of the human genome, the concept of annotated protein expression for paired antibodies was introduced [12]. Paired antibodies are defined as two independent antibodies directed against different, nonoverlapping epitopes on the same protein. Paired antibodies can thus be used to validate the staining pattern of each other. The IHC staining pattern across all normal tissues is compared, and a new annotated protein expression score is manually created for each cell type in each tissue and organ. This annotated protein expression can be viewed as a best estimate for the distribution and relative abundance of each protein in various normal cells. An estimation of the degree of knowledge-based certainty of the curated expression profile is also used to assign a reliability score. This reliability score is based on the similarity of immunostaining for the paired antibodies, and available published information, bioinformatics predictions and additional experimental evidence, such as western blot analysis, transcript profiling and/or siRNA knockdown, are also taken into account. The current version 8.0, released in May 2011, includes 11 260 genes with protein expression profiles based on 14 506 antibodies.

A tool for pathology

The Human Protein Atlas can also be used as a tool for pathology with opportunities to identify different types of biomarkers with in silico-based methods [13]. Because the atlas contains a large number of cancer samples (tumours from 216 patients), the available protein profiles provide an excellent starting point for identifying new potential cancer biomarkers. Although the analyses of genomic DNA, transcribed genes and expressed proteins all add important information to the histological features detected by microscopy, there is a great need for improved stratification of individual tumours into categories that share biological traits with influence on patient outcome and therapeutic response [14, 15]. The realization of personalized cancer therapeutic regimens depends upon the successful identification and translation of informative biomarkers for use in clinical decision making, and antibody-based proteomics plays a key role within the cancer biomarker discovery and validation pipeline to facilitate the high-throughput evaluation of candidate markers [16]. Cancer is a heterogeneous and highly complex disease associated with alterations in cellular protein expression patterns leading to net cell growth and ‘antisocial behaviour’ of tumour cells. The dysregulated expression patterns in tumour cells are not the consequence of alterations in single genes and proteins but rather the effect of multiple proteins and signalling pathways that have been altered. Protein profiling of cancer-associated signatures thus requires the possibility of analysing several proteins in tumour tissue samples, and it is essential to analyse a multitude of tumour types in order to widen the screening for cancer biomarkers. The major forms of human cancer analysed in the Human Protein Atlas are represented by 12 different tumours for each tumour type. This enables the perception of a potential protein signature for each given type of cancer and a starting point for further analyses of cancer type–specific proteins. Extended analyses of clinically well-defined tumour tissue samples from clinical biobanks containing large retrospectively collected patient cohorts with long-term follow-up are required to test and validate the diagnostic, prognostic and predictive value with regard to the treatment of a candidate cancer marker.

A tool for global analysis of protein expression

In addition to creating a map of protein expression patterns and providing a starting point for biomarker discovery, the systematic exploration of the human proteome has provided a new insight into global protein expression patterns in tissues [17, 18] and cells [19, 20]. The spatial distribution and the relative abundance of proteins was analysed in the different cell populations of various tissues in all major human tissues and organs, including the brain, liver, kidney, lymphoid tissues, heart, lung, skin, gastrointestinal tract, pancreas, endocrine tissues and the reproductive organs. This enabled a description of the proteomic landscape in human tissues, according to which normal cell types could be subdivided into groups that are in good agreement with the current concepts of histology and cellular differentiation. At a global level, a large fraction of proteins (>65%) were expressed in most of the analysed cells and tissues, with very few proteins (<2%) detected in only a single cell type. The overall results suggest that tissue specificity is achieved by precise regulation of protein levels in space and time and that the cellular phenotypes that define different tissues are acquired by controlling not which proteins are expressed but how much of each is produced [17]. In a similar analysis of cell lines in which IHC-based protein profiling using image analysis was employed, the results suggested that cell lines, although likely to share an ‘in vitro protein signature’, retain traits and characteristics of their progenitor cells sufficient to generate a hierarchical cluster according to cellular origin and phenotype. In particular, haematological cell lines constitute good model systems for their respective progenitor cell types. Moreover, previously uncharacterized proteins of interest for further investigation into cellular phenotypes were revealed in this study [20].

Antibody generation and validation

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

The complexity of the human proteome is massive because of the existence of various isoforms and protein modifications, including splice variants, glycosylation, phosphorylation and proteolytic cleavage. The strategy for mapping the human proteome employed in the Human Protein Atlas project relies on the high-throughput generation of antibodies targeted against at least one major isoform for each protein encoded by the human genome. The obtained expression profiles corresponding to one gene can thus be representative of a large range of possible isoforms derived from a particular gene product. A major challenge is to validate the protein-specific binding for each antibody and to discriminate specific from nonspecific binding. This is especially challenging and important for antibodies generated against unknown proteins, where independent experimental data are lacking. To meet this challenge, a number of assays are routinely used for validating and scoring the reliability of specific binding to the intended target protein for each antibody.

Generation of antibodies

The antibody generation process starts by identifying protein-coding sequences translating into a stretch of 50–150 amino acids with the criterion of being as unique as possible with respect to all other protein-coding genes in the genome [21]. The identified low-homology region is termed a ‘protein epitope signature tag’ (PrEST) [22]. The selection of low-homology regions for producing an antigen enhances the possibility of generating unique antibodies against the corresponding protein and thus decreases the risk of antibodies with cross-reactivity to other proteins. In addition to low homology, transmembrane regions including hydrophobic and less immunogenic regions are avoided in the PrEST. Recombinant PrEST protein fragments are produced in Escherichia coli and purified subsequent to primer synthesis, RT-PCR and cloning. Up to four different PrEST proteins are generally defined for each gene, and each PrEST clone and protein product is validated by sequence verification and molecular mass using electrospray mass spectrometry. The purified PrEST proteins are used as immunogens for immunization to produce polyclonal antisera. The polyclonal antibodies are further affinity-purified using the PrEST protein to yield monospecific antibodies (msAbs) [22–24]. The aim is to produce at least two validated antibodies against nonoverlapping epitopes on the same protein.

Antibody approval for protein profiling

Antibodies showing specific binding on protein arrays containing the immunized target are further tested through a series of antibody-based assays, including western blot, IHC and IF. Whether the antibodies are approved or not is based on the concordance of the results with bioinformatic predictions, published data (if available) and comparative IHC staining with paired antibodies. Depending on the degree of consistency across validation platforms and available data from external sources, the antibodies are excluded or assigned a reliability score of high, medium, low or very low. Approximately 20 new antibodies are processed in this manner each day, and about 50% are approved for protein profiling. In addition to in-house-generated PrEST antibodies, a large number of external antibodies have also been tested in a similar fashion and the results from approved external antibodies have been included in the Human Protein Atlas.

High-throughput IHC

Antibodies that have been approved for use are applied for IHC staining of a set of nine different tissue microarrays (TMAs) [25] containing human tissues, cancers and cell lines. A TMA is a paraffin block in which hundreds of separate tissue cores 1 mm or less in diameter from donor blocks are arranged in a matrix on a single paraffin receiver block from which 2–300 sections can be obtained. This technique provides the possibility to immunohistochemically stain a large number of tissue specimens simultaneously. The TMAs used for basic protein profiling in the Human Protein Atlas contain samples from 48 different normal human tissues, 20 different cancer types, 47 different human cell lines and 12 haematopoietic cell types from patients [6, 7, 26]. The 48 normal tissues are in triplicate samples representing 144 individuals, and the cancer tissues are sampled in duplicate from 216 different patients. The immunostained TMA sections are scanned to generate high-resolution digital images representing each core from the stained TMA sections. The digital images are used for manual annotation by certified pathologists, and the outcome of IHC is scored with respect to the intensity of immunoreactivity, the fraction of immunostained cells and cellular localization of immunoreactivity [26]. Annotation of cell lines is achieved via automated image analysis, and a similar scoring system is applied to describe the level and extent of immunoreactivity in the samples [8].

Annotated protein expression

If there are data from two or more paired antibodies directed against the same target protein, these expression profiles are manually curated to yield an ‘annotated protein expression’. The concept of annotated protein expression was launched with the version 7.0 release of the Human Protein Atlas [12]. The rationale is to derive the most reliable estimation of true protein expression by considering factors including previous independent published expression data, antibody validation and other experimental data, and the annotated protein expression provides the means to award the most reliable or consistent antibodies more weight in the comparison. Approximately 20% of the 10 100 genes with protein expression profiles published in version 7.0 were curated using annotated protein expression and thus contain protein profiles with a higher level of validity. The long-term objective is to provide annotated protein expression based on two or more independent antibodies for the complete human proteome.

Subcellular localization of protein expression

Alongside the validation and IHC pipeline, antibodies are also used for confocal IF analyses on three cell lines of different origins (glioma, osteosarcoma and epithelial carcinoma) [10, 27]. The cells are stained using a standard set-up in which secondary antibodies are used to detect the antibody of interest (green) and two reference primary antibodies against the cytoskeleton (red) and endoplasmic reticulum (yellow). Nuclei are stained blue by DAPI. Using oil-immersion objectives, two high spatial resolution images from each cell line are obtained, typically containing 6–12 cells. The images are manually annotated in terms of subcellular localization, staining intensity and staining characteristics.

Using the Human Protein Atlas database

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

With the release of version 7.0 of the Human Protein Atlas in November 2010, a milestone for the project was reached as the halfway point in the number of genes with published protein expression profiles was accomplished [12]. This version also included major changes in the website design, data presentation and the user interface. One important aspect was to enhance possibilities for customized searches in the database and provide a more user-friendly and intuitive search tool. Improvements in the search function facilitate queries for proteins expressed (or not expressed) in specified organ systems, tissues or cell types, cancers and/or subcellular localizations. It is also possible to filter searches based on protein class, chromosomal location of gene and parameters related to antibody validation and reliability scores. The Human Protein Atlas database can thus be used as a powerful resource to address specific queries regarding the human proteome and serve both as a reference and as a starting point for various biomedical research projects [13, 16].

Browsing the Human Protein Atlas

When visiting http://www.proteinatlas.org, an initial search is needed to browse the database. A basic search for Pax6, a transcription factor known to be important during the development of eyes, CNS and endocrine glands [28–30], is used as an example to generally describe how to browse the website. A search for ‘pax6’ returns our example gene at the top of the hit list. The result shows that the expression profile for Pax6 is based on two antibodies, one external (CAB034143) and one in-house-generated antibody (HPA030775). Colour codes are displayed to show the validation results for these two antibodies based on IH, IF, western blotting and PrEST-array analysis (Fig. 2a). Because the data for Pax6 are based on more than one antibody (paired antibodies), an annotated protein expression curation has been performed and the reliability of protein expression data subsequently scored. In addition, the number of cell types expressing Pax6 is given together with the subcellular location. Expression of Pax6 appears to be quite selective (in 19 of 66 cell types), and IF analysis demonstrated the expression in the nucleus but not in the nucleoli.

image

Figure 2.  Browsing the Human Protein Atlas using Pax6 as an example. (a) Search summary report for the Pax6 gene with a list of synonyms, a list of antibodies used for protein expression profiling, antibody validation summary, reliability score of the annotated protein expression, tissue specificity and subcellular location. (b) The Pax6 summary page provides brief summaries of the underlying data, including gene and protein information, antibody information, subcellular localization, expression in normal tissues and organ systems, expression in cancer and expression in cell lines. (c) Distribution of Pax6 in normal tissues. White, yellow, orange and red coloured boxes indicate relative staining intensities, and blue coloured boxes indicate the level of annotated protein expression. (d) Preview images of Pax6 expression in the lower part of the stomach with a brief annotation summary. (e) High-resolution image (microscope view) of one of the images shown in D reveals strong Pax6 expression in a fraction of glandular cells. (f–j) Pax6 expression levels in selected normal tissues reveal strong staining in pancreatic islet cells (f), granular layer cells of the cerebellum (g) and respiratory epithelial cells of the nasopharynx, moderate to low staining intensity in cells of the testis (i) and negative staining in adrenal glandular cells (j).

Download figure to PowerPoint

Browsing normal tissues

Selecting ‘Pax6’ leads to a summary page displaying background information and a summary of Pax6 protein expression patterns (Fig. 2b). This page summarizes the data from the six sublevels of data that are available: (i) gene and protein information, (ii) data on the generation and validation of the antibodies, (iii) subcellular localization, (iv) expression profiles in normal tissues and organ systems, (v) expression in cancer and (vi) expression in cell lines. Browsing to the different sublevels provides access to more detailed information. For Pax6, the normal tissue and organ summary shows expression profiles in blue colour codes, which indicates that the expression of Pax6 has been curated using annotated protein expression. Underlying annotation details and images are found by selecting the ‘More tissue data’ link where the immunostaining patterns of the individual antibodies are compared side by side (Fig. 2c). In this view, expression patterns of individual antibodies are colour-coded in white, yellow, orange and red to indicate the level of immunohistochemical staining, and the corresponding annotated protein expression score is indicated in shades of blue. By selecting a specific type of tissue, e.g. the lower part of the stomach, preview versions of the images underlying the annotation are shown (Fig. 2d), and by selecting these images, the corresponding high-resolution image is presented and can be viewed and navigated as if under a microscope (Fig. 2e). High levels of Pax6 protein expression are detected in the fractions of glandular cells of the gastrointestinal tract (Fig. 2c,e), pancreatic islets, granular cells of the cerebellum and respiratory epithelial cells of the nasopharynx (Fig. 2c,f–h), whereas lower levels are found in the cells of the testis (Fig. 2c,i). Most other tissues were negative for Pax6, as exemplified by the adrenal gland (Fig. 2c,j). These data are in agreement with previously reported results of Pax6 expression in tissues such as the pancreas and cerebellum [31–33].

Browsing cancer tissues

The data and images representing cancer tissues and cell lines can be accessed by browsing through the corresponding sublevels similar to the method described earlier for normal tissues. However, neither cell lines nor cancer tissues have been curated with annotated protein expression, and expression data are thus presented solely based on IHC staining. Analogous to the normal tissue view, the cancer summary page provides a side-by-side comparison of individual antibodies stained across the 20 different cancer types. A useful feature in this view is the ability to identify tumours based on overall staining score, intensity, quantity or subcellular locations of the staining pattern. By selecting one or multiple boxes representing, for example, ‘Intensity strong’ and/or ‘Quantity <25%’, the preview images corresponding to the set criteria will be highlighted. The selection is also remembered and thumbnail images highlighted purple when viewing the high-resolution images.

Advanced searches

Using the ‘Fields’ function, it is possible to design queries based on specific search strings. Searches can be customized to include or exclude (i) protein classes (all proteins are stratified into 18 different classes with several additional subclasses), (ii) subcellular protein localization patterns (19 different patterns annotated based on IF), (iii) distribution pattern in the human body (12 organ systems, more than 40 tissues and over 60 different normal cell types), (iv) expression pattern in cancer (20 different types of human cancer) and (v) expression pattern in cell lines (56 different cell lines and patient cell samples) (Fig. 3a). Searches can also be set to include or exclude genes/proteins based on chromosomal location, antibody validation and annotated protein expression. Combined with the possibility of defining thresholds for staining intensities and/or fractions of stained cells or tissues, a large number of search permutations are possible.

image

Figure 3.  Advanced searches in the Human Protein Atlas database using the ‘Fields’ function. (a) Using ‘Fields’, very specific queries can be designed based on, e.g., protein class (18 choices), subcellular localization (19 choices), organ expression (12 choices), tissue expression (46 choices), cancer expression (20 choices) or cell line expression (59 choices). For any ‘Fields’ option, a number of additional choices will be available after the initial selection (not shown). (b) Example of a search for proteins expressed in heart muscle but not in skeletal muscle that are also candidate cardiovascular disease genes and plasma proteins. (c) Example search for transcription factors expressed in any cell type of the testis, but not expressed in female tissues.

Download figure to PowerPoint

In a first example using the ‘Fields’ function, the aim is to identify plasma proteins that are markers for cardiovascular disease and expressed in cardiomyocytes but not in myocytes from skeletal muscle (Fig. 3b). By selecting ‘Fields’ on the Human Protein Atlas start page, drop-down menus appear where ‘Organ expression’ is selected in the first box and ‘Cardiovascular system’ in the adjacent box. ‘Fraction’ and ‘Tissues’ are set to ‘=’ and ‘100%’, respectively, and in the Expression menu, ‘Weak’, ‘Moderate’ and ‘Strong’ are marked to select positive cases regardless of intensity score. By selecting ‘Add’, this first search string is saved and additional criteria can be set by selecting ‘Fields’ again. By selecting ‘Tissue expression’, ‘Skeletal muscle’ and ‘Negative’ followed by ‘Add’, both basic expression criteria are set. In the two consecutive steps, ‘Fields’ is used to add ‘Protein class’, ‘Candidate cardiovascular disease genes’ and ‘Protein class’, ‘Plasma proteins’ (Fig. 3b). When all criteria have been added, selecting ‘Search’ will yield a hit list. This particular search returned 19 hits, all of them are proteins known to be involved in cardiovascular disease, e.g. cardiac troponin T and apolipoprotein E [34, 35]. To exclude these known proteins in another search for plasma proteins, the ‘NOT’ operator can be used instead of ‘AND’ for the ‘Candidate cardiovascular disease genes’ criteria.

In a second example, the aim is to identify transcription factors expressed in testis but not expressed in female genitalia. Using the ‘Fields’ function, ‘Tissue expression’, ‘Testis’, ‘Any cell type’ is selected with ‘Moderate’ and ‘Strong’ expression. After adding these criteria, ‘Organ expression’ is selected to enable setting the same criteria to a larger set of tissues in one search string. ‘Breast and female reproductive system’, ‘=’, ‘100%’, ‘Negative’ is selected followed by ‘Add’. Finally, ‘Protein class’, ‘Transcription factors’ is added as the last criterion (Fig. 3c). This search returned 16 hits, including the well-known testis-specific cAMP-responsive element modulator (CREM) protein [36]. The list includes several less well-known genes, and at the bottom of the list, there are three unknown zinc finger proteins that appear to be almost exclusively expressed in testis (ZFP64, ZNF597 and ZNF843). It should be noted that these data are based on IHC staining with only one antibody for each of these zinc finger proteins and thus present a degree of uncertainty as other, independent experimental or bioinformatic data supporting the findings are lacking. However, these novel, specifically expressed and previously uncharacterized proteins provide a starting point for further studies and demonstrate the exciting prospect of novel in silico discoveries by mining the Human Protein Atlas.

Innumerous variants of queries can be addressed using the search function, and searches for proteins related to cancer projects appear as the most obvious application owing to the large amount of cancer tissue represented in the database and the ease of designing queries to list, for example, membrane proteins expressed in a particular cancer type or kinases expressed in cancer but not in the corresponding normal tissue. However, cancer is not the only field in which the Human Protein Atlas database can serve as a resource for new scientific leads. Those interested in diabetes may look for proteins expressed in pancreatic islets, but not in the exocrine pancreas, and neuroscientist may want to explore the proteome of astrocytes compared with neurons in the hippocampus. Likewise, scientists using cell lines to explore a particular organelle or protein class may find searches based on the cell and subcellular atlas useful for obtaining new data.

Immunopathology

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

Clinical relevance of biomarkers

Personalized medicine requires the discovery, validation and application of unambiguous prognostic, predictive and pharmacodynamic biomarkers to guide therapeutic decisions [37, 38]. High-throughput screening methods, using genomic and transcriptomic profiling, have greatly increased our knowledge of the molecular basis of tumorigenesis, cancer progression and therapeutic response [39, 40]. The ambition of individualized treatment regimens can thus be viewed as an achievable goal. Microscopic evaluation of tissue sections taken from a tumour remains the gold standard for determining a cancer diagnosis, including the establishment of the tumour type in most cases. However, there is a great need for better stratification of tumours to optimize patient handling and therapeutic intervention. The heterogeneous and complex nature of cancer can, in part, be untangled by gene sequencing and other emerging molecular biological technologies; however, adding a layer of information regarding protein expression on top of morphology appears to be beneficial for tumour stratification in a clinical setting. IHC prevails as an invaluable method and provides such a tool for the visualization of protein expression patterns in cells from a section of tumour tissues. The role of antibodies is most likely to involve diagnostic, prognostic and predictive biomarker development, and despite the relative success of antibody-based assays for both the oestrogen receptor and human epidermal growth factor receptor 2 (Her2) in breast cancer, the lack of development of clinically implemented assays has not even nearly kept pace with the rate of biomarker discovery [41].

To date, the majority of the approximately 200–300 antibodies used in clinical pathology are mainly used for diagnostic purposes and to a lesser extent for grading of malignancy and stratification of different tumour types, with the exception of haematological malignancies for which antibody-based classification is required for the subtyping of lymphoma and leukaemia. Only rarely are prognostic markers used in clinical routine and only exceptionally are biomarkers routinely used for the purpose of personalized medicine, by predicting or determining which therapeutic regimen(s) a patient would benefit from [16, 41]. The most prominent exceptions are overexpression of Her2 in breast cancer for determining susceptibility to treatment with trastuzumab [42, 43] and the detection of KIT (CD117) expression for diagnosing gastrointestinal stromal tumours susceptible to imatinib treatment [44, 45]. With only a small repertoire of biomarkers for use in personalized medicine, there is an unmet need to identify sets of markers to subclassify both disease states and patients for accurate assessment of prognosis and for selecting the most favourable regimens for treatment [14, 15, 37].

TMA-based biomarker discovery

Tissue-based clinical research requires specimens from a large set of patients, in the form of frozen or paraffin-embedded tissue blocks. To enable efficient data collection in combination with high-throughput IHC, the TMA technique has proven powerful for paraffin-embedded material, as it allows for the detection of molecular targets in a multitude of samples simultaneously [7, 25, 46]. This technique also conserves the original tissue samples as only a small part of the specimen is needed to generate representative tissue cores for TMA construction. The TMA technology also requires small amounts of reagents and decreases errors because of experimental variability, as all samples are analysed using a single protocol for the entire TMA section. Depending on the design and number of cores sampled from each block of donor tissue, analyses of core specimens can provide a representation of the entire original tissue specimen with >95% accuracy [47].

Within the Human Protein Atlas project, the database is actively mined for potential biomarkers with the aim of identifying protein expression patterns that indicate whether a particular protein could be used as a biomarker. The focus is to identify and validate cancer biomarker candidates that can fulfil currently unmet clinical needs in pathology and oncology. To address such clinical needs, clinical questions are defined and appropriate patient cohorts determined to enable the collection and assembly of tumour material and clinical data for creating cancer-research TMAs with corresponding comprehensive clinical databases. These specifically designed cancer-research TMAs are used for extended analysis of protein expression patterns to test and validate candidate proteins as diagnostic, prognostic and/or predictive cancer biomarkers (Fig. 4, upper row).

image

Figure 4.  Strategy for tissue-based biomarker discovery: identification of clinical questions and unmet needs for novel biomarkers. Based on the clinical question, patient cohorts are defined and the corresponding tissue microarrays (TMAs) and clinical databases are constructed. The database is mined for potential biomarkers and putative candidates are selected based on, e.g., heterogeneous expression in cancers. A selected candidate antibody is scrutinized with respect to specificity to detect its target protein based on the basic validation performed in the Human Protein Atlas pipeline, extended validation platforms and available external data. The reproducibility and quality of immunohistochemistry staining is tested on screening TMAs before selected antibodies are used with clinical material. The first clinical cohort will fail or qualify the protein as a potential biomarker, and promising results are verified in independent clinical cohorts. Monoclonal antibodies are raised against verified biomarker proteins by using, e.g., information derived from epitope mapping of the msAbs used to detect the protein.

Download figure to PowerPoint

Biomarker candidate selection and antibody verification

The protein expression pattern across 12 individual tumour samples for each cancer type is analysed in the basic screening performed on all approved antibodies in the Human Protein Atlas [13]. This set-up enables the identification of differentially expressed proteins, a prerequisite for finding markers of biological relevance and for determining correlations between protein expression and clinical parameters such as disease-free survival. Proteins exclusively expressed in a particular form of cancer or cell type are equally interesting as such markers could be potential diagnostic markers. Proteins showing differential expression in cancer cells as compared to corresponding normal cells suggest an alteration associated with the malignant phenotype and could provide a marker for distinguishing benign from malignant lesions. All novel proteins with the above-mentioned expression patterns provide a lead for further investigation and require in-depth studies to explore functional aspects. Extended studies in larger well-defined patient cohorts are also needed before a candidate can be established as a useful biomarker.

Identified potential biomarker proteins are scrutinized prior to being selected for further analyses. Foremost, the available technical validation of the antibody used to detect the protein must provide convincing evidence of target-specific binding. Experimental data including the known or presumed function and interaction partners of the target protein are also considered. Identified candidate biomarkers are tested in a series of additional technical verification steps to determine the specificity and reliability of the antibody, including the use of siRNA to knockdown the corresponding transcript and additional immunoassays to ascertain the functionality and reproducibility of the antibody (Fig. 4, lower row).

Qualification of biomarker value

Verified candidate antibodies are subsequently used to analyse the expression pattern in specific cancer-research TMAs, typically containing tumour tissues from 200 to 300 patients. The outcome of immunostaining is analysed and annotated by a certified pathologist to obtain a score based on the intensity and fraction of positive tumour cells for each patient sample. The annotation data are imported to the corresponding clinical database and used for biostatistical analysis to detect possible associations between levels of protein expression and clinical parameters. Using this approach, proteins identified as potential biomarkers are further investigated in a series of independent cancer-research TMAs to verify the original results (Fig. 4, upper row). If the results are reproducible in several different and independent patient cohorts, the biomarker is deemed to be of clinical significance.

For a biomarker to be truly useful in IHC and other applications, the antibodies used to detect the target protein must selectively bind to the intended protein with the highest level of certainty. The in-house-generated, affinity-purified polyclonal antibodies have the advantage that they are normally directed to several binding sites (epitopes) of the target protein, but they have the disadvantage that binding to multiple epitopes increases the risk of cross-specificity towards other proteins. The preferred option for launching a new biomarker is therefore to generate renewable, highly characterized, single epitope-specific monoclonal antibodies that selectively bind to the target protein. This can be accomplished by epitope mapping of the original PrEST antibody to define the epitopes with highest specificity [48], and the corresponding peptides can be synthesized and subsequently used as antigens for producing monoclonal antibodies (Fig. 4, lower row).

Biomarker examples

The exploitation of the Human Protein Atlas database in combination with the analysis of protein expression patterns in specific cancer TMAs has proven a successful strategy for biomarker discovery efforts. Cell type–specific proteins are rare, and the detection of such novel proteins in cells related to cancer provides a starting point for exploring the potential of a new diagnostic biomarker. The homeobox transcription factor SATB2 was identified with a selective pattern of expression and, within cells of epithelial lineages, SATB2 expression is restricted to glandular cells lining the lower gastrointestinal tract. The expression of SATB2 protein is largely preserved in cancer cells of colorectal origin. The results from a recent study of over 2000 tumours showed that SATB2 is a sensitive and highly specific marker for colorectal cancer with distinct positivity in 85% of all colorectal cancers. Moreover, SATB2 and cytokeratin 20 combined identified 97% of all tumours of colorectal origin [49].

Recently, the RNA-binding protein RBM3 and the rate-limiting enzyme in the mevalonate pathway HMG-CoAR were identified as potential biomarkers of human cancer. RBM3 and HMG-CoAR were initially identified as potential biomarkers through searches in the database, and the extent of protein expression was subsequently found to correlate with clinically relevant parameters in several types of cancer. Both RBM3 and HMG-CoAR were found to be independent predictors of recurrence-free survival in patients with breast and epithelial ovarian cancer [50–54].

Patients with tumour cells expressing high levels of RBM3 have a better overall prognosis compared with patients with tumours showing lower expression levels. Moreover, in vitro experiments have demonstrated a role of RBM3 with respect to cisplatin sensitivity. The cisplatin-resistant ovarian cancer cell line A2780/Cp70 has relatively lower levels of RBM3 compared with the cisplatin-sensitive parental cell line A2780, and siRNA knockdown of RBM3 in the latter resulted in an increased resistance to cisplatin [55, 56]. This indicates that patients whose tumours express high levels of RBM3 could benefit from cisplatin treatment, whereas alternative drugs may be considered in patients with a lack of or low RBM3 expression.

Similarly, decreased levels of HMG-CoAR have been coupled to decreased in vitro sensitivity to tamoxifen and, intriguingly, HMG-CoAR has also been identified as a predictor of response to adjuvant tamoxifen treatment in breast cancer, regardless of oestrogen receptor status [52]. Moreover, the activity of HMG-CoAR can be inhibited by statins that are known to have antineoplastic effects [57–59], and studies on breast and ovarian tumours suggest that HMG-CoAR may prove useful as a surrogate marker of response to statin treatment in these cancers [51, 53]. Taken together, these recent findings offer an emerging panel of biomarkers for personalized medicine with respect to cisplatin, tamoxifen and statin treatments in breast and ovarian cancers.

Several additional promising cancer biomarkers have been explored, and protein expression patterns of use for clinical cancer research have been suggested for a number of tumour types. First, for colorectal cancer, a high expression of tumour-associated trypsin inhibitor was shown to correlate with liver metastasis and poor prognosis [60]. There was also a correlation between low expression of tryptophanyl-tRNA synthetase in tumour tissue and increased risk of recurrence and shorter survival [61]. Furthermore, growth differentiation factor 15 was a negative prognostic marker with high expression in tumour tissue and high plasma levels correlating with an increased risk of recurrence and reduced overall survival [62]. Second, in malignant melanoma, decreased expression of the protein syntaxin-7 in the cells of melanocytic lineage appeared to be associated with more aggressive tumours [63]. Additionally, SOX-10 was identified as a transcription factor selectively expressed in melanocytic cells, with highest levels in benign lesions and lowest levels of expression in melanoma metastases [64]. Also, the amplification of topoisomerase I was associated with more advanced tumours and poor prognosis [65]. Third, in prostate cancer, GAD-1 showed specific expression in both benign and malignant prostatic tissues, suggesting a role as a prostate-specific tissue biomarker [66]. Moreover, the three novel markers somatic cytochrome c, intestinal cell kinase and inhibitor of nuclear factor-κB kinase subunit beta showed higher expression in prostate tumours as compared to benign prostatic tissue [67].

Antibody-based proteomics and the Human Protein Atlas resource can also be used to complement more basic tumour biology studies, by providing expression data from human tissues and clinical cancer samples. In a recent tumour biology study using a mouse model, large numbers of granulin-expressing bone marrow–derived haematopoietic cells were found in the tumour stroma of breast cancers responding to instigating signals. These cells were shown to induce a local inflammatory response and remodel the extracellular milieu through paracrine interactions with resident fibroblasts.This study also showed that the expression of granulin in human breast cancer was strongly correlated with the triple negative/basal-like breast tumour subtypes and that breast cancer patients with tumours positive for granulin staining had a significantly worse outcome in terms of overall survival [68].

Studies of diseases other than cancer can also benefit from the data and reagents generated within the Human Protein Atlas project. In a recent effort to develop new tools for measuring beta cell mass to use as an end-point in diabetes-related studies, a screen for new cell surface markers with specific expression on beta cells as compared to surrounding cell types was performed. In this first study, several new candidates were characterized and tetraspanin-7 was identified as a promising candidate for future development of new PET tracers for beta-cell imaging [69]. In a study of epilepsy, the disc large homologue-5 gene encoding for the synapse-associated protein 102 was found to have a strictly neuronal pattern of expression and the results suggested a role of this protein in cortical hyperexcitability and epileptogenicity of malformations of cortical development [70].

Proteomic profiling of serum and plasma

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

Serum and plasma profiling set-up

Current analysis efforts within the Human Protein Atlas are being broadened to include the use of body fluids as sample material. The initial focus is on samples derived from blood as this transportation system is, in most cases, the preferred sample type for clinical diagnostics, and methods have been established for studying its cell and cell-free subproteomes [71]. Because blood is continuously exposed to the environment outside the circulation because of cellular turnover, secretion or disease-related leakage, it is assumed that all proteins are likely to be present in blood at a certain time-point and under certain (disease) conditions [72]. This fact, however, has hampered efforts to clearly define plasma proteins [73–75], as has the finding of variations in protein levels between individuals of different gender and from different age groups. Even if blood collection is preferable to that of taking tissue biopsies, the different serum and plasma collection procedures can have an impact on the study outcome [76, 77]. Within blood-derived samples, protein composition and the broad concentration range of different proteins poses a challenge to any proteomic method.

Technological aspects of high-throughput blood-sample profiling

To extend the Human Protein Atlas initiative to nonsolid sample material, appropriate methods for the implementation of the in-house-generated antibodies are needed to meet the throughput requirements in terms of number of analytes investigated in large numbers of samples. Thus, the use of microarray technology offers the required properties [78]. Conceptually, these arrays are built either by immobilizing capture reagents to analyse many parameters in a given sample or by immobilizing samples and applying binding reagents to determine one parameter in all samples in parallel [79].

There are two major types of microarray platforms available: planar and bead-based microarrays. Planar systems utilize robotics to deposit droplets onto functionalized microscopic slides to create an ordered spot layout, which is processed by experimental steps of blocking, incubation and washing, before fluorescence signals are read by a laser scanner and the acquired images are translated into intensity values. The bead-based system utilizes colour-coded particles, which are functionalized, mixed and analysed during the experimental procedure by a flow cytometric system to assign the measured interactions to a colour code [80, 81]. Whilst the two methods are complementary [82], the planar systems enable up to several thousand parameters to be included in one experiments but are suboptimal in terms of sample throughput and assay turnaround time affected by scanning and image analysis steps. The bead-based system enables a higher sample throughput by using microtiter plates and magnetic beads with instant data output, but the drawback is that only 500 colour codes are currently available. With the technological platforms in use at present, the immunoassays performed on these systems need to allow (i) flexibility to analyse many different parameters/antibodies at the same time, (ii) sensitivity to measure the proteins of low abundance, (iii) simplicity to enable cost-efficient and streamlined sample and experimental handling and (iv) versatility to add, remove and alter the included parameters without major experimental adjustments.

Today, sandwich immunoassays are most commonly used to determine quantitative information on protein levels, and the possibility of multiplexing the analysis has proven to consume minute amounts of reagents and samples [83]. Nevertheless, these sandwich assays suffer from a low level of flexibility and versatility, as the two antibodies used within each of the pairs must not cross-react with the other sandwich pairs included in the test [84]. Other set-ups make use of directly labelling the protein content of a sample to introduce a tag, which allows the detection of captured targets without the need for a secondary antibody. Even though the sensitivities achieved in thoroughly optimized sandwich assays are superior, the label-based method is more flexible and versatile and allows the discovery and semiquantitative determination of protein profiles.

To match the need for analysing many samples using many antibodies, we have developed microarray-based workflows that can utilize both bead-based and planar systems [82], that circumvent the need to purify unreacted labelling reagents [85] and that can be modulated using heat to enhance the assay performance (Fig. 5). [86]. The assay works with serum or, preferably, plasma [76], and recent technological updates of the bead-based system enable the analysis of 384 antibodies on 384 samples in a day (J. M Schwenk, unpublished data). The assays consume only about 2 μg antibody for up to 1000 samples, and in the final incubation mixture, 0.1 μL neat fluid is sufficient to obtain profiles from all 384 antibodies.

image

Figure 5.  Plasma profiling strategy. The illustration encompasses the three stages within the Human Protein Atlas plasma profiling strategy, where the first phase (top row) describes the generation, validation and selection of antibodies. These then enter the second phase (middle row), in which they are implemented to produce the antibody bead arrays. Study design follows antibody array testing, and here analysis considerations include sample randomization and replication. In the application phase (bottom row), a multidisease cohort is screened using the arrays to identify protein profiles that separate case and control groups. Verification of these profiles is then achieved in larger and independent sample collections with a single disease focus, before clinically relevant sandwich assays are developed to qualify biomarker candidates in extended studies.

Download figure to PowerPoint

Blood-based biomarker discovery

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

Biomarker candidate selection

In contrast to the cancer biomarker discovery efforts using TMAs in which candidate biomarkers are selected based on tissue expression profiles, the biomarker discovery effort using serum and plasma samples can be performed as both directed, hypothesis-driven and undirected, hypothesis-free analyses [87]. Thus far, only mass-spectrometry-based methods have been considered as undirected, discovery-based screening approaches, as disease–protein correlations are determined based on profiling experiments. In overcoming the obstacle of a lack of affinity reagents by access to more than 10 000 antibodies, the affinity-based approach can offer an emerging alternative to MS in this respect. The workflow begins with the first stage and the selection of antibodies to define the direction of the discovery route. In undirected screening approaches, reagents can be selected based on their quality with regard to specificity validation [11, 22] and concentration. In directed approaches, antibodies are additionally chosen based on their target protein or gene product.

Bead array production and evaluation

The second stage in the workflow is to produce and test the arrays (Fig. 5, middle row). The following experimental steps are performed in microtiter plates, and we use a liquid handling system to support the throughput starting by preparing concentration-normalized antibody solutions to couple the beads. As we utilize magnetic particles, procedures for washing and bead array assembly are straightforward. Bead collections are then combined to create an array in suspension. These arrays are tested for relative antibody coupling efficiency and number of beads per colour code, and the first analysis in a sample reveals the first protein profiles.

In parallel to bead array preparation, the samples are transferred from plates and distributed onto a second plate to create a randomized sample layout to allow different studies or sample cohorts to be combined. The samples are then diluted, labelled with biotin and stored frozen. For use, labelled material is thawed, prepared in experimental buffer, heat-treated and cooled to room temperature and then combined with the beads for incubation. Finally, the beads are washed and reporter molecule (streptavidin) is added, and intensity values for each antibody and sample are determined using flow cytometry. The measured values are medians from counting at least 50 events per bead colour code. With this method, up to 150 000 immunoassays can be performed per day.

Application of studies

The third stage is the actual study design and the required verification and qualification steps (Fig. 5, lower row). Here, the clinical question and the selection of prototypical case samples and matched controls are defined. Making use of the microtiter plate size, many different studies can be included to allow for within and across disease category profiling. This not only enables the identification of targets that are unique for one disease (e.g. prostate cancer) but also reveal candidate markers of disease category (e.g. cancer) or markers that may be influenced by systemic effects (e.g. inflammation). From each sample group, the collected protein profiles are compared by statistical methods with those of the respective control group. It is important that the comparisons are based on samples collected using the same protocols to exclude protein profiles that are altered owing to sample collection or storage procedures.

Technical verification

Discovered protein marker candidates need to be technically verified. Initially, replicated measurements allow for the determination of the variation in obtained profiles, and by re-analysing the samples using the employed set-up, the reproducibility of the results can be judged. Other strategies include applying and comparing profiles from multiple antibodies generated against the same target protein. In addition, related (proteomic) platforms could serve as a technical verification for each another to exclude platform bias such as switching between planar and bead-based arrays [82]. Experimentally, competitive assays offer a valuable procedure to confirm that binding events are related to the proteins of interest [88]. In addition to array methods, western blots can be used to determine the molecular mass of a protein target detected in a given specimen as well as to indicate differences in protein abundance when comparing different samples. To identify proteins captured by a specific and immobilized reagent, the interaction can be displayed via bead-based immunoprecipitation, whereupon the captured binding partners can be eluted and monitored using western blot analysis or MS.

Biological verification

To reproduce technically verified findings, it is necessary to assess the value of the candidate in independent sample cohorts. This is to exclude the possibility that the findings are associated with specific cohorts or with the effects of the procedures for sample collection, preparation, storage or handling. Thus, larger sets of samples from one or several cohorts should be considered at this stage. As another initial step of biological validation, the cohort size should be increased to improve the power of the analysis and to address technical errors associated with the discovery phase [89, 90]. Any later phase of biological verification can include the initially employed sample cohort(s), making use of a sample cohort overlap for further technical interpretation. In addition to an increased size, a higher degree of diversity and definition of the verification cohort is required to accelerate the biological confirmation and eventually achieve further disease-specific subclassifications or to account for other unrelated benign pathologies [91]. As the biomarkers advance through the verification phases, there is a growing need for increasing sample numbers from unrelated collections.

Qualification

With regard to qualifying and translating verified biomarker candidates discovered from different methods into clinical studies, the sandwich set-up is the assay format of interest as the secondary affinity molecule adds specificity to the test [83]. This assay combines specificity with sensitivity and enables a quantitative analysis of proteins [92], and the required experimental equipment and expertise is currently more commonly available than for microarrays. This format is mainly of interest for this late stage in the workflow because it is challenging to find good pairs of antibodies, and recombinant proteins that resemble the antigens in the sample, and to be able to combine these in any given composition as cross-reactivity between the detection antibodies interferes with target analysis [93]. Nevertheless, sandwich assays are used basically every day in clinical laboratories, and therefore, using this method, the study can be extended to locations in which patients would benefit from an improved diagnosis.

Additional considerations

There is a growing initiative towards more standardized sample collection and storage in biobanks [94–96] to provide samples in formats that would allow a faster processing in research laboratories. In parallel to these standardization efforts, any proteomic platform must not only meet high-throughput requirements but also consider the value of precious samples already stored or being collected in biobanks. To study clinical material stored in biobanks to discover, validate and qualify potential biomarker targets, adaptations are required, such as introduction of bar-coded sample tracking, plate formats and a common sample decoding language.

Future perspectives

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References

The Human Protein Atlas project has a pivotal role in antibody-based proteomics. Validated antibodies against all human proteins can be envisioned together with a map showing where within a cell our proteins are expressed and how proteins are distributed in the human body. Analyses of alterations in protein expression patterns in disease tissues, such as tumours, provide a better understanding of the biology underlying various disease states, as well as a basis for developing novel diagnostic and therapeutic tools to meet the promises of future personalized medicine. Following the near completion of a first draft of the Human Protein Atlas, containing data corresponding to major isoforms of the approximately 20 000 protein-encoding human genes, there are several possible areas for expansion of the project: First, adding protein expression data for different isoforms, including post-translational modifications and splice variants; second, expanding the number of cell types and tissues to be analysed and including other cell types and tissues from additional disease states; third, expanding basic knowledge regarding the underlying transcriptome using new-generation sequencing techniques, e.g. RNAseq, to study the temporal and spatial relation between RNA and protein in normal development and disease. As all data on the Human Protein Atlas are free and publicly available, the information can be utilized in various research projects and also integrated into other databases.

References

  1. Top of page
  2. Abstract.
  3. The Human Protein Atlas project
  4. Antibody generation and validation
  5. Using the Human Protein Atlas database
  6. Immunopathology
  7. Proteomic profiling of serum and plasma
  8. Blood-based biomarker discovery
  9. Future perspectives
  10. Conflict of interest statement
  11. Acknowledgements
  12. References