The vision of personalized medicine is a compelling one for the future of medical care in general and cancer care in specific. It foresees the use of molecular data to better classify disease, facilitate the development and validation of new targeted therapies, treat patients with more specificity and efficacy but fewer adverse events, and more accurately determine disease predisposition. This vision drives most of the major strategic initiatives of the National Institutes of Health and the National Cancer Institute (NCI), and it is the principle behind recently drafted congressional bills such as the “Genomics and Personalized Medicine Act of 2007.”

An issue that is intuitively obvious but rarely emphasized is that biospecimens of high quality are the sine qua non of personalized medicine. The molecular data that are the envisioned basis for personalized medicine must be derived from cells and tissues because, quite simply, that is where the molecules reside. Biospecimens such as cells, tissue, blood, and plasma are common requirements for patient management in current, standard-of-care medical practice. In cancer medicine they are the gold standard of diagnosis, staging, and prognostic/predictive factor analysis. In a world of personalized medicine, however, biospecimens will take center stage as the critical link between the clinic and the patient. To realize the vision of personalized medicine, patient- and disease-specific molecular data must be derived from biological specimens in an accurate and reproducible manner. This in turn requires that the biospecimens themselves be annotated, collected, processed, and, if necessary, stored and/or distributed according to standards that safeguard their quality. Otherwise, on either an individual patient or a medical enterprise level, the vision is reduced to the well-known aphorism of “garbage in, garbage out.”

The reality of personalized medicine is on the horizon. The first steps have been taken in the evolution of tumor classification, disease prognosis, molecularly targeted treatment, and response to therapy based on molecular features. These represent some of the most important advances in cancer medicine over the past decade, but their development has been utterly dependent on the availability of high-quality biospecimens. High-profile examples are described below.

  • Investigators using tumor specimens from the NCI Cooperative Breast Cancer Tissue Resource showed that the HER-2/neu receptor is amplified in 20% to 30% of breast cancer cases. An antibody developed to this receptor (trastuzumab [Herceptin]) was found to be effective against breast cancers with HER-2/neu overexpression and is now standard of care for such tumors.

  • Imatinib mesylate (Gleevec) is a drug originally developed for the treatment of chronic myelogenous leukemia by targeting the BCR-ABL protein. After conducting molecular profiling studies on biospecimens collected from different tumor types, investigators found that a mutant form of KIT, a protein related to BCR-ABL, is responsible for the progression of gastrointestinal stromal tumors (GISTs). This led to the hypothesis that imatinib could be used to treat GISTs, and subsequent clinical trials confirmed that the drug has unprecedented effectiveness in this disease.

  • Gefitinib (Iressa), an anticancer drug that has been on the market for several years, was designed to target the epidermal growth factor receptor (EGFR) thought to contribute to lung cancer. Although the drug was approved by the US Food and Drug Administration, only about 10% of lung cancer patients typically responded to gefitinib. Using samples of tumor tissue from strong responders, it was found that these cancers carried specific EGFR mutations and that this genetic marker could be used to predict gefitinib responsiveness.

Advanced technologies for molecular analysis are now exquisitely sensitive and specific, and many have the capacity for high throughput. Thus, the capacity for performing the type and amount of molecular profiling needed for personalized medicine is within reach. However, great strides in the development of molecular analysis technology have significantly raised the bar for molecular analyte quality and standardization. For example, powerful technologies that can detect the difference in the phosphorylation state of a single protein in a biologic sample demand increased rigor in biospecimen acquisition and handling to achieve reproducible results. Even if the analysis method were faultless, false results could be produced by poor quality samples. The source of confounding variation would be the analyte itself, with error introduced upfront during specimen processing.

The usefulness of biospecimens for the types of translational research needed to move medicine into the personalized era is limited by variation in the ways biospecimens are annotated with clinical data and consented for scientific use. Even biospecimens of the highest physical quality will be of little or no use in translational research if they lack high-quality clinical data or are not properly consented. The question looms as to whether or not enough biospecimens of sufficient quality currently exist to support the research needed to drive the development of personalized medicine, and if not, what is being done to address this problem.

Biospecimen collections have been stored by individual laboratories, private companies, community hospitals, academic medical centers, and government institutions for various objectives since the mid-19th century. According to an authoritative analysis published by RAND Corporation in 1999, there are over 300 million biospecimens stored in hundreds of biorepositories across the US.1 That number is likely far greater today. However, the full potential of these repositories as scientific resources has never been realized because the biorepositories employ widely varying procedures for quality control, storage, annotation, and patient consent. This variation makes it difficult, or even impossible, for the research community to compare and validate results derived from the use of these biospecimens.

The variation in biospecimen quality is especially problematic for genomic and proteomic analysis technologies given their extraordinary sensitivity. Thus, despite the large numbers of biospecimens banked across the country, insufficient numbers of high-quality samples presently exist to support major scientific initiatives using these technologies. One example of this dilemma is The Cancer Genome Atlas (TCGA) pilot project. TCGA is a recently launched, large-scale team science initiative jointly sponsored by the NCI and the National Human Genome Research Institute (NHGRI). The pilot aims to perform broad genomic analysis of 3 cancer types using a combination of major technology platforms, including large-scale gene sequencing, in a coordinated fashion. The data from TCGA will be made publicly available for all researchers to use in developing new diagnostic, therapeutic, and preventative strategies for cancer. If the pilot phase is successful, it is envisioned that TCGA will be expanded to include all major cancer types. Initial preparation for TCGA included intensive efforts to locate specimen collections that met the technical demands of the analysis platforms to be used, but only a small number of qualifying specimen collections ultimately could be identified. Among them, no single repository had sufficient numbers of samples of any given tumor type to service the entire project. For the NCI and NHGRI, this experience has underscored the pressing need for biospecimens that meet the quality requirements of sophisticated molecular technologies that, in turn, enable transformative research initiatives like TCGA.

The associated clinical data obtained from medical records must be accurate, complete, and standardized across biospecimen collections to facilitate studies that link molecular profiles to patterns of disease progression and outcome. The integrity and quality of biospecimen materials and data are becoming increasingly important as the research community sets its sights on answering the bigger questions in cancer with larger, population-scale studies.

Both within the US and internationally, there are a number of ongoing efforts to create large-scale resources in which biospecimens are collected, stored and distributed under a new system of standards, quality control, data sharing, and access. These efforts include:

  • The United Kingdom (UK) National Cancer Tissue Resource, which comprises a large-scale network of acquisition and processing centers for tumor biospecimens.

  • The UK Biobank, which is recruiting up to half a million participants between the ages of 45 and 69 to contribute blood samples, lifestyle details, and medical histories to create a large biospecimen resource for epidemiological studies.

  • Biobank Japan, which is creating a large-scale DNA repository, with blood samples and associated clinical information from over 300,000 individuals.

  • Kaiser Permanente, which has launched a large-scale campaign among more than 2 million subscribers with the goal of soliciting 500,000 volunteers to provide blood or saliva specimens for genetic analysis to study how lifestyle, environmental factors, and genes interact to contribute to diseases such diabetes, asthma, and cancer.

  • The US. Department of Veterans Affairs (VA), which has launched a pilot project to gather 100,000 biospecimens with the aim of linking certain diseases to the genetic makeup of the biospecimen donors. Due to its large size and well-established electronic health record system, the VA is well positioned to successfully expand this effort.

Key to increased patient participation in building these and other large-scale biospecimen collections will be the implementation of guarantees of medical record privacy, legislation and enforcement of protections against genetic discrimination, and the creation of very clear informed consent guidelines for the use of biospecimens and their data.

The NCI has recently led several efforts to establish best practices for biospecimen resources that will help to ensure that quality standards for the research needed to further personalized medicine are recognized and implemented throughout the research enterprise, thereby furthering personalized medicine. These include the establishment of the NCI Office of Biorepositories and Biospecimen Research; the development of the National Biospecimen Network Blueprint (providing a framework of recommendations for standardized informed consent, privacy of Health Insurance Portability and Accountability Act-protected patient information, biorepository operating procedures, researcher access to biospecimens, and quality control for data and specimens); hosting of the first International Summit on Harmonization of Biorepositories; and issuance of the Draft First-Generation Guidelines for NCI-supported Biorepositories, now revised in accordance with public comment and renamed NCI Best Practices for Biospecimen Resources. While large biorepositories are clearly increasing in number and incorporating standardized procedures for biospecimen and data handling, much work remains to be done to achieve the vision of national and/or global resources that can support the scale and precision of molecular analyses needed to accelerate progress toward personalized medicine.


  1. Top of page
  • 1
    Eisenman E, Haga SB. Handbook of Human Tissue Sources. A National Resource of Human Tissue Samples. Washington, DC: Rand; 1999.