Arraying the Orchestration of Allograft Pathology

Authors


*Corresponding author: Minnie Sarwal, MD, MRCP, PhD, e-mail: msarwal@stanford.edu

Abstract

Microarrays, or gene chips, are exciting investigative tools for analyzing expression changes across thousands of genes in concert in tissues and cells of interest. Despite the relatively recent application of microarrays to transplant research, they hold great promise for unraveling the staging of rejection, stratifying patients towards more individualized treatment regimes, and discovering noninvasive biomarkers for monitoring of intragraft events. Bioinformatics tools are being developed to sift through the large data sets generated as ‘genomic fingerprints’ of the underlying biologic pathways. Gene clustering and class prediction tools allow discovery of diagnostic and prognostic molecular signatures of health and disease. Oligonucleotide-based microarrays also have utility in genotyping polymorphic markers. This report reviews the current literature of microarray use in transplantation research, compares currently available array platforms, and discusses future application of this technology to clinical organ transplantation.

Abbreviations:
PCR

polymerase chain reaction

aRNA

antisense RNA produced by amplifying RNA using T7 polymerase

cDNA

copy-DNA made by reverse transcription of RNA

RT-PCR

reverse transcriptase-PCR, also known as quantitative-PCR

Tm

melting temperature of DNA:DNA duplex formed during hybridization

SNPs

single nucleotide polymorphisms

HLA

highly polymorphic human leukocyte antigens used in tissue typing

CAN

chronic allograft nephropathy

AR

acute rejection

TGF-β

transforming growth factor-beta

IFN-γ

interferon-gamma

DT

drug toxicity

SAM

significance analysis of microarrays

FDR

false discovery rate

PAM

prediction analysis of microarrays

PCNA

proliferating cell nuclear antigen

Microarrays: New Tools for Transplantation Research

Even with recent advances in functional genomics after the full human genome sequence was published (1), only a small handful of the 30 000 human genes have been characterized in human renal disease, and more recently in human renal transplantation (2–4). One of the most significant advances in biomedical research has been the development of DNA microarrays capable of generating genome-wide profiles of mRNA expression or mutational signatures. These new tools have been used to explore gene expression and regulation and to characterize the genetic diversity underlying disease. The earliest uses of DNA arrays included only a small number of targeted genes. Filter-based ‘macroarrays’ in the 1980s evolved towards miniaturization in the 1990s to ‘microarrays’ (5), aiming to increase the number of genes available for analysis in a single experiment while also reducing sample requirements for each hybridization. Significant improvements in surface chemistries, probe attachment methods, robotics and signal detection have made possible miniaturization of arrays that have thousands of capture sequences (Figure 1).

Figure 1.

Evolution of array technology over the past 30 years. (A). The advent of microplate-based technology revolutionized how enzymatic assays were conduced in the 1970s because it permitted smaller volumes of reagents (cost savings) and robotic liquid handling (improved precision) to be used. The format of using parallel analysis was replicated in the 1980s for hybridization-based assays using membrane filter ‘macroarrays’ containing cDNA clones hybridized in parallel with a mix of labeled radioactive cDNA mixtures. The technology migrated toward further miniaturization onto glass or other solid supports at much higher probe density with the advent of ‘microarray’ technology in the 1990s. The use of fluorescent dyes (Cy3 and Cy5) and laser-based scanning in the process reduced dependence on radioactive detection without sacrificing sensitivity. (B) A newly emerging advance in microarray profiling is the development of plate-based ‘arrays of microarrays’ where up to ∼1000 probes can be spotted onto the grid of a standard 96-well microplate. This platform approach permits automated protocols for RNA sample preparation, amplification, labeling and hybridization to be developed and optimized for real clinical samples, improving assay reproducibility while reducing risk of sample mix-up and processing errors.

The term DNA microarray refers to a high-density array of oligonucleotides or PCR-products (also known as cDNA or copy-DNA) immobilized onto a solid support such as glass slides. The immobilized DNA selectively retrieves genes or sequences of interest when the array is hybridized to a mixture of complementary sequences. Because hybridization is concentration driven, signal intensity is generally proportional to the relative abundance of the message (mRNA) in the hybridizing sample. Currently, more than 50 000 genes can be spotted on a single slide and each array may be custom designed to focus on individual metabolic pathways.

Most DNA microarray systems use a two-color hybridization scheme to visualize and measure gene expression levels reproducibly across multiple samples. Typically, RNA is fluorescently labeled either with a green dye (Cy3) or a red dye (Cy5) before hybridization. A common reference sample, labeled with the green dye, is used to normalize signals across multiple experiments and is competitively hybridized with test samples on each DNA microarray. The ratio of red to green fluorescence (R/G ratio) is measured by scanning the microarray slide using wavelengths specific to each dye. The relative abundance of test vs. reference mRNA is then indicated by the color of the individual spots: red if test RNA is more abundant than reference RNA, green if reference mRNA is more abundant, and yellow if they are equally abundant. Digital color images are made by scanning the arrays in which signal intensities correlate with gene abundances, which can be visualized and compared by various software tools (Table 1). All array-based studies yield relative expression information, so independent quantitative methods are performed to confirm expression differences. These tests include Q-PCR (for quantitation of mRNA levels) or immunohistochemistry (for identifying the cellular location and abundance of the gene product).

Table 1.  Useful websites for array analysis
 HTML
Significance Analysis of Microarrayshttp://www-stat.stanford.edu/~tibs/ElemStatLearn/
Prediction Analysis of Microarrayshttp://www-stat.stanford.edu/%7Etibs/PAM/Rdist/index.html
R: Statistics for Microarray Analysishttp://ihome.cuhk.edu.hk/~b400559/arraysoft_rpackages.html
Cluster and Treeview (Eisen Lab.)http://rana.lbl.gov/EisenSoftware.htm
Xcluster (Sherlock Lab., Stanford)http://genetics.stanford.edu/~sherlock/cluster.html#formats
BRB ArrayTools (NIH informatics tools)http://linus.nci.nih.gov/BRB-ArrayTools.html
KEGG Kyoto gene classification databasehttp://www.genome.ad.jp/kegg/
GoMiner gene classificationhttp://discover.nci.nih.gov/gominer/index.jsp
Argon Laboratories metabolic pathwayshttp://wit.mcs.anl.gov/WIT2/
Gene Ontology (GO) Consortiumhttp://www.geneontology.org/
ExPAS biochemical pathwayshttp://www.expasy.org/cgi-bin/search-biochem-index
GenMAPP metabolic pathwayshttp://www.genmapp.org/
GeneCards human gene databasehttp://bioinformatics.weizmann.ac.il/cards/
Stanford microarray databasehttp://genome-www5.stanford.edu/
SOURCE clone & gene information databasehttp://source.standord.edu
Pat Brown's lab.: pioneers in microarrayshttp://brownlab.stanford.edu/
DRAGON microarray bioinformatics toolshttp://pevsnerlab.kennedykrieger.org/dragon.htm
Protein data bankhttp://www.rcsb.org/pdb
NCBI Entrez tutorialshttp://www.ncbi.nlm.nih.gov/Education/index.html
The Wellcome Trust Sanger Institutehttp://www.sanger.ac.uk/
European Molecular Biology Lab.http://www.ebi.ac.uk/embl
Swiss Institute of Bioinformaticshttp://www.expasy.ch/
Protein Information Resourcehttp://pir.georgetown.edu
National Library of Medicinehttp://www.nlm.nih.gov/
EMBL genome browserhttp://www.ensembl.org/
NIH list of genome siteshttp://linus.nci.nih.gov/pilot/links.htm
UK list of genome siteshttp://www.hgmp.mrc.ac.uk/GenomeWeb/UK genome sites

Global expression profiling with microarrays has become a driving tool for discovering novel genes in functional pathways or in diseases of interest. Hypothesis-driven and directed expression studies that use a candidate gene approach analyzing a limited repertoire of genes is now being replaced by hypothesis-generation studies. While only 35–40% of the ∼30 000 human genes are functionally characterized, DNA microarray profiling does not require prior knowledge of gene pathways involved in a disease. In the analysis of the test results, the transcriptional similarities across thousands of genes of known and unknown function are made simultaneously. The technique of global profiling of either expression patterns or patterns of gene polymorphisms in association studies (Figure 2) brings the potential to unravel the structured and coordinated physiological processes linked to complex immunological circuits in organ transplantation. It is thus a powerful discovery tool to identify the interplay among multiple gene pathways in various immune responses such as acute rejection, vascular or cellular rejection, chronic injury, infection, drug toxicity, and tolerance.

Figure 2.

Oligonucleotide-based arrays for genotyping genetic variants in drug metabolizing genes. Short oligonucleotide probes (15-mer) flanking common mutations in 11 drug metabolizing enzyme genes were designed and spotted onto glass slides. (A) Amine-modified probes were covalently attached to the array and represent each of the two common allelic forms for 47 mutations in these genes. To test the specificity of hybridization-based discrimination, a mixture of dye-labeled sequences complementary to the probes on the array was hybridized using high-stringency conditions and allele-discrimination buffer (Uni-Hyb; Array-It, Sunnyvale, CA). (B) Cy-5 (red) was used on all wild-type test sequences probes and Cy-3 (green) on all mutant sequences. Probe specificity can be identified by relative signal intensities in each of the two color channels (yellow signal indicates nonspecific discrimination of the alleles and white signal indicates signal saturation). Specificity of and hybridization efficiency can be improved by modifying the length or composition of the arrayed allele-specific targets in future designs of the array. The first two rows contain experimental controls and all allele-specific probes were spotted in duplicate. The specific mutations tested include polymorphisms in CYP2D6 (21 SNPs), CYP2C19 (3 SNPs), CYP2A6 (1 SNP), MPX (1 SNP), NAT2 (5 SNPs), NAT1 (5 SNPs), EPHX (5 SNPs), TAP2 (1 SNP), TPMT (2 SNPs), NQ01 (2 SNPs) and GPX2 (1 SNP).

As DNA microarrays can reveal dynamic patterns of expression of tens of thousands of genes in a single experiment (6), they are being widely applied to bridge the gap between the basic and clinical sciences. Currently, there are two main microarray platform types: ones that use cDNA spots gridded on glass slides (6) or ones using synthetic capture oligonucleotides (e.g. Affymetrix, Agilent, Applied Biosystems, GeneXP Biosystems arrays). Relative merits of these platform types and the different biological applications that they are most suitable to address are summarized in Table 2. While cDNAs were the first type of probe to be used on microarrays, oligonucleotide arrays offer greater versatility, as both expression profiling and genotyping studies can be conducted. In addition, cDNA probes are generally too long to effectively detect alternatively spliced mRNA variants. The lengths of these probes are advantageous, however, to differentiate individual genes in large super-gene families where large segments of the genes may be repeated across all members of the gene family. Reduced ability to detect members of the p450 super-gene family or HLA genes, for example, are limitations of short oligonucleotide arrays in expression studies. In addition, cDNA arrays can much more readily be adapted to new experiments and customized to include new probe sets than arrays generated by photolithographic methods such as the Affymetrix array.

Table 2.  Array format comparison
 cDNAOligonucleotideLong-oligos
Probe length on the array0.5-3 kB15–25 bases45–70 bases
Use in mRNA profiling++++++
Genotyping-++++
Detect splice variants-+++++
Comparative genome hybridization++++++
Requires multiple probes per target-+++++
Spot consistency-+++++
Current availability++++++
Manufacturability-+++++
Batch-to-batch consistency-++++++
Ability to customize++-+++
Typically uses spotting technologyYesNoYes
Template required for genome-wide profiling5 μg16 μg5 μg
Cost per arrayModerateHighestLowest
Base system cost$50K$150K$50K

Improvements in DNA synthesis methods as well as surface and attachment chemistries now permit longer oligonucleotide probes to be spotted on microarrays. This improves both hybridization consistency and batch consistency. Probe construction is simplified, as no PCR amplification and clean up is required. In addition, spot consistency is improved (high viscosity of the cDNA probes causes spot artifacts, resulting in batch inconsistencies that may need to be normalized from the experiment's results), hybridization is improved (probe binding can be better Tm balanced), and the arrays can be made at a lower cost. Thus, more and more cDNA-based studies are migrating to long oligonucleotide probe-based arrays.

Another recent advance in the field is the use of microwell-plate sized ‘arrays of microarrays’ in a standard 8 × 12-cm plate format commonly employed in robotic fluid handling (Figure 1B). In these plates, 96 individual samples can be simultaneously processed using automated methods for sample handling. Standard laboratory robots dispense fluids and perform all the steps in sample preparation, amplification, labeling, washing, and hybridization. Sample requirements for the well-based array hybridization plates are similar to those required for genome-wide profiling on the microarray scans, as larger reaction volumes are required to prevent evaporation during overnight hybridizations. To keep the per sample cost as low as possible competitive hybridization is eliminated and experiments are conducted using only a single dye. Most single-color microarray applications make use of internal control probes to normalize expression results. Expression signals are normalized to invariant ‘housekeeping’ genes and spiked controls, a method commonly employed in quantitative RT-PCR-based expression assays.

Microarray Studies in Transplantation

Fundamental studies in solid organ transplantation have characterized specific functional roles for individual genes in the immunological cascade leading to organ rejection. Nevertheless, significant redundancy in the immune system is suggested by clinical variability in acute rejection outcomes, in the progression of chronic rejection, and by the potential to acquire graft tolerance. Underlying all of these complex systems are concordant expression and co-regulation of multiple genes in known and novel molecular pathways. Ascertaining their individual roles in adapting the effector response is now possible with gene expression profiling using microarrays.

Microarray-based studies in animal models of transplant dysfunction and acceptance

Controlled animal transplant models are invaluable for improving our understanding of genes regulating human responses to allograft injury. Mueller et al. (7) recently reported a study of acute rejection in cardiac transplants using mouse models of allogeneic, syngeneic, and alymphoid vascularized transplants. A temporal expression of transcription changes in acute rejection was identified in this study. Genes identified included ones in the innate immune response, cell stress, defense and metabolism. Saiura et al. (8) examined a rat-to-mouse concordant cardiac transplantation model, and identified a strong interferon-γ (IFN-γ) response in both allografts and xenografts. In addition, cardionatrin and atrial natriuretic factors were expressed in higher levels in the xenografts relative to the rat isografts. Pro-inflammatory and IFN-γ-inducible genes were also identified in a different study on tolerized mice (cardiac allograft model tolerized by anti-CD80 and anti-CD86 monoclonal antibodies) (9). Stegall et al. (10) used oligonucleotide arrays to serially follow and analyze rat heterotopic heart transplant models, and demonstrated an inflammatory gene response in rejection. Down-regulation of decorin, an antagonist of transforming growth factor-β (TGF-β), was hypothesized to be a pathway for stimulation of the fibrogenic graft response to this injury. Together these observations hint to the extensive redundancy of the immune system in response to graft rejection.

Cross-species comparative genomics between well-characterized and controlled animal transplant models and human patient studies is an alternative approach to following biological patterns of health and disease. A free and publicly available website (http://expression.gnf.org) enables the research community to query mouse and human data sets based on gene name, protein family and accession number. Table 1 lists additional websites that provide useful tools or data collections to support interpretation of microarray studies.

Microarray-based studies in human transplantation

Only a few published studies in human organ transplant use microarrays but the field is likely to grow exponentially in the near future. Mechanisms of acute transplant rejection and chronic allograft nephropathy (CAN) injury remain important pathways requiring additional investigation. This information would permit assessment of patients into clinically relevant categories to stratify them by risk, therapeutic responses, and prognosis. Comprehensive expression analysis across carefully selected clinical samples may provide unexpected pathways of immune injury possibly far removed from the current standard of hypothesis-based gene analysis. One of the earliest studies of seven renal allograft rejection samples (2) identified IFN-γ-driven inflammatory and cytokine-rich pathways in acute rejection. This study did not show increased expression of granzyme, perforin, and granulysin, three cytotoxic T-lymphocyte (CTL) signaling antigens, which had previously been demonstrated as markers of acute rejection (11,12). However, this difference could have been owing to the small study size or the underlying heterogeneity in the rejection process.

Scherer et al. (13) analyzed 6-month protocol biopsies to ascertain a set of 10 genes which may predict the development of CAN 1 year post-transplantation. Some of the identified genes play a role in cellular remodeling, fibrogenesis (FGF-2), and immune activation (prolactin receptor). The levels of TGF-β varied over time in patients who developed CAN: initially the gene decreased but later it increased (13). This intriguing observation remains to be confirmed in further studies. Expression profiling of pediatric renal transplant recipients with both anemia and acute rejection (11) was undertaken to identify underlying mechanisms. Contrary to current opinion, the anemia was found to be largely secondary to erythropoietin deficiency. Additional key pathways identified suggest that deficiencies of iron and folate availability, reduced expression of specific hemoglobin chains, and reduced red cell survival all contribute to the fall in hematocrit observed. Based on this study, vitamin and iron supplements could augment erythropoietin treatment of transplant patients with anemia (11).

A novel steroid-free immunosuppressive protocol in pediatric renal transplantation was also recently studied by DNA microarray analysis (14). The objective of this study was to identify differences in expression profiles comparing steroid-free and steroid-based treatment regimes that might account for improved outcomes in the steroid-free branch. A marked, increased expression of CTL genes in stable non-rejecting steroid-free recipients (simultaneous protocol biopsies were negative for acute rejection) was observed early post-transplantation (14). These results were validated by quantitative RT-PCR and suggest that an increased immune surveillance may be a mechanism for stable engraftment in the steroid-free patients. The study also suggests that noninvasive monitoring for acute rejection may need to be standardized for immunosuppression regimens lacking some of the ‘backbone’ immunosuppressive drugs such as steroids and calcinurin inhibitors.

Clinically, it is well recognized that acute rejection is variable in etiology and clinical course. Consequently, graft outcome cannot be predicted using available clinical, histological and genetic markers. Our recent results profiling mRNA levels in 67 pediatric renal transplant biopsies suggest that previously unrecognized molecular heterogeneity might underlie some of the variability in the clinical course and treatment response of acute renal allograft rejection (4). Specifically, three distinct molecular signatures were identified that impact more than 1300 differentially expressed genes and cluster acute rejection (AR) cases into three distinct groups designated AR-I, AR-II, and AR-III (Figure 3A). Acute rejection-I has aggressive T- and B-cell, macrophage, and NK-cell infiltration and activation and exhibits a strong INF-γ and NFkB response. In contrast, AR-II is a milder form of rejection and has a gene expression signature similar to the innate immune response to infection and to drug toxicity (DT). These patient samples co-cluster with other samples with DT reactions in the absence of rejection. Finally, AR-III is an immunologically quiescent rejection despite Banff evidence of tubulitis and may be acute rejection captured late and on its way to spontaneous recovery.

Figure 3.

Clustering of 67 renal biopsies by global mRNA profiling and a minimum gene set that classifies acute rejection into three acute rejection (AR) classes. (A) Expression data from 1340 differentially expressed genes were used to cluster 67 pediatric renal biopsy samples. Acute rejection samples cluster into three of four branches of this tree. Prediction analysis of microarrays (PAM) class prediction was used to identify the minimum gene set that differentiates acute rejection biopsies. Genes are ranked by degree of differential expression across user-selected sample groups. In this analysis, the 1340 differentially expressed genes from 12 AR-I, nine AR-II, and five AR-III biopsies were used as the learning set and a fivefold expression cut-off identified 97 informative genes. (B) Functional clusters of the informative genes are illustrated and the AR samples’ group consistently with the comprehensive expression profiles. (C) Based on this small subset of genes, the called phenotype and PAM classification scores are in 96% concordance with our previous report (4). One AR-I sample, while loosely clustering with other AR-I biopsies, has an intermediate PAM score and phenotypes as AR-II in this analysis. Like many of the AR-II patients, this patient also had urinary tract infection at the time of biopsy.

These acute rejections vary in the extent and cell-specific expression patterns of infiltrating lymphocytes. A surprising finding was a robust signature for immunoglobulin and B-cell-specific genes; this was corroborated by an unexpected finding of B-cell dense (CD20+) clusters localizing to the interstitial compartment in the absence of tubulitis. A pure humeral rejection response in these samples could not be established, as these biopsies had variable C4d staining. Survival analysis of graft function recovery following the rejection episode, and graft loss over the period of follow up, revealed a significant (p < 0.001) association with B-cell cluster density in this ‘learning’ data set, as well as in an additional data set of 31 additional acute rejection biopsies.

Since publication of this study, we have used predictive analysis of microarrays (PAM) (15) to identify 97 genes represented in this dataset that all have a > fivefold difference in expression level, and have classified our learning set of 26 AR samples with 96% concordance to an assigned phenotype (Figure 3B). Clustering of the samples based on these genes is in agreement with PAM class prediction scores (Figure 3C). Another key discovery in this study is that AR expression overlaps with the innate immune response to infection, as evidenced by cluster analysis and by differential expression of several TGF-β-modulated genes including RANTES, MIC-1, several cytokines, chemokines, and cell-adhesion molecules (Table 3). The most highly differentially expressed genes selected in the class prediction analysis contains significant enrichment for ion transporters, MHC antigens (both class I and class II), apoptotic genes as well as T- and B-cell-specific genes (Figure 4). Interestingly, we observe more B-cell-specific genes than T-cell-specific genes in the classification gene set, as well as highest expression in AR-1. This may represent the presence of increased B-cell density in a larger number of the AR-1 samples, suggesting that they may be serving as efficient antigen-presenting cells for indirect allorecognition.

Table 3.  Gene expression differences in acute rejection (AR) classes
 Pathway or cellAR-IAR-IIAR-IIICAN
Annexins: ANXA11, ANXA3, ANXA4, ANXA5Apoptosis++++++
CASP10 Caspase 10, apoptosis-related proteaseApoptosis+++−
STK17B Apoptosis-inducing kinaseApoptosis++++++
CD20 B-cell antigenB cell++++++−+−
Immunoglobulin: IGHM, IGL, IGKC, IGHG3B cell+++++++−+−
CACNB1 Calcium channelCalcium+−−−−++++
S100A10 Calcium-binding protein A10 (Annexin II ligand)Calcium+++−−−−−−
TACSTD1 Calcium signal transducer 1Calcium+++−−−−−−
ITGB2 Integrin, beta 2 (Also called CD18 or p95)Cell adhesion++++++−−−−
PECAM1 Platelet/endothelial cell adhesion molecule (CD31)Cell adhesion++++++−−−−
VCAM1 Vascular cell adhesion molecule 1 (CD106)Cell adhesion++++++−−−−−−
CDKN2A Cyclin-dependent kinase inhibitor 2ACell cycling+++++++
Cyclins: CCNB1, CCNB2, CCNA2Cell cycling+−+++++
Cyclins: CCNG2, CCNI, CCNG1Cell cycling+++++++−+−
MHC Class I: HLA-A, HLA-B, HLA-C, HLA-EHLA genes+++++++−−−−
MHC Class II: HLA-DR, HLA-DQ, HLA-DMA, HLA-DRB4HLA genes+++++++−−−−
Chemokines: MIG, MIP-1, CCR5, CX3CR1, DARCImmune response+++−
Cytokines: SCYB10, SCYA5, SCYA3, SCYA13, SCYA2Immune response+++−
Interleukins: IL2RB, IL6R, IL16, IL15RImmune response+++++++
DEFA1 Defensin, alpha 1Innate immunity+−−−++++++
DEFB1 Defensin, beta 1Innate immunity+++−−−−−−−
SCYA2 Small inducible cytokine A2 (MCP-1)Innate immunity++++−+−+−
SCYA5 Small inducible cytokine A5 (RANTES)Innate immunity++++−+−+−
MST1 Macrophage stimulating 1Signaling+−+++−−−−−−
STAT1 Signal transducer (INFa induced)Signaling+++++
STAT6 Signal transducer (IL-4 induced)Signaling+++++−−−−−−
CD69 Antigen (p60, early T-cell activation antigen)T cell+−−−−++++
MAL Mal, T-cell differentiation proteinT cell+−+++−−−−
NFATC3 Nuclear factor of activated T cellsT cell+−++−−−−
FXYD3 FXYD domain-containing ion transport regulator 3Transport regulator+−−−−++++
ATP5B ATP synthase, H+ transportingTransporters+++++
ATP5C1 ATP synthase, H+ transportingTransporters+++++
Figure 4.

Functional composition of the minimum gene set that differentiates acute rejection (AR) cases into three phenotypic classes. Prediction analysis of microarrays (PAM) class prediction software was used to analyze the differentially expressed genes within the acute rejection cases. An expression threshold of 2.5log2 (corresponding to genes with greater than five-fold average expression difference among samples) narrows the gene list to 97 genes meeting these criteria. See Sarwal et al. (4) and accompanying web supplement (http://www-genome.stanford.edu/rejection/) for more details on the samples and all expression data used in this analysis.

Specific genes associated with clinical events or outcomes might be considered as potential targets for individualized drug therapy. For example, the association between CD20 + lymphocyte infiltration and graft loss suggests that early treatment of these patients with anti-CD20 monoclonal antibody (Rituximab®, Genetech Inc., San Francisco, CA, USA) may be beneficial. The role of cell-cycle inhibitors such as Rapamycin could be questioned in AR-III patients, especially given that the CAN samples that co-cluster with AR-III have chronic graft injury and negative intragraft staining for proliferating cell nuclear antigen (PCNA). Interestingly, the CAN signature was surprisingly homogenous in this study with no identifying features across samples with regards to causation or course of progression. Chronic allograft nephropathy may therefore need to be sampled relatively early and serially post-transplantation, and once established may have little value for deciphering underlying molecular causes.

Limitations of Microarray Analysis

Although DNA microarrays offer several advantages, many limitations specific to the technology currently exist. Some of the common limitations and their solutions are listed in Table 4. Here we discuss in greater depth a few of the more predominant problems.

Table 4.  Limitations in microarray analysis and approaches to solve them
ProblemSolution
Data variability, especially for genes with very low expression levelsRun replicate arrays on each sample to reduce false-positives
Small sample amounts, which limits replicationUse of amplified RNA
Unequal labeling efficiency of fluorescent dyesReciprocal labeling to confirm observations
Prone to obtain false-positive correlationsUse high-stringency statistical cut-off and multiple methods to confirm associations
Provides no information regarding protein expression levels and functionConfirm with other biochemical analysis for protein expression (e.g. immunohistochemistry, protein arrays)

Sampling source

Transplant tissue biopsy samples used for microarray analysis contain a mixture of different T-cell types. Thus, with the exception of cell-specific genes (e.g. E-selectin), the source of mRNA is unknown and limits our ability to interpret the cellular signatures. To address this concern, laser-capture microdissection of cellular subtypes of interest and microarray analysis after RNA amplification has been attempted with success (16). An additional method is to study the gene at the protein levels by immunohistochemistry for genes of interest in specific samples of interest or by the use of tissue microarrays (17). The latter allows for the simultaneous examination of hundreds of tissues of interest with numerous different antibodies per sample. Comprehensive systems for high-throughput analysis and storage of tissue microarray data are available at http://genome-www.stanford.edu/TMA/index.shtml.

Sample amplification

The amount of total RNA extracted from either blood or biopsy samples is generally insufficient for DNA microarray hybridization, necessitating the RNA to be amplified to produce enhanced quantities of antisense RNA (aRNA) for subsequent hybridization. Amplification may be performed in one or two successive rounds depending on the amount of starting material available. Biopsy samples often require two rounds of amplification in order to produce sufficient aRNA for labeling although recent improvement in amplification protocols permit single-amplification to be performed with adequate yield. Typically, the amplification protocol produces sufficient aRNA for up to three hybridizations, each using 5 µg of RNA.

Controlling variability in expression measurements

Variability in microarray measurements can be significant, especially for genes with low expression levels. Replication is recommended to establish a high degree of confidence in the results and to reduce the number of potential false-positive results. However, this may be difficult owing to the high cost of processing microarrays or insufficient amount of sample available for study. Factors specific to microarray experiments that add to data variability include: 1) Insufficient total RNA from samples, therefore requiring amplification steps that may introduce bias (see further discussion later in this review); 2) the efficiency of fluorescent dye labeling during reverse transcription may not be equal, adding to potential bias in results; and 3) cross-hybridization or adverse secondary structure can reduce the ability or failure of certain DNA elements on the array to detect specific transcripts. Alternative and more conventional techniques for measuring RNA abundance such as Northern blotting, RNase protection or real-time PCR may be used to verify a subset of results.

Further functional analyses

DNA microarrays provide results on mRNA expression levels which do not necessarily correlate with protein expression levels (18). Post-translational modifications that modify protein activity such as phosphorylation cannot be measured using these arrays. Thus, array results provide an incomplete view of the functional significance of the differentially expressed genes. Techniques for protein analysis such as Western blotting, two-dimensional polyacrylamide gel electrophoresis, radio-ligand receptor binding, chromatographic separation and detection, as well as mass spectrometry remain indispensable for elucidating protein levels or function (19). Rapid advances are now being made in enabling technologies (i.e. consistent antibody library production and cost-effective slide production methods). These advances should permit development and use of protein microarrays possible as a key platform to complement the DNA microarrays.

Analysis of Microarray Data

The large data sets generated in a typical microarray experiment require sophisticated data analysis software (e.g. 50 samples and 30 000 genes yields 1.5 million expression measurements). Three broad steps are involved in data analysis: data normalization, data filtering, and pattern identification. Data must first be normalized in order to effectively and accurately compare expression levels. It is then reduced by eliminating spots of poor quality or genes expressed below a defined threshold. Finally, clustering and visualization programs are used to identify fundamental gene expression patterns inherent in the massive data sets (unsupervised analysis). Such patterns infer possible biological or clinical relevance when strict statistical significance measures are used to interpret the results. Genes associated with specific diseases can readily be identified using supervised analysis tools. One such tool is statistical analysis of microarray (SAM), where normalized significances scores and false discovery rates (FDR) are computed and reported. When very high-stringency cut-offs are selected (i.e. the 90th percentile for FDR and q-score < 5%), the risk of identifying a false-positive association is minimized. However, genes with low baseline expression and small but significant changes that may be important in the process of rejection will be missed if only highly significant data is retrieved in array studies. Data analysis software tools such as SAM are currently available from either public or private sources (e.g. http://genome-www4.stanford.edu/MicroArray/SMD/restech.html) (20) (Table 1) or from commercial suppliers. As with other types of scientific experiments, microarray experiments are subject to random fluctuations resulting from experimental procedures that need to be dissected away from inherent biological variations. To mitigate the effects of noise in the data, it is advantageous to perform at least replicate experiments and to include multiple probes in the arrays.

Classification accuracy in microarray analysis is complicated by the fact that there are many more attributes (genes) than objects (samples) that we are trying to classify (21). Methods that have been implemented to discover and simplify sample classification including clustering methods, compound covariate prediction, fuzzy logic, sunken centroid gene list filtering, and neural networks. Detailed comparison of these methods is beyond the scope of this review although we have recently surveyed their use in the context of transplantation research elsewhere (22). Two tools that have been implemented by our group for class prediction are PAM (15) and ArrayTools (21). Both ArrayTools and PAM run cross-validation during the prediction process using a training set of arrays (ArrayTools using a leave-one-out method and PAM using 10 iterations across the data).

Repositories for gene expression data

Microarray data are too large to be published in its entirety in scientific journals. It is, however, imperative that array study data be available to the entire scientific community to advance research. Standard data formats to permit sharing of microarray-based expression data have now been defined and widely adopted (23). Further, several major journals (23–25) now require that all raw data should be archived into public or private gene expression databases such as Stanford Microarray Database (SMD), ExpressDB, The Gene Expression Database (GXD) and Gene Expression Omnibus (GEO). For up-to-date journal lists see http://www.ncgr.org/genex/and http://www.biologie.ens.fr/en/genetiqu/puces/bddeng.html). So far, however, there is no unified expression archive system comparable to the GenBank and EMBL sequence databases.

Concluding Remarks

The complex interplay of various genes orchestrating the immune system results in a myriad of responses in transplant recipients varying from rejection to tolerance. Currently, too much or too little medication is often given because there is no individualizing therapy. Unfortunately, this results in confounding clinical outcomes ranging from the rejection response to malignancy. Resolution of these issues may be possible by using either mRNA expression profiling or global genotyping assays to find genes that help predict how much immunosuppression is needed. Mutational profiles of informative genetic polymorphisms in drug metabolizing enzymes or other genes could then be added to blood group and HLA-typing panels now in the mainstay of transplant treatment. By studying differences that exist in peripheral blood or in the graft and its local environment (urine in kidney transplants, bronchial lavage in lung transplants, bile in liver transplants, etc.), informative biomarkers may be discovered to improve treatment programs. The behavior of genes orchestrated in the post-transplant setting may be informative in monitoring rejection susceptibility, ascertaining ‘transplant donor–recipient compatibility’, assessing risk and stratifying cases of acute rejection or chronic injury as well as for predicting long-term graft outcomes and acceptance. Ultimately, microarray technology may be most valuable in the clinical transplant setting for targeted diagnosis, treatment, monitoring and prognosis, using a limited repertoire of genes cherry-picked for their informativeness from across the entire human genome.

Acknowledgments

We gratefully acknowledge the assistance of Szu-Chuan Hsieh in data generation for the acute rejection study. Sheryl Shah and Rosa Liu provided administrative assistance on preparation of the manuscript and Karen Woodward (GeneXP Biosciences, Inc.) provided the 96-well ‘array of array’ plate image for Figure 1. The experiments summarized in Figure 2 were conduced by Curtis Kautzer and Xing-Jian Lou under the direction of ESM at diaDexus, LLC (Santa Clara, CA). Permission to publish the results from this feasibility study was graciously given to her by Dr John Burczak. We would also like to acknowledge the support of the NIH and the Packard Foundation for the microarray work conducted in the Sarwal laboratory. Technical assistance on array data storage and analysis by Dr Gavin Sherlock and staff of the Stanford Microarray Database is also gratefully acknowledged.

Ancillary