Molecular Diagnosis of T Cell-Mediated Rejection in Human Kidney Transplant Biopsies


  • J. Reeve,

    1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    2. Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author
  • J. Sellarés,

    1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    2. Servei de Nefrologia, Hospital de la Vall d'Hebron, Barcelona, Spain
    Search for more papers by this author
  • M. Mengel,

    1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    2. Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author
  • B. Sis,

    1. Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author
  • A. Skene,

    1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author
  • L. Hidalgo,

    1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    2. Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author
  • D. G. de Freitas,

    1. Department of Renal Medicine, Manchester Royal Infirmary, Manchester, UK
    Search for more papers by this author
  • K. S. Famulski,

    1. Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    2. Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author
  • P. F. Halloran

    Corresponding author
    1. Department of Medicine, University of Alberta, Edmonton, Alberta, Canada
    • Alberta Transplant Applied Genomics Centre, University of Alberta, Edmonton, Alberta, Canada
    Search for more papers by this author

Corresponding author: Philip F. Halloran


Histologic diagnosis of T cell-mediated rejection is flawed by subjective assessments, nonspecific lesions and arbitrary rules. This study developed a molecular test for T cell-mediated rejection. We used microarray results from 403 kidney transplant biopsies to derive a classifier assigning T cell-mediated rejection scores to all biopsies, and compared these with histologic assessments. The score correlated with histologic lesions of T cell-mediated rejection (infiltrate, tubulitis). The accuracy of the classifier for the histology diagnoses was 89%. Very high and low molecular scores corresponded with unanimity among three pathologists on the presence or absence of T cell-mediated rejection, respectively. The molecular score had low sensitivity (50%) and positive predictive value (62%) for the histology diagnoses. However, histology showed similar disagreement between pathologists—only 45–56% sensitivity of one pathologist with diagnoses of T cell-mediated rejection by another. Discrepancies between molecular scores and histology were mostly when histology was ambiguous (“borderline”) or unreliable, e.g. in cases with scarring or inflammation induced by tissue injury. Vasculitis (isolated v-lesion TCMR) was particularly discrepant, with most cases exhibiting low TCMR scores. We propose new rules to integrate molecular tests and histology into a precision diagnostic system that can reduce errors, ambiguity and interpathologist disagreement.


antibody-mediated rejection


acute kidney injury


T cell-mediated rejection


T cell-mediated rejection (TCMR) remains important as a prototype for T cell-mediated inflammatory diseases, a target for immunosuppressive drug development, an end point for drug trials and a feature of the early transplant course, particularly with new treatment protocols (1,2). However, histologic diagnosis of TCMR by the Banff classification has serious limitations. It was derived as “acute rejection” before antibody-mediated rejection (ABMR) was defined, and relies on nonspecific lesions, arbitrary rules and subjective interpretations [3] that have never been validated against an objective independent test. The principal rule for diagnosing TCMR requires semiquantitative assessment of two lesions, interstitial inflammation (i-score) and tubulitis (t-score), using arbitrary thresholds. Both lesions are observed in other conditions such as acute kidney injury (AKI) and primary renal diseases [3-5]. In protocol biopsies, both correlate with previous injury, but not with rejection [6], because the inflammatory response to injury can produce these lesions. Many biopsies with inflammation and tubulitis below the thresholds are called ‘borderline’. Furthermore, these lesions must be assessed in unscarred areas, rendering the diagnosis of TCMR impossible in biopsies with advanced atrophy scarring. The second rule for diagnosing TCMR relies on isolated v-lesions: intimal arteritis without sufficiently high i- and t-lesions (v > 0 and (i < 2 or t < 2)). This rule is suspect because v-lesions can result from ABMR and AKI (7,8), and will probably change at the next Banff meeting [8].

In addition, TCMR diagnoses are poorly reproducible within the rules. Furness et al. found that agreement among pathologists using the same rules was not only limited but resistant to improvement (9, 10). These authors concluded that ‘…international variation in histologic grading is large, underrecognized, difficult to improve, and almost certainly of major clinical relevance. Urgent steps are required to improve this area of clinical practice.’

Given the absence of a reliable gold standard, molecular phenotyping presents an alternative approach which in other areas of medicine, particularly cancer, is in clinical use [11-14]. Molecular studies of transplant biopsies using RTPCR (15,16) and microarrays [17-19] have shown promise, but none distinguished between TCMR and ABMR. The importance of this distinction is underscored by the discovery that most ABMR is missed by the Banff criteria requiring C4d staining [20-22]. As a result, all earlier studies must now be reinterpreted: many biopsies previously called ‘rejection’ or TCMR are actually mixed TCMR/ABMR.

This study aimed to develop a molecular test for TCMR. We performed microarrays on 403 kidney transplant indication biopsies that had been assigned histology diagnoses incorporating both the Banff classification and new knowledge such as C4d-negative ABMR. The molecules that best distinguished TCMR from other diseases were used to develop classifier equations that assigned TCMR scores to every biopsy by cross-validation. We compared the scores to histologic lesions and diagnoses, and studied the discrepancies in relationship to known limitations of histology. The assumption was that TCMR can be detected by a molecular signal that will clarify biopsies that are ambiguous or unclassifiable by histology, e.g. scarred biopsies. This approach allowed us to estimate the potential error rate in the histologic diagnosis of TCMR, and to suggest how the molecular test can be used to improve biopsy assessment.


Patient population, specimens and data collection

Written informed consent was obtained from all study patients. The study was approved by the University of Alberta Health Research Ethics Board (Issue # 5299), by the University of Illinois, Chicago (protocol # 2006-0544) and by the University of Minnesota (protocol # 0606M87646). Consenting patients undergoing a renal transplant biopsy for clinical indication (dysfunction or proteinuria) as standard of care between 09/2004 and 11/2008 were included. In addition to the cores for standard histology, one extra core was collected for microarray analysis and processed as previously described [23]. Eight normal kidney specimens were also obtained, from native nephrectomies for renal carcinoma.

Histology and diagnostic classification

C4d staining was performed by indirect immunofluorescence on frozen sections using a monoclonal anti-C4d (Quidel, San Diego, CA, USA). Biopsies were classified using a modified Banff classification [24] including C4d negative ABMR and probable ABMR. C4d negative ABMR was defined by donor specific antibody (DSA), nondiffuse C4d staining (C4d 0 - C4d 2) [2] and at least one microcirculation lesion: peritubular capillaritis (ptc > 1), glomerulitis (g > 0), thromboses, or transplant glomerulopathy (cg > 0). Probable ABMR was similar but included anti HLA that was not donor specific (NDSA). Mixed rejection was diagnosed when both ABMR and TCMR were present. Glomerulonephritis (GN) included vasculitis and other glomerular diseases. Biopsies classified as ‘no major abnormalities’ had a ci-score (interstitial fibrosis) <2 and no specific disease features. AKI was defined as biopsies for clinical indications before 42 days posttransplant that lacked disease features. ‘Atrophy-fibrosis’ was defined as a ci-score of 2 or 3, without specific disease features. ‘Others’ included uncommon entities: C4d deposition with no pathology (n = 5), thrombotic microangiopathy (n = 3), suspicious viral nephropathy (n = 3), transplant glomerulopathy (n = 2), posttransplant lymphoproliferative disorder (n = 1), oxalosis (n = 1), obstruction (n = 1), acute pyelonephritis (n = 1), tubulo-interstitial nephritis (n = 1). Two biopsies were of marginal adequacy. All biopsies were scored by pathologists ‘A’ and ‘B’, and a subset of 245 was scored by a third pathologist ‘C’. This subset was selected by availability of virtual slides without knowledge of previous diagnoses. The virtual slides included scans of H&E, PAS and trichrome sections which were selected for digitization by pathologist A or B during their review process as being representative of all diagnostically relevant lesions in each particular case.

The reference standard diagnoses reflected the assessment of histology by two pathologists (‘A’ and ‘B’), interpreted by two transplant nephrologists (DDF and JS). Diagnoses were assigned before molecular classifiers were created, independent of molecular measurements.

HLA antibody screening

Of 403 biopsies, 363 had serum available at the time of biopsy for HLA antibody testing. The testing method varied among centers. Antibody specificities were determined either by Luminex single antigen beads, mostly by FlowPRA® single antigen I and II beads (One Lambda Canoga Park, CA, USA) after a positive screening test using FlowPRA® beads, or by positive flow cytometry or CDC-AHG crossmatch. Specificities to Cw were not assigned as DSA. Donor HLA typings for DP or DQA1 were not performed.

Microarray analyses

With Robust Multiarray Averaging (RMA), every time a new sample is added, the batch of samples must be renormalized, resulting in slight alterations of the existing values for all microarrays. To avoid this, we used a modified RMA, implemented in the Bioconductor ‘RefPlus’ package, which allows a single reference dataset to be normalized with RMA, with subsequent samples normalized against this reference set without altering the original set's values. Our reference set was 200 randomly selected biopsies from this population.

The 54,675 probe sets on the arrays were subjected to nonspecific interquartile range (IQR) filtering across 403 samples. Probe sets with IQR less than 0.5 log2 units were removed from further analyses. All analyses and graphics were done using the ‘R’ software package, version 2.12.1 (64-bit), with various libraries from Bioconductor 2.8. Microarray expression files are posted on the Gene Expression Omnibus website (GSE36059).

Classifier construction and cross-validation

We used linear discriminant analysis (LDA), implemented in the ‘classification’ function of the ‘CMA’ library of ‘R’ to build the molecular classifier. The 403 BFCs were split randomly into 10 approximately equally sized groups (folds). One fold was left out as a test set, while the other nine were used to train the classifier (Figure 1). Some biopsies were withheld from the training sets because of ambiguous or uncertain diagnostic status: borderline; mixed rejection; biopsies in which TCMR was diagnosed by isolated v-lesions; and biopsies called TCMR by only one of the two pathologists (‘A’ and ‘B’) (reasoning that such samples were less likely to be true TCMR). The top 20 probe sets in each training set were selected by Welch's t-test comparing histologic TCMR to non-TCMR. The classifier equation, comprising the top 20 probe sets and their coefficients, was applied to every biopsy in the left out test set, with no biopsies withheld. This was repeated for each of the 10-folds, so that every biopsy was in a test set once and was assigned a single TCMR score. The entire procedure was repeated a further 99 times, assigning 100 TCMR scores to every biopsy, a method known as replicated K-fold cross-validation (25,26).

Figure 1.

Flowchart depicting the multiple 10-fold cross-validation procedure.

The median of the 100 scores for each biopsy was used as its final score. The classifier output is a score between 0.0 and 1.0, reflecting the degree of certainty associated with a prediction of TCMR. This resulted in 28 TCMRs and 304 non-TCMRs being included in the training sets, with 71 biopsies removed due to the above restrictions. All 403 biopsies, plus the 8 normal kidney specimens, were included in the test sets. All aspects of classifier training, including gene selection, were done from scratch within each training set. No information from any test sample was used to build the corresponding training-set derived equations.


Study population

We studied 403 consecutive prospectively collected biopsies for clinical indications, taken 6 days to 36 years posttransplant from 315 consenting kidney transplant recipients (Table 1). Because every biopsy was included, this population represents the full spectrum of conditions encountered in patients presenting for indication biopsies.

Table 1. Demographics comparing patients and their biopsies: TCMR versus all other conditions
 All patients (n = 315)Patients with a biopsy diagnosed as TCMR (n = 22)All other patients (n = 293)p-Value (TCMR vs. all others)
Mean recipient age43 (5–81)48 (22–69) ns
Recipient gender (% male) [n = 315]196 (62%)14 (64%)182 (62%)ns
Race [n = 315]    
 Caucasian201 (64%)14 (64%)187 (64%)ns
 Black33 (10%)1 (5%)32 (11%)ns
 Other81 (26%)7 (32%)74 (25%)ns
Primary disease [n = 315]    
 Diabetic nephropathy64 (20%)2 (9%)62 (21%)ns
 Hypertension/large vessel disease29 (9%)2 (9%)27 (9%)ns
 Glomerulonephritis/vasculitis119 (38%)9 (41%)110 (38%)ns
 Interstitial nephritis/pyelonephritis20 (6%)1 (5%)19 (6%)ns
 Polycystic kidney disease46 (15%)6 (27%)40 (14%)ns
 Others15 (5%)1 (5%)14 (5%)ns
 Unknown etiology22 (7%)1 (5%)21 (7%)ns
Mean donor age40 (2 –70)42 (19–69)40 (2–70)ns
Donor gender (% male) [n = 263]121 (46%)9 (45%)112 (46%)ns
Donor type (% deceased donor transplants) [n = 310]152 (49%)10 (45%)142 (49%)ns
   Biopsies other 
Clinical characteristics at time of biopsyAll biopsiesBiopsies withthan TCMR Allp-Value (TCMR
[n = 403](n = 403)TCMR (n = 35)others (n = 368)vs. all others)
Median and range time from transplant to biopsy (mo)17 (0.2–428)4.9 (0.3–129)20 (0.2–428)0.001
Indication for biopsy    
 Primary nonfunction10 (2%)2 (6%)8 (2%)ns
 Rapid deterioration of graft function96 (24%)8 (23%)88 (26%)ns
 Slow deterioration of graft function150 (37%)11 (31%)139 (41%)ns
 Stable impaired graft function71 (18%)7 (20%)64 (19%)ns
 Investigate proteinuria38 (9%)4 (11%)34 (10%)ns
 Follow-up from previous biopsy14 (3%)1 (3%)13 (4%)ns
 Others9 (2%)0 (0%)9 (3%)ns
 Indication unknown15 (4%)2 (6%)13 (4%)ns
Maintenance immunosuppressive regimens at biopsy    
 MMF, tacrolimus, steroid176 (44%)15 (43%)161 (44%)ns
 MMF, cyclosporine, steroid101 (25%)7 (20%)94 (26%)ns
 Others126 (31%)13 (37%)113 (31%)ns

All biopsies were read by two pathologists, with a subset of 245 biopsies read by a third pathologist. Each biopsy was assigned histologic diagnoses using a new reference standard classification adapted by incorporating C4d-negative ABMR [21] and probable ABMR [27]. The reference standard diagnoses were based on independent assessments by pathologists A and B, integrated by two transplant nephrologists (see the Methods section).

The reference standard histologic diagnostic groups are shown in Table 2. The time of biopsy posttransplant was earlier for TCMR than for other diagnoses (4.9 versus 20 months).

Table 2. Diagnoses assigned by the reference standard system based on histology, C4d staining and DSA
Histology-DSA diagnosisEarly (<1 year)Late (>1 year)All
C4d-positive ABMR31417
C4d-negative ABMR34548
Probable ABMR1910
C4d-positive mixed rejection099
C4d-negative mixed rejection21113
Polyoma virus nephropathy10212
Acute kidney injury50050
No major abnormalities333972
Other uncommon diagnoses12820

Agreement among pathologists

Agreement between pathologists A and B (i.e. the accuracy of one for predicting the other's diagnoses) on a diagnosis of TCMR/mixed versus non-TCMR (calling isolated v-lesions non-TCMR) in the 403 dataset was 81%. The agreement of the null model (diagnosing all samples as non-TCMR) with pathologist A was 90%. This demonstrates how overall agreement or accuracy is not the most relevant statistic to use when the ratio of the classes is far from 50:50. We therefore present the sensitivities—the proportion of biopsies called TCMR/mixed by one pathologist that are also called TCMR/mixed by another pathologist, as a more useful measure of diagnostic consensus.

The sensitivity of a diagnosis of TCMR/mixed was 67% for pathologist A versus B, and 45% for B versus A (Table 3). When biopsies with isolated v-lesions were called TCMR, the sensitivities fell to 45% and 35%, respectively. In the 245 biopsy subset read by all three pathologists, the average sensitivity among pairs of pathologists was 45%. Unanimity was uncommon: in the 75 biopsies judged by at least one pathologist to have TCMR, all three agreed in only 12 (16%).

Table 3. Sensitivity of one pathologist's diagnoses1 for those of another's, with regard to a diagnosis of TCMR or mixed rejection2
  1. 1For this table, the diagnoses were made by strictly applying the Banff rules to the lesion scores recorded by the pathologists.

  2. 2Isolated v-lesions were not considered to be sufficient to diagnose TCMR. Sensitivity was either reduced or unchanged when isolated v-lesions were considered to be TCMR.

  3. 3Sensitivity, e.g. B versus A = 100% x (# biopsies agreed to be either TCMR or mixed by both B and A)/(# biopsies called TCMR or mixed by A).

In 403 biopsies assessed by two pathologists
 B vs. A45
 A vs. B67
In 245 biopsies assessed by three pathologists
 C vs. A59
 C vs. B40
 B vs. A40
 A vs. B48
 B vs. C37
 A vs. C46

Molecules selected to distinguish TCMR from other diseases

In 1000 class comparisons, a total of 61 probe sets were selected by a p-value using Welch's t-test as one of the top 20 probe sets. Five probe sets were selected 1000 times; 12 at least 900 times; and 26 at least 100 times. The 30 probe sets most frequently selected are shown in Table 4. A detailed analysis of these molecular findings will be published separately (manuscript in preparation).

Table 4. Molecules selected in 1000 class comparisons in the classifier algorithm
Affymetrix IDDescriptionCommon gene nameTCMR1Left out2Others3Nephrectomy controlsProportion of times used by classifier
  1. 1TCMR: samples called TCMR by both pathologists excluding isolated-v biopsies (cases with intimal arteritis and no interstitial inflammation).

  2. 2Left out: cases with mixed rejection [n = 22], intimal arteritis and no interstitial inflammation [n = 7] and cases called TCMR by only one of the pathologists [n = 40].

  3. 3Others: all other biopsies for clinical indications.

206761_atCD96 antigenCD96342116141
220485_s_atsignal-regulatory protein beta 2SIRPB2773827241
235735_attumor necrosis factor (ligand) superfamily, member 8TNFSF8734131271
236226_atB and T lymphocyte associatedBTLA542922181
238629_x_atSimilar to Olfactory receptor 2I2 (LOC346170)OR2I1P1187253461
237753_atinterleukin 21 receptorIL21R502819160.995
206545_atCD28 antigenCD28271613110.989
236099_atTranscribed sequences613527250.962
210354_atgamma interferon (human)IFNG663626230.954
204852_s_atprotein tyrosine phosphatase, nonreceptor type 7PTPN71318163560.934
210116_atSH2 domain protein 1ASH2D1A532820150.931
239196_atankyrin repeat domain 22ANKRD221529069620.902
206486_atlymphocyte-activation gene 3LAG3583226240.892
228737_atTOX high mobility group box family member 2TOX2301815130.853
227030_atIKAROS family zinc finger 3 (Aiolos)IKZF318411888690.708
1569225_a_atsex comb on midleg-like 4 (Drosophila)SCML4532619120.621
219385_atSLAM family member 8SLAMF81468960540.531
206134_atADAM-like, decysin 1ADAMDEC11685521100.488
205242_atchemokine (C-X-C motif) ligand 13CXCL133256326130.433
229437_atmicroRNA host gene 2 (nonprotein coding); microRNA 155MIR155HG763722130.304
207777_s_atSP140 nuclear body proteinSP140553224200.301
215925_s_atCD72 antigenCD721186345330.219
240070_atT cell immunoreceptor with Ig and ITIM domainsTIGIT462821170.217
217147_s_atT cell receptor interacting moleculeTRIM271713110.122
1552584_atinterleukin 12 receptor, beta 1IL12RB1784937270.104
227458_atprogrammed cell death 1 ligand 1PDCD1LG11437953340.101
205758_atCD8 antigen, alpha polypeptide (p32)CD8A426211109650.057
220423_atphospholipase A2, group IIDPLA2G2D845244430.053
1558972_s_atthymocyte selection pathway associatedTHEMIS25151190.048

Relating molecular scores to histologic diagnoses

The y-axis of Figure 2 represents TCMR scores for 403 indication biopsies, which are grouped on the x-axis by their reference standard histology diagnoses, with isolated v-lesions classified as TCMR. Eight normal kidney samples were added for comparison. Early biopsies (before 1 year) are represented by gray symbols, late biopsies by black symbols. To compare TCMR scores with histologic diagnoses, the scores were considered positive for TCMR above a cut-off of 0.1, based on visual inspection of the scores. Defining histologic TCMR as either TCMR or mixed, and everything else as non-TCMR, the classifier statistics are: accuracy 89%; sensitivity 50%; specificity 95%; positive predictive value (PPV) 62%; negative predictive value (NPV) 92%.

Figure 2.

Relationship between the TCMR score and the histological reference standard diagnoses. Circles and solid vertical lines represent the median and interquartile range (IQR) of the TCMR score over the 100 classifier iterations. The biopsies are represented by their time period posttransplantation: early (<1 year: gray circles) and late (>1 year: black circles). Ordering within each histological stack is random. The horizontal line at 0.1 divides the samples into high and low TCMR scores -this threshold was used for the calculation of accuracy statistics. Neph=nephrectomies.

Variability in TCMR scores from sampling variance—the random splits into ten folds in 100 iterations of classification—is represented by solid vertical bars in Figure 2. Biopsies with very high or low scores showed little variance; those with intermediate scores had considerably more.

Association between molecular TCMR scores and histologic lesions

Across 403 biopsies, the TCMR score was strongly associated with ‘i’ and ‘t’ lesions (Table 5), and less strongly with ‘v’. A weak association with peritubular capillaritis (‘ptc’), which is more related to ABMR, is expected because ptc lesions are sometimes induced by TCMR [28].

Table 5. Association of TCMR scores (high [> 0.1] vs. low [< 0.1]) with histology lesions (p-values for two-tailed Mann–Whitney tests)
  All biopsies for clinical indications [n = 403]TCMR, mixed and borderline [n = 99]
  Mean lesion score Mean lesion score 
Histologic lesionsAbbreviationHigh TCMR score [n = 47]Low TCMR score [n = 356]p-ValueHigh TCMR score [n = 36]Low TCMR score [n = 63]p-Value
TCMR-related lesions       
 Interstitial inflammationi score1.50.34 × 10−181.80.85 × 10−7
 Tubulitist score1.80.52 × 10−162.21.43 × 10−4
TCMR/ ABMR-related lesion       
 Intimal arteritisv score0.40.01 × 10−
ABMR-related lesions       
 Peritubular capillaritisptc score0.80.47 × 10−
 Glomerulitisg score0.20.30.570.30.40.84
 Transplant glomerulopathycg score0.10.46 × 10−
 Mesangial matrix expansionmm score0.
 Interstitial fibrosis lesionsci score1.21.20.690.91.20.06
 Tubular atrophyct score1.31.30.381.21.30.52
 Arterial fibrous Intimal thickeningcv score0.
 Arteriolar hyalinosisah score0.41.12 × 10−50.31.01 × 10−4
Median time from transplant to a8.
 biopsy (months)       

Within the 99 histologic TCMR, mixed and borderline biopsies, high TCMR scores were only associated with high i-scores and t-scores.

There was no association of TCMR scores with atrophy scarring. Biopsies with high-TCMR scores displayed less arteriolar hyalinosis (‘ah’), probably because hyalinosis occurs late and TCMR is predominantly early. It is also possible that less hyalinosis in TCMR biopsies reflects underexposure to calcineurin inhibitors, i.e. under-immunosuppression.

The molecular TCMR score correlates with the number of pathologists diagnosing TCMR

The TCMR score was low when pathologists agreed TCMR was absent, high when they agreed TCMR was present and intermediate when they disagreed. Comparing pathologists A and B in 403 biopsies (not considering isolated v-lesions as TCMR), the mean TCMR scores in biopsies called TCMR/mixed by two, one, or zero pathologists were 0.66, 0.33 and 0.02 respectively. With isolated v-lesions included as TCMR, the scores were 0.51, 0.14 and 0.02, respectively.

In 245 biopsies read independently by three pathologists (Figure 3), TCMR scores were again highest when all pathologists agreed TCMR was present (0.78); lower when only two called TCMR (0.32), lower still with only one (0.10) and very low when all agreed TCMR was absent (0.007).

Figure 3.

Venn diagram showing the relationship between the molecular T cell-mediated rejection (TCMR) score and the agreement among three pathologists in the 245 biopsy subset. Numbers in italics show the average molecular TCMR score in the biopsies. Numbers with no parentheses are the intersections of the number of biopsies diagnosed as TCMR by the three pathologists. Biopsies with either i2t2 TCMR or mixed rejection were considered TCMR. Isolated v-lesion TCMRs were not counted as TCMR.

The relationship of the TCMR scores to isolated v-lesions

Most biopsies called histologic TCMR/mixed on the basis of isolated v-lesions had low TCMR scores: 19/24 biopsies (79%) with isolated v-lesions had TCMR scores ≤0.1 (Figure 4). This supports the conclusion by the Banff group that v-lesions can be induced by conditions other than TCMR, namely ABMR and acute kidney injury [8]. The five biopsies with high TCMR scores had all been called at least i2 t2 by the other pathologist, as opposed to only four of the 19 biopsies with low TCMR scores. This suggests that ‘isolated’ v-lesions with high TCMR scores are actually typical (‘i2t2’) TCMR that has been missed by the pathologist.

Figure 4.

TCMR scores in biopsies whose TCMR status was due solely to isolated-v-lesions, i.e. those with v > 0 and (i < 2 and/or t < 2).

Agreement between histologic and molecular diagnoses

Summarizing the results in Figure 2, considering histology showing TCMR/mixed as positive, the main groups are histology positive, score positive 29; histology positive, score negative 28; histology negative, score positive 18 and score negative histology negative 328 (Table 6). Thus in 28/57 (49%) of biopsies called TCMR by histology did not have a molecular signal, and 18/47 (38%) of biopsies with a molecular TCMR signal were not called TCMR by histology.

Reconciling discrepancies between TCMR scores and histology diagnoses

TCMR score negative, histology TCMR positive (n = 28). Nineteen biopsies with histologic TCMR/mixed based on isolated v-lesions are probably histology false positives. In one case treatment before the biopsy may have extinguished the molecular changes but left histologic lesions [7]. Two biopsies taken days 8 and 14 had severe AKI, which can induce tubulitis and inflammation and cause false positive histologic TCMR diagnoses [7].

TCMR score positive, histology TCMR negative (n = 18): Seven ‘borderline’ biopsies with high-TCMR scores are probably true TCMR (29). In nine biopsies, advanced atrophy-fibrosis (ci2 or ci3) may have made a histologic diagnosis of TCMR impossible. One biopsy with PVN had histologic TCMR lesions but the pathologists declined to diagnose TCMR because of PVN. One biopsy (called ‘AKI’ by histology) was the subject of specimen numbering confusion, and the histology and microarray specimens may be from different patients.

We reclassified these results based on TCMR scores and known deficiencies of histology, using three proposed rules that if confirmed in ongoing studies would resolve most discrepancies:

  1. Biopsies with potential for false positive histology diagnoses (isolated v-lesions, AKI, or pretreatment) with low TCMR score are not TCMR;
  2. Biopsies with potential for false-negative-histology diagnoses (heavy scarring, borderline) with high-TCMR score have TCMR;
  3. Biopsies with PVN and high-TCMR scores may have both PVN and TCMR.

The results (Table 6) illustrate how discrepancies between the scores and histology can be reconciled based on known

limitations in histology, producing a new diagnostic system that incorporates both molecules and histology, reducing ambiguity and error in the current diagnostic system.

Table 6. Comparing the histology TCMR diagnoses to the TCMR scores considering borderline as not TCMR


This study developed a molecular test for TCMR and compared it with histology. We confirmed serious disagreement among pathologists in applying the current histology system [9]: in 74 biopsies diagnosed as TCMR or mixed by any of three pathologists, only 12 (16%) were diagnosed TCMR by all three. To address this, we assigned one reference standard diagnosis to each biopsy, and used these diagnoses to create classifiers that assigned a TCMR score to each biopsy. The TCMR scores correlated with the lesions of TCMR and the number of pathologists diagnosing TCMR. The disagreements between histology and the TCMR scores were primarily in situations where histology has known limitations: the ambiguous ‘borderline’ category, false negatives in scarring and PVN, and false positives in isolated v-lesions, kidney injury and after treatment of rejection. The molecules distinguishing TCMR from other conditions echoed previous studies (29, 30) and will be analyzed in a separate report (manuscript in preparation). Thus the molecular TCMR score emerges as a test that can be used with histology to create a new diagnostic system that correctly classifies scarred biopsies, identifies which isolated v-lesions actually represent TCMR, and improves the assessment of biopsies with inflammation due to AKI or after antirejection treatment.

The absence of a gold standard in disease studies is a major problem that must be addressed whenever precision diagnostic tests are developed. One consequence of a flawed gold standard is that a new diagnostic test, even if excellent, will appear inferior (‘reference set bias’) [31]. This is evident here, where histologic TCMR was diagnosed in only 29/47 biopsies with positive scores. Acknowledging previous proposals addressing flawed gold standards (32, 33), we studied the discrepancies in detail. Most discrepancies were in situations where histology cannot decide on the diagnosis (borderline) or tends to overcall (isolated v-lesions, AKI, posttreatment) or undercall (scarring, PVN) TCMR. Undercalling TCMR is an inherent problem in scarring because i- and t-scores cannot be assigned in scarred tissue. Overcalling occurs because isolated v-lesions, tubulitis and interstitial inflammation can result from other diseases (6, 34) and because lesions can persist after treatment [7]. Reluctance to diagnose TCMR in biopsies with polyoma virus (PVN) is understandable but nevertheless inaccurate if criteria for TCMR are present.

While it would be desirable to confirm true TCMR by some independent criteria, this is not possible. We initially believed that response to treatment could be used, but in our observational study, functional improvement after TCMR was similar to that after other diagnoses, because clinicians apply multiple strategies, often simultaneously, e.g. steroids, volume correction and calcineurin inhibitor reduction [35]. Moreover, TCMR episodes do not always improve after therapy, particularly late episodes. It is impossible to perform a randomized trial of TCMR treatment because it is unethical to leave this serious condition untreated. We propose instead a new generation of observational studies supported by TCMR score assessment, eventually leading to controlled trials. For example, we are validating the TCMR score in an independent set of biopsies in the new observational INTERCOM study (clinical NCT01299168), which differs from all previous studies in the specific acknowledgement of C4d negative ABMR.

Although a cutoff of 0.1 was imposed for calculating accuracy, sensitivity, etc., the TCMR score is a continuous value expressing the classifier's degree of certainty in diagnosing TCMR. As such, values around the cutoff are equivalent to ‘Borderline’, although it is impossible to stipulate how far from this line the range should extend. Even a fairly broad range would reduce the use of ‘borderline’ to about a dozen biopsies—an improvement over the 42 diagnosed by histology.

The TCMR score not only adds precision to TCMR diagnosis but also provides estimates of variability. Sources of variation include intrinsic sampling error in biopsies, different cores being read by the microarray and histology, pathologist disagreement and variation in the generation of classifier scores. Core-to-core variation is unavoidable in needle biopsies, but we believe that molecular changes are less variable than histology. Our ethics approval for this study did not permit multiple research cores, but when we tested multiple cores from transplant nephrectomies the results have been reassuring (unpublished observations). The TCMR score itself has variability due to different combinations of samples in the training sets, different pathologists labeling those samples, and different choices of classifier algorithms. However, the use of 1000 class comparisons and 100 scores to estimate the probability of TCMR provides both robustness and an estimate of the variance that pathologists and clinicians will find useful.

The present algorithm can easily be applied to any new sample. The ‘RefPlus’ method allows new microarray chips to be normalized against our database of existing samples, and previously derived classifiers from 100 random splits of the data can be used to generate the 100 scores and their median for each new sample. From one microarray, other measurements can be derived in addition to the TCMR score, expressing both disease-specific and general assessments, e.g. the rejection score [7], ABMR score (currently in press), risk score for graft loss [36], extent of acute kidney injury [37] and scarring [38];[39]. All scores can be generated within minutes of receiving a new microarray file. Molecular diagnoses will be particularly useful in badly damaged or scarred tissues or those with multiple diseases.

The first application of the TCMR score will be to enhance histology diagnoses in the circumstances where ambiguity and errors are common. Histology is good at distinguishing abnormal from normal tissue, but disagreement is frequent among pathologists in abnormal inflamed biopsies. This is why more than half of inflamed biopsies were called borderline (42 borderline vs. 35 TCMR) in this study, and why three expert pathologists agreed on only 16% of biopsies in which at least one had diagnosed TCMR. The TCMR diagnostic system was set up in 1991 when the main issue was identifying “acute rejection” in early unscarred biopsies, before ABMR was accepted. In current practice with few early uncomplicated TCMR episodes but many severely damaged kidneys and late biopsies with mixed rejection and scarring, a new dimension is required. Thus transplant biopsy diagnostics will follow the pattern emerging in molecular diagnostics in cancer, where the molecules do not compete with histology but help extend diagnostic capabilities.


Special thanks to Jessica Chang for statistical analyses; and to Vido Ramassar and Anna Hutton for technical support. We are grateful to Drs. Arthur Matas and Bruce Kaplan for providing biopsies from Minneapolis and Chicago respectively.

This research has been supported by funding and/or resources from Genome Canada, Genome Alberta, the University of Alberta, the University Of Alberta Hospital Foundation, Roche Molecular Systems, Hoffmann-La Roche Canada Ltd., the Alberta Ministry of Advanced Education and Technology, the Roche Organ Transplant Research Foundation, the Kidney Foundation of Canada and Astellas Canada. Dr. Halloran held a Canada Research Chair in Transplant Immunology until 2008 and currently holds the Muttart Chair in Clinical Immunology.


P. F. Halloran holds shares in Transcriptome Sciences Inc., a company with an interest in molecular diagnostics. The other authors have no competing financial interests to disclose as described by the American Journal of Transplantation.