SEARCH

SEARCH BY CITATION

Keywords:

  • Allograft function;
  • kidney transplantation

Abstract

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

The nonspecific diagnoses ‘chronic rejection’‘CAN’, or ‘IF/TA’ suggest neither identifiable pathophysiologic mechanisms nor possible treatments. As a first step to developing a more useful taxonomy for causes of new-onset late kidney allograft dysfunction, we used cluster analysis of individual Banff score components to define subgroups. In this multicenter study, eligibility included being transplanted prior to October 1, 2005, having a ‘baseline’ serum creatinine ≤2.0 mg/dL before January 1, 2006, and subsequently developing deterioration of graft function leading to a biopsy. Mean time from transplant to biopsy was 7.5 ± 6.1 years. Of the 265 biopsies (all with blinded central pathology interpretation), 240 grouped into six large (n > 13) clusters. There were no major differences between clusters in recipient demographics. The actuarial postbiopsy graft survival varied by cluster (p = 0.002). CAN and CNI toxicity were common diagnoses in each cluster (and did not differentiate clusters). Similarly, C4d and presence of donor specific antibody were frequently observed across clusters. We conclude that for recipients with new-onset late graft dysfunction, cluster analysis of Banff scores distinguishes meaningful subgroups with differing outcomes.

Kidney allograft recipients with late deterioration of kidney function have often been labeled as having ‘chronic rejection’ or ‘chronic allograft nephropathy’ (CAN). More recently, the terminology, ‘interstitial fibrosis and tubular atrophy, with no evidence of any specific etiology’ (IF/TA) replaced ‘CAN’ and is used to describe biopsies with fibrosis and atrophy and with no obvious underlying pathogenesis (1). These diagnoses, which have been applied to nearly 50% of recipients with late graft loss, are not specific, and suggest neither identifiable pathophysiologic mechanisms nor possible treatments. As a consequence, it has been difficult to develop intervention trials.

The Long-Term Deterioration of Kidney Allograft Function (DeKAF) study, a multicenter observational study conducted at seven transplant centers in the United States and Canada, was designed with the premise that new-onset late graft dysfunction, if studied early in its course, could be characterized as (or associated with) distinct clinical and histopathologic entities with reproducible phenotypes. DeKAF is focused on current nonspecific diagnoses (e.g. CAN-IF/TA) and not on already well-defined entities (e.g. recurrent disease). The current pathogenetic characterization of most late graft dysfunction (CAN–IF/TA) seems somewhat analogous to differentiating subgroups within the diagnosis of ‘Bright's disease’ 5 decades ago (2). Our goal is to determine if distinct subgroups can be defined on the basis of clinical, laboratory and pathologic information available at the time of new-onset graft dysfunction. Ultimately, these defined phenotypes could provide the basis for specific intervention trials.

The DeKAF study consists of two cohorts of kidney transplant recipients: (a) a cross-sectional cohort of patients transplanted prior to October 1, 2005, and developing late graft dysfunction leading to a biopsy and (b) a prospective cohort of consecutive consenting recipients entered into the study at the time of transplant (3). This report focuses on the cross-sectional cohort. We previously described the demographics of this cohort, and also reported on the local pathologists’ interpretations of biopsies done at the time of development of new-onset late graft dysfunction: CAN and calcineurin inhibitor (CNI) toxicity were the most common diagnoses (3). Importantly, we noted that the local diagnosis of CAN was of no prognostic significance, as patients with or without this diagnosis had similar rates of postbiopsy graft failure (3).

Herein, we describe the use of cluster analysis, which groups patients with similar characteristics, to differentiate subsets within this cohort of recipients with new-onset, late allograft dysfunction (4–6). Clustering, based on centrally read Banff scores, separated recipients with new-onset late graft dysfunction into meaningful subgroups with differing histologic characteristics, outcomes and presumably, responses to potential therapeutic interventions.

Material and Methods

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

Entry criteria

Recipients were eligible for enrollment in the cross-sectional cohort if they had undergone a kidney or kidney-pancreas transplant prior to October 1, 2005, had a ‘baseline’ serum creatinine ≤2.0 mg/dL before January 1, 2006, and subsequently developed deterioration of graft function which led to a kidney allograft biopsy. Deterioration of function was defined as (1) an unexplained and persistent ≥25% increase of creatinine over baseline (in the absence of potential confounding factors), or (2) new-onset proteinuria (defined as albumin/creatinine ratio ≥0.2 or a protein/creatinine ratio > 0.5). Importantly, the increase of ≥25% in creatinine over baseline could occur over any follow-up interval (e.g. days to years).

Enrollment and data collection

At the time of the biopsy, informed consent was obtained and baseline recipient information was recorded. In addition, blood and urine were collected for determination of donor specific antibodies (DSA) and metabolomics, respectively. Follow-up data were prospectively collected every 6 months and at any ‘event’ (e.g. graft loss).

Clinical care

Allograft biopsies were read by the local pathologist, and the pathologic diagnosis was used to guide clinical care and any adjustments to immunosuppressive medications. Clinical care was provided using local protocols.

Central pathology

Representative histologic sections were submitted to a central laboratory for analysis. Sections were stained for C4d (immunoperoxidase) and interpreted as positive if >10% peritubular capillaries stained positive (7). All biopsies were interpreted by the same pathologist (JG), masked to local diagnoses, clinical information and pathology reports, using the Banff 97 classification (8).

Determination of donor specific antibodies

Serum samples (2 mL) collected at the time of each biopsy were processed, frozen at –80 degrees and sent to a central laboratory (UCLA) in batches, along with information regarding the patient and donor HLA types and the patient's pretransplant sensitization status. Sera were tested for anti-donor HLA antibodies, using microparticles with individual purified HLA antigens (Ag) covalently bound as targets (One Lambda, Inc, Canoga Park, CA.

Statistical analyses

  • (a) 
    Inflammation in areas of atrophy. In a preliminary analysis we examined whether inflammation in areas of atrophy (iatr), tubulitis in areas of atrophy (tatr) and peritubular capillary infiltrates (ptc) (each scored by central pathology) correlated with outcome. Iatr and tatr were scored in the same manner as i and t, except being scored in areas of fibrosis and/or atrophy. Scoring of ptc infiltrates was done using the Banff 2007 classification (9). In univariate analysis, each of these correlated with subsequent graft survival (p < 0.001), and were included in our subsequent analyses.
  • (b) 
    Cluster analysis. We used hierarchical cluster analysis of Banff scores (as determined by central pathology) (SAS Proc Cluster) to identify subgroups. Cluster analysis categorizes a collection of observations into subsets (clusters) based on a preselected similarity metric computed from a fixed list of variables (4–6,10–12), and hence is a particularly useful technique for defining individuals with similar characteristics that may reflect unknown pathogenic mechanisms. Ultimately these clusters may become defined clinical entities that predict outcomes, and suggest potential treatments to improve those outcomes. Cluster analysis was applied to the 265 biopsies having complete centrally determined histology at the time of analysis, and whose local clinical diagnosis did not show BK virus nephropathy or recurrent disease. These diagnoses were eliminated a priori as our focus is on differentiating entities among those with nonspecific diagnoses for which, ultimately, intervention trials could be developed. In preliminary analyses, we explored which variables to use for clustering. Our criteria for selecting variables included: (1) availability at the time of diagnosis; (2) reasonable sample size; and, (3) minimization of strong correlations between variables used in computing dissimilarity. For example, in our preliminary analyses, the Banff i- and t-scores had an observed Pearson correlation of R = 0.81; therefore, both scores were likely to make similar contributions. Including both of these variables in the clustering algorithm without regard to this correlation would potentially disproportionately emphasize this common characteristic. Therefore we only used 1 of these 2 scores in the clustering analyses. Of note, we excluded ‘v’ as there were only 13 positive cases (v > 0); of these, 11 were v= 1, 2 were v= 2.

We did pairwise correlations of all candidate clustering variables (Banff scores) and discarded highly correlated variables (Table 1). Those variables in pairs with correlation (R value) greater than 0.5 were considered for elimination, a cutoff chosen as a natural break in the distribution of pairwise correlations. When selecting which variables to eliminate, priority was given to attempting to include the largest number of variables and to include variables with small variances. For this analysis, we did the clustering based on the Banff scores i, g, ct, cv, ah and mm. Of the three nontraditional scores (iatr, tatr and ptc) we chose tatr because it was a reflection of inflammation in areas of atrophy and was not as highly correlated with the i score as was iatr.

Table 1.  Pair-wise correlation between individual Banff scores. Those with correlation ≥0.5 are highlighted
Person Correlation Coefficients, N = 265
 itgvcictcgcvahmm
i1.000000.805780.102310.20176−0.066200.10451−0.06335−0.05782−0.23702−0.15144
t0.805781.000000.053560.15634−0.19918−0.03070−0.16756−0.13419−0.31417−0.23956
g 0.102310.053561.000000.114320.075570.077040.542530.124650.167410.30029
v 0.201760.156340.114321.00000−0.09001−0.030480.027880.09885−0.05954−0.04708
ci−0.06620−0.199180.07557−0.090011.000000.831200.234770.272170.338660.25415
ct 0.10451−0.030700.07704−0.030480.831201.000000.163490.221310.218260.20050
cg−0.06335−0.167560.542530.027S80.234770.163491.000000.236500.334010.48670
cv−0.05782−0.134190.124650.098850.272170.221310.236501.000000.307770.19657
ah−0.23702−0.314170.16741−0.059540.333660.218260.334010.307771.000000.44186
mm−0.15144−0.239560.30029−0.047080.254150.200500.486700.196570.441861.00000

We used the following guidelines to determine the final number of clusters: (1) 90% of the observations must belong to clusters each containing 10 or more observations; (2) between-cluster fraction of the total pair-wise distances between clusters (pseudo-R2) > 0.40; and (3) a local maximum of the pseudo-F statistic. In our preliminary analyses, this identified at most two choices for the total number of clusters to use in a given cluster analysis, i.e. a smaller number of clusters with a large number of recipients in each cluster or a larger number of clusters with fewer recipients in each cluster. Choosing between these two options took into account clinician review of the clustering results, using subjective discernment of cluster differences and comparison with known clinical entities. Additional details on the clustering methodology are in Supporting information.

  • (a) 
    Depiction of clusters. We (JC and RL) developed a graphical method to display the frequency distribution of the Banff scores for each cluster. Each cluster is represented as a circle or ‘clock’, with the individual Banff categories shown as numbers on the clock (Figure 1). For each Banff score, the length of a ‘spoke’ within the circle portrays the proportion of recipients in the cluster who have a score >0. For example, in Figure 1, which displays the results for Cluster 1 (see results), very few biopsies have i, t or g (i.e. short spokes), whereas 100% have ct (i.e. the spoke length is the entire radius of the circle). Furthermore, the score for each Banff parameter is represented by the relative lengths of the dotted, dashed and solid lines. The length of the dotted line represents the proportion of biopsies with Banff grade 1 (i.e. almost all in Cluster 1 have Banff grade 1 ct) (Figure 1); likewise, the length of the dashed and solid segments represent the proportion of biopsies with Banff grades 2 and 3, respectively. Black ‘dots’ delineate the breakpoints between subsections for clarity. The cluster clocks provide a way to visualize the Banff scores within each cluster and to see both the similarities and differences between clusters.
  • (b) 
    Graft outcome. For recipients in each cluster, we determined actuarial post-biopsy graft survival and cause of graft loss. In addition for each cluster, we determined local pathologists’ primary and secondary diagnoses, the percent of recipients that had C4d positive biopsies or had circulating DSA, and the time from transplant to biopsy. In evaluating the relationship of clusters to the time to first occurrence of biopsy for cause, we employed survival-analytic methods (13).
image

Figure 1. Cluster ‘clock’ for cluster 1. Each Banff score is shown around the periphery. The length of each spoke is proportional to the percent of patients in the cluster that have that particular Banff score >0. Dotted lines represent Banff 1; dashed lines, Banff 2; and solid lines, Banff 3. Black dots delineate the breakpoints between the subsections.

Download figure to PowerPoint

Results

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

At the time of analysis, 289 enrolled subjects had complete central pathology readings. Of these, biopsy results from 265 were used in the cluster analysis and form the basis of this report. A total of 24 recipients were excluded because they had specific diagnoses: recurrent disease (n = 15), and BK virus nephropathy (n = 9). For studied recipients, mean serum creatinine prior to January 2006 was 1.4 ± 0.3 mg/dL; mean creatinine at biopsy was 2.6 ± 1.5 mg/dL. Mean time from transplant to biopsy was 7.5 ± 6.1 years (median, 5.7 years; range 0.6–32.5 years). Using the selection criteria described earlier, clustering was based on Banff scores of i, g, ct, cv, mm and ah, and the additional score of tubulitis in areas of atrophy (tatr).

Our analysis resulted in six larger clusters, which included 240 of the 265 biopsies; 25 biopsies segregated to 7 smaller clusters (maximum cluster size, n = 3). Figure 2 shows the 6 clusters (and the ‘n’ for each) with the limited number of Banff scores used in the analysis. Figure 3 shows the same clusters but includes all Banff scores and the scores for tatr, iatr and ptc. Of note, 100% of the biopsies were interpreted as having ct and the vast majority also had ci (Figure 3). Thus we are capturing a population with IF/TA. Table 2 shows, for each Banff score in each cluster, the median score and the 25th and 75th percentile scores. In the supplemental materials, we have provided the frequency distribution of each Banff score by cluster.

image

Figure 2. Clusters based on Banff i, g, ct, cv, ah, mm, plus tatr* (n= 265); only scores used in clustering are shown. (Twenty-five observations in seven additional small clusters not shown.)*tatr, tubulitis in areas of atrophy.

Download figure to PowerPoint

image

Figure 3. Clusters based on Banff i, g, ct, cv, ah, mm, plus tatr* (n = 265); all BANFF scores plus iatr*, tatr and ptc* are shown. (Twenty-five observations in seven additional small clusters not shown.) *tatr, tubulitis in areas of atrophy; iatr, inflammation in areas of atrophy; ptc, peritubular capillary infiltrates.

Download figure to PowerPoint

Table 2.  Median and interquartile range for each Banff score in each cluster
Banff scoreCluster
1 (N = 94)2 (N = 40)3 (N = 49)4 (N = 14)5 (N = 29)6 (N = 14)
MedianInter-quartile rangeMedianInter-quartile rangeMedianInter-quartile rangeMedianInter-quartile rangeMedianInter- quartile rangeMedianInter-quartile range
i0.002.011.011.510.002.00
t0.002.010.012.010.001.02
g0.000.001.010.001.010.01
V0.000.000.000.000.000.00
ci1.001.011.002.012.012.02
ct1.001.001.003.012.002.01
eg0.000.002.010.001.030.02
cv1.010.011.001.011.012.00
ah1.020.011.010.012.012.00
mm0.010.001.000.002.011.00
iatı0.012.011.012.001.012.00
tatr1.012.001.012.012.012.01
ptc0.001.011.010.011.011.01

We found no major differences between clusters with regard to: donor or recipient age, race/ethnicity, primary kidney disease, donor source (deceased vs. living), transplant number, transplant era or initial immunosuppressive protocol (p = NS). In addition, baseline mean serum creatinine (defined as creatinine prior to January 2006) did not differ between clusters (Table 3). Clusters having an acute inflammatory component (i and t scores) had higher creatinine levels and lower eGFR at biopsy (Table 3).

Table 3.  Mean creatinine (±SD) prior to January 1, 2006 and mean creatinine and eGFR** at the time of biopsy, by cluster
ClusternMean creatinine (±SD)eGFR at biopsy
Prior to 01/01/06At biopsyeGFR95% CI
  1. *p < 0.01 vs. Cluster 1.

  2. **eGFR, estimated (by the modification of diet in renal disease formula) glomerular filtration rate.

1941.4 (.3)2.2 (1.1)33.6(31, 36)
2401.3 (.3) 3.5 (2.5)*25.4*(21.4, 29.4)
3491.5 (.3)2.3 (0.9)32.5(28.9, 36.1)
4141.5 (.3) 3.0 (1.2)*24.3*(17.6, 34.6)
5291.5 (.3)2.4 (0.8)29.9(26.6, 34.6)
6141.5 (.4) 3.9 (1.9)*19.1*(12.4, 25.8)

Importantly, actuarial graft survival differed significantly by cluster (log-rank = 24.03, 5 df, p = 0.002) (Figure 4). Cluster 1, a group with Banff 1 ct scores and no inflammation had the best outcome (significantly better than Clusters 3–6 [Table 4]); Cluster 6, with more severe fibrosis, plus inflammation and arterial lesions, the worst (significantly worse than Clusters 1–3 [Table 4]). The median postbiopsy follow-up interval and the number of (death-censored) graft losses for each cluster are shown in Table 5. There was no difference between clusters in length of follow-up interval.

image

Figure 4. Death-censored graft survival (postbiopsy) by cluster (p = 0.0072).

Download figure to PowerPoint

Table 4.  Hazard ration for time to death-censored graft failure by cluster
Cluster comparisonHR95%p-Value
2 vs. 13.82(1.39, 10.54)0.0095
3 vs. 12.86(0.99, 8.25)0.0522
4 vs. 15.57(1.82, 17.01)0.0025
5 vs. 14.35(1.46, 12.96)0.0083
6 vs. 110.93 (3.5, 34.12)<0.0001 
3 vs. 20.75(0.29, 1.9)0.5420
4 vs. 21.46(0.54, 3.9)0.4529
5 vs. 21.14(0.43, 3)0.7935
6 vs. 22.86(1.03, 7.98)0.0446
4 vs. 31.95(0.68, 5.58)0.2134
5 vs. 31.52(0.55, 4.2)0.4169
6 vs. 33.83(1.32, 11.08)0.0133
5 vs. 40.78(0.26, 2.31)0.6544
6 vs. 41.96(0.63, 6.14)0.2466
6 vs. 52.51(0.84, 7.51)0.0991
Table 5.  Mean (±SD) post-biopsy follow-up and number of (death-censored) graft losses, by cluster
ClusterNMean follow-up (mos)Median follow-up (mos)# losses
19414.2 ± 6.6 mos12 mos6
24013.9 ± 8.3 mos12 mos10 
34913.0 ± 7.0 mos12 mos8
41416.7 ± 10.1 mos17.1 mos7
52912.7 ± 6.9 mos11.9 mos7
6149.0 ± 6.5 mos7.4 mos6

In contrast, the local pathologists’ diagnoses did not differentiate clusters (Table 6). A similar proportion of the biopsies in each cluster (40–62%) were interpreted by local pathologists as having a primary or secondary diagnosis of CAN. In addition, 7–45% of the biopsies in each cluster were interpreted as having CNI toxicity. Acute rejection was more commonly diagnosed for Cluster 2.

Table 6.  Local pathologists’ primary or secondary diagnoses for biopsies in the 6 clusters1
CharacteristicCluster 1 (N = 94)Cluster 2 (N = 40)Cluster 3 (N = 49)Cluster 4 (N = 14)Cluster 5 (N = 29)Cluster 6 (N = 14)
  1. 125 biopsies were not clustered; of these, 52% had local pathologists diagnosis of CAN.

  2. CAN, chronic allograft nephropathy; Tx, transplant; CNI, calcineurin inhibitor toxicity; Ab, antibody.

CAN (%)53%40%54%50%62%57%
Tx glomerulopathy (%) 8% 5%38%21%48%36%
CNI toxicity (%)45% 8%21% 7%41%21%
Acute cellular rejection (%) 5%73%17%29% 3%36%
Ab-mediated rejection (%) 3%13%17% 7% 3% 7%

Of note, in each cluster, ≥29% recipients were C4d-positive (central lab); also, in each, ≥22% demonstrated DSA (Table 7). Cluster 1, the group with the best postbiopsy outcome, had the fewest biopsies expressing C4d and the lowest percent of recipients with circulating DSA. Time from transplant to for-cause biopsy differed by cluster, with Cluster 2 being the shortest and Clusters 5 and 6 the longest. When adjusted for time from transplant to biopsy, postbiopsy graft survival differences between clusters remained significant (LR χ2= 21.044, 5 df, p = 0.0008). Serum creatinine level at the time of biopsy also differed by cluster (Table 3).

Table 7.  Characteristics at biopsy for each of six computer-generated clusters
CharacteristicCluster 1 (N = 94)Cluster 2 (N = 40)Cluster 3 (N = 49)Cluster 4 (N = 14)Cluster 5 (N = 29)Cluster 6 (N = 14)
  1. 1Values are mean ± standard deviation.

  2. tx, transplantation; DSA, donor specific antibodies; PCR, urine protein to creatinine ratio.

  3. *p < 0.05 vs. Cluster 1.

Time from tx (mo.)185 ± 6553 ± 52*71 ± 5358 ± 32134 ± 104*126 ± 78*
C4d positive (%)2950*49*503658*
DSA positive (%)2242*63*55*54*54*
PCR ≥60 mg/g (%)193551505550

Discussion

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

Historically, most kidney transplant recipients with slow deterioration of graft function either do not undergo allograft biopsy, or have a biopsy late in their course. Biopsies commonly show some degree of IF/TA, with no discernible cause. Graft dysfunction is attributed to ‘chronic rejection’, ‘CAN’ or IF/TA. These nonspecific terms provide no insight into potential pathophysiologic mechanisms, and as a result may give rise to therapeutic nihilism. The 2005 Banff conference on allograft pathology recommended: (a) elimination of the term ‘CAN’ as a description or diagnosis; (b) use of specific diagnoses; and (c) reserving the terminology of ‘interstitial fibrosis and tubular atrophy, no evidence of specific etiology’ (IF/TA) for biopsies in which no specific diagnosis could be identified (1). Yet a PubMed search of the literature since 2007 reveals hundreds of articles continuing to describe ‘CAN’. In a previous analysis of our data, we showed that, in patients with late posttransplant, new-onset graft dysfunction, the local pathologists’ diagnosis of CAN (vs. no CAN) was of no prognostic significance (3).

The goal of the DeKAF study is to identify subgroups (ideally, with individual diagnoses) within the nonspecific categories noted earlier. If clinically meaningful subgroups are defined, it should be possible to postulate pathogenesis and develop intervention trials for each. For our analyses, we excluded biopsies with specific diagnoses (e.g. polyomavirus nephropathy, recurrent disease) for which trials could be initiated today (if promising therapies were available).

Of the cohort studied, all with new-onset late graft dysfunction, 100% had tubular atrophy (Banff ct > 0) and the vast majority had interstitial fibrosis (Banff ci > 0), consistent with the nonspecific diagnoses of IF/TA or CAN. Yet, in these same patients, our cluster analysis, based solely on histologic scoring, distinguished subgroups associated with different outcomes. If validated, these findings will define a new way of characterizing late allograft injury, one that elucidates meaningful histologic, and ultimately clinical criteria which will allow development of clinical trials of interventions to slow or reverse deterioration of function. Eventually, segregation of vague diagnoses like chronic rejection, CAN, or CNI toxicity into discrete diagnostic entities should also allow more precise evaluation of biomarkers.

This is the first demonstration that recipients with new-onset, late graft dysfunction can be grouped into different entities with different prognoses. Unlike the local pathologists’ diagnoses (3), the clusters that emerged from our analysis differed not only in pathological features, but also in outcomes. Given that the local pathologists are part of our study, it is likely that future local readings will be influenced by the emerging results of the study, with greater homogeneity between local and central pathology biopsy interpretations. Thus, there is a compelling reason to publish these preliminary observations at this time, before local pathological interpretation itself changes as a result of the DeKAF study.

It is possible to describe the makeup (and speculate on the pathogenesis) of the individual clusters (Figure 3). For example, Cluster 1 has no evidence of inflammation and has mild interstitial fibrosis and tubular atrophy (low Banff ci and ct scores). Cluster 1 is associated with good graft survival (Figure 4), and might not require intervention therapy. Cluster 2, in addition to interstitial fibrosis and atrophy, has significant inflammation (reflected by Banff i and t scores) and more allograft failure; additional anti-inflammatory interventions might be most appropriate in these patients. Clusters 1 and 2 are both large and have been robust in our analyses, i.e. they have changed little when changing the variables that are included in the clustering analysis (data not shown). The other cluster that has been robust, although smaller, is Cluster 6 (Figure 3), which consists of 14 recipients whose biopsies show a combination of inflammation, severe interstitial fibrosis and tubular atrophy (high Banff ci and ct scores), and evidence of vascular damage (high Banff cv, ah and mm scores). Recipients in this cluster have had poor graft survival (Figure 4). Electron microscopy might further differentiate subgroups, but to do EM evaluation of all late posttransplant biopsies would add considerable expense (14,15).

Overall, 39% of our biopsies were C4d positive (central lab) and 45% of biopsied recipients had circulating DSA (central lab). This is higher than reported in most previous studies (16,17). However, our patients have had their transplants longer and our cohort does not include subjects with stable allograft function. Hidalgo et al. have recently reported that 37% of recipients biopsied for dysfunction (between 7 days and 31 years posttransplant) had circulating DSA (18). It is also interesting that the presence of DSA and the finding of positive C4d staining do not appear to be associated with a unique histologic signature in the cluster analysis (Table 7), suggesting that the hypothesis that most cases of late graft dysfunction are due to DSA may be too simplistic. Indeed, the high frequency of DSA in these patients makes it difficult to determine when DSAs are pathogenic or innocent bystanders, although longer follow-up and more intense analysis may distinguish the relative importance of antibody-mediated injury, and incorporation of antibody status into the clusters themselves may provide additional insight.

To date, our cluster analyses have been based solely on Banff scores plus the additional score of tatr. Future analyses will include other information available at the time of development of graft dysfunction, including C4d staining, DSA (including antibody strength), characteristics of the infiltrate, and clinical factors such as donor source and age, and history of previous rejection episodes. Our expectation is that this will both sharpen the differentiation of clusters and the association of clusters with prognosis.

Analysis of protocol biopsies have shown that recipients with fibrosis and no inflammation (similar to our cluster #1) have excellent long-term prognosis, whereas those with both fibrosis and inflammation have worse outcome (19–21). Our patients differ in that the biopsies were done for new-onset, late allograft dysfunction and our clusters differ in fibrosis (Banff 1, 2 or 3 scores), inflammation, and other histologic features. As noted in Figure 4, clusters with fibrosis plus inflammation (#s 2, 3, 4 and 6) differed in outcome; in addition, clusters with fibrosis and no inflammation (#'s 1 and 5) differed in outcome.

There are limitations to this study. It will be necessary to confirm these clusters and their associations in independent cohorts. The fact that Banff scoring and the data used in the cluster analyses are categorical may be obscuring significant relationships attributable to degrees of severity. Future analyses with continuous variables for different pathological scores may produce different results. In addition, our mean follow-up is relatively short and the number of death-censored graft losses in each cluster is small (Table 5). Longer follow-up may further differentiate outcomes among clusters. Another limitation is the lack of donor kidney biopsies. Lehtonen et al. have noted the importance of donor biopsy in long-term follow-up studies (22), and we have noted that donor pathology can be misinterpreted as chronic rejection or chronic allograft nephropathy in recipients (23). Future clustering analyses, with inclusion of variables such as donor source and donor age may further help distinguish phenotypes.

Outcomes after biopsy differed by cluster, but local treatments may have affected outcomes. As enrollment increases, we will be able to study whether or not any specific treatment regimen had a significant effect. However, our primary goal was not to show that these clusters have different outcome; our goal is to differentiate subgroups. It may be that 2 subgroups have exactly the same outcome (with current treatment), but would be candidates for 2 completely different intervention trials.

Our study is based on observations at a one point in time and on a single biopsy in each recipient. It will be necessary to demonstrate that, based on a single biopsy, a recipient can be reproducibly characterized into a specific cluster. Recipients may subsequently develop other causes of graft dysfunction (e.g. noncompliance, recurrent disease). Similarly, given that time from transplant to biopsy differed by cluster, it is possible that we are observing different stages of a process in evolution. However, the time from transplant to biopsy was similar for Clusters 1 to 4; and for Clusters 5 and 6. Another limitation is that because all patients were biopsied for graft deterioration, it is possible that we have not included patients with histologic lesions, but without long-term deterioration. More importantly, it is also possible that we failed to detect early lesions in some patients (14,19,24). However, interventions need to begin at a specific point in time, and our goal in DeKAF is to determine if we can distinguish subgroups when new-onset, late graft dysfunction is first identified, so that intervention trials could be developed. The study is clinically relevant, since in most centers a late posttransplant biopsy is not done until the creatinine increases or proteinuria develops.

Although our results require validation, they suggest that it is possible to distinguish individual clusters associated with different clinical outcomes. As the numbers of recipients and duration of follow-up grow, we anticipate increasingly robust clusters with distinct clinico-pathological identities and predictability of outcomes. This will permit the design of intervention trials with well-defined, reproducible entry criteria, expected rates of outcomes and accurate sample size estimations. Future analyses will expand to incorporate additional clinical and laboratory-based parameters into the current histopathologically based clusters; we anticipate that these additional parameters will lead to even better definition of specific pathophysiologic processes that cause late kidney allograft failure.

Acknowledgments

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

We would like to thank our local pathologists (William Cook, Lynn Cornell, Gretchen Crary, Ian Gibson, Donna Lager, Ramesh Nair, Behzad Najafian, Kim Solez) who are playing a critical role in this study. And we thank Stephanie Daily for her help in preparation of the manuscript.

References

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. Acknowledgments
  7. References
  8. Supporting Information

APPENDIX FOR WEB

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

FilenameFormatSizeDescription
AJT_2943_sm_Suppmat.doc74KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.