A data‐driven approach to complement the A/T/(N) classification system using CSF biomarkers

Abstract Aims The AT(N) classification system not only improved the biological characterization of Alzheimer's disease (AD) but also raised challenges for its clinical application. Unbiased, data‐driven techniques such as clustering may help optimize it, rendering informative categories on biomarkers' values. Methods We compared the diagnostic and prognostic abilities of CSF biomarkers clustering results against their AT(N) classification. We studied clinical (patients from our center) and research (Alzheimer's Disease Neuroimaging Initiative) cohorts. The studied CSF biomarkers included Aβ(1–42), Aβ(1–42)/Aβ(1–40) ratio, tTau, and pTau. Results The optimal solution yielded three clusters in both cohorts, significantly different in diagnosis, AT(N) classification, values distribution, and survival. We defined these three CSF groups as (i) non‐defined or unrelated to AD, (ii) early stages and/or more delayed risk of conversion to dementia, and (iii) more severe cognitive impairment subjects with faster progression to dementia. Conclusion We propose this data‐driven three‐group classification as a meaningful and straightforward approach to evaluating the risk of conversion to dementia, complementary to the AT(N) system classification.


| INTRODUC TI ON
Alzheimer's disease (AD) is the most common cause of dementia. 1inical and epidemiological evidence indicates that AD-related pathological changes occur decades before the onset of clinical symptoms. 2Identifying patients at risk in stages such as mild cognitive impairment (MCI) is essential to provide a critical time window for early clinical management, treatment, care planning, and design of clinical trials. 3,4Currently, early and differential AD diagnosis is performed by checking different biomarkers related to biological processes underlying the disease, such as β-amyloid accumulation and neurofibrillary tangles formation.

Different modalities of AD biomarkers have been implemented
[7] Advances in the study of biomarkers have enhanced a redefinition of Alzheimer's disease (AD) as a biological construct, meaning different pathological pathways and systemic processes involved.In 2016, Jack et al. 6 proposed a new classification scheme based on biomarkers for three dimensions: AT(N).Under this definition, the AT(N) classification system recognizes three general groups of biomarkers for AD: biomarkers of β-amyloid plaques (A), fibrillar tau (T), and neurodegeneration or neuronal injury (N). 6Amyloid positron emission tomography (PET); cerebrospinal fluid (CSF) Aβ1-42 and Aβ1-42/Aβ1-40 ratio correspond to the "A" category; tau-PET-and CSF-phosphorylated Tau (pTau) correspond to the "T" category; and 18 F-fluorodeoxyglucose-PET (FDG-PET), structural brain MRI, and CSF total Tau (tTau) are considered markers of the "N" category.
Thus, the AT(N) classification scheme is presented as an unbiased descriptive system that can be applied to all patients without suspected AD cognitive symptoms.
Despite its wide use in recent years, some studies have raised issues associated with the AT(N) classification system.One study 8 showed that the AT(N) system was inconsistent and highly dependent on the biomarkers used and the stages of the disease, even when using several cut-off points.This work also exhibited a very low correlation between the distinct biomarkers in the whole sample set and the AD continuum stages, except for CSF pTau and tTau. 8In addition, another work 9 found a large number of profiles according to the AT(N) system in their application to a memory clinic, but with an important overlap in baseline characteristics and patterns of cognitive decline. 9ere are still a few challenges in applying the AT(N) system to the clinic.We hypothesized that applying unbiased, data-driven techniques such as machine learning (ML) algorithms might help optimize it.1][12] Within ML methods, a large group is known as "unsupervised classification" techniques, which includes clustering algorithms.These algorithms can find patterns within the data to identify new groups or clusters.Although unsupervised classification techniques are less common than supervised classification (prediction of output from input data), clustering applications in a clinical context could discover new subgroups or types of patients.The evaluation of newly discovered clusters offers new knowledge about how the disease occurs, especially when applied in the complex context of highly heterogeneous diseases such as AD. 13 These clustering techniques can be applied to obtain categories based on the values of different biomarkers, similar to what is done by the AT(N) system.
One of the main goals of the AT(N) classification system is to get insight into the evolution of the biomarkers in AD and eventually be able to make an early (even preclinical) biomarker-based diagnosis.Accordingly, we set a very similar goal from the point of view of a data-driven strategy such as clustering.In this work, we present a clustering analysis of the CSF biomarkers representative of the AT(N) system: Aβ1-42, Aβ1-42/Aβ1-40 ratio, pTau, and tTau.
The main objective of our study was to compare the clustering results over all available CSF biomarkers against the AT(N) system.
Therefore, we evaluated the diagnostic and prognostic capacity of the clusters and interpreted their significance from the point of view of AD progression.Finally, we validated our results in two cohorts of AD patients: the Alzheimer's Disease Neuroimaging Initiative (ADNI) and data from our center, representing both research and clinical practice settings, respectively.

| Datasets description
We used two data sources: (i) data coming from the San Carlos Clinical Hospital in Madrid, Spain (named HCSC dataset throughout this work), and (ii) data coming from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (adni.loni.usc.edu)database.In ADNI and HCSC data, we queried for cerebrospinal fluid (CSF) biomarkers measurements, sociodemographic information, and clinical diagnoses.CSF measurements included Aβ(1-42) (raw value and ratio with Aβ(1-40); this last one only in the case of the HCSC dataset), phospho-tau (pTau), and total tau (tTau).Table 1 shows the main ing to diagnostic criteria. 14For more information about the neuropsychological protocol, CSF samples acquisition and analysis, and diagnostic criteria employed in the HCSC dataset, please refer to Appendix S1: Supplementary Information 1.1 and 1.2.Diagnostic criteria are depicted in Table S1.
The ADNI database was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD, and has as its primary goal to test the combination of several neuroimaging techniques, biological markers, and clinical and neuropsychological evaluations to assess the progression of mild cognitive impairment (MCI) and early AD.We obtained data from the key table "ADNIMERGE," which contains a summary of essential features and measurements of subjects enrolled in ADNI distributed in medical visits.We selected only patients from the ADNIMERGE table whose diagnosis was available for 5 years or more (n = 625).Next, we discarded patients who reversed diagnosis during the follow-up (e.g., MCI to CN or dementia to MCI) or started with a dementia diagnosis (n = 581).Lastly, we selected patients with available CSF biomarkers measurements at the baseline visit (t = 0).As a result, 419 patients remained for this analysis.Patients were categorized into cognitively normal (CN) (n = 159), subjective memory complaints (SMC) (n = 50), early mild cognitive impairment (EMCI) (n = 111), and late MCI (LMCI) (n = 99).Please refer to Table S2 for ADNI diagnosis criteria.

| Proposed generalization analysis
We trained two different KMeans models for HCSC and ADNI cohorts because the latter did not include the amyloid ratio present in HCSC.However, to properly generalize the three clusters obtained and discussed in this work regarding CSF biomarkers, we evaluated the clustering results using another smaller cohort from ADNI that included data of the Aβ(1-42)/Aβ(1-40) ratio.For selected patients in this ADNI cohort, we followed the steps described in the first section of Materials and Methods.We employed the HCSC cohort as the discovery cohort (n = 165) and the mentioned ADNI cohort as the validation cohort (n = 174).Next, we performed the following steps.First, with the trained clustering model on the discovery cohort, we labeled the discovery cohort according to cluster assignments.Secondly, we trained a logistic regression model on biomarkers values in the discovery cohort, using cluster assignations as the classification labels.Finally, we evaluated this classification model on the discovery cohort, obtaining classification performance metrics such as accuracy, the area under the ROC curve (AUC), and Matthews correlation coefficient (MCC).

| CSF biomarkers clustering
We performed a clustering analysis through KMeans with Euclidean distance.For this, we used as input all the available biomarkers measurements combined and obtained results in the HCSC and ADNI datasets.The biomarkers values were scaled to the unit standard deviation.We implemented standardization and KMeans analyses using Scikit-Learn v.1.1.2. 18KMeans algorithm needs the number of clusters as input.This number was set from 2 to 10 clusters to evaluate the quality of different clustering solutions using the silhouette index (SI) score.This score is based on the intra-and intercluster distance, that is, the cohesion and separation of clusters, TA B L E 1 Sample description of the HCSC and ADNI datasets.respectively.This value ranges from 0 to 1, where 1 represents a perfectly clustered dataset.Moreover, we described each cluster's diagnosis and AT(N) categories distributions.We also described the clinical characteristics of all clusters and compared them using ANOVA, considering significant p-values < 0.05.

| Dementia progression evaluation in ADNI
We evaluated the conversion to dementia probabilities of the newly

| Ethical statement
The study was conducted with the approval of our hospital Ethics Committee and participants (or their legally authorized representative) gave written informed consent.

| Biomarkers clustering in HCSC and ADNI datasets
We obtained clusters for all CSF biomarkers combined in both HCSC and ADNI datasets.In the case of the HCSC dataset, we evaluated two datasets with different amyloid status representatives: one using Aβ(1-42) and another using Aβ(1-42)/Aβ(1-40) ratio.Tables S3 and S4 list the SI values obtained for each solution for HCSC and ADNI datasets, respectively.
The solutions that obtained the highest SI values in the HCSC Lastly, to properly generalize these three proposed groups, we evaluated the clustering results obtained in HCSC (discovery cohort) with another smaller cohort from ADNI (validation).The classification performance results were the following: 90.80% accuracy, 90.81% F1 score, and 84.43% MCC.Thus, meaning that the threecluster solution existed in both cohorts and these proposed CSF categories are generalizable.

| Clusters description in HCSC and ADNI datasets
Due to the unsupervised nature of clustering algorithms, it is necessary to analyze the solutions obtained from a clinical point of view beyond the score obtained from a computational perspective.We started describing the diagnosis signature of each obtained cluster.
Figure 1 shows within-cluster distribution of diagnoses in the HCSC and ADNI datasets.
The first cluster (Cluster 0) obtained in HCSC (Figure 1A) con- However, compared to HCSC, there were more A− subjects with positive TN (28.57%).Cluster 1 mainly comprised A+ subjects (68.32%), a significantly lower percentage of A+ than Cluster 1 in the HCSC dataset (92.18%).Cluster 1 also contained around 30% of A− subjects.In Cluster 2, most subjects were positive for all biomarkers (87.95%), and a small percentage (12.05%)corresponded to A−T+(N)+ subjects.Finally, it is noteworthy that in HCSC and ADNI, we did not find either A−T−(N)+ or A+T−(N)+ subjects, respectively.
Continuing with the description of the clusters obtained by combining all biomarkers, we evaluated the main characteristics of each cluster, including CSF biomarkers, sex, age at baseline, education years, and MMSE score at baseline (depicted in Table 2).
Concerning the mean values of the biomarkers, Figure 3 shows scatterplots of the CSF biomarker values to compare their distribution in the two datasets evaluated, according to the three clusters obtained with all biomarkers.
In both datasets, amyloid biomarkers were the most clearly defined differentiation between Cluster 0 versus Clusters 1 and 2. The mean values of the respective amyloid biomarkers shown in Table 2 also reflected this result, such as Cluster 0 > Cluster 1 > Cluster 2.
In addition, regarding both datasets, the differentiation between Clusters 1 and 2 was defined by pTau or tTau.Figure 3 shows that the correlation between pTau and tTau in both datasets was very high, reflected in their respective representation against Aβ(1-42), which was practically the same.
However, the clustering results between datasets showed differences in the biomarkers' values distribution.Firstly, the values distribution of Aβ(1-42) against both pTau and tTau differed between datasets.In the case of HCSC, this distribution resembled more in an "L" shape (Figure 3A), whereas ADNI dataset biomarkers values were more dispersed (Figure 3B).Secondly, related to On the other hand, regarding ADNI, pTau against tTau scatterplot (Figure 3B) and

| Survival analysis of each biomarker clustering in ADNI
Finally, we conducted survival analyses in ADNI to assess whether the new clusters showed significant differences in dementia progression compared to the AT(N) categories.Figure 4 shows the Kaplan-Meier curves of the clusters and the hazard ratios (HR) and p-values obtained.
The three clusters in ADNI showed significant (p-values <0.005) differences regarding progression to dementia, even between Clusters 1 and 2 (Figure 4A).Cluster 2 represented the most affected, with an increased risk of dementia of 3.64 and 11.51 compared to Clusters 0 and 1, respectively.Regarding the categories of the AT(N) system (Figure 4B), Table S5 shows

A+T+[N+]
).The remaining AT(N) pairs overlapped and were not significantly different for the progression to dementia.
We evaluated how the information of all the biomarkers combined could result in different clustering solutions in two cohorts.Lastly, we compared the optimal clustering solutions against the currently employed AT(N) system categories to assess differences between both methods.
Importantly, two main methodological differences between the two cohorts employed need to be discussed.First, the amyloid representative: Aβ(1-42)/Aβ(1-40) and Aβ(1-42) in the HCSC and ADNI cohorts, respectively.The ratio differentiated patients better with AD and, as in the AT(N) system, appeared to be a more appropriate biomarker to carry out unsupervised strategies for patient stratification.Second, how patients were recruited in each cohort.The HCSC cohort was obtained from clinical practice, in which the majority of subjects included were patients who came with at least subjective memory complaints or cognitive impairment and were therefore expected to be slightly more advanced in the disease than the ADNI cohort.On the contrary, the ADNI cohort is a research cohort in which many more CN subjects appear at the initial visit since it aims to represent the potential deterioration of these patients in a longitudinal study.Despite these methodological differences, we believe that the combination of both cohorts and the replication of the findings in both settings was important to confirm the external validity of the presented study.
One of the key points in this work, compared with the AT(N) system, is that cut-off points are established separately for each biomarker in the latter.Then, joint categories are created without taking into account the interactions that may exist between biomarkers.
This dichotomization of biomarkers in the AT(N) system shows inconsistencies when combining different biomarkers 8 and loses prognostic ability compared to using them on a continuous basis. 21reover, clustering techniques like the one applied in this work would not require determining the cut-off points using traditional methods.3][24] Moreover, it considers the mentioned interaction between biomarkers.Thus, clustering techniques could be useful to obtain valid groups for different cohorts, reducing the variability across centers and the limitations of using cut-off points from other populations [25][26][27] and with a similar computational cost to the AT(N) system.S5.
of the clinical diagnoses (Figure 1), AT(N) categories (Figure 2), biomarker values distribution (Table 2 and Figure 3), and survival (Figure 4).Accordingly, we propose three categories for the AD diagnosis based on CSF biomarker values combined: (i) not affected by AD, (ii) AD in a very early stage or slow progression, and (iii) AD in a more advanced stage or more rapid progression.Our results are consistent with another clustering work 28 in an independent cohort of 151 patients with AD, in which a group with higher levels of Tau and lower amyloid levels was associated with worse clinical outcomes over time, including faster cognitive decline and higher mortality.The results showed amyloid categorization is dichotomous, as proposed by the AT(N) system. 6Furthermore, as other works have previously suggested, 29 a much clearer and cleaner binary categorization of patients happens when using Aβ(1-42)/Aβ(1-40) ratio rather than Aβ(1-42) alone.Therefore, it is advisable to use the former whenever possible.Moreover, we found that Tau biomarkers are not dichotomous but rather continuous, significantly different among the three clusters obtained in the two cohorts (Table 2).This fact was also reflected in the SI scores obtained for these biomarkers alone (Tables S3 and S4), which were very similar, probably making tTau a "redundant" biomarker when using CSF biomarkers in the context of AD.
A counterintuitive result is that Tau is a stronger predictor for HCSC than ADNI (Figure 3).One potential first reason could be the different amyloid representatives employed in each case.The amyloid ratio value in the HCSC cohort seemed to provide more information for the characterization of Cluster 0 versus Clusters 1 and 2.
Then, HCSC's Clusters 1 and 2 could be better characterized using tau values, acquiring more importance in the proposed classification.This fact most likely caused the clusters not to be defined in the same way in both scenarios, generating the phenomenon that can be observed in Figure 3. Second, and importantly, we found that Aβ(1-42) presented many saturated points at value 1700 in the ADNI cohort.Another work performed on ADNI included these saturated points as extrapolated values, 17 resulting in a much more similar distribution as the one presented for HCSC in Figure 3.We did not have access to these extrapolated values and straightly replaced them with the value of 1700, thus, clearly affecting clustering results.When we performed the clustering eliminating the patients presenting this extreme value, we obtained much more similar results to those obtained in the case of HCSC, with pTau and tTau getting higher values according to the cluster progression (Table S6).
Importantly, we performed a clustering-then-classification analysis to corroborate the three clusters found in both cohorts.This classification model developed is the one we present for classifying new patients in the proposed CSF categories.In the future, it will be necessary to train it with a larger number of samples, evaluating and comparing the results of more and different cohorts.Moreover, the great classification performance obtained (more than 90% accuracy) corroborated the presence of three clusters in both the HCSC cohort (discovery dataset) and in another ADNI cohort that included the amyloid ratio value (validation dataset).Furthermore, and notably, the original ADNI cohort, which did not contain the amyloid ratio, also showed grouping in three similar clusters, which were significantly different regarding their survival.
The survival analysis performed on clustering results from ADNI (Figure 4) validates the clustering as a data-driven tool for the prognosis of AD.The three obtained clusters showed a significant difference in the progression of dementia.Conversely, when comparing survival between different AT(N) categories, we found that only 33% of AT(N) categories significantly differed in the progression to dementia.The remaining AT(N) categories were mixed in survival, similar to recent studies. 30Overall, our findings suggest that, when employing CSF biomarkers, it could not be necessary to use a finegrained categorization such as the AT(N) system proposes.In addition, the AT(N) categories, because they are assigned independently and separately, may lose information on the relationships between the biomarkers, which is relevant as Tau and amyloid represent key linked pathophysiological processes.
Our study presents some limitations.First, survival analysis was only evaluated on the ADNI cohort because data from HCSC had a shorter follow-up time.Second, we only evaluated a single clustering algorithm, KMeans, as it is a standard and widely used method, especially using low-dimensionality (few input variables) datasets.In the future, it would be interesting to corroborate the results obtained with clustering methods that also consider non-linear relationships on the different biomarkers and explore whether more clinically relevant clusters appear.Another clustering work using CSF biomarkers applied a Gaussian mixture modeling algorithm, which yielded six clusters. 31However, this work included more non-AD pathologies and did not report survival differences between the clusters.Third, we only focused on certain CSF biomarkers to increase the applicability of our findings in clinical settings, where biomarkers from multiple sources are generally absent.Other works suggest that including more biomarkers, such as MRI measures 32 or additional CSF biomarkers, 33 will lead to more clusters.Recent studies have proposed to split the "T" category of the AT(N) system into CSF pTau and tau-PET. 34In this regard, combining CSF with other blood biomarkers and MRI and PET neuroimaging would be of interest in future studies using unsupervised machine learning demographic and clinical characteristics of the HCSC and ADNI datasets.The HCSC cohort data included 165 patients consulting for memory loss recruited from the Department of Neurology between December 2018 and December 2021.All patients were Spaniards and were native Spanish speakers.All patients were examined with a comprehensive neuropsychological protocol, structural neuroimage (TC or MRI), and lumbar puncture for CSF biomarkers.Patients were categorized, according to the results from the neuropsychological assessment, longitudinal follow-up, and results from cerebrospinal fluid biomarkers, as follows: subjective memory complaints (SMC) (n = 43), early mild cognitive impairment due to AD (EMCI) (n = 26), and late MCI due to AD (LMCI) (n = 63) and MCI without evidence of neurodegeneration (MCI-NN) (n = 33).Suspected non-Alzheimer's neurodegenerative disease patients (frontotemporal dementia and Lewy body disease) were excluded.Patients were considered AD when the diagnosis was supported by altered levels of Aβ(1-42) or altered Aβ(1-42)/Aβ(1-40) ratio and pTau in CSF analysis accord-

dataset were k = 3
clusters and k = 2 clusters for combining all biomarkers using the Aβ(1-42)/Aβ(1-40) ratio and Aβ(1-42) as amyloid representatives, respectively.Given that the clustering solution using the Aβ(1-42)/Aβ(1-40) ratio gave significantly better SI scores than the Aβ(1-42) one, we decided to evaluate the first one in the HCSC dataset.In the case of ADNI, SI scores did not clearly show whether 2, 3, or 4 clusters were the best solution.Therefore, we decided to explore the three-cluster solution of the ADNI dataset, evaluate the differences between cohorts, and evaluate the use of the ratio against the use of Aβ(1-42) jointly with other biomarkers.Therefore, we next describe the clustering results k = 3 for using Aβ(1-42)/Aβ(1-40) ratio in HCSC and Aβ(1-42) in ADNI cohorts.
primarily SMC and MCI-NN subjects.Clusters 1 and 2 presented the majority of MCI patients.Cluster 1 represents an intermediate group, with many more MCI cases than Cluster 0 but fewer than Cluster 2. Regarding the ADNI dataset (Figure 1B), we obtained similar results, although closer between clusters.In ADNI, the obtained intermediate cluster (Cluster 1), compared to HCSC's Cluster 1, was more similar to Cluster 0 in the CN and SMC subjects' distribution.Thus, this intermediate cluster is more advanced in diagnosis in HCSC, while in ADNI, it is less advanced, more similar to Cluster 0. In addition, we noted that the distribution of EMCI cases remained nearly constant in all clusters (around 20%).On the contrary, LMCI, SMC, and CN changed greater their distributions among the different clusters.Finally, we observed that using Aβ(1-42)/Aβ(1-40) ratio rather than Aβ(1-42) as amyloid representative in combination with other CSF biomarkers yielded a much more sensitive output for Alzheimer's diagnosis (EMCI and LMCI) and specific for non-Alzheimer's cases (SMC and MCI-NN).Next, we obtained the within-cluster distribution of the original AT(N) categories (see Section 2.2 for the cut-off values used for each dataset).Figure2shows heatmaps of these distributions for all biomarkers clustering results.In addition, Figure2shows AT(N) categories in a specific order, according to the AT(N) classification of the NIA-AA 20 : A−T−(N−) (normal AD biomarkers); A−T−(N+) and A− T+(N±) (pathological change not AD type); A+T-(N−) (AD continuum: Alzheimer-like pathological change); A+T+(N±) (AD continuum: AD); and A+T(N+) (AD continuum: Alzheimer-like pathological change with suspected pathology concomitant).Regarding the HCSC dataset (Figure2A), Cluster 0 mainly contained A−T−(N−) subjects (90.41%).Cluster 1 consisted almost entirely (92.18%) of A+ subjects, most of which were positive for all biomarkers (A+T+(N)+, 62.50%).This AT(N) category was followed by 20.31% of A+T−(N−) subjects.Finally, Cluster 2 consisted almost completely of A+T+(N)+ patients (96.43%).Regarding the ADNI dataset (Figure2B), Cluster 0 was similar to Cluster 0 in HCSC.In the case of ADNI, Cluster 0 was formed entirely by A− subjects.

F I G U R E 1
Stacked bar plots representing clinical diagnosis distribution (percentage, %) in each cluster.(A) HCSC dataset; (B) ADNI dataset.CN, Cognitively normal; MCI-NN, mild cognitive impairment without neurodegeneration; SMC, subjective memory complaints; EMCI, early mild cognitive impairment; LMCI, late mild cognitive impairment.F I G U R E 2 Heatmaps showing within-cluster distribution (%) of different AT(N) categories in (A) HCSC and (B) ADNI datasets in clusters obtained with all biomarkers.Note that clustering results used different amyloid representatives: HCSC dataset used Aβ(1-42)/Aβ(1-40) ratio, whereas ADNI results used Aβ(1-42) value.the previous result, the pTau and tTau means between Clusters 1 and 2 differed between datasets.On the one hand, in HCSC, while pTau and tTau values increased, they also increased their dispersion (or standard deviation).Table 2 also reflects this result, where the mean values and standard deviations of pTau and tTau meet the following progression: Cluster 2 > Cluster 1 > Cluster 0.
(a, b, c) Note: For continuous variables, values represent mean ± standard deviation.p-valuescomputed for ANOVA for continuous variables and the chisquare test for continuous ones.ANOVA with Tukey post-hoc analysis showed statistically significant differences between Clusters 0 and 1 (a), Clusters 0 and 2 (b), and Clusters 1 and 2 (c).F I G U R E 3 Scatterplots of CSF biomarkers values.Different colors represent cluster assignments in (A) HCSC and (B) ADNI datasets.0 were greater than Cluster 1, following distribution: Cluster 2 > Cluster 0 > Cluster 1.
the HRs and p-values obtained when comparing all AT(N) categories present by pairs (in the absence of A+T−[N+] individuals).Seven of twenty-one (33%) AT(N) comparisons resulted in statistical significance (p-value <0.005).Among these pairs, five of seven (71.43%) showed increased risk when A biomarker was positive (A−T+[N+] vs. A+T+[N-] and A+T+N+; A−T− [N−] vs. A+T+[N−], A+T−[N−], and A+T+[N+]).The remaining AT(N) pairs showed an increased risk when T or TN biomarkers were positive in the presence of A+ (A+T+[N−] vs. A+T−[N−]; A+T−[N−] vs. Interestingly, our findings suggest clustering CSF biomarkers together, especially when using Aβ(1-42)/Aβ(1-40) ratio as the amyloid representative, led to three clusters.These three clusters appeared in the two cohorts and presented differences in the distribution F I G U R E 4 Kaplan-Meier curves and Cox-PH models' results obtained in ADNI for (A) clustering results and (B) AT(N) categories.Cox-PH models results are shown as hazard ratio (upper 95% CI-lower 95% CI); p-value.Models were adjusted using covariates: age at baseline, years of education, and sex.C0: Cluster 0; C1: Cluster 1; C2: Cluster 2. For significant differences between AT(N) categories, please refer to Table The first cluster included individuals with high beta-amyloid and low pTau and tTau levels, consisting mostly of individuals with no AD (A−T−[N−]) or, especially in the case of ADNI, subjects with a pathological change not associated with AD (A−T+[N±] or A−T−[N+]).The second cluster consisted of an intermediate or transitional group,with most individuals having a low beta-amyloid biomarker value and high Tau levels.This intermediate group was the one that showed the greatest differences between the two cohorts.In HCSC, the intermediate cluster was more similar to the most affected (Cluster 2), with amyloid positivity but lower tau levels.ADNI's intermediate cluster was also positive for amyloid but showed Tau levels closer to the least affected cluster (Cluster 0).Finally, the third cluster included subjects with very low beta-amyloid values and extremely high pTau and tTau.Nearly all A+T+(N+) subjects were in the third cluster in both datasets.
Cognitively Normal; EMCI, Early Mild Cognitive Impairment due to AD; LMCI, Late Mild Cognitive Impairment; MCI-NN, MCI without evidence of neurodegeneration; MMSE, Mini-Mental State Examination; SMC, Subjective Memory Complaints.
clusters from the ADNI cohort, including subjects with a subjects' sex, age at baseline, and education years.We considered significant p-values less than 0.05.

Table 2
Clinical characteristics of clusters obtained in the HCSC and ADNI datasets.
showed pTau and tTau values in Cluster TA B L E 2