Attenuated cell cycle and DNA damage response transcriptome signatures and overrepresented cell adhesion processes imply accelerated progression in patients with lower‐risk myelodysplastic neoplasms

Patients with myelodysplastic neoplasms (MDS) are classified according to the risk of acute myeloid leukemia transformation. Some lower‐risk MDS patients (LR‐MDS) progress rapidly despite expected good prognosis. Using diagnostic samples, we aimed to uncover the mechanisms of this accelerated progression at the transcriptome level. RNAseq was performed on CD34+ ribodepleted RNA samples from 53 LR‐MDS patients without accelerated progression (stMDS) and 8 who progressed within 20 months (prMDS); 845 genes were differentially expressed (ІlogFCІ > 1, FDR < 0.01) between these groups. stMDS CD34+ cells exhibited transcriptional signatures of actively cycling, megakaryocyte/erythrocyte lineage‐primed progenitors, with upregulation of cell cycle checkpoints and stress pathways, which presumably form a tumor‐suppressing barrier. Conversely, cell cycle, DNA damage response (DDR) and energy metabolism‐related pathways were downregulated in prMDS samples, whereas cell adhesion processes were upregulated. Also, prMDS samples showed high levels of aberrant splicing and global lncRNA expression that may contribute to the attenuation of DDR pathways. We observed overexpression of multiple oncogenes and diminished differentiation in prMDS; the expression of ZEB1 and NEK3, genes not previously associated with MDS prognosis, might serve as potential biomarkers for LR‐MDS progression. Our 19‐gene DDR signature showed a significant predictive power for LR‐MDS progression. In validation samples (stMDS = 3, prMDS = 4), the key markers and signatures retained their significance. Collectively, accelerated progression of LR‐MDS appears to be associated with transcriptome patterns of a quiescent‐like cell state, reduced lineage differentiation and suppressed DDR, inherent to CD34+ cells. The attenuation of DDR‐related gene‐expression signature may refine risk assessment in LR‐MDS patients.

MED/016; UP Young Researcher Grant Competition, Grant/Award Number: JG_2023_016 upregulated.Also, prMDS samples showed high levels of aberrant splicing and global lncRNA expression that may contribute to the attenuation of DDR pathways.We observed overexpression of multiple oncogenes and diminished differentiation in prMDS; the expression of ZEB1 and NEK3, genes not previously associated with MDS prognosis, might serve as potential biomarkers for LR-MDS progression.Our 19-gene DDR signature showed a significant predictive power for LR-MDS progression.In validation samples (stMDS = 3, prMDS = 4), the key markers and signatures retained their significance.Collectively, accelerated progression of LR-MDS appears to be associated with transcriptome patterns of a quiescent-like cell state, reduced lineage differentiation and suppressed DDR, inherent to CD34+ cells.The attenuation of DDR-related geneexpression signature may refine risk assessment in LR-MDS patients.

What's new?
The mechanisms of transformation from myelodysplastic syndrome to acute myeloid leukemia are still unclear, with some lower-risk MDS patients progressing rapidly despite a good prognosis.This comprehensive analysis of the CD34+ cell transcriptome identified several molecular signatures underlying accelerated progression in lower-risk MDS patients.Early progression (within 20 months from diagnosis) appeared to be associated with decreased cell cycle and metabolic activity, downregulated DNA damage response and dysregulated cell adhesion gene expression.
The suppression of CD34+ cell intrinsic DNA damage response seemed to be interconnected with decreased lineage differentiation, and may predict progression in patients with lower-risk MDS.

| INTRODUCTION
Myelodysplastic neoplasms (MDS) are a heterogeneous group of disorders characterized by clonal hematopoiesis.Patients with MDS are classified according to their risk of transformation to acute myeloid leukemia (AML) using international prognostic scoring systems.The mechanisms of disease progression are still unclear, and some lower-risk MDS patients (LR-MDS) progress rapidly despite a good prognosis. 1Identification of patients at risk of rapid progression and early initiation of effective treatment may significantly improve patient outcomes.
In the era of microarrays, a few classifiers and gene signatures connected to a poor prognosis have been introduced for MDS; however, these have not been implemented in clinical practice. 2,3Currently, despite the massive availability of the RNAseq technique, only a limited number of studies have been published on the MDS transcriptome, especially those focusing on classification and risk prediction. 4,5The heterogeneity of MDS makes these studies difficult.
Recently, we have proposed that DNA damage response (DDR) activation forms an intrinsic antitumor barrier in LR-MDS CD34+ cells that counteracts cellular transformation, and this barrier may be disrupted by RUNX1 mutations. 6Here, as an extension, we aimed to analyze the transcriptome of LR-MDS CD34+ cells regardless of their driver mutations, to identify the molecular mechanisms underlying rapid progression.Using RNAseq, we demonstrated that CD34+ cells from diagnosis of LR-MDS with accelerated progression (within 20 months from diagnosis; progressive MDS; prMDS) exhibited distinct pro-malignant gene expression signatures and attenuation of potential intrinsic tumor-suppressing barrier compared to CD34+ cells of stable LR-MDS patients (stMDS).For later confirmation of our findings, we used a set of validation samples.When confirmed by further studies, these signatures may allow to predict the probability of progression for patients with LR-MDS.

| Patients and samples
We enrolled 61 LR-MDS patients according to the IPSS-R 7 (0-3.5 points).Eight patients (13%) progressed within 20 months, while stable MDS patients were monitored at least 20 months from diagnosis.The progression was defined according to the revised criteria of the International Working Group. 8The median age of the cohort was 65 years (range, 23-85 years).The median follow-up period was 52 months (range, 9-208 months).Three patients underwent hematopoietic stem cell (HSC) transplantation (HSCT) after progression and were followed until the date of HSCT for the purposes of this study.The characteristics of the patients are summarized in Table S1A, Validation sample set consisted of 7 LR-MDS patients (median age was 68, range, 54-75 years, Table S2), for whom full data became available after the completion of the discovery cohort analyses.

| Sampling
Bone marrow (BM) CD34+ cells from diagnostic samples were isolated by magnetic separation on an autoMACS Separator (Miltenyi Biotec, Bergisch Gladbach, Germany).RNA was isolated by acid guanidinium thiocyanate-phenol-chloroform extraction and measured with Qubit 3.0 fluorometer (Life Technologies, Carlsbad, CA).RNA quality was checked using an Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA).Only high-quality RNA samples with an RNA integrity number of at least 7.5 were used.The samples were ribodepleted prior to library preparation using the RiboCop rRNA Depletion Kit (Lexogen, Wien, Austria).Somatic mutations from BM or peripheral blood were detected by TruSight Myeloid Sequencing Panel (Illumina, San Diego, CA). 6

| RNA sequencing and data analysis
The library was prepared with the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (New England Biolabs, Ipswich, MA) according to the manufacturers' instructions and sequenced on HiSeq 2500 and NovaSeq (Illumina).The sequencing coverage and quality statistics for each sample are summarized in Tables S3A (discovery cohort) and S3B (validation sample set).The data analysis pipeline has been described previously. 6Briefly, the reads were mapped to the human genome GRCh38.p13.R software 4.0.2packages (described in Ref. 6) and GraphPad Prism 7 software (GraphPad Software, La Jolla, CA) were used for analyses and data visualization.Analysis of protein-protein interactions was performed using the online tool String 11.0; gene set enrichment analysis (GSEA) was performed in GSEA software 3.0.Gene Ontology (GO) and Reactome Pathways were used for functional annotation.To remove redundant GO terms, the REVIGO online tool was used. 9Differential alternative splicing events were detected by rMATS 4.1.2.Mutation analysis from RNAseq data is described in the Supplementary Methods in Data S1.

| Immunohistochemistry
BM formalin-fixed paraffin-embedded sections from 9 LR-MS patients (highlighted in Figure S2A) were stained with CD34, ZEB1 and IDH2 antibodies detailed in the Supplementary Methods in Data S1.

| Statistical analysis
Unpaired t tests, Fisher's exact tests and Mann-Whitney tests were used to compare groups to identify differences between two groups and Cox proportional hazard regression was used for multivariate analysis.Kaplan-Meier curves were generated in GraphPad Prism 7 software.The level of statistical significance was set at 0.05 unless indicated otherwise in the text.Data were assumed to be non-normal (tested by Shapiro-Wilk test).The description of statistical approaches used for gene expression signatures and the cumulative incidence analysis are detailed in the Supplementary Methods in Data S1.Before statistical analysis of validation sample set, batch effect between discovery and validation sample sets was treated using limma package in R software.When assessing differential gene expression of candidate markers, these were first tested in the discovery cohort and then combined with validation samples into groups of all stMDS and prMDS analyzed cases for stronger statistical power of the boxplot constructions.Also, all candidate gene expression correlations were first tested in the discovery cohort but are presented in combination with validation samples for stronger associations.Whenever validation cohort samples are present in figures, they are differentiated from discovery cohort by color or shape.Alternative splicing analysis in rMATS, GSEA, gene signature analysis and cumulative incidence analysis were performed only in the discovery cohort.

| Differentially expressed genes between stMDS and prMDS samples
First, using RNAseq on CD34+ ribodepleted RNA, we aimed to uncover possible transcriptome differences between stMDS and prMDS, who progressed within 20 months.The arbitrary 20 months cut-off was used for the separation of patients with accelerated progression, abnormal for LR-MDS group.We hypothesized that there are pre-existing molecular differences in CD34+ cells at diagnosis that might drive LR-MDS progression, and that these differences (transcriptional signatures) become apparent phenotypically within this 20-month period in the patients with such accelerated progression.
prMDS showed a higher level of consistency in their expression profiles (Figure S2A), and in the principal component analysis (PCA), they formed a marginal group of LR-MDS; transcriptome of some prMDS and stMDS samples partially overlapped (Figure S2B).To confirm that we dichotomized the patients based purely on MDS progression without other unrecognized causes of disease severity, we evaluated expression of two components of WNT signaling, known to be associated with differential outcome in MDS/AML.We observed significant downregulation of LEF1, a marker of poor outcome in MDS, 10 and significant upregulation of LRP6, a marker of AML leukemia stem cells (LSCs) activity 11 in prMDS vs stMDS (Figure S3).The differential expression analysis (DEA) showed 378 significantly (ІlogFCІ > 1, FDR < 0.01) upregulated and 467 significantly downregulated genes in stMDS (Figure 1A).Protein-coding genes (PCGs) included 369 upregulated and 328 downregulated genes in stMDS.We further analyzed those 369 upregulated PCGs that may include PCGs that form the barrier protecting cells from progression.In total, 237 GO Biological processes (BP) were enriched (not shown).Using REVIGO (settings: small) to reduce the list of GO terms, the list was shortened to 70 terms.The most enriched terms were associated with nucleosome assembly, regulation of gene expression, regulation of megakaryocyte differentiation, regulation of immune system processes, response to stress and stimulus, export and secretion from cells, cellular transport, telomere capping, signaling, DNA repair, cell adhesion, apoptosis and protein modification.
In REACTOME, 119 pathways were enriched in stMDS.The most significantly upregulated pathways were connected to the cell cycle, DNA repair, stress responses and chromosome and telomere maintenance (Figure 1B).Moreover, gene expression data showed dysregulation of critical signaling pathways mediated by NOTCH, RHO GTPases, β-catenin and WNT and the MAPK family.The pathways of regulation of rRNA expression, splicing and RNA metabolism were also dysregulated.
Next, we analyzed genes upregulated in prMDS.Although 328 PCGs were upregulated, multiprotein analysis in String showed only a few over-represented pathways.Five GO BP terms significantly enriched in prMDS were related to cell adhesion (Figure 1C).No pathway was upregulated according to the RACTOME database.
A set of validation samples (4 prMDS and 3 stMDS) showed the same pattern in PCA analysis (Figure S4A) and in DEA of LEF1 and LRP6 (Figure S4B).

| Downregulation of cell-cycle and energy metabolism related pathways in prMDS BM CD34+ cells
In GSEA, we ran the analysis through Curated Gene Sets (C2) and Hallmark Gene Sets (H); in stMDS, 485 and 12 gene sets were enriched at FDR < 0.1, respectively.We observed significant overrepresentation of processes related to cell cycle progression and energy metabolism in stMDS CD34+ cells (Figure 1D).Active cell cycling leads to replication stress, and ATR-dependent checkpoint activation is essential for cell-cycle progression to maintain the integrity of replication forks. 12The gene sets for cell cycle, cell cycle checkpoints and activation of ATR in response to replication stress were all significantly overexpressed in stMDS and under-represented in prMDS BM CD34+ cells (Figure 1E).These data, together with overall decreased metabolism signatures in prMDS CD34+ cells when compared to stMDS CD34+ cells (Figures 1D,F, G and S5), are consistent with slow cycling, more quiescent phenotype of prMDS CD34+ cells. 13  metabolism and DDR was significant (FDR < 0.1) in C2 and H datasets (Figure S7B).These data support the hypothesis that features that lead to development (and protect against progression) of LR-MDS manifest largely as a consequence of intrinsic molecular changes in cells that occur in processes of normal HSC ageing.15,16 The unavailability of BM CD34+ cells from age-matched healthy controls for prMDS precluded such comparison, but data from healthy controls and stMDS-age imply that premalignant expression signatures of prMDS are distinct from normal HSC ageing.prMDS CD34+ cells revealed broadly repressed markers of multiple subcategories of DDR among the top 50 downregulated genes when compared to stMDS (Figure S8, Table S4A).H2AFX gene, encoding H2AX, a critical chromatin marker of DDR, was significantly downregulated in prMDS vs stMDS (Figure 2C).Among the significantly downregulated DDR genes in prMDS were also typical markers of DNA repair, such as PCNA and AUNIP, or mitotic phase-related genes, such as CCNB1, MAD2L1, NCAPH and EXO1 (Figures 2C and   S9).In addition, among significantly upregulated genes in prMDS compared to stMDS were genes encoding suppressors of DDR pathways, such as ZBTB8A and ZEB1 (Figure 2D).ZBTB8A (BOZF1) expression is increased in several cancers where it functions as oncoprotein that stimulates progression through inhibiting p21 pathway.17 ZEB1, a known regulator of DDR in cancer, 18 can downregulate p53 activity to suppress DDR.19 This led us to hypothesize that the mechanisms of attenuated DDR signaling in prMDS are associated with decreased cycling and other altered intrinsic pathways that are inherent to CD34+ cells and that these mechanisms could be also governed by aberrant splicing and epigenetic mediators, such as long noncoding RNAs (lncRNAs).

| Aberrant splicing and lncRNA expression are increased in prMDS BM CD34+ cells and target DDR pathways
To fully address the transcriptomic differences between stMDS and prMDS, we focused on aberrant splicing and differential expression of lncRNAs.Differential alternative splicing analysis revealed 1356 significant events (FDR < 0.05, absolute inclusion level difference >10%).
PCA showed a great diversity in alternative splicing events (Figure 3A).We observed a higher number of alternative splicing events in prMDS (977 vs 378 events) (Figures 3B and S10) and found increased use of alternative 3 0 and 5 0 splice sites (A3SS, A5SS), but the highest increase was seen in the retention of introns (RI) (Figures 3B   and S10).In contrast, skipped exon (SE) events and usage of mutually exclusive exons (MXE) were decreased in prMDS compared to stMDS.Across all aberrantly spliced genes, several key hematopoietic regulators, such as EZH2, MDM2, PDK1 and DNMT3A, were present.The top 10 alternatively spliced genes according to the absolute value of the inclusion level difference are listed in Table S5.
Genes aberrantly spliced in prMDS were part of pathways that were downregulated in this group according to DEA.These were metabolic processes including: nucleic acid, amino acid and protein metabolism; cellular response to DNA damage and stress; gene expression; histone modification; regulation of mRNA splicing; and protein transport (Figure 3C).Cellular pathways from GO BPs associated with aberrantly spliced genes in stMDS did not contain cellular responses to stress and responses to DNA damage (Figure 3D); these GO BPs were specifically targeted in prMDS.As aberrant splicing impairs the functionality of multiple cellular processes during carcinogenesis, including tumor-inhibitory pathways 20,21 ; these data are consistent with the proposed impaired tumor-suppressing mechanisms in prMDS.
Furthermore, a total of 629 lncRNAs were upregulated and 75 were downregulated (FDR < 0.05) in prMDS (Figure 3E).Antisense RNAs and lincRNAs were the most upregulated groups of lncRNAs, while processed pseudogenes were the most downregulated group.The expression of lncRNAs was heterogeneous in stMDS throughout the patients, but we observed a greater uniformity and mainly upregulation in lncRNA expression in prMDS (Figure S11).In accordance with previously described lncRNAs with prognostic significance in MDS, [22][23][24] we observed upregulation of MEG3 and KIAA0125 and downregulation of TCL6 in prMDS.Multiple other lncRNAs upregulated in prMDS compared to stMDS have a known function in cancer (Table S4B).Among them, LINC00340 (CASC15) correlated positively with ZEB1 (Figure 3F) and inversely with CDKN1A expression (Figure S12) in our LR-MDS patient cohort, in agreement with LINC00340 proposed role in other cancer types. 25MALAT1 was published as the most highly expressed oncogenic lncRNA in multiple myeloma, where it directly binds to PARP1, a central protein in the base excision repair pathway 26 ; in our study, MALAT1 correlated inversely with PARP1 expression (Figure 3F).ZEB1-AS1 showed moderate positive correlation with ZEB1 (Figure S12, see also Table S4B for relevant references).
Overall, these data suggest that upregulated global lncRNA expression in prMDS samples may contribute to the DDR pathway attenuation.

| Gene expression signatures reveal overexpression of multiple oncogenic PCGs and diminished differentiation in prMDS BM CD34+ cells
The heatmap of the top 50 upregulated genes in prMDS vs stMDS samples, ranked by GSEA (Figure S8), included several PCGs, previously identified as markers of leukemic progression and/or poor MDS/AML survival, such as SPNS2, TEC, MN1 and others (Figure 4A, see Table S4C for the list of these PCGs and relevant references).One of the genes, NEK3 (Figure 4A, see also Figure S4B for NEK3 in validation sample set), has not previously been linked to MDS/AML progression, but its overexpression is associated with poor prognosis in patients with gastric cancer and the encoded protein plays an important role in cell migration, cell proliferation and cell viability. 27Other PCGs, such as RPAP2 and LRP6 belong to the key molecules of AML LSCs. 11,28 further investigated the commitment to specific hematopoietic lineages as previously described, 4 where a decrease in the expression of genes specifically expressed in erythroblasts and megakaryocyte/ erythrocyte progenitors and upregulation of genes related to immature progenitor cells was associated with shorter survival and leukemic transformation in myelodysplasia samples.Indeed, prMDS had significantly reduced expression of genes of more mature hematopoietic lineages (Figures 4B and S13, Table S6).
These data collectively suggest reduction in markers for mature hematopoietic lineages and expression of LSC phenotype in BM CD34+ cells of prMDS vs stMDS.

| ZEB1 is significantly overexpressed in prMDS BM CD34+ cells and anticorrelates with CDH1 expression
Next, we addressed the overrepresented GO terms associated with cell-matrix or cell-cell adhesion in prMDS (Figure 1C).A key factor which controls alterations in cell-cell adhesion and its connection with DDR in tumorigenesis is ZEB1. 18As mentioned, we observed significant upregulation of ZEB1 in prMDS compared to stMDS (Figure 2D, see also Figure S4B for ZEB1 in validation sample set), which is consistent with a previous study that showed ZEB1 expression significantly associated with poor overall survival of AML. 29 We examined whether differences in  S7).In one biopsy from prMDS patient with ZEB1 transcript level above median, ZEB1 immunoreactivity was absent in CD34+ cells but was positive in multiple nucleated cells of granulocytic and monocytic lineage (Figure S14).
This, together with above mentioned coregulations of ZEB1 with some lncRNAs underscores a complex regulation of ZEB1 expression in LR-MDS.
ZEB1 plays a pivotal role in transcriptional repression of CDH1, the gene encoding E-cadherin 30 and we observed significantly downregulated CDH1 in prMDS (Figure 4D).There was a significant moderate negative correlation of expression levels of ZEB1 and CDH1, suggesting their functional relationship in LR-MDS (Figure 4D).Another PCG associated with cell adhesion, migration and invasiveness of cancer cells including AML, SPNS2, 31 was significantly upregulated in prMDS (Figure 4A, see also Figure S4B for SPNS2 in validation sample set); in fact, it was the most upregulated PCG among the top 50 upregulated genes in the prMDS vs stMDS samples ranked by GSEA (Figure S8, Table S4C).These data confirmed dysregulated cell adhesion gene expression in prMDS.

| DDR gene expression signature for LR-MDS patient stratification
To propose a gene expression signature predictive of progression of LR-MDS patients, we first identified two groups of genes from the expression profiles of the top 50 up-and downregulated genes in GSEA (Figure S8).From the top 50 upregulated genes in prMDS, we selected genes with known function in leukemic progression and/or poor MDS/AML survival (13 genes, Table S4C) and tested them independently for predictive power on progression (Supplementary Methods in Data S1).The cumulative incidence of MDS progression of the 61 patients (discovery cohort) revealed significant curve separations (P < .05)for 11 genes with the best F I G U R E 3 Alternative splicing events and lncRNA expression in LR-MDS CD34+ cells.(A) PCA based on the inclusion levels of the differential splicing events separated the stMDS and prMDS samples.(B) Visualization of alternative splicing events in prMDS and stMDS.The higher number of alternative splicing events was observed in prMDS.The highest difference was detected in the retention of introns (RI, red dots).SE, skipped exon; RI, retention of introns; A3SS, A5SS, alternative 3 0 and 5  P-values for RPAP2, MN1 and DOCK1 (Figure 5A).We also tested other oncogenes among the top 50 upregulated genes in prMDS not previously associated with MDS/AML for their potential biomarker role in predicting LR-MDS progression.The best significance of curve separation was calculated for NEK3 gene, encoding for NIMA-related kinase 3 (Figure 5B).Expression of NEK3 was also an independent prognostic factor (P < .001)for progressionfree disease in multivariate analysis of important clinical and genetic factors (Table S8).
As described above, downregulation of DDR genes expression was the main characteristic feature distinguishing prMDS from stMDS in our cohort.Therefore, we chose 19 DDR genes (Table S4A) from the top 50 downregulated genes in prMDS (Figure S8) and tested them individually for the cumulative incidence of progression; significant curve separation (P < .05)was achieved for 17 genes (Figure S15).The entire 19 DDR gene set contained mainly cell cyclerelated genes, reflecting non-cycling status of prMDS BM CD34+ cells, not known to be mutated in MDS and AML.To assess whether an expression-based DDR signature could predict LR-MDS patients' progression, we calculated probability of a progression-free disease (with mortality before MDS progression as a competing risk, using regularized Cox regression analysis as described in Supplementary Methods in Data S1) based on expression of these 19 DDR-associated genes.Indeed, our analysis revealed that LR-MDS patients with high DDR gene expression had a significantly higher probability of progression-free disease (P < .001)compared to those with low DDR gene expression (Figure 5C).Thus, DDR gene expression signature appears to be a significant predictor of LR-MDS progression in our cohort.

| Correlations of gene expression profiles suggest a key role for ZEB1 in LR-MDS transcriptional regulatory network
We asked whether ZEB1, a known master regulator (repressor or transcriptional activator, depending on cell context 32 of key processes dysregulated in prMDS, ie, DDR, EMT and quiescence 18,33,34 ) might act as a transcriptional regulator in LR-MDS.We searched for overlap between the ZEB1 target genes in the ChIP-seq datasets (ENCODE transcription factor dataset 35 ) and both, DDR signature genes downregulated in prMDS (Table S4A) and genes upregulated in prMDS with known roles in leukemic progression and/or poor MDS/AML survival (Table S4C).Among the 19 DDR-associated genes selected for the "DDR signature," 14 genes are listed as putative ZEB1 targets.All were found to be significantly anticorrelated with ZEB1 expression in our LR-MDS samples (anticorrelations with H2AFX and H2AZ1 are shown in Figure S16).Four of these genes, CCNB1, MAD2L1, NCAPH and EXO1, which are related to mitotic phase and whose Pearson correlation coefficient (R) exceeded À.64 (Figure 5D), have been previously described to have ZEB1 binding sites in their putative promoter sequences in breast cancer cells, but in which ZEB1 serves as a transcriptional activator of these genes. 36ong the genes predicting poor prognosis in MDS/AML that were positively correlated with ZEB1 expression was ANGPT1, encoding angiopoietin-1 (Figure 5E).ANGPT1 was shown to be a direct target positively regulated by ZEB1 in triple-negative breast cancer cells, 37 but its potential transactivation by ZEB1 in MDS or AML progenitors has not been investigated.We also observed significant positive correlation of expression between ZEB1 and LRP6 (Figure 5E).
LRP6, encoding the coreceptor of Wnt/β-catenin signaling, 11 is a ZEB1 target according to publicly available (ChIP)-seq datasets from the ENCODE project.
According to the ENCODE project datasets, NEK3 is also among the ZEB1 target genes.Indeed, a significant moderate positive correlation between ZEB1 and NEK3 was observed in LR-MDS CD34+ cells (Figure S16), but such a possible relationship in LR-MDS progenitors requires further functional analyses.

| DISCUSSION
MDS is a very heterogeneous disease.So far, several transcriptomic studies have been performed to understand the  pathogenesis of MDS and predict the outcome. 4,5However, to our knowledge, there is no transcriptomic study targeting exclusively only LR-MDS to study the slight but important differences associated with accelerated progression.Here, we provide a comprehensive transcriptome analysis of LR-MDS samples and identify the differences between patients with stable disease and those at risk of rapid progression.We have previously demonstrated that RUNX1-mutated LR-MDS patients are at risk of rapid progression and that RUNX1 mutations presumably disturb anti-tumor cellular defense pathways and contribute to disease progression. 6The work presented here builds on the previous study and examines the differences regardless of driver mutations.
For the purpose of this study, we stratified our cohort of patients into stMDS and prMDS groups; the cut-off point for accelerated progression and inclusion in the prMDS group was set to 20 months from diagnosis.Our initial analyses (transcriptome consistency between prMDS samples, PCA, overlap with recently published markers of WNT signaling) suggested that the division of LR-MDS patients based on such a threshold of accelerated progression might be associated with distinct gene expression signatures of BM CD34+ cells.Our initial discovery cohort consisted of 61 LR-MDS patients, the validation sample set included 7 samples; a higher number of analyzed patients was precluded by unavailability of additional diagnostic CD34+ samples.
In the discovery cohort, we found 697 significantly dysregulated PCGs between stMDS and prMDS.These results show a large heterogeneity within LR-MDS patients, even though they represent one group for treatment decisions, and are consistent with the previously published heterogeneity of progenitor subpopulations within LR-MDS BM CD34+ cells. 38In stMDS, we found transcriptional signatures connected to actively cycling CD34+ cells poised for erythro-megakaryocytic differentiation, with high metabolic activity and with the intrinsic ability to activate stress defense pathways.In contrast, only GO BPs for cell adhesion pathways were upregulated in prMDS.In addition, we found that several oncogenes known to be involved in leukemia progression and/or poor survival in MDS/AML are significantly upregulated in prMDS BM CD34+ cells.Interestingly, the NEK3 gene, not previously associated with MDS/AML, revealed the best prognostic value for predicting LR-MDS progression.NEK3 is a member of the serine/threonine NEK kinase family and its overexpression is associated with progression and poor prognosis in patients with breast cancer and gastric cancer, 27 but its role in myeloid leukemogenesis remains to be clarified.
The most characteristic transcriptomic signature of prMDS CD34+ cells distinguishing them from stMDS cells was the expression pattern of quiescent-like or slowly proliferating progenitors with LSC markers, and the signatures of processes that are associated with or regulate non-cycling malignant stem cells: that is, suppressed DDR and dysregulated adhesion pathways. 39It is likely that the aforementioned changes in the cellular composition of CD34+ subpopulations between prMDS vs stMDS CD34+ cells (ie, their differentiation status) determine to some extent the differences in gene expression signatures. 38We propose that prMDS CD34+ cells bear cell-autonomous and perhaps also noncellautonomous defects to establish a tumor-suppressing DDR barrier; 2][43] Nevertheless, none of the 22 DDR genes selected for sequence analysis were mutated in prMDS; the four mutant TP53 samples were among the stMDS cases.Abnormal splicing in MDS is mainly related to the presence of mutated splicing factors, 41,42,44 and there are limited data regarding their clinical implications.In a study by Yang et al, 45 almost 30% of genes were aberrantly spliced in MDS samples compared to healthy controls.
The aberrantly spliced genes were related to cell proliferation, cell adhesion and protein degradation, and a higher degree of aberrantly spliced genes correlated with shorter OS and time to leukemic transformation.We revealed that stMDS and prMDS significantly differ in their alternatively spliced genes; genes abnormally spliced in prMDS were enriched in specific GO terms, involving DDR-related pathways.Several key hematopoietic regulators were also aberrantly spliced.The highest difference between stMDS and prMDS was seen in the retention of introns, which was significantly higher in prMDS.RI, frequent in multiple cancers (including MDS), has been described as a mechanism of tumor suppressor inactivation. 21,43rthermore, we observed an overall upregulation of lncRNA expression in prMDS CD34+ cells.A published lncRNA-based riskscoring system for MDS patients, independent of the patients' mutational profiles, demonstrated the potential of lncRNA profiling to improve current MDS risk stratification. 46In our study, the F I G U R E 5 Dysregulated expression of DDR genes and genes previously associated with myeloid malignancies is significantly associated with accelerated progression in LR-MDS.(A) Cumulative incidence of LR-MDS progression with death as a competing risk according to the expression of three genes with a known function in leukemia stem cell function/leukemic progression and/or poor MDS/AML survival (Table S4C) from the top 50 upregulated genes in prMDS (Figure S8).(B) Cumulative incidence of LR-MDS progression with death as a competing risk according to the NEK3 expression.(C) Probability of a progression-free disease with mortality prior to MDS progression as a competing risk based on DDR gene expression signature.DDR gene expression signature appears to be a significant predictor of LR-MDS progression.The signature contains expression levels of 19 DDR genes (Table S4A) from the top 50 downregulated genes in prMDS (Figure S8).(D) ZEB1 transcriptional correlations with its putative targets, using a combined set of patients from the discovery cohort (n = 61, black dots) and validation samples (n = 7, red dots).
Pearson correlation analysis revealed a strong negative correlation of ZEB1 with the expression of four DDR genes NCAPH, CNNB1, MAD2L1 and EXO1 and (E) a strong positive correlation with the expression of ANGPT1 and LRP6 genes.CPM, counts per million; P, P-value; R, correlation coefficient.
most significantly upregulated lncRNA in prMDS was KIAA0125; its higher expression has previously been associated with concordant upregulation of HSC-associated genes at higher-risk MDS and has emerged as an independent unfavorable prognostic marker for OS and LFS in MDS. 22Indeed, lncRNAs belong to important epigenetic regulators of tumor-promoting and tumor-restraining pathways and contribute to the regulation of expression of multiple oncogenes and tumor-suppressor genes. 47Because epigenetic silencing of DDR genes and pathways is a common cause of DDR deficiency in cancer, 38 we correlated two of the significantly overexpressed oncogenic lncRNAs in prMDS, LINC00340 (CASC15) and MALAT1, with their potentially interacting components of DDR pathway (ZEB1 and CDKN1A in the case of LINC00340, and PARP1 in the case of MALAT1), and confirmed earlier findings. 25,26These data suggest that lncRNA expression in prMDS may trigger silencing of some DDR factors, thus likely contributing to the overall suppression of DDR expression in prMDS.
To further explain the mechanisms of disease progression in prMDS with attenuated DDR pathways, we addressed the possible impact of overrepresented GO terms associated with cell-matrix or cell-cell adhesion in prMDS.First, we focused on ZEB1 gene expression.ZEB1 controls alterations in cell-cell adhesion and E-cadherin expression; thus, it is a key factor of the first steps in epithelial-mesenchymal transition (EMT).Moreover, its activation prevents the induction of DDR (and thus the tumorigenesis barrier) in the early steps of tumorigenesis. 18Indeed, we observed a significant upregulation of ZEB1 in prMDS, consistent with previously published ZEB1 expression significantly associated with poor overall survival of AML. 29 ZEB1 is a transcriptional repressor of CDH1, the gene encoding E-cadherin, 30 and we observed significantly downregulated CDH1 in prMDS.Disruption of E-cadherin mediated cell-cell adhesion is frequently observed in hematological malignancies. 48,49An earlier report on CD34+ cell populations from MDS patients has shown that the promoter of CDH1 is often hypermethylated, resulting in decreased protein expression. 50Earlier analyses also interconnected overexpression of ZEB1 with downregulation of H2AX; H2AX silencing in HCT116 human colon cancer cells promoted mesenchymal-like characteristics including upregulation of EMT transcription factors, including ZEB1. 51pression of histone H2A variants, participants in DDR pathways, was largely suppressed in prMDS, and significantly anticorrelated with ZEB1 expression.In addition, an association between the downregulation of DDR repair genes and the dysregulated expression of EMT transcription factors, including ZEB1, potentially counteracting the mesenchymal to epithelial transition, has been described during an early phase of reprogramming mouse embryonic fibroblasts. 52All these data collectively suggest a close coregulation between suppressed DDR gene expression with a transcriptional program of cell-cell adhesion and mesenchymal cell fate promoting leukemogenesis/progression in prMDS.
The attenuated DDR, alteration of cell adhesion and relative quiescence of prMDS CD34+ cells led us to hypothesize, that ZEB1 might be an important transcriptional regulator in LR-MDS.
This could be supported by ZEB1 IHC detection in the patients' biopsies and the high transcriptional association of ZEB1 with multiple DDR, cell adhesion/migration and cell cycle progression genes, putative ZEB1 targets proposed in the literature and/or biomedical databases.Indeed, these literature/database data were obtained from human cell types other than CD34+ HSPCs (MDS) and in different cell contexts, 25,30,[32][33][34][35][36]51,52 and the expression data shown here represent only basis for further functional studies. Anothr regulatory role of ZEB1 in LR-MDS progression could be its proposed stimulation of AGPN1, encoding the angiogenic factor angiopoietin-1.Such coregulation may have functional implications similar to the ZEB1 role in tumor stromal endothelial cells.53 The expression of the LSC phenotype in prMDS, characterized by LRP6 gene, 11 could result from Wnt/β-catenin/ZEB1 corregulation.
Wnt/β-catenin signaling is known to directly induce ZEB1 transcription, 54 and Lrp6 is a downstream target of ZEB1 in endothelial progenitors. 33deed, our observed coregulation between ZEB1 and LRP6 expression in LR-MDS CD34+ cells should be corroborated by further studies.
In conclusion, our data show transcriptional signatures of active cycling and functional DDR signaling in stMDS CD34+ cells, which corresponds to an enhanced intrinsic ability of stMDS CD34+ cells to suppress malignant progression through antitumor barrier activation.This is consistent with current understanding that proliferation in premalignant disease states generates replicative stress and upregulation of DNA repair processes, which evokes checkpoint activation, participating in an anti-transformation barrier and thus preventing pathways leading to tumor development. 12The accelerated progression phenotype of prMDS appears to be a consequence of the downregulation of DNA damage checkpoints/repair, and hence allows grounds for increased genomic instability.This cell-autonomously and perhaps also noncell-autonomously mediated suppression of DDR barrier seems to be interconnected with dedifferentiation and a quiescentlike gene expression program which promotes MDS progression.
Our findings also point to appropriate therapeutic approaches for prMDS patients.We propose that specific therapeutic approaches can be exploited to selectively target malignant progenitors in stMDS and prMDS; we present a DDR gene expression signature with potential implication in prognostic stratification of LR-MDS patients, independent of the mutational status.Previous studies have documented that the quiescent cell state and a transcriptional program resembling that observed for prMDS CD34+ cells represent features of preexisting failure to respond to azacitidine monotherapy. 13,55On the other hand, regimens combining azacitidine with proliferation/myeloid differentiation-inducing growth factors 56 Figure S1.Samples of seven healthy individuals were used as healthy controls (median age was 45 years; range, 29-69 years).

F
I G U R E 1 Legend on next page.

3 |
BM CD34+ cells of prMDS have attenuated DNA damage checkpoint and DDR pathways DDR is closely interconnected with cell cycle progression and DNA damage checkpoints are activated to regulate various DNA-repair mechanisms; therefore, most DNA repair pathways are attenuated in non-dividing cells.14We compared differential expression profiles of the DNA damage checkpoint and DNA repair pathways between stMDS and prMDS.GSEA analyses revealed significant underrepresentation for G2/M DNA damage checkpoint and DNA repair pathways in prMDS BM CD34+ cells compared to those of stMDS (Figures2A and S6A).Then, we analyzed individual subcategories for single-strand break repair (SSR) and double-strand break repair (DSR) pathways; although stMDS expression patterns for these DNA repair gene sets were highly heterogeneous in the patients, prMDS patients often shared the same profiles of largely downregulated genes (Figures2B and S6B-E).The downregulation of DNA repair genes in prMDS were primarily seen in mismatch repair/ nucleotide excision repair (MMR/NER) genes and Fanconi anemia (FA) genes (Figures2B and S6E).These data are consistent with the less proliferative, more quiescent-like cell state in prMDS, as MMR/NER SSR and FA DSR are primarily employed by proliferating HSCs for repair.13We questioned whether the crucial regulatory networks of cell cycle progression, checkpoint activation and DNA repair signature in stMDS BM CD34+ cells significantly differ from those of healthy individuals.Thus, we performed DEA between healthy control (n = 7; median 45 years, range, 29-69 years; average 49 years) and selected age-matched stMDS (stMDS-age) (n = 13; median 53 years, range, 23-69 years; average 49 years) samples.We observed 70 up-and 209 downregulated genes in stMDS-age (FDR < 0.05).Only the splicing pathways were upregulated, and the immune pathways were downregulated (FigureS7A).In GSEA, no gene set for cycling, F I G U R E 1 Expression analysis of prMDS vs stMDS.(A) Volcano plot of differentially expressed genes (PCGs and lncRNAs) between CD34+ BM cells of prMDS and stMDS.The red points indicate significantly dysregulated genes.The most upregulated genes in prMDS (ie, downregulated in stMDS) are toward the right; the most downregulated genes in prMDS (ie, upregulated in stMDS) are toward the left.x-axis: logFC, logarithm of fold changes; y-axis: Àlog10 of FDR value; FDR, false discovery rate, red points: FDR < 0.01.(B) The most upregulated Reactome pathways in stMDS.(C) GO Biological Process terms significantly enriched (FDR < 0.01) in prMDS.(D) Gene sets enriched (FDR < 0.1) in stMDS from Hallmark dataset by GSEA.ES, enrichment score; NES, normalized enrichment score; P, P-value.(E) Selected pathways enriched (FDR < 0.1) in stMDS from C2 gene sets by GSEA.(F) IHC detection of IDH2 protein expression (brown) with hematoxylin nuclear counterstain (blue) in BM trephine biopsies of LR-MDS patients.Left: A section from stMDS patient no.V2092 with IDH2 expression above the median showing nucleated cells with medium to strong mitochondrial staining for IDH2 (details seen in the inset).Right: Rare IHC staining for mitochondrial IDH2 from prMDS patients no.V1834 with IDH2 expression below the median.IDH2 transcription levels and detailed evaluation of IDH2 protein expression in individual samples are described in FigureS5B.(G) Differential expression of PKLR (left, a glycolytic marker) and NDUFS2 (a core subunit of mitochondrial complex I) in stMDS vs prMDS samples.For boxplots generated from RNAseq data, we used the combined values of all samples, from the initial discovery cohort (n = 61, black dots) and the validation sample set (n = 7, red dots); P-value (shown in the graphs) is counted for all samples.Student's t test with Bonferroni correction.

F
I G U R E 2 Legend on next page.

F I G U R E 2
Attenuated DNA damage checkpoints and DDR pathways in prMDS CD34+ cells.(A) GSEA identified G2/M DNA damage checkpoint and DNA repair gene sets as significantly enriched in stMDS compared to prMDS.ES, enrichment score; NES, normalized enrichment score; P, P-value; FDR, false discovery rate.(B) Heatmap representations of the differential expression of genes in selected DNA repair gene categories: single-strand break mismatch repair (SSR MMR), double-strand break repair-Fanconi anemia pathway (DSR FA).Red indicates upregulation, blue indicates downregulation of gene expression, and the color intensity indicates the level of differential expression.Columns in the heatmaps represent individual samples.(C) Boxplots depicting expression of four genes of the 19-gene DDR signature in stMDS and prMDS cohorts.(D) Boxplots depicting expression of ZBTB8A and ZEB1 in stMDS and prMDS.For boxplots generated from RNAseq data, we used the combined values of all samples, from the initial discovery cohort (n = 61, black dots) and the validation sample set (n = 7, red dots); P-values (shown in the graphs) were counted for all samples.Student's t test with Bonferroni correction.

F
I G U R E 3 Legend on next page.
expression in LR-MDS BM CD34+ cells are reflected at its protein level.Immunohistochemistry (IHC) staining indicated correlation of CD34+ cell transcripts of ZEB1 and ZEB1 nuclear protein levels in LR-MDS BM CD34+ cells in the most examined samples (Figure 4C, Table

F
I G U R E 4 Legend on next page.
Expression below median (n = 30) Expression below median (n = 30) Expression above median (n = 31) Expression above median (n = 31) Expression above median (n = 31) Expression below median (n = 30) F I G U R E 5 Legend on next page.
could represent candidate treatment options for prMDS.AUTHOR CONTRIBUTIONS Monika Kaisrlikova: Processed the samples and performed the experiments; Processed and interpreted the data; Drafted the article.David Kundrat: Performed the bioinformatic analyses.Pavla Koralkova: Processed the samples and performed the experiments; Interpreted the data; Reviewed the article.Iva Trsova: Processed the samples and performed the experiments.Zuzana Lenertova: Processed the Differences in cellular states of CD34+ cells from prMDS and stMDS, defined by transcriptional programs.(A)Differentialexpression of SPNS2 and NEK3 in stMDS vs prMDS samples.For boxplots generated from RNAseq data, we used the combined values of all samples, from the initial discovery cohort (n = 61, black dots) and the validation sample set (n = 7, red dots); P-value (shown in the graphs) is counted for all samples.Student's t test with Bonferroni correction.(B)Heatmap of z score means from expression values of genes related to hematopoietic lineages.The asterisks flag indicates the level of significance from the t test, the values are listed in TableS6, and a heatmap for all related genes is provided in FigureS13.CLP, common lymphoid progenitor; CMP, common myeloid progenitor; EB, erythroblast; GMP, granulocyte monocyte progenitor; HSC, hematopoietic stem cell; MEP, megakaryocyte erythrocyte progenitor; MK, megakaryocyte; MPP, multipotent progenitor.(C)RepresentativeIHC double staining of CD34/ZEB1 protein in BM trephine biopsies of LR-MDS patients.Upper panels: Individual sections from stMDS patients no.V1528 and V2092 with ZEB1 transcript levels below the median (see also TableS7) demonstrated membranous/cytoplasmic staining for CD34 (red) with absent/rare nuclear/cytoplasmic ZEB1 staining (brown, see the insets).Bottom panels: Specific membranous and diffuse cytoplasmic CD34 staining (red) with moderate to strong ZEB1 protein co-expression (brown, see the insets) in individual sections from prMDS patients no.V1834, V2243 and V997 and stMDS patient no.V1921, all with ZEB1 transcript levels above the median.Multiple other nucleated, non-CD34+ cells with hematoxylin nuclear counterstains (blue), showed both nuclear and/or cytoplasmic staining for ZEB1.ZEB1 transcript levels and detailed evaluation of ZEB1 protein expression in individual samples are described in TableS7.(D) Differential expression of CDH1 gene in stMDS and prMDS patients and Pearson correlation analysis of the expression levels of CDH1 and its repressor ZEB1.For boxplots and correlation analysis, we combined values of both cohorts, from the initial discovery cohort (n = 61, black dots) and the validation sample set (n = 7, red dots).Student's t test with Bonferroni correction, P-values shown in the graphs; CPM, counts per million; R, correlation coefficient.