High‐resolution single‐cell transcriptomic survey of cardiomyocytes from patients with hypertrophic cardiomyopathy

Abstract Hypertrophic cardiomyopathy (HCM) is a common inherited cardiovascular disease, which can cause heart failure and lead to death. In this study, we performed high‐resolution single‐cell RNA‐sequencing of 2115 individual cardiomyocytes obtained from HCM patients and normal controls. Signature up‐ and down‐regulated genes in HCM were identified by integrative analysis across 37 patients and 41 controls from our data and published human single‐cell and single‐nucleus RNA‐seq datasets, which were further classified into gene modules by single‐cell co‐expression analysis. Using our high‐resolution dataset, we also investigated the heterogeneity among individual cardiomyocytes and revealed five distinct clusters within HCM cardiomyocytes. Interestingly, we showed that some extracellular matrix (ECM) genes were up‐regulated in the HCM cardiomyocytes, suggesting that they play a role in cardiac remodelling. Taken together, our study comprehensively profiled the transcriptomic programs of HCM cardiomyocytes and provided insights into molecular mechanisms underlying the pathogenesis of HCM.


| INTRODUCTION
Hypertrophic cardiomyopathy (HCM) is a common inherited cardiovascular disease characterized by left ventricular hypertrophy and is often caused by mutations in genes encoding sarcomere proteins. 1,2Despite significant advances in understanding the genetic underpinnings of HCM, there remains an incomplete understanding of the molecular basis that underlies the cellular processes relevant to the pathogenesis of HCM.
Single-cell RNA-sequencing (scRNA-seq) have provided new opportunities to more deeply understand human diseases including cardiac diseases. 3,46][7][8][9] Since the cell mass of a cardiomyocyte is much larger than its nucleus, single-cardiomyocyte RNA-seq should be able to detect more transcripts per cell which should be complementary to these single-nucleus RNA-seq studies and may provide a more comprehensive characterization of the transcriptome of HCM cardiomyocytes. 10More recently, a single-cardiomyocyte RNA-seq study for human HCM samples has been reported, but the sequencing depth is still limited. 11Due to the heterogeneity of HCM, analysis based on limited cases may result in incomplete molecular characteristics.Therefore, integrative analyses are required to leverage the available resources and identify conserved transcription modules among different datasets.
However, since these datasets were generated from different labs using various platforms, cross-sample integration remains challenging.
To address these issues, in this study, we performed high-resolution single-cardiomyocyte RNA-seq, investigated the transcriptional programs related to the remodelling of HCM cardiomyocytes, and integrated with published single-cell and single-nucleus RNA-seq datasets of human HCM cardiomyocytes.

| High-resolution single-cardiomyocyte RNAseq for HCM
To elucidate the transcriptional remodelling in cardiomyocytes of HCM, we performed high-resolution scRNA-seq on 2115 individual cardiomyocytes obtained from HCM patients (n = 7) and normal controls (NC, n = 2).After isolation of cardiomyocytes, scRNA-seq was performed using our previously modified STRT-seq method 12 (Figure 1A).A total of 1971 high-quality cells were retained for single-cell analysis after stringent filtration based on the number of expressed genes and the fraction of mitochondrial counts (Figure S1a).We also eliminated potential cell contaminants to ensure the data quality (see also Methods, Figure S1b-g).As a result, an average of 6061 genes and 92,369 transcripts (counted as the unique molecular identifier, UMIs) were detected in each cell, which were significantly higher than other published single-cardiomyocyte (one study) and single-nucleus (two studies) RNA-seq data 8,9,11 (Figure 1B).
Our data also exhibited a lower fraction of mitochondrial counts than the published single-cardiomyocyte RNA-seq data, which also indicated the high quality of our data (Figure 1B).By performing principal component analysis (PCA), cells from the HCM patients and the control subjects were clearly separated into two clusters (Figure 1C).The expression of typical cardiomyocytes markers such as TNNT1 and ACTN1, and HCM markers such as NPPA and NPPB validated the cell type and the states (Figure 1D).

Examination of differentially expressed genes (DEGs)
between the HCM and NC cardiomyocytes showed that 130 genes were significantly up-regulated and 61 genes were down-regulated in the cardiomyocytes of the HCM patients (adjusted p-value < 0.05, fold change > 2, Figure 1E,F, Table S2).
As expected, NPPA and NPPB, which are well-known marker genes of HCM, were prominently up-regulated (Figure 1D-F).
The up-regulation genes also included TPM3, ACTN1 and TNNT1, which are related to cardiac muscle function and preferentially expressed in the foetal heart, indicating a reprogramming of gene expression in HCM from adult backward to foetal pattern 13 (Figure 1C-E).By calculating Pearson's correlation coefficient of gene expression levels, we found co-regulation of these genes among single cells within each individual (Figure S2).The causal genes of HCM showed relatively slight (such as MYH7, MYBPC3, MYL2, TPM1) or even not significantly difference (such as MYL3, TNNT2, TNNI3 and ACTC1) when comparing cardiomyocytes from all HCM patients and the controls, which highlighted the heterogeneity between HCM patients and complexity of HCM pathogenesis (Figure S3a,b).

| Biological pathways and regulatory networks of HCM cardiomyocytes
To investigate which biological pathways were associated with the up-regulated and down-regulated genes in HCM cardiomyocytes, we performed pathway analysis including Gene Ontology (GO) enrichment and Gene Set Enrichment Analysis (GSEA).For the up-regulated genes in HCM, the enriched GO terms included 'heart development', 'muscle contraction', 'actin cytoskeleton organization' and 'regulation of cell adhesion', which were also shown by the enriched GSEA pathway 'hypertrophic cardiomyopathy' and 'cell adhesion molecules' (Figure 2A,B).These enriched terms and pathways should reflect the up-regulation of heart contraction, reactivation of some foetal genes, and reorganization of the cellular cytoskeleton and adhesion of the adaptive and maladaptive hypertrophy 6 (Figure 2A,B, Table S4).Consistent with recent reports, we found that ACE2 (angiotensin-converting enzyme 2 receptor), a gene that encodes the main SARS-CoV-2 binding receptor, was markedly up-regulated in HCM 14 (Figure 1C-E).Interestingly, our analysis showed that a total of 21 genes in the 'SARS-Cov-2 Infection' pathway (R-HSA-9694516) were up-regulated in HCM (Figure 2C).For example, we found that NRP1, which is another host factor of SARS-Cov-2 that can enhance SARS-Cov-2 infection, was also up-regulated in HCM. 15 Another gene SDC2 involving in the cellular entry of SARS-CoV-2 was also up-regulated. 16The up-regulated genes also included several genes relating to the immune response of SARS-CoV-2 infection (e.g., HLA genes and MASP1) (Figure 2D).For the down-regulated genes in the HCM cardiomyocytes, the enriched GO terms included 'heart contraction' and 'cell maturation', reflecting a foetal-like reprogramming in HCM (Figure 2A).The GSEA analysis showed that the 'oxidative phosphorylation' and 'ribosome' pathways were down-regulated, which have been shown to occur in the late stage of hypertrophy 6 (Figure 2B).Interestingly, several metallothioneins (e.g., MT1X, MT1E, MT3 and MT2A) that can bind metals and scavenge free radicals such as superoxide and hydroxyl radicals, 17 were significantly downregulated in the HCM cardiomyocytes (Figure S3c,d), which may lead to their susceptibility to oxidative stress.Consistently, it has been reported that zinc supplementation can induce cardiac metallothionein and prevent diabetic cardiomyopathy in mice. 18xt, we performed the ligand-receptor pair analysis to examine cell-cell interactions among cardiomyocytes.Our results revealed that several ligand-receptor pairs were activated in the HCM cardiomyocytes, pointing to the up-regulation of genes relating to cell adhesion and integrin signalling, cardiac homeostasis and growth factors (e.g., VEGF, IGF, FGF and HBEGF with their respective receptors functioning in cell growth and proliferation) during HCM progression (Figure 2E, Table S3).
To explore the regulatory networks involved in HCM, single-cell regulatory network inference, and clustering (SCENIC) analysis was performed to score the activity of gene regulatory networks of cardiomyocytes in HCM compared with NC. 19,20 We then used the regulon activity generated from SCENIC analysis to cluster all the cells and visualize them by t-distributed stochastic neighbour embedding (TSNE), which showed that cardiomyocytes from HCM and controls were well separated (Figure 3A, Figure S4a), indicating that the gene regulatory network markedly altered in cardiomyocytes of HCM.
Comparison of the regulon activity between HCM and NC identified activation of transcription factors (TFs) in HCM including KLF5 which was associated with cardiomyopathy, 21,22 ETV1 which is involved in atrial remodelling, 23,24 and TCF4 which participates in the WNT signalling pathway (Figure 3B,C).TFs of CEBPB and CEBPD, which regulate genes involved in immune and inflammatory responses, were down-regulated in HCM (Figure 3B, Figure S4b).A consistent distribution pattern for AUC scores of these regulon activities and the expression level of these TFs was found in HCM compared with control cardiomyocytes, which further validated the alteration of these TFs (Figure S4c).By analysing the target genes of these activated TFs, we found that the up-regulated genes of HCM which regulating muscle contraction or relaxation, like NPPA, TPM3, TNNT1 were included.
Meanwhile, the regulon target gene module score (see Methods) of ETV1, TCF4 and KLF5 were up-regulated in HCM, and that of CEBPB and CEBPD were down-regulated (Figure 3D).Based on the SCENIC results, 105 differential activated regulons (DARs) were identified in HCM, among these activated regulons, 31 were overlapped with the 150 differential expressional transcriptional factors.Interestingly, only 14 DARs were identified in the controls with 2 DARs being overlapped with the 13 differential expressional transcriptional factors, again suggesting that HCM was characterized by dominant gene upregulation (Figure S4d).To validate the result of our SCENIC analysis, we calculated the expression of these TFs and the module score of related regulons in other published human single-cell and singlenucleus RNA-seq data of cardiomyocytes from HCM, 8,9,11 which revealed consistent patterns with our results (Figure 3E,F).Together, our data revealed alterations of HCM cardiomyocytes in gene expression, biological pathways, ligand-receptor pairs and TF networks.

| Signature marker genes of HCM cardiomyocytes revealed by comparative analysis
Given the strong heterogeneity and complex genetic background of HCM patients, analysis of materials from limited individuals from different ethics may result in incomplete and biased estimation of molecular characteristics of HCM.To this end, we tried to integrate all published single-cell RNA-seq datasets for HCM.As a result, a total of 4 datasets, including 2 single-cardiomyocyte RNA-seq (including this study) and 2 single-nucleus RNA-seq data were collected, which constituted a meta dataset of 65,083 cells (nuclei) from 37 HCM patients and 94,240 cells (nuclei) from 41 normal controls (Figure 4A). 8,9,11To investigate batch effects among datasets derived from various patients and platforms, consensus genes from different datasets (n = 16,392) were classified into 10 groups based on their expression levels and used to perform unsupervised hierarchical     clustering (Figure 4B).We showed that the overall distributions of gene expression were very different between singlecardiomyocyte and single-nucleus RNA-seq data, which was also consistent with the higher Pearson's correlation coefficients within the two groups (Figure 4C).And we demonstrated that this difference may result in preference when performing differentially expression analysis.For example, the prominent up-regulation of known HCM marker genes (including NPPA, NPPB, TPM3 and RTN4) in HCM compared with NC were only captured in the two single-cardiomyocyte RNA-seq datasets while showing slight differences in the two single-nucleus RNA-seq datasets (Figure 4D).
In addition, we also found that the sensitivity of DEGs identification may be reduced in low depth scRNA-seq data, as the DEGs identified by Wehrens et al. 11 were generally expressed with higher levels than DEGs identified by other datasets (Figure S5).
Together, these results demonstrated the difference between scRNA-seq and snRNA-seq when applied to cardiomyocytes, which also highlight the advantages of high-resolution scRNAseq data.
To overcome the batch effects among datasets, we then chose to perform multiple comparisons between DEGs identified by each dataset (adjusted p-value < 0.05; fold change > 1.5), aiming to find common features of HCM cardiomyocytes.Only DEGs nominated by at least two datasets were remained for further analysis.In this way, we identified 653 up-regulated and 185 down-regulated genes as the signature marker genes of HCM.(Figure 4E; Figure S6c; Table S5).To reveal the modularity of these marker genes, we performed coexpression analysis across single cells and identify gene modules for each dataset, respectively (Figure 4F; Figure S6a,d; Table S6).Benefiting from our high-resolution scRNA-seq data, 8 robust modules were identified among the up-regulated signature genes, which was more than in other datasets.By calculating the differential module scores in these 8 modules in HCM compared with NC for each dataset, we observed that UP_M1 and UP_M2 were the most pervasive altered modules in all datasets, which mainly enriched in pathways such as 'muscle structure development' and 'regulation of MAPK cascade'.
And we also found that the UP_M6 was the most conserved module among datasets which was related to calcium ion import, indicating the dysregulation of ion channels in HCM (Figure 4G,H, Figure S6b).
Three modules of down-regulated signature genes were also identified in our data (Figures S6c-e and S7).Together, by integrating all available resources of scRNA-seq data for human HCM, our results provided a comprehensive molecular feature profiling of cardiomyocytes from HCM.

| Heterogeneity and subpopulation of cardiomyocytes from HCM
We next explored the cell heterogeneity of the cardiomyocytes.After the correction of batch effects derived from different individuals by the Harmony algorithm, five clusters of the HCM cardiomyocytes and three clusters of the NC cardiomyocytes were identified, respectively (Figure 5A,B).Differential expression analysis revealed significant gene expression differences among subpopulations (Figure 5C; Table S7).GO analysis for the differentially expressed genes of each HCM cluster showed that, and GAS6).It also enriched in genes related to energy homeostasis, such as CKB, CKM and mitochondrial genes, which were important markers for myocardial infarction, 25 indicating a relationship between HCM and myocardial infarction.Additionally, Cluster 5 was related to RNA processing (e.g., DDX17 and RNPC3) and chromatin organization (e.g., KMT2C and KDM5A), suggesting unknown mechanisms involved in HCM that should be further investigated (Figure 5D,E).
To further reveal the different usage of HCM clusters in transcriptional programs, we calculated the module score of previously identified signature gene modules for each cluster.We showed that cluster 2 exhibited the highest score of up-regulated signature genes, consistent with its high-level expression of HCM markers (Figure 5F).The preferential activation of gene modules in HCM clusters was consistent with their characteristics.For example, C1 showed the highest score of the ribosome module (UP_M7), and the modules related to muscle and heart development (UP_M1 and UP_M2) were both activated in C2 (Figure 5G).Taken together, these results showed the heterogeneity of cardiomyocytes from HCM.

| Up-regulation of ECM genes in cardiomyocytes of HCM
7][28] Previous studies highlighted the central role of the activated myofibroblasts and TGF-β signalling pathway in cardiac fibrosis. 26Unexpectedly, we found that many   S11).

Up-regulated Genes
In the normal heart tissue, positive expression of all five genes can be seen in the interstitial region, while the cardiomyocytes largely displayed no or weak staining.In the HCM patients, the expression of POSTN was strongly increased in some interstitial regions, which was consistent with previous studies (Figure 6D, However, the results showed that, despite that the mice clearly exhibited cardiac hypertrophy and fibrosis, the expressions of FN1 and CTGF were not elevated, while the DCN expression was slightly increased in the interstitial region (Figure S12, see Section 3).
Together, these results indicated that some ECM genes, that is, FN1 and CTGF, were up-regulated and located in the cytoplasm of the cardiomyocytes in some HCM patients.

| Validation in large bulk RNA-seq dataset
Since cardiomyocytes constitute the majority masses of the myocardium, we supposed that the signature features of cardiomyocytes from HCM identified in this study could be applied to sequencing data from bulk samples, which may facilitate researches about the pathogenesis of HCM.To this end, we downloaded the large-scale bulk RNA-seq data from the Myocardial Applied Genomics Network (MAGNet, www.med.upenn.edu/magnet),which contains 28 HCM samples, 166 DCM samples, 6 PPCM (peripartum cardiomyopathy) samples and 166 non-failing controls.PCA of these data revealed a clear separation of NC and other samples in the second principle component (PC2) (Figure S13a).We then sought to verify the expression of known HCM markers.As expected, both NPPA and NPPB were significantly elevated in all three groups compared with NC, while the ACTN1 only significantly up-regulated in HCM, and ACTA1 was not increased in PPCM (Figure S13b).Analysis of signature gene sets revealed high correlations between pathological state and module score, as the signature up-regulated genes were positively correlated with PC2 (Pearson's r = 0.64, p-value = 2.2eÀ43), while signature downregulated genes were negatively correlated (Pearson's r = À0.85 pvalue = 2.3eÀ102) (Figure S13c,d).The score of each gene module also changed consistently, with UP_M1, UP_M2 and UP_M6 showing the most significant up-regulation in HCM (Figure S13e).
What's more, we also verified the expression of key TFs and their regulons identified by our SCENIC analysis in MAGNet dataset, which further confirmed our results (Figure S13f,g).These results suggested that the molecular mechanism identified in our singlecardiomyocytes analysis could also be validated in larger experiments.

| DISCUSSION
In this study, we used high-resolution single-cardiomyocyte RNA-seq for comparing gene expression patterns of the HCM and NC cardiomyocytes.Combining with the published human single-cell and singlenucleus RNA-seq datasets, we systematically investigated the key transcriptional alterations of HCM cardiomyocytes.We also addressed the heterogeneity of HCM cardiomyocytes by performing single-cell clustering and analysing the preferential usage of gene modules among subpopulations.Our immunofluorescence staining results interestingly showed that some ECM genes, that is, FN1 and CTGF, were up-regulated and located in the cytoplasm of the cardiomyocytes in some HCM patients.
With the rapid development of single-cell RNA-seq technologies, various approaches have been established and used in a wide range of fields.However, the differences across platforms, especially when applied to specialized cell types such as cardiomyocytes, were required to estimate.In this study, we demonstrated the distinct gene expression patterns between scRNA-seq and snRNA-seq data when comparing our data with other published human HCM datasets.Due to the large cell size of cardiomyocytes, the cytoplasm of cardiomyocytes may contain more mRNA molecules than other cell types, while these mRNAs may not be sufficiently captured by single-nucleus methods.This was consistent with previous reports that snRNA-seq may result in the underrepresentation of some cell types or depletion of a small set of genes. 29,30However, snRNA-seq of cardiomyocytes is easier to operate and can achieve higher throughput with fluid-or droplet-based devices.Thus, these two methods are complementary.
Using high-resolution scRNA-seq, our results revealed the common gene expression changes of the HCM cardiomyocytes compared with other studies, 8,11 including the prominent up-regulation of NPPA, NPPB and biological pathways such as 'muscle contraction' and 'heart development'.Consistent with the previously reported elevated expression of ACE2 in HCM 14 and the potential link between HCM and COVID-19, 31 we showed that about 20 genes related to SARS-Cov-2 infection and immune response were up-regulated in HCM.
We also found that ACE2 was co-regulated with NPPB in our data.
Using SCENIC analysis, several differential activated TFs between HCM and NC were identified, including three TFs (ETV1, TCF4 and KLF5), which were validated in other datasets and large-scale bulk RNA-seq data.Interestingly, the target genes of ETV1, TCF4 and KLF5 also included NPPA, TPM3, RTN4 and TNNT1, indicating a potential link between the TFs and marker genes of HCM.Further research about the underlying regulatory mechanism of these TFs and their target genes may provide more information about the pathogenesis of HCM.
Importantly, we further identify the conserved transcriptional programs across datasets by performing integrative analyses.To overcome the batch effects between datasets, a differential-feature-based strategy were used to identified common signatures of HCM cardiomyocytes.Moreover, our high-resolution scRNA-seq allowed us to further detect robust modules from these signature genes, which provided a more comprehensive understanding of the transcriptional programs in the HCM cardiomyocytes.
The heterogeneity of cardiomyocytes in both normal and disease status has been reported in recent single-cell research.In this study, using our high-resolution single-cell RNA-seq data, five distinct clusters of HCM cardiomyocytes were identified.Consistently, a cluster with high-level expression of NPPA and NPPB (HCM_C2) was reported by two other studies 9,11 and the cluster showed response to growth factor (HCM_C3), featured by expression of FGF12 was also identified in Liu et al. 9 Additionally, clusters related to viral mRNA translation (HCM_C1), energy homeostasis (HCM_C4) and chromatin remodelling (HCM_C5) were also captured in our data, reflecting the advantages of high-resolution data.
Previous bulk RNA-seq studies on HCM have observed the upregulation of ECM genes in the HCM heart tissue; however, the cardiomyocyte and the myofibroblast were analysed in a mixed manner in these studies. 28,32,33Here, our scRNA-seq data showed that the expression levels of some ECM genes were elevated in cardiomyocytes of HCM, which were confirmed by reanalyzing previous datasets.Our immunostaining results interestingly showed that the elevation of FN1 and CTGF were mainly located in the cytoplasm of the cardiomyocytes in two of three HCM patients.As the ECM genes generally do not re-enter the cell after secretion, these results thus allowed us to conclude that these ECM genes were up-regulated in the HCM cardiomyocytes, which validated the scRNA-seq results.
Consistent with our results, a previous study also reported the upregulation of CTGF in the cytoplasm of cardiomyocytes in dilated cardiomyopathy patients. 34Also, several studies have reported that the elevation of CTGF during mouse heart hypotrophy is mainly derived from cardiomyocytes. 34,356][37] The accumulation of FN1 in the cytoplasm of human HCM cardiomyocytes has not been previous reported.In two of three HCM patients, the expression of FN1 predominantly located in the cytoplasm of the cardiomyocytes, and it appeared to accumulate in the endoplasmic reticulum of HCM cardiomyocytes, which may increase endoplasmic reticulum stress in cardiomyocytes. 38In another HCM patient, in contrast, FN1 was clearly located in the interstitial region.This suggested that the cytoplasm location of the FN1 protein was stage-specific, possibly occurring in the late stage of human cardiomyocyte hypertrophy, and this also served as a control supporting the specificity of the FN1 antibody.We did not replicate the expression patterns of FN1, CTGF and DCN in a knock-in mouse HCM model, using the same antibodies and same conditions for the human sample experiments.This is a new HCM mouse model made by us, which has not been published.In 16 weeks after birth, the HCM mice exhibit clear cardiac hypertrophy and the Masson's trichrome staining demonstrates clear myocardial fibrosis of the heart tissue in the knock-in mice.It should be noted that, both FN1 and CTGF were not elevated even in the interstitial region in this model.This may suggest that our findings are unique to human HCM, but more mouse models and disease stages should be examined to make this conclusion.

F
I G U R E 1 Single-cardiomyocyte analysis reveals transcriptome alterations in HCM cardiomyocytes.(A) Schematic diagram of the summary of this study.(B) Comparison of quality control metrics between our data and published single-cell or single-nucleus RNA-seq of cardiomyocytes from HCM. (C) The clustering result of cells from HCM patients and NC subjects was visualized with PCA.HCM, hypertrophic cardiomyopathy; NC, normal control.(D) The expression of representative cardiomyocyte markers (TNNT1, TPM3 and ACTN1) and HCM markers (NPPA, NPPB and ACE2).(E) Heatmap showing significantly differentially expressed genes (DEGs) in CMs between HCM patients and NC subjects (adjusted pvalue < 0.05; fold change > 2).(F) Volcano plot of DEGs between HCM and NC related to (E).

F
P-value) Negative regulation of immune system process Regulation of transforming growth factor beta production Response to mechanical stimulus Actomyosin structure organization Transmembrane receptor protein Ser/Thr kinase signaling I G U R E 2 Enrichment analysis and pathway analysis in HCM.(A) Gene ontology (GO) enrichment result of DEGs between HCM and NC.(B) GSEA enrichment result of DEGs between HCM and NC in the KEGG signalling pathway.(C) Heatmap of differentially expressed genes between HCM and NC in the 'SARS-CoV-2 Infection' (R-HSA-9694516) pathway.(D) Representative genes from the 'SARS-CoV-2 Infection' and 'SARS-CoV-2 activates/modulates innate and adaptive immune responses' (R-HSA-9705671) pathway.(E) Ligand-receptor pair expression levels among CMs from HCM patients and NC subjects.
F I G U R E 3 SCENIC analysis in HCM.(A) TSNE visualization with top 30 PCA components of SCENIC regulon AUC scores.(B) Differentially activated regulons between HCM and NC cardiomyocytes.(C) Expression of representative up-regulated regulons in HCM cells, including TCF4, ETV1 and KLF5 regulons.(D) Gene module scores of target gene regulons of TCF4, ETV1, KLF5, CEBPB and CEBPD in NC and HCM, respectively.(E) Expression of TCF4, ETV1, KLF5, CEBPB and CEBPD in four HCM datasets, coloured by average expression levels.The size of the dot represented the percentage of expressed cells.(F) Relative gene module scores of target gene regulons of TCF4, ETV1, KLF5, CEBPB and CEBPD in four HCM datasets.Each dot was coloured by the relative module score between HCM and NC, the size of the dot represented the Àlog10(pvalue), two-sided t-test was used to determine the significance levels.
cluster 1 was characterized by ribosome genes (e.g., RPL30, RPL23A) and enriched in the 'Viral mRNA Translation' pathway and 'SARS-CoV-2-host interactions' (RPS25, UBB), which pointed to its association with the SARS-CoV-2 susceptibility of HCM patients.Cluster 2 exhibited the highest levels of canonical HCM markers NPPA and NPPB among all five clusters, and was relatively enriched in the 'hypertrophic cardiomyopathy' and 'cardiac muscle contraction' pathways.Cluster 3 showed activation of the growth factor pathway, including the high expression of FGF12.Cluster 4 was enriched in genes of the 'heart contraction' (e.g., ACTC1, MYL3 and ACTA1) and 'diabetic cardiomyopathy' (e.g., TNNI3, MYH7 ECM genes, including LUM, DCN, FN1, CTGF and COLIA2, as well as the 'extracellular matrix organization' pathway, were prominently upregulated in all the subpopulations of HCM cardiomyocytes, especially in cluster 4 (Figure6A-E and Figures 1E,F and 2A,B).We confirmed that a large proportion of up-regulated DEG genes intersecting between our data and previous scRNA studies on both human DCM patients and mouse cardiac hypertrophy models were ECM genes and the ECM pathway was commonly activated in the HCM cardiomyocytes of published HCM scRNA-seq or snRNA-seq datasets (FiguresS8 and S9).5

4
Legend on next page.To validate the up-regulation of the ECM genes in HCM cardiomyocytes, we performed immunofluorescence staining of FN1, CTGF, DCN, POSTN and VIM on heart tissues from HCM patients and NC (n = 3 for each group) (Figure 6B-D, Figures S10 and

Figure S11 )
FigureS11); the expression of VIM also appeared to be slightly increased (FiguresS10 and S11).The expressions of FN1, CTGF and DCN were evidently increased in the HCM heart tissues.Interestingly, in two HCM patients (Patient #347 and #350), we found that the elevated expressions of FN1 and CTGF were mainly located in the cytoplasm of the HCM cardiomyocytes, and the elevated expression of DCN was located both in the interstitial region and the cytoplasm (Figure6B,C, FigureS10).Strong expression of FN1 was found in the perinuclear space of the hypertrophic cardiomyocytes, suggesting that it was accumulated in the endoplasmic reticulum (Figure6B, FigureS11).CTGF displayed an intriguingly asymmetric pattern in the cytoplasm, and the expression was also found in the nuclei (Figure6C, FigureS11).Another HCM patient (Patient #470) did not show such evident cytoplasm FN1 and CTGF gene expression patterns, and prominent FN1 expression was clearly located in the interstitial region (Figure6B, FigureS11).To quantify the expression of these ECM genes within the cardiomyocytes, we limited the region of interest (ROI) for measurements based on the immunofluorescence signal of cardiac troponin T (CTNT), which selectively expressed inside the cardiomyocytes (FigureS11).As a result, the mean cardiomyocytic fluorescence intensities of FN1, CTGF and DCN were significantly elevated in HCM, with a fold change of 8.27, 8.38 and 12.86, respectively.The factions of signal in ROI of FN1, CTGF and DCN were also increased by 21.4%, 22.5% and 18.1%, respectively.Although POSTN was strongly elevated in the interstitial region in HCM patients, its expression in the cardiomyocytes was only slightly increased (fold change = 7.73; increased faction = 4.1%), which was consistent with the RNA expression data (fold change = 1.55, p-value = 4.1eÀ13) (Figure6E).We also performed immunofluorescence staining for FN1, CTGF and DCN in a double mutant knock-in mouse HCM model (see Methods).

F I G U R E 4
Comparative analysis revealed signature gene modules in HCM.(A) Summary of four datasets used in comparative analysis.The sequenced cell number of HCM and NC with the related number of samples in the parentheses were listed.*For the Wehrens et al. dataset, the NC data was derived from Wang et al. (1400 cells; n = 7) and Litvinukova et al. (27,604 cells; n = 14).(B) Heatmap showed the binned expression levels of the whole transcriptome of four datasets.(C) Pearson's correlation of binned expression levels of four datasets.(D) Normalized expression levels of NPPA, NPPB, TPM3 and RTN4 of four datasets, visualized with violin plots.(E) Venn diagram and upset plot of up-regulated differential expressed genes identified from four datasets (adjuster p-value < 0.05, fold change > 1.5), respectively.(F) Co-expression heatmap of signature up-regulated genes in cardiomyocytes from HCM of this study, visualized by Pearson's correlation coefficient between genes.Gene modules were classified by unsupervised hierarchical clustering.(G) Relative gene module scores of gene modules in four HCM datasets.Each dot was coloured by the relative module score between HCM and NC, the size of the dot represented the Àlog 10 ( p-value), two-sided t-test was used to determine the significance levels.(H) Representative gene ontology enrichment results of signature up-regulated gene module UP_M1, UP_M2 and UP_M6.

2 F
I G U R E 5 Heterogeneity revealed by subpopulations of cardiomyocytes from HCM. (A) UMAP1 showed the subpopulation of cardiomyocytes from HCM and NC, cells were coloured by clusters.(B) Summary of the individual information of each cell for each cluster.(C) Heatmap showed the differential expressed gene between the five clusters from HCM (adjusted p-value < 0.05; log 2 (fold change) > 0.5).(D) The violin plot showed the normalized expression levels of representative DEGs between HCM subpopulations.(E) Representative gene ontology results of DEGs of HCM subpopulations related to (D).(F) Module score of signature up-and down-regulated genes in all clusters of NC and HCM.(G) Heatmap showed the module score of signature up-regulated modules in HCM subpopulations.

F
I G U R E 6 Up-regulation of ECM genes in cardiomyocytes of HCM.(A) Expression of representative ECM genes in cardiomyocytes of HCM and NC in this study.(B-D) Immunofluorescence staining of FN1 (B), CTGF (C), POSTN (D) in HCM and NC.Individual cardiomyocytes were defined using CTNT (green) and DAPI (blue).FN1, CTGF and POSTN are ECM genes (red).Scale bar, 50 μm.(E) Quantitative statistic of immunofluorescence staining results.The mean fluorescence intensities and fractions of signals in the region of interest (ROI) were calculated.Each dot represented results from three representative fields from an individual.