Spatiotemporal Characterization of Human Early Intervertebral Disc Formation at Single‐Cell Resolution

Abstract The intervertebral disc (IVD) acts as a fibrocartilaginous joint to anchor adjacent vertebrae. Although several studies have demonstrated the cellular heterogeneity of adult mature IVDs, a single‐cell transcriptomic atlas mapping early IVD formation is still lacking. Here, the authors generate a spatiotemporal and single cell‐based transcriptomic atlas of human IVD formation at the embryonic stage and a comparative mouse transcript landscape. They identify two novel human notochord (NC)/nucleus pulposus (NP) clusters, SRY‐box transcription factor 10 (SOX10)+ and cathepsin K (CTSK)+, that are distributed in the early and late stages of IVD formation and they are validated by lineage tracing experiments in mice. Matrisome NC/NP clusters, T‐box transcription factor T (TBXT)+ and CTSK+, are responsible for the extracellular matrix homeostasis. The IVD atlas suggests that a subcluster of the vertebral chondrocyte subcluster might give rise to an inner annulus fibrosus of chondrogenic origin, while the fibroblastic outer annulus fibrosus preferentially expresseds transgelin and fibromodulin . Through analyzing intercellular crosstalk, the authors further find that notochordal secreted phosphoprotein 1 (SPP1) is a novel cue in the IVD microenvironment, and it is associated with IVD development and degeneration. In conclusion, the single‐cell transcriptomic atlas will be leveraged to develop preventative and regenerative strategies for IVD degeneration.


Introduction
The axial skeleton protects our body during daily physical movement, including the vertebral bodies (VBs), intervertebral discs (IVDs), tendons/ligaments, and muscles. IVDs act as fibrocartilaginous joints to anchor adjacent VBs, which is essential for the mechanical stabilization. [1] Mature IVDs consist of the central nucleus pulposus (NP), surrounding annulus fibrosus (AF), and end plate (EP) that adjoins the VB. The NP originates from the embryonic notochord (NC), while the AF and EP are derivates of the mesenchymal sclerotome. [2] The characteristic feature of embryonic IVD formation is the NC-to-NP transition, after which NC cells are thought to begin reducing and even disappear in postnatal human IVDs. [1,3] NC cells have been reported to be the precursors of all NP cells, [4] and the loss of NC cells is closely associated with IVD degeneration (IVDD) onset. [5] IVDD is known to be a major cause of low back pain, which can be severe and often leads to disability. [6] A poor understanding of the cellular and molecular basis underlying IVD formation and degeneration limits the development of disease-modifying therapeutics for IVDD. [1] Recently, single-cell RNA sequencing (scRNA-seq) has provided a powerful alternative to study cellular heterogeneity and to help uncover regulatory relationships among genes during organ development and degeneration. [7] To develop regenerative strategies for IVDD, some recent scRNA-seq studies revealed the cellular heterogeneity within adult mature IVDs in both humans and mice. [4b,8] However, limited information is known about early human IVD formation, especially in regard to the cellular heterogeneity during the embryonic NC-to-NP transition, due to the lack of high-precision and unbiased resolutions for distinguishing cell populations during early human IVD formation.
In this study, we aimed to provide a single-cell view of human early IVD formation. We profiled 177 725 cells from the axial skeleton across 13 human embryos at different gestational weeks, along with 22 838 cells from the mouse axial skeleton. Combined with lineage tracing and spatial transcriptome analysis, we comprehensively characterized the transcriptome features of developing NC/NP cells, vertebral chondrocytes, and AF cells during early IVD formation. We identified two novel NC/NP clusters as well as potential inner AF/EP progenitors, and we decoded the intercellular crosstalk with respect to early IVD formation. Our results provide new cellular-level insights into the transcriptional alterations associated with early IVD formation, and this could be leveraged in the development of preventative and regenerative strategies for IVDD.

Single-Cell Profiling of the Embryonic Axial Skeleton Cell Atlas during Early IVD Formation
Human fetuses were obtained from the First Affiliated Hospital of Sun Yat-sen University from donors who provided informed consent for the use in research of fetal material arising from the termination of their pregnancy. Morphologically normal embryos at the 7th to 11th weeks of gestation, when the NP undergoes a chronological transition from a rod-like notochord to segmented fibrocartilaginous tissue, [9] were staged by an ultrasound system and collected. We microdissected and dissociated the axial skeleton, including the presumptive VB, annulus fibrosus (AF), NP, and other adherent soft tissues (i.e., muscles, tendons, and ligaments), for scRNA-seq. A total of 13 fetuses were analyzed using chromium (10x Genomics, Pleasanton, CA, USA) ( Figure 1A). Based on the morphology of the developing IVDs, these specimens were grouped into three stages: the notochord stage (Noto, week 7, n = 4), the transition stage (Trans, weeks 8 and 9, n = 4), and the nucleus pulposus-like stage (NPL, weeks 10 and 11, n = 5) ( Figure 1B).
The characteristic genes of SOX10 + were enriched in previously reported markers of notochordal precursors (FGF9 and FGF20), [29] cell fate determination (SOX11), and stem cell maintenance (HMGB3 and HMGA1) ( Figure 2B and Table S3, Supporting Information). These molecular signatures were also similarly expressed in Sox10 + ( Figure 2B and Table S4, Supporting Information). The expressions of both SOX10 in the human fetal spine and Sox10 in the mouse embryonic spine decreased with the NC-to-NP transition ( Figure 2D,E). Lineage tracing was then performed to better characterize Sox10 + subpopulation during early IVD formation. First, we crossed Sox10-Cre mice with mT/mG mice, in which transgene expression of tdTomato converts to the expression of enhanced green fluorescent protein (GFP) following Cre-mediated excision. [30] The GFP + Sox10 lineage cells (Sox10 + cells and their descendants) included most NP cells and dorsal root ganglion (DRG) cells in the postnatal mouse axis ( Figure 2F). Next, we used the CreER T2 system to achieve temporal control of Cre recombinase through tamoxifen injection to explore Sox10-expressing cells after the NC-to-NP transition. Tamoxifen administration did not mark any NP cells at either postnatal day 3 (P3) or week 3 (3w) in Sox10-CreER T2 ; mT/mG mice ( Figure 2G), which was similar to the outcome of Shh-expressing NC/NP cells. [31] The notochordal signaling is responsible for neural tube closure, while its defects result in spina bifida and exencephaly. [32] Gene Ontology (GO) term enrichment analysis revealed the capacity of SOX10 + and Sox10 + subpopulations in regulating epithelial tube morphogenesis, mesenchyme development, tube formation, and cartilage development ( Figure 2H). In addition, we found that SOX10 + subpopulation highly expressed BMP7, SPINT2, and CLDN3/4/6/7 (Table S3, Supporting Information), which are critical for neural tube closure. [33] The single-cell regulatory network inference and clustering (SCENIC) [34] showed that SOX10 + subpopulation was enriched in regulons including SOX10, FOXG1, and SIX1 ( Figure S6A, Supporting Information). GO analysis further revealed that the downstream target genes of SOX10 regulon were enriched in "Notch signaling pathway," "neural tube closure," and "positive regulation of activin receptor signaling pathway" (Figure S6B,C and Table S5, Supporting Information). Notch signaling pathway is evolutionarily conserved and essential for midline development, [35] and activin signaling is required for notochordal cell differentiation. [36] Taken together, our results indicated that SOX10 is a novel marker for early NC cells in both humans and mice, and that SOX10 + cells may be potential notochordal stem/progenitor cells.
The characteristic genes of APOA1 + were enriched in retinoic acid binding (APOA1), [37] retinoic acid synthesis (ALDH1A2), and the transportation of retinol (RBP1) ( Figure S7A and Table  S3, Supporting Information). Although retinoic acid is a wellknown teratogen capable of causing neural tube defects, a physiological level of retinoic acid is required for the formation of neural tubes/spinal cords. [38] In mNC subpopulations, we found that the Tcf21 + subpopulation preferentially expressed Wt1 and Wnt5a ( Figure S7B, Supporting Information), which are required for anteroposterior axis specification. [39] GO enrichment analysis showed that the characteristic genes of the Tcf21 + subpopulation were associated with "anterior/posterior pattern specification," "regulation of Wnt signaling pathway," "segmentation," "neural tube development," and "neural tube closure" ( Figure S7C, Supporting Information). Furthermore, gene set variation analysis (GSVA) showed that the Tcf21 + subpopulation was closely associated with "notochord morphogenesis" ( Figure S7D, Supporting Information), the process in which the anatomical structures of the notochord are generated and organized (GO:0048570). These indicated that the Tcf21 + subpopulation might be involved in regulating notochord development and neural tube closure. Meanwhile, GO enrichment analysis showed that the characteristic genes of the Rprm + subpopulation are associated with "epithelial tube morphogenesis," "neural tube development," "neural tube formation," "neural tube closure," and "regulation of retinoic acid receptor signaling pathway" ( Figure S7C, Supporting Information). This indicated that the Rprm + subpopulation might be involved in retinoic acid signaling pathway and neural tube closure.
The NPY + subpopulation was exclusively distributed at the Noto stage and preferentially expressed WT1 and WNT5A (Figure 2A,B and Table S3, Supporting Information), which are required for anteroposterior axis specification. Accordingly, GSVA showed that the NPY + subpopulation was closely associated with www.advancedsciencenews.com www.advancedscience.com "notochord morphogenesis" ( Figure 2I). The pseudotime analysis revealed a developmental trajectory among hNC subclusters ( Figure S7E,F, Supporting Information), in which the NPY + subpopulation was located at the start. The route then seemed to develop into two branches (Branch 1 and 2), the CTSK + and the TBXT + subclusters were mainly distributed at each end. These indicated that NPY + subpopulation might be essential for early IVD formation.
Taken together, we revealed the heterogeneous cellular map of NC/NP during early IVD formation in humans and mice, and then we validated Sox10-expressing early NC cells through lineage tracing.

TBXT + and CTSK + Are Two Major Matrisome Clusters That Regulate Extracellular Matrix (ECM) Homeostasis during Early IVD Formation
Mature NP contains various ECM proteins, including proteoglycans, fibrillar collagens, and elastin fibers, which maintain the height and hydrostatic pressure of IVDs. [1] Within mature NP, randomly organized collagen fibers provide tensile strength, and proteoglycans create a large osmotic swelling pressure. [40] We found that ECM proteins, such as ACAN and COL1A1, dramatically increased with IVD development ( Figure 3A). Our in-house bulk RNA-seq results of human fetal NC captured by laser capture microdissection (LCM) also revealed elevated expressions of collagens (COL2A1 and COL11A1) and proteoglycans (ACAN and FMOD) from nine to 13 weeks of gestational age ( Figure 3B and Table S6, Supporting Information).
ECM genes are categorized into core matrisome (collagens, proteoglycans, and glycoproteins) and matrisome-associated (secreted factors, ECM affiliation, and ECM regulators) groups (matrisomeproject.mit.edu). [41] We first evaluated the average expression of six modules in these seven hNC subpopulations ( Figure S8A, Supporting Information) and compared the expression abundance of ECM genes in these hNC subpopulations ( Figure 3C,D and Figure S8B, Supporting Information). The ECM homeostasis was mainly regulated by both the TBXT + and CTSK + subpopulations, and GO analysis also showed that their characteristic genes were enriched in "ECM organization" ( Figure 3E). More specifically, the collagens were preferentially expressed in the CTSK + subpopulation, and GO analysis also showed that the characteristic genes of CTSK + were associated with "collagen fibril organization" and "collagen metabolic organization" ( Figure 3E). Thus, CTSK + subpopulation might preferentially contribute to collagens in NP tissue, providing tensile strength. The TBXT + subpopulation preferentially expressed ACAN ( Figure 3F and Figure S8C, Supporting Information), which encoded the most abundant proteoglycan, aggrecan, within NP. Thus, the TBXT + subpopulation might mainly contribute to the proteoglycans in NP tissue, creating a large osmotic swelling pressure. In humans, PDGFRA + subpopulation was also enriched in "ECM organization" and "collagen fibril organization" ( Figure S8D, Supporting Information).
Furthermore, SCENIC regulatory network analysis revealed that the TBXT + subpopulation was enriched for regulons such as SOX9, SOX5, and SOX6 ( Figure S6A,D, Supporting Information), which are critical regulators of ACAN expression. [42] Mean-while, GO analysis revealed that the characteristic genes of the CTSK + subpopulation were associated with "response to TGFbeta" (Figure 3E), which is essential for NP cell maintenance and ECM homeostasis. [43] The major ligand that activates TGFpathway in NC/NP cells, TGFB3, [36] was predominantly expressed in the CTSK + subpopulation and elevated with IVD development ( Figure S8C, Supporting Information). It is accepted that the ligand binds to TGFBR2 at the cell surface, which then recruits TGFBR1 to form a heterotetrameric complex with an additional TGFBR1 and TGFBR2. [44] We found that TGFBR2 was preferentially expressed in the CTSK + and the TBXT + subpopulations ( Figure S8C, Supporting Information). These indicate that the CTSK + might modulate ECM homeostasis through the TGFsignaling pathway. Taken together, these bioinformatic clues suggest that the TBXT + subpopulation may mainly contribute to proteoglycan homeostasis under the regulation of SOX9, SOX5, and SOX6, while the CTSK + subpopulation may be mainly involved in collagen homeostasis through the TGF-signaling pathway.
In the human fetal spine, CTSK was scarcely detected at the Noto stage, while its expression increased with ECM formation in developing NC/NP cells ( Figure 3A,G), which was consistent with the elevated expression of CTSK in human fetal LCM specimens ( Figure 3B). Next, we crossed Ctsk-Cre mice with mT/mG mice to further investigate the dynamic expression of Ctsk during early IVD formation ( Figure 3H). We found that GFP + Ctsk lineage cells began to appear at P0 in the NP as well as in the ossification center of the VBs, rather than the EP and AF. Of note, the number of GFP + Ctsk lineage cells increased with IVD development, which was consistent with our abovementioned results. This finding indicated that CTSK is a late marker of developing NC/NP cells and that CTSK + might be responsible for ECM homeostasis in fibrocartilaginous NP.
The mature NP in the IVD plays a role in absorbing compressive forces during normal motion and activities. We found that the CTSK + subpopulation was enriched in "response to mechanical stimulus" ( Figure 3E). We then compared the expressions of different mechanosensitive ion channels and found that CTSK + cells highly express PIEZO2, while TBXT + cells preferentially express PIEZO1 and TRPV4 ( Figure 3I). In mice, T + preferentially expressed Trpv4 ( Figure S8E, Supporting Information). This indicated that CTSK + and TBXT + subpopulations might be responsible for mechanotransduction within developing NC/NP cells.
In addition to proteoglycans and collagens, cytokeratin, which is an important component of intermediate filaments, was reported to be present in the human notochord. [45] We found that KRT15 + subpopulation was enriched for cytokeratin (KRT15, KRT17, and KRT19) and epithelial markers (EPCAM and PERP) (Tables S3 and S4 and Figure S8F, Supporting Information), and GO analysis showed that KRT15 + was enriched in terms associated with "keratinization" and "intermediate filament organization" ( Figure 3E). This indicated that KRT15 + is responsible for the epithelial characteristics of the notochord. [45a] Taken together, we demonstrated the cellular heterogeneity and molecular dynamics related to ECM homeostasis in developing NC/NP cells, and we propose that TBXT + and CTSK + are two major matrisome clusters that act during human early IVD formation.

Temporal and Functional Characterization of Developing Vertebral Chondrocytes
To characterize the dynamics and functions of developing vertebral chondrocytes, we aimed to partition hChon into four subclusters ( Figure 4A; Figure S9A and Table S7, Supporting Information); however, mChon was ultimately only subdivided into Col2a1 + and Col1a1 + clusters due to the limited amount of cell number ( Figure S9B, Supporting Information). hChon1, which is characterized by the expressions of MATN1 and CYTL1, is the major component of VB anlagen, especially at the Noto stage ( Figure 4A,B). The expression of MATN1 was restricted in epiphyseal chondrocytes ( Figure 4C), [46] and the characteristic genes of hChon1 were enriched in terms associated with "cartilage development" and "ECM organization" (Figure S9C, Supporting Information). This finding indicated that hChon1 might form central cartilage primordia. hChon2 preferentially expressed COL1A1, POSTN, and SFRP2 ( Figure 4B and Table S7, Supporting Information), while POSTN ( Figure 4C) and SFRP2 [47] were markers of the perichondrial cells that surrounded and lined chondrogenic condensations. 33 core regulons were filtered to discriminate these hChon clusters by SCENIC analysis ( Figure 4D and Table S8, Supporting Information). The axial cartilage at the Noto stage mainly consisted of hChon1 and hChon2, both of which exhibited strong enrichment of HMGA1, HMGB3, and SOX11 ( Figure 4D). More specifically, SOX11 is widely expressed during early embryogenesis and is essential for stem cell/progenitor survival. [48] HMGA1 binds directly to the SOX9 promoter to induce its expression, [49] and the expression of SOX9 is essential for chondrogenic mesenchymal condensation. [50] hChon3 increased drastically with VB development (Figure 4A) and highly expressed VEGFA (angiogenesis) and HHIP (IHH target gene) ( Figure 4B). The characteristic genes of hChon3 were enriched in "response to hypoxia," "ossification," and "osteoblast differentiation" ( Figure S9C, Supporting Information). Moreover, the HIF1A, RARG, and NFIX regulons were specific to hChon3 ( Figure 4D). HIF1A is a master regulator of hypoxic vascular responses, [51] and RARG is required for the differentiation of hypertrophic chondrocytes; [52] both are necessary for endochondral ossification. These results indicated a potential role of hChon3 in regulating endochondral ossification.
hChon4 was exclusively distributed at the NPL stage (Figure 4A), and was marked by the expressions of GDF5, PIEZO2, and ADGRG6 ( Figure 4B). GDF5 is required for normal joint development in both humans and mice, [53] and Gdf5 −/− mice showed smaller nucleus pulposus regions and irregular endplates. [54] We noticed that hChon4 also preferentially expresses CCN2 (Table S7, Supporting Information), and neonatal Ccn2-GFP mice showed that Ccn2 was expressed in the area where the EP and growth plate form later as well as in the ossification center and NP ( Figure 4E and Figure S9D, Supporting Information). In addition, the loss of Adgrg6 was reported to cause endplate-driven herniation in mice. [55] hChon4 also expresses markers of human skeletal stem cells, such as THY1 (CD90), PDPN, and CD164 ( Figure 4F), since PDPN + CD146 (MCAM) − CD73 + CD164 + was reported to mark human skeletal stem cells. [56] Meanwhile, hChon4 exhibits strong enrichment of EBF1, NFAT5, and SOX9 ( Figure 4D). SOX9 is a pivotal tran-scriptional factor in chondrocytes, and NFAT5 contributes to IVD embryogenesis. [57] Notably, the expression of CTSK was quite low in hChon4 ( Figure 4F), while Ctsk + lineage cells did not comprise AF and EP cells in the mouse postnatal IVD ( Figure 3H). Thus, these data suggest that hChon4 is involved in EP and IAF development, and it might be a kind of chondrogenic progenitor cell.
Taken together, our results temporally and functionally demonstrate the development of vertebral chondrocytes during human early IVD formation, and they further suggest that GDF5 + CCN2 + hChon4 might be a human EP/IAF progenitor.

Temporal Characterization of the Developing Outer AF (OAF)
We found that cluster 6 (in hCT) highly expressed PAX1 and PAX9 through all three stages and increased the expressions of MKX and COL5A1 (Figure 5A,B and Figure S10A, Supporting Information). PAX1/9 are known to be downregulated once prechondrogenic cells mature into chondrocytes and they persist to be expressed in OAFs to regulate their development. [58] MKX was reported to be preferentially expressed in OAF and to promote its maintenance and regeneration. [59] Thus, we defined cluster 6 as the developing OAF. This OAF also showed increased expression of other AF markers, such as TAGLN, FMOD, and BGN ( Figure 5B,C). Compared to hChon4, which is supposed to regulate IAF development, this OAF showed reduced expression of CCN2 ( Figure S10B, Supporting Information). In mice, we found that cluster 4 preferentially expressed Pax1, Mkx, and Pdgfrl (Figure 5D,E), and defined it as the developing AF in mice. The expressions of Tagln and Fmod in mouse OAFs were also validated by immunostaining ( Figure 5F).
As mentioned above, IVD is essential for mechanical stabilization of the axial skeleton. We next focused on the expressions of different mechanosensitive ion channels in the developing OAF at the different stages. We found that the expression of PIEZO2 drastically increased with OAF development ( Figure 5G). Then, when the Piezo2-EGFP-IRES-Cre mouse was crossed with a Rosa-tdTomato reporter mouse (Ai9) to perform lineage tracing, we found that the tdTomato + Piezo2 lineage cells (Piezo2 + cells and their descendants) included some NC/NP cells and most AF cells  ( Figure 5H), since the EGFP signal was too weak to be reliably detected. [60] These results indicate a potential role of PIEZO2 in AF biomechanics.
Taken together, our results characterize the temporal changes of the mesenchyme-derived OAF, which differ from those of the chondrogenic IAF, during early IVD formation.

Signaling Network for the Intercellular Crosstalk between Developing NC/NP Cells and Vertebral Chondrocytes during Early Human IVD Formation
Notochordal signaling is essential for VB segmentation and formation, [61] so we next investigated the cellular interactions www.advancedsciencenews.com www.advancedscience.com among NC/NP cells and vertebral chondrocytes using CellPhoneDB. [62] Relative active bidirectional interactions among these 11 clusters showed highly regulated cellular communications ( Figure 6A) in which BMP, WNT, FGF, and TGFpathways were vital signaling pathways regulating early IVD formation ( Figure S11A, Supporting Information).
First, we determined that TGF-2 (mainly secreted by hChon1 and hChon4) and TGF-3 (mainly secreted by hChon2 and hChon3) modulated matrisome clusters (CTSK + , TBXT + , and PDGFRA + ) via TGFBR3 ( Figure 6B). We also found that hChon3 modulated the developing NC/NP cells via TGFB1-TGFBR3 (Figure 6B). The TGF-family is essential for the proper formation and ECM metabolism of IVD. Specifically, Tgfbr2 conditional knockout mice displayed incomplete and even missing IVDs without affecting vertebral chondrocyte differentiation, [63] and a recent study showed that TGF-3 treatment in vitro enhanced the chondrogenesis of adult NP progenitor cells. [4b] Activating TGF-pathway by TGF-3 administration for 15 d promoted human pluripotent stem cells-derived notochord-like cells to differentiate into nucleus pulposus-like cells, characterized by the expression of ACAN and COL2A1. [36,64] The main receiver cells of TGF-signaling were CTSK + , TBXT + , and PDGFRA + subpopulations, which are defined in this study as matrisome clusters. Meanwhile, we found that the potential EP/IAF progenitors (hChon4) also modulated NC/NP via BMP2-SMO and BMP7-PTPRK ( Figure 6B). BMP2/7 enhances matrix production of bovine NP cells, [65] and recombinant human BMP2 is widely applied in spinal fusion in clinics. [66] These data demonstrate that TGF-and BMP signaling from vertebral chondrocytes modulates the anabolism of developing NC/NP cells. WNT4/5a secreted by developing NC/NP cells, except for TBXT + , modulates developing vertebral chondrocytes via Frizzled receptors (FZD1/8), SMO, and ANTXR1 ( Figure 6C). The developing NC/NP clusters, especially CTSK + , also secrete FGF7/10/18 to modulate developing vertebral chondrocytes via FGFR1/2/3 ( Figure 6C). WNT and FGF are highly conserved signaling pathways required for normal skeletal development, [67] and conditioned medium from notochordal cells was reported to enhance the chondrogenesis of human mesenchymal stem cells, without demonstration of soluble factors. [68] Of note, we observed that SPP1 secreted by developing NC/NP cells, especially TBXT + , modulated hChon3 via CD44 (Figure 6C). Previous bioinformatic analysis reported that SPP1 is involved in the interaction between NP progenitor cells and EP stromal cells at the adult stage, [4b] but no direct evidence has been reported yet. We found that SPP1 is expressed in human developing NC/NP cells as well as in the VB ossification center since SPP1 is a known osteogenic marker ( Figure 6D,E). In addition to being a well-known marker for stem cells, CD44 plays important roles in various physiological processes, including limb development, and chondrocytes with greater CD44 expression show higher chondrogenic capacity. [69] Here, we found that CD44 was ubiquitously expressed in the human fetal spine, including preferentially in the NP, AF, and VB ossification centers ( Figure 6D,F). Meanwhile, we found that blood vessels, marked by the expression of EMCN, predominantly lined the VB instead of the NP or AF ( Figure 6G and Figure S11B, Supporting Information). These results prompted us to investigate the role of SPP1 in regulating chondrogenesis. Recombinant mouse Spp1 (rSpp1) treatment significantly upregulated the expression of the master regulator for chondrogenesis (Sox9 [70] ) and decreased the expressions of key regulators of vascularization (Hif1a, Vegfa [71] ) and hypertrophic markers (Runx2 and Cd9 [72] ) in non-induced ATDC5 chondrogenic progenitor cells after three days ( Figure 6H and Figure S11C, Supporting Information). However, the administration of CD44 monoclonal antibodies blocked these changes ( Figure 6H and Figure S11C, Supporting Information). We further found that the expression of Spp1 was drastically downregulated, while the expressions of Vegfa and Col10a1 (hypertrophic marker [73] ) significantly elevated in the puncture-induced IVDD rat model ( Figure 6I and Figure S11D, E, Supporting Information). These results demonstrate that notochordal SPP1 might suppress hypertrophy and vascularization while stimulating chondrogenesis by interacting with CD44.
Notably, SOX10 + also interacted with hChon3 via SPP1-CD44 ( Figure 6C). Meanwhile, we found that SEMA3E, which is secreted by the NC and required for the maintenance of the avascular midline, [74] was preferentially expressed in SOX10 + cells ( Figure S11F, Supporting Information). SEMA3A, another semaphorin that might repel neuronal and vascular ingrowth in the healthy NP, [75] was preferentially expressed in SOX10 + and CTSK + subpopulations, and the expression of SEMA3A in CTSK + increased with IVD development ( Figure S11F, Supporting Information). These results indicate the potential role of SOX10 + and CTSK + , in addition to that of TBXT + , which maintained the avascular microenvironment within developing NC/NP.  Taken together, our results revealed the existence of complicated cellular crosstalk with a hierarchical signaling pathway between developing NC/NP cells and vertebral chondrocytes and further demonstrated that notochordal signaling, including SPP1 and semaphorins, is vital for IVD formation and health.
We next divided the mCT into seven subpopulations (Figure 7D,E; Figure S12C and Table S11, Supporting Information): mCT1, marked by the expressions of Pax1 and Pax9; mCT2, marked by the expressions of Uncx, Nrp1, Nrp2, and Scx; mCT3, marked by the expressions of Zic1 and Zic2; mCT4, marked by the expressions of Ngfr and Hic1; mCT5, marked by the expressions of Pth1r, Irx5, and Irx3; mCT6, marked by the expressions of Tppp3 and Cd34; mCT7, marked by the expressions of Wt1, Gata6, and Cxcl12. Similarly, mCT1-3 subpopulations were identified as mesenchymal sclerotome according to the expressions of Foxc2, Pax1, and Pax9, and mCT6 subpopulation was identified as tendon stem cells according to the expressions of Tppp3 and Cd34.

Spatiotemporal Characterization of Developing IVD Cells by Spatial Transcriptome Analysis
To better characterize the spatial distribution of developing IVD cells, we performed spatial transcriptome analysis using 10x Genomics visium. The clusters were defined morphologically. At E11.5 and E12.5, the developing axial skeleton included two main subclusters: dense Chon1 and loose Chon2 ( Figure S13A, Supporting Information). Chon1 was marked by the expressions of Pax1, Uncx, and Pax9 (Tables S12-S14, Supporting Information) and was the posterior half-sclerotome derivate. Conversely, loose Chon2 was marked by the expressions of Dcn and Igfbp2 (Tables S12-S14, Supporting Information). At E13.5, the defined AF highly expressed Pax1, Bgn, and Tgfbi, while Chon preferentially expressed Matn1, Comp, and Mia (Tables S12-S14, Supporting Information). We found that Pax1 was mainly expressed in the ventral axial skeleton, while Zic1 was preferentially expressed in the dorsal region as well as in the spinal cord ( Figure S13B, Supporting Information), which was consistent with our above-mentioned results. Meanwhile, HOX genes are known to specify the morphological identity of the vertebrae and spinal cord, [81] and we determined that Hox10/11 genes were predominantly expressed in the caudal region, while Hox4/5 were preferentially expressed in the rostral region ( Figure S14, Supporting Information).
Although the resolution of the 10x Genomics visium was not ideal for capturing developing NC/NP cells (E11.5, 4 spots; E12.5, 8 spots; and E13.5, 14 spots), we succeeded in determining their signature genes (Tables S12-S14, Supporting Information), which was in line with our abovementioned data. Specifically, these NC clusters preferentially expressed canonical notochord markers (Car3, T, Shh, Krt19, and Krt18) as well as novel markers, including Spp1 and Sox10 ( Figure S13C,D, Supporting Information). Low expression of Ctsk was detected in developing NC/NP cells from E11.5 to E13.5 ( Figure S13E, Supporting Information). Interestingly, we found that a previously reported NC/NP marker, Lepr, was preferentially expressed at E11.5 and E12.5 and decreased with IVD development in mice ( Figure S13F,G, Supporting Information), but substantially low expression of LEPR was observed in developing human NC/NP cells ( Figure S13G, Supporting Information), thus indicating the existence of cellular divergence between humans and mice during early IVD formation.
Taken together, our data describe the spatiotemporal distribution of early IVD cells and sclerotomal mesenchymal cells, and they were partially validated by spatial transcriptome analysis during early IVD formation.

Discussion
In this study, we generated a large scRNA-seq dataset, which included a total of 177 725 cells from 13 human fetuses at different gestational weeks to investigate key signaling events underlying human early IVD formation. Samples of the human embryonic axial skeleton ranging from seven to 11 gestational weeks of age were included in this scRNA-seq dataset and further grouped into three developmental stages morphologically. NC cells are thought to disappear starting in human adolescent NPs, [3] although this has been questioned by recent studies. [4b,82] Furthermore, the loss of NC cells from the NP was reported to be closely associated with IVDD onset, [68] and NC cells are known to regulate IVD homeostasis. [83] The majority of recent investigations have focused on identifying new NP stem/progenitor cells in the adult mature IVD, [4b,8a] and some scholars have described the cellular heterogeneity inside the human adult IVD. [8b] Here, we focused on the NC-to-NP transition instead of mature adult IVDs and resolved the spatiotemporal cellular diversity at the singlecell level using transcriptomic profiling in the human embryonic axial skeleton.
We identified two novel markers (SOX10 and CTSK) of developing NC/NP cells in humans and mice, which were then validated via lineage tracing. In addition, SPP1 and KRT15 are novel NC/NP markers in both humans and mice but require further genetic validation. Furthermore, we analyzed two published scRNA-seq datasets of human adult IVDs (GSE165722 and GSE160756; Figure S15A,B, Supporting Information), and checked the expression of hNC signature genes (CTSK, TBXT, SOX10, APOA1, KRT15, PDGFRA, and NPY) ( Figure  S15A-D, Supporting Information). The early hNC markers, www.advancedsciencenews.com www.advancedscience.com such as SOX10, NPY, and APOA1, were hardly expressed in both nondegenerative (GSE160756) and degenerative (GSE165722) NP tissues. TBXT was expressed in the Notochord cluster within nondegenerative NP tissues, while CTSK and PDGFRA were expressed in both nondegenerative and degenerative NP tissues. We then integrated our hNC data with these two datasets (Figure S15E, Supporting Information). The TBXT + , CTSK + , and PDGFRA + subpopulations were distributed closely alongside adult IVD clusters in the integrated UMAP, while the KRT15 + and SOX10 + subpopulations were not. This trend suggests that the TBXT + , CTSK + , and PDGFRA + subpopulations might be still retained in adult stage of IVD development.
To achieve NP-specific genetic experiments, several Creexpressing mouse strains have been generated thus far. [8a,68,84] The Shh-Cre strain was the first reported and most used NC-Cre mouse, [85] but the tamoxifen inducible Shh-CreER T2 strain failed to target the postnatal NP, [31] suggesting that SHH is an early NC marker. In our study, we found that the Sox10-Cre and Sox10-CreER T2 strains showed similar cellular and temporal spectra to those of the Shh-Cre and Shh-CreER T2 strains. In addition, our lineage tracing results confirmed that Ctsk-Cre is suitable for NP-specific genetic experiments, since Ctsk + lineage cells were located in the NP region instead of the AF, EP, or growth plate within the postnatal IVD ( Figure 3H). Lepr-Cre and Uts2r-CreER T2 strains were recently reported to target NP cells in mice, [8a,84] thus providing more NC-Cre strain choices. The Lepr + lineage cells included most NC cells, as shown by Lepr-Cre;Ai9 mice at E12.5, [84] and we also found that Lepr is highly expressed in the mouse T + . However, the expression of LEPR was quite low in human embryonic NC/NP cells. The expression of UTS2R/Uts2r was hardly detected in either our human or mouse embryonic datasets, which might be because Uts2r is a postnatal NP stem/progenitor marker. [8a] In addition, NPY + , PDGFRA + , and APOA1 + clusters were uniquely defined in humans, while Rprm + , Tcf21 + , and Adipoq + clusters were unique in mice. However, canonical NC/NP markers, including TBXT, CA3, SHH, KRT19, KRT18, and KRT8, were detected in both humans and mice. Our results suggest the existence of cellular and molecular diversity in developing NC/NP cells between humans and mice, which provided comparable human and mouse datasets to help with NC/NP-specific genetic experiments.
In addition to being a classical marker of osteoclasts, CTSK was recently proven to be a marker of periosteal stem cells, perichondrial progenitors and tendon progenitors. [86] We defined CTSK + as a matrisome cluster present in both human and mouse developing NC/NP cells and validated its spatiotemporal dynamics by lineage tracing. There has historically been a hypothesis that cartilage-like cells within the NP originate from the surrounding EP or the perichondrium at the periphery of the IVD. [87] Although CTSK was reported to be a marker of periosteal stem cells and perichondrial progenitors in long bones, [86a] our results showed that Ctsk + lineage cells are distributed within the NP instead of the AF, EP, or growth plate, suggesting that these Ctsk + cells to be of notochordal origin. These findings coincide with another theory that, during normal development and aging, all cells within the NP are NC-derived. [88] Tu et al. suggested that THY1 + (CD90 + ) fibroNPCs are the progenitor cells, which could be considered as a potential therapeutic target of degenerative spinal diseases. [8b] They supposed that fibroNPCs may differenti-ate into adhesion NPCs and other NPCs subsets, and homeostatic NPCs were mainly distributed at the end. We further divided the fibroNPCs into two subsets, the CTSK High THY1 High and the CTSK Low THY1 Low clusters ( Figure S15F-H, Supporting Information). Monocle analysis revealed that the CTSK High THY1 High cluster mainly existed at the start of the fibroNPCs trajectory ( Figure  S15I, Supporting Information). Furthermore, the CTSK + subpopulation also highly expressed THY1 (CD90) ( Figure S15J, Supporting Information). Taken together, the late embryonic CTSK + subpopulation might be retained at adult stage and give rise to adult fibroNPCs.
The IVD is constantly subjected to various mechanical forces, and its exposure to excessive mechanical stress is known to cause IVDD. [89] At an early stage, the rising osmotic pressure permits the NC to elongate, [90] while, in mature healthy NPs, hydrostatic and osmotic pressures are altered in response to external loading conditions. [91] Negatively charged proteoglycans and other ECM components provide high osmotic pressure to maintain a hydrated structure within the mature IVD. In this study, we found that mechanosensitive channels, including PIEZO1, PIEZO2, and TRPV4, were preferentially expressed in matrisome NC/NP clusters, such as TBXT + and CTSK + clusters. TRPV4 was the first osmotically sensitive vertebrate ion channel to be discovered, [92] and it has been found to be expressed in a variety of tissues. [93] TRPV4 was activated by hypo-osmotic cell swelling, and its activation promotes chondrogenesis by inducing SOX9 transcription. [94] Recently, the expression of Trpv4 in the NP and IAF at the embryonic stage was validated by the Trpv4 Lacz/WT strain, and Trpv4-dependent Ca 2+ responses regulate the expression of Acan and Prg4 in AF cells. [95] It was also reported that a low magnitude of compression promoted the biosynthesis of mesenchymal stem cells toward NP cells via TRPV4. [96] However, the function of TRPV4 in the NC-to-NP transition has not yet been illustrated. Because TBXT + is enriched in the expression of proteoglycans and preferentially expresses TRPV4, the potential role of TRPV4 in maintaining osmotic pressure in the NC/NP deserves more investigation. PIEZO1 could also be gated by osmotic pressure, [97] and conditional knockout of Piezo1 leads to overhydrated red blood cells with increased osmotic fragility. [98] However, gain-of-function mutations in PIEZO1 are associated with dehydrating red blood cell disease xerocytosis. [99] We previously reported that ECM stiffness accelerated the degeneration of adult NP cells by activating PIEZO1, [89] but understanding the role of PIEZO1, which regulates osmotic pressure in embryonic NC/NP cells, necessitates further experimental evidence. CTSK + is associated with the "response to mechanical stimulus" and preferentially expresses PIEZO2. However, little is known about the function of PIEZO2 in IVD development and degeneration, despite its expression in both NP and AF.
A healthy adult IVD is largely avascular and aneural with sparse innervation and vascularization to the peripheral AF and EP. [100] Neoinnervation and neovascularization are integral steps in the pathogenesis of IVDD, especially the ingrowth of nociceptive nerve fibers, which might be associated with chronic low back pain. [101] Evidence suggests that proteoglycans, such as ACAN, inhibit neural and vascular ingrowth. [102] In addition, SEMA3A may also act as a barrier to neuronal ingrowth within healthy IVDs. [75] However, few additional studies to date have www.advancedsciencenews.com www.advancedscience.com investigated other inhibitory factors on blood vessels in healthy IVDs. Our study revealed that the SOX10 + , TBXT + , and CTSK + clusters may be vital for the maintenance of the avascular midline during IVD formation. Furthermore, we found that rSpp1 enhanced chondrogenesis and inhibited the expression of Vegfa in chondrogenic progenitor ATDC5 cells. These results indicate that SPP1 is an inhibitory factor influencing vascular ingrowth during IVD development. We further found that the expression of Spp1 was drastically decreased in degenerated NPs ( Figure 6I), thus suggesting a potential therapeutic role of SPP1 addition in IVDD.
We found that CD44 is ubiquitously expressed in the human fetal spine, which inspired us to explore the potential roles of SPP1-CD44 in different IVD cells. Human primary nucleus pulposus cells (hNPs), rat annulus fibrosus cells (rAFs), and rat endplate chondrocytes (rEPs) were isolated and cultured. First, the administration of rSpp1 in hNP cells decreased the expressions of hypertrophic markers (CD9 and RUNX2) and vascularization markers (HIF1A and VEGFA) ( Figure S16A, Supporting Information). However, rSpp1 treatment seemed to have no effect on chondrogenesis in hNP cells. CD44 mAb treatment could reverse the reduced expression of CD9 and VEGFA induced by rSpp1. Second, neither rSpp1 nor CD44 mAb treatment had effects on rAF cells ( Figure S16B, Supporting Information). At last, rSpp1 treatment in rEP cells stimulated chondrogenesis, which was reversed by CD44 mAb treatment ( Figure S16C, Supporting Information). rSpp1 treatment also downregulated the expression of vascularization markers (Hif1A and Vegfa), but CD44 mAb did not reverse the tendencies. Taken together, SPP1 may play different roles in different IVD cells not only through interacting with CD44, although CD44 is ubiquitously expressed in the human fetal spine.
The hChon4 (expressing CCN2, GDF5, and PIEZO2) is supposed to give rise to the chondrogenic EP/IAF. The Ccn2-GFP mice showed that Ccn2 was expressed in the NP, the outer layer of IAF adjacent to the OAF, and VB ossification centers at postnatal IVDs ( Figure 4E and Figure S9D, Supporting Information), consistent with another paper reporting that Ccn2-Lacz was expressed in the IAF and NP at E16.5. [103] Thus, Ccn2 could label both EP/IAF and NP in embryonic IVDs. Meanwhile, Christina Munndy and colleagues previously showed that Gdf5 + lineage cells were restricted to the AF region instead of the NP region in the intervertebral joints at P0. [104] Taken together, the hChon4, marked by the expression of both CCN2 and GDF5, is a chondrogenic cluster that might give rise to EP/IAF. Meanwhile, Ctsk was hardly expressed in the hChon4, and we also found that Ctsk-Cre;mT/mG mice labeled only some NP and VB ossification center cells, rather than AF or EP cells ( Figure 3H). At last, the hChon4 expressed markers of human skeletal stem cells, such as THY1 (CD90), PDPN, and CD164 ( Figure 4F), suggesting that it might be a kind of chondrogenic progenitor cell, and more experimental evidence is still needed to prove their "stemness".
In conclusion, we reported scRNA-seq analysis findings of human and murine fetal axial skeletons during early IVD formation combined with NC/NP subcluster lineage tracing. We revealed the heterogeneity and spatiotemporal dynamics of axial skeleton cells and further compared the differences between humans and mice. We reported three novel NC/NP markers (SOX10, CTSK and SPP1) and demonstrated the diverse roles of NC/NP subpopulations during IVD formation, which may be leveraged to develop preventative and regenerative strategies for degenerative spinal diseases.

Experimental Section
Clinical Specimen: This study protocol was approved by the ethics committee of the First Affiliated Hospital of Sun Yat-sen University (Nos. [2015]76 and [2020]320). All of the donors signed written informed consent for the use in research of fetal material arising from the termination of their pregnancy. All the donors were informed about the purpose of the research and there was no compensation offered for donation. All of the embryos (n = 13 for scRNA-seq, n = 20 for in-house bulk RNA-seq and histological verification) were considered structurally normal on ultrasound examination before termination.
Each sample was proceeded and microdissected by a spine surgeon with the assistance of a senior obstetrician under a stereo microscope, following the standard operating procedure. Briefly, the whole axial skeleton and their adjacent tissues (ligaments, ribs and spinal cord) were carefully dissected from the embryo, followed by removal of the ribs (at their costovertebral joints), the spinal cord and the vertebral arches by using microsurgical forceps and scissors. The vertebral column order was recognized by the ribs. At last, the anterior and posterior longitudinal ligaments were gently separated from the spine. The resulting whole fetal axial skeleton was then cut into pieces, followed by trypsin (3 mg mL −1 , 37°C, 10 min) and type I collagenase (5 mg mL −1 in RPMI, 37°C, 60 min) digestion. The filtered (70 μm) cell suspension was then treated with red blood cell lysis buffer to remove red blood cells. The cells were washed twice with wash solution (0.01 × 10 −3 m EDTA and 0.04% BSA in PBS) and resuspended in PBS followed by filtration (40 μm).
scRNA-seq: scRNA-seq was performed by Guangzhou Gene Denovo Honour Biotechnology Co., Ltd., and a 10x Genomics GemCode singlecell instrument was used to generate single-cell Gel Bead-In-EMlusion (GEMs). Single-cell capture libraries were constructed with Chromium Next GEM Single Cell 3' Reagent Kits v3.1 according to the manufacturer's instructions. The remaining biochemical reagents and primers in the post-GEM reaction mixture were removed with silane magnetic beads. R1 (the read 1 primer sequence) was added during GEM incubation, while P5, P7, a sample index, and R2 (the read 2 primer sequence) were added during library construction through end repair, A-tailing, adaptor ligation, and PCR. The final libraries contained the P5 and P7 primers used in Illumina bridge amplification. The Single Cell 3' Protocol produced Illumina-ready sequencing libraries.
Single-Cell RNA Statistical Analysis: Conversion, alignment, and count quantification were performed with the 10x Genomics Cell Ranger software (version 3.1.0). After filtering out reads with low-quality barcodes and unique molecular identifiers (UMIs), reads uniquely mapped to the transcriptome and intersecting an exon at least 50% were considered for UMI counting. Before quantification, the UMI sequences were corrected for sequencing errors, and valid barcodes were identified based on the EmptyDrops method. [105] The cell-by-gene matrices were produced via UMI counting and cell barcode calling. The Seurat package (version 4.1.0) was used for cell normalization and regression based on the expression table according to the UMI counts of each sample and the percentage of mitochondria rate to obtain the scaled data. The Harmony package (https://github.com/immunogenomics/harmony) [106] was applied to correct the batch effect followed by UMAP reduction construction. PCA was constructed based on the scaled data, and the top 2000 highly variable genes were used for UMAP construction. Using the graph-based cluster method (resolution, 0.9), the unsupervised cell cluster result was acquired, and the marker genes were calculated by the FindAllMarkers function under the following criteria: log 2 FC > 0.75, adjusted P-value < 0.01, and pct.1 > 0. 25. To identify the cell type details, clusters of the same cell type were selected for re-UMAP analysis, graph-based clustering, and marker gene analysis.