Cell‐specific network analysis of human folliculogenesis reveals network rewiring in antral stage oocytes

Abstract Although previous studies have explored the gene expression profiles of human oocytes and granulosa cells by single‐cell RNA sequencing (scRNA‐seq), the dynamic regulatory network at a single‐cell resolution during folliculogenesis remains largely unknown. We identified 10 functional modules by WGCNA, four of which were significantly correlated with primary/antral oocyte and antral/pre‐ovulatory granulosa cells. Functional enrichment analysis showed that the brown module, which was correlated with antral oocyte, was enriched in oocyte differentiation, and two core subnetworks identified by MCODE were involved in cell cycle (blue subnetwork) and oogenesis (red subnetwork). The cell‐specific network (CSN) analysis demonstrated a distinct gene network structure associated with the antral follicular stage, which was notably different from other developmental stages. To our knowledge, this is the first study to explore gene functions during folliculogenesis at single‐cell network level. We uncovered two potential gene subnetworks, which may play an important role in oocyte function beginning at the antral stage, and further established their rewiring process at intra‐network/whole transcriptome level. The findings provide crucial insights from a novel network perspective to be further explored in functional mechanistic studies.


| INTRODUC TI ON
Folliculogenesis refers to the biological process by which an ovarian follicle matures through the primordial, primary, secondary, antral and pre-ovulatory developmental stages. 1 The follicle is comprised of granulosa cells (GCs), which surround and support the immature oocyte. Follicle growth is rapidly increased during the antral stage, when it becomes more responsive to female hormones (eg follicle-stimulating hormone), and the end of the development process is marked by separation of the oocyte from the GCs in the pre-ovulatory stage. Although the large majority of follicles never complete the full development process, a small number produce mature oocytes, which are released during ovulation.
The specific gene expression pattern in different follicle maturation stages and their dynamic changes are noted to be a key factor for folliculogenesis regulation. 2,3 Previous studies have used single-cell RNA sequencing (scRNA-seq) to explore the transcriptomes of the human oocytes and GCs at five follicular stages in vivo and identified candidate secretory biomarkers of ovarian reserve in primordial and primary follicles at gene expression level. 2 However, the gene interactions and dynamic network changes throughout the folliculogenesis developmental stages remain largely unexplored.
Recently, a cell-specific network (CSN) analysis approach has been developed to explore the comprehensive gene relationships for scRNA-seq data from a network viewpoint. 4 In contrast to traditional network construction, which examines the relationships between genes at the grouped cell level, the CSN method constructs a separate network for each individual cell, thereby accounting for the cell type heterogeneity. This analysis has been efficiently applied to infer key time-points of gene functions by revealing the network topology changes during human embryo development. 4 Here, we constructed cell-specific networks for ovarian follicle development and revealed a potential network rewiring process in antral stage oocytes. Our findings provide a framework of gene relationships during folliculogenesis at the single-cell level and also lay the foundations to explore characteristic gene functions from a novel network perspective.

| Data sources and sample information
The data set used in the present study was obtained from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih. gov/gds) with Accession Number GSE10 7746. This data set included the scRNA-seq gene expression profiles with cell type information of 151 human follicle cells (80 oocytes and 71 GCs) at five different developmental stages. 2 Cells with fewer than 2400 genes or 500 000 mapped reads were filtered out. In total, 80 oocytes and 71 GCs at five developmental stages passed the filter standards. To ensure the accuracy of estimated gene expression levels, only genes with FPKM > 1 in at least one cell were analysed. An offset value of 1.0 was added for each gene, and the expression levels were log2transformed for further analysis. Cell type information is listed in Table S1.

| Construction of co-expression modules
The top 5,000 highly variable genes were selected using the FindVariableFeatures function in Seurat (v3.1.2), and weighted gene co-expression network analysis (WGCNA) was used to identify functional modules in the co-expression network. 5 WGCNA uses soft thresholding to select an appropriate power for constructing the adjacency matrix from the gene correlation matrix.
The power amplifies the differences in gene co-expression and is selected such that the co-expression network will have a scalefree topology (β = 14, Figure S2B). The adjacency matrix is used to construct a topological overlap matrix, which is then used as input for hierarchical clustering to identify gene functional modules. We set the minimum module size to be 30 genes and each module was represented by its eigengene, defined as the first principal component of a given module. The average gene significance (GS) was calculated to identify correlation between module eigengenes and a certain cell type, and module membership (MM) was defined as the Pearson correlation between each gene and the module eigengene.

| Protein-protein network (PPI) construction and subnetwork identification
The important WGCNA module genes with GS > 0.2 and MM > 0.8 were used as input to construct the PPI network to assess their inter-relationships at protein level. The PPI network was constructed by the Search Tool for the Retrieval of Interacting Genes (STRING) database and Cytoscape software. Molecular complex detection (MCODE) was used to identify the most densely connected subnetworks in the PPI network with the following parameters: degree cut-off = 2, node score cut-off = 0.2, K-core value = 2 and MCODE score > 5. The Pearson correlation coefficient between genes in each subnetwork was calculated by using the R package corrplot.

| Functional enrichment analysis
Functional enrichment analysis was conducted on four WGCNA gene modules and two PPI subnetworks. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were conducted using the R package ClusterProfiler. Only those terms with adjusted P value < 0.05 were considered as significant. The top 20 records ranked by adjusted P value were extracted when more records were returned.

| Cell communication inference
We used an open-source R package iTALK (https://github.com/ Coolg enome/ iTALK) to identify and illustrate intercellular signalling communications. This approach was designed to profile the ligand-receptor (L-R)-mediated cross-talk signals from scRNAseq data based on a built-in database, which includes 2648 L-R pairs classified into four categories: cytokines/chemokines, immune, checkpoint genes, growth factors and others. We ranked the top 100 L-R pairs based on their expression levels, and the top 20 highly abundant L-R gene pairs were selected for further visualization.

| Immunohistochemistry
We performed immunohistochemistry staining of the ovarian tissue samples of SD rat (8 week) as previously described. 2 Briefly, the paraffin-embedded ovarian tissue sections were deparaffinated and rehydrated in xylene and decreasing graded ethanol.
Subsequently, the sections were demasked in citrate buffer (pH 6.0) for 20 minutes at 98℃ and cooled down to room temperature and incubated in 0.3% H 2 O 2 for 10 minutes. The sections were treated overnight at 4°C with primary antibodies, followed by consecutive incubations with biotinylated secondary antibody and streptavidin-peroxidase conjugate. Colour development was achieved using DAB.

| Cell-specific network construction and network degree matrix transfer
We used MATLAB software to construct a CSN of gene associations for each individual cell in the scRNA-seq data. 4 In the CSN of cell k, there are m gene nodes and the edge corresponding to the pairwise association between genes x and y is estimated by ̂ After hypothesis testing to determine the significance of the edge, the edge weight between genes edge xy (k) was set to either 0 or 1. The network degree matrix (NDM) was then constructed to represent the network features in a lower dimension. For gene x in the network of cell k (Equation 2):

| Construction of gene co-expression network
Based on hierarchical clustering in WGCNA, the oocytes and GCs were correctly divided into two clusters ( Figure S2A). When the power value for constructing the adjacency matrix was set to 12, the independence degree was estimated to be 0.9 and the mean connectivity was close to zero, suggesting a scale-free topology of the co-expression network ( Figure S2B). WGCNA identified ten distinct functional modules ranging in size from 48 genes (purple module) to 935 genes (turquoise module) ( Figure 1A). The eigengene adjacency heat map suggested that these ten modules could be classified into two distinct clusters ( Figure 1B).

| Quantifying gene co-expression module-cell type associations
Modules in the first cluster (including yellow, brown and turquoise modules) were identified to have the highest correlation coefficients between their eigengenes with oocytes; eigengenes of the modules in the other cluster (including pink, purple, magenta, green, black, blue, red modules) had the highest modulecell type correlation coefficients in GCs ( Figure 1B-C). The yellow and brown modules were more specifically associated with primary and antral stage oocytes, respectively. The black and green modules were more specifically associated with antral and preovulatory stage GCs, respectively ( Figure 1D-G). These four modules (yellow, brown, black and green) were considered for further downstream analysis.

| Functional enrichment analysis
xy brown module were predominantly up-regulated in this cell type.
The top 30 high-degree genes in the brown ( Figure 2C), yellow, black and green modules ( Figure S3D-F) were denoted in their respective networks.

| Pivotal subnetwork identification of genes associated with antral stage oocyte
Genes with both high gene significance (GS > 0.2) and high mod-

| CSN rewiring in antral stage oocyte
We performed the network rewiring analysis on the two identified subnetworks. We observed that the subnetwork genes were weakly interconnected in primordial, primary and secondary stages, and that the connections between these genes become much stronger in were strongest between CDC25C, NPM2 and ZP1, although the connections between all the genes involved in these two subnetworks were greatly strengthened during this stage. These findings based on the CSN algorithm suggest a drastic network rewiring process during folliculogenesis ( Figure 6).

| Dynamic changes in network degree along folliculogenesis
To explore the gene connections on the whole transcriptome level, we calculated network degrees of each gene in the two subnetworks.
Additionally, most other genes in the two subnetworks also demonstrated the highest degree in the antral oocytes with the exception of AURKA, KIF4A and MELK ( Figure S4). The largest differences in NDM trends across different subnetwork genes were observed in the pre-ovulatory stage at the conclusion of folliculogenesis. treatments. 8,9 In this study, ten co-expression gene modules were constructed using the most highly variable genes in 151 oocytes and GCs. 2 Four modules were detected as significantly correlated with a specific functions. [10][11][12] Various NLRPs were shown to have stage-dependent expression level during human pre-implantation development, 13 and they are duplicated and functionally diversified in mammalian reproductive systems. 14-16 Tubulin beta-eight class VIII (TUBB8) is a subtype of β-tubulin that only exists in primates, and mutations in TUBB8 are responsible for human oocyte meiotic arrest. 17 As most of these high-degree genes have previously been reported to be involved in reproductive function, 18-23 we selected the brown module genes for further downstream analysis.

Reciprocal communications between human oocytes and GCs
played key roles in folliculogenesis. [24][25][26] We predicted interactions of GCs-oocytes, oocytes-GCs, oocytes-oocytes and GCs-GCs in antral stage follicles and showed the top five L-R pairs involving brown module genes. Growth differentiation factor 9 (GDF9) has been shown to act through miR-375 to affect BMPR2 expression and Smad signalling pathway activation, 27 which ultimately affected the proliferation, spread and apoptosis of bovine cumulus cells. 28 Exploring the effect of the GDF9-miR-375-BMPR2 molecular pathway on human antral stage follicles may be a valuable future research direction.
Another interesting identified gene pair was HDC (ligand) and HRH2 (receptor), which were specifically highly expressed in antral follicles. Transplanted imaginal discs homozygous for an hdc mutation were found to affect oogenesis in the recipient females, 29 although the specific function of HDC and HRH2 remains largely undefined.
Our findings provide novel insight into this gene pair interaction, which may be related to autocrine signalling in oocytes. tion. 33 The drastic changes in TACC3 in the CSN level may imply that the oocytes may assist in the preparation for the subsequent meiotic division process. Further research should provide insight into the detailed mechanisms of these drastic changes.
The replication helicase is comprised of CMG (CDC45, MCM2-7 and GINS) in eukaryotic cells. 34 Treslin (encoded by TICRR), which F I G U R E 7 NDM of two subnetwork genes. Differential network degrees of two subnetwork genes in five follicle stages. Point size represents the total connection scores among the subnetworks that the gene belonged to orchestrates assembly of the CMG in human cells, is essential for incorporation of CDC45 into the replicative helicase and helps to trigger the initiation of DNA replication. 35 GTSE1 is an upstream regulator of microtubule stability, chromosome alignment and spindle pole integrity mediated by KIF4A. 36,37 The cells depleted of GTSE1 display hyper-stabilized spindle microtubules, which in turn diminish accumulation of the chromokines in KIF4A, 38  We note that the general network degrees of these genes were not strictly related to the total connection scores among the subnetworks they belong to. The dynamic tendency of these genes varies, especially in the pre-ovulatory stage. Although these results may stem from the specific connection state, another factor that must be considered is the insufficient pre-ovulatory oocyte numbers (only three cells). More cells are needed to get the unbiased expression information, which directly influenced the CSN connections and network degree values. Furthermore, validation of certain molecular functions should be performed to unveil the complex roles of the genes in the CSN during follicular development.
In summary, our analysis identified two pivotal gene subnetworks, which may play an important role in antral stage oocytes.
Genes in these subnetworks demonstrated a drastic intra-network/ whole transcriptome rewiring process that began in the antral stage.
This is the first study to identify the gene associations/network for oogenesis at single-cell resolution, and it establishes the foundations to reveal advanced mechanisms in antral stage folliculogenesis in a new perspective based on the cell-specific network.

CO N FLI C T O F I NTE R E S T
The authors have declared no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The scRNA-seq data used in this study were available from the GEO data sets with Accession Number GSE10 7746.