Blood transcriptomics identifies immune signatures indicative of infectious complications in childhood cancer patients with febrile neutropenia

Abstract Objectives Febrile neutropenia (FN) is a major cause of treatment disruption and unplanned hospitalization in childhood cancer patients. This study investigated the transcriptome of peripheral blood mononuclear cells (PBMCs) in children with cancer and FN to identify potential predictors of serious infection. Methods Whole‐genome transcriptional profiling was conducted on PBMCs collected during episodes of FN in children with cancer at presentation to the hospital (Day 1; n = 73) and within 8–24 h (Day 2; n = 28) after admission. Differentially expressed genes as well as gene pathways that correlated with clinical outcomes were defined for different infectious outcomes. Results Global differences in gene expression associated with specific immune responses in children with FN and documented infection, compared to episodes without documented infection, were identified at admission. These differences resolved over the subsequent 8–24 h. Distinct gene signatures specific for bacteraemia were identified both at admission and on Day 2. Differences in gene signatures between episodes with bacteraemia and episodes with bacterial infection, viral infection and clinically defined infection were also observed. Only subtle differences in gene expression profiles between non‐bloodstream bacterial and viral infections were identified. Conclusion Blood transcriptome immune profiling analysis during FN episodes may inform monitoring and aid in defining adequate treatment for different infectious aetiologies in children with cancer.


INTRODUCTION
Children with cancer are at increased risk of infection, frequently presenting as fever and neutropenia (FN), due to chemotherapy-induced immune suppression. 1 Early (< 24 h) and accurate identification of children at low risk for severe infection during FN is increasingly recognised as important in reducing unnecessary antibiotic exposure and hospital length of stay and improving quality of life. 2 While some progress has been made in identifying novel blood plasma biomarkers, 3 next-generation whole-genome RNA sequencing technologies investigating the global immune activation landscape as an indicator of infection status during episodes of FN in children with cancer have not been systematically studied. Unique transcriptional signatures in white blood cells indicate pathogenic processes and may be able to distinguish cases with bacterial or viral infection or fever of unknown origin. The blood leukocyte transcriptome during FN therefore reflects aspects of immune status and may have utility in guiding and personalising treatment. 4,5 Only two studies, one in adults and one in paediatric cancer patients, have been conducted to investigate blood gene expression profiles associated with infections during episodes of FN. 6,7 Limitations of these studies include small cohort sizes and cross-sectional and retrospective analyses rather than prospective longitudinal follow-up and restricted depth of analysis. In paediatric cancer patients, RNAseq analysis of 43 FN episodes identified a panel of two genes to be significantly differentially expressed in bacterial infection compared to controls, but the analysis was not extended to include viral infection or coinfection. 6 The paediatric study also concluded that the blood transcriptome was not suitable for determining the aetiology of FN due to the lack of sufficient circulating immune cells impacting the quality of gene expression analysis. 6 The Australian Predicating Infectious ComplicatioNs In Children with Cancer (PICNICC) study was a large multisite, prospective cohort study designed to validate existing paediatric FN clinical decision rules (CDRs) and to identify novel biomarkers and immune profiles that predict severe infection. 8,9 In the same cohort of patients, we have shown that procalcitonin (PCT) and interleukin (IL)-10 may enhance the accuracy of existing CDRs for the prediction of bacterial infection. 10 In this exploratory study we compared the transcriptional profile of peripheral blood mononuclear cells (PBMCs) from children with cancer and FN with documented infection and unexplained fever. We also investigated how the transcriptomes changed over the 24-h period post admission and initiation of therapy.  Supplementary table 1) between the cohort where transcriptional profiling was performed and the total cohort. Of those with diagnosed bacteraemia, three were Gram-positive and six were Gram-negative organisms. In the non-bloodstream microbiologically defined infection (MDI) group, seven were bacterial and 12 were viral infections.

Patient characteristics and infection outcomes
There were no differences in total white blood cell (WBC) count and absolute neutrophil count (ANC) or proportion of cell populations, across episodes with bacteraemia, MDI, clinically defined infection (CDI) and unexplained fever (Supplementary table 2).
Global transcriptional downregulation of immune signalling pathways distinguishes infectious and non-infectious causes of fever, and these differences diminish with treatment After filtering and normalization, 16 559 genes were included in the differential gene expression analyses of any infection versus unexplained fever. On Day 1, 2601 genes were differentially expressed in episodes with any infection (i.e. bacteraemia, other MDI or CDI) compared to episodes with unexplained fever with 772 genes up-regulated and 1829 genes down-regulated ( Figure 1a). On Day 2, 145 genes were differentially expressed in episodes with any infection compared to unexplained fever, with three genes up-regulated and 142 genes down-regulated (Figure 1b).
Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis and Gene Ontology (GO) were performed to identify signalling pathways associated with the genes differentially expressed between episodes with any infection and episodes with unexplained fever. On Day 1, the top 20 KEGG pathways included those implicated in phagosome and lysosome formation, as well as pathways implicated in response to a diverse spectrum of pathogens including bacteria (Mycobacterium tuberculosis, Salmonella spp., Staphylococcus spp., Escherichia coli) and parasites (Leishmania) (Figure 1c). On Day 2, pathways that involved phagocytosis and pathogen response (i.e. M. tuberculosis, Staphylococcus aureus, Phagosome) as well as an array of autoimmune responses (rheumatoid arthritis, Type 1 diabetes, asthma, systemic lupus and autoimmune thyroid disease) were identified (Figure 1d).
The most common gene ontology terms overrepresented in the differentially expressed genes on Day 1 included vesicle-mediated immune responses (Supplementary figure 1a) and on Day 2  Genes that were up-or down-regulated on Day 2 were also differentially expressed on Day 1 between episodes with any infection and episodes with unexplained fever (Supplementary figure 1c). A common set of 129 genes were differentially expressed on both Day 1 and Day 2. Of these, two genes were up-regulated and 127 were downregulated (Supplementary figure 1d). Among the top differentially expressed genes identified in all comparisons across both days were ATF3 (activating transcription factor 3), TNFRSF21/DR6 (death receptor 6) and SLC4A3 (solute carrier 4A3anion

Blood transcriptional changes on admission differs among bacterial infection, viral infection and unknown causes of FN episodes
We additionally investigated whether bacteraemia, non-bloodstream bacterial and viral infections would elicit different transcriptional changes in immune signalling when compared to episodes of unexplained fever. We identified 1206 differentially expressed genes when comparing bacteraemia to unexplained fever episodes (150 up-regulated, 1056 down-regulated) (Figure 3a). A comparison of nonbloodstream MDI to unexplained fever revealed 582 differentially expressed genes (76 up-regulated, 506 down-regulated), and 132 genes differentially expressed between episodes of viral MDI and unexplained fever (2 up-regulated, 130 downregulated) (Figure 3b and c). Of these 533 genes were uniquely differentially expressed in bacterial non-bloodstream MDI and 83 genes were uniquely expressed in viral MDIs, while 49 genes were commonly expressed across episodes with both bacterial and viral MDI compared to episodes with unexplained fever (Figure 3d). No significant differentially expressed genes were found in direct comparison of episodes with bacterial nonbloodstream infection or viral infection.
Longitudinal analysis in bacteraemia episodes shows treatment-associated immune and metabolic restoration On Day 2, 11 genes were differentially expressed between episodes with and without bacteraemia, all of which were down-regulated (Figure 4a and b, Supplementary table 4). The top 10 Hallmark gene sets and top 10 KEGG pathways, included apoptosis, bile acid metabolism and PI3K-Akt signalling (Figure 4c and d).
Comparison of episodes with bacteraemia and episodes with either unexplained fever, CDI or MDI on Day 2 identified five genes, including BOK (BCL2 family apoptosis regulator BOK), which were uniquely differentially expressed in bacteraemia episodes across all comparisons (Supplementary figure 3a).
The gene signatures distinguishing bacteraemia from all other causes of FN on Day 1 (24 genes, see Figure 2a and b) and Day 2 (11 genes, see Figure 4a and b) were compared, and no overlapping genes were identified (Supplementary figure 3b). Comparison of differential gene expression comparing bacteraemia on Day 1 versus Day 2 identified 108 genes (67 up-regulated, 41 down-regulated) (Figure 5a). The top 20 KEGG pathways included eight that were up-regulated in Day 1 bacteraemia episodes. Amongst those pathways were necroptosis, NK cell-mediated cytotoxicity and JAK-STAT-mediated immune signalling (Figure 5b). In contrast, 12 pathways were up-regulated in Day 2 bacteraemia episodes, including a range of metabolic pathways involving carbon, thiamine propanoate and pyruvate metabolism (Figure 5b).

DISCUSSION
This exploratory study of the blood transcriptome of PBMCs in children with cancer and FN identified specific profiles that may aid in categorizing causes of fever on presentation and Day 2 of hospital admission. Unique gene profiles differentiating episodes with and without infection and more specifically with and without bacteraemia were identified. These profiles were independent of total WCC and ANC, rather the differences observed between infectious and presumed non-infectious causes of FN were due to more subtle changes in immune cell signalling not absolute cell numbers. 4 We identified almost 500 genes that were differentially expressed and distinguished infectious causes of FN from unexplained fever. Amongst the top differentially expressed genes was ATF3 (activating transcription factor 3), a transcription factor that modulates immune response by negatively regulating inflammatory genes, calcium signalling and lysosome formation [12][13][14] and was shown to provide protection against bacterial infections. 15 Genes involved in phagocytosis and lysosome/vesicle formation are key pathways involved in early innate responses to infection and were also differentially regulated between infectious and non-infectious causes of FN in our study. This is in keeping with a study in adult cancer patients with FN which showed that vesicle-mediated transport and cytokines may help distinguish bacterial causes of fever from other causes. 7 Our transcriptomic data showed that unexplained fever can be distinguished from other causes of FN based on immune gene activity suggesting that the fundamental cause of these fevers may not be due to undiagnosed or resolving infection. Amongst the different infective causes of FN, transcriptional gene profiling was able to distinguish bacteraemia from other causes including viral infection. This has important clinical implications as the risk of invasive bacterial infection in children with FN drives broad-spectrum antibiotic exposure. We detected the highest amount of unique differentially expressed genes between bacteraemia and unexplained fever episodes  (> 1200 genes). Interestingly, bacterial nonbloodstream MDI had more differentially expressed genes (582 genes) than viral MDI (132 genes) when compared to unexplained fever, potentially indicating less profound systemic immune responses and immune gene signatures in viral infections. While respiratory viruses such as rhinovirus, influenza virus and respiratory syncytial virus are commonly detected during episodes of FN, it remains unclear whether these viruses are always the primary cause of fever. 16 Our transcriptional analysis was able to distinguish viral and bacterial causes (i.e. bacteraemia and bacterial non-bloodstream infections). A direct comparison of viral and bacterial non-bloodstream infections did not identify differentially expressed genes. However, pathway analysis revealed that, while overall similar pathways were dysregulated when comparing bacterial or viral FN episodes to unexplained fever, the changes were more dramatic in the former comparison and more subtle in the latter comparison. The present 'gold standard' diagnostic test for bacteriaemia is blood culture which can take up to 48 h to identify a causative pathogen. 17 Given that we identified a unique gene signature in bacteraemic patients compared to those with unexplained fever and unique signatures associated with MDI compared to unexplained fever episodes (Figure 3), we further dissected whether particular patterns of differentially expressed genes at the time of hospital admission could distinguish children who have high-risk infections. Among the gene signature consisting of 24 genes that were differentially expressed in FN episodes with bacteraemia compared to all episodes without bacteraemia were genes shown to have roles in calcium signalling (CABP4) and phospholipid metabolism (ULK2). Calcium and phospholipids are required for initiation of coagulation and platelet activation. 18 This is of particular interest as calcitonin, the active form of procalcitonin (PCT) and a promising blood plasma biomarker for risk stratification in FN, is important in regulating calcium homeostasis in steady state and during severe infection. 19,20 Of the 24 genes identified in episodes with bacteremia, two genes (SNX24 and RDH10) were uniquely differentially expressed (both downregulated) when bacteraemia was compared to each individual cause of FN. Retinol dehydrogenase (RDH10) is a key enzyme in retinoic acid (RA) synthesis. 21 Retinoic acid was shown to decrease inflammatory processes 22 and improve immunocompetence in sepsis 23,24 and is known to regulate bile acid homeostasis which was shown to predict outcome in critically ill patients. 25 Taken together, this supports the relevance of retinoic acid and its key enzyme retinol dehydrogenase as potential biomarkers to discriminate causes of FN at the time of hospital admission.
Our unique capacity to utilise a prospective longitudinal study design allowed us to compare samples collected at the time of admission to those collected on Day 2 after patients were commenced on empiric FN antibiotics. This analysis revealed that the main signalling pathways that were differentially regulated on Day 2 were involved in aspects of cellular signalling, regulation and communication as well as response to organic substances, Longitudinal analysis of transcriptional profiles in bacteraemia episodes from Day 1 versus Day 2 identified that while genes for apoptosis, bile acid metabolism and coagulation were similarly down-regulated on Day 2, pathways reported to be responsible for immune recovery and metabolic restoration, such as thiamine metabolism and HIF-1a signalling 26,27 were over-represented on Day 2.
Overall, this suggests that treatment may alter the transcriptional profiles over time and the changes observed may indicate treatment response.
BOK was identified as one of the differentially expressed genes that distinguished bacteraemia from all other causes of FN in individual head-tohead comparisons on Day 2 and could thus be an interesting biomarker gene. BOK's role in programmed cell death and apoptosis is still not clearly defined, 28 but a recent report identified a role for this gene in the control of uridine metabolism. 29 Interestingly, BOK has been proposed to play an essential role in regulating mitochondrial calcium levels. 30 We therefore speculate that BOKs overlapping functions in key pathways impacted by bacteraemiacell death, metabolism, and calcium signallingmight cause its differential expression during FN episodes with underlying bacteraemia.
Although our cohort had only nine bactaeremia episodes, our study is the largest transcriptomic analysis of children with cancer and FN. We obtained sufficient RNA from > 90% of FN episodes and there was no trend towards insufficient RNA in any of the groups, thus overcoming limitations of a previously reported study. 6 Clinical data informing this study were also collected prospectively as part of a larger cohort study with international definitions of bacteraemia, MDI and CDI used. While microbiological and molecular testing was done at the discretion of the treating clinician and therefore may have missed some infection diagnoses, all patients did have pre-antibiotic blood cultures taken.
As this was an exploratory study, analyses were performed on stored PBMCs to facilitate coordinated RNA extraction and processing. An independent replication cohort would be useful in validating our data and in defining a minimum set of differentially expressed genes that can discriminate bacteraemia from all other causes of FN. Future prospective studies could then be used to ascertain the positive and negative predictive values of these key transcriptional changes in diagnosing bacteriaemia in FN patients. The clinical utility of gene expression profiling is best exemplified in the diagnosis, characterization, and prognostic evaluation of many cancers. 31 Further work is needed to correlate whether the immune profiles identified using RNAseq are reflected in the plasma proteome or metabolome. This could aid in identifying novel blood biomarkers that are more readily translatable for diagnostic point-of-care test.

CONCLUSION
Collectively, our data showed that transcriptomic analyses performed on PBMCs collected from children with cancer who develop FN may have utility in predicting the cause of fever. Blood collected at presentation showed transcriptional signatures that allowed differentiation of bacteraemic causes of fever from other causes of fever. Interestingly, this signature is dampened after the institution of appropriate antibiotics and is replaced with different signatures indicating a degree of immune recovery. We postulate that transcriptomic analysis can potentially identify the cause of fever in neutropenic children with cancer and sequential analyses may provide evidence that the correct treatment has been initiated. It will be important to follow up our findings with further studies in replication cohorts and to understand their accuracy in predicting the cause of FN in children with cancer.

Patient recruitment and blood sample collection
The present blood immune transcriptome study was embedded in the Australian Predicating Infectious ComplicatioNs In Children with Cancer (PICNICC) study (Australian New Zealand Clinical Trials Registry 12616001440415). 8 Recruitment for the transcriptomic analysis was conducted at two of the eight study sites 8 : Royal Children's Hospital (RCH), Melbourne, and Queensland Children's Hospital (QCH), Brisbane. The analysis of plasma abundance of 33 cytokines, CRP and PCT in the same cohort of patients is reported elsewhere. 10 Children with solid-organ cancer or leukaemia on active treatment and who presented to the emergency department with FN were included. Fever was defined as a single temperature ≥ 38°C, and neutropenia was defined as an absolute neutrophil count (ANC) < 1000 mm À3 . Children who had a hematopoietic stem cell transplant (HSCT) in the three months prior to recruitment and those already receiving antibiotics were excluded. Demographic and clinical data including outcomes were prospectively collected from electronic and paper-based records and entered into REDCap. 32 Blood from eligible patients was collected at two time points: FN onset (within 0-4 h of ED presentation and prior to 1st dose antibiotic) and Day 2 (within 8-24 h of ED presentation). Blood was collected in EDTA tubes and processed by the cancer centre's tissue banks at RCH and QCH. Within 2 h of sample collection, plasma was separated from erythrocytes and PBMCs using Ficoll density gradient centrifugation and PBMCs stored in RNAlater at À80°C until thawed for RNA extractions and downstream processing.
The causes of fever were prospectively classified as microbiologically documented infection (MDI), clinically documented infection (CDI) or unexplained fever according to international consensus definitions 11 (see Supplementary  table 1). Bacteraemia, with a known pathogen or a common commensal cultured on two or more occasions, was considered a subset of MDI. 11 Microbiological and other investigations were performed according to site FN guidelines. Across both sites this included at least one blood culture set (all patients), urine and nasal swabs for PCR or culture where applicable, chest X-ray, stool culture with Clostridioides difficile toxin assay and viral PCR and skin or wound swab for culture and viral PCR where applicable.
Patients were managed according to hospital FN guidelines which included early administration of an antipseudomonal beta-lactam or cephalosporin after blood cultures were taken.

Transcriptional profiling
The PBMCs were stored in RNAlater TM Stabilization Solution (CAT# AM7020; Thermo Fisher Scientific, Waltham, MA, USA) and RNA extraction was performed using the Isolate II RNA mini kit according to manufacturer's instruction (Cat# BIO-52072; Meridian Bioscience, Cincinnati, OH, USA).
An input of 10-100 ng of total RNA was prepared and indexed for illumina sequencing using the TruSeq RNA sample Prep Kit (Cat# RS-122-2001; Illumina, San Diego, CA, USA) with RiboGlobin depletion as per manufacturer's instruction. Each library was quantified using the Agilent Tapestation (using RNA ScreenTape [Cat# 5067-5576] on a 2200 TapeStation system (Cat# G2964AA; Agilent Technologies, Waldbrunn, Germany)) and the Qubit TM DNA BR assay kit for Qubit 3.0 â Fluorometer (Cat# Q32850; Thermo Fisher Scientific). The indexed libraries were pooled for single end sequencing (1 9 75 cycles)  All reads were aligned to the human genome, build hg38, using align from the Rsubread software package v2.0.1. 33 Over 94% of reads were successfully mapped for each sample. The number of reads overlapping genes were summarized into counts using featureCounts 34

from
Rsubread. An average of 71% of reads were assigned to genes for each sample. Genes were identified using NCBI RefSeq annotation. Differential expression (DE) analyses were then undertaken using the edgeR 35  Prior to analysis, all genes with no current symbol, ribosomal RNAs, non-protein coding immunoglobulin genes and haemoglobin genes were removed. Genderspecific genes including XIST and those unique to the Y chromosome were also removed to avoid gender biases. Expression-based filtering for lowly expressed genes was then performed using edgeR's filterByExpr function with default parameters. Library sizes were then normalized using the trimmed mean of the M-values (TMM) method. 37 Following filtering and normalization, the data were transformed to log 2 -counts per million (CPM) with associated precision weights using voom 38 and the correlation between samples from the same patient estimated using limma's duplicateCorrelation 39 function. Sample weights were also calculated using limma's voomWithQualityWeights 40 function. Differential expression was then assessed using linear models and robust empirical Bayes moderated tstatistics. To increase precision, the linear models included not only the patient correlation estimate and sample weights, but also incorporated a batch effect correction for cancer type. The false discovery rate (FDR) was controlled below 5% using the Benjamini and Hochberg method. 41 Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis, Hallmark gene set analysis and Gene Ontology (GO) were performed to identify signalling pathways associated with the genes differentially expressed between episodes with and without infection. The types of pathway analyses were chosen as indicated for each respective comparison based on how well they represented overall signatures. Analyses of the GO 42,43 terms and KEGG 44 pathways were performed using limma's goana and kegga functions respectively. The analysis of the Hallmark gene sets from the Molecular Signatures Database 45 was achieved using limma's fry function.
The mean-difference (MD) plots were generated using limma's plotMD function and the heatmaps using the pheatmap CRAN software package v1.0.12. The removeBatchEffect function in limma was used to adjust for the effect of cancer type in multi-dimensional scaling (MDS) plots and heatmaps. Deconvolution of bulk-RNAseq data to identify immune cell subsets was performed using dtangle. 46

Statistical analyses
Ordinary one-way ANOVA was used to compare immune cell subsets derived from dtangle analysis. Statistical significance was considered when P < 0.05.