Proteomic insights into paediatric cancer: Unravelling molecular signatures and therapeutic opportunities

Survival rates in some paediatric cancers have improved greatly over recent decades, in part due to the identification of diagnostic, prognostic and predictive molecular signatures, and the development of risk‐directed therapies. However, other paediatric cancers have proved difficult to treat, and there is an urgent need to identify novel biomarkers that reveal therapeutic opportunities. The proteome is the total set of expressed proteins present in a cell or tissue at a point in time, and is vastly more dynamic than the genome. Proteomics holds significant promise for cancer research, as proteins are ultimately responsible for cellular phenotype and are the target of most anticancer drugs. Here, we review the discoveries, opportunities and challenges of proteomic analyses in paediatric cancer, with a focus on mass spectrometry (MS)‐based approaches. Accelerating incorporation of proteomics into paediatric precision medicine has the potential to improve survival and quality of life for children with cancer.

are cured following contemporary therapies. 4Survival remains very poor for children with high-grade glioma, diffuse midline glioma and metastatic sarcoma. 5,6Improvements in survival over past decades can be attributed primarily to advances in treatment, including intensity of chemotherapy and supportive care, and enrolment in clinical trials for most patients.9][10] Adoption of risk-directed treatment approaches for paediatric cancer is increasing.The aims of risk-directed therapy are to decrease intensity of therapy for patients with a favourable risk profile (which consequently decreases long-term side effects) and, for patients with an unfavourable risk profile, to intensify therapy or use molecularly targeted agents for treatment.[13]

UNLOCKING THE POTENTIAL OF PROTEIN ANALYSIS FOR PAEDIATRIC CANCER
Precision medicine in paediatric cancer has revealed benefits from genomics and transcriptomics, such as identification of targetable molecular alterations for existing anticancer drugs, identification of risk stratification biomarkers, and selection of patients for intensification or reduction of therapy. 14,15Despite these successes however, some children do not respond to targeted treatments, they develop resistance, or no actionable target can be found. 14,16,17This reveals a need for better biomarkers to improve survival and quality of life. 18,19rge-scale proteomic studies provide potential avenues for revealing such biomarkers and novel therapeutic targets 20,21 (Figure 1).
[22] Proteomics can reveal insights not obtained from other 'omics.For example, there is an imperfect correlation between the transcriptome and proteome.Genomic and transcriptomic data are not reliably informative of protein stability, post-translational modifications (PTMs) or protein regulatory networks, 23,24 making some discoveries only possible at the proteome level. 25ildhood cancers typically harbour relatively few genomic changes. 26,27The incidence of somatic mutations (see Supporting Information-Glossary of Terms) in childhood cancers is up to 14 times lower than in adult cancers. 28Therefore, proteomic research could be especially promising in paediatric cancer.Moreover, different genomic changes may manifest similarly in the proteome, enabling cancer subgroup identification from protein profiles rather than DNA alterations. 29Traditionally, cancer diagnosis and treatment decisions that use protein data in the clinic rely on only a small number of tests, typically by immunohistochemistry (IHC). 30Although IHC is highly valuable, it has critical limitations such as a dependence on specific antibodies, semi-quantitative measurements, and the ability to assess only small numbers of proteins per tumour sample. 30,31In contrast, mass spectrometry (MS)-based proteomics is antibody-independent, with thousands of proteins measurable in a single sample.However, there is the disadvantage that spatial information is lost in whole tissue proteomics.
After proteins are extracted from a biological sample and digested into peptides for MS, they can be separated by liquid chromatography (LC).The MS is commonly connected to the LC eluent for analysis.
Other types of MS such as matrix-assisted laser desorption/ionization (MALDI) and surface-enhanced laser desorption/ionization (SELDI) can be used to analyse peptides without LC separation.Digested peptides are next ionized, and their mass-to-charge ratios are determined in the mass spectrometer.MS instruments with two or more mass analyzers can perform tandem MS.In the first stage of tandem MS (MS1), intact peptides (called precursor ions) are ionized and their massto-charge ratios measured.In the second stage (MS2), each peptide (precursor ion) is fragmented to generate fragments that can be deciphered to reveal the amino acid sequence of the corresponding intact parent.The two sets of information (called a transition) are combined for identification of that peptide.The presence of proteins is inferred from identified peptides, and peptide quantification enables estimates of protein abundance.There are several types of MS approaches, based on the data acquisition mode, instrument configuration and quantitation method (label-free or labelled).The choice of MS approach and instrument depends on the research goals, sample complexity and available resources.See Table 1 and refs [32][33][34] for an overview of these concepts and discussion of experimental considerations of biospecimen type, 35 protein identification and quantitation, [36][37][38][39][40][41] measurement of PTMs with open search algorithms [42][43][44][45][46][47][48] and kinase activity via Kinobead technology 49 and kinase enrichment analysis. 50

EXISTING MOLECULAR SIGNATURES AND BIOMARKERS FOR PAEDIATRIC CANCER
Large-scale whole-genome sequencing studies of paediatric cancer 28,51,52 have revealed common features of the molecular landscape.For example, there is heterogeneity in the genetic alterations that underlie paediatric cancer, 27 with most tumours having a low somatic mutation burden (below 1 mutation per Mb). 28,51This mutation rate is likely due to dysregulation of developmental pathways, the embryonal origin of many childhood cancers, and only a small impact from environmental carcinogens. 27Some paediatric tumours with very high mutation rates (over 10 mutations per Mb) are related to deficient mismatch repair. 28Many paediatric cancer driver genes (i.e., genes where mutations confer a selective growth advantage) are exclusively mutated within certain cancer types. 28,51Approximately half of these genes are not common drivers of adult cancers, suggesting unique pathways to oncogenesis. 51Cancer driver genes are also less frequently identified in childhood cancers, with approximately half of paediatric tumours harbouring at least one significantly mutated driver gene, compared with over 90% of adult cancers. 28ediatric cancers are frequently driven by copy number alterations, 1 The central dogma of molecular biology is that genetic information mostly flows from DNA to RNA to proteins.Proteoforms highlight the inherent complexity of the proteome, because they represent the multitude of ways in which a single gene can give rise to a family of functionally diverse protein products.Proteomic analysis in childhood cancer is highly valuable for several reasons, including those outlined in this figure .structural variants or gene fusions. 51The frequency of structural variant occurrence differs greatly among paediatric cancer types, often correlating with germline or somatic TP53 mutation. 28Certain genomic loci are commonly altered across cancer types, such as amplification of MYC and MYCN oncogenes.Gene fusions are common, with the EWS-FLI1 fusion protein in Ewing sarcoma and fusions involving the NTRK gene frequently identified. 27ildhood cancers are more commonly driven by genetic predisposition syndromes than adult cancers. 53Predisposing germline variants, often in genes relating to DNA repair, 28 have been identified in around 16% of childhood cancers, 14 though low-level mosaicism and difficulty detecting structural variants means the incidence may be considerably higher. 27Predisposition variants can contribute to earlier onset of cancer, influence surveillance protocols for patients and their families, and have implications for treatment. 17,53For example, retinoblastoma patients with germline RB1 alterations have an increased risk of secondary cancer, with that risk increasing following radiotherapy treatment of their primary tumour. 54st molecular markers currently in clinical use assess germline or somatic DNA variants, including single nucleotide mutations or larger structural variants.In the paediatric cancer clinic, these biomarkers are used for risk stratification, diagnosis and treatment decisions.For example, neuroblastoma patients with non-high-risk disease have 5year event-free survival over 85%, but survival for high-risk disease approximates 50%. 55Prognosis is determined by factors such as age at diagnosis, disease stage, tumour histology and molecular markers such as tumour ploidy, MYCN amplification and certain segmental chro-mosomal changes. 56Alterations in the RAS and TP53 pathways and activation of a telomere maintenance mechanism are also predictive of poor outcome, although these are not routinely clinically considered. 56lecular biomarkers for prognosis or a targeted therapy have also been identified for diagnosing central nervous system tumour subtypes, reviewed in ref. 53 Translocations generating fusion proteins can be markers of therapeutic targets in paediatric bone tumours and soft tissue sarcomas.However, use of targeted or immune therapies as a first-line treatment in children is generally limited.There are several Phase I and II trials ongoing in difficult-to-treat paediatric solid tumours or relapsed patients to assess possible biomarkers. 53A few biomarker-driven therapies have been approved for use in paediatric cancers, such as vitrakvi/larotrectinib and entrectinib for patients with NTRK gene fusions to inhibit the tyrosine kinase activity of the fusion proteins. 53In addition to DNA-or RNA-based molecular markers, researchers are also investigating epigenetic markers, such as methylomic assessment of childhood central nervous system tumours to guide subtyping. 57Research investigating protein abundance patterns or proteomic biomarkers of therapy or staging is limited and not yet integrated into clinical use.

EXPLORING THE PROTEOMIC LANDSCAPE OF PAEDIATRIC CANCER
There are several areas of paediatric cancer research that can be informed by MS-based proteomic analysis (Figure 2).These TA B L E 1 Mass spectrometry and array-based proteomics approaches and related experimental considerations.

Data-dependent acquisition (DDA):
In DDA-mass spectrometry (MS), the mass spectrometer selects and isolates specific precursor ions (peptides in the case of proteomic analyses) from a sample for fragmentation and identification.Due to its stochastic nature, DDA may miss low-abundance peptides and is prone to higher levels of missing values.2. Data-independent acquisition (DIA): DIA-MS methods, such as sequential windowed acquisition of all theoretical mass spectra (SWATH), fragment all precursor ions (peptides) within defined mass windows, enabling more comprehensive and reproducible quantitation.3. Selected reaction monitoring (SRM) or multiple reaction monitoring (MRM): These methods are targeted MS methods used for precise quantification of specific peptides by monitoring predefined transitions between precursor and fragment ions.

Instrument configurations
1. Liquid chromatography mass spectrometry (LC MS): Complex samples consisting of tens of thousands of peptides are separated by LC, which is directly connected to the MS.Peptides are ionized and analyzed by an MS instrument such as a triple quadrupole (QQQ), quadrupole time-of-flight (Q-TOF) or orbitrap mass analyzer.These MS methods typically use an electrospray ionization source.2. Matrix-assisted laser desorption/ionization (MALDI): MS uses a laser to ablate and ionize samples that have been dried together with a matrix onto a plate.This technique has also been applied to direct imaging of tissue sections.MALDI is typically coupled to a TOF mass analyzer.labelling, but the cost is prohibitive in many cases.An example is stable isotope labelling by amino acids in cell culture (SILAC).4. Array-based proteomics: These are various alternative modes of proteomics that are typically independent of a mass spectrometer.These include: • Protein or antibody microarrays.
• Reverse-phase protein arrays (RPPA), which involves printing cell lysates onto a solid support.
• O-link technology, which involves the use of proximity extension assays and a unique panel of paired antibodies to measure protein levels in a multiplexed and highly sensitive manner.• SOMAScan technology, which uses modified DNA aptamers (called SOMAmers) to target and quantify the levels of thousands of proteins simultaneously in a single sample.

Some experimental considerations
Biospecimen type: Similar numbers of proteins are identified in fresh frozen (FF), FF embedded in optimal cutting temperature (OCT) or formalin-fixed paraffin-embedded (FFPE) tissue, with approximately 50%-80% overlap in common proteins across tissue types.Protein identification: DIA-MS and DDA-MS will typically enable the identification of thousands of proteins per cancer tissue sample, and these approaches are often used for hypothesis generation.In contrast, targeted MS methods such as SRM and MRM typically measure tens to hundreds of proteins in one experiment, and are often used for validation studies and clinical translation.Quantitation accuracy: Targeted MS has an advantage of providing the most accurate quantitation, leading to lower coefficients of variation (CV).
DIA-MS is typically considered to be more accurate and with fewer missing values than DDA-MS.Data analysis strategy is also a strong driver of reproducibility, with library search providing higher identifications and quantification accuracy than sequence databases, regardless of which acquisition method is used.Labelled MS experiments have been found to produce lower false-positive rates in regard to quantitation, but similar or higher protein sequence coverage and quantitative precision have been observed from label-free MS.Post-translational modifications (PTMs): MS can be used to measure PTMs such as phosphorylation, glycosylation and acetylation.Recent advances to develop open search algorithms now enable the detection of numerous PTMs directly from DDA-MS and some DIA-MS data.PTM analysis does not utilize different MS instruments.However, it requires considerably longer time to perform due to steps like enrichment of the modified peptides on affinity beads and two-dimensional high-performance LC to separate the peptides into multiple fractions (e.g., 24) for more MS runs.Understanding kinase activity: The use of kinase inhibitors for cancer treatment is growing, with many targeted therapies being tested in paediatric cancers and some, such as tyrosine kinase inhibitors, already approved for use.Kinase selectivity profiling is valuable for understanding drug action, and greatest depth of coverage can be achieved when it is performed using Kinobead technology.This is done by a competitive set-up between a kinase inhibitor followed by affinity enrichment with a Kinobead matrix and measurement of the target peptides by MS to infer a dose-dependent relationship.Kinase enrichment analysis (KEA) tools such as KEA3 can be used to predict the upstream kinases responsible for observed differential phosphorylation from phosphoproteomic and proteomic data.
include using proteomics to identify biomarkers for diagnosis and classification, identification of cancer subtypes, risk stratification, and prediction of survival or treatment response.The majority of examples described below represent proteomic research in its exploratory phases, and they are yet to be incorporated into clinical practice.

Proteomics to aid diagnosis and classification decisions
Proteomics can be used to aid paediatric cancer diagnoses.For example, in a study aiming to classify paediatric AML at diagnosis, F I G U R E 2 Overview of a mass spectrometry-based proteomic workflow and proteomic analysis approach to illuminate the proteomic landscape of childhood cancer for precision medicine and to enable tailored treatment plans.Proteomic data can be analyzed to aid diagnosis and classification decisions, identify cancer subtypes, perform risk stratification and survival prediction, or to inform predictions of treatment response.Proteomic analysis of liquid biopsy samples can also be adopted for several purposes.two-dimensional gel electrophoresis and MALDI time-of-flight (TOF) MS (Table 1) was used to assess de novo versus post-myelodysplastic syndrome AML. 58Using bone marrow and peripheral blood samples, several proteins were identified as indicators of leukaemogenesis or AML biomarkers. 58Similarly, MS measurements of pooled normal retina and retinoblastoma tissues revealed several hundred differentially regulated proteins via data-dependent acquisition (DDA) by LC-MS and using isobaric tags (iTRAQ; Table 1) for relative quantitation. 59e dysregulated proteins, which included lamin B1 and transferrin receptor, were also overexpressed via IHC and may be retinoblastoma biomarkers.
MS can also be used to predict the future likelihood of developing a disease.While the following example concerns metabolomics (the large-scale study of water-soluble small molecule intermediates or products of metabolism to understand metabolic processes and their associated biochemical pathways) rather than proteomics, it is interesting to consider a small MS-based study that identified metabolites from newborn dried blood spots that were significantly predictive of female children who would later develop paediatric AML. 60

Proteomics for identification of cancer subtypes
MS-based proteomic approaches can also be used to identify cancer subtypes, which is clinically useful if these subtypes have prognostic significance, as illustrated by the following three examples.
Proteomic profiling was performed on leukaemic cells from 16 paediatric AML patients, identifying AML subtypes with differing core binding factor gene translocation status and potentially druggable proteome targets. 61In a multi-omic study of rhabdomyosarcoma, six groups of proteins and phosphoproteins were identified from orthotopic patient-derived xenografts (PDXs), human myoblasts and myotubes, which were differentially expressed in alveolar and embryonal rhabdomyosarcomas and developing muscle. 62Differential protein abundances revealed a proteomic signature that could distinguish four subtypes of paediatric osteosarcoma formalin-fixed paraffinembedded (FFPE) samples that were not correlated with clinical determinants. 63 part of the National Institutes of Health Clinical Proteomic Tumor Analysis Consortium (CPTAC), 64 a large-scale MS-based paediatric brain tumour proteomic study profiled 218 frozen tissue samples from seven childhood brain cancer subtypes with DDA by LC-MS, using tandem mass tag (TMT; Table 1) labelling for quantification. 29oteomic and phosphoproteomic data were combined with wholegenome and RNA-sequencing data for multi-omic analysis.Proteomic data revealed brain cancer clusters spanning histological subtypes that correlated with molecular features such as common somatic mutations.For example, some BRAF-wildtype samples showed a similar proteomic profile to BRAF V600E -mutant samples, suggesting potential utility for MEK inhibitors. 29These data indicate that some treatments may be applied to cancers within a similar proteomic cluster, despite comprising different histologically defined subtypes.The authors also identified a low correlation between pairs of initial and

Proteomics for risk stratification and survival prediction
Accurate risk stratification is important for cancers such as neuroblastoma, where survival differs considerably according to risk profile.In a pilot study aiming to identify spatial peptide heterogeneity in neuroblastoma tissues with divergent risk classifications, MALDI-MS imaging revealed a spatially associated peptide signature that discriminated risk subgroups. 67Risk stratification has enabled survival improvements in paediatric ALL, which now approximates 90%. 68Several proteomic biomarkers have been identified in paediatric ALL, span-ning a range of sample types including cancer cell lines, serum, bone marrow and peripheral blood (reviewed in ref 69 ).In a cohort of lowand high-risk paediatric ALL, differential proteomic analysis with gel electrophoresis and MALDI-TOF MS enabled identification of several proteins involved in leukaemia prognosis and response to therapy. 70 another MS-based study, data-independent acquisition (DIA) by LC-MS and label-free quantitation (LFQ; Table 1) was used to identify 86 differentially expressed proteins in high-risk childhood B-ALL, including proteins involved in pre-mRNA splicing, DNA damage and stress response. 71ediatric AML has been the focus of several proteomic risk stratification studies.DDA SELDI-TOF MS (Table 1) revealed that subgroups of leukaemic cells correlated with different survival outcomes, with S100A8 expression being predictive of poor survival. 72On a larger scale, 296 candidate proteins were profiled by reverse-phase protein arrays (RPPA) from 500 paediatric AML cases. 73RPPA is a non-MS high-throughput antibody-based technique used to measure multiple proteins simultaneously in a single sample 74 (Table 1).Nine protein expression signatures were found to be associated with clinical outcomes including prognosis and responses to specific treatment regimens. 73

Proteomics for prediction of treatment response
Proteomics can guide treatment decisions by predicting drug response or identifying new drug targets.Using the SK-N-SH neuroblastoma cell line, DDA by LC-MS with LFQ was used to identify proteins affected by combinatorial therapy involving 13-cis retinoic acid and K777 cathepsin inhibitor. 75In a surface and global proteome analysis of Ewing sarcoma PDXs and cell line-derived xenografts, many Ewing sarcoma-associated proteins and new cell surface targets were identified for potential use in targeted immunotherapy. 76In an example from lipidomics (the large-scale study of the diversity and structure of cellular lipid species in a given cell or organism) and metabolomics, picosecond infrared laser desorption MS was used for medulloblastoma tissues to identify a small molecule-based signature that graded tumours into prognostically important subgroups. 77This method, completed within 10 seconds of MS during surgery, enabled immediate personalised subgroup-specific treatment of medulloblastoma. 77Paediatric cancers frequently harbour mutations in epigenetically associated genes. 78Epigenetic therapies aimed at reversing aberrations in the cancer cell epigenome are a possible approach for overcoming drug resistance.MS can provide information on the nature of epigenome modifications, and therefore presents a potential avenue for research into treatment response, as reviewed in ref. 24 Cancer cell lines, organoids and PDXs are particularly informative for drug response studies, as these model systems can be used for drugscreening experiments assessing multiple anticancer agents. 21,79The ProCan-DepMapSanger dataset was generated by a pan-cancer, multiomic study of 949 human cancer cell lines across 28 tissue types using DIA-MS. 25The cell lines were also screened against 625 anticancer drugs, with deep learning used for drug response prediction. 25More than 10% of the cell lines are from paediatric cancers, and ProCan-DepMapSanger data are available at http://cellmodelpassports.sanger.ac.uk 80 and in the PRIDE repository 81 (Table 2).A multi-dimensional analysis of over 300 paediatric cancer cell lines is also available in the Childhood Cancer Model Atlas 82 at https://vicpcc.org.au/resources/(Table 2).

LIQUID BIOPSY AND PROTEIN ANALYSIS
Though yet to enter routine clinical practice, liquid biopsy is a major focus of paediatric cancer research. 17,83Liquid biopsy uses biological fluids, such as peripheral blood, plasma, urine or cerebrospinal fluid, to obtain circulating tumour cells, cell-free tumour DNA or RNA, or proteins.It is especially valuable where invasive biopsies of primary or metastatic tumour would be risky or difficult and to avoid exposure to imaging radiation. 84[86][87][88][89][90] Proteins can be measured through liquid biopsy by capturing exosomes or circulating tumour cells. 84Early detection of hepatoblastomas associated with cancer predisposition is performed in the clinic by regular measurements of serum alpha-fetoprotein. 91MS-based proteomic analysis of liquid biopsy is currently being investigated in paediatric cancer.For example, in rhabdomyosarcoma, differentially expressed proteins were detected in urine samples using both DDA and DIA by LC-MS, with a subset of proteins identified as possible biomarkers for non-invasive molecular diagnoses. 92In brain cancer, DDA by LC-MS was used to analyse the proteome of cerebrospinal fluid in control and brain tumour patients.With parallel reaction monitoring for quantification (Table 1), tumour versus non-tumour controls and broad subtypes of brain tumour were differentiated. 93In paediatric ALL, SELDI-TOF MS was used to generate serum profiles from 94 patients and 84 controls and develop a classification model that used proteins and protein fragments as potential biomarkers to distinguish paediatric ALL patients from both healthy controls and paediatric AML patients. 94MS has also been used to find urine markers of Wilms' tumour, identifying prohibitin as a prognostic marker and therapeutic target. 95To detect markers of early ageing in childhood cancer survivors, DDA by LC-MS and LFQ of mononuclear cells from peripheral blood samples was used to identify proteins involved in anaerobic metabolism and glucose transport that were more highly abundant in cancer survivors than age-matched controls. 96To monitor immune and tumour responses to treatment in patients with diffuse intrinsic pontine glioma (DIPG), multiple reaction monitoring (MRM; Table 1) was performed on longitudinal DIPG biospecimens of cerebrospinal fluid and serum. 97To improve sensitivity, peptide immunoaffinity enrichment of immunomodulatory proteins was incorporated alongside MRM-MS in an assay called immuno-MRM. 97

CHALLENGES, OPPORTUNITIES AND THE ROAD AHEAD FOR PAEDIATRIC CANCER PROTEOMICS
Many studies have observed a low correlation between the transcriptome and proteome. 23,98Multi-omic analyses have revealed protein expression patterns that were not evident in the transcriptome, 25,29 emphasising the potential added value of proteomic analyses in addition to other 'omic approaches.By further evaluating cancer samples at multiple time points, our understanding of genotype-to-phenotype relationships and response to therapy can be enhanced. 99However, despite these promising indications, proteomic analyses of paediatric cancer face unique challenges.

Limitations on cohort size, biopsy tissue and clinical data
Childhood cancer is comparatively rare, meaning that cohort sizes for research are often limiting.This is a particular challenge for analyses spanning multi-omic data types that use high-dimensional molecular information (such as large-scale drug response studies), as these require large cohorts for adequate power.A related challenge is the insufficient availability of survival data for research aiming to identify prognostic or predictive markers of survival.Investigating particularly rare cancer types or histologies will require partnerships and collaborations that bridge disciplines and institutions. 29Another challenge is limited availability of cancer tissue, as core biopsies done by interventional radiologists are generally preferred to avoid morbidities associated with open surgical biopsies.Core biopsies generate smaller quantities of tissue than open biopsies, creating additional difficulties when tissue is required for histopathology as well as several multi-omic assays.
To partially overcome cohort size limitations, research can utilize preclinical models such as cancer cell lines, organoids and PDXs to boost cohort sizes for biomarker discovery. 21In support of this approach, the proteomic landscape of PDX models of paediatric ALL has been found to be largely reflective of patient of origin. 100e advantages and disadvantages of using such model systems for biomarker studies are reviewed in ref. 21

Computational and technical challenges relating to MS-based proteomic data
Many paediatric cancers are characterised by fusion proteins or large structural changes.Detecting these is a challenge for traditional MSbased proteomic methods that rely on a reference protein database or spectral library.These methods are similarly challenged in their identification of somatic mutations in peptide sequences.To analyse these molecular alterations requires the use of bespoke computational pipelines and comprehensive databases of variant-containing peptides. 101MS-based proteomics faces an additional data interpretation challenge in the presence of missing values.Missing values may indicate proteins that are undetectable because of biological factors as well as incomplete data due to technical artefacts. 102One avenue to overcome these challenges is to focus on measuring the downstream pathway effects of fusion events, chromosomal alterations or somatic mutations rather than direct detection.Despite these limitations, proteomics also presents significant advantages over other 'omic approaches in its unique ability to measure complexes, PTMs and subcellular localization. 103

Big data and public repositories
Recent advances in MS to enable high-quality and reproducible highthroughput data generation have improved accessibility of proteomic technology, leading to analyses of thousands of proteins within large cancer tissue or cell line cohorts. 25,102These include several big proteomic datasets of adult cancers via initiatives such as ProCan, 104 CPTAC 64 and the International Cancer Proteogenome Consortium (ICPC) 105 (Table 2), as well as emerging literature interrogating the proteome of paediatric cancer tissue and cell line cohorts. 25,29There are already several paediatric genomic data sharing platforms, [106][107][108][109] and paediatric proteomic data are also becoming increasingly available on platforms such as the PDC. 65Integrating large proteomic datasets acquired on different platforms or at different times can be technically challenging, but there are ongoing efforts at data harmonization. 102,110,111ngle cell technologies introduce multi-omic measurements of individual cells, which necessarily increases the scale of biological data.
Such data provide opportunities for understanding tumour heterogeneity and mechanisms of relapse and resistance.Proteins can be measured in single cells using methods such as cytometry by timeof-flight (CyTOF), CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) or several spatial proteomics approaches, with potential utility in paediatric cancer as reviewed in ref. 112

Artificial intelligence and advanced computational approaches
The availability of big multi-omic datasets enables the leveraging of machine learning, artificial intelligence and advanced computational methods. 113These approaches offer distinct advantages over traditional bioinformatic analyses.The enhanced analytical capabilities offered by big data enable the extraction of more nuanced biological insights, which remain elusive when relying on smaller cohorts or datasets spanning more limited 'omic data types.In the long-term, the combination of rich proteomic data from a variety of cancer cohorts alongside other multi-omic information will facilitate pancancer research that can enable the identification of commonalities across cancer types and features unique to specific cancers. 25Moreover, foundation models 114 can be developed, which are a specialised form of machine learning architecture pre-trained on large datasets.These can serve as an initial platform for many downstream applications.In the context of paediatric cancer research, a foundation model could be pre-trained using datasets from childhood cancer cell lines, allowing more effective stratification of paediatric cancer tissues or prediction of drug response.Foundation models could also be explored for the application of models trained in adult cancers to paediatric cancers, though biomarkers may differ greatly between adult and childhood cancers. 82An additional consideration is the requirement for validation of any machine learning model in an independent cohort.A typical proteomic validation cohort should be completely independent of the training datasets, ideally with MS data generated at a different time point, on a different instrument or at a different site.

7 CONCLUSIONS
Proteomic research has already revealed several putative biomarkers and molecular signatures from tumour tissue and liquid biopsy of childhood cancer.With clinical proteomics now possible through technological advances enabling large-scale, high-throughput and reproducible MS, we should expect to see novel protein biomarker discoveries in coming years.For these to enter routine clinical practice, the tests will need to be validated and results provided with rapid turnaround.Some of the hardest-to-treat paediatric cancers have not seen significant improvements in survival for several decades.Proteomic research holds promise for the identification of biomarkers in cancers such as these.Even a modest improvement in target discovery or prediction of treatment outcome could have a major impact in childhood cancer, improving survival and reducing side effects from treatments that are ineffective for individual patients.
3. Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF): MS is related to MALDI-TOF, but in this technique samples are applied to a specially prepared surface with affinity for specific molecules.4. Capillary electrophoresis mass spectrometry (CE MS): It combines capillary electrophoresis for separation with MS for detection.Label-free quantitation (LFQ): LFQ methods compare the intensity or area-under-the-curve of peptide signals in mass spectra between different samples, typically those acquired by DIA/SWATH or SRM MS.It does not require chemical modification of the peptides.2. Isobaric labelling: It enables multiplexed quantitation by labelling peptides from different samples with isobaric tags.Up to 18 samples are run together in a single experiment.This allows for fast and accurate relative quantitation, but the results are difficult to compare between laboratories.It involves incorporating isotopically labelled amino acids into proteins during cell culture.The advantage is proteome-wide Examples include tandem mass tag (TMT) and isobaric tags for relative and absolute quantitation (iTRAQ).3.Stable isotope labelling: