Characterisation of paediatric brain tumours by their MRS metabolite profiles

1H‐magnetic resonance spectroscopy (MRS) has the potential to improve the noninvasive diagnostic accuracy for paediatric brain tumours. However, studies analysing large, comprehensive, multicentre datasets are lacking, hindering translation to widespread clinical practice. Single‐voxel MRS (point‐resolved single‐voxel spectroscopy sequence, 1.5 T: echo time [TE] 23–37 ms/135–144 ms, repetition time [TR] 1500 ms; 3 T: TE 37–41 ms/135–144 ms, TR 2000 ms) was performed from 2003 to 2012 during routine magnetic resonance imaging for a suspected brain tumour on 340 children from five hospitals with 464 spectra being available for analysis and 281 meeting quality control. Mean spectra were generated for 13 tumour types. Mann–Whitney U‐tests and Kruskal–Wallis tests were used to compare mean metabolite concentrations. Receiver operator characteristic curves were used to determine the potential for individual metabolites to discriminate between specific tumour types. Principal component analysis followed by linear discriminant analysis was used to construct a classifier to discriminate the three main central nervous system tumour types in paediatrics. Mean concentrations of metabolites were shown to differ significantly between tumour types. Large variability existed across each tumour type, but individual metabolites were able to aid discrimination between some tumour types of importance. Complete metabolite profiles were found to be strongly characteristic of tumour type and, when combined with the machine learning methods, demonstrated a diagnostic accuracy of 93% for distinguishing between the three main tumour groups (medulloblastoma, pilocytic astrocytoma and ependymoma). The accuracy of this approach was similar even when data of marginal quality were included, greatly reducing the proportion of MRS excluded for poor quality. Children's brain tumours are strongly characterised by MRS metabolite profiles readily acquired during routine clinical practice, and this information can be used to support noninvasive diagnosis. This study provides both key evidence and an important resource for the future use of MRS in the diagnosis of children's brain tumours.


| INTRODUCTION
Diagnosis of paediatric brain tumours is routinely made by multidisciplinary teams including radiologists, oncologists and histopathologists, with histopathology and molecular characterisation after biopsy or resection considered the gold standard. 13][4] While this modality illustrates exceptional anatomical and spatial information, conventional sequences provide little information on tumour properties and have limited diagnostic accuracy. 57][8] DWI, in particular, has become an essential technique for image-based diagnosis. 9Coupling DWI with machine learning has demonstrated its ability to diagnose between three major paediatric brain tumours, medulloblastoma (MB), pilocytic astrocytoma (PA) and ependymomas (EPs), across a multicentre cohort. 10eliminary studies of single-voxel proton MRS have shown how it can aid in the characterisation of adult brain tumours, 11,12 leading to an interest in its use in paediatric neuro-oncology.Early spectroscopy studies were performed at long echo time (TE) with limited metabolite information.
However, it became evident that short TE, providing more metabolite information and greater signal-to-noise ratio (SNR), was better for the characterisation of some tumours. 13Analysis is now preferentially performed using automated software for metabolite fitting and water referencing. 14,15As a result, subsequent studies have been able to discriminate between closely related metabolite profiles of brain tumours and have aided the detection of tumour-specific markers. 13,16Visual interpretation of index cases against mean MR spectra and mean metabolite profiles of different tumour types have been shown to add clinical value, 17 and the combination of these metabolite profiles with machine learning techniques for brain tumour classification has become popular.The most commonly used multivariate techniques include principal component analysis (PCA) and linear discriminant analysis (LDA), 18 and classifiers developed from these often give excellent accuracy for diagnosis. 19,20ile there are many studies demonstrating the diagnostic accuracy of MRS, reports are also published that show the added value of MRS compared with conventional MRI.Most commonly, simple viewing of MRS has been evaluated and shown to improve noninvasive diagnosis when compared with radiological review of conventional MRI, 7 including DWI. 17 It has also been shown that radiologists' diagnostic accuracy improves sequentially when conventional MRI is viewed followed by MRS and then a decision support system giving the results of a machine learning classifier. 21Currently, such decision support systems are not available in a routine clinical setting.
With the advent of molecular testing, the diagnostic categories for children's brain tumours have changed considerably over the past decade. 22This has led to some categories that were poorly defined on histology, such as supratentorial primitive neuroectodermal tumours (PNETs), being reclassified into several different diagnoses, some previously recognised, such as glioblastoma multiforme (GBM), and others that are new, such as embryonal tumour with multilayered rosettes.At the same time, some tumour types, such as MB, are composed of molecular subtypes, which only partially match the histological subtypes previously established. 23This evolving picture provides many challenges for studies of noninvasive diagnosis, particularly as children's brain tumours are not common and large datasets take many years to accrue.Increasing subcategorisation also reduces the numbers in each diagnostic group, making it difficult to obtain sufficient data to build reliable machine learning classifiers.Although there are definite challenges posed by this change in diagnosis, MRS has been shown to improve noninvasive diagnosis and clinical decision-making for children with brain tumours. 17,246][27] This makes it difficult to demonstrate the robustness and benefits of using the technique. 21Progress has been made with small, multicentre studies showing the diagnostic potential of MRS across multiple centres, at both 1.5 and 3 T, 24,28 and understanding of MRS as a multicentre diagnostic technique is continuing to develop.Inclusion of automated diagnosis, 21 as well as the integration of MRS interpretation in the clinical workflow, 29 has the potential to improve the accuracy of clinical decision-making.Nevertheless, large comprehensive, multicentre datasets for MRS are still unavailable, hampering its clinical use and impact.
The aim of the current study was to determine metabolite profiles of a cohort of paediatric brain tumours using short TE and, where available, long TE spectroscopy at 1.5 and 3 T, thus establishing distinguishing features from a large multiscanner cohort.Demonstrating the effectiveness of the compiled cohort in classifying the three main types of paediatric brain tumours with a view to aiding preoperative diagnosis, and could be used as an adjunct to conventional and advanced MRI.

| Patients
The current study received ethical approval and parental consent was obtained (5512M[B] and 04/MRE04/41).Single-voxel 1 H MRS was acquired during routine MRI for a suspected brain tumour, prior to treatment, from July 2003 to October 2012.In total, 340 children were scanned, with 464 spectra acquired for analysis.Where surgery was undertaken as part of the patient's routine clinical management, tumour diagnosis and grading were obtained from histopathology using the diagnostic criteria relevant at the time. 1 For those cases where surgery was not performed or a tissue diagnosis was inconclusive, diagnosis was made on clinical and imaging findings, after review by the local multidisciplinary team.

| MRS acquisition
Scans were acquired at both 1.5 and 3 T at five principal treatment centres, then collected and analysed retrospectively.At 1.5 T, spectroscopy was performed on five different scanner types at five centres.The scanners used were the Siemens Symphony, Siemens Avanto, GE Signa Excite, Philips Intera and Philips Achieva.At 3 T, all spectroscopy was performed on the Philips Achieva from three centres.The acquisition protocol was agreed at the start of the project, but with sufficient flexibility to allow implementation on all scanners (Table S1).MRS was acquired with a pointresolved single-voxel spectroscopy (PRESS) sequence after conventional MRI including T1-and T2-weighted and T1-weighted postcontrast sequences.These images were used to guide the placement of the MRS voxel within the solid-appearing component of the tumour, avoiding areas of cyst, necrosis or normal-appearing brain tissue.The voxel was placed to avoid close proximity to scalp or other fatty tissue to minimise the risk of lipid contamination of the MRS signal.If the tumour was sufficiently large to allow for some flexibility in voxel placement, contrast enhancement and low apparent diffusion coefficient (ADC) were used to guide optimal position.The MRS parameters at 1.5 T were short TE (23-37 ms), long TE (135-144 ms), TR 1500 ms, 128-256 signal averages, spectral bandwidth 2000 or 2500 Hz and vector length 2048.The parameters at 3 T were short TE (37-41 ms), long TE (135-144 ms), TR 2000 ms, 64 to 196 signal averages, spectral bandwidth 2000 Hz and vector length 2048.Voxel size ranged from 3.38 to 8 cm 3 at 1.5 T and from 2.2 to 8 cm 3 at 3 T. B 0 shimming and water suppression were undertaken according to the automated methods available on the scanners.Free induction decays made available routinely on each scanner were exported for analysis and are referred to here as the raw data.The signal from each receiver coil is usually combined by the scanner software prior to export.

| Quality control criteria
A pipeline showing the quality control (QC) steps and the resulting cohort numbers is summarised in Figure 1; 363 1.5-T and 101 3-T MRS scans were assessed using the following QC criteria.Spectra failed QC if they did not have TARQUIN values for full-width half-maximum (FWHM) at less than 0.150 ppm, SNR more than 5 and a measure of fit quality (Q) less than 2.5, or if on visual inspection they exhibited baseline abnormalities and artefacts.The FWHM cut-off was chosen to give reasonable quantification of the tCho and total Cr peaks around 3.2 and 3.0 ppm, respectively, which is important clinically.In all, 140/363 failed the QC checks at 1.5 T, and 43/101 failed at 3 T. Spectra were then excluded if the diagnostic groups had less than three cases, the diagnostic group was unknown or insufficient diagnostic information was available for inclusion, or finally if the spectrum was a replication of same spectrum for the same patient (1.5 T, n = 56; 3 T, n = 31).Tumour types with three or more cases (n = 13) were included in the analysis with 142 short TE and 25 long TE for 1.5 T and 17 short TE and 10 long TE for 3 T, and henceforth patient numbers refer to that cohort.For short TE MRS data at 1.5 T, the diagnostic categories are given in Table 1.There were 25 cases with long TE data at 1.5 T; 10 PAs, 10 MBs and five diffuse intrinsic pontine gliomas (DIPGs).At 3 T, a total of 17 cases had short TE MRS, eight of which were PA and nine MB, and 10 cases had long TE MRS, five of which were PA and five MB.
A relatively high proportion of spectra were excluded for not meeting the QC criteria.The main reasons for this were poor SNR and poor FWHM.These were often a consequence of the tumour itself, with low cellularity leading to poor SNR and calcification, or haemorrhage leading to poor FWHM.Some childhood brain tumours are in locations where it is difficult to perform spectroscopy because of the proximity of bone, such as the suprasellar region or the inferior brain stem.

| Statistical analysis
Statistical analysis was performed using R Statistical Software (version 2.13.0, 2011) 32 on metabolite concentrations (mM).Metabolites with a Cramer-Rao lower bound (CRLB) of less than 30% in at least two subjects were included in the analysis.These criteria resulted in the exclusion of the metabolites Asp and GABA from the final analysis.Lipids and macromolecules at 0.9 ppm (LMM0.9),1.3 ppm (LMM1.3)and 2.0 ppm (LMM2.0)are reported as combined signals.
Mean metabolite values were compared between different groups using Mann-Whitney U-tests.For MRS acquired using short TE at 1.5 T, each tumour type was compared with all the other tumour types combined into a single group (All Other).MB and PA were compared at 1.5-T long TE and at 3-T short and long TE.A comparison was also made between tumour types that are often included in a differential diagnosis on conventional MRI sequences, 1.5-T short TE (Table S2).A formal correction for multiple comparisons was not undertaken, but a reduced cut-off of p less than 0.01 was used to recognise the challenge and to provide consistency with previous publications. 13ceiver operator characteristic (ROC) curves were constructed for pairs of tumour groups and the area under the curve (AUC) was determined, with AUCs of more than 0.8 reported.Cut-off values (maximum sensitivity plus specificity) were determined to optimise discrimination.
Cut-off value accuracy was determined using leave-one-out cross-validation (LOOCV). 33diagnostic classifier was constructed to discriminate between the three most common paediatric tumour types, MB, PA and EP, by using metabolite profiles from short TE 1.5-T MRS.Principal components accounting for 95% of the variance were used in the LDA.Classifier accuracy was determined using LOOCV and is reported as the balanced accuracy rate (BAR).34 There were insufficient 3-T data to build a classifier and test it using LOOCV; however, we performed an exploratory study of field strength dependence, in which the 3-T cases were used as a test set in the 1.5-T classifier.To evaluate interscanner MRS variability, spectra acquired at 1.5 T and short TE from MB on four of the scanners (Siemens Symphony, Siemens Avanto, GE Signa Excite and Philips Intera) were compared using a Kruskal-Wallis test.
To test the appropriateness of the QC criteria used in this study, with relation to effect on classification accuracy, all cases of MB, PA and EP acquired at 1.5 T that failed were rereviewed and placed into three categories: large, intermediate and borderline QC fail.The definitions were: 1. Large QC fail: A spectrum where major metabolite peaks were not discernible.
2. Intermediate QC fail: A spectrum where major peaks were discernible but artefacts or broad peaks were seen in more than one region of the spectrum.
3. Borderline QC fail: A spectrum where the voxel contained a small proportion of normal brain, or the voxel was over a metastatic lesion and/or artefacts were seen around the water peak.
These cases were incorporated into the original classifier as test cases.
F I G U R E 1 Flow diagram of patients studied at 1.5 and 3 T. MRS, magnetic resonance spectroscopy; QC, quality control; TE, echo time.

| Metabolic features
Mean metabolite concentrations (mM) for diagnostic groups are given in Table 2 with mean spectra in Figures 2 to 6.All these figures show 1.5-T spectra, except for Figure 3, which shows 3-T spectra.Notable features are reported below; comments apply to short TE MRS at 1.5 T and the comparison is with all other cases in the cohort unless specified.

| Medulloblastoma
MB had higher Cr, tCho, Gly, LMMs and Tau and significantly lower mIns ( p < 0.0001) (Figure 4A).The most discriminatory feature of the MB mean spectra was high Tau.Gly makes a greater contribution than mIns in MBs (Figure 3B,D).

| Ependymoma
EP spectra showed higher mIns and Cr ( p < 0.01) (Figure 5).Grade III had a lower Cr ( p < 0.01) than grade II (Figure 5B,C).The higher LMM1. 3   and lower mIns seen in the mean spectra of grade III relative to grade II were not significant.T A B L E 2 Mean absolute concentrations (mM) ± SD of short echo time pretreatment paediatric brain tumours at 1.5 T compared using the Mann-Whitney U-test.

| Central nervous system PNET
No significantly different variables ( p > 0.01) were identified when comparing central nervous system (CNS) PNETs with All Other (Figure 4B).Note that in pilocytic astrocytoma, the peak labelled NAA is known to be consistent with an assignment to macromolecules.Cr, creatine; Glx, glutamine + glutamate; Gly, glycine; Lac, lactate; LMM, lipids and macromolecules; mIns, myo-inositol; NAA, N-acetylaspartate; SD, standard deviation; tCho, total choline; WHO, World Health Organisation.

| Comparison of mean metabolite values between tumours commonly found in a differential diagnosis
Results comparing specific tumours are reported in Table 3.The following comparisons are of particular clinical relevance.MB demonstrated higher Gly compared with ATRT.MB demonstrated higher tCho and Tau and lower mIns when compared with EP.PA demonstrated lower LMM1.3 and LMM2.0 compared with GBM.

| Classification of differential diagnoses using individual metabolites
The accuracy of individual metabolites to discriminate between specific pairs of tumours is given in Table 4 and Figure 7. Myo-inositol was found to be the best discriminator between MB and EP.Creatine, mIns, tCho and LMM2.0 were found to discriminate between PA and EP.Lactate, LMM0.9, LMM1.3 and LMM2.0 were found to discriminate between DIPG and GBM, although all LMMs demonstrated a low specificity.The LOOCV error rate was more than 30% for each pair of comparisons.

| Classification of common brain tumours by metabolite profile
The diagnostic classifier built from short TE MRS at 1.5 T showed good separation between the three tumour types of MB, PA and EP (Figure 8A).MBs were distinguished primarily by discriminant function 1, with high tCho, sIns, Gua and Tau, and PAs with high tNAA and Lac Note that in craniopharyngioma, the peak labelled NAA is known to be consistent with an assignment to macromolecules.Cr, creatine; Glx, glutamine + glutamate; Lac, lactate; LMM, lipids and macromolecules; mIns, myo-inositol; NAA, N-acetylaspartate; SD, standard deviation; Tau, taurine; tCho, total choline; WHO, World Health Organisation.
(Figure 8B).PA was distinguished from EP by high tNAA, with EP characterised by high mIns (Figure 8C).LOOCV demonstrated a success rate of 93% with the correct classification of 74 out of 80 cases, all with a posterior probability of more than 80%.The discrimination of the three tumour groups with a posterior probability of more than 80% obtained a BAR of 0.88.

| Scanner comparisons
Guanidinoacetate ( p < 0.01) was the only metabolite that differed between the four scanner models measured in MBs at 1.5 T (Table S4).No significant difference in SNR was found between scanners (31 vs. 27 vs. 14 vs. 20, p > 0.05, Siemens Symphony, Siemens Avanto, GE Signa Excite and Philips Intera, respectively).

| DISCUSSION
This large multicentre study reports extensive information on 1 H MRS of 13 different types of children's brain tumours and its accuracy as a diagnostic aid.Mean spectra and metabolite data are presented at two TEs, two field strengths and for rare as well as common tumours, providing an extensive resource for those using the technique.Mean metabolite concentrations were shown to differ between each tumour type and the combined group of all other tumours, showing that tumour metabolite profiles strongly reflect tumour type as determined by histological analysis.
Individual metabolites were able to discriminate between some pairs of tumours where there is commonly a diagnostic dilemma on conventional MRI and may provide a useful diagnostic aid.However, the use of the complete set of metabolites provides greater promise in diagnosis than individual metabolites.This is demonstrated by the finding that pairs of the three most common tumour types can be discriminated by individual metabolites with an accuracy of 80% to 91% (Table 4), while a formal test of the diagnostic potential of the whole metabolite profile using a three-way PCA-LDA classification demonstrated a diagnostic accuracy of 95% and can be used on almost the whole range of data qualities acquired in routine practice.
The use of conventional MRI in the preoperative diagnosis of brain tumours is often challenging and can result in several proposed diagnoses. 13,35The large number of metabolites showing differences in mean concentrations highlights the potential for these to be used in diagnosis.
However, the large variability of values for specific cases about the mean hinders individual metabolites from discriminating between tumour types, confirming the results of smaller studies. 137][38][39][40] ROC curves demonstrated a number of biomarkers that were highly sensitive and specific for discriminating between T A B L E 4 Binary classification of tumours with discriminatory biomarkers identified using the Mann-Whitney U-test.Optimal cut-offs for significant variables were determined using ROC curves where an area under the curve (AUC), sensitivity, specificity and leave-one-out crossvalidation (LOOCV) error rate were calculated.Using whole metabolite profiles to discriminate between tumours is a logical extension to individual metabolites.Visual comparison of MRS with mean spectra from tumour groups is one approach that has been previously reported, 7,17,21 and the mean spectra provide a useful diagnostic aid.The accuracy of this approach needs to be assessed formally and its added value to conventional MRI evaluated.The use of pattern F I G U R E 9 (A) Example case 1 spectra, and (B) Axial T2-weighted image with voxel placement over an ependymoma suspected to be a medulloblastoma on radiology reporting (2.5 years, male); (C) Example case 2 spectra, and (D) Axial T2-weighted image with voxel placement over a medulloblastoma suspected to be an ependymoma (4 years, male); and (E) Example case 3 spectra, and (F) Axial T2-weighted image with voxel placement over a pilocytic astrocytoma that was suspected to be an ependymoma (4.8 years, male) (as labelled in Figure 8).Cr, creatine; Glx, glutamine + glutamate; LMM, lipids and macromolecules; mIns, myo-inositol; NAA, N-acetylaspartate; tCho, total choline.recognition methods for classification of tumours based on MRS has been reported previously. 19,20,41High levels of accuracy for diagnosing PA, EP and MB in children have been achieved using these techniques from short TE MRS in a single-centre study, 20 but short and long TEs needed to be combined to provide high accuracy in a multicentre study. 19The current multicentre study reproduces the high accuracy of the single-centre data at short TE, giving a saving in acquisition time.Further improvements should occur with MRS data acquired at 3 T but there are insufficient cases to explore this fully at present.While there is mounting evidence for the accuracy of MRS as a diagnostic aid when used in combination with pattern recognition techniques, the availability of such tools for clinical use is still lacking, and commercial, regulatory approved decision support systems need to be made available.

| Scanner comparisons
No cases had MRS acquired on more than one scanner, but comparing MRS on different scanners showed that for one specific tumour type (MB) the mean concentrations of Gua and Cr were different ( p < 0.05).Both of these metabolites have an MRS peak close to water, making them subject to variation with level of shimming and water suppression.Modelling of Cr in TARQUIN has a correction term to allow for this known effect.Cr is also known to vary widely in MB and the differences could be due to population differences. 23

| Quality control
Although MRS data fail stringent QC measures in approximately one-third of cases, spectra failing QC do not necessarily fail as a limitation of the technique, because the characteristics of the tumour also play an important role.A total of 39% of 1.5-T short TE spectra and 48% of 3-T short TE spectra that failed QC were acquired from Pas.The low cellularity of low-grade tumours and their cystic nature result in reduced SNR, 42,43 and this in itself can be diagnostic information.Similarly, it is often not possible to shim effectively over tumours containing regions of haemorrhage, leading to grossly broadened peaks.Importantly, spectra in the intermediate and borderline QC fail categories perform well on the metabolite profile classifier, with classification success rates being 95% with and 93% without these cases included.This finding demonstrates that less stringent QC criteria could be used, allowing inclusion of almost all cases.Furthermore, all three cases with very poor quality MRS could have been avoided; two were performed on a scanner known to produce artefacts, while the third had a poor voxel placement.Intermediate and borderline QC failures could be avoided in many cases by care with voxel placement, the addition of saturation bands and setting an appropriate number of averages.Applying these measures should enable clinically interpretable MRS to be obtained for the vast majority of children's brain tumours.
All metabolites that had a %CRLB of less than 30% in at least two cases were included in the analysis and reported in the results.Whether a particular metabolite has low %CRLB in a specific case depends on the concentration of that metabolite and the quality of the MRS, in particular the SNR.In cases where these conditions were present, some metabolites that are challenging to quantify, for example, Glu, Gln and Lac, were determined with acceptable %CRLB.

| Study limitations
The study does suffer from significant limitations both technical and related to study design.From a technical perspective, a fixed MRS processing protocol has been used with a lack of accurately known T1, T2s of metabolites and water for relaxation time correction.T2 values for water and metabolites will vary with field strength and even between tumours, leading to some inaccuracy in the estimated concentrations produced by the default values assumed in TARQUIN.Particular care should be taken when comparing metabolite values between the different field strengths.In addition, tumour heterogeneity within the voxel has not been taken account of, with partial volume effects of necrosis and high cellularity regions creating variability in water reference values due to proton density and T1 relaxation effects.Not taking account of these parameters could lead to a loss of useful diagnostic information and create variability that hinders classification.However, determining these parameters for each tumour is not currently feasible in a routine clinical environment.Metabolite ratios obviate the need to make assumptions about water concentrations and water relaxation rates and are an alternative approach.
We have elected to use a fixed predetermined metabolite basis set because, at the time of MRS acquisition and processing, the diagnosis would not be known and an adaptive basis set would be challenging to implement clinically.However, there are situations where the basis set may not contain key metabolites present in the tumour and mis-assignments occur.A pertinent example of this is the assignment of a peak around 2 ppm to NAA in PAs, a peak for which there is evidence that a macromolecular component exists. 44While the metabolite concentrations are not all strictly correct in these circumstances, the use to provide a noninvasive diagnosis should not be significantly impaired.
Taurine levels in the brain do decrease with age but the main tumour types occur throughout childhood and so this does not explain the discriminatory nature of the metabolite in MBs.
Undertaking MRI in children can be challenging.All scans were performed in specialist centres with access to sedation and general anaesthetic where deemed clinically appropriate.Scans in which significant movement artefact was detected on conventional imaging did not proceed to acquisition of spectroscopy and so it is unlikely that movement artefact is a reason for degradation of spectroscopy data.
The relative lack of 3-T data is a limitation of this study and further acquisition and analysis at this field strength should be undertaken.There were few 3-T scanners available in most centres at the start of the study period, and even towards the end, 1.5-T scanners were often used in preference, most likely because of the acute presentation of the cases leading to the need to find an available scanner overriding the advantages of 3 T together with the requirement for spinal imaging, which was often better at the lower field strength.
Metabolite profiles are likely to vary across each tumour to some extent, making the MRS dependent on voxel placement.Many tumours were sufficiently small that there was little flexibility in voxel placement, reducing the variability due to intratumoural heterogeneity.Ideally, multiple voxels would be placed in large heterogenous tumours, but this was not practical within the clinical setting.
Chemical shift displacement effects could lead to certain metabolites having their values altered; however, the use of relatively small voxels and their placement well within the solid component of the tumour should mitigate against this.At higher field strengths, the effect is more pronounced, but smaller voxels tended to be used.The availability of mean spectra for a variety of tumour types as a comparator for new cases is an important facilitator for clinical practice and has been shown to aid decision-making. 17,21,29In the future, we envisage that machine learning classifiers will become an increasingly important diagnostic aid.These should be incorporated into a computerised clinical decision support system, which would enable them to be incorporated into clinical workflows.However, this requires appropriate evaluation and can be subject to regulatory approval.
From a study design perspective, there are challenges and limitations posed by the length of time that the data were acquired over and the time that has elapsed since data collection was completed.Children's brain tumours are relatively rare, and large datasets can only be acquired from multiple sites over a prolonged period, during which diagnostic criteria and MRS acquisition techniques inevitably change.With the advent of molecular tests, some diagnoses based on histopathology are now recognised to be incorrect, while whole diagnostic categories such as supratentorial PNET are no longer recognised.Higher field strength scanners are increasingly used clinically with subsequent improvements in MRS quality, particularly when combined with newer pulse sequences such as semi-LASER.However, from a pragmatic viewpoint, diagnoses have only changed for a few cases, many children with brain tumours have MRS acquired at 1.5 T with very few above 3 T, and new pulse sequences have their biggest impact for MRS imaging and higher field strengths.The reporting of such a large multisite dataset of MRS for children's brain tumours is therefore still important scientifically and of clinical relevance.

| CONCLUSION
This study reports MRS metabolite profiles for a large multicentre cohort of childhood brain tumours.Mean spectra, metabolite concentrations and multivariate classifiers aid discrimination between tumour types.Pattern classification of metabolite profiles provides high levels of accuracy in discriminating between the major children's brain tumour types, even for data that do not meet stringent QC measures.Overall, the study should greatly aid clinical implementation of MRS across multiple centres.
T A B L E 1 Patient demographics of the paediatric brain tumours in the cohort with short echo time 1 H MRS at 1.5 T.SubjectsWHO grade Age (years) (mean ± SD) Age range (years) Sex (F/M)

F I G U R E 3
Mean 3-T spectra for (A) Pilocytic astrocytomas at short echo time (TE) (n = 8), (B) Medulloblastoma at short TE (n = 9), (C) Pilocytic astrocytoma at long TE (n = 5), and (D) Medulloblastoma at long TE (n = 5), with a solid black line indicating the mean spectra and SD indicated by the shaded region.

Figure 9
Figure 9 demonstrates three cases where the suspected diagnosis given on the radiological report differed from that indicated by visual inspection and classification of the MRS.The diagnosis proposed by MRS was confirmed by biopsy in each of these cases.

F I G U R E 7
Dot density plots corresponding to discriminatory metabolites reported in Table 4. Cut-off values are shown with dotted lines.These include (A) Ependymoma (EP) versus pilocytic astrocytoma (PA)-myo-inositol, (B) EP versus medulloblastoma (MB)-myo-inositol, (C) EP versus PA-creatine, (D) Diffuse intrinsic pontine glioma (DIPG) versus glioblastoma multiforme (GBM)-lactate, (E) DIPG versus GBM-lipids and macromolecules (MMs) at 0.9 ppm, (F) DIPG versus GBM-lipids and MMs at 1.3 ppm, and (G) DIPG versus GBM-lipids and MMs at 2.0 ppm.specific tumour types as validated using LOOCV.A future study should be undertaken to validate the findings prospectively and determine clinical impact.