A systematic review of evidence for and against routine surveillance imaging after completing treatment for childhood extracranial solid tumors

Abstract Background Regular off‐treatment imaging is often used to assess for recurrence of disease after childhood cancer treatment. It is unclear if this increases survival, or what burden surveillance places on patients, families, or health‐care services. This systematic review examines the impact of routine surveillance imaging after treatment of pediatric extracranial solid tumors. Methods Collaborative patient and public involvement informed the design and interpretation of this work. Thirteen electronic databases, conference proceedings, and trial registries were searched alongside reference list checking and forward citation searching from 1990 onwards. Studies were screened and data were extracted by two researchers. Risk of bias was assessed using a modified ROBINS‐I tool. Relevant outcomes were overall survival, psychological distress indicators, number of imaging tests, cost‐effectiveness, and qualitative data regarding experiences of surveillance programs. PROSPERO (CRD42018103764). Results Of 17 727 records identified, 55 studies of 10 207 patients were included. All studies used observational methods. Risk of bias for all except one study was moderate, serious, or critical. Data were too few to conduct meta‐analysis; however, narrative synthesis was performed. Surveillance strategies varied, and poorly reported, involving many scans and substantial radiation exposure (eg, neuroblastoma, median 133.5 mSv). For most diseases, surveillance imaging was not associated with increased overall survival, with the probable exception of Wilms tumor. No qualitative or psychological distress data were identified. Conclusions At present, there is insufficient evidence to evaluate the effects of routine surveillance imaging on survival in most pediatric extracranial solid tumors. More high‐quality data are required, preferably through randomized controlled trials with well‐conducted qualitative elements.


| Searches
Electronic searches were undertaken from 1990 onwards, reflecting the current era of survival in childhood cancer. Published and unpublished studies were sought and no language or study design restrictions were applied, as such randomized controlled trials, quasi-randomized studies, prospective, and retrospective cohort studies were all eligible to be included, as described in the protocol. 5

Box 1 Study inclusion and exclusion criteria
Inclusion criteria: -Population: Children or young people up to 25 years who had completed treatment for a malignant extracranial solid tumour and had no evidence of active and ongoing disease at end of treatment (or results for this subgroup) -Intervention: programme of surveillance imaging aiming to detect relapse of previously treated childhood cancer -Comparators: routine clinical review, another surveillance programme (using imaging or laboratory measures) or none (some studies reported this comparison as detection of relapses by surveillance compared to by symptoms) Outcomes: a. Primary: Overall Survival (age at time of death or time from original diagnosis) b. Secondary: psychological distress indicators, number of imaging tests, cost-effectiveness, qualitative data relating to experiences of surveillance imaging, other harms of imaging (as identified by the studies themselves) Exclusion criteria -Case studies -Studies from Low and Middle-Income Countries (LMIC) -Only studies performed in high-income countries were included, to reflect the treatment and surveillance strategies in these settings. -Surveillance solely related to patients with cancer predisposition syndromes -Surveillance looking predominantly for late effects of treatment included articles were reviewed and forward citation searching of included articles was performed, using Web of Science.

| Screening and data extraction
For inclusion and exclusion criteria see Box 1. Two authors independently screened the title and abstract of studies, dual-screening 10% of the records and singlescreening the remaining 90% as agreement was good (96.6%). Disagreements were resolved by consensus, or by recourse to a third author. Data extraction was performed by two authors. Study quality was assessed using a modified ROBINS-I tool, supplemented with potential sources of heterogeneity: patient demographic and clinical characteristics, study era, and geography. 6,7

| Analysis
Key study characteristics were summarized in narrative and tabular forms. Given the degree of clinical heterogeneity and absence of sufficient data, meta-analysis was not appropriate. Narrative synthesis was performed by tumor type, and focused on the key themes of: method of identification of relapse, burden of surveillance programs, and effects on survival. About 17 727 unique records were identified by the search,  17 226 were excluded on title and abstract, and 449 excluded following full-text review ( Figure 1). Review of conference proceedings and references searches identified three further studies. Review of trial registries identified no ongoing relevant studies.

| Mapping summary
Fifty-five studies, with 10 423 participants, were included ( was unclear whether surveillance imaging was within a clinical trial or part of routine care.

| Risk of bias
Risk of bias was variable, with most studies demonstrating moderate to serious risk of bias (Table 2). Particular issues relate to confounding and lead-time bias (where studies measure survival from time of detection of relapse, rather than from original diagnosis).

Non-Hodgkin's lymphoma
Four studies examined surveillance imaging in 110 patients with non-Hodgkin's lymphoma. [19][20][21][22] Data from three of four studies indicated large numbers of scans, with 806 scan conducted in 66 patients. [20][21][22] Where reported, scanning was associated with notable radiation dose (median whole body radiation dose of 40.3-91.3 MSv). 20,21 Three relapses occurred within one study population in a median time of 0.25 years. These were detected by symptoms. 21 Surveillance imaging detected no relapse and produced 17 false positive images. 19,20,22 Hodgkin's lymphoma Five studies assessed surveillance imaging in 799 patients with Hodgkin's lymphoma. [23][24][25][26][27] Surveillance programs comprised large numbers of images, where reported, 1293 in 291 patients. [24][25][26] Relapse was detected in 111 (13.8%) patients, 51 (45.9%) by surveillance imaging and 60 (45.1%) by clinical signs and symptoms. Thirty-four false positive images were reported in two studies. 24,25 One study reported a median time to relapse of 1.7 years by scan compared to 0.61 years in those detected by clinical signs and symptoms. 26 For those relapses detected after 12 months off-treatment, 5-year survival after relapse was 100% for both groups. 26 Another study reported 5-year survival after relapse in those detected by surveillance imaging 64.6% ± 10.1% vs clinical signs and symptoms 73.8% ± 7.2% (P = .186). 23

| Osteosarcoma
Five studies of three cohorts reported on 247 patients with osteosarcoma. [28][29][30][31][32] Where reported, the number of scans during surveillance programs was large, 2394 for 231 patients. 28 Forty-three patients experienced relapse, with one study providing comparative data on the numbers of patients who experienced relapse detected by surveillance imaging vs symptoms, 7/28 vs 21/28, respectively. 28 Survival data were largely lacking. Korholz et al reported a 5-year overall survival of 67%, without comparative data on the method of relapse detection. 28
One study reported a shorter median time to relapse, 0.28 vs 1.22 years, and a lower 5-year survival, 0% vs 17%, in symptomatic patients compared to those detected by surveillance imaging. 33 Another study also reported a shorter nonsignificant median time to relapse, 1.6 vs 1.9 years (P = .07), between symptomatic and surveillance detection. 34 Another study found that 5-year overall survival (OS) after relapse was higher in asymptomatic patients vs symptomatic patients, 37% vs 9%. 35

| Wilms tumor
Six studies explored five cohorts of 5074 patients with Wilms tumor. 17,36-40 These experienced 836 relapses; where method of relapse detection was reported, 501 were detected by surveillance imaging and 181 detected by symptoms. Not all patients had method of relapse detection reported.

| Hepatoblastoma
There were 73 patients with hepatoblastoma included in three studies. 17 detected by rise in alpha-fetoprotein (AFP) levels rather than imaging. 17,42 One study did not report the number of patients relapsed but stated that all were detected by rise in AFP levels prior to imaging. 41 In total, 408 imaging studies were performed, although two studies only reported CT data without other imaging types. No study reported data on overall survival. One study found no significant difference in time to relapse between patients detected by surveillance and those detected by symptoms. 42
Survival statistics were variably reported, with most studies reporting less than 5-year follow-up. Few patients survived following relapse (n = 2 in CR, three alive with disease of 28 relapses). 43,44,46 Curve-estimated 5-year overall survival after relapse is around 3% in those detected by surveillance and 0% in those detected by symptoms. 45 One study reported a mean of 29.5 CT scans per patient and another reported a median of 35 images (median CED 133.5 mSv) from the time of initial diagnosis to relapse. 44,47

| Retinoblastoma
Two studies assessed 65 patients with retinoblastoma. 48,49 Three patients relapsed, one was detected by surveillance imaging and two by symptoms. A total of 223 scans were conducted and 11 false positive images were reported. Survival data are lacking.

| Soft tissue sarcomas
Four studies examined rhabdomyosarcoma and included 466 patients with 325 relapses. 3,[50][51][52] One study included all pediatric soft tissue sarcomas-235 patients with relapsed disease, of whom 150 had rhabdomyosarcoma. 16 In the studies that only included patients with rhabdomyosarcoma, where method of detection was reported, 85 relapses were detected by surveillance imaging and 140 by symptoms. In Dantonello et al, 90 were detected by surveillance, and 139 by symptoms. One study reported 507 scans in 40 patients, with scanning frequency data not provided by other studies. 50 Two studies reported survival data. 3,51 Neither found a significant difference in overall survival between those detected by surveillance and by symptoms. The survival rate was lower in one study compared to the other (surveillance vs symptoms: 20% vs 11% 3-year survival and 43.3% vs 44.6% 5-year survival, respectively), as the former included progression of disease along with relapse. 3,51 In Dantonello et al, 5-year overall survival from primary surgery was 40% for those detected by surveillance and 29% for those detected by symptoms. 16 However, by 10 years, survival was 21% for surveillance and 23% for symptomatic. These differences may reflect different biology of disease being detected by surveillance, with these patients surviving longer.

| DISCUSSION
Evidence on the use of surveillance imaging in pediatric extracranial solid tumors is derived exclusively from observational studies. Surveillance strategies are often poorly reported and variable in design, making replication of many studies impossible. The risk of bias for most studies is significant. Evidence gaps were present in all malignancies and the quality of studies was generally low, with particular issues around confounding and lead-time bias. Conclusive statements regarding the survival benefit of surveillance imaging cannot be made based on information identified in this review.
We recognize that reporting combinations of different imaging types together makes it challenging to separate out the roles of each modality for different malignancies. Sadly, much of the available literature includes all imaging types and presenting separate findings is currently impossible.
Notwithstanding these limitations, it is possible to establish that surveillance imaging programs result in large number of additional imaging investigations, often associated with notable radiation doses. There is a risk of false positive images, including incidental or uncertain findings, which was particularly present in studies of lymphoma. These may be associated with additional distress for patients and families, as well as further investigations. Even with large numbers of tests, surveillance imaging detected only 57% of relapses identified.
Survival outcomes were generally poorly reported. For most malignancies studied, the data available suggested no significant difference in survival between patients whose relapse was detected by surveillance and those whose relapse was detected by symptoms. One exception to this finding is within Wilms tumor, where detection by surveillance imaging does appear consistent with increased survival. However, the number needed to scan is large and the financial costs of surveillance imaging are high. Summary information with the clinical bottom line for each cancer type is provided in Box 2.
It is important to recognize that any differences in survival reported in these nonrandomized studies may be due to the variable biology of relapsed disease rather than an effect of surveillance. As such, randomized studies are necessary in order to truly evaluate the role of routine surveillance imaging in pediatric patients with extracranial solid tumors.
No qualitative data, psychological distress indicator studies, or studies exploring morbidity or burden of relapse treatment, including the risk of secondary malignancies, were identified. As such, the literature captures little of the patient's or family's experience of routine surveillance programs, which may be positive, negative, or both, or of the subsequent treatment of relapse. This is particularly disappointing given our PPI group stressed the importance of these issues. They highlighted to us the importance of understanding the "sawtooth" of anxiety relating to scanning ("scanxiety"), where the anxiety builds to the point of receiving the results of a scan, followed by the relief of a result showing no evidence of disease. They discussed that there may be different anxieties experienced if routine surveillance imaging was not undertaken. They also felt the literature should reflect that knowing about a relapse

Box 2 Clinical bottom lines
Lymphomas . Large numbers of scans and false positive imaging is demonstrated in the literature. More research is needed on whether surveillance imaging provides survival benefit. Osteosarcoma Large numbers of scans are conducted. A lack of comparative survival data between relapses detected by surveillance vs. symptoms. More research is needed on whether surveillance imaging provides survival benefit. Ewing's sarcoma Surveillance imaging may not detect relapse prior to symptoms. Those detected earlier by symptoms may have more aggressive disease and therefore have a lower survival after relapse. Research using appropriate effect measures is needed to infer a survival benefit. Wilm's tumour Most relapses were detected by surveillance imaging and this appears consistent with increased survival. Data on survival benefit was reported post relapse and at risk of lead-time bias, thus should be interpreted with caution. The number needed to scan is large and the financial costs are high.

Hepatoblastoma
Tumour markers detected all relapses in the literature prior to surveillance imaging, though there were few relapses reported. Patients received a large number of imaging studies. There is no evidence on the effect of surveillance imaging on survival. Neuroblastoma Evidence was derived mostly from high-risk patients. The risk of relapse was high and few patients with relapse survived, regardless of the method of detection. Surveillance programmes involved a large number of scans and a significant radiation dose. Retinoblastoma Large numbers of scans were conducted and were associated with false positive images. There is no evidence on the effect of surveillance imaging on the time to detection of relapse or on survival. Soft tissue sarcoma Numbers of scans were high and most relapses were detected by symptoms. Evidence does not support improved survival after relapse of rhabdomyosarcoma in those detected by surveillance imaging. For patients with other soft tissue sarcomas evidence is inconclusive.

Other tumours
Minimal data is available on the impact of surveillance in rarer diseases and no evidence suggests improved survival with surveillance.
in advance may not change survival but may alter how life was lived, and thus a deeper exploration of the meanings surrounding surveillance imaging would be a key contribution to the literature in the future. We strongly recommend that high-quality qualitative research should be performed to understand the various roles of follow-up, the meaning assigned to surveillance imaging, and the preferences of patients, parents, and professionals in this setting. This research should include both those undergoing routine disease surveillance and those who are not.
The strengths of this review lie in the robust systematic review methodology, informed by extensive PPI engagement focused on design, interpretation, and dissemination. One key challenge lies in how to address teenage and young adult malignancies in systematic reviews. We excluded studies where the majority of participants were over 25. For some diseases, where the population prevalence straddles this cutoff (eg, Hodgkin's lymphoma and germ cell tumors), this review does not provide all relevant data. Future reviews of these particular malignancies should focus on surveillance across the population.
In addition to this, it is important to recognize that there are challenges in identifying whether patients are symptomatic at the time of surveillance imaging, particularly in retrospective studies. Even if this was identified, this was rarely reported within the literature. We recognize that some patients may have presented with symptoms at the point of surveillance and may have therefore been classified either as symptomatic or detected by surveillance. The effects of this within the data are difficult to predict. It is possible that studies could have consistently classified these patients as one mode of detection over the other. If, for example, patients were more frequently classified as detected by surveillance, it may appear that surveillance images identify more relapses than would be the case in practice. However, it is unlikely to change the duration of survival findings if patients are symptomatic at surveillance visits. Future prospective studies should aim to capture this information so as to inform our understanding of the role of surveillance imaging in this setting.
Concerted effort is required to improve understanding of the risks and benefits of surveillance imaging. We strongly recommend the review of currently held data from research trials or cohorts not currently in the public domain. This may include combining data from multiple studies to inform the research problem. We also recommend the national and international trial bodies to consider including the randomization of follow-up policies within future trial platforms so as to provide further information about best surveillance practices. Furthermore, we recognize that surveillance imaging programs should change over time, as both up-front and relapse therapies change, and as the imaging modalities available for surveillance develop, resulting in a different balance of risks and benefits for patients.

| CONCLUSIONS
At present, there is insufficient evidence to evaluate the effects of routine surveillance imaging on survival in most pediatric extracranial solid tumors. More high-quality focused research is needed that uses appropriate effect measures to address the research questions, alongside wellconducted qualitative data. This should be a key research priority considering the substantial impact of imaging on patient experience, and the financial and opportunity costs to health services.
Funding: This study was funded by a Children's Cancer and Leukaemia Group (CCLG) 40 th Anniversary Grant. JEM is supported by an NIHR Clinical Lectureship and RSP by an NIHR Postdoctoral Fellowship. Neither NIHR nor CCLG have had any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

ACKNOWLEDGMENTS
Many thanks to Jennifer Brown and Alexis Llewellyn for helping screen and extract foreign language records. We are also grateful to the patient and public involvement group whose input has been invaluable throughout this work.

CONFLICTS OF INTEREST
JEM is supported by an NIHR Clinical Lectureship and RSP by an NIHR Postdoctoral Fellowship.

AUTHOR CONTRIBUTIONS
JEM and RSP conceived the study idea and obtained funding. JEM, MH, and RSP designed the protocol. MH performed the searches. JEM, RW, and RSP screened, selected, quality assessed, extracted data, and analyzed the study. All authors contributed to and have approved the final manuscript.

DATA AVAILABILITY STATEMENT
This study reports a systematic review for which all data are already available within the public realm in the form of scientific publications, references for which are provided.