Managing vestibular schwannomas with radiosurgery and radiotherapy: AGREE II appraisal of clinical practice guidelines

Vestibular schwannomas (VSs) are rare, benign intracranial tumours that have prompted clinical practice guideline (CPG) creation given their complex management. Our aim was to utilize the Appraisal of Guidelines for Research and Evaluation (AGREE II) instrument to assess if such CPGs on the management of VSs with radiosurgery and radiotherapy are of acceptable quality.


Introduction
Vestibular schwannomas (VSs, also known as acoustic neuromas) are benign intracranial tumours arising from the neoplastic growth of Schwann cells of the vestibulocochlear nerve (cranial nerve VIII). 1 Most VSs arise sporadically but are also associated with genetic disorders such as neurofibromatosis 2 or schwannomatosis. 2 They are generally rare with prevalence estimates of 42 cases per 100,000 persons. 3Yet recent research suggests increasing incidence increasing secondary to the more widespread use of MRI. 3 The presentation and disease course of VSs can be widely variable with some patients having unchanging, asymptomatic tumours indefinitely, whereas others will experience progression of sensorineural hearing loss or tinnitus to include other cranial nerve deficits (i.e.cranial nerves V and VII), cerebellar dysfunction, hydrocephalus and even death from brain-stem compression. 1,4Appropriate diagnosis and treatment are therefore essential; however, the difficult anatomy of VSs along the internal acoustic canal (IAC) and cerebellopontine angle (CPA) makes the management especially complex. 5adiosurgery and radiotherapy have emerged as promising treatment modalities for VSs in the last several decades. 6Stereotactic techniques, in particular, have demonstrated successful long-term tumour control in more than 90% of patients for small-to-intermediatesized neoplasms. 7Their high degree of specificity has also shown fewer treatment-related adverse effects on surrounding IAC and CPA structures compared to microsurgery. 8Importantly, however, these positive outcomes are highly dependent on provision by experienced physicians at centres with high VS volumes.This largely stems from the rarity of VSs and technical complexity in defining planning target volumes with millimetre-level accuracy and optimizing dose limits. 7As a result, even if appropriate equipment were available at lower volume medical centres, numerous healthcare professionals have historically felt unequipped to manage such patients. 91][12][13][14][15][16][17][18] In principle, these CPGs involve a multidisciplinary group of expert authors who systematically collect and evaluate all available evidence on the topic to recommend best practices. 19Such directed guidance could then be synthesized directly and improve radiosurgery and radiotherapy management outcomes for VS patients worldwide.Although in practice, this may not always be the case.Whether it be due to limited published data on the topic or incorrect favouritism of expert opinion over scientific evidence, the guidance provided in CPGs may not always be truly evidence-based. 19Objective assessment tools are therefore needed to ensure CPG recommendations are trustworthy and truly beneficial.
The Appraisal of Guidelines for Research and Evaluation (AGREE II) instrument is the second, refined version of a quality metric designed for this very purpose. 20It comprehensively assesses CPGs over 23 key items and six encompassing domains to provide insight into whether guidelines are developmentally sound.Additionally, its evaluation methodology allows for the formation of concrete recommendations capable of improving new or future updates of CPGs. 20The AGREE II tool has also demonstrated specific success in evaluating CPGs on radiotherapies. 21,22Consequently, our aim was to utilize the AGREE II instrument to provide the first known quality appraisal of CPGs on the management of VSs with radiosurgery and radiotherapy and leverage our findings to help improve the standard of care.

Systematic clinical practice guideline identification
CPGs were systematically identified following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) protocols. 23Embase, PubMed and Web of Science databases were used as sources alongside internet searching of relevant professional society websites.Inclusion criteria consisted of CPGs with substantial guidance regarding the management of vestibular schwannomas with radiosurgery and/or radiotherapy for patients of all ages.Non-English articles and publications without an associated full-text manuscript were excluded.When multiple similarly focused editions of CPGs existed, the newest version was preferred.Medical Subject Headings (MeSH) of 'Neuroma, Acoustic' and 'Practice Guidelines as Topic' were used as search terms alongside the following combination of keywords, Boolean operators and truncations: '((acoustic AND neuroma*) OR (vestibular AND schwannoma*)) AND ((clinical* AND practice AND guideline*) OR guideline* OR consensus OR recommendation*)'.
The implementation of this strategy is shown in Figure 1.Organization and duplicate removal were automated using Covidence (Covidence systematic review software, Veritas Health Innovation, Melbourne, Australia).Two authors (DL, EW) independently screened articles first by titles and abstracts and then by full-text review via the previously mentioned inclusion and exclusion criteria.Disputes in whether to include certain publications were resolved by a third author (KR).The selected CPGs then had relevant general characteristics extracted prior to AGREE II appraisal.

AGREE II evaluations
Each CPG was independently read and assessed by four experienced reviewers (DL, EW, CF, JH) trained in AGREE II protocols available on the enterprise website (<https://www.agreetrust.org/>).The evaluators assigned scores from 1 (strongly disagree) to 7 (strongly agree) over each of the 23 key items listed in Table 1

Â100%
Obtained score refers to the sum total of assigned scores from the appraisers across all domain-specific items, minimum possible score refers to the sum total across all domain-specific items assuming scores of 1 (i.e. 12 for domain 1: scope and purpose) and maximum possible score refers the sum total across all domain-specific items assuming scores of 7 (i.e.84 for domain 1: scope and purpose).Scaled domain scores were also averaged (with associated standard deviations) across every domain and CPG using Microsoft Excel (Version 16.79.1;Microsoft Corporation).5][26][27] CPGs were subsequently designated as high, moderate or low quality if average scaled domain scores were ≥60% for ≥5 domains, 3-4 domains and ≤2 domains, respectively.

Interrater reliability assessment
Reliability between reviewers was objectively assessed using intraclass correlation coefficients (ICCs).Specifically, ICCs were quantified across each of the six domains with associated 95% confidence intervals (CIs) using a two-way random-effects model.All calculations were performed using the 'psych' R package (Version 2.3.9;Revelle 2023) in RStudio (Version 2023.6.1.524;RStudio Team).Per previous literature, ICCs denoted excellent, good, moderate and poor consistency between reviewers if >0.90, between 0.75 and 0.90, between 0.50 and 0.75, and <0.50, respectively. 28The null hypothesis was defined by an ICC of 0.00.

Included clinical practice guidelines
][12][13][14][15][16][17][18] Their general properties are shown in Table 2. Notably, two editions of guidelines from the International Stereotactic Radiosurgery Society (ISRS) released in 2017 and 2023 were both included given clear differences in focus.All CPGs were created in the past six years, utilized their evidence base in a similar manner, and originated from a variety of countries.1][12][13][14][15][16][17][18] The primary focus of each CPG was variable, ranging from general diagnosis and treatment of VS to tumour-specific target volume delineation guidance.

Quality appraisals
The results of the AGREE II appraisal are shown in Table 3.The Congress of Neurological Surgeons/American Association of Neurological Surgeons (CNS/AANS) guidelines and European Association of Neuro-Oncology (EANO) had the highest overall mean scaled domain scores with 83.0% and 80.6%, respectively.Most other CPGs generally performed well with overall mean scores above the aforementioned 60% threshold, but the French Society for Radiation Oncology (SFRO) had the lowest value at 39.8%.The SFRO guideline was the only CPG determined to be of low quality.The EANO guideline received the only high-quality designation with the seven remaining CPGs all being of moderate quality.Regarding domains, the clarity of presentation domain had the highest mean score of 96.0% and lowest standard deviation (SD) of 5.7%.The stakeholder involvement and applicability domain had the lowest means of 49.2% and 47.2%, respectively.The remaining three domains all performed generally well with means above 60%; however, the SD varied greatly between each.The highest guideline-specific scaled domain score was 100% and was present in nine different areasfour of which were held by the CNS/AANS guideline alone.The lowest guideline-specific scaled domain score was 4.2% for the editorial independence domain for the SFRO guideline.

Intraclass correlation coefficients
The calculated ICCs for each domain are presented in Table 4.The stakeholder involvement, rigour of The overall objective(s) of the guideline is (are) specifically described 2.
The health question(s) covered by the guideline is (are) specifically described 3.
The population (i.e.patients, public) to whom the guideline is meant to apply is specifically described Domain 2: Stakeholder involvement 4.
The guideline development group includes individuals from all relevant professional groups 5.
The views and preferences of the target population (i.e.patients, public) have been sought 6.
The target users of the guideline are clearly defined Domain 3: Rigour of development 7. Systematic methods were used to search for evidence 8.
The criteria for selecting the evidence are clearly described 9.
The strengths and limitations of the body of evidence are clearly described 10.
The methods for formulating the recommendations are clearly described 11.
The health benefits, side effects and risks have been considered in formulating the recommendations 12.
There is an explicit link between the recommendations and the supporting evidence 13.
The guideline has been externally reviewed by experts before its publication 14.
A procedure for updating the guideline is provided Domain 4: Clarity of presentation 15.
The recommendations are specific and unambiguous 16.
The different options for management of the condition or health issue are clearly presented 17.
Key recommendations are easily identifiable Domain 5: Applicability 18.
The guideline describes facilitators and barriers to its application 19.
The guideline provides advice and/or tools on how the recommendations can be put into practice 20.
The potential resource implications of applying the recommendations have been considered 21.
The guideline presents monitoring and/or auditing criteria Domain 6: Editorial independence 22.
The views of the funding body have not influenced the content of the guideline 23.
Competing interests of guideline development group members have been recorded and addressed AGREE II, Appraisal of Guidelines for Research and Evaluation.

Rigour and reliability of protocol
Prior to discussion of findings and their associated implications, it is important to evaluate the basis on which they were found.Beginning with the search, the use of the globally recognized PRISMA protocol ensures that the algorithm (Fig. 1) used herein is thorough and easily reproducible. 29This comprehensiveness is further demonstrated by the breadth of professional organizations, countries and topics represented among the selected CPGs (Table 2).Similarly, given that the AGREE II tool is the most widely accepted CPG quality appraisal instrument and that the authors closely followed its ideal implementation recommendations (i.e.incorporating the preferred number of four experienced raters), the CPG assessment was also rigorous. 30Finally, implementing ICC calculation allowed for objective quantification of consistency between reviewers.Having either good or excellent interrater reliability across every domain (Table 4) demonstrates the presented AGREE II appraisals to be objectively reliable.

Similarities and differences in guideline recommendations
Several common themes and key differences between CPG recommendations emerged during the review process.]15,17,18 Similarly, CPGs uniformly recommended use of thin-slice MRI for treatment planning and 12-13 Gy for SRS alongside no strong preference between single and fractionated doses or radiosurgical modality (i.e.1][12][13][14][15][16][17][18] Discrepancies arose when discussing large VS (>3 cm).The ISRS 2023 guideline uniquely recommended single-dose upfront SRS for large VS in ideal candidates while all other CPGs recommended upfront surgery followed by SRS. 12,13,16,17nalogously, guidance on interval and duration of radiographic follow-up ranged widely.,18 Areas of strength across CPGs CPGs scoring highest in domains of clarity of presentation and scope and purpose showcase that current guidelines clearly delineate recommendations, indicate intended patient populations and define covered topics.Clarity of presentation is a valuable CPG strength given the variability of radiosurgical and radiotherapeutic management techniques for VSs.][12][13][14][15][16][17][18] Similarly, a clear illustration of each guideline's scope and purpose is paramount for multidisciplinary teams that help manage VSs.Radiation oncologists may benefit most from and best communicate information from guidelines focused on target volume delineation, whereas neurosurgeons may be best suited to synthesize and share content from CPGs aimed at combining surgical resection with radiotherapy. 13,14This subsequently allows tumour board members to each function at their highest level of expertise and efficiency, so CPG developers would do well to continue upholding this high degree of delineation in future iterations.High scores in rigour of development and editorial independence domains illustrate that CPG creators follow evidence-based practices in forming recommendations, critically evaluate available data and remain unbiased by external influence.Rigour of development is arguably the most critical strength to have given their potential to influence clinical practice for VSs.For example, the benefits of treating small, asymptomatic vestibular schwannomas with radiation must be carefully weighed against the risks of hearing loss and cranial nerve dysfunction. 31Evidence must be systematically collected and objectively assessed before making society-wide recommendations to an entire target population.It is promising that most CPGs creators adhered to rigorous development and of the utmost importance, they continue to do so indefinitely.Editorial independence goes together with this concept since it also ensures data-driven medicine.By declaring how funding bodies (or lack thereof) and potential conflicts of interests did not impact recommendations, readers can be assured that CPG developers prioritized optimal patient care above all else.This level of transparency should consequently remain standard practice among CPGs.

Areas of weakness across CPGs
The applicability domain having the lowest mean score (alongside generally low scores for it across every CPG) shows a stark lack of current guidance regarding implementation.This is evidenced by all CPGs missing valuable discussion on cost analysis.Radiosurgery and radiotherapy are inherently expensive treatment modalities (i.e.due to equipment and requirement of multiple specialized healthcare professionals) that are not available at all medical centres. 324][35] Furthermore, even if such obstacles did not exist, CPGs still lacked discourse on auditing adherence to guideline recommendations and long-term follow-up for patients.12][13][14][15][16]18 Therefore, incorporating health economists onto development panels is a valid next step in improving CPGs for the management of VSs with radiosurgery and radiotherapy. 36,37loser analysis of the low scores in the stakeholder involvement domain reveals another area of improvement for CPGs.Reviewer-level data showed uniformly low scores in the inclusion of patient perspectives as the primary reason behind the decreased overall mean.This suggests current CPGs have little to no representation of their target population during the development process.Lack of patient insight is a principal flaw as treatment decisions in several guidelines directly rely on patient-reported symptoms like tinnitus. 11,12,15,17,18The exact reasoning behind this lack of patient representation is unknown but likely influenced by the already labour-intensive process required to create guidelines, costliness of systematically gathering public perspectives and limited patient population with VSs. 3,38,39Nonetheless, future CPGs should consider conducting focus groups with patients or administering surveys to ameliorate this issue. 40inally, despite overall strengths in editorial independence and rigour of development domains, it is important to acknowledge factors that led both to have the highest associated SDs.The phenomenon explaining that of the editorial independence domain is visible in Table 3 itself.By directly comparing the scaled domain scores across the different guidelines, it is clear that some CPGs scored perfectly or near-perfect on their constituent items (CNS/AANS, EANO, ISRS 2023, European Association of Neurosurgical Societies (EANS)) while others were mixed (Alberta Health Services (AHS), ISRS 2017, European Society for Therapeutic Radiology and Oncology/ Advisory Committee for Radiation Oncology Practice (ESTRO/ACROP), Brazilian Society of Otology (SBO)) or nearly had the minimum possible score (SFRO).
Guideline creators should therefore mimic the practices from those of CNS/AANS, EANO, EANS and ISRS 2023 to showcase their own CPGs as similarly transparent and unbiased from ulterior motives.Close inspection of appraiser-level data for the rigour of development domain illuminated low scores in outside review and update procedures for all guidelines excluding the perfect scores of AHS and CNS/AANS.This implies that most CPGs had limited to no engagement with external peer review or listed protocols for updating guidelines.Both deficiencies warrant attention and future CPGs authors should consider mirroring the practices of AHS and CNS/ AANS to ensure evidence-based, up-to-date guidelines.

Recommendations for CPG improvement
The overall quality of CPGs regarding the management of VSs with radiosurgery and radiotherapy is acceptable but with clear room for improvement.Only the EANO guideline received a high-quality designation; however, the CNS/AANS guideline had the highest overall score with the highest possible value of 100% in four of the six domains.Therefore, although it received a moderate quality rating on account of lower scores in domains of stakeholder involvement and applicability, these deficiencies were likely more reflective of the current status quo than of the CNS/AANS guideline's individual quality.We therefore recommend using the EANO guideline as a developmental framework for future CPGs with the CNS/ AANS guideline being a valid alternative.The remaining moderate and low-quality guidelines would benefit from shoring up weaknesses as follows.First, acknowledging facilitators and barriers to radiosurgical and radiotherapeutic care for VSs via the assistance of health economists may increase applicability.Second, systematically garnering public perspectives could improve stakeholder representation.Third, clearly denoting the lack of influence brought on by funding sources and/or competing interests could continue to strengthen the air of editorial independence surrounding current CPGs.Fourth, seeking opinions of outside experts and providing guideline update procedures could bolster already strong rigour of development.

Limitations
Important limitations should be noted when interpreting the results of this study.Although the search strategy and inclusion criteria were specifically designed to be broad, the exclusion of non-English articles inevitably introduces some amounts of selection bias.The AGREE II instrument itself also has several inherent shortcomings.Most importantly, the tool itself is only designed to evaluate the developmental strength of a guideline rather than the validity of presented recommendations. 20CPGs designated as low quality should therefore not be discounted altogether and may still provide some valuable pearls of clinical guidance.Reviewer ratings also rely on a Likert scale which is an intrinsically subjective methodology. 41his bias was likely mitigated by the good to excellent ICCs calculated for each domain but still exists to some degree.Furthermore, instrument scoring places equal weighting on each of the 23 items despite domains having widely variable numbers of constituent items (Table 1).Small fluctuations between appraiser ratings subsequently have unduly large impacts on domains like editorial independence with two items versus those like rigour of development with eight items.This directly ties into limitations in overall quality designations, as evidenced by the CNS/AANS guideline in this study itself.For example, if its minor deficiencies were more evenly distributed, the inherent design of the AGREE II scaled domain scoring system would have favoured it and allowed it to meet the predetermined 60% threshold for all six domains.Instead, its localized weaknesses in domains of stakeholder involvement and editorial independence not only blocked this possibility but also prevented it from receiving a high rating altogether.
In conclusions, overall, current CPGs regarding the management of VSs with radiosurgery and radiotherapy are of acceptable quality but would greatly benefit from concrete improvements in applicability, stakeholder involvement, editorial independence and rigour of development.We recommend CPG authors reference the high-quality EANO guideline as a developmental framework.The CNS/AANS guideline can also serve as a valid alternative.Moreover, CPG creators should consider involving health economists, gathering opinions of target patient populations, transparently denoting no bias from funding sources and competing interests and establishing protocols for updates and external reviews during development to increase applicability, stakeholder involvement, editorial independence and rigour of development.This way the rare yet complex care involved in treating VSs with radiation can be improved for patients globally.

Fig. 1 .
Fig. 1.Clinical practice guideline identification algorithm.Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) criteria-based search strategy used to identify relevant clinical practice guidelines.

Table 1 .
AGREE II instrument composition of six domains and 23 key items Domain 1: Scope and purpose 1.

Table 2 .
Properties of identified clinical practice guidelines on management of vestibular schwannomas with radiosurgery and radiotherapy

Table 3 .
Scaled domain scores with associated means, standard deviations and guideline quality appraisals Alberta Health Services; CNS/AANS, Congress of Neurological Surgeons/American Association of Neurological Surgeons; EANO, European Association of Neuro-Oncology; EANS, European Association of Neurosurgical Societies; ESTRO/ACROP, European Society for Therapeutic Radiology and Oncology/Advisory Committee for Radiation Oncology Practice; ISRS, International Stereotactic Radiosurgery Society; SBO, Brazilian Society of Otology; SD, standard deviation; SFRO, French Society for Radiation Oncology.

Table 4 .
Intraclass correlation coefficients with associated confidence intervals for each AGREE II domain © 2024 The Authors.Journal of Medical Imaging and Radiation Oncology published by John Wiley & Sons Australia, Ltd on behalf of Royal Australian and New Zealand College of Radiologists.