The effectiveness of voice therapy on voice‐related handicap: A network meta‐analysis

Treatment approaches for voice therapy are diverse, yet their differential effects are not well understood. Evaluations of treatment effects across approaches are important for clinical guidance and evidence‐based practice.


| INTRODUC TI ON
Voice problems longer than 1 week affect one in 13 adults annually and present a substantial burden and impact on quality of life. 1,2 Current prevalence rates for dysphonia diagnosis based on insurance claims are approximately 1.7% of the population. 2 Overall prevalence rates might be much higher, because many patients with voice problems do not seek medical care. 2 The treatment effectiveness for voice therapy is typically measured by several outcome variables such as pre-to post-treatment changes in endoscopic, auditoryperceptual, acoustic and aerodynamic assessments. Various voice diagnostic protocols also recommend as a standard dimension of the assessment battery the patient's self-assessment of the perceived handicap. [3][4][5][6][7][8] The Voice Handicap Index (VHI) is an instrument designed to assess self-perceived voice-related handicap using a 30-item questionnaire (VHI-30). 9 It consists of three subscales with statements relating to physical, functional and emotional domains. Each domain includes 10 statements evaluated on a five-point Likert scale.
In surveying the different systematic reviews of voice therapy effectiveness, [35][36][37][38][39][40][41][42][43][44][45] only three researcher groups performed a meta-analysis on different treatment approaches (ie Vocal Function Exercises, Laryngeal Manual Therapy, and indirect and combined voice therapy). 36,42,43 There exist several barriers for conducting a meta-analysis of voice therapy effectiveness. First, there is a lack of consistency of outcome measurements across studies to evaluate a treatment effect. Second, many studies used small-scale uncontrolled observational study designs with the inclusion of only small samples or specific populations. Third, there are differences in timing, frequency, and intensity of treatment or home practice. However, to compare multiple treatments for a given medical or healthcare condition in one analysis, a network meta-analysis (NMA) can be used.
The NMA includes both direct comparisons of interventions within randomised controlled trials and indirect comparisons across trials based on a common comparator. 49 This statistical approach might be useful for meta-analyses of voice therapy effectiveness because different voice treatments could be compared with each other if the same outcome measure is reported in the same way across studies.
The aim of the present study was to perform a NMA for estimating treatment effectiveness and establish a ranking for different voice therapy approaches for dysphonia based on randomised controlled/clinical trial (RCT) studies. While the use of objective outcome parameters from acoustic and aerodynamic voice quantification and laryngeal imaging would be desirable, it is often limited by the availability and comparability of software and hardware, data acquisition, and processing methodology, which also affects the comparability of studies. Therefore, the VHI was chosen as the outcome measure for this NMA because it is one of the most commonly used measures in voice therapy research due to its simplicity and ease of use without instruments or secondary assessments by others.

| ME THODS
We followed the Preferred Reporting Items for Systematic Reviews To identify the studies to be included, the following search terms were used: randomised controlled trial, randomised clinical trial, randomised sham-controlled trial, voice therapy and voice handicap index.
Published studies were included that evaluated the effectiveness of treatments that targeted non-organic or organic voice disorders in adults and adolescents 16 years or older with a pre-post treatment

Keypoints
• We compared the effectiveness of various voice treatments using the Voice Handicap Index 30.
• To our knowledge, this is the first network meta-analysis on the treatment effectiveness of dysphonia.
• Five from nine voice treatments resulted in a significant improvement of VHI-30 scores.
• Stretch-and-Flow Phonation has been identified by our network meta-analysis as the most effective intervention.
design. The outcome measure for the effectiveness of treatments was evaluated using the total score of the VHI-30. Clinically relevant critical difference scores of the total VHI-30 score have been interpreted as meaningful when the difference is minimally 13 points.
This score corresponds to an average cut-off score from various test-retest studies of VHI total scores ranging from 8 to 18. 10,[26][27][28] Included studies minimally reported the number of subjects per group, the mean results between pre-and post-treatment outcomes and standard deviation (SD) values of the differences of change or p-values of the pre-post outcomes. Finally, scientific reports were considered in English and German languages.
We excluded studies in which any of the subjects had been diagnosed with a neurological motor speech disorders (eg Parkinson's disease) or involved subjects who were vocally healthy (eg for prevention) or singers without voice disorders. Furthermore, we did not include studies that used medical or pharmacological treatments in participant groups. Additionally, studies were excluded if adequate descriptions of the voice therapy approaches were not provided or the voice therapy protocol of the groups had not a primary single approach. Finally, studies that incorporated instrumentation in the application of voice therapy (eg voice amplification) were excluded as well.

| Risk of bias assessment
To assess risk of bias of the included studies, the RoB 2 tool was used. 51 The following domains were evaluated to conclude an overall risk of bias (ie low, some concerns or high): randomisation process, deviations from intended interventions, missing outcome data, measurement of the outcome and selection of the reported result.

| Statistical analysis
Where possible, mean pre-post differences (MD) of VHI-30 total scores and their SDs for each treatment arm in each study were directly extracted from the publications. Where standard deviations were missing, p-values of the pre-post MD were used for calculating them. For random effects NMA, R package netmeta from the open statistical programming environment R was used. 52,53 Results were presented as MD between pre-post differences with 95% confidence intervals (CI). Furthermore, we ranked interventions based on a quantity, called P-score, which is a critical appraisal of ranking that can be considered as a frequentist analogue to surface under the cumulative ranking curve (SUCRA) of the Bayesian concept without need for resampling methods. 54 It is a simple analytical method, which is based on the frequentist point estimates and their standard errors. P-scores produce a ranking on a scale from 0 to 1, where 0 means worst and 1 means best and are based on both size and uncertainty of the effects. This metric can be interpreted as measuring the mean degree of confidence that one treatment is better than a comparable treatment. Figure 1 shows the details of exclusion and inclusion of studies using a flow chart. Table 1

| Control group versus treatment groups
The results of the NMA showed considerable heterogeneity (I 2 = 42.9) and are presented as a forest plot in Figure 3.

| D ISCUSS I ON
To the best of our knowledge, this is the first NMA to assess the treatment effectiveness of different voice therapy approaches in subjects with dysphonia reported in RCT study designs. The quality of the included studies had low risk of bias. We used the VHI-30 as the primary outcome variable, as it is a frequently used voice assessment tool and measures the impact of treatment from the perspective of the patient. SFP manifested significant (ie P-score, and 95%-CI results) and clinically relevant (ie VHI-30 difference score > 13 points) treatment outcomes and placed it as the superior treatment approach across all those compared. The RV, CVRP and VFE approaches also demonstrated statistically significant improvements on VHI-30 scores (ie MD score ≥ 13 points and a confidence interval which did not cross the null line), in which VFE was the most evaluated voice therapy method. Interestingly, three of these four approaches share a common framework in that they each have a hierarchical structure with a physiological concept. For example, the SFP approach aims to control airflow and laryngeal control with increasing complexity moving from a voiceless airflow over an exaggerated airflow during connected speech with elongated vowels, to connected speech with normal articulation and vowel production with the perception of an easy, effortless airflow. The primary perceptual target of SFP is airflow movement throughout the vocal tract, at each level of the hierarchy. It has been applied to individuals with both non-organic and organic voice disorders. 55,56 The RV approach has been investigated in individuals with hyperfunctional (muscle tension) or phonotraumatic voice disorders. 39 Like SFP, the RV approach moves across a framework from low complexity to high complexity (eg continuous speech) stimuli. The working of RV is based on an easy voice production with vibratory sensations in facial bones that reflects a relatively high-intensity glottal source spectrum yielding a loudness that is easily heard and intense oral air pressure variations that result in the vibratory sensations. 57 In case of the VFE approach, the perceptual targets include low intensity phonation with a resonant production. The treatment stimuli consist of four core exercises which are produced in multiple repetitions and sets for a specified number of weeks. Over time, the number of sets and specific exercises can be tapered to the needs of the individual. A previous meta-analysis of VFE 43 showed comparable results to our NMA with mean VHI improvements of −11 and −13.00, respectively. This meta-analysis included four studies to calculate the effect size. 43 Both meta-analyses showed that treatment effectiveness as measured by VHI can be expected in patients with both non-organic or organic voice disorders.
The concept of CVRP contains an eclectic voice therapy approach in a six-week application. This approach was mainly developed for behavioural dysphonia. It is model of care, which relates to a consolidation of experiences from the Brazilian Larynx Institute

| Caveats, limitations and future directions
Limitations of the present meta-analysis concern the generalisability of its results, but also provide a direction for future research.
The following section is divided into two categories (ie repre-  Second, multidimensional assessments are needed for the diagnosis and measurement of treatment outcomes in voice disorders. Therefore, a consistent standardisation of diagnostic voice assessments or compliance with already established standards is important for the performance of both voice therapies and high-quality comparative studies on the outcome of different voice therapies. 36,38,43 Over the last two decades, attempts were undertaken to attain consensus and recommendations for voice measurements of laryngeal imaging (eg stroboscopy), 3 and self-evaluation (eg VHI). 7 Third, several voice therapy approaches were not included in the present meta-analysis with regard to those that were analysed in previous systematic reviews on voice therapy. [35][36][37][38][39][40][41][42][43][44][45]  those studies had to be excluded for the present meta-analysis.

| CON CLUS ION
The above-stated limitations notwithstanding, the current NMA

ACK N OWLED G EM ENT
The authors thank Dr Gerta Rücker (Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center-University of Freiburg, Germany) for her support in network meta-analysis statistics.