The effectiveness of stabilization appliance therapy among patients with myalgia

Abstract Background The efficacy of stabilization appliance therapy for masticatory muscle pain is debated. Therefore, there are currently no clear usage standards. We analyzed patient factors influencing its efficacy and characterized masticatory muscle pain subtypes to determine appropriate therapy candidates. Methods This case series study recruited patients diagnosed with local myalgia or myofascial pain and used variables related to temporomandibular disorders in the analysis. We used temporary appliance to screen patients for sleep bruxism for 2 weeks. Afterwards, we initiated therapy with stabilization appliances. Efficacy was evaluated via tenderness intensity during muscle palpation and the treatment satisfaction score after 2 months of treatment. Results We analyzed 62 (91%) patients. Tenderness upon muscle palpation was mitigated in 27 patients. Mitigated tenderness odds ratios were 0.035 for myofascial pain, 0.804 for 15‐item Patient Health Questionnaire scores, and 1.915 for facet length. Thirty‐nine patients expressed satisfaction; satisfaction odds ratios were 0.855 for 9‐item Patient Health Questionnaire scores, 1.606 for facet length, and 4.023 for awake bruxism awareness. Conclusions Stabilization appliance therapy is most effective for patients with awake bruxism awareness, local myalgia, long facets, and no psychosocial risk factors.

We analyzed patient factors that might influence the effectiveness of SAT on masticatory muscle pain and aimed to identify the masticatory muscle pain subtypes for which SAT is most appropriate.

| Patients
This case series study comprised 68 patients (14 men and 54 women; mean age: 48.3 ± 14.4 years) out of 71 patients who presented with orofacial pain at the Tokyo Dental College Suidobashi Hospital between March and December 2016 and who were diagnosed with local myalgia or myofascial pain. We did not recruit patients aged <18 years or those who had moderate or severe systemic disease (i.e., American Society of Anesthesiologists physical status class III or above), loss of posterior support, or temporomandibular joint pain.
The exclusion criteria included failure to attend hospital appointments, a deteriorated condition necessitating a treatment switch, and improvements before SAT was started. Written informed consent was obtained from all the participants, and the study protocol was approved by the Ethics Committee of Tokyo Dental College (Ethical Clearance Number 670).

| Assessments
Items that are associated with TMD in the existing literature (Türp & Schindler, 2012) were assessed and used in the analysis of patient factors. During the first examination, we assessed tenderness upon muscle palpation. Tenderness intensity at the most tender point during muscle palpation, used as an indicator of the pain threshold, was evaluated on the visual analog scale (VAS). The number of tender areas (both sides), pain-free mouth-opening range, awareness of awake bruxism, presence of anterior guidance, open bite, muscle fatigue on waking, torus palatinus or mandibularis, and tongue scalloping or lines on the inner surface of the cheeks were also assessed. The bilateral muscle palpation area was the anterior, middle, and posterior temporalis muscle and the origin, body, and insertion of the masseter muscle. Applying palpation pressure involved using a weight of 1 kgf for 2 s, and prior to palpation, finger pressure was calibrated using an algometer (adjustable spring coil with a small pin touching the examiner's hand when the correct pressure is achieved) in order to standardize the pressure.
To differentiate the types of myalgia, the duration of the pressure was increased to 5 s. Myofascial pain was diagnosed if spreading pain or referred pain was present, and local myalgia was diagnosed if neither was present. Two doctors who have over 5 years of clinical experience in orofacial pain trained by the "Japanese Orofacial Pain Society" made a diagnosis. The other patient factors investigated comprised sleep duration, snoring or apnea, smoking, daily alcohol consumption, daily caffeine consumption, duration of computer usage time, and scores on three self-administered questionnaires. These questionnaires were the 9-item Patient Health Questionnaire (PHQ-9), 15-item Patient Health Questionnaire (PHQ-15), and 7-item Generalized Anxiety Disorder (GAD-7) scale. These assessments screen for depression, somatization, and anxiety, respectively.

| Temporary screening appliances
Temporary screen appliances were used to screen for facets formed by sleeping. Facets formed by sleeping were observed on the surface of the temporary screening appliances, and the length of the facet was measured (Figure 1). Temporary screening appliances F I G U R E 1 Temporary screening appliances. Facets lengths formed by the mandibular canines were made from autopolymerizing resin (Facet Resin, GC Corporation, Tokyo, Japan) for nighttime use. This resin was selected because it combines sufficient strength with the appropriate degree of readability for the formation of facets. The appliance's position was adjusted by tapping it until it contacted all teeth equally with canine guidance. After the adjustment was complete, an appliance marker (Facet Resin Marker, GC Corporation) was applied to enable easy observation of the appliance's surface texture. After 2 weeks, the surface texture was assessed, and the lengths of the facets formed by the mandibular canines were measured. However, it should be noted that the effectiveness of temporary screening appliance has not been reported in the past, and detection of bruxism by this is not certain.

| Stabilization appliances
After the screening appliance assessment, SAT was initiated. For each patient, an appliance for nighttime use was made to cover the entire maxillary dentition, and the stabilization appliance's position was adjusted in the same manner as the screening appliance's position. (de Leeuw & Klasser, 2013) The adjustment of the stabilization appliance was performed by two dentists. The participants were asked to attend fortnightly appointments at the hospital for stabilization appliance adjustments.

| Potential differences between patients with local myalgia and myofascial pain
From the obtained data, the background of patients with myalgia and myofascial pain was investigated, and statistical analysis was performed.

| Evaluating the efficacy of SAT
The efficacy of SAT was evaluated in terms of the VAS score indicating the intensity of tenderness during muscle palpation and the treatment satisfaction score 2 months after the start of treatment. A VAS score that was ≥30% lower after treatment than the score before treatment was considered to indicate improvement, and any other score was regarded as a lack of improvement. Treatment satisfaction was self-assessed as (a) "greatly worsened", (b) "worsened", (c) "no change", (d) "improved", or (e) "greatly improved". A score of 1, 2, or 3 was indicated dissatisfaction whereas a score of 4 or 5 signified satisfaction.

| Statistical analysis
SPSS version 24 (IBM, Armonk, NY, USA) was used for statistical analysis, and a p < .05 was considered statistically significant. 2.7.1 | Potential differences for the local myalgia patients and the myofascial pain patients We compared multiple baseline patient factors in patients with local myalgia and myofascial pain. For the test of normality, the Shapiro-Wilk test was used, and two groups were compared using Mann-Whitney U-test, Student's t test, and χ 2 tests. Logistic regression analysis was performed on all items to obtain odds ratios.

| Between-group comparisons of baseline patient factors
We compared patients who exhibited improvements to those who did not and those who expressed satisfaction to those who did not in terms of multiple baseline patient factors. χ 2 tests were used to compare the following variables between the two pairs of groups: sex; presence of myofascial pain, anterior guidance, open bite, awareness of awake bruxism, muscle fatigue on waking, torus palatinus or mandibularis, tongue scalloping or lines on the inside of the cheeks, and snoring or apnea; smoking; daily alcohol consumption; and daily caffeine consumption. The Mann-Whitney U-test was used to compare the pretreatment baseline VAS score, number of tender areas, as well as PHQ-9, PHQ-15, and GAD-7 scores. The Student's t test was used to compare the baseline sleep duration, duration of computer and smartphone use, pain-free mouth-opening range, and facet length in the screening appliance's canine region. These variables were further analyzed with logistic regression analysis to identify predictors associated with improvements in VAS scores and those associated with satisfaction.

| Before and after stabilization appliance therapy
Statistical analysis of changes in the number of tender areas, VAS score (intensity of tenderness during muscle palpation), and pain-free mouth opening range of myalgia at the time of initial examination of all participants 1 and 2 months after starting SAT was performed by a Friedman test.

| Between-group comparisons of patient factors assessed before and after treatment
The Mann-Whitney U-test was used to compare the number of tender areas 2 months after the start of treatment (between both pairs of groups), VAS score at 2 months (between the satisfied and dissatisfied patients), and treatment satisfaction score at 2 months (between the patients who improved and those who did not). Finally, the Student's t test was used to compare the pain-free mouth opening range at 2 months (between both pairs of groups).

| RESULTS
First, this study design is uncoordinated and requires a lot of statistical analysis, so there are Type I errors. Also note that the study power is low due to the small sample sizes.
Of the 68 patients who consented to participate in the study, 3 were excluded because they failed to attend hospital appointments, 2 because their conditions deteriorated so markedly that they were switched to different treatments, and 1 because of an improved condition before SAT was initiated. Thus, the analysis was based on 62 patients (mean age: 48.3 ± 15.2 years). Of them, 12 were men (mean age: 50.5 ±17.1 years) and 50 were women (mean age: 47.8 ± 14.7 years). Overall, there were significant improvements in the VAS score (p < .001). There was no significant change in the number of tender areas (p = .051) and the pain-free mouth opening range (p = .183; Table 1).

| Potential differences between patients with local myalgia and myofascial pain
There was a difference in the number of tender areas (p = 0.029), but there was no difference in other items ( Table 2). The odds ratio by logistic regression analysis was 1.295 (Table 3).
3.2 | Improvement versus lack of improvement Improvement was evident in 27 patients (10 men and 17 women; mean age: 51.0 ±15.1 years) but not in the remaining 35 (2 men and 33 women; mean age: 46.3 ±15.2 years).

| Between-group comparisons of baseline patient factors
The between-group comparisons, in terms of improvement versus lack of improvement, are shown in Table 4. The improvement rate was significantly higher for men than for women (p = .003). Compared with those who did not experience improvements, those who did were significantly less likely to have myofascial pain (p = .001), had significantly lower PHQ-15 scores (p < .001), and had significantly longer facets (p = .006). Logistic regression analysis of these variables showed that the odds ratios for improvement were 0.035 for myofascial pain, 0.804 for PHQ-15 scores, and 1.915 for facet length.
Sex was not associated with a significant odds ratio (Table 5).

| Between-group comparisons of patient factors assessed before and after treatment
Although there was no significant difference in the number of tender areas on initial examination between patients who improved and those who did not (p = .322) 2 months after the start of treatment, the number was significantly lower among the patients who had improved (p = .001). There was also no significant difference in the pain-free mouth opening range on initial examination (p = .062), but at 2 months, the range was significantly greater among the patients who had improved (p = .019). Satisfaction levels were also significantly higher among the patients who had improved (p = .004).

| Between-group comparisons of baseline patient factors
The between-group comparisons, in terms of satisfied versus dissatisfied patients, are shown in Table 4. Compared with the dissatisfied patients, the satisfied patients were significantly more likely to be aware of awake bruxism (p = .030), had significantly lower PHQ-9 scores (p = .041), and had significantly longer facets (p < .001). Logistic regression analysis of these factors showed that the odds ratios for T A B L E 1 Before and after SAT

| Between-group comparisons of patient factors assessed before and after treatment
Although there was no significant difference in VAS scores on initial examination (p = .53) 2 months after the start of treatment, VAS scores were significantly lower among satisfied patients than among dissatisfied patients (p = .016). There was also no significant difference in the number of tender areas on initial examination (p = .542), but at 2 months, the number among satisfied patients was significantly lower (p = .016). There was no significant difference in the pain-free mouth opening range at the initial examination (p = .529) or at 2 months (p = .546).

| DISCUSSION
TMD is broadly divided into muscle pain and joint pain, with myalgia, tendonitis, myositis, and spasms categorized as muscle pain.
Under Diagnostic Criteria for TMDs classifications, myalgia is further classified into local myalgia, myofascial pain with spreading pain, and myofascial pain with referred pain. In this study, we divided myalgia into local myalgia and myofascial pain with spreading or referred pain.
T A B L E 2 Potential differences between patients with local myalgia and myofascial pain

| Awareness of awake bruxism
Self-reported awake and sleep bruxism are both risk factors for TMD. (Huhtela et al., 2016) Awake bruxism includes both tooth clenching and light tooth contact. Habitual tooth contact is evident in 52.4% of patients with TMD and is associated with a 1.944-fold increased probability of continuance or worsening of TMD pain. (Sato et al., 2006) Self-reported tooth contact or tooth clenching is also a risk factor for facial pain. (Glaros & Williams, 2012) In this study, 54.8% of the participants were aware of having awake bruxism. Our finding that awareness of awake bruxism was significantly more common among satisfied patients than among dissatisfied patients was unexpected, given that the stabilization appliances were only worn at night.
Because many of the highly satisfied patients were aware of having awake bruxism and showed long facets in screening appliances, the highly satisfied patients might have had both awake and sleep bruxism. Reissmann et al. (Reissmann et al., 2017) reported that patients who were aware of having both awake and sleep bruxism were at an increased risk of developing painful TMD. If the stabilization appliance positively influenced sleep bruxism perhaps by relieving muscle fatigue and physiological stress caused by grinding, patients may have been highly satisfied due to a reduced risk of painful TMD. Because we did not investigate biological responses during sleep, further research is needed to test this hypothesis. Polysomnography is the standard technique for analyzing sleep bruxism, but it is expensive, requires specialist expertise for analysis and diagnosis, and poses difficulties for patients due to the long period of restraint required and the altered sleep environment. Therefore, we used screening appliances in this study. Although screening appliances do not provide extensive information, they suffice for sleep bruxism screening and, compared with polysomnography, are less burdensome for patients. As the number and timing of events associated with sleep bruxism vary from day to day, another advantage of using screening appliances is that the results reflect the entire period of use.

| Types of sleep bruxism
We found that the patients with improved VAS scores and those who were satisfied had significantly longer facets than those with no improvement and those who were dissatisfied, respectively. Screening 2009) found that muscle activity was most intense during grinding.
This suggests that decreasing the force imposed during grinding relieves muscle fatigue, which might have influenced the VAS improvements and treatment satisfaction scores that we observed.
Furthermore, myalgia is believed to be caused by impaired blood flow due to excessive muscle use and sympathetic reflexes. (de Leeuw & Klasser, 2013) Although muscle stress reduction and psychological stress may lead to blood flow improvements and positive health outcomes by promoting bruxism by stabilization appliances, research is needed that include a heart rate variability analysis, near-infrared spectroscopy, electromyography, and accelerometer use.

| Type of muscle pain
Patients who had significantly improved VAS scores showed more local myalgia than myofascial pain. Myofascial pain is characterized by deep or spreading pain and is associated with factors such as impaired peripheral blood flow (i.e., hypoxia), the pain-inducing action of growth factors, and hypersensitivity of the sympathetic nervous system that can increase sensitivity to palpation or tenderness to pressure, sometimes with referred pain. (Maekawa, Clark, & Kuboki, 2002) In other words, central sensitization or peripheral sensitization may be involved. The difference in the tender areas between local myalgia and myofascial pain in this study result also suggests that sensitization is involved.
TMD is categorized as a type of functional somatic syndrome, alongside fibromyalgia and somatic symptom disorder, (Henningsen, Zipfel, & Herzog, 2007) and the diagnostic criteria for myofascial pain and fibromyalgia have many similarities. (Wolfe et al., 1990) If TMDassociated myofascial pain is truly like fibromyalgia, then SAT may be less effective for myofascial pain than for local myalgia, as shown in our results. If local myalgia is caused simply by muscle fatigue, then SAT should be more effective for local myalgia than for myofascial pain. However, the existence of psychosocial risk factors and the involvement of central hyperalgesia, such as that observed in fibromyalgia, must be considered when managing myofascial pain.

| Psychosocial risk factors
Psychosocial factors influence the risk of developing chronic lower back pain and other musculoskeletal disorders, (Hasenbring, Hallner, & Klasen, 2001) including chronic TMD. (Harper, Schrepf, & Clauw, 2016;Slade et al., 2016) We found that high PHQ-9 scores were significantly associated with dissatisfaction with treatment and that high PHQ-15 scores were associated with VAS scores indicating a lack of improvement.
Depression is thought to be closely association with pain. (Wright et al., 2004) Chronic stress due to factors such as pain causes both depression and hyperalgesia, (Rivat et al., 2010) and depression impairs the function of the descending pain modulatory system. (Stahl, 2002) Screening for depression is regarded as essential in managing chronic lower back pain, (Tsuji, Matsudaira, Sato, & Vietri, 2016) and accumulating evidence indicates an association between musculoskeletal pain and depression. For example, patients with TMD due to muscle pain have higher depression scores than patients with other TMD subtypes, suggesting that depression screenings would facilitate pain management for patients with TMD. (Bertoli & de Leeuw, 2016) The PHQ-9 is widely used in Japan as a screening tool, and its use has highlighted the association between pain and depression.
High PHQ-9 scores are significantly associated with higher pain intensities in patients with chronic lower back pain, (Vietri, Otsubo, Montgomery, Tsuji, & Harada, 2015) and 22% of individuals who have been injured for ≥90 days have PHQ-9 scores indicative of depression, (Zhou & Jia, 2016) signifying that managing pain in patients with high PHQ-9 scores is difficult. Such patients therefore require the early adoption of a multifaceted approach, as satisfaction is unlikely to be obtained with SAT alone.
Studies on psychosocial stress and TMD have shown that pronounced somatic symptoms represent a strong risk factor for developing TMD.  The PHQ-15 is also effective for evaluating fibromyalgia severities. (Häuser, Brähler, Wolfe, & Henningsen, 2014) Fibromyalgia and TMD are both forms of functional somatic syndrome, (Henningsen et al., 2007) and because they have many similarities, the PHQ-15 can also be useful for assessing patients with TMD. Fibromyalgia pain is thought to be neuropathic or central pain rather than nociceptive pain, with pain hypersensitivity as a contributing factor. High degrees of sensitivity to pressure, heat, and pinprick stimulation have also been reported in chronic TMD; (Greenspan et al., 2011;Greenspan et al., 2013) also, central hyperalgesia is thought to be involved. We found that compared with patients whose VAS scores indicated improvements, those whose VAS scores indicated no improvement (i.e., patients with low pain thresholds) had significantly higher PHQ-15 scores and a higher frequency of myofascial pain. The contribution of somatic symptoms and hyperalgesia must be considered when managing myofascial pain in patients with high PHQ-15 scores because in this study, these patients were less likely to experience improvements compared with patients with myofascial pain who had low PHQ-15 scores. These patients may need an alternative approach to SAT.
The GAD-7 is a valuable anxiety screening tool. (Löwe et al., 2008) Although its use may be regarded as essential in managing TMD, we found no significant difference in GAD-7 scores between the patients who improved and those who did not or between the satisfied and dissatisfied patients.
Our results showed that higher PHQ-9 scores were associated with lower likelihoods of satisfaction with SAT, whereas higher PHQ-15 scores were associated with lower likelihoods of experiencing improvements, as reflected in VAS scores. In a study of patients with residual pain following knee surgery, Bierke et al. (Bierke, Häner, & Petersen, 2016) found that scoring ≥10 on both the PHQ-9 and PHQ-15 was associated with significantly greater knee pain and a higher likelihood of being dissatisfied with treatment, which is consistent with our findings. Our findings are also consistent with those of other studies that found that various psychosocial risk factors heightened sensitivity to pain and reduced the probability of patients responding to standard treatments. (Ohrbach & Dworkin, 1998) In TMD management, it is important to evaluate psychosocial risk factors during both the initial examination and the course of treatment and to consider approaches other than conventional treatments such as conservative therapy, SAT, and physiotherapy.

| CONCLUSION
Various masticatory muscle pain subtypes exist. Therefore, detailed examinations of factors, including psychosocial factors, are essential for effectively treating patients with masticatory muscle pain.
Our results suggest that patients aware of awake bruxism and with local myalgia who formed long facets on their stabilization appliances would respond better to SAT than those who have myofascial pain or formed short facets. High PHQ-9 scores indicate a reduced likelihood of satisfaction with SAT, and high PHQ-15 scores indicate a reduced likelihood of benefiting from this treatment in terms of VAS scores for tenderness during muscle palpation. Therefore, SAT may be most effective for patients aware of awake bruxism and with local myalgia who form long facets on the stabilization appliances and who lack psychosocial risk factors.