Over‐the‐counter bite splints: A randomized controlled trial of compliance and efficacy

Abstract Background Occlusal splints are often used to curb the impacts of sleep bruxism (SB) on the dentition, and over‐the‐counter (OCT) options are becoming increasingly popular. OTC splints are usually fabricated at home by patients, but not routinely evaluated by dental professionals. It is unclear how OCT splints compare with more traditional splints that receive dental oversight. Objectives The present randomized controlled study tested how an OTC splint compared with a gold standard bite splint in terms of patient compliance (primary outcome) and efficacy (secondary outcomes). Methods Sixty‐seven subjects were randomly assigned to receive either the OTC (SOVA, N = 35) splint or the gold standard “Michigan” bite splint (MI, N = 32), with 61 completing the study (SOVA, N = 30; MI, N = 31). OTC‐splint subjects were required to fabricate their splints to clinically acceptable standards. Both groups wore the splints nightly for four months. Compliance was measured via daily diary. Efficacy outcomes evaluated stability, retention, periodontal health, night‐time rhythmic masticatory muscle activity (RMMA), and material wear. Results OTC‐splint subjects had difficulty fabricating splints to clinically acceptable standards. The number of night‐time RMMA bursts was significantly greater for the OTC splint group. Compliance and all other efficacy measurements were not significantly different between‐groups. Conclusions The results support the potential use of OTC splints for curbing the impacts of SB. However, the results strongly suggest that dentists should be actively engaged in overseeing patients' use of self‐fabricated appliances. This clinical trial is registered at ClinicalTrials.gov, Identifier number NCT02340663.


| INTRODUCTION
Bite splints or occlusal appliances are often used to reduce tooth wear caused by sleep bruxism (SB), clenching and grinding (Cunha-Cruz, Pashova, Packard, Zhou, & Hilton, 2010). They serve an important purpose, as tooth wear is a prevalent condition, which increases with age (Cunha-Cruz et al., 2010). Terminating splint use can cause both temporomandibular disorder (TMD) and sleep bruxism (SB) symptom exacerbation (Rehm et al., 2012).
Three commonly recognized appliances exist (Maeda, Kumamoto, Yagi, & Ikebe, 2009): (a) an over-the-counter (OTC), non-adjustable type, (b) an OTC, intra-orally-formed or "boil-and-bite" type, and (c) custom appliances. OTC devices are typically used without dental supervision, whereas the custom appliances involve dental intervention. Studies suggest that custom hard acrylic or boil-and-bite splints have nearly equal efficacy (Klasser, Greene, & Lavigne, 2010); however, evidence suggests that hard acrylic splints are superior to soft or repositioning appliances in managing TMD pain (Fricton et al., 2010).
Custom splints are usually expensive, and replacements are not typically covered by insurance. This is problematic, as splints can wear out with time (Korioth, Bohlig, & Anderson, 1998), despite long-term splint use being necessary in bruxers (Rehm et al., 2012). OTC splints are becoming more popular, as they are inexpensive and convenient.
However, few studies have evaluated OTC appliance efficacy and compliance. This is important, as it is unclear whether private practitioners evaluate OTC appliances routinely with their patients. This study's objective and purpose was to compare a specific custom appliance, the "Michigan bite splint," to an OTC boil-and-bite appliance for compliance and measures of efficacy. By necessity, this was a shortterm trial; however, we attempted to evaluate many issues in that time interval. This is an FDA-registered RCT, publically visible at: (https://clinicaltrials.gov/ct2/show/NCT02340663?cond=bruxism& draw=4&rank=38).

| Participants
This randomized controlled study was approved by the University of Michigan Medical IRB (HUM00085489). Figure 1a shows recruitment and retention numbers. Subjects were recruited via announcements F I G U R E 1 (a) Summary of participant numbers at selective time points during study. See Table 2 for demographic details. (b) Study sequence. Subjects meeting selection criteria were involved in four appointments. Time periods over which each appointment occurred are indicated. Primary outcome (Diary-based compliance, underlined) was evaluated after app 4. Secondary outcomes (italicized) were evaluated during specific appointments, 2-4, as indicated. app, appointment; IC, informed consent; DMFS, decayed missing and filled surfaces; Perio, evaluation of gingiva and plaque indices; HST, home sleep testing (see text) posted throughout the University of Michigan School of Dentistry.
Screening occurred in the principal investigator's (PI, author GEG) laboratory between July 21, 2015 and September 2, 2016 and involved 103 candidates, 36 of whom did not meet inclusion-exclusion criteria, decided not to participate, or did not return calls. Subject losses occurred after randomized allocation (three subjects) or between Week 1 and the study's conclusion (three subjects). Thus, 30 SOVA and 31 MI subjects completed the study. This was considered sufficient, based upon a power analysis using SB data in Huynh, Manzini, Rompré, and Lavigne (2007). Using mean bruxing episodes per hour with an occlusal splint = 3.97 versus with a palatal splint = 4.45, a SD = 0.63, an α = .05 and a β = .8, we concluded a study would be sufficiently powered with 28 subjects group −1 . Inclusion criteria were: (a) ≥18 years old, (b) clinical signs of dental wear, (c) self-report of nocturnal tooth grinding noises, (d) self-report of a bruxing diagnosis by a family dentist, (e) absence of dental and medical conditions, including periodontal disease, cardiovascular disease, sleep apnea, movement, neurologic and sleep disorders, (f) full dentition sans third molars, (g) no active orthodontics nor removable prostheses, (h) no medications known to have movement disorder or sleep disturbance side effects, (i) ability to follow instructions, (j) no jaw function limitations, and (k) ability to report to the clinical laboratory at appointed times. Presence of TM joint noises or myalgia was permitted; however, joint arthritides were not.
Consented subjects (App 1, Figure 1b) were randomly assigned to either an OTC "boil-and-bite" appliance group (SOVA, Akervall Technologies, Saline, MI; Figure 2a), or an acrylic occlusal appliance group, Michigan (MI) appliance ( Figure 2b) using a stratified randomization procedure (Suresh, 2011), run in Excel (version 2010 with custom algorithms created by the PI. Groups were matched for gender and presence/absence of TMD signs/symptoms using the Diagnostic Criteria for TMD (TMD-RDC; Dworkin & LeResche, 1992), with the TMD-RDC performed by the PI or one other clinician, both of whom had extensive calibrated training in its use. Other instruments used included the Jaw Function Limitation Scale (JFLS; Ohrbach, Larsson, & List, 2008), TMD Pain Screener , Measure of Symptoms Sleep Scale (MOS; Spritzer & Hays, n.d.), Perceived Stress Scale (PSS; Cohen, Kamarck, & Mermelstein, 1983), and Oral Behaviors Checklist (OBC; Markiewicz, Ohrbach, & McCall, 2006 Baseline intra-and extra-oral exams performed by the PI evaluated dental and medical health. Alginate impressions were taken and poured in dental stone. The stone models were used to fabricate the MI appliance or to assess SOVA appliance fabrication (below). Many of the methods described, below, appear in summary form in Table 1.

| Bruxism grading
We used the international consensus criteria for grading bruxism in subjects . Question 1 of the OBC was used to define self-reported bruxism. Bruxing signs were clinically evaluated, and confirmed on mandibular stereolithography (stl) models (Figure 3a; True Definition Scanner, 3M, St. Paul, MN). Scores of no wear (0), wear into enamel (1), and wear into dentin (2) were assigned to each tooth, based upon severest detected wear on each tooth, and subjects were scored using the median value for the mandible. Scoring and grading were done by an investigator blinded to group assignments.

| Splint delivery
SOVA subjects fabricated SOVA devices in the clinic during Appointment 2 ( Figure 1b)  The MI appliance was fabricated in a professional dental laboratory routinely used by the UM School of Dentistry for fabrication of bite splints. The MI appliance was fabricated by the lab between Appointments 1 and 2 (Figure 1b). At delivery, MI splints were adjusted for fit and to establish bilateral occlusal contacts.

| Compliance (primary outcome)
Daily diaries were filled out at home over the 4-month study period.
Instructions stressed that the diaries be filled out daily. Subjects were instructed to account accurately for nights the splints were not worn.
Compliance was defined by the total number of nights of splint wear and the percent of total nights worn ( (1) water bath temperature not correct, (2) splint not thoroughly warmed, (3) incisal bite not on anterior bar pad of splint blank, (4) lack of snugness against palate (>1 mm gap), (5) lack of snugness against facial/buccal tooth surfaces (<1 mm), (6) lack of sufficient material coverage on facials of anterior teeth, that is, flange is short of gum line, (7) material folded over on itself, (8) marks from lower dentition excessive, (9) material overstretched, that is, major axis > twice length of minor axis of perforations, (10) maximum intercuspation not even in clench, (11) splint falls off cast when inverted or shaken, (12) material orientation issues, viz., asymmetry/rotation, translated laterally or anteroposteriorly on occlusal surfaces, (13) distal of posterior-most tooth not adequately covered, (14) posterior flange of splint extends onto soft tissues. Note: Because error categories were determined a priori, not all error categories were actually observed in the sample. Moreover, no additional error categories were observed or added post hoc.
at Week 1 to determine whether subjects were correctly reporting compliance and to re-educate as necessary. The null hypothesis was that no significant differences in number of nights of splint wear nor in percent of nights worn would exist between groups.

| Surveys
SOVA subjects completed a survey covering ease of fabrication.
At study's end, all subjects completed a survey on user satisfaction.
Standard surveys used at Appointments 1 and 4 to assess changes over time included the Oral Health Impact Profile (OHIP; Slade & Spencer, 1994), Tampa Scale for Kinesiophobia for TMD (TSK; Visscher et al., 2010), and TMD Pain Screener (PS; Gonzalez, Schiffman, et al., 2011). No hypothesis was formalized for ease of fabrication; however, we anticipated that SOVA subjects would report that splints were easy to fabricate. Null hypotheses for other surveys were: (a) User satisfaction with splints will not be significantly different between subject groups. (b) There will be no significant between-group differences in responses on the three standard surveys at baseline nor at study's end, nor will there be differences in the survey responses through time.

| Efficacy (secondary outcomes)
Efficacy was defined by stability, retention, periodontal health, and estimated RMMA (Table 1) task was performed five times in a row (trial, Figure 4a), and two such trials were performed, resulting in ten observations per task. A >10-s rest period occurred between tasks. The null hypothesis was that there would be no significant between-group differences in splint displacements (measured in mm in three-dimensional space) caused by the tasks.
Stability was monitored using a magnetometer motion analysis system and 1.8 micro sensors attached to the chin and splint ( Figure 4b, Liberty, Polhemus, Colchester, VT). If the splint was dislodged during tasks, the splint sensor recorded this in three dimensions. The maximum change in this distance per task was used to assess stability.
Subjects were asked to bite hard during tasks, and bite force estimates were made to confirm this. Prior to performing tasks, subjects bit on a custom bite plate ( Subjects bit with mild, medium, and maximum forces, twice with the splint in and twice with the splint out. The bite plate thickness was standardized to 20 mm. A 10-s rest period occurred between bites. An example of one such trial is shown in Figure 4d. The filtered right masseter EMG bursts were used to extract root mean square (RMS), the mean and median power frequencies, and peak amplitude, expressed as a percent of the maximum peak amplitude (LabChart 8, AD Instruments, Colorado Springs, CO). A step-wise linear regression, with a procedure to eliminate variables expressing high collinearity, was used to create an equation to estimate bite forces. The first "splint-in" bite force trial was used as the training set; the remaining three trials served as test sets.
Test-set results ( Figure 4e) were evaluated for precision and accuracy with Lin's concordance coefficient, ρ c (Lin, 1989)    replicates per task). The null hypothesis was that no significant between-group differences would exist in number of reported splint reseatings following tasks at either Appointment 3 or 4.
Tissue health was assessed using the Rustogi modification of the Navy plaque index (PI; Rustogi et al., 1992)   . Splint data were sampled during the time windows corresponding to each task identified by chin movements. Arrows identify time periods used to construct data for the "Rest" category (see text and Table 4). Letters L, R, A over the Grind trial indicate left, right, and anterior grinding, respectively. Border refers to a task where subjects' swept the jaw out to the facial border of the splint by moving the jaw first left laterally, then anteriorly, then right laterally and back to a rest position. (b) Close-up of a microsensor used for tracking chin and splint movements; outputs from two such sensors resulted in the time series shown in (a). (c) Picture of the bite plate used to sample bite force data. The shown perforated thermoplastic was used to provide contact with first molar regions bilaterally, with and without bite splints in place. With thermoplastic in place, vertical dimension was 20 mm and was not adjusted for trials with and without the splints in place.
(d) Example of a bite force trial. Upper trace is right masseter EMG; lower trace is output from the force transducer. Rectangles partition trials into four replicates, each of which involved mild, moderate and high bite forces in sequence. The four replicates were: splint in, splint out (tooth trial), splint out, splint in, in that order for all subjects. (e) Scatter plot of actual bite force (abscissa) against EMG-based estimate of bite force (ordinate). Both axes are in kg. Plotted data are from the final three trials, one with splints in place and two trials with splints removed, with each trial including mild, moderate and high bite forces as shown in (d). (f and g) Mean (SD) bite force estimates by group and task. Ordinate in both plots is estimated bite force (kg). Horizontal bars indicate pairwise comparisons that were statistically significant, Bonferonni-corrected at the p < .05 level. Note that left and right lateral components of the Grind task were pooled to create a Lateral Grind category whereas the anterior component of Grind is Protrusive Grind (see also Table 4, which reports splint displacements in mm during the tasks). The lateral and protrusive components are separated in order to report results for "roll" and "pitch" degree-of-freedom dislodgements independently group to minimize scoring biases. Periodontal data were taken at Appointments 2, 3 and 4 ( Figure 1b). PI and MGI scores were calculated separately for upper and lower arches and also for the buccal and lingual of the upper and lower arches. The null hypothesis was that there would be no significant between-group differences in PI or MGI, nor would there be time-dependent effects.
Finally, RMMA was estimated (BioRadio Recording Unit, Great Lakes NeuroTechnologies, Valley View, OH) in a home sleep study performed at Appointment 4 ( Figure 1b and Table 1).
Newbury Park, CA) were placed bilaterally on the masseter and thyroideus muscles (Figure 6a,b). A ground electrode was placed on the mastoid process opposite the subject's sleeping side preference ( Figure 6a). Body movements and positions were also sampled.
Audio was used to assist with identifying bruxing events and to identify other nocturnal noises, for example, coughing, talking, and so on (Figure 6c). The monitor was worn on an upper arm of the subject's choosing.
Data were recorded at 1 kHz, filtered (EMG, 20-500 Hz bandpass; body movements, 20 Hz low pass), and evaluated in 10-s epochs by investigators trained and calibrated to analyze recordings using published criteria (Carra, Huynh, & Lavigne, 2012). Scorers were blinded to subject group assignment, and they received equal numbers of subjects from each group to reduce rater bias.
Gold standard evaluation of SB uses polysomnography (PSG) and audio-video capture ; however, home monitoring is an acceptable alternative (Ahlberg, Savolainen, Paju, et al., 2008), despite moderate false positive rates (Carra, Huynh, & Lavigne, 2015). Also, RMMA is not necessarily SB, and without EEG and EOG leads, recording time is not differentiable from sleep time. cheek and reverse while prone. Each task was repeated five times alone, then in conjunction with bruxing-like RMMA, below.
Bruxing-like RMMA: clench for 3 s (tonic clench); clench rhythmically five times at 1 Hz (phasic clench); grind rhythmically five times at 1 Hz to the right and to the left (phasic grind). Each RMMA was repeated five times without and then with the body movement artifacts. Investigators were trained to avoid movements other than those called for by each task. www.slicer.org (Fedorov et al., 2012)) and registered (CMFreq extension of 3D Slicer). Two regions of interest (ROI) were registered, one around first molar-splint contacts, and one around canine-splint contacts (Figure 3b, rectangles). The left side was analyzed unless clearer signs of wear occurred on the right side.
F I G U R E 6 Top row shows instructions from the laboratory manual used for calibrated placement of masseter (a) and thyroideus (b) EMG electrodes. Also shown is the audio monitor (c) used to monitor room and subject sounds, see text. Bottom row (d) shows an example of 15 s from a subject's home sleep study. Top trace is right masseter, middle trace is right thyroideus and bottom three traces are body movements. Note the sequence of three masseter bursts, representing an RMMA sequence. Note that this sequence was associated with thyroideus activity and body movements The Model-to-Model Distance extension in 3D Slicer was used to calculate the signed closest point distances between the aligned surfaces of the baseline and 4-month-old stl models. The average distance (in mm) was calculated at two specific sites within each ROI, one site where tooth-splint contact occurred (contact sites) and an adjacent site where no tooth-splint contact could occur (control sites; Figure 3b). Means for control and test sites were obtained using areas with five-voxel radii (Pick 'n Paint extension in 3D Slicer).
Mean distances between splint surfaces were calculated (Mesh Statistics extension in 3D Slicer). Estimated wear, w, was calculated as w = d control − d test , where d test = mean between-model distance at the tooth contact site, and d control = mean between-model distance at the control site (Table 1). The null hypothesis was that no differences in splint material wear will exist between subject groups.

| Statistical tests
Continuous data were evaluated for normality using Q-Q plots, and tests for skewness and kurtosis. If normality assumptions were violated, data were transformed. If normality was not achieved through transformation, we used non-parametric tests. General linear models (GLM) were used for tests of within-and between-group differences.
A repeated measures design was used for data collected multiple times during the study. Pearson's product moment was used on normally distributed data.
For non-normally distributed data and non-continuous data, the  Table 2 shows initial and final enrollment demographics. There were no significant differences between the two groups. No betweengroup differences existed in self-report of SB (MWU = 417, p = .462).

| RESULTS
All subjects reported current SB noises and showed clinical signs of wear (5 had scores of 1; 5 had scores of 1.5; 51 had scores of 2). No significant between-group differences existed for TMD by category (Table 3). Also, there were no significant between-group differences in overbite, overjet, maximum pain-free opening, maximum voluntary opening, maximum opening with passive stretch, maximum protrusion or maximum left or right laterotrusions (Table 3).
Twenty SOVA subjects (66.7%) fabricated splints without asking for help, eight (26.7%) asked for help once, and two (6.7%)    Bite force estimates during stability tasks are shown in Figure 4f,g.
Horizontal error bars show significance at the p < .05 level.
Retention results,

| DISCUSSION
Ideally, participants randomly assigned to groups are well-matched in RCT. We carefully matched groups based on gender and presence/ absence of TMD signs/symptoms. Fortuitously, the groups were also closely matched across a number of other factors, including ethnicity, clinically-observed tooth wear severity, self-reported bruxing nights/ week, and data from several surveys including the JFLS-20, OBC, PSS, MOS, OHIP, TSK for TMD. There were also no significant betweengroup differences for more detailed TMD findings as well (Table 3).
Perhaps the most significant finding was the inability of subjects to form the OTC splint according to instructions. Only 4/31 SOVA splints were clinically acceptable. All subjects were ultimately helped to fabricate clinically acceptable SOVA splints, but this would not occur routinely. It is highly likely that virtually any OTC appliances currently in use are being improperly fabricated. We strongly recommend that dental professionals play pro-active, engaged roles with their patients who possess OTC appliances. It is noteworthy that, based on our findings, the SOVA splint is now available only through dentists and not available OTC.
This study was prompted by the fact that insurers often cover one splint per lifetime; however, rarely do splints last a lifetime.
Because severe bruxers are more likely to require replacements, the unfortunate consequence is that patients who need the benefits the most stand to pay the most out-of-pocket. This suggests why OTC devices are becoming increasingly popular.
Clinical studies of obstructive sleep apnea (OSA) appliances emphasize the need to evaluate compliance and efficacy in terms of mean disease alleviation (Vanderveken et al., 2013) or effectiveness (Sutherland, Phillips, & Cistulli, 2015). One treatment may be more efficacious but have poorer compliance than another treatment.
Whatever the case, our results suggest that something importantly different in SB architecture associated with SOVA versus MI splint wear may be occurring. This might be partly due to the softer material and larger occlusal contact areas provided by the SOVA versus MI splint. Further work needs to be done, ideally with PSG, to assess whether SOVA versus MI splints differentially impact SB architecture.
We developed a few novel methods to help evaluate splints, including motion analysis of stability, estimating bite forces during stability tests, and assessment of material wear. Motion analysis has an established record in the oral motor literature (Gerstner, Lafia, & Lin, 2005;Gerstner, Marchi, & Haerian, 1999;Gerstner & Parekh, 1997;Tanaka, Yamada, Maeda, & Ikebe, 2016;Wilson, Luck, Woods, Foegeding, & Morgenstern, 2016), and so we are reasonably confident that our stability results are objective and accurate. Similarly, previous bite force estimates using EMG data have used variables similar to ours Van Eijden, Brugman, Weijs, & Oosting, 1990). Thus, the stability results are probably reasonable.
A digital method of evaluating splint wear was developed by Korioth et al. (1998) similar to ours. The wear seen in the previous study was also low. It is possible that the 4-month time period does F I G U R E 7 Compliance results plotted as total nights of splint wear (a), and as percent total days splint was in possession of subject (b). Histograms are means with 1 SD error bars for the MI versus SOVA groups not provide sufficient time to identify wear, let alone estimate wear rates. Longer-term studies will likely be more revealing.
We measured compliance through self-report and found no significant between-group differences. Evidence suggests that selfreport is a reasonable compliance measure, even when compared to embedded micro-sensor methods (Vanderveken et al., 2013). Based on patient satisfaction, there did not seem to be any differences between groups, and reasons for not wearing splints were also similar between groups. Thus, the similar scores for both satisfaction and compliance suggest no significant differences in splint preference.
Several study limitations existed. Firstly, the study was not double-blinded. This would have been difficult to do, given the distinct fabrication methods and appearances of each splint (Figure 2).
Obviously splint fabrication assessment could not be blinded either.
However, single blinding was done for virtually all other study aspects, for example, stability, retention, periodontal assessment, EMG analysis, compliance tabulation, and statistical analyses. Also, where multiple investigators were involved, we allotted equal numbers of subjects from each group to each investigator, thereby reducing the impact of investigator biases on results.
Another study limitation was our use of subjects with "probable" as opposed to "definite" diagnoses of SB .
Given that dentists do not routinely obtain sleep studies on patients with SB, dentists usually treat patients with "probable" SB diagnoses anyway, and in this respect our subjects probably represent the "typical" population treated by dentists for SB. We recognize that this limitation may be a reason for the lack of observed splint surface wear. On the other hand, results of the nocturnal EMG study suggest active SB habits in our subjects.
Other study limitations include the small sample sizes and the abbreviated time period over which the study was performed. Many of the effect sizes demonstrated fairly large confidence intervals, which is likely due to the small study size. Future, larger projects may consider cross-over designs, inclusion of PSG, longer-term splint wear, and use of compliance sensors, among other things.
In conclusion, because OTC splints were difficult to construct, we highly recommend that OTC splints be monitored by dentists. We recognize the need for inexpensive alternatives. Ideally, a large RCT would demonstrate definitely whether an OTC splint would be a legitimate solution. This would potentially provide an important measure of external validity. It is unlikely that such a large RCT will occur in the near future. Hence, a practical approach would be for dentists to be vigilant with OTC splint fabrication and use, since this appears to be a practical alternative for at least the near future.