Sunnybrook facial grading system: Reliability and criteria for grading


  • This work was funded by an NIH-NINDS 1 R34 NS052014-01A2 grant (Neely), and the Students and Teachers as Research Scientists (STARS) summer research program for high school students (Cherian and Dickerson). The authors have no other funding, financial relationships, or conflicts of interest to disclose.



In clinical research, which is distinctly quantitative and rigidly fixed to a written protocol, the need for precision is great, especially when multicenter trials are planned. The Sunnybrook Facial Grading System (SB) is a well-established tool for assessing facial movement outcomes; however, some ambiguities do arise. The purpose of this study was to construct specific grading criteria and to test the intra-rater and inter-rater reliability before and after the use of these criteria. The hypothesis was that even in naïve observers, specific criteria improve reliability.

Study Design:

Prospective test of hypothesis.


Facial video recordings of 30 subjects with facial paralysis were randomly presented to two naïve raters in four trials; trials 1 and 2 using the SB system in the usual manner, and trials 3 and 4 using specific grading criteria for the SB system.


The SB system was reliable, even with naïve raters, having an intraclass correlation coefficient (ICC) of 0.890 between raters; this was improved with the use of specific grading criteria to 0.927. Additionally, variability of the SB composite scores was greatest in the midrange of scores and was predominantly seen during voluntary movement of brow rising and lip puckering.


To our knowledge, this is the first report of specific criteria for completing the SB system. It is also the first in-depth description of the location within the system in which the majority of variances occur. Laryngoscope, 2010