The FLEP scale in diagnosing nocturnal frontal lobe epilepsy, NREM and REM parasomnias: Data from a tertiary sleep and epilepsy unit


Address correspondence to Dr. Raffaele Manni, Unit of Sleep Medicine and Epilepsy, Institute of Neurology “C. Mondino Foundation,” Via Mondino 2, 27100 Pavia, Italy. E-mail:


Purpose: To test the usefulness of the FLEP scale in diagnosing nocturnal frontal lobe epilepsy (NFLE), arousal parasomnias, and REM sleep behavior disorder (RBD).

Methods: The FLEP scale was applied to 71 subjects (60 male; 11 female; aged 54 ± 21) referred to an outpatient's sleep and epilepsy unit for diagnostic assessment of nocturnal motor-behavioral episodes, which turned to be arousal parasomnias (11 subjects), NFLE (14 subjects), or idiopathic RBD (46 subjects), based on the findings of in-lab full night video polysomnography with extended EEG montages.

Results: The sensitivity of the scale as a diagnostic test for NFLE was 71.4%, the specificity 100%, the positive predictive value 100%, and the negative predictive value 91.1%. The FLEP scale gave an incorrect diagnosis in 4/71 (5.6%) of the cases, namely NFLE patients with episodes of nocturnal wandering, and uncertain diagnostic indications in 22/71 subjects (30.9%).

Conclusions: The FLEP scale shows high positive and negative predictive values in diagnosing NFLE versus arousal parasomnias and RBD. However, the scale is associated with a real risk of misdiagnosis in some patients and gives uncertain indications in about one-third of cases, mainly RBD. Our investigation highlights the inadequacy of some of the items in the scale. The item investigating wandering, as presently formulated, may be unable to distinguish nocturnal wandering from sleepwalking. The items about “recall” and “clustering” of the events throughout the night may increase the likelihood of mistaking RBD for seizures. Further testing of the reliability of the FLEP scale items appears to be needed.

Semeiological similarities, together with nonspecific surface electroencephalographic findings, make it difficult to distinguish nocturnal frontal lobe epilepsy (NFLE) from parasomnias. The differential diagnosis of NFLE is considered challenging mainly with respect to arousal parasomnias (sleepwalking, sleep terror, and confusional arousal) (Provini et al., 1999; Zucconi & Ferini-Strambi, 2000; Derry et al., 2006a). However, REM sleep behavior disorder (RBD) episodes may also be misdiagnosed as NFLE (Schenck & Mahowald, 2002; Bazil, 2004). Comorbidity, mainly of NFLE with arousal parasomnias (Bisulli et al., 2005), but also of RBD with epileptic seizures (Manni et al., 2006), may further complicate the clinical picture.

Nocturnal video-polysomnography (V-PSG) which, until otherwise proven, remains the gold standard to achieve a definite diagnosis, is an expensive procedure and not universally available.

Many attempts have been made to identify distinctive clinical profiles of NFLE versus arousal parasomnias, through careful scrutiny of the reported symptoms (Provini et al., 1999; Derry et al., 2006a) and analysis of ictal video recordings (Vignatelli et al., 2007). However, no single, valid and reliable diagnostic procedure (with the exception of nocturnal V-PSG), or diagnostic algorithm, has been yet defined for these disorders.

A new scale, the Frontal Lobe Epilepsy and Parasomnias (FLEP) scale, was recently proposed as a tool for distinguishing NFLE from parasomnias (Derry et al., 2006b), and it has been both welcomed and criticized (Tinuper et al., 2007). Validated in a multicenter study in Australia, the usefulness of the scale still needs to be verified in clinical settings.

Here we present our data on the usefulness of the FLEP scale in diagnosing nocturnal paroxysmal motor-behavioral episodes in a clinical setting (a tertiary sleep medicine and epilepsy unit).

Patients and Methods

Patients with undefined (epileptic or parasomnic) nocturnal paroxysmal motor-behavioral episodes attending the Sleep Medicine and Epilepsy Unit (an outpatient facility) at the IRCCS “C. Mondino Institute of Neurology” Foundation in Pavia, Italy, in the 3-year period 2004–2006 were screened during a scheduled visit.

Seventy-one of these patients, whose variables are given in Table 1, were included in the study, having met the following inclusion criteria: (1) written informed consent to participate; and (2) a final diagnosis of arousal parasomnia, NFLE or idiopathic RBD, based on one or more nocturnal paroxysmal episodes documented on an in-lab, full-night video-EEG polysomnography (V-EEG PSG) recording with extended EEG montages (full-scalp EEG, positioning of leads according to the International 10–20 System: Fp1, Fp2, F3, F4, F7, F8, C3, C4, P3, P4, T3, T4, T5, T6, O1, O2, common reference, with display system used to allow the rearrangement of EEG traces into various montages). In all cases a detailed clinical history, interictal routine EEG, and neuroradiological brain NMR findings were also available.

Table 1.  Demographic and disease-related variables in the subgroups of studied subjects
 NFLEArousal parasomniasRBD
  1. PA, paroxysmal arousal.

Gender male11 643
Age (mean ± s.d.)35 ± 1523 ± 867 ± 8
5 wandering + PA7 sleepwalking 
4 PA3 confusional arousal 
3 wandering1 Pavor 
1 hypermotor seizures 
1 hypermotor seizures + PA 

An Italian version of the FLEP scale was obtained according to the Sackett method (Sackett et al., 1991). The English text of the FLEP scale was translated into Italian by a doctor specialized in sleep medicine and epilepsy (R.M.) and then back-translated into English by a native English speaker with experience in scientific writing (C.W.). Adherence to the original text was assessed by another medical doctor specialized in sleep medicine and epilepsy (L.P.), who is also bilingual.

The scale was filled in by a medical doctor (A.R.) on the basis of the reports given by the patients and their partners or relatives, all of whom were reinterviewed for the specific purposes of the study. A.R., who is trained in sleep and epilepsy diagnosis and treatment, was blind to each patient's identity and final diagnosis.

The whole diagnostic procedure including V-PSG score and interpretation, was carried out by two other physicians (R.M., M.T.) from the unit, who were blind to the FLEP scale scores.

In interpreting the FLEP scale scores, we applied the three discrete ranges of values indicated in the original FLEP scale validation study (Derry et al., 2006b): scores of 0 or less were taken to be indicative of episodes highly likely to be parasomnias; scores of between 0 and +3 as indicative of episodes potentially epileptic in nature but which need further investigation before a definite diagnosis can be made; scores > +3 as episodes highly likely to be epileptic in nature (NFLE).

Statistical Analysis

Median and range values were used to describe FLEP scale scores as continuous variables in the whole patient series and in the three diagnostic subgroups of patients.

Percentages were used to describe the distribution of the three discrete ranges of values in the whole patient series and in each diagnostic subgroup of patients.

The sensitivity, specificity, positive predictive and negative predictive values of the FLEP scale as a diagnostic test for NFLE were calculated considering the result of the scale to be incorrect (false negative or false positive) only when it gave, respectively, a definite parasomnia score (0 or less) in a case that turned out to be NFLE, or a definite NLFE score (>+3) in a case that turned out to be affected by parasomnias.


The median FLEP scale score was +2 (range −9 to +5) in the NFLE subjects, −4 (+1 to −9) in the arousal parasomnias group, and −1 (−5 to +3) in the RBD group. Fig. 1 shows the score distribution in the three groups.

Figure 1.

Distribution of the individual scores in the three groups of subjects studied.

Of the 14 patients in the NFLE group, four gave FLEP scale scores strongly indicative of NFLE (>+3) and six gave scores suggestive of NFLE (between +1 and + 3). Four (28.5%) NFLE patients gave a negative score (−3 in two subjects, −4 and −9 in the other two). These four subjects with scores indicative of parasomnias reported episodes of wandering (even out of the bedroom) with complex, purposeful movements. Two of them presented coherent speech without recall.

Ten of the 11 patients in the arousal parasomnias group gave FLEP scale scores strongly indicative of parasomnias (0 or less). Only one (9%) patient in this group (who presented confusional arousals) gave a positive score (+1).

In the RBD group, 31/46 patients gave FLEP scale scores strongly indicative of parasomnias (0 or less). The other 15 patients (32.6%) gave positive scores (range +1 to +3); 12 of these had clear recall of the event (even if not a full awareness of the motor enactment), four reported clustering of episodes (of which there were between three and five per night). Three subjects reported coherent speech with lucid recall; only six subjects reported complex behaviors.

In the whole sample, the FLEP scale gave incorrect diagnostic indications in 4/71 cases (5.6%), and uncertain indications (with the need for further investigations) in 22/71 cases (30.9%) (Fig. 2).

Figure 2.

Percentage of subjects with FLEP score-based correct or incorrect diagnostic indication.

Considering the indications of the FLEP scale to be incorrect only when the scale gave a clear parasomnias score (0 or less) in cases that actually had NFLE, or a clear NFLE score (>+3) in cases that had a final diagnosis of parasomnias, the scale, as a diagnostic test for NFLE, showed a sensitivity of 71.4% (CI 95%: 0.47–0.95), a specificity 100%, a positive predictive value of 100% and a negative predictive value of 91.1% (CI 95%: 0.827–0.994).


In the FLEP scale validation study (Derry et al., 2006b), it was stressed that the scale proved to be useful in distinguishing NFLE seizures from arousal parasomnias with cut-off scores to stratify the likelihood of a nocturnal paroxysmal episode proving to be epileptic or parasomnic in nature. Reliable even when used by clinicians with limited experience in the field of epilepsy and sleep disorders, the FLEP scale could limit the need for V-PSG in diagnosing nocturnal paroxysmal episodes.

The aim of our study, conducted in a tertiary sleep disorders and epilepsy unit, was to test the usefulness of the FLEP scale in diagnosing nocturnal motor-behavioral episodes of an initially uncertain nature, which, after a full set of investigations including V-EEG PSG, were shown to be NFLE seizures, arousal parasomnias, or RBD.

The FLEP scale's sensitivity and negative predictive value were found to be lower than those reported in the original validation study.

In NFLE patients, the scale gave incorrect diagnostic indications (NFLE episodes misinterpreted as arousal parasomnias) in 4 out of the 24 cases (28.5%). The four NFLE cases misinterpreted by the FLEP scale were all cases in which the only or prevalent clinical manifestations were episodes of nocturnal wandering. The scores on the “wandering outside the room” and “performing complex, direct behaviors during the events” items were the ones found to contribute most to increasing the total negative score in these cases.

It is not clear whether the aforementioned behaviors are part of the ictal event or whether they should, instead, be considered postictal. Interestingly, it has been underlined in the literature (Tassinari et al., 2005; Terzaghi et al., 2007; Tinuper et al., 2007) that parasomnic phenomena may occur in a context of NFLE seizures and might even constitute phenomena triggered by the seizures themselves without being an integral part of them.

Thus, it is not surprising that nocturnal wandering, which is the NFLE seizure manifestation most similar to the manifestations of arousal disorder (the latter was, in fact, initially termed “agitated somnambulism”) (Plazzi et al., 2005), is the aspect of NFLE that the FLEP scale, in its current format, may fail to identify correctly.

No cases of arousal parasomnias or of RBD were definitively misinterpreted as NFLE.

However, around a third of the parasomnia patients, mainly belonging to the RBD group, had misleading diagnostic indications (their scores, falling within the range of +1–+3, indicate that there is a relatively high chance that the episodes were epileptic in nature).

The presence of “recall” of the events (which in RBD cases means recall of enacting a fight during a dream) and “clustering” of events during the night account for the positive FLEP scale scores in our RBD patients. Modifying the recall item, so that it specified “except for fighting-enacting dreams”, would have eliminated most of the positive scores we obtained in our patients.

Our data show that the FLEP scale has high positive and negative predictive values in diagnosing NFLE versus arousal parasomnias and RBD. However, the scale is associated with a real risk of misdiagnosis in some patients. The patients at most risk of misinterpretation were NFLE subjects presenting episodes of nocturnal wandering, which were misinterpreted as arousal parasomnias.

Furthermore, in 9% of cases with arousal parasomnias and in 32% of those with RBD, the scale gave misleading diagnostic indications (scores strongly suggesting that the episodes were epileptic).

Although it is possible that the FLEP scale interpreted some of nocturnal paroxysmal episodes incorrectly or ambiguously because of their complex clinical semeiology, our investigation nevertheless highlighted the inadequacy of some of the items in the scale.

The item investigating wandering, as presently formulated, may be unable to distinguish nocturnal wandering from sleepwalking. The items about “recall” and “clustering” of the events throughout the night may increase the likelihood of mistaking RBD for seizures.

Concluding Remarks

The diagnostic power of the FLEP scale based on our data is lower than that found in the original validation of the scale. Differences in the composition of the patient series, namely the overrepresentation of RBD in our sample, may in part account for the differences between the results of the two studies.

Our preliminary observations that some items of the scale do not adequately address NFLE, namely the nocturnal wandering episodes, highlight the need for appropriate statistical investigations (Cohen, 1960; Frankfort-Nachmias & Nachmias, 1992) to analyze, in depth, the reliability of the FLEP scale items. Furthermore, we feel that efforts should be aimed not just at identifying a single reliable method, but also at developing a reliable algorithm able to guide the physician through the complexity of these disorders, helping him, step by step, in the decision-making process.

The FLEP scale, once its limitations have been overcome, could be a useful tool in such an algorithm, helping the physician to better address simple clinical impressions, sometimes suggesting the need for a video or polysomnographic recording and other times making it possible to avoid these investigations.


The authors thank Dr Liborio Parrino and Mrs Catherine Wrenn for their participation in the procedure of FLEP validation in Italian according to Sackett method.

Conflict of interest: The authors confirm that they have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines. The authors disclose no conflicts of interest.