A library of quantitative markers of seizure severity

Abstract Objective Understanding fluctuations in seizure severity within individuals is important for determining treatment outcomes and responses to therapy, as well as assessing novel treatments for epilepsy. Current methods for grading seizure severity rely on qualitative interpretations from patients and clinicians. Quantitative measures of seizure severity would complement existing approaches to electroencephalographic (EEG) monitoring, outcome monitoring, and seizure prediction. Therefore, we developed a library of quantitative EEG markers that assess the spread and intensity of abnormal electrical activity during and after seizures. Methods We analyzed intracranial EEG (iEEG) recordings of 1009 seizures from 63 patients. For each seizure, we computed 16 markers of seizure severity that capture the signal magnitude, spread, duration, and postictal suppression of seizures. Results Quantitative EEG markers of seizure severity distinguished focal versus subclinical seizures across patients. In individual patients, 53% had a moderate to large difference (rank sum r>.3, p<.05) between focal and subclinical seizures in three or more markers. Circadian and longer term changes in severity were found for the majority of patients. Significance We demonstrate the feasibility of using quantitative iEEG markers to measure seizure severity. Our quantitative markers distinguish between seizure types and are therefore sensitive to established qualitative differences in seizure severity. Our results also suggest that seizure severity is modulated over different timescales. We envisage that our proposed seizure severity library will be expanded and updated in collaboration with the epilepsy research community to include more measures and modalities.

Ryan.Faulder2 b ; Yu.Guan@warwick.ac.uk;Veronica.Leach c ; Shona.Livingstone c ; cpapasavva@gmail.com;Rhys.Thomas a ; Kevin.Wilson a ; Peter.Taylor a a @newcastle.ac.uk b @nhs.netc @ggc.scot.nhs.ukd @ucl.ac.uk We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.None of the authors has any conflict of interest to disclose.
• Quantitative severity markers can be used to investigate fluctuations in seizure severity over time in individual patients.

Introduction
Seizure severity is an important clinical measure for patients with epilepsy that is strongly correlated with quality of life (Bautista and Glen, 2009).However, the best approach for measuring seizure severity remains unclear.Existing scales for measuring seizure severity, including the National Hospital Seizure Severity Scale (NHS3) (Duncan and Sander, 1991;O'Donoghue et al., 1996), the Liverpool Seizure Severity Scale (LSSS) (Baker et al., 1991), and the Seizure Severity Questionnaire (SSQ) (Cramer et al., 2002), are composed of questions on various aspects of seizures including warnings, ictal and postictal phenomena, and resultant injuries.Most scales separate seizures by their clinical classification (Cramer and French, 2001) to reflect differences in severity across different seizure types.
A primary shortcoming of existing measures of seizure severity is their reliance on patient or carer recollection (Cramer and French, 2001).For example, a patient's recollection of their seizure may be impaired as a result of the seizure itself (DuBois et al., 2010;Tatum IV et al., 2001).It is hence challenging to assess changes in severity from seizure-to-seizure in an unbiased manner for the full range of a patient's seizures.Objective, quantitative tools for measuring severity of individual seizures are therefore needed to understand variations in seizures on different timescales.
EEG-based severity markers are a potential approach to quantifying seizure severity.Past studies have used EEG features such as ictal duration (Ochoa-Urrea et al., 2021) and spatial synchronisation (Ravan et al., 2016) as proxies for seizure severity.The anatomical spread of seizure activity has also been suggested as a measure of seizure severity (Cramer and French, 2001).It is yet to be determined how such measures compare and which to use for each individual patient.
Moreover, various seizure features, which are directly associated with severity, fluctuate over time.For example, focal seizures are more likely to generalise in sleep (Jobst et al., 2001), particularly in temporal lobe epilepsy (Bazil and Walczak, 1997).The extent of postictal suppression also depends on the time of day of seizure occurrence (Lamberts et al., 2013;Peng et al., 2017).
Furthermore, it has been shown that 'seizure spatiotemporal evolutions' (Schroeder et al., 2020) and ictal onset dynamics (Saggio et al., 2020) differ within individuals on circadian or longer timescales.Therefore, monitoring fluctuations in seizure severity could lead to a better understanding of an individual's epilepsy.
To objectively quantify seizure severity we provide an expandable library of interpretable EEGbased markers of seizure severity.As a way of validation we test if seizure severity markers distinguish clinically distinct seizure types (Fisher et al., 2017a) with known differences in severity.
We further show that markers of seizure severity are patient-specific.As a proof of principle we further demonstrate fluctuations in severity over circadian or longer timescales.

iEEG Pre-processing
We first downsampled all EEG to 256Hz.Pre-ictal noise was detected using an iterative noise detection algorithm and visual inspection; noisy channels were removed from all seizures (see Suppl.Methods S3.1).The iEEG was re-referenced to a common average reference, notched filtered at 50 and 100Hz (2Hz window) to remove line noise, and band-pass filtered between 0.5 and 100Hz (fourth order, zero phase shift Butterworth).

Seizure Markers
The selection of markers was inspired by seizure detection literature (e.g., Alotaiby et al. (2014); Birjandtalab et al. (2016); Guo et al. (2010)).To quantify different types of features our library of objective seizure severity markers has three main branches: • 'peak' markers to measure the peak level of activity that occurs during a seizure • 'spatial' markers to summarise spread of ictal activity across recording channels • 'suppression' markers to evaluate post-ictal suppression Ictal duration was also included as an additional severity marker (Beniczky et al., 2020).Supplementary Table S3.1 gives detailed mathematical definitions of all markers.
Common notation is used throughout the definition of markers: x is the time series for one channel, k is the time point, N is the number of time points in the segment, C is the number of recording channels, and T is the seizure duration in seconds.
For each severity marker (i.e., each matrix) we first summarised markers across time; for each recording channel, the 95 th percentile of each marker was calculated (Fig. 1C).The maximum value across channels was then used as the estimated peak activity of the seizure (Fig. 1D).As expected, markers differ across seizure types and patients (Fig. 1E).Once summarised over time (by the 95 th percentile of each channel across time) and across channels (maximum value), we log-transformed the measures to normalise their distributions.

Spatial Markers
The extent of the spread of ictal activity across recording channels was captured through spatial markers.For each channel, baseline (pre-ictal) and ictal recordings were divided into 1 second, nonoverlapping windows, from which each of eight features (line length, energy, 6× band-powers) were  Four markers were derived from the seizure imprint: the proportion of channels with seizure activity at any point in the ictal phase (Fig. 2C, example patients), the proportion of channels with simultaneous seizure activity at the point of maximum recruitment (Fig. 2D, example patients), the time taken from seizure onset to the time of maximum recruitment, and the proportion of the seizure duration taken to reach maximum recruitment.

Suppression Markers
Duration and strength of post-ictal suppression was captured by our suppression markers.Signal range was computed in 0.5-second non-overlapping windows.For each channel post-ictal ranges were compared against the distribution of preictal ranges.Ranges below the fifth percentile of preictal range were labelled as suppressed (see Fig. 3A for a post-ictal EEG and its corresponding suppression matrix in B).Periods of suppression were labelled as majority suppression or partial suppression based on the proportion of suppressed channels (Fig. 3C).Durations of majority suppression and partial suppression (Fig. 3D & E) were calculated using a 2.5-second moving sum to account for short spikes of activity in suppressed segments.Further details are provided in Suppl.Methods S3.5.The suppression duration was computed as the time following seizure offset with a one-second buffer.A third suppression marker, suppression strength, was defined as the median proportion of channels with suppression across the duration of the post-ictal recording.
Whilst we analysed 120 seconds of postictal activity, duration of suppression may have exceeded this 120s (Ochoa-Urrea et al., 2021).Therefore, suppression durations of 120s in the following should be understood as 'at least 120 seconds'.

Statistical analysis
Statistical analyses were then performed in RStudio.P-values were calculated for reference and visualisation, not to stratify patients for further analyses.FTBTC seizures.Performance of markers was assessed through how well they distinguish these seizure types.We applied two strategies for validation, across and within patients, to separately assess performance of markers in distinguishing clinically distinct seizure types.

Across Patients
For each marker three hierarchical logistic regression models were compared to assess marker and/or patient effects.Specifically, we created a model considering only random patient effects and considering both fixed marker effects and random patient effects (random intercept & random intercept and slope models).The fit of each model was assessed using Akaike information criterion, Bayesian information criterion, and deviance.Models with poor fit were deemed inadequate and removed.Assumptions of logistic regression models were checked for each model individually.
The quality of each model as a classifier of seizure type was assessed through the area under the curve (AUC) for receiver operating characteristic (ROC) curves with 100 decision thresholds.

Within Patients
Each marker's performance in distinguishing seizure types for each patient was assessed using twotailed Wilcoxon rank sum tests.Patients were included in within-patient validation if that they had a minimum of five seizures, with two or more seizures of each type.The distinction between markers of different seizure types was quantified using the effect size (r) calculated as: where Z is the Z-statistic and N is the total sample size.The r value was bounded between zero and one with values closer to one indicating larger effects.It is common in the literature to consider 0.1 ≤ r < 0.3 as a small effect, 0.3 ≤ r < 0.5 as a moderate effect, and r ≥ 0.5 as a large effect.

Circadian and longer-term modulation of seizure severity
We additionally assessed circadian and longer-term fluctuations in seizure severity.For individual patients we assessed circadian fluctuations using rank circular-linear correlation (Mardia and Jupp, 2000) using the cylcop R package (Hodel and Fieberg, 2021).P-values were calculated through a permutation test with 1000 permutations.Inclusion criteria were that patients must have 20 or more recorded seizures irrespective of the frequencies of each seizure type.This threshold was chosen based on performance of circular-linear correlation on simulated data with varied sample sizes and noise.Long-term fluctuations in severity were assessed using Spearman's rank correlation between markers and the time since first recorded seizure.

Code and data availability
The analysis code and data are available on Zenodo.orgupon acceptance.The expandable library of severity markers is already available on GitHub(https://github.com/cnnp-lab/seizure_severity_library), and we invite contributions from the community.

Results
We computed each of the 16 proposed seizure severity EEG markers for all 1009 recorded seizures.
We first validated each marker by assessing performance in distinguishing different ILAE classification both across all patient seizures and within each patient.However, we envisage additional uses of this library and, as an example, demonstrate its potential ability to detect fluctuations in seizure severity over time.

Severity markers distinguish between ILAE clinical seizure types across patients and seizures
To validate our markers we assessed their ability to distinguish focal vs. subclinical seizures and focal seizures with and without impaired awareness across patients.Specifically, for each of the 16 markers, we compared seizure types across all patients using hierarchical mixed effects logistic regression models.Fig. 4 displays the AUC values obtained for each model in all markers when comparing focal vs. subclinical (A) and focal seizures with and without impaired awareness (B).
There were clear patient differences in the marker values; however, the majority of models created with only patient effects were unacceptable classifiers (AUC < 0.7 or model assumptions not met), suggesting that between-patient differences alone did not account for differences between focal and subclinical seizures.In contrast, 14 severity markers yielded excellent classifier performance with random intercept models or random intercept and slope models.As seizure duration is often used to assess seizure severity (Beniczky et al., 2020) we compared the performance of each marker against the performance of duration in distinguishing seizure types (see Fig. seizures, respectively.For TLE patients five markers showed excellent or outstanding performance in random intercept models (Suppl.Table S4.4).For eTLE patients all markers had excellent or outstanding performance in random intercept and/or random intercept and slope models (Suppl. Table S4.5).

Severity markers distinguish between ILAE clinical seizure types within patients
We next validated our markers by quantifying distinctions between ILAE seizure types within individual patients.We analysed effect sizes between seizure types using Wilcoxon rank sum test r-values.Using our inclusion criteria, we could compare focal and subclinical seizures in 15 patients.Patients included in this analysis did not differ in demographics (sex, age, disease duration, and epilepsy diagnosis) relative to the entire cohort.Majority suppression duration could not be validated as many patients did not have sufficient seizures with periods of majority suppression.
Moderate to large effects (r > 0.3, p < 0.05) in three or more markers were found for eight of the 15 included patients (53.3%).The heat-maps of r-values is shown in Fig. 5A, Fig. 5B shows a heat-map of r-values only where p < 0.05.The number of focal and subclinical seizures recorded per patient varied (see Fig. 5C).Effects were notably higher in four patients, all of whom were TLE patients, supporting that performance of markers is likely patient-specific.We investigated the effect of various other patient metadata (sex, TLE/eTLE, surgical outcome, disease duration, age, number of recording channels, and number of recorded seizures) on marker performance in Suppl.
Table S4.6.Most notably, there was a large effect between spatial markers for patients with TLE compared to eTLE, but none of the other patient features showed consistent or noteworthy effects.
Comparing performance of our markers against seizure duration, in five patients (33%), duration alone was not a useful marker of seizure severity (r < 0.3, p > 0.05).However, in each of these patients, at least three other markers were useful (r > 0.3, p < 0.05) in distinguishing focal and subclinical seizures

Seizure severity changes across different time scales
Finally, we used our markers to capture fluctuations in seizure severity on circadian and longer timescales in 15 patients.Fig. 6A shows example day-time and night-time seizure iEEG traces from the same patient, U14.In U14, seizures occurring at different times of day appeared to have different characteristics; for example, line length and suppression strength differences are higher in nocturnal seizures (Fig. 6B and C).The association between these markers and seizure times, was measured using circular-linear correlation (Mardia and Jupp, 2000).Eight patients (66.7%) had correlations with ρ > 0.2 and p < 0.05 for at least three markers.
We additionally asked if our severity markers also changed over the span of each patient's recording.Fig. 6D shows the absolute Spearman's rank correlation between two example markers and the time of each seizure relative to the start of the recording.This measure captures the strength, but the not direction, between marker values and the time of seizure occurrence.In eight out of 15 patients (53.3%), at least three markers had correlations with ρ > 0.3 and p < 0.05 with the amount of time elapsed since the start of the recording.
Moderate to strong correlations can be seen in a wide range of markers and patients; thus, we conclude that circadian and longer-term changes in EEG severity can be detected in the majority of patients.
We were limited by the time spent in the EMU; therefore, our findings on modulation are proofof-concept.These results should be interpreted as evidence that our markers could be used to capture fluctuations in severity.

Discussion
We evaluated 16 objective quantitative markers of seizure severity derived from iEEG recordings of patients with refractory focal epilepsy.Our goal was to offer a collection of markers which can be used as output measures for clinical trials, tracking fluctuations in seizure severity, or other applications.Our results demonstrated that almost all severity markers could distinguish focal vs. subclinical seizures across our cohort of 63 patients.Importantly, marker performance was patient-specific, indicating that different groups of patients are best evaluated with a subset of our proposed markers; thus, our approach of providing a severity library for future work to draw from is an important contribution.We also found that severity fluctuated on circadian and longer-term timescales in a patient-specific manner supporting the use of EEG-based severity markers to investigate temporal modulation of seizure severity.Our work may therefore also facilitate personalised, time-adaptive treatments or enhance our understanding of the chronobiology of seizures.
Existing scales of seizure severity have been used as outcome measures in clinical trials (Beenen et al., 1999;Dagar et al., 2011;Kverneland et al., 2018;Szaflarski et al., 2018).However, scales depend on patients' ability to recall seizures over weeks (Baker et al., 1991;Cramer and French, 2001) leading to concern over their reliability.Many scales also focus on patient risk rather than objective severity.For example, the NHS3 stipulates that seizures occurring in bed are automatically scored zero for falls, potentially underestimating their electrographic and neurobiological severity.No existing scales assess individual seizure severity in an objective quantitative manner, making small changes in severity difficult to capture.Our library of quantitative EEG markers addresses these limitations, providing a complementary approach for measuring and understanding seizure severity.
Our approach of validating our markers was to compare two seizure types that have obvious distinctions in terms of their neurobiological and symptomatic severity: namely subclinical vs.
focal seizures.The proportion of subclinical vs. focal seizures within this data (323 vs. 656) agrees with previous literature (Farooque and Duckrow, 2014), suggesting that our seizure type labels are not biased.Previous literature suggests that subclinical and focal seizures have different EEG features (Blume et al., 1984), even within the same patient (Farooque and Duckrow, 2014), thus making it a good standard to compare to.However, our proof-of-principle validation against seizure type is only one of many possible standards; future work could test other standards that are tailored to the research question.
One main finding of this work was that the performance of seizure severity markers derived from iEEG recordings is highly patient-specific.Peak markers tended to perform well as did some spatial markers (proportions of channels measures).The remaining markers varied in their performance, even among patients with better distinctions based on other markers.Results suggest that spatial markers have the highest performance in distinguishing focal seizures with and without impaired awareness.We suggest testing the entire library of markers for each new patient to determine which, if any, are the most appropriate for the desired application.
Different aspects of seizure severity have been repeatedly reported to follow circadian, sleep/wake, and longer timescale modulations.For example, secondary generalisation and post-ictal suppression occur more often in seizures arising from sleep (Jobst et al., 2001;Lamberts et al., 2013;Peng et al., 2017).Subclinical seizures are also reported to follow a circadian pattern (Jin et al., 2017).
Recent studies also reported modulations at circadian and longer timescales within many patients in terms of seizures electrographic evolutions (Panagiotopoulou et al., 2020;Schroeder et al., 2019) and other seizure properties (Schroeder et al., 2022).In agreement with previous literature, we found evidence that EEG based seizure severity markers are modulated on circadian and longer timescales although the effect size of the modulation is patient-specific and weak in some patients.
We suggest that, similar to previous work (Panagiotopoulou et al., 2020), capturing data of the potential modulations and directly relating those to the severity markers in a multivariate model may be insightful.
Limitations  remain meaningful as a proof-of-concept that our markers can be used to detect fluctuations in ictal electrographic activity and, by extension, seizure severity.

Conclusion
In conclusion, we propose 16 EEG markers of seizure severity which can be used to complement existing measures.Most markers were validated against ILAE classification on an across patients basis.Marker performance, as measured by their ability to distinguish seizure types and capture fluctuations in seizure severity, is strongly patient-specific.We also detected circadian and longer timescale fluctuations in seizure severity which may be relevant for a range of applications including capturing treatment response and seizure forecasting (Cook et al., 2016;Freestone et al., 2017;Takahashi et al., 2012).Our library therefore contributes to ongoing efforts in characterising seizures over time, seizure prediction, and generally designing novel, personalised treatment plans that manage and mitigate severe seizure.

S2 Patient Metadata
We retrospectively analysed iEEG recordings from a cohort of 63 patients undergoing presurgical evaluation for refractory focal epilepsy.All patients had electrodes surgically implanted as grids and/or strips.
The number of subclinical, focal and FTBTC seizures is listed in Supplementary Table S2 S3 Supplementary Methods

S3.1 Noise Detection
Prior to computation of markers and subsequent analysis, each iEEG recording was assessed for noise.Muscle movements and eye blinks were not concerning here as iEEG electrodes are placed directly onto or into the brain and thus are not susceptible to such sources of noise.However, this data was screened for noise from other potential sources.Line noise was removed using a notch filter at 50Hz and 100Hz (with 2Hz windows).
The preictal segment was used to compute a baseline of electrographic activity, which was used in detection of seizure activity and postictal suppression.For more reliable estimates, noise was algorithmically detected as follows: 1. Raw iEEG time series MAD scored based on variance and min-max range for each channel independently 2. MAD>16 labelled as 'outlier' -channel is noisy 3. 'Noisy' channels removed 4. iEEG time series common average referenced (CAR) 5. MAD>16 labelled as 'outlier' -channel is noisy 6. 'Noisy' channels removed 7. 1Hz high-pass Butterworth 4 th order filter used to remove any slow trends 8. Repeat the process with a less lenient threshold of MAD>12.

Visual check
Visual checks were performed to ensure that noise detected was, indeed, noise and to identify potential noise that was not detected.Following this, markers of seizure severity were computed.Noise in the ictal segment was visually assessed using iEEG traces and power spectral density plots -noisy channels were removed from all recordings.We did not seek or remove noise impacting only the postictal segment.

S3.2 Seizure Severity Markers
We calculated 16 markers of seizure severity based on iEEG recordings.Each marker captures a different aspect of seizure severity; descriptions of the markers and relevant equations are listed in Table S3.1.Our library of objective seizure severity markers has three main branches: peak, spatial, and suppression markers.

S3.3 Peak markers
Signal complexity was captured using line length (Olsen et al., 1994), calculated as: Esteller et al., 2004) The strength of the EEG signal was captured by calculating the signal's energy: Hamad et al., 2016) where x is the mean of the time series.
For each severity marker, we first summarised markers across time; for each recording channel, the 95 th percentile of each marker was calculated.We selected the 95 th percentile rather than the maximum value to reduce the risk of capturing outlier values which may not have been representative of true seizure activity.The maximum value from this array was then used as the estimated peak activity of the seizure.Each of the peak markers was log-transformed to normalise their distributions.As expected, markers differed across seizure types and patients.

S3.4 Spatial markers
Spatial markers were designed to capture the extent of spread of ictal activity across recording channels.Seizure activity was detected using the eight features (line length, energy, band-power in six frequency bands) discussed above.Ictal changes in these features were compared to preictal EEG.For each channel, baseline (pre-ictal) and ictal recordings were split into 1 second, non-overlapping windows.Each of the eight features were calculated for all windows.These computations yielded a baseline distribution of values for each feature and channel.We then scored ictal feature values relative to the baseline distribution to derive if and when a channel was invaded by seizure activity.In detail, the pre-ictal baseline distribution was obtained for each feature and each channel following an automated rejection of pre-ictal spikes or artefacts.We achieved this by removing outliers (in any feature) from the distribution with median absolute deviance (MAD) greater than five.To score each ictal window to the baseline, we used the MAD score, which scores a given observation in terms of the median absolute deviation from the median.We chose MAD scores over z-scoring as this method is more robust to outliers.Finally, to derive if any given window in a channel displays seizure activity, we obtained the maximum MAD score across all eight features, effectively measuring if the EEG activity deviated from baseline in any feature.Any window with a maximum MAD score greater than five was deemed as potentially displaying seizure activity.This step yields a binary matrix (of size number of channels by number of time windows) indicating potential seizure activity.To avoid detection of spurious non-seizure activity (e.g.caused by a brief noise or spike), we further validated the binary matrix with a sliding window approach.A symmetric moving sum of length 2 × τ + 1 was applied to the binary matrix.If the sum within each sliding window exceeded τ , this window was labelled as having seizure activity.In other words, a channel and time window is deemed to contain seizure activity only if in its temporal vicinity (τ ) more than half of windows also showed potential seizure activity.We calculated τ as 10% of the seizure duration (d).Durations varied from five to 600 seconds in our data; therefore, we bound τ between two and five seconds to prevent extreme window lengths: In this work, we combined eight markers to capture the spread of seizure activity.For each recording channel, the scale of abnormality compared to the preictal baseline was calculated in each marker.This list of markers is non-exhaustive, it is possible to increase the number of biomarkers included in this algorithm.Future work could expand our list of features (e.g., HFO activity), and we welcome contributions to the library by the community.

S3.5 Suppression markers
Duration and strength of post-ictal suppression was captured by our suppression markers.Signal range was computed as x max −x min in 0.5-second non-overlapping windows.Periods of suppression (calculated in 0.5 second windows) were labelled as majority or partial suppression based on the proportion of suppressed channels: majority suppression was defined as suppression present in over 80% of recording channels, while partial suppression was defined as suppression between 10% and 80% of the channels.Duration of majority and partial suppression were calculated using a 2.5-second moving sum to account for short spikes of activity in suppressed segments.I.e. if a short spike of activity lasted for less than 2.5 seconds, those time points would still be labelled as suppressed.The duration was computed as the time following seizure offset with a one-second buffer.
The proportion of seizures with majority suppression differ across seizure types, as reported in Table S3.2.As expected, the proportion of seizures with majority suppression increases with the increasing severity of seizure types.Most seizures were found to have post-ictal partial suppression, the only seizures without such suppression had majority suppression for the entire postictal period.
The partial suppression marker is likely to suggest suppression in seizures as the threshold for suppression is 5% of the preictal mean; therefore, with multiple channels and 120-time epochs per channel many instances of suppression will be highlighted by chance.We invite future work to adjust our threshold of 5%, for example a threshold of 1% of the baseline will encounter fewer false positives.

S4.1 Across patients
For each model, four hierarchical logistic regression models were created to validate markers.Four models were created for each marker: • Random patient (RP) effects: Only patient effects are included in the model.This model was used to determine if the distinction between seizure types is driven by patient differences.
• Fixed marker and random patient effects (random intercept) (RI): Here, both fixed marker effects and random patient effects (in the form of random intercepts).This model captures the performance of markers, whilst considering the difference in marker values across patients.
• Fixed marker and random patient effects (random intercept and slope) (RIS): Here, both fixed marker effects and random patient effects (in the form of random intercepts and slopes).This model captures the performance of markers, whilst considering the difference in marker values and changes in marker values between seizure types across patients.
The performance of each marker was assessed using the area under the curve (AUC) receiver operator curve (ROC).An AUC value of 0.7 or greater was considered acceptable, above 0.8 was considered excellent and above 0.9 was outstanding (Mandrekar, 2010).Supplementary Table S4.1 displays the AUC values for each model type in each marker comparing subclinical vs. focal seizures.Models with poor fit to the data, as shown by large deviance values, were removed from analysis (AUC is shown here as NaN).
When comparing focal aware and impaired awareness seizures, there were clear patient differences in the marker values; however, the majority of models created with only patient effects were unacceptable classifiers (AUC < 0.7) or poor fit to the data, suggesting that between-patient differences alone did not account for differences between focal aware and impaired awareness seizures.In contrast, 14 severity markers yielded excellent classifier performance with random intercept models or random intercept and slope models.Supplementary Table S4.2 lists AUC values.In validating markers against focal and FTBTC seizure classifications, all markers created excellent or outstanding classifiers using random intercept models.Supplementary Table S4.3 lists AUC values.However, the sample size of FTBTC seizures was very small (n=6), therefore the results of this analysis are indicative of good performance but further testing on a larger data set is required.
We further divided patients into patients with TLE and eTLE to determine if the performance of markers (focal vs. subclinical) was impacted by the lobe in which seizures began.Tables S4.4 and S4.5 display AUC values for hierarchical logistic regression models for TLE and eTLE patients, respectively.Comparing AUC values from all patients and TLE and eTLE patients separately, these results  Results suggest that performance of markers is differently impacted by various patient features.This finding supports testing the library of markers on each patient to determine if their performance is adequate for the individual.
Repeating this analysis comparing focal seizures with and without impaired awareness.Fig. S4.2A shows a heat-map of r-values, Fig. S4.2B shows r-values with associated p-value less than 0.05.Only six patients met inclusion criteria for this analysis.In one patient (U15), there are large effects with p < 0.05 in at least three markers.For patients U22 and G12 two spatial markers (proportion of channels included and at maximum recruitment) had large effect sizes (r > 0.8, p < 0.05).Unlike our focal vs. subclinical analysis, there is not a clear distinction between TLE and eTLE patients.Further studies with a larger cohort are required to confirm these findings.It was not possible to test differences in effect sizes based on patient metadata as too few patients met inclusion criteria.

S4.3 Capturing fluctuations of seizure severity
In this paper, we used our markers to capture and assess changes in seizure severity on circadian and longer timescales.Circular-linear correlation was used to assess changes of severity across the day, Table S4.7 presents circular-linear correlation values for peak markers, all other markers are

Figure 1 :
Figure 1: Visualising the workflow for calculating peak markers for example patient U22.A) Intracranial EEG traces for a subclinical (orange) and focal (purple) seizure in an example patient, with a subsection of recording channels for visualisation.B) Heat-maps of the line length marker in one second epochs for seizures in A. C) 95 th percentile of line length measures for each channel across time.D) Bee-swarm representation of the same data as C, also for a few more example seizures in this patient.Grey arrows point to the maximum value across channels, this is the peak value for the seizure.E) Log-transformed peak line length values (maximum channel value across 95 th percentiles), as indicated by grey arrows in D in five example patients, each data point represents a seizure.

Figure 2 :
Figure 2: Visualising spatial markers for example patient U22.A & C) Intracranial EEG traces of an example focal/subclinical seizure with a subset of recording channels.B) Corresponding binary map of seizure imprint (yellow indicates seizure activity, green no seizure activity) across time in the same subset of channels as in A & C. E) Swarm plot of the proportion of channels with seizure activity at any point in the seizure for all seizures in five example patients.F) Swarm plot of the proportion of channels with seizure activity at the point of maximum recruitment for all seizures for five example patients.

3. 4 . 1
Validating markers against ILAE seizure classification International League Against Epilepsy (ILAE) seizure classification (Fisher et al., 2017b) was used as a validation for seizure severity.Our main analyses compared focal vs. subclinical seizures and focal aware vs. impaired awareness seizures; supplementary analyses are shown comparing focal vs.

Figure 3 :
Figure 3: Visualising suppression markers for example patient U22.A) Intracranial EEG traces of example subclinical (orange) and focal (purple) post-ictal segments with a subset of recording channels.B) Corresponding binary maps of channels with suppression (< 5% of preictal activity levels) in the same subset of recording channels.C) Proportion of suppressed channels across 120 seconds of postictal activity.Segments of majority suppression and partial suppression are highlighted.D) Swarm plot of (log-transformed) majority suppression duration for all seizures for five example patients.E) Swarm plot of (log-transformed) partial suppression duration for all seizures for five example patients.

Figure 4 :
Figure 4: Validating markers against ILAE classification across patients.A) Heat-map of AUC values for hierarchical logistic regression models comparing focal and subclinical seizures.B) Heat-map of AUC values for hierarchical logistic regression models comparing focal seizures with and without loss of awareness (LoA).

Figure 5 :
Figure 5: Validating markers against ILAE classification on a within-patient basis.A) Wilcoxon rank sum test r-values obtained through comparing focal and subclinical seizures.Each row is a patient, and each column is a marker.Patients were sorted by descending r-values within the TLE and eTLE groups.B) Same as A, filtered by p < 0.05.C) Paired bar chart displaying counts of focal and subclinical seizures for each patient included in within-patient validation.

Figure 6 :
Figure 6: Detecting circadian and longer-term modulation of seizure severity.A) iEEG recordings for a day-time (blue) and night-time (pink) seizure from example patient U14.B) Plot of marker against time of day for line length and postictal suppression strength.Pink background indicates evening/night, whilst blue background indicates daytime.C) Dot plot of scaled circular-linear correlation coefficients between markers and time of day across included patients.P-values < 0.05 obtained through permutation tests are highlighted in black.D) Dot plot of absolute Spearman's rank correlation coefficient between markers and time in EMU across included patients.Correlations with p-values < 0.05 are highlighted in black.
scalp EEG and other modalities are planned and with our open code-base on GitHub we welcome contributions from the community.As recordings took place in EMUs, patients were also under non-normal conditions during recordings; anti-seizure medications (ASMs) are often tapered, and patients are potentially under an increased amount of stress.Future work might use continuous recordings to capture the full range of interictal brain dynamics to better estimate spatial and suppression properties of seizures.Future work should also investigate the three-way relationship between severity markers, seizure type, and circadian influences.Further, electrographic activity can fluctuate for weeks following electrode implantation(Ung et al., 2017); although, the pre-ictal baseline that we applied for spatial and suppression markers may render those markers less sensitive to such fluctuations.Future work needs to disentangle the biological, technological, and pathological influences on EEG biomarkers; this remains an open challenge for various applications.the results of this work may have been influenced by such fluctuations, especially in modulation analyses.Regardless, our results

Figure
Figure S4.1:Validating markers against ILAE classification across patients.A) Heat-map representing the proportion of the bootstrapped duration distribution below the observed AUC for all other markers for random intercept and random intercept and slope models comparing focal vs. subclinical seizures.B) Heat-map representing the proportion of the bootstrapped duration distribution below the observed AUC for all other markers for random intercept and random intercept and slope models comparing focal seizures with and without impaired awareness.

Figure S4. 2 :
Figure S4.2:Validating markers against focal seizures with and without impaired awareness on a within-patient basis.A) Heat-map of Wilcoxon Rank Sum r values comparing focal seizures with and without impaired awareness within patients.B) Heat-map of Wilcoxon Rank Sum r values comparing focal seizures with and without impaired awareness within patients with only r-values with associated p-value < 0.05.
and future work: The patients included in this study are presurgical candidates with refractory focal epilepsy; therefore, our library needs to be expanded and tested in other epilepsy syndromes.The use of iEEG allows for good signal quality but does not capture activity beyond a small part of the brain.Electrode placement was determined by clinical need and therefore the location of electrodes varied across patients.This variability means that spatial markers do not represent the same information in different patients and thus hierarchical statistical approaches are needed to compare markers across patients.Future work could use simultaneous scalp and intracranial EEG to validate markers of spread based on iEEG in different anatomical regions.Within this work spatial markers based on activity in regions of interest (ROIs) rather than individual channels was considered, unfortunately electrode location was not available for all patients.We opted to maintain our channel-based spatial and suppression markers to maintain our sample size.Further research including a larger cohort with available electrode location information is required to confirm if spatial markers derived from ROIs could be used to capture seizure severity.Our methods could be extended to sub-scalp EEG with some alterations to account for lower spatial coverage.Although the lower coverage presents a challenge, previous studies suggest encouraging findings.For example, (Parvez and Paul, 2015) predicted seizure occurrence using only six recording channels per patient.Furthermore, recordings from only 16 locations on the (Maturana et al., 2020)ptured critical slowing(Maturana et al., 2020)giving evidence that alterations in EEG around seizures can be captured with few electrodes.Extension of our library to

Table S2 .
.1; the seizure types experienced by each individual are listed in Supplementary Table S2.2.1: Table of counts of seizure types in dataset.Table S2.2:Table of distribution of seizure types for individual patients.
.819 0.772 0.618 Energy 0.768 0.609 0.638 δ band-power 0.795 0.617 0.650 θ band-power 0.688 0.785 0.500 α band-power NaN 0.781 0.635 β band-power 0.778 0.747 0.644 Low-γ band-power 0.857 0.671 0.594 High-γ band-power 0.897 0.631 0.543 Prop.chan.included 0.916 0.523 0.523 Prop.chan.at MR 0.917 0.503 0.502 Time to MR NaN 0.830 0.514 Prop. of seizure to MR NaN 0.849 0.529 Major.suppr.duration 0.873 0.798 0.717 Part.suppr.duration NaN 0.958 0.819 Suppr.strength NaN 0.973 0.870 Duration NaN 0.809 0.551 Table S4.4:Table of area under the curve (AUC) values for TLE-only across-patient validation against ILAE classification for subclinical and focal seizures.Outstanding performance is marked in bold.Unacceptable performance is marked in grey.Shorthand: RP (model only uses random patient effects), RI (model includes both fixed marker and random patient effects using random intercept), RIS (model includes both fixed marker and random patient effects using random intercept and slope) TableS4.7: Circular linear correlation between markers and time of day of seizure occurrence for peak markers.Correlations with p < 0.05 based on permutation test with 1000 permutations marked in bold.