A machine-learning tool to identify bistable states from calcium imaging data

Mapping neuronal activation using calcium imaging in vivo during behavioral tasks has advanced our understanding of nervous system function. In almost all of these studies, calcium imaging is used to infer spike probabilities since action potentials activate voltage-gated calcium channels and increase intracellular calcium levels. However, neurons not only fire action potentials, but also convey information via intrinsic dynamics such as by generating bistable membrane potential states. While a number of tools for spike inference have been developed and are currently being used, no tool exists for converting calcium imaging signals to maps of cellular state in bistable neurons. Purkinje neurons (PNs) in the larval zebrafish cerebellum exhibit membrane potential bistability, firing either tonically or in bursts. Several studies have implicated the role of a population code in cerebellar function, with bistability adding an extra layer of complexity to this code. In this manuscript we develop a tool, CaMLSort which uses convolutional recurrent neural networks to classify calcium imaging traces as arising from either tonic or bursting cells. We validate this classifier using a number of different methods and find that it performs well on simulated event rasters as well as real biological data that it had not previously seen. Moreover, we find that CaMLsort generalizes to other bistable neurons, such as dopaminergic neurons in the ventral tegmental area of mice. Thus, this tool offers a new way of analyzing calcium imaging data from bistable neurons to understand how they participate in network computation and natural behaviors. Key Points Summary Calcium imaging – the gold standard of inferring neuronal activity – does not report cellular state in neurons that are bistable, such as Purkinje neurons in the cerebellum of larval zebrafish. We model the relationship between Purkinje neuron electrical activity and its corresponding calcium signal to compile a dataset of state-labelled simulated calcium signals. We apply machine-learning methods to this dataset to develop a tool that can classify the state of a Purkinje neuron using only its calcium signal, which works well on real data even though it was trained only on simulated data. CaMLsort also generalizes well to bistable neurons in a different brain region (ventral tegmental area) in a different model organism (mouse). This tool offers a new way of analyzing calcium imaging data from populations of bistable neurons, thereby facilitating our understanding of how these neurons carry out their functions in a circuit.


Introduction
Calcium imaging enables us to sample the activity of populations of neurons and even whole brains during animal behaviour (Ahrens et al., 2013;Grewe and Helmchen, 2009;Grienberger and Konnerth, 2012), providing us a window into the functioning of the nervous system. When a neuron fires an action potential, somatic calcium levels increase dramatically due to the opening of voltage-dependent calcium channels. Simultaneous electrophysiology and calcium imaging recordings of neuronal activity demonstrate the correspondence of somatic calcium imaging signals with action potential firing (Chen et al., 2013;Kerr et al., 2005;Kwan and Dan, 2012;Tian et al., 2009). Nevertheless, the calcium imaging signal is at best a delayed, low-pass filtered and non-linearly transformed proxy for neuronal activity. Non-linearities are introduced due to the dynamics of calcium release, and the kinetics of calcium binding and unbinding by the sensor molecules (Ali and Kwan, 2020). Yet, this technique is useful in cases where the neurons being studied have low baseline firing rate. However, this is not often the case and several neuronal classes, such as cerebellar Purkinje neurons and cortical interneurons are known to fire at very high spike rates spontaneously (Armstrong and Edgley, 1984;Markram et al., 2004). Further, many neuronal types have interesting intracellular dynamics apart from action potential firing that might strongly modulate calcium imaging signals (Lin et al., 2007;Moreaux and Laurent, 2007). Lastly, confounds may be introduced when more than one type of electrical event leads to somatic calcium increase Ramirez and Stell, 2016).
Purkinje neurons (PNs) show all of the features mentioned above that make the interpretation of calcium signals in these neurons especially difficult. PNs are principal neurons of the cerebellum, integrate thousands of synaptic inputs and are indispensable for cerebellar function (Narayanan and Thirumalai, 2019). Therefore, studying PN population activity during motor behaviours is critical to deciphering cerebellar computation. PNs fire simple spikes and complex spikes -in mammals, climbing fiber (CF) input from the inferior olive leads to calcium entry within the dendrites and several spikelets that ride the depolarization wave (Kitamura and Häusser, 2011;Miyakawa et al., 1992); in fish, the olivary input causes an all-ornone AMPAR-mediated giant synaptic potential with concomitant calcium entry but without any spikelets (Sengupta and Thirumalai, 2015). While the CF input causes a relatively large calcium signal, bursts of simple spikes can also generate calcium signals via the activation of somatic voltage-dependent calcium channels, albeit smaller in amplitude (Knogler et al., 2019;Sengupta, 2015). This means that optically recorded calcium dF/F signals cannot be uniquely assigned to simple spikes or CF inputs.
In addition to the above, PNs exhibit membrane potential bistability -PNs can exist in one of two stable membrane potentials. At a depolarized state, they generate action potentials at a more or less regular rate and are said to be tonic. When hyperpolarized, they generate bursts of action potentials (Sengupta and Thirumalai, 2015). Simple spikes in the bursting state are highly correlated with motor bouts whereas those in the tonic state are not (Sengupta, 2015). This implies that bistability has important ramifications for how motor bouts are represented across the entire PN population. We wondered if calcium imaging can be used to deduce the state of PNs. Since both CF inputs and simple spikes contribute to calcium signals, assigning optically recorded dF/F signals is not straight-forward and deduction of spike rate using existing spike inference algorithms is not possible.
Here, we first performed simultaneous electrophysiological recordings and calcium imaging to obtain ground truth data. Unsurprisingly, we found that none of the usually measured parameters such as peak amplitude, area under the curve etc., of the calcium signal uniquely correlated with cellular state. We therefore built a sorter using convolutional recurrent neural networks to identify cellular state from dF/F data. We have tested our sorter with data that was not part of the training set and validated that it performs at above 90% accuracy and with F1 scores greater than 0.8 in most cases. Our method demonstrates that calcium imaging datasets can be mined using such tools to glean greater insights into neuronal activity states.

Zebrafish cerebellar Purkinje neurons are spontaneously active and exhibit membrane potential bistability
Purkinje neurons (PNs) are the principal neurons in the cerebellar circuit (Fig. 1A). In larval zebrafish, they are spontaneously active and exhibit two kinds of membrane potential events that can be distinguished solely based on their amplitude in both extracellular and intracellular recordings (Sengupta and Thirumalai, 2015) (Fig 1B and 1C). The smaller amplitude events are action potentials and are called simple spikes ( Fig 1B-E, marked by cyan dots), while the larger amplitude events are all-ornone synaptic input from climbing fibers (Fig 1B-E, marked by magenta dots). Moreover, these neurons exhibit membrane potential bistability, i.e. they are found to be at one of two stable membrane potentials and fire either tonically or in bursts, depending on the membrane potential. Both these states can be distinguished visually from extracellular loose-patch recordings based on (i) the firing frequency of simple spikes and (ii) the presence of long intervals of inactivity in the burst mode ( Fig 1B, left) which are not present in the tonic mode (Fig 1B, right). The same features can be used to distinguish between the two modes in whole-cell patchclamp recordings, along with an extra feature -membrane potentials are more hyperpolarized in the bursting mode (Fig 1C, left) as opposed to the tonic mode ( Fig  1C, right). Cells can spontaneously switch between these two states, but can also be taken from one state to the other by current injection -bursting cells become tonic upon depolarizing current injection, whereas tonic cells fire in bursts when injected with hyperpolarizing currents.

Calcium imaging reports both simple spike bursts and individual CF inputs, independent of state
Given that PNs manifest two distinct activity patterns (tonic and bursting) and two types of electrical events (simple spike and CF events), we wondered if calcium imaging signals can reliably report these distinct patterns of activity. To do this, we performed simultaneous wide-field calcium imaging and targeted patch-clamp electrophysiology from zebrafish PNs. We used the carbonic anhydrase (Ca8) enhancer element and a minimal promoter (cfos) to drive expression of the genetically encoded calcium sensor GCaMP5G in PNs (Matsui et al., 2014). Embryos were microinjected at the 1-2 cell stage and those larvae that showed sparse labelling in PNs were selected for targeted patch-clamp electrophysiology and imaging (Fig 2A). We chose to perform extracellular, loose-patch recordings so as to be minimally invasive and to avoid the dialysis of cellular contents including GCaMP. We obtained such recordings from 15 cells in 12 larvae, out of which 9 cells were naturally in the tonic state, and 6 cells, in the bursting state.
Representative traces from this experiment for cells in the bursting and the tonic states are shown in Fig 2B and 2C, respectively. The first observation we made was that many of the calcium transients (marked with a hash symbol in Fig 2B and 2C) had an almost one-to-one correspondence with CF inputs (magenta lines in the rasters) in cells of both states. In tonic cells, most simple spikes are not reported as transients, as one would expect from cells that fire at high frequencies. However, in some cases, calcium transient peaks occurred in the absence of CF inputs, during times of simple spike firing (Fig 2C, see asterisk and cyan lines in raster). Similarly, in bursting cells, we found some calcium transients associated only with a burst of simple spikes (Fig 2B, asterisk). Naturally, then, one would expect to see calcium transients when CF inputs co-occurred with a burst of simple spikes, which is what we observed (marked with open circles in Fig 2B and 2C). Thus, regardless of the state of a cell, we found calcium transients that corresponded to individual CF inputs, bursts of simple spikes or a combination of both.
To quantify the correspondence between calcium imaging signals and simple spikes or CF events, we first measured the average calcium signal triggered with respect to individual simple spikes (Fig 2D(i)) or CF inputs (Fig 2D(ii)). We found that the calcium signal's decay is the dominant response when triggered to individual simple spikes. This may be attributed to the high frequency of simple spikes observed in these cells. When aligned to CF inputs, however, there is a modest increase in calcium response after the CF input, followed by a decay, which is the typical response of a calcium sensor. It is interesting to note here that the baseline calcium signal preceding either of the two events is roughly the same. These general features of the event-triggered calcium signal remained the same when data was sorted by cellular state (Supp. Fig. 2.1) While this method confirms that a calcium transient occurs after a CF input, suggesting a causal link between the CF input's arrival and the measured calcium influx into the cell, it is not useful in the case of simple spikes. Firstly, there are multiple simple spikes per frame -approximately 2-3 spikes per frame when imaged at 30Hz. This is further worsened by the timescales of GCaMP5G's rise and decay, which occurs on the order of 100s of milliseconds. Moreover, one cannot separate out the individual effects of simple spikes and CF inputs because the two events are interspersed and thus, when calculating the triggered average response for one event, there is bleed-through from the other.
As an alternate measure, we decided to look at the number of simple spikes and CF inputs relative to the peak of the calcium signal using peri-event time histograms ( Fig  2E). GCaMP5G has a rise time constant of ~100ms, which means that the time it takes to reach a peak value from the baseline is ~267ms. This time point is marked in the histograms with a vertical dashed line. The y-axes of these histograms represent the z-score for each bin. We calculated the average z-score of bins 200-400ms before the GCaMP peak (which is the window in which the rise in the signal would have started, Fig 2E, yellow shaded area) and the average z-score of bins +/-100ms around the peak (which is roughly when the signal would start decaying, Fig  2E, grey shaded area). It is clear from these that there is a greater count of both simple spikes and CF inputs right before the GCaMP signal starts to rise, which suggests that the occurrence of both kinds of electrical events in a PN could lead to a calcium transient. We checked if this effect was state-dependent, and we find that it is not, since the histograms do not differ much between tonic and bursting cells (Supplementary Figure 2.2).
In summary, it is difficult to infer the underlying event train or even the cellular state of a Purkinje neuron directly from its calcium signal, as there is no one-to-one correspondence between calcium transients and their underlying electrophysiological events, as has also been reported previously (Knogler et al., 2019;Sengupta, 2015).

The calcium signal for a Purkinje neuron can be reconstructed from its electrophysiological recording
Though the calcium signal in PNs could arise from more than one electrical event, we hypothesised that there might be sufficient information in the calcium signal time series to allow us to determine whether it is tonically firing or bursting. One possible approach to this problem could be the application of machine learning algorithms to build a classifier that labels cells as tonic or bursting given a time series of calcium transients. For this, it was first necessary to generate a larger ground truth dataset comprising the electrical activity patterns and their corresponding calcium transients for training the classifier. Mathematically, the calcium signal obtained by imaging can be thought of as the result of the convolution of spike trains with a calcium sensor kernel having a fast rise and a slow decay and described as a difference of exponentials. This transformation has been shown to work well in cells that are not spontaneously active, where every single action potential produces a large increase in intracellular calcium concentration (Rupprecht et al., 2021;Vogelstein et al., 2010;Yaksi and Friedrich, 2006). We hypothesised that a similar transformation could be used to reconstruct the calcium signal profiles of PNs from their electrophysiological recordings alone. Such reconstructed calcium imaging traces with their corresponding electrical recording traces could then be applied to build the classifier.
However, given that these neurons have not one, but two electrophysiological events, each contributing differently to the calcium signal obtained from imaging, we must assign each event a separate kernel (Fig 3A), which we assume to be independent of each other. This is a valid assumption, since simple spikes and CF events arise from distinct intrinsic mechanisms (Sengupta and Thirumalai, 2015). We used simultaneous imaging and electrophysiology recordings to find optimal kernels for simple spikes and CF events. For each PN recording, we obtained simple spike and CF event rasters. Each event raster was then independently convolved with its own exponential response kernel, and the resulting "event-specific signals" were added to produce the final reconstruction ( Fig 3A). We then calculated the correlation coefficient and the mean squared error (MSE) between the reconstructed calcium signal time series and the experimentally obtained calcium imaging time series.
With this description in place, the problem then becomes one of finding optimal coefficients for each response kernel. This optimization was done by gradient descent to minimise the MSE between the ground truth and the reconstruction. The process was performed independently for each cell in the dataset. Then, these optimal kernel combinations were used to make reconstructions for other cells in the dataset, and the kernel that generalised best across all cells, as quantified by least total MSE and highest total cross-correlation between the ground truth and reconstruction (Supplementary Figure 3), was taken to be the best kernel for the next steps. Kernel numbers 7 and 8 were very close to each other in both scores (Supplementary Figure 3), so kernel 7 was chosen for further analysis.
Although this is a simple forward model with calcium dynamics assumed to be linear and time-invariant, we found that this method to approximate the calcium signal works well, no matter the state of the neuron (Fig 3B). The quality of the reconstructions has been quantified by measuring the correlation coefficient between the ground truth and the reconstruction (Fig 3C). To benchmark the correlation coefficient values obtained against those obtained simply by chance, we also calculated the correlation coefficient between the ground truth and a scrambled version of the reconstruction, as well as that between the ground truth and a reconstruction generated after shuffling all the event times in the recording (Fig 3C). In both the latter cases, the correlation disappears, suggesting that the reconstructions generated by this forward model are, indeed, faithful reconstructions.
Given that only groups/bursts of simple spikes produce a calcium transient amplitude comparable to that generated by an individual CF input, one would expect a much smaller response kernel for simple spikes than for the CF input. Indeed, the optimal kernels found by this method show that the CF input response kernel is approximately 7.5 times larger than the simple spike response kernel (Fig 3D).

A state-labelled calcium signal dataset was generated by reconstructing the calcium signal for a compilation of electrophysiological traces
Having a method of converting any given PN electrophysiology trace to its corresponding calcium imaging trace allowed us to generate simulated calcium imaging traces for a compilation of all electrophysiological recordings acquired in our lab ( Fig 4A). To do so, we first pooled all the electrophysiological recordings, both extracellular and whole-cell, made during the period of 2011-2020, for a total of N=173 recordings. Recordings of poor quality, i.e., those with interruptions, or noisy baselines, unusually broad/filtered events, or indistinguishable simple spikes and CF inputs (N=37) were first excluded from the dataset. Retained recordings then had a state assigned to them, including state switches, if there were any. This collection was split into two sets, one with those recordings which had only one state throughout (N=131), and another with those which had a state switch (N=5). Events were detected in all the recordings, and then the calcium signal reconstructed, to finally produce two labelled datasets -a state-labelled reconstructed calcium signal dataset and an independent dataset with switches. The recordings used to generate this dataset were of different durations with a median duration of 60 seconds. In order to get multiple samples per recording, we decided to split the traces into 10second-long non-overlapping chunks, representative samples from which are shown in Fig 4B. This produced a dataset of state-labelled calcium imaging traces (n=928 traces; n=580 bursting, n=348 tonic) that is much larger than could have been obtained by simultaneous imaging and electrophysiology of individual cells. The entire complement of non-overlapping samples is shown as a heatmap in Fig 4C. To test whether there were properties of these traces that could be used to unambiguously classify their source state, we compared various properties of the traces between the two classes -the number of peaks, average peak amplitude, area under the curve, mean, standard deviation and coefficient of variation ( Fig 4D). We found that most of the properties of the traces between the two classes were largely similar. While several of these differences are statistically significant (Table  2), they have small effect sizes, and our aim was not to just find properties that are statistically distinguishable from each other but to be able to classify a trace as having come from either a tonically firing cell or a bursting cell. To do so, we first performed a PCA on the dataset of trace features and found that they had large overlaps ( Fig 4E). Thus, linear methods of separating the two classes weren't successful and we needed to consider more complex models to solve this classification problem.
Machine learning methods have successfully been used to solve such problems when the data points are not linearly separable (He et al., 2016). Thus, we turned to developing a machine learning model to identify the state of a cell from a calcium signal time series.

Training a convolutional recurrent neural network to distinguish state from calcium imaging data alone
To find good machine learning models to solve this classification problem, we performed 5-fold cross validation, which is used to ensure that models generalised well and do not overfit the training data ( Fig 5A). On each fold, only 60% of the training traces were shown to the network in the training phase ( We tested several different machine learning models on this classification problem -Support Vector Machines (SVMs), Deep Neural Networks (DNN) and Convolutional Neural Networks (CNNs). The prediction accuracies and F1 scores (a metric that adjusts for class imbalances as well as whether a prediction was a true/false positive/negative, defined in the methods section) for the various models tested are shown in Table 3. Of these models, the 1D CNN model performed best across all folds. However, these are only on snapshots of 10s-long snippets of the calcium trace, and the networks had more erroneous performance when predicting on the entire time series (data not shown). To overcome this, we decided to use a CNN-LSTM model. LSTM (Long short-term memory) networks (Hochreiter and Schmidhuber, 1997) are a kind of recurrent neural network (RNN), which has the ability to learn features across an entire sequence and not just across neighbouring positions in a trace sequence, like CNNs do.
The architecture of the network is shown in Fig 5B. The input to the network is a trace sampled at 30Hz, which is some multiple of 10 seconds in duration. It is first remapped to values between 0 and 1 using min-max normalisation and then processed by a CNN, which acts as a local feature extractor that also down samples the input data, such that it produces 4 vector outputs at a 1Hz sampling frequency. These outputs are passed to a bidirectional LSTM, one time step at a time. A forward LSTM cell looks at trends in the forward direction ( Fig 5B, blue dotted box and arrow), while a reverse LSTM cell goes through the data in the reverse direction ( Fig  5B, red dotted box and arrow). Each of the two cells produces a "hidden state" as its output. Given that the LSTM is a recurrent neural network, this output is fed back into the network along with the next appropriate time step from the CNN's output, a process which is repeated 9 more times (for every 10 second trace). In essence, the LSTM is detecting trends across the entire sequence. The final hidden states from the forward and reverse LSTMs, produced after passing through the entire trace, are retained and concatenated. These outputs are combined by passing them through a Linear layer, which performs a matrix multiplication to produce an output that is 10 samples long. At this step, each sample represents the likelihood that the trace in question comes from a cell in the bursting mode, predicted at 1 second intervals. The likelihood of each time step having come from the tonic state can then be inferred ( Fig 5B, grayscale boxes on the right). The state with the larger likelihood score is taken to be the final classification at each time step.
Five networks were produced, one for each fold of cross-validation. The median accuracy was 80.15% for the cross-validation phase and 82.02% for the test phase ( Fig 5C, top). The median F1-score was 0.7302 for the cross-validation phase and 0.8071 for the test phase (Fig 5C, bottom). We looked at the posterior scores of the predicted class in both the cross-validation and test stages ( Fig 5D). In almost all the cases, the posterior scores of a network are higher in the true positive and true negative cases as opposed to the false positive and false negative cases. Of all the network folds, Fold 2 (orange) seemed to be particularly promising, because it consistently had high posterior scores, and had much higher scores for true positives and negatives when compared to false positives and negatives. Put together, this means that not only are these networks being trained well and making accurate predictions, but also confident about the predictions they made.
To be sure that networks weren't performing well just because of peculiarities in the dataset used for training and testing, we presented an independent set of calcium signal reconstructions to the network that were not used in any phase of the training. These reconstructions also had state switches in them ( Fig 6A). Networks performed fairly well even in these cases, with median F-1 scores higher than 0.8 (Fig. 6B). Fold 2 had a median F1-score of 1 and its predictions are shown in Fig 6A. The regions highlighted in red are those where the prediction didn't match the ground truth. What is immediately clear from it is that most of the cells' states were correctly identified, except in those cases where the cell switched state mid-recording, which is underrepresented in our training dataset as well.
As a further challenge to the trained networks, we decided to generate artificial spike trains that represent extreme cases of the tonic or bursting class (Supplementary Figure 6), reconstruct the calcium signal for these spike trains and ask the network to classify these completely synthetic data. We found that most folds of the network performed this task well (Fig 6D). Fold 2, in particular, predicted the state of 20/20 traces correctly (Fig 6C), with only one brief misclassified region, marked in the pink shaded area in Fig 6C.

Neural networks trained only on simulated calcium signals correctly identify Purkinje neuron state from real data
Thus far, all the inputs and challenges to the network were made using calcium signal reconstructions. To test the applicability of these networks on data from experiments, we went back to our original simultaneous electrophysiology-imaging data and asked if the network could call the state of those cells correctly based on the dF/F traces alone.
We found that the networks called the state of 7/15 cells correctly throughout the entire trace duration. In two other cells, more than 50% of the entire trace duration was called correctly (Fig 6E). It is important to note, though, that those traces which were incorrectly identified also had posterior scores that were not only low, but also similar for both classes (Fig 6E bottommost trace).
This outcome is interesting, because the networks have only been exposed to simulated calcium signals generated using a first-order generative model in the training and test phases, and yet are able to reasonably assess the state from real data, which have several challenging features -variable sensor expression levels, shot noise and signal bleaching, to name a few.
Among the five optimal networks identified from the 5-fold cross-validation, network #2 performed the best across all the challenges presented to it ( Fig 5C, Fig 5D, and  Fig 6), and thus is what we have decided to retain for future use, and call it CaMLsort_v1 (Calcium imaging and Machine Learning based tool to sort intracellular state). This will be made available as a Python package on GitHub.
CaMLsort not only makes calls about the state of a neuron but also produces class posterior scores, which essentially show how confident it was for each call. Thus, with some subjectivity on the part of the experimenter in combining the calls with the confidence scores, the error rate may be further reduced.

Discussion
In this paper, we describe the development of CaMLsort, a machine learning-based tool to identify the bistable state of Purkinje neurons from calcium imaging data. We develop this network and train it only on a simulated calcium signal dataset, but find that it can be used on experimental data, too.
The ability of neurons to exist in two stable states and exhibit distinct firing patterns/ excitabilities in those states has been documented in invertebrate (Dickinson and Nagy, 1983;Lechner et al., 1996;Marder et al., 1996;Russell and Hartline, 1982) and vertebrate nervous systems (Cowan and Wilson, 1994;Hounsgaard et al., 1984;Loewenstein et al., 2005;Misgeld et al., 1986;Schwindt and Crill, 1980). Among the many systems in which neuronal bistability has been studied, Purkinje neurons of the cerebellum are particularly interesting because they are spontaneously active. Thus, they switch between firing either tonically or in bursts (Sengupta and Thirumalai, 2015), and such behavior is not only unique to Purkinje neurons but has also been reported in many other cell types such as thalamic (Jahnsen and Llinás, 1984) and cortical neurons (McCormick et al., 1985).
Though it is appreciated that bistability increases the information processing capacity of neurons, such as by allowing gating of synaptic inputs (Thirumalai and Jha, 2022), its implication for network computation or behavioural output is less well understood. This is true of Purkinje neurons as well -while it is established that these neurons are bistable and are important for motor behaviours (Engbers et al., 2013;Popa et al., 2018;Sengupta and Thirumalai, 2015), the role of bistability in Purkinje neuron computation or motor behaviour output has thus far not been clear.
We propose that that having more than one stable state allows a cell to toggle in and out of participating in circuit computations. In the cerebellum, specifically, bistability may have significant consequences by modulating PN neuronal synchrony with other PNs. PNs make weak inhibitory synapses onto deep cerebellar nuclear (DCN) cells in mammals and the eurydendroid cells in fish, with a high degree of convergence (Bae et al., 2009;De Zeeuw et al., 2011;Harmon et al., 2020;Ikenaga et al., 2006;Person and Raman, 2012). DCN cell spike times could be effectively controlled via inhibition from several PNs firing synchronously (Person and Raman, 2012). Since PNs fire tonically when synaptically-isolated and CF inputs can trigger bursts (Sengupta and Thirumalai, 2015), the bursting state is a network-embedded state while the tonic state could be thought of as an isolated state. By toggling between these two states PNs can increase or decrease the levels of population synchrony and thereby affect DCN firing differentially. Mapping state distribution across the PN population during various tasks will lead us to an understanding of how PNs synchronize or desynchronize during these tasks and how that in turn shapes cerebellar output and motor behaviors.

Inferring long-term dynamics from calcium imaging data
Calcium imaging has been one of the core experimental techniques in neuroscience for more than three decades and has seen many major developments in sensor quality, microscopy and analytical algorithms and software (Chen et al., 2013;Göbel and Helmchen, 2007;Peron et al., 2015;Renninger and Orger, 2013;Tian et al., 2009;Yang and Yuste, 2017). Yet, the technique is largely used to identify regions of the brain that respond to presented stimuli or correlate with behavioral output under well-defined experimental conditions. This is resultant from the fact that the predominant mode of transient calcium elevation occurs due to the opening of voltage-gated calcium channels during action potential firing (Borst and Helmchen, 1998;Göbel and Helmchen, 2007). Thus, these calcium transients are quite brief, typically lasting only a few milliseconds and report spiking activity of neuronal populations during sensory stimulation and/or motor output. However, it is clear that neurons also exhibit longer time scale dynamics such as bursting, which carries information for downstream neurons (Zeldenrust et al., 2018). As the currently available calcium sensors themselves are much slower compared to the calcium transients, it is obvious that the recorded dF/F signals must manifest signatures of such long-term dynamics. CaMLsort addresses this issue by looking at longer term signatures in the calcium imaging time series data to identify cellular state.

Broader Applications of CaMLsort
The development of CaMLsort has opened up the possibility of a lot of future research in understanding PN bistability and its role in the function of the cerebellum as a whole. We could now experimentally address, for instance, what causes a switch in a PN's state by doing a high-throughput screen for chemical agents that affect bistability. We know that while bistability is a cell-intrinsic property, state switches could be a result of modulation of network activity. Neuromodulators are key candidates for this, as they can alter cell-intrinsic properties as well as network activity, and could thereby affect the bistability of a neuron, as has been demonstrated in multiple experimental systems (Abbinanti et al., 2012;Conway et al., 1988;Hounsgaard et al., 1988;Lechner et al., 1996;Williams et al., 2002).
Another question we can now directly address is how the state of a cell changes over development. We hypothesise that the tonic state could be a developmentally early state, with the burst-like phenotype developing as they get integrated into the developing cerebellar circuit. By birthdating PNs using a photoconvertible protein like Kaede and imaging their activity with GCaMP, we can use CaMLsort to ask whether younger cells tend to be more tonic or vice versa. While this question could potentially be addressed using electrophysiology, CaMLsort allows us to perform this experiment in a more high-throughput and minimally invasive fashion.
We focused on calcium imaging and electrophysiology data obtained from Purkinje neurons of larval zebrafish where we have shown that the tool predicts cellular state with very high accuracy and reliability (Fig. 6). A natural question is whether prediction of cellular state is useful in cell types other than the Purkinje neuron. A number of cell types exhibit bistability including cortical neurons (Cowan and Wilson, 1994;Egorov et al., 2002), striatal neurons (Misgeld et al., 1986), thalamic neurons (Fuentealba et al., 2005) and spinal motoneurons (Hounsgaard et al., 1984). In all of these cases, cellular state dictates how synaptic inputs are integrated. Therefore, predicting cellular state is essential to understanding circuit computation and here CaMLsort will be immensely useful. It is natural to wonder whether CaMLsort will be equally effective in predicting cellular state in different cell types and in different model organisms. While CaMLsort needs to be tested in other cell types and organisms, our classifier is built using kernels derived from a small ground truth dataset and then used to build a larger database of calcium traces from electrophysiological recordings. A similar approach can be used without much more additional work to create parallel electrophysiology-calcium imaging databases of any of the above-mentioned cell types. Taken together, the development of CaMLsort opens up new avenues of research into the functioning of not just the cerebellar circuit, but also other circuits with bistable neurons.

Limitations of CaMLsort
Having said the above, CaMLsort is not free of limitations. For instance, the generative model we use to train the neural network is a linear, time-invariant model and results may improve with the use of more accurate models, such as ones that estimate the true increase in intracellular calcium concentration resulting from simple spikes and CF inputs. Similarly, our model is trained and tested on GCaMP5G, for reasons of resource availability and experimental feasibility. However, newer generations of the GCaMP family of sensors continue to be developed, the latest being the 8th generation, which have much faster dynamics and a higher signal-to-noise ratio than the GCaMP5G we have used. Whether our results depend on the variant of calcium sensor being used is unclear, but we do not expect it to make a significant difference, because CaMLsort performs a window-wise min-max normalisation prior to identifying state. Since the relative ratio of simple spike to CF input calcium influx shouldn't depend on the sensor used, we predict that CaMLsort will work well regardless of the sensor being used. Moreover, the advantage of the method we have described is that a small ground truth dataset might be all that is necessary to retrain CaMLsort for a new sensor.
Lastly, it could be that CaMLsort may not work independent of the imaging method used. However, this is also easily fixed, either by the acquisition of a small groundtruth dataset, or by comparing the traces obtained from imaging to those in the reconstructions dataset, or by denoising the data prior to prediction using CaMLsort.

Experimental Animals
Experiments were performed on Indian wild type (Ind WT) zebrafish (Danio rerio) and were approved by the institutional animal ethics committee (IAEC) and institutional biosafety committee (IBSC). Embryos were obtained by setting up an incross between adult Ind WT zebrafish raised and housed in a ZebTec multi linking system (Tecniplast, Italy) with a pH setting of 7.8 and conductivity of 1200μS. Larvae were kept in a MIR-154 incubator (Sanyo, Japan) with 14:10h light-dark cycle maintained at 28 o C in E3 medium (composition in mM: 5 NaCl, 0.17 KCl, 0.33 CaCl 2 , and 0.33 MgSO 4 , pH 7.8) in standard 90mm Petri dishes (Tarsons, Kolkata, India). The medium was routinely replaced, and starting at 5 dpf, larvae were fed Zeigler Larval diet AP100 (<100 microns) (Pentair AES, FL, United States). Larvae have not yet undergone sex specification by these stages, so experiments performed were agnostic to the sex of the animal.

Transient transgenesis by microinjection
Thin-walled borosilicate capillaries (1.0mm OD, 0.56mm ID; Warner Instruments, Hamden, CT, United States) were pulled using a Flaming-Brown P-97 pipette puller (Sutter Instruments, Novato, CA, United States) to produce long and thin pipettes for microinjection. Indian WT embryos at the 1-2 cell stage were injected with ~10nL of a mixture of tagRFP-T:PC:GCaMP5G plasmid (Matsui et al., 2014) (50 ng/μL) and Tol2 transposase mRNA (50 ng/μl) (Urasaki et al., 2006), using a Picospritzer III (Parker Hannifin, NH, United States) pressure injection apparatus, and raised as described above. Larvae were screened at 5 dpf at an epifluorescence dissection microscope (Olympus SZX-16) and those showing mosaic labelling of Purkinje neurons were selected for experiments, which were performed at 7 dpf at room temperature (25-28 o C).

Electrophysiology
Larvae were anaesthetised in 0.01% MS-222 (Sigma-Aldrich; Missouri, USA) and transferred to a custom-made acrylic recording chamber. Fine tungsten wire (California Fine Wire, CA, USA) was used to pin the larva onto a piece of Sylgard (Dow Corning, Midland, MI, United States) glued to the recording chamber. Two pins were put through the notochord at the level of the swim bladder and the tail, respectively. A third pin was then used to position the larval head dorsal-up. The MS222 was replaced by external solution (composition in mM: 134 NaCl, 2.9 KCl, 1.2 MgCl 2 , 10 HEPES, 10 glucose, 2.1 CaCl 2 , 0.01 D-tubocurarine; pH 7.8; 290 mOsm) and skin on the dorsal surface of the larva was carefully removed using fine no. 5 forceps (Fine Science Tools, Foster City, USA) to expose the brain.
A Flaming-Brown P-97 pipette puller (Sutter Instruments, Novato, CA, United States) was used to pull patch pipettes made of standard-walled borosilicate capillaries with a filament (1.5 mm OD; 0.86 mm ID; Warner Instruments, Hamden, CT, United States), pulled to a final tip diameter of 1-1.5 μ m. For loose patch recordings, pipettes were backfilled with external solution, while for whole-cell recordings, pipettes were backfilled with K-gluconate internal solution (composition in mM: 115 K gluconate, 15 KCl, 2 MgCl 2 , 10 HEPES, 10 EGTA, 4 Mg-ATP; pH 7.2; 290 mOsm). Solutions were filtered using a 0.2μm filter (Millipore, Merck, Germany) before backfilling pipettes, which typically had resistances between 8 and 10 MΩ. Sulforhodamine (Sigma-Aldrich, St.Louis, MO, United States) was added to the patch internal solution at a final concentration of 5μg/mL to facilitate visualisation of cellular morphology post-recording.
Recordings were acquired with a Multiclamp 700B amplifier, Digidata 1440A digitizer and pCLAMP 10 software (Molecular Devices, Sunnyvale, CA, United States). Data were low pass filtered at 2 kHz and sampled at 20-50 kHz with a gain of 1 (whole-cell recordings) or between 20-1000 (loose patch recordings), so as to optimally use the dynamic range of the digitizer.

Simultaneous calcium imaging and electrophysiology
Indian WT larvae microinjected with the Tol2-tagRFP-T:PC:GCaMP5G construct (gift from Dr. Hideaki Matsui, Niigata University, Japan) and screened for sparse labelling were used for this experiment. Purkinje neurons expressing GCaMP5G were targeted for the recordings. Widefield imaging was performed using a 60x waterimmersion objective (1.0 NA) at an Olympus BX61WI microscope. Excitation light was provided using a pE300-ultra (CoolLED, Andover, England). Imaging was performed using an EVOLV EM-CCD camera (Photometrics, Tucson, AZ, United States) at the rig, with an exposure of 30 ms, and an on-chip gain of 170. An ROI was drawn around the targeted cell, so that only an isolated cell was imaged. Imaging was synchronised with electrophysiology by interfacing with the Digidata 1440A.
Image analysis was performed in Fiji/ImageJ (Schindelin et al., 2012). In order to identify cell boundaries, the averaged time series was used as a reference and an ROI drawn around the cell. Bleach correction was performed using the Exponential Fitting Method in the Bleach Correction plugin in Fiji. The average raw pixel intensity values within the ROI for each time point was extracted. The baseline intensity, F0, was taken to be the 5th percentile of these values. dF/F was then calculated for each frame using the formula (F -F0)/F0, where F represents the fluorescence in that frame and F0 the baseline fluorescence. Following this calculation, a Gaussian filter was applied to the time series in MATLAB R2019b (Mathworks, Natick, MA), using its function 'smoothdata', with a window size of 15.

Spike detection and analysis
All electrophysiological analyses were done using custom scripts written in MATLAB, using functions defined in a publicly available repository (https://github.com/wagenadl/mbl-nsb-toolbox) to read electrophysiology data. For extracellular recordings, spikes were identified as events that showed large fluctuations relative to a 25ms rolling window baseline and the peak timing and amplitude were extracted. Events were sorted based on amplitude, with small amplitude events being classified as simple spikes and large amplitude events as climbing fiber (CF) inputs, as per (Sengupta and Thirumalai, 2015). For intracellular recordings, peak detection algorithms inbuilt in MATLAB were used to identify spikes. Once again, events were sorted as simple spikes or CF inputs based on their amplitude relative to baseline.

Calcium imaging reconstruction from electrophysiology data
Simultaneous calcium imaging and extracellular electrophysiology data was used to infer optimal GCaMP5G kernels, which are of the general form A1*exp(-t/ decay ) -A2*exp(-t/ rise ), for simple spikes and CF inputs of zebrafish PNs. Simple spike and CF input timings were first extracted from the electrophysiology data for a single cell. Simple spike timings were convolved with a simple spike kernel, with amplitudes A1s and A2s and CF input timings were convolved with a similar CF input kernel, with amplitudes A1c and A2c. The sum of both these convolutions was taken to be the putative calcium imaging reconstruction, downsampled to match the sampling rate of calcium imaging. The mean squared error (MSE) between the reconstruction and the true calcium imaging trace was calculated. The values for A1s, A2s, A1c and A2c were obtained by minimising the MSE between the true calcium imaging trace and the reconstructed trace using gradient descent. To ensure that local minima were avoided, the gradient descent algorithm was repeated with randomised starting points. The kernels with these coefficients were taken to be the optimal simple spike and CF input kernels, respectively, for that cell.
The 'optimal' kernel would be one that generalised best. This was tested by using the kernels from one cell to reconstruct calcium imaging for other cells for which ground truth data was available. As before, the MSE and also the Pearson correlation coefficient between the reconstructed trace and the true trace were calculated. The final 'optimal' kernel was obtained by taking the kernel which generalised the best, i.e., had minimum MSE and maximum correlation across all cells. To test whether the correlation was real or not, the Pearson correlation coefficient between a scrambled reconstructed trace and the true trace was also calculated, for reference. This process of convolving simple spike and CF timings with their optimal GCaMP5G kernels and summating the resultant traces was then used to generate a calcium imaging trace from both extracellular and intracellular Purkinje cell recordings.

Building a state-labelled calcium imaging dataset
Intracellular and extracellular recordings from larval zebrafish Purkinje neurons acquired during the period 2011-2020 by various members of the lab were compiled. Good quality recordings were those in which the resting membrane potential of the cell was not below -80mV or not above -20mV, were uninterrupted gap-free recordings with no current injections and had two clearly distinguishable event types. Only those recordings which met the aforementioned criteria were retained. Besides poor quality recordings, those recordings with an ambiguous, indeterminate state were also excluded. These recordings were renumbered randomly, to remove grouping of data collected by the same individual. Recordings were then classified as tonic or bursting. If there was a switch in state within the recording, the recordings were labelled as "switch", with the appropriate state switch timing and kind (tonic-tobursting, or bursting-to-tonic) marked. A summary of the recording information is in Table 1.
Here, data was split into two groups -one that contained traces only with a single state throughout the recording, and another that contained traces with switches in state. Once split, the spike detection protocol described above was applied to each group of recordings to detect the simple spikes and CF inputs. These event timings were then used to generate simulated calcium imaging traces as described above. Combining the reconstruction with the assigned state resulted in the state-labelled calcium imaging dataset.

Principal Components Analysis
Principal Components Analysis was performed using the PCA module that is part of the sklearn package in Python (Pedregosa et al., 2011). The input to the model was a vector with the trace features for each 10 second long trace in the reconstructions dataset (number of peaks, peak amplitude, area under the curve, mean, standard deviation and coefficient of variation). A 2-component PCA was performed, and the resulting fit (shown in Figure 4E) explained 88% of the variance in the data.

Training the CNN-LSTM model for classification
The dataset of reconstructed calcium imaging time series was first split into 5 folds to perform cross validation (Fig 5A). The experiments are repeated by considering each fold as a test set, with one of the folds being the validation set and remaining 3 folds to be training sets such that the data split for each experiment is as follows -60% for training, 20% for validation, 20% for testing. This splitting was done at the level of entire traces, such that the training, testing and cross-validation sets all had a balanced representation of traces from both the tonic and bursting classes. We chose to do a 5-fold cross-validation, i.e. the entire training dataset is split into 5 parts, each containing 20% of the data. Four of the folds are used for training and the remaining one is used for cross-validation.
The architecture of the network is shown in Fig 5B. The input to the network is the entire trace sequence at 30Hz, normalised using min-max normalisation for every non-overlapping 10 second period, thus setting the minimum to 0, and maximum to 1. The first layer of processing is a CNN with kernel size 30 and stride size 30. The CNN downsamples the input trace from 30Hz to 1Hz, and produces 4 vector outputs, which act as the input to the LSTM. The LSTM is bidirectional, so it looks at the sequence in both the forward and reverse order to produce an output. The LSTM has 24 hidden layers, so it produces 48 numbers per time step (Fig 5B). The feature dimensions are collapsed by a Linear layer by taking their weighted average. This provides the classification logits at each position, which is passed through a softmax layer. The network is optimised using a cross entropy loss and the Adam optimizer (Kingma and Ba, 2017). The output of a trained neural network, then, is the probability of each timestep (at 1 Hz) belonging to the bursting class, from which the probability of it belonging to the tonic class can be derived. The class with the higher probability is assigned to each time step.
Since the model acts on the entire trace sequence, the amount of independent data for training the model is significantly reduced. To overcome this issue, we used a data augmentation strategy, which allows the network to learn features on arbitrary length sequences. For a sequence of length N, we determine a random segment size n, with n between [k, N], where k is the minimum sequence length. We randomly determine a starting position x, with a constraint that x < N -n. In our case, k=300 samples, and n varies in multiples of k (i.e. n = 300, 600, 900…). In each iteration of the network pass during the training phase, this random sequential segment was taken as the input trace to the model and gradient descent was performed to update weights. This was done for all the traces in the training set, so that the network saw entire traces at least once and in a non-overlapping fashion during the 500 epochs of training. The training is stopped before the full 500 epochs, based on early stopping criteria -if the validation loss does not improve after few epochs. All models were trained using the PyTorch framework in Python 3 (Paszke et al., 2019).
For all evaluations -cross-validation, test and other independent validations, the network had to make predictions for the entire trace duration and not just the nsample long chunks. Networks were evaluated using two metrics -classification accuracies and F1-scores. Classification accuracies are simply the fraction of predictions that were correct. F1-scores, on the other hand, are adjusted for class imbalances and account for the kind of errors made. By definition, an F1-score is the harmonic mean of the precision and recall metrics, where precision is the fraction of all predicted positives that were true positives, and recall is the fraction of all ground truth positives that were true positives.

Generating artificial Purkinje neuron spike trains
Synthetic spike trains of the tonic class were generated using a Poisson statistical model. The only parameter needed for this is an average event rate, λ , which we took to be a random number between 4-10 for simple spikes. Spike trains for bursting neurons do not follow Poisson statistics. Hence, we used an inverse sampling method to generate spike trains of the bursting class. For this, we used the compiled electrophysiology data to obtain the real cumulative distribution of interevent intervals for simple spikes across all bursting cells. A random number, y, is generated from the uniform distribution and it is used to find a value x from the reference cumulative interspike interval (ISI) distribution such that P(ISI <= x) = y.
Thus, by generating events sequentially with a given next interevent interval, we obtained artificial simple spike trains for the bursting class.
CF input event trains were generated using a Poisson statistics model with an average event rate between 0.2-1. The distributions of interevent intervals in the synthetic datasets were compared against the real distributions using histograms and QQ plots (Supplementary Figure 6C-F). Independent reviewers were asked to classify unlabelled spike trains as real or artificial and managed to do so with an accuracy of ~42% on average.  (Danio rerio), showing the location of the cerebellum in the brain (left) and the cell types and circuit architecture (right). Purkinje neurons (green) are the principal neurons in this circuit and receive multiple inputs from afferent sensorimotor nuclei outside the cerebellum via parallel fibers (black) and climbing fibers (magenta). (B) Representative loose patch recordings from larval zebrafish PNs showing the two modes of firing. The bursting mode recording is shown on the left, and the tonic mode is shown on the right. In each mode, electrical events can be distinguished by their amplitude, with small-amplitude simple spikes marked in cyan and the large-amplitude CF inputs marked in magenta. Note that the time scales shown for each mode are different, with a longer duration shown for the bursting mode than the tonic mode. (C) Representative whole-cell current clamp recordings showing the two modes of firing in PNs, with the same colour scheme for events as in (B). These recordings were obtained from different cells from those shown in (  Representative traces obtained during simultaneous imaging and electrophysiology for cells in the bursting (B) and tonic (C) mode. Simple spikes and CF inputs in the trace are identified and marked in the raster above the trace in cyan and magenta, respectively. The simultaneously-recorded change in GCaMP fluorescence (dF/F) is shown below in green. Calcium transients are marked with an asterisk (*) if they correspond to only simple spikes in the electrophysiology channel. Similarly, they are marked with a hash symbol (#) if they correspond only to CF inputs, and with an open circle (o) if they correspond to both simple spikes and CF inputs. (D) Eventtriggered average calcium signals. (i) Simple spike-triggered average calcium signal. The black line represents the mean and the cyan-shaded region is the SEM. n=2293 events from N=15 cells. (ii) CF-input triggered average calcium signal, similar to (i). The black line represents the mean and the magenta region is the SEM. n=111 events from N=15 cells (E) Peri-event time histograms (PETHs) of (i) simple-spikes and (ii) CF inputs relative to peaks in the calcium signal. The vertical dashed lines mark t=-0.267s and t=0s, which is the interval during which the GCaMP5G increases from baseline to peak. The y-axis values represent z-scores of the event counts. The mean z-scores for the yellow and grey shaded regions are given above each shaded region. spikes and CF inputs around the peak calcium signal in each state (A) Peri-event time histogram (PETH) of (i) simple-spikes and (ii) CF inputs relative to peaks in the calcium signal for cells in the bursting mode. The PETHs are coloured cyan and magenta for simple spikes and CF inputs, respectively. The vertical dashed lines mark the period it takes for the GCaMP5G sensor to reach the peak from baseline. The mean z-score for the yellow (-400 to -200 ms relative to the calcium peak) and grey (-100 to +100 ms relative to the calcium peak) regions are given above the respective shaded bars. Data from n=6 cells. (B) PETHs for cells in the tonic mode, similar to (A). Data from n=9 cells.  cells (3 bursting (i) and 3 tonic (ii)) chosen randomly from the dataset. The ground truth trace is shown in green and the reconstruction in purple, and the Pearson's correlation coefficient between the two traces is marked for each pair. (C) Quantification of the quality of reconstructions obtained using the optimal kernel combination. Correlation coefficients were calculated between the ground truth for each cell (n=15 cells; 6 bursting, 9 tonic) and either the true reconstruction, a scrambled version of the reconstruction, or the reconstruction for when simple spike and CF input event times were shuffled. A Kruskal-Wallis test yielded a p-value of 2.108 x 10^-7, which was followed by a post-hoc Dunn's test for individual comparisons, the latter of which is marked on the figure. (D) The optimal calcium sensor kernels for simple spikes (cyan) and CF inputs (magenta). 9 .3 0 6 9 .5 4 0 9 .6 7 3 9 .5 4 1 9 .5 9 8 9 . 6 7 4 9 .6 6 5 9 .6 5 5 8 .9 2 1 9 .2 6 3 9 .0 0 1 9 .4 2 1 9 .1 3 8 9 .0 6 6 9 .6 2 4  Mean-squared errors (MSEs) between the ground truth and reconstructed calcium signal for each cell and kernel combination used. Each row represents a single cell and each column, the kernels used. The "average" kernel's coefficients were the mean of the coefficients of the kernels optimised for each cell. The last row, with a different colormap, represents the total MSE across all reconstructions for a kernel.

Supplementary
The values corresponding to each colour are shown in the colour bars on the right. Bright colours represent lower MSEs. (B) Similar representation as above, but for Pearson's correlation coefficients between the ground truth and reconstructions. Note that the colormaps are reversed as compared to (A), so that brighter colours represent higher correlations. In both (A) and (B), the kernel chosen for all future work (i.e. no 7) is highlighted in bold.  Table 2). (E) Principal Components Analysis of trace properties, with data points marked by state -tonic (red crosses) and bursting (black open circles).

Figure 5 -Design and training of CaMLsort (A)
Flowchart showing how the statelabelled calcium signal dataset was split into training and test datasets using a 5-fold 60-20-20 split into training (white boxes), test (blue boxes) and cross-validation (yellow boxes) samples, respectively. The networks that performed well in the training phase, as assessed by performance on the cross-validation samples, were then challenged with the unseen test data (blue boxes) in each fold. An independent dataset with state switches was also used to further validate the networks obtained (see Fig 6). (B) The architecture of the convolutional recurrent neural networks used to solve the classification problem. The network takes an input time series trace sampled at 30Hz (purple) and that is a multiple of 10 seconds long in duration (here just 10 seconds long). This trace is first normalized using min-max normalization, following which it is passed through a 1-dimensional convolutional neural network (1D CNN), which has 4 kernels with a step size and stride length of 30 each. The resulting 4 traces (grey boxes) are downsampled, local feature-extracted versions of the original trace. All 4 outputs are passed to the LSTM module (central black box) for trend identification. The LSTM is a bidirectional one, which processes traces in both the forward (blue arrow) and reverse (red arrow) directions. A single time step is taken from all the CNN outputs (red and blue dotted boxes) and passed to the respective LSTM cell, which then processes the input to produce a resultant "hidden state" output (red and blue filled boxes). This output is passed back into the LSTM cell along with the next sample from the CNN's output traces, until every time step from it has been processed (here 10 rounds). Once this is done, the final "hidden state" from each LSTM cell is retained and concatenated before their weighted average is taken by a Linear layer. The resulting average is converted to a "posterior score" using an activation function. These posterior scores represent the likelihood of the cell being in the bursting state at each time step. The likelihood of it being in the tonic state can be calculated from this. The final call is taken to be the state which has a higher posterior score. The text in italics represents the net effect of each phase of the neural network. The numbers in parentheses indicate the sampling frequency and/or the duration of the resultant vectors at various stages of processing by the neural network. Similarly, the numbers in bold and italics within square brackets indicate the size of the vectors/matrices at various stages of passing through the network. (C) Average classification accuracy (top) and F1-scores (bottom) at the cross-validation and the test phases, for trained networks. Each fold is represented in a different colour, marked in the legend to the right of panels (C) and (D). (D) Posterior scores for the predicted class in the cross-validation (top) and test (bottom) phases. Each dot represents the mean posterior score for a trained network, with error bars representing standard deviation.