Extracting electrophysiological correlates of functional magnetic resonance imaging data using the canonical polyadic decomposition

Abstract The relation between electrophysiology and BOLD‐fMRI requires further elucidation. One approach for studying this relation is to find time‐frequency features from electrophysiology that explain the variance of BOLD time‐series. Convolution of these features with a canonical hemodynamic response function (HRF) is often required to model neurovascular coupling mechanisms and thus account for time shifts between electrophysiological and BOLD‐fMRI data. We propose a framework for extracting the spatial distribution of these time‐frequency features while also estimating more flexible, region‐specific HRFs. The core component of this method is the decomposition of a tensor containing impulse response functions using the Canonical Polyadic Decomposition. The outputs of this decomposition provide insight into the relation between electrophysiology and BOLD‐fMRI and can be used to construct estimates of BOLD time‐series. We demonstrated the performance of this method on simulated data while also examining the effects of simulated measurement noise and physiological confounds. Afterwards, we validated our method on publicly available task‐based and resting‐state EEG‐fMRI data. We adjusted our method to accommodate the multisubject nature of these datasets, enabling the investigation of inter‐subject variability with regards to EEG‐to‐BOLD neurovascular coupling mechanisms. We thus also demonstrate how EEG features for modelling the BOLD signal differ across subjects.

( ) = max( | | − , 0 ) * sgn( ) (S.1) We used soft-thresholding to sparsify the HRF tensor, such that SSRFs associated with correlations below the 95 th percentile were nulled. Soft-thresholding -as opposed to hard-thresholding -downscales all correlation coefficients regardless of whether they survive the threshold. To clarify this point, let's consider a threshold of 0.25 and correlation coefficients of 0.5 and 1.0. With hard-thresholding, both values would remain intact, i.e. 0.5 and 1.0. With soft-thresholding however, values are reduced by 0.25, and thus become 0.25 and 0.75. In this case, the ratio between thresholded correlation coefficients goes from 1.0/0.5 = 2 for hard-thresholding to 0.75/0.25 = 3 for soft-thresholding. Soft-thresholding thus has the effect of increasing higher correlations relative to lower correlations, thereby emphasizing differences between correlation coefficients. Consequently, the ratio between SSRF amplitudes are also increased for SSRFs associated with higher correlation coefficients.
The intended effect of reweighting the HRF tensor in the way just described was to render CPD more sensitive to SSRFs leading to higher correlations. Moreover, the aforementioned increase in ratio of SSRFs served to sparsify CPD outputs. For clarification, readers may refer to (Hervás et al., 2019) for an explanation of the softthresholding operator as the solution to an L1-regularized least-squares regression problem and its application to multiway partial least-squares.

SM 2 -injecting input noise
The simplest way for accounting for the effect of the Hilbert transform on thermal noise is to inject an additive white Gaussian noise (AWGN) process into the data at an early stage of data processing (before the Hilbert transform). A new AWGN process could be injected for each bootstrap or phase randomization iteration, applying the preprocessing steps every time. However, this approach comes at a great computational cost and memory demand during the bootstrapping and construction of empirical distributions for statistical inference.
Instead, we used the following mathematical relationship which relates the amplitude of a Hilbert transformed sum of complex signals = 1 + 2 (Eq. S.2): where 1 and 1 are the real and imaginary parts of 1 respectively, and 2 and 2 are the real and imaginary parts of 2 . The magnitude of a given complex signal is symbolized as |•|. Similarly, if we consider 1 as the Hilbert transformed LFP data and 2 the Hilbert transformed AWGN process with scaling factor such that 2 = ( 2 + 2 ), the following relation holds (Eq. S.3): According to Equation (S.3), one may compute the squared noise amplitude | 2 | 2 and cross-term 2( 1 2 + 1 2 ) and scale with accordingly. Importantly, rather than computing these two terms at every bootstrap or phase randomisation iteration, the terms | 2 | 2 and 2( 1 2 + 1 2 ) were computed once and replicated by random block resampling at every iteration. In short, the idea is to mimic the process of injecting input measurement noise to the LFP data -with a different AWGN process at every iteration -and performing the Hilbert transform afterwards.

SM 3 -estimating reference hemodynamic response functions
We also obtained reference hemodynamic response functions (HRFs) that served as standards to assess the performance of the proposed HRF estimation scheme. To create these reference HRFs, WGN was directly used as input to the balloon model as opposed to squared LFP signals. Using WGN ensures that the full bandwidth of the system is excited by serially uncorrelated data (Marmarelis & Marmarelis, 1978), serving as an ideal signal to derive a reference HRF. Two reference HRFs were obtained using two separate techniques, the first being the spherical Laguerre basis function expansion technique explained above, and the second technique being the ordinary least-squares method (Westwick & Kearney, 2003). Whereas the first reference HRF reflects the ability of spherical Laguerre basis functions to estimate proper HRFs, the ordinary least-squares method serves in principle as a more accurate estimate of the underlying HRF although potentially less stable. It should be noted that, whereas the number of spherical Laguerre basis functions was set to 3 during the main analysis to avoid overfitting, the number of basis functions when using WGN was set to 5. In this latter case, we did not observe overfitting despite the larger number of basis functions. This assessment was based on qualitatively observing the derived HRF and looking for non-smooth features within the HRF waveform.
Figures S.10D and S.10D.2 show comparisons between HRFs estimated using CPD and reference HRFs. In Figure   S.10D, the balloon model parameters were set such that a post-stimulus undershoot is present in the HRFs. On the other hand, parameters were set in Figure S.10D.2 such that the undershoot is absent. Removing this undershoot was achieved by increasing the autoregulation parameter of the balloon model.

SM 4 -residuals-based block bootstrapping
To assess the robustness of CPD and obtain confidence intervals for the estimated HRFs and spatial-spectral distributions, a residuals-based bootstrapping procedure was performed. As previously mentioned, the HRF and spatial-spectral distributions derived from the CPD decomposition of the HRF tensor were used to obtain a weighted linear combination of LFP signals, yielding a compound LFP signal which was convolved with the estimated HRF. The resulting BOLD signal estimate was compared to the true BOLD signal to obtain the residuals The simulated PPG signal (A) was used to obtain the heart-rate (HR) signal shown in (C). The latter was convolved with the cardiac response function (E) to generate the cardiac SLFO (G, red). The respiratory time-course (B) gave rise to the respiratory flow (RF) signal shown in (D). The latter was convolved with the respiratory response function (F) to output the respiratory SLFO (G, blue). Cardiac and respiratory SLFOs were summed to create the SLFO signal (G, green) which was later added to the BOLD signal as a source of physiological confound.   Frequency-band specific input measurement noise were scaled by and added onto LFP signals to produce noisy LFP signals. Output measurement noise was scaled by and SLFO scaled by , both added to the BOLD signal to produce a noisy BOLD signal. Noisy LFP and BOLD signals were used to construct the HRF tensor to then finally proceed with CPD.

Figure S.7:
Residuals-based block-bootstrapping. Residuals were derived by subtracting the BOLD signal estimate from the BOLD signal. Residuals were segmented into blocks which were resampled with replacement to form bootstrap replicates of the residuals. Individual bootstrap replicates were added back to the BOLD signal to construct a new HRF tensor and perform CPD for each replicate.

Figure S.8:
Statistical inference based on phaserandomisation of the BOLD signal. Pearson correlations were used to quantify the goodness-of-fit of the BOLD signal estimate. The correlation between BOLD signal and BOLD signal estimate (purple) was compared against a null distribution of correlations (orange). This null distribution is constructed by phase-randomising the BOLD signal numerous times. For each phaserandomised BOLD signal, an estimate of the phaserandomised BOLD signal was derived. Every pair of phase-randomised BOLD signal and its estimate were correlated and added to the null distribution.

Figure S.11:
Comparing the modelling capacity of a CPD-based HRF with that of the canonical HRF. BOLD signals were simulated for all 66 nodes of the cortical parcellation and used to extract spatial and spectral features as well as HRF estimated using our proposed CPDbased method. The simulated BOLD signals were then either estimated using all CPD outputs or by replacing the CPD-based HRF by the canonical HRF. For the BOLD signal simulations, the balloon model parameters were selected such that the resulting CPD-based HRF would differ from the canonical HRF to some extent. Specifically, the signal decay parameter of the balloon model was reduced from 1.54 seconds to 0.54 seconds. A) HRF derived using the CPD decomposition for randomly selected cortical region. B) Canonical HRF. C) Simulated BOLD signal (blue) overlaid with BOLD signal estimate (orange) when using CPD-based HRF for same cortical region as in A). D) Simulated BOLD signal (blue) overlaid with BOLD signal estimate (orange) when using canonical HRF for same cortical region as in A). E) Histogram of correlations when using CPD-based HRF for all 66 nodes of the cortical parcellation. F) Histogram of correlations when using canonical HRF for all 66 nodes of the cortical parcellation. corr: correlation.