Neuroscience meets behavior: A systematic literature review on magnetic resonance imaging of the brain combined with real‐world digital phenotyping

Abstract A primary goal of neuroscience is to understand the relationship between the brain and behavior. While magnetic resonance imaging (MRI) examines brain structure and function under controlled conditions, digital phenotyping via portable automatic devices (PAD) quantifies behavior in real‐world settings. Combining these two technologies may bridge the gap between brain imaging, physiology, and real‐time behavior, enhancing the generalizability of laboratory and clinical findings. However, the use of MRI and data from PADs outside the MRI scanner remains underexplored. Herein, we present a Preferred Reporting Items for Systematic Reviews and Meta‐Analysis systematic literature review that identifies and analyzes the current state of research on the integration of brain MRI and PADs. PubMed and Scopus were automatically searched using keywords covering various MRI techniques and PADs. Abstracts were screened to only include articles that collected MRI brain data and PAD data outside the laboratory environment. Full‐text screening was then conducted to ensure included articles combined quantitative data from MRI with data from PADs, yielding 94 selected papers for a total of N = 14,778 subjects. Results were reported as cross‐frequency tables between brain imaging and behavior sampling methods and patterns were identified through network analysis. Furthermore, brain maps reported in the studies were synthesized according to the measurement modalities that were used. Results demonstrate the feasibility of integrating MRI and PADs across various study designs, patient and control populations, and age groups. The majority of published literature combines functional, T1‐weighted, and diffusion weighted MRI with physical activity sensors, ecological momentary assessment via PADs, and sleep. The literature further highlights specific brain regions frequently correlated with distinct MRI‐PAD combinations. These combinations enable in‐depth studies on how physiology, brain function and behavior influence each other. Our review highlights the potential for constructing brain–behavior models that extend beyond the scanner and into real‐world contexts.


| INTRODUCTION
Our cognitive and physiological systems are engaged in a complex interplay, coordinated by the brain and continuously adapting to environmental stimuli.This joint system can be approached both from the perspective of the underlying brain mechanisms and from the behavioral point of view.Technology advances, such as magnetic resonance imaging (MRI), have facilitated the noninvasive study of the brain.Different MRI techniques have been employed to investigate the brain's anatomy (Durston et al., 2001), physiology (Frahm et al., 1994), structure (Assaf & Pasternak, 2008), and function (Logothetis, 2008).However, despite the progress made in uncovering brain processes, there are still gaps in our knowledge of how the brain generates behavior (Krakauer et al., 2017).Specifically, the artificial conditions of laboratory and clinical settings can decouple subjects from the environment in which they operate (Nastase et al., 2020), making it challenging to apply laboratory results to real-life situations.Simultaneously, the emergence of portable automatic devices (PADs) has revolutionized the measurement of human behavior and physiology in real-world settings.PADs are compact, wearable, or handheld electronic devices that automatically collect behavioral or physiological data with minimal to no user input.They facilitate digital phenotyping, the quantification of human phenotypes via gadgets in situ (Jain et al., 2015;Torous et al., 2016).Examples of PADs include fitness trackers that gather data on human bodily functions (physiology) and smartphones that collect data of human interactions with the environment or of mental states via self-reports (behavior).Although PADs significantly enhance our understanding of health-related habits within natural settings, a critical challenge remains: unlike MRI scanners, PADs can only provide information about behavior and physiology, but are unable to relate it to intricate brain biology and function.Moreover, many PAD studies are observational, therefore providing mostly correlational information (McGowan et al., 2023;Moura et al., 2022).
The gap in integrating neurological understanding with the environmental context is highlighted by the dual technological evolution, where MRI reveals intricate details of brain function in artificial, controlled settings and PADs capture real-life human behavior and physiology without direct insights into the underlying brain function.While each technology independently contributes valuable information, their isolated use in studies often leads to a fragmented comprehension of how human brain function relates to behavioral and physiological patterns in everyday environments.This need for a more integrative strategy resonates with prior viewpoints (Krakauer et al., 2017;Marom et al., 2009), where it is argued that the current approach of causal manipulation is insufficient for fully understanding the brain's role in behavior, much like how studying feathers alone is not enough to explain how birds fly (Marr, 1978).
Consequently, the integration of PADs in neuroscience research and clinical practice offers a promising path to bridge the gap between brain imaging and real-world behavior, improving the generalizability of laboratory findings, and strengthening the statistical power of evidence supporting the ecological validity and clinical relevance of MRI studies.
Despite these clear advantages, we found only one review paper that discusses the combined application of MRI techniques and ambulatory assessment in examining brain-behavior relationships in natural contexts (McGowan et al., 2023).In this work, the authors emphasize the necessity of frameworks for integrating MRI and PAD data for a deeper understanding of brain-behavior interactions, primarily focusing on the methods employed in analyzing combined MRI-PAD data.Here, we report a wider range of PADs and identify the main brain areas found in the studies categorized by MRI-PAD combination.These brain areas are analyzed in both, a descriptive summary at the level of large regions of interest (ROIs) and a coordinate-based meta-analysis.Consequently, to expand on the topic, in this systematic literature review, we focus on studies that jointly analyzed MRI data and information collected by PADs outside the scanner.
Our approach involved systematically searching for studies incorporating both technologies, followed by meticulous data selection, extraction, and synthesis.The results of this review provide four main contributions: (i) we characterize and summarize the selected studies, synthesizing findings on true feasibility of combining data from PADs with MRI (ii) we identify frequently used MRI-device combinations and identify underexplored MRI-PAD pairings, (iii) we show the brain areas yielding statistically significant results in those studies, and (iv) we discuss trends and open issues to inform future research.
The structure of the review is as follows.In Section 2, we outline the methodology.In Sections 3.2-3.7,we characterize and synthesize the selected studies.In Section 3.8, we examine the concurrent use of MRI and devices in the studies, both in average and over time.In Section 3.9, we show the statistically significant brain areas identified in the studies for each MRI-device combination.Finally, in Section 4, we discuss identified trends, open issues, and summarize our findings.

| MATERIALS AND METHODS
We conducted a systematic literature review using two databases, PubMed and Scopus.Initially, we developed a protocol for systematic data retrieval and analysis, which was preregistered in the Open Science Framework (Triana et al., 2022).Following this protocol, we analyzed and synthesized the data to provide a summary of relevant information in this research field.

| Objective and research question
This systematic literature review aims to provide a comprehensive understanding of how scientists and clinicians have employed PADs in conjunction with MRI.We ask two questions: (i) What is the current state of research on the integration of MRI and PAD data collected in natural contexts?and (ii) What are the characteristics of these studies?

| Search strategy and selection process
To ensure a comprehensive search, we designed a search string based on two keywords extracted from two sets listed in Table 1.Set "A" includes terms commonly used in research studies analyzing brain signals using MRI techniques (e.g., functional MRI [fMRI], diffusion tensor imaging, etc.), and set "B" comprises keywords typically used in digital phenotyping papers (e.g., actigraph, smartphone, etc.).Each string is a combination of a keyword from set "A" and a keyword from set "B" using the operator AND (e.g., "fMRI" AND "sleep").
Once all the possible combinations were established, we automatically retrieved papers from the two databases.These databases were chosen based on their coverage of scientific publications and their APIs, which allowed us to automate the search process.Each combination was treated as a query, generating a list of unique identifiers (PMID or Scopus ID) for the matched papers.Subsequently, we fetched basic research article information, including title, abstract, publication date, DOI, and keywords.Duplicate entries were removed, and the gathered data were systematically organized.
An electronic search was performed on January 31, 2022, covering the last 22 years (2000-2022) of literature.We searched for pairs of terms ("set A keyword" AND "set B keyword") in the title and abstract.Only studies in English were included.Due to database API characteristics, we restricted the document types to journal articles, clinical studies, and congress in PubMed and to article and conference papers in Scopus.

| Eligibility criteria
Primary research studies were selected if they met the following inclusion criteria: (i) the study analyzed data from human subjects; (ii) the study analyzed data collected with PADs in everyday-life conditions (i.e., not in clinical or laboratory settings), and (iii) the study analyzed MRI data in combination with data from PADs.Papers were excluded if: (i) they were literature reviews or protocols; (ii) they did not clearly state what technology/device is employed (e.g., a study that administered questionnaires but did not mention if they are paper-based or electronically delivered); (iii) they mentioned a device but did not use the collected data (e.g., a paper where actigraph data were collected but not analyzed), (iv) researchers employed the device under clinical or laboratory settings (e.g., an intervention study where heart rate was monitored using a smartwatch during staff-supervised sessions or a study where heart rate was measured only while scanning), and (v) MRI or device data were used to verify states or as a back-up option (e.g., data from an actigraph that was only used to verify sleep-deprived nights before a scanning session, but not analyzed).
Note that standard computers are not considered as PADs.Moreover, for brain imaging, our scope is limited to MRI conducted in a laboratory or clinical setting, and therefore we also exclude EEG devices even when worn outside the laboratory.

| Screening process
The screening process was divided in two parts, abstract screening and full-text inclusion.Three study authors (NMEAH, EG, and AMT) independently screened all the titles and abstracts.Keywords from the query that yielded the paper were also available at this stage to provide additional information.Each author assigned a score of 1 for inclusion, À1 for exclusion, or 0 for uncertainty.After all abstracts T A B L E 1 Keywords and sets used to design the search string.

| Graph analysis
Based on categorized data, we created a network to identify the most common combinations of MRI and devices employed in the selected sample.In this network, nodes represent MRI techniques and PADs (labeled according to their functions).The weight of a link indicates the number of papers using the respective techniques or devices.We organized PADs into two color-coded sets-physiology and behavior-based on what they measure.In this context, physiological measurements refer to the quantification of human bodily functions, such as heart rate, while behavioral measurements cover a range of human actions, interactions, and self-reported states, including sleep patterns and response to questionnaires.Finally, we used the algorithm of Ahn et al. (2010) to detect link communities for identifying close relationships between MRI techniques and devices.This algorithm detects sets of closely related links (link communities), which allows nodes to belong to multiple communities at the same time.

| Brain area visualization
For each of the included studies, we mapped the reported list of statistically significant brain areas into the AALv3 atlas (Rolls et al., 2020).First, we conducted automated string matching, followed by manual quality control.Similar to NeuroSynth (Yarkoni et al., 2011) (where peak coordinates from activations are converted into spheres), this approach allows us to generate a brain mask for each study, labeling reported ROIs as significant with 1 s and nonsignificant areas with zeros.Since not all studies reported coordinates, we opted for a descriptive statistics approach using ROIs.Our goal was to graphically visualize the areas that were most frequently reported across studies, presenting frequency counts of the brain areas reported for each MRI-PAD combination.In addition, we ran a meta-analysis using NIMARE (Salo et al., 2022;Salo et al., 2023) for those few cases where enough coordinates are reported.For both cases, we limited the synthesis of brain maps to T1-weighted and fMRI studies due to the complexity of remapping white matter tracts to the standard MNI space.

| Data and software availability
The code developed for this literature review is available at Zenodo (Triana et al., 2023).

| Study selection
Figure 1 shows the diagram of the selection process, according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis statement (Moher et al., 2009).First, 5777 registers were automatically retrieved from two databases (PubMed and Scopus).Of these, we discarded 3024 duplicated papers and 1 book.Next, we screened the abstracts of the remaining 2752 papers and agreed to exclude 2406, selecting 346 papers.Of these, 252 papers were excluded because they did not comply with the inclusion criteria, resulting in 94 studies being selected.
To calculate interrater agreement, we used the κ Fleiss statistic as it is the most commonly used analysis technique in systematic reviews (Belur et al., 2021).We also report the overall rater agreement in percentage.When looking at the interrater agreement, the value of κ Fleiss statistic was 0.283 which corresponds to "Fair agreement" (Belur et al., 2021).The percentage of agreement, however, (68.6%) indicates a substantial agreement.This discrepancy between the low value of κ and the higher percentage of agreement is common when the rating distribution is skewed as is the case in this literature review (see Supplementary Figure S1).

| Characterization of the selected studies
The reviewed studies and their main characteristics are listed in Table 2.An extended version of this table can be found in Supplementary Table S1.
Table 3 shows summary information including type, patient inclusion, and number of diagnoses included in each study sample.It also provides an overview of the papers' data collection processes, showing that while researchers primarily collected data for their specific studies, reanalysis of existing data (reused data) is also a common practice.

| Sample size and age
Included studies vary largely in the age of participants, which ranges from 9 to 88 years (Figure 2), irrespective of whether patients were included or not.There are no apparent gaps within this range, demonstrating a broad age representation.
On average, the studies included 157 participants.However, given the influence of extreme values (minimum sample size: 6, maximum sample size: 5272), the median of 58 participants may be a more informative measure, as can be seen in the sample size distribution (see Figure 3a).

| Illnesses targeted
Diagnosed subjects were involved in 41 studies (40.4% of the total papers), with 92.7% of the studies featuring only one diagnosis (Table 3).Notably, three papers (7.3%) incorporated multiple

| Data quality
Figure 3b displays the distribution of excluded subjects according to three categories (MRI, device, and others).The median number of subjects with discarded data due to device-related issues is slightly larger than the median number of subjects discarded due to MRI issues or other problems, such as medical illness or phobias.

| MRI techniques
fMRI is the main MRI technique employed in the reviewed studies, followed by diffusion-weighted imaging (DWI), T1/T2-weighted MRI, and to a lesser extent, angiography.Task-based fMRI is a more commonly employed technique than resting-state fMRI (Table 4).

| Portable automatic devices
Table 4 shows the most common PADs used in the reviewed papers.
Accelerometers are the primary choice for portable data collection, followed by smartphones and other devices, such as cuffs and glucose meters.Beepers, wristwatches, and step counters were also employed, albeit less frequently.The devices can be further categorized by their function or the variable they measure.Figure 3d shows that researchers mainly use PADs to measure physical activity and ask customized questions (from now on referred as EMA/ESM).Variables such as sleep, blood pressure, respiration rate, and screen use have also drawn some interest.While variables like glucose, communication patterns, and GPS location have been used only once across the reviewed sample, these studies showcase the feasibility of recording such data with PADs.Further, most of the studies employed the devices for 10 days or less, regardless of whether they collected the data passively (i.e., with no use input) or actively (i.e., with user input).Four studies employed the devices during several months, mostly using passive data collection (see Figure 3c).

| MRI and device combinations
To understand how different combinations of MRI techniques and devices were employed in the selected studies, we constructed a cooccurrence network where nodes represent techniques or devices.
The weight of the link between two nodes was determined by the number of papers combining the respective techniques or devices.
Figure 4a shows the computed network (numeric values available in Supplementary Table S2).
The network visualization reveals that the majority of published   (b) Distribution of excluded subjects according to the source of the issue, which could be related to magnetic resonance imaging (MRI), the device and its data collection, or other factors (e.g., unknown medical problems or phobias).(c) Number of days during which a device was used to collect data.The devices are organized into two groups based on the level of interaction required from the subject: active (some interaction is needed, such as answering a question) or passive (no interaction is required, such as heart rate measurements).(d) Number of studies employing a device based on its function to measure specific human behavior or physiological variables.(e) Diagnoses investigated in the included study sample.The studied illnesses were grouped according to their type.The number of studies that investigate an illness according to its group is shown.
and mobility patterns, and EMA/ESM.These combinations highlight the potential for incorporating multiple data streams within a single study, which may be collected by a single device (typically a smartphone).
To quantify the cluster structure apparent in the network visualization, we employed the link-community detection algorithm in Ahn et al. (2010), identifying seven sets of closely related links.Figure 4b shows the co-occurrence network grouped by color-coded link communities.Three large communities emerge (red, blue, and green).In the fMRI case, Figure 5c shows that in 2011, a larger number of papers combining fMRI and physical activity is reported.However,  Finally, only a few studies have used angiography, with its first year of publication being 2017.Interest in this modality has remained low (Figure 5e).

| Brain areas
We also visualized the list of statistically significant areas in all studies.
The list includes only the areas reported as results from the MRI-PAD combination analysis.Figure 6 shows a summary of brain areas involved across the most common MRI-PAD combinations.Other combinations are shown in Supplementary Figures S2 and S3.
From the 94 reviewed papers, we found that 44 did not provide ROI coordinates.Within this subset, seven papers omitted coordinates due to their use of whole-brain analyses, and one paper reported no significant findings for MRI-PAD combination analyses.
The full list of coordinates per paper is in the Supplementary Table S3.
Subsequently, we conducted a meta-analysis on MRI-PAD combinations for which there were at least four papers reporting coordinates.
By selecting this number, we want to ensure that there was sufficient data to allow for meaningful statistical analysis, while also maintaining a reasonable standard of representativeness across the studies in our review.Initially, the meta-analysis encompassed all papers corresponding to three MRI-PAD combinations (fMRI and EMA/ESM, physical activity, and sleep).However, no significant clusters were found for studies combining fMRI and physical activity.Significant clusters for the combination of fMRI-EMA/ESM and fMRI and sleep sensors are shown in Figure 7 and listed in the Supplementary Table S4.Following this, we performed a secondary analysis, categorizing the papers according to the type of analysis (general linear model or connectivity analysis).The unthresholded maps yielded by these meta-analyses are shown in Supplementary Figure S4.The maps are also available in neurovault https://identifiers.org/neurovault.F I G U R E 7 Meta-analytical brain maps of areas reported using the most common functional magnetic resonance imaging-portable automatic devices (fMRI-PAD) combinations.For each study, we extracted the reported statistically significant region of interest (ROI) coordinates.Then, for each combination, we run a meta-analysis if at least four studies reported coordinates.Significant clusters were found for studies combining (a) fMRI and EMA/ESM and (b) fMRI and sleep sensors.
In this study, we systematically and characterized research that jointly examines data from MRI and PADs used in real-world contexts.
These devices are compact, wearable, or handheld electronic devices that automatically collect data that are physiological (e.g., bodily functions) or behavioral (e.g., environmental interactions, mental states via self-reports), with minimal user input.We automatically retrieved relevant papers from two digital libraries (PubMed and Scopus), selecting those that met predefined inclusion criteria.Subsequently, we synthesized and analyzed their methodology and results.Our analysis of the The graph analysis revealed a cluster of the most common MRI-PAD combinations (Figure 4a).This cluster featured fMRI, T1/T2-weighted MRI, and DWI on the MRI side, which is not surprising considering their extensive use in brain research.Within this cluster, the MRI techniques were frequently paired with PADs that collect information on physical activity, sleep patterns, and thoughts and behaviors through EMA/ESM.
The inclusion of physical activity and sleep patterns in this cluster can be attributed to two key factors.First, their measurement is relatively simple and cost-effective due to the widespread use of accelerometers, which are the most frequently used device in the selected studies (Table 4).This is partly because they have been widely integrated into devices such as actigraphs and smartphones.Finally, the well-established clinical relevance of these markers makes them particularly attractive for researchers and clinicians, as the findings can be readily translated into actionable steps improve health outcomes.
Despite their shared relevance, sleep studies are less represented than physical activity, which may be partially explained by PAD limitations in sleep staging measurement (Imtiaz, 2021) Although EMA/ESM requires user input and thus imposes a higher burden compared to other passive methods, its versatility and adaptability continue to draw significant attention from researchers.This is evident in the robust connection between fMRI and EMA in Figure 4a.

| Combination of MRI and physiological measurements is feasible, but underexplored
The graph analysis revealed other intriguing patterns, with three distinct groups identified through link community detection (Figure 4b).These encompass the previously mentioned cluster (red), a group of less well-connected nodes (green), a cluster focused on common physiological measurements (blue).This last cluster also includes common behavioral measurements (physical activity and sleep) that are known to greatly affect human physiology (Atkinson & Davenne, 2007;Janssen et al., 2020).
Notably, the blue group demonstrates the feasibility of collecting physiological measurements in naturalistic settings PADs and indicates a promising direction for future research.In particular, the advances in photoplethysmography and its implementations in smartwatches have facilitated the reliable collection of physiological data in real-time (Fuller et al., 2020).As cardiac function is significantly influenced by the brain's central autonomic system (Silvani et al., 2016), expanding research on brain-heart interactions is vital.Furthermore, in the clinical context, analyzing brain-heart in naturalistic settings offers promising tools for the prognosis of neurological disorders (Boots et al., 2019;Silvani et al., 2016).The growing diversity in research is also demonstrated by the integration of advanced PADs that can simultaneously measure complex behaviors from multiple data streams, such as mobility and communication patterns.Interestingly, these PADs were included in studies 4 years after the first papers on digital phenotyping was published (Jain et al., 2015;Torous et al., 2016).
Combining MRI and PAD data presents unique challenges and potential measurement issues.One key challenge is integrating these two data types, given their inherent differences in format, resolution, and timing.The integration of smartphone-MRI data, as discussed in McGowan et al. (2023), addresses some of these challenges proposing a new framework that is adaptable to other PADs beyond smartphones (e.g., fitness trackers).Similar works will be needed in the future, as research shifts from observational to experimental approaches.
Signal artifacts and noise in PAD data pose another substantial challenge, as factors such as motion artifacts, variable environmental conditions, battery life, user engagement, or electronic interference can affect quality (Böttcher et al., 2022;Onnela, 2021).PAD accuracy and reliability is still a concern, especially since issues like data missingness or quality are not comprehensively reported in current MRI-PAD studies.In our review, 54 studies did not comment on PAD data missingness or quality, underscoring the urgent need for more transparent reporting practices to guide and improve the design of future research.
In MRI, movement artifacts, particularly head motion, significantly impact data accuracy (Ciric et al., 2018;Gilmore et al., 2021;Hedges et al., 2022;Oldham et al., 2020;Power et al., 2012).Our review of the literature revealed that in the period from 2007 to 2020, 5 fMRI studies, 10 T1-weighted MRI studies, and 8 DWI studies did not report any measures for correcting head motion, suggesting that this remains an ongoing concern.Notably, in fMRI research, where head motion has been closely scrutinized (Ciric et al., 2018;Lynch et al., 2021;Power et al., 2012), 20 papers specifically excluded data based on head movement parameters, demonstrating a heightened awareness of this factor in certain areas of MRI research.

|
The medial prefrontal cortex, hippocampus, and dorsolateral frontal regions are most commonly associated in studies merging MRI and PAD data The visualization of significant brain regions reported in the selected studies (Figure 6) highlights the medial prefrontal cortex, hippocampus, and dorsolateral frontal regions as the most frequently reported areas.Given their pivotal role in working memory and executive functions such as goal-directed behavior (Friedman & Robbins, 2022), it is expected to find a strong correlation between these regions and behavior measurements derived from real-world PADs.Interestingly this pattern does not seem to heavily depend on whether the association between brain activity and PADs is due to EMA/ESM, physical activity, or sleep sensors.
The hippocampus and prefrontal cortex play critical roles in various behavioral and cognitive functions, such as memory consolidation, decision-making, and emotional regulation.Dysfunction within this network is associated with several psychiatric disorders, such as schizophrenia, major depressive disorder, and post-traumatic stress disorder (Euston et al., 2012;Sigurdsson & Duvarci, 2016).
Complementary to the brain visualizations, the meta-analyses show significant clusters in the insula, pallidum, and anterior cingulate cortex.Given most of the studies combining fMRI and EMA/ESM focused on measuring mood, affect, and self-control, it is expected to find areas related to self-awareness, emotion processing, reward, and decision-making (Gu et al., 2013;K. S. Smith et al., 2009).While behavioral measurements from PADs might only reflect a correlational relationship with the functionality of these brain areas, they can serve as a valid proxy for cost-effective diagnostics that do not need to rely on MRI.Additionally, they can be used as reliable outcome measures in future intervention studies and behavioral therapy.

| Age is not a constraint, but adherence may be
Although continuous monitoring in elderly populations might be challenging due to their declining physiological and psychological conditions (Kekade et al., 2018), this review shows that age is not a constraint for including elderly individuals in studies that simultaneously collect brain MRI and PAD data (Figure 2).Supporting this finding, (Ahmad et al., 2020;Ma et al., 2021) report that older adults are willing to use technology when they perceive it as useful and easy to use.However, willingness to adopt unfamiliar technologies can be influenced by age (Ma et al., 2021).Similarly, research indicates that children are also receptive to using portable devices for health research purposes (Dimitri, 2019;Nadal et al., 2020).
While chronological age is not an obstacle in the reviewed studies, other factors may affect a subject's participation.One factor is the willingness of subjects to participate in PAD data collection protocols.
We found two studies that reported subjects who refused to enroll in data collection with physical activity PADs (Ahmed et al., 2017;Kokotilo et al., 2010).Although this is a small proportion, being aware of these cases can help researchers factor in potential difficulties during the recruiting phase.This underscores the necessity of including patient and public involvement activities in clinical study designs deploying PADs (Hassan et al., 2017).
Another crucial factor is participant exhaustion, which can be influenced by the chosen PAD and sampling strategy.In general, reviewed studies employed PADs for periods ranging from 1 to 243 days, with most studies using devices for approximately 10 days (Figure 3d).Blood pressure PADs were employed for the shortest time due to their cumbersome characteristics, limitations (Dadlani et al., 2019;Ling et al., 2020), and current clinical practices (Pena-Hernandez et al., 2020).In contrast, fitness trackers are well suited for longer research studies, as they collect data passively and are common in everyday use (Evenson et al., 2015).Finally, EMA/ESM other active data collection strategies may be associated with some burden for subjects, posing a challenge in maintaining engagement (Onnela, 2021).Financial incentives can boost engagement over short periods, but this may not be feasible for large cohorts or noncommercial research.Ultimately, striking a balance between accurate data collection and participant comfort is key for the success of studies involving PADs.
While extending data collection time may benefit many studies, it is also crucial to consider how it may affect data quality.In our sample, the number of subjects excluded due to poor PAD data quality is com-

| Clinical inclusion
Notably, almost half of the reviewed research papers include a group diagnosed with an illness (Table 3).Naturally, as the brain is the primary focus, most attention is directed toward psychiatric and neurological disorders.However, the inclusion of other illnesses indicates a shift in how researchers and clinicians approach disease investigation.
These studies demonstrate the importance of brain-behavior dynamics in various physiological aspects beyond the brain itself.
This broader perspective extends beyond focusing solely on the organ where symptoms appear, to include interactions between different body systems.Many studies could benefit from the combined analysis of MRI and PAD data.For example, cerebrovascular perfusion MRI data may require normalization to baseline physiological measurements in stroke research (Boots et al., 2019).In fact, the BOLD response might be significantly shaped by physiological changes.
Therefore, to enhance the accuracy of fMRI studies, long-term physiological data may be necessary for post-processing results and interpretations (Shmueli et al., 2007).
In the behavioral domain, other examples may include psychiatric imaging studies, where monitoring changes in behavioral patterns (Huckvale et al., 2019) or controlling for symptom severity during a study (Lepage et al., 2020)

| Limitations and future work
There are several limitations to consider.First, the review only covered published studies, excluding protocols, books, conference abstracts, dissertations, and theses.Future reviews could also encompass these.Second, we searched for articles in only two digital databases that allowed for automatic retrieval of papers.Future research may extend the search to other databases.Third, only one person fully read and classified the papers.Future efforts could involve more reviewers examining the papers and employing quality control to verify the extracted information.
The sample of papers reviewed here has some notable limitations that warrant further considerations.First and foremost, the majority of the reviewed research is observational, and therefore correlative and exploratory in nature.As a result, one should be cautious about inferring causative mechanisms based on these observations.Only few papers (Bakker et al., 2019;Balbim et al., 2021;Burzynska et al., 2017;Kalafatakis et al., 2021;Michielse et al., 2020;Morris et al., 2022;Rodriguez-Ayllon et al., 2020;Servaas et al., 2019;J. L. Smith et al., 2021) have employed controlled trials or interventional studies that test specific hypotheses.Given the growing interest in integrating MRI and PAD data, future research should increasingly emphasize experimental approaches.This shift is crucial for gaining a deeper and more detailed understanding of the interplay between brain function and behavior.
Additionally, while we visually summarized statistically significant brain areas as reported in the studies, these maps should be interpreted with caution.The visual summaries serve an informative purpose, highlighting trends in results from specific MRI and PAD combinations, helping researchers by identifying the relevant literature in the context of brain areas of interest.Furthermore, the visual summaries offer a valuable in situations where a great portion of the studies lack reported coordinates, hindering meta-analysis due to insufficient data.As an example, the EMA/ESM method was not categorized into functional domains due to its broad adaptability-researchers often customize questionnaires to suit their specific study needs (refer to, e.g., Horstman et al., 2022 andSequeira et al., 2021); categorization into functional categories would require an analysis beyond this review's scope.To provide further context, we include the EMA-measured constructs in Table 2.
Care is also advised when interpreting maps, such as those correlating physical activity and MRI modalities.For example, while sensorimotor areas might be expected to feature prominently in studies using PADs with accelerometers, the specifics of the studies must be considered.Some studies investigated connectivity patterns in selected ROIs excluding sensorimotor areas, employed whole-brain analysis, or used MRI paradigms to target specific cognitive functions.
However, specific studies focusing on sensor regions, such as Kokotilo et al. (2010) that analyzed left arm movement in stroke patients, did find significant activity in these areas.

| CONCLUSIONS
Combining neuroimaging with personalized behavioral and physiological information is becoming crucial for the development of precision medicine (Scala et al., 2023).Traditionally, neuroimaging studies have tended to only focus on late-stage clinical manifestations of disease, with little consideration for the coincident underlying physiology and behavior that likely influences image interpretation (Hampel et al., 2023).Therefore, combining MRI and PAD data may sharpen and refine future methods and models for interpreting imaging data.
Alternately, the interpretation of a subject's behavioral or physiological activity may be shaped by detailed knowledge of concurrent brain activity, for example, during optimal athletic performance (Furrer et al., 2023).Finally, predictive medicine models within the growing health "omics" revolution are becoming ever more reliant on large data sets from concurrent multimodal sources.Consequently, integrating MRI data with many types of PAD data will be necessary to help harness the full potential of multimodal artificial intelligence in future health technologies (Acosta et al., 2022).
sleep and physical activity with integrity of white matter microstructure in bipolar disorder patients and healthy study of chronic fatigue syndrome: evidence of brainstem dysfunction and altered homeostasis of real-world self-control: Neural correlates of breaking the link between Association of self-measured home, ambulatory, and strictly measured office blood pressure and their variability with intracranial arterial cell phones: Neural and real-world responses to social evaluation in et al. (2019) Repetitive negative thinking in daily life and functional connectivity among default mode, fronto-parietal, and salience networks fMRI EMA/ESM (mood and repetitive negative thinking) Smartphone diagnosis groups (ranging from two to five).The reported illnesses were classified into five categories, as shown in Figure 3e.Neurological illnesses included cognitive decline, cognitive impairment, Alzheimer's, dementia, and multiple sclerosis.Psychiatric disorders encompassed anxiety, depression, bipolar disorder, anorexia, psychoaffective, emotional distress disorder, psychosis, and schizophrenia.Musculoskeletal illnesses grouped chronic fatigue syndrome, fibromyalgia, and knee osteoarthritis.Systemic medical diseases covered hypertension, obesity, stroke, and chronic obstructive pulmonary disease.Finally, pediatric illnesses included sickle cell disease and Down syndrome.
relationship between cardiovascular measurements and brain structure, also explored by the combination of angiography and blood pressure.

Figure 4
Figure 4 also reveals that physiological and behavioral data streams are used together.For example, data on screen use have been employed alongside data on sleep, physical activity, communication U R E 4 Network of co-occurrences between portable automatic devices and magnetic resonance imaging (MRI) techniques.Link thickness represents the frequency of a particular combination in the selected research papers, with thicker links indicating higher frequency.Node size corresponds to the number of papers using a specific device or technique.(a) Nodes are color-coded based on category: MRI (green), physiology (orange), and behavior (purple).(b) Link communities and node overlaps in the network.Links are colored according to the detected link communities that are also indicated by the shaded areas.Node positions are the same as shown in (a)).researchers started combining MRI and PAD data as early as 2007 by merging fMRI and EMA/ESM data.From 2009 onward, new variables are included such as sleep, physical activity, blood pressure, and heart rate.However, it is not until the last 5 years that we truly see a growing interest in combining MRI and PAD data.
Frequency of used magnetic resonance imaging (MRI) techniques and portable automatic devices over time.(a) Colors indicate the number of papers employing a specific device and MRI technique.Panels (b)-(d)indicate the number of times a device has been used in combination with T1/T2-weighted MRI, fMRI, DWI, and angiography.it is not until 2018 that we see a clear interest in merging fMRI with physical activity, sleep, and different EMA/ESM-measured variables.This interest peaks in 2020, when most of the papers employing fMRI are published (Figure 5a).In 2011, T1/T2-weighted MRI was incorporated for the first time into studies examining physiological variables derived from blood pressure and heart rate measurements.Subsequently, the focus expanded to include physical activity and sleep as the main variables of interest in human behavior (Figure 5b), peaking in 2021.DWI appeared for the first time among the reviewed papers in 2014, in conjunction with data from physical activity.Shortly after, the interest in integrating DWI with other variables such as sleep, increased, peaking in 2020 and 2021 (Figure 5d).
Brain areas reported across the most common magnetic resonance imaging-portable automatic devices (MRI-PAD) combinations.The colors represent the number of studies that reported a specific brain area as statistically significant for (a) T1-weighted MRI alone, (b) T1-weighted MRI and physical activity, (c) functional MRI (fMRI) alone, (d) fMRI and physical activity, (e) fMRI and EMA/ESM, and (f) fMRI and sleep.
94 selected papers shows a rapidly growing interest in developing new approaches that merge MRI techniques with PAD data.This interest is evident in the growing number of publications since 2007, the diversity of measured behaviors and their corresponding MRI combinations, and the range of sampled characteristics, such as age and illness.This trend indicates an emerging brain research paradigm shift, moving beyond traditional laboratory or clinical settings to embrace more naturalistic, reallife contexts.Thus far, of all PADS, smartphones have been largely covered in combination with MRI data in the work by McGowan et al. (2023), which reviewed methods to combine these data sources and to sketch a new network science framework.Our review expands the range of devices, incorporating more PAD search terms that may be of interest in the community.We also address the characteristics of included studies and discuss the brain areas they cover.Moreover, the open code provided here provides a good foundation for future researchers who would like to expand the review to other aspects.Taken together, both reviews point out to the growing interest in combining MRI and PADs for the study of brain-behavior relationships.4.1 | The most common combinations are fMRI, T1/T2-weighted MRI, and DWI with physical activity, EMA, and sleep

4. 3 |
There is an increasing interest in combining MRI and PAD technologies to study brain-behavior relationshipsResearch merging MRI and PAD data has grown in volume and diversity since the first publication in 2007 (Figure5).In the early years, a few focused on combining fMRI with physical activity, EMA/ESM, and sleep.Since then, the number of published papers has increased and more MRI techniques have been incorporated.For example, there has been a rise the use of DWI in the last 3 years and the inclusion of angiography in the past 5 years.
are vital.In these cases, the combined use of PAD and MRI data could enable estimation of causal links between MRI measurements and symptoms, as well as the detection of early warning signs (Wichers et al., 2020).Finally, since MRI tests are expensive and not prescribed for every individual, understanding the connections between real-world behavior and physiology measured by PADs and MRI brain data becomes crucial.By monitoring a patient's behavior using PADs, we could determine the need for more specialized MRI tests based on the collected PAD data, thus optimizing resources and clinical management.
Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/hbm.26620by Aalto University, Wiley Online Library on [14/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 10970193, 2024, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/hbm.26620by Aalto University, Wiley Online Library on [14/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License T A B L E 2 (Continued) Summary of study characteristics.The summary includes the research design type, instances where researchers reanalyzed data from existing datasets, the number of studies involving patients, and the variety of diagnoses included.Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/hbm.26620by Aalto University, Wiley Online Library on [14/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License literature combines fMRI, T1/T2-weighted MRI, and DWI with physical activity sensors, EMA/ESM, and sleep.In contrast, angiography and some physiological variables (respiration rate, heart rate variability, glucose, and temperature) are less frequently employed.merged with physiological signals.Instead, DWI is preferred with physiological data.Yet, regarding behavioral data, the use of DWI is primarily limited to physical activity and sleep.Among the combinations of physiological signals and MRI techniques, blood pressure with T1/T2-weighted MRI is a particularly noteworthy combination, indicating an interest in the T A B L E 3 Summary of employed MRI techniques and devices.
a Some papers use more than one technique.b One study used both, task and resting-state.c Other types include cuffs, glucose-meters, etc . Because of these, sleep measurements are mostly restricted to sleep duration, onset, and offset.Therefore, some researchers prefer the accuracy of poly-