Never‐Ending Learning for Explainable Brain Computing

Abstract Exploring the nature of human intelligence and behavior is a longstanding pursuit in cognitive neuroscience, driven by the accumulation of knowledge, information, and data across various studies. However, achieving a unified and transparent interpretation of findings presents formidable challenges. In response, an explainable brain computing framework is proposed that employs the never‐ending learning paradigm, integrating evidence combination and fusion computing within a Knowledge‐Information‐Data (KID) architecture. The framework supports continuous brain cognition investigation, utilizing joint knowledge‐driven forward inference and data‐driven reverse inference, bolstered by the pre‐trained language modeling techniques and the human‐in‐the‐loop mechanisms. In particular, it incorporates internal evidence learning through multi‐task functional neuroimaging analyses and external evidence learning via topic modeling of published neuroimaging studies, all of which involve human interactions at different stages. Based on two case studies, the intricate uncertainty surrounding brain localization in human reasoning is revealed. The present study also highlights the potential of systematization to advance explainable brain computing, offering a finer‐grained understanding of brain activity patterns related to human intelligence.

method on these articles and then stored the extracted BI provenance with key factors in the sample library.
Resource Constraint Analysis.Currently, we verified the framework in the scenario of analyzing multi-task brain images to understand high-order cognition, while it still works if the resources are organized and the purpose is defined to satisfy the following principles.In particular, the resources are organized in the knowledge-information-data architecture, covering multisource text and neuroimage.However, considering the data constraints, the framework might struggle with some specific scenarios: • Firstly, the framework on handling multi-scale data, such as genetics, will face challenges.
The core challenge is alignment.Accordingly, the framework should enhance its flexibility and alignment ability on integrating task-free data, multiscale data, and so forth.• Secondly, the interactions and associations among cognitive functions during conditions and diseases are other valuable topics, in which the structural and functional alignment problem needs to be further investigated to reduce the gap between cognitive and clinical findings.
• Thirdly, we aim to further confront challenges in translational research to make it available in clinical practices.Therefore, we need to further integrate other technologies, such as brain stimulation, providing a novel perspective for decoding cognitive mechanisms.We encourage more scientists from different backgrounds to explore and recognize its value in various scenarios, contributing to a more profound understanding.
Computational Complexity Analysis.In the NEL-based explainable brain computing framework, these complexities of computational components need to be concerned about as follows: 1.The computational complexity of the conceptual Data-Brain construction via the language models, such as the learning processes from text data to global graphs; 2. The computational complexity of the internal evidence learning, such as the analysis processes of parametric maps; 3. The computational complexity of the internal evidence learning, such as the mapping processes from the reported coordinates to parametric maps; 4. The computational complexity of the external evidence learning, such as the mining processes of the named entity recognition; 5.The evidence combination and fusion computing of the internal and external evidence, including the brain data selection, data alignment to brain template, and the data fusion processes.The computing components from 1 to 4 have the dynamic complexity, depending on the used methods in the framework.In addition, the upper bounds of the computational complexity on the fifth component "Evidence Combination and Fusion Computing" can be given as follows.Herein, N is the number of computational samples, and P is the number features with respect to the brain template scale.The computational complexity is mainly from the data alignment operation to a predefined brain template, and the fusing operation for selected brain data.As we do not need pretrain the data in the framework, the approximation of the computational complexity is be given with O(NP).
Computational Sensitivity Analysis.On basis of the KID architecture, the KID loop supports never-ending learning, which tests the goal hypothesis along with the generation, evolution and learning of multiple sources surrounding knowledge, information and data, continuously.At the early iterations, the proposed brain computing framework is relatively sensitive to the quality of the input data.High quality data will lead to reliable results quickly.However, as the number of iterations increases, the incorporation of more and more internal and external evidence will reduce sensitivity to data quality.To address this issue, we design the operating rules to reduce the impact from low-quality data.In details, the framework performs internal evidence with high confidence (from intra-experiment to inter-experiment evidence) firstly, and then performs external evidence with relatively low confidence (from intra-experiment to interexperiment evidence).
Technicial Summarization.In the details of the resource layer, the data are organized by the Brain Imaging Data Structure (BIDS) standard; the information is organized in relational tables; and the knowledge is organized by the Resource Description Framework (RDF) in triple tables.During the internal evidence process, the raw brain images are analyzed by the general liner model and multivariate pattern analysis methods, while the extracted peak coordinates are mapped to the standard brain template via the Python nilearn library.During the external evidence learning process, a global graph is constructed by the "Relation Extraction By Endto-end Language (REBEL)" framework [6] , as shown in Figure S1 of Supplementary Information, guiding the construction of personal subgraphs to execute systematic computing operations, together with the human-in-the-loop operations.In addition, the provenances are learned by the Neuroimaging Data Model (NIDM) and BioBERT, identifying the topics of studies through interaction-based neuroimaging topic modeling.These topic-tagged studies with reported results will be further computed during the never-ending learning processes.To perform evidence combination and fusion computing, the program is simulated by the Python libraries of nibabel, nilearn, and so forth.In the interaction layer, the human can determine the experimental preference and the start-up evidence corresponding to a proposed hypothesis.

Figure S1.
Construction of the conceptual Data-Brain. A. Global graph is constructed by pre-trained language modeling techniques, from raw text data to structural knowledge graph.B. A personal graph is constructed by reorganizing the global graph with prompts, including the four dimensions of function, experiment, data and analysis, corresponding to the four aspects of the Brain Informatics methodology: systematic investigations of complex brain science problems; systematic design of cognitive experiments; systematic brain data management; and systematic brain data analysis and simulation.

Figure S2.
The template graph of systematic experimental planning.The main experiment ( mae ) that corresponds directly to the goal hypothesis is a starting point for systematic experimental planning, and the supplementary experiments are driven by the main experiment as continuous support of evidence combination and fusion computing.The supplementary experiments are further defined as various experimental types, including the similar experiment ( sie ), parallel experiment (  ), deeper experiment ( dee ), inspired experiment ( ine ), missed experiment ( mie ) and subprocessing experiment ( spe ).The reasoning rules of various experimental types are described as follows.R1: If an experiment is identified as a similar experiment, its task shares similar factors with  mae in the function and experiment dimensions.Under these circumstances,  sie and  mae may have different factors in practice, such as device and brain image parameters from multiple data centers.R2: If an experiment is identified as a parallel experiment, its task shares similar factors in the function dimension with  mae but may have different factors in the experiment dimension, such as digits and symbols.R3: If an experiment is identified as a deeper experiment, its task is used to further explore hidden mental processes related to  mae , but corresponding to different hypotheses with other factors in the function dimension.For instance, calculation-related cognitive activity can be studied through arithmetic tasks.However, such a task is not only relevant to calculation processing but also to the integration of numerical and symbolic processing that must be further considered.R4: If an experiment is identified as the inspired experiment, its task is used to test the goal hypothesis involving different factors in the function dimension from  mae but shares similar factors in the experiment dimension with  mae .R5: If an experiment is identified as a missed experiment, its task does not satisfy the aforementioned criteria but evokes similar brain activities (such as patterns and indicators) with  mae .R6: If an experiment is identified as the subprocessing experiment, its task is used to test the goal hypothesis-related single aspect within a dual-task paradigm.For instance, an experimental design for the association study of emotion and calculation may be regarded as two separate tasks to test emotional and calculation hypotheses, respectively.The experimental similar degree {  |1 ≤  ≤ 6} between the main experiment and its supplementary experiment is computed by experimental similarity assessment (see the systematic experimental planning approach in Section 4.2).

Table S1.
A fragment of the sample library with the symbiosis of internal evidence.In the sample library, each piece of evidence is regarded as a chain of evidence that contains the functional neuroimaging data, the results of studies, and their context, such as the study purpose, experimental design and processing methods.D81 [13] Reasoning Factorial Block Digits Healthy (23) D82 [13] Reasoning Factorial Block Letters Healthy (23) … ID: Identifier of the Experimental Data in the Sample Library; COG: cognitive function; EPA: experimental paradigm; EPR: experimental protocol; SEN: explicit stimulus; #: number of subjects; MDD: major depressive disorder.

Table S2.
Twelve categories of neuroimaging entities obtained from the BI provenance model.These entities indicate the key factors in experiments and analyses that can be used for evidence combination and fusion computing.

Category Definition Example
Brain Area (BRI) Brain area is an area in the human cortex that responds to one or several cognitive tasks during the neuroimaging study.
Motor, language, and learning: functional magnetic resonance imaging of the cerebellum.

Cognitive Function (COG)
Cognitive function is an ability of the brain to process information during the neuroimaging study.
Control of goal-directed and stimulus-driven attention in the brain.

Medical Problem (MDI)
The medical problem is an abnormal symptom of subjects during the neuroimaging study.
Clinical and experimental study on adrenomedullin in acute myocardial infarction.

Explicit Stimulus (SEN)
The explicit stimulus is a kind of sensory channel of subjects presented by stimuli during the neuroimaging study.
The simulation and analysis of the biological olfactory neural model.

Experimental Task (TSK)
The experimental task is a cognitive task that the subject needs to complete during the neuroimaging study.
The external datasets for the color-word stroop task.

Experimental Paradigm (EPA)
The experimental paradigm is an experimental setup (i.e., a way to conduct a certain type of experiment) that is defined by certain fine-tuned standards and often has a theoretical background, including categorical designs, parametric designs, and factorial designs.
Due to the involvement of two factors in the present study, the group-level analysis was implemented based on a 2 by 3 factorial design.

Experimental Protocol (EPR)
The experimental protocol involves the management of variables, their presentation, the assignment of respondents, and the statistical procedures of analysis, especially for event-related design, block design and mixed design.
Within each session, stimuli were presented randomly in an event related design.

Subject (SUB)
The subject is a person who completes the cognitive task during the neuroimaging study.
The patient indicates whether or not the word was shown previously.

Data Acquisition Device (DAD)
The data acquisition device is a kind of professional equipment that is used to record the psychological or physiological data of subjects during the neuroimaging study.
Application of positron emission tomography in the central nervous system.

Analytical Tool and Method (TOL)
The analytical tool and method are a data analytical algorithm or software, which is used to mine experimental data during the neuroimaging study.
Data processing and analysis of MRI based on principal component analysis.

Activated Feature (ACF)
The activated feature is a brain response that is mined from experimental data during the neuroimaging study.
The peak of the activation coordinate began to decrease.

Brain Networks (BRN)
The brain networks are a kind of brain responses that are mined from experimental data during the neuroimaging study.
An fMRI study of deactivation and default mode network activity in human brain.
Table S3.Partial categories of neuroimaging interactions obtained from the BI provenance model, where interaction indicates semantic relations between entities.

Category Type ID Definition is-part-of BRI-BRI
The "is-part-of" is the interaction between two "Brain Area" entities, which indicates the inclusion relation between brain areas.

reflect COG-ACF
The "reflect" is the interaction between the "Cognitive Function" entity and the "Activation Feature" entity, which indicates "Activation Coordinate" reflects the "Cognitive Function" in the cognitive research.

is-located-in ACF-BRI
The "is-located-in" is the interaction between the "Activation Feature" entity and the "Brain Area" entity, which indicates the brain response appears in the "Brain Area".

perform SUB-TSK
The "perform" is the interaction between the "Subject" entity and the "Experimental Task" entity, which indicates the "Subject" performs the "Experimental Task" in the neuroimaging study. has-themedicalproblem-of

SUB-MDI
The "has-the-medical-problem-of" is the interaction between the "Subject" entity and the "Medical Problem" entity, which indicates the "Subject" is suffering from the "Medical Problem".

acquire TSK-DAD
The "acquire" is the interaction between the "Experimental Task" entity and the "Data Acquisition Device" entity, which indicates researchers collect brain data related to the "Experimental Task" through the "Data Acquisition Device".

… Table S4.
Human reasoning-related neuroimaging articles from PubMed and the PLOS series, which are recognized from the sample library based on similarity assessment during the systematic experimental planning process.

Table S6 .
The learned τ-Values in the peak coordinates selected from the last loop LOOP-23 are given throughout all learned loops, where the selection conditions of peaks are Voxels > 500 and τ-Values > 0.