Developmental neuroimaging studies aim to characterize the neural basis of age-related changes in behavior. Research in our laboratory has examined both the behavioral and neurobiological correlates of adolescent cognition using behavioral assessments of oculomotor function [Asato et al.,2006; Luna et al.,2004] together with functional magnetic resonance imaging (fMRI) [Geier and Luna,2009; Geier et al.,2007; Luna et al.,2001; Scherf et al.,2006; Velanova et al.,2008,2009] and diffusion tensor imaging (DTI) [Asato et al., in press]. Our findings indicate that adolescence is a unique period of development during which mature elemental cognitive and neural processes are often available but are irregularly accessed due to continued immaturities in higher order control networks [Velanova et al.,2009]. Performing our fMRI studies has raised a number of questions regarding optimal approaches for developmental neuroimaging, many of which have been addressed at length elsewhere in discussions of pediatric subject preparation and compliance [e.g., Kotsoni et al.,2006; Poldrack et al.,2002; see also Church et al., this issue], co-registration of child and adult data [e.g., Burgund et al.,2002; Kang et al.,2003; Muzik et al.,2000], movement compensation [e.g., Evans et al.,2009], and the treatment of physiologic noise and inhomogeneity of variance [e.g., Samanez-Larkin and D'Esposito,2008; Thomason et al.,2005]. In what follows, we outline some of the less commonly addressed questions and provide potential (and often alternative) responses. Although many elegant tasks and designs have been used to provide important insights into the neural basis of development, we use oculomotor approaches as a model to discuss issues that are of relevance to the field.
What is an Optimal Behavioral Task for Developmental Neuroimaging?
Developmental functional imaging studies have unique considerations regarding both the use of tasks that integrate age-related variability in performance and tasks that can readily provide a link to the underlying neurobiology of interest. Unlike tasks that are readily used in adult studies where comparisons are within-subject, developmental studies probe differences between groups that may differ in many dimensions beyond the variable of interest. An optimal approach is to minimize extraneous developmentally laden factors. Therefore, it is recommended that one use tasks with simple instructions to control for complexities that are extraneous to the tested variables of interest. This is evident in the great adeptness that children have with playing video games. Although children have limitations in cognitive control, they can adapt their behavior to make use of the abilities that are in place by the use of strategies, which can result in an overestimation of the capacity of core abilities. Finally, given that the aim of neuroimaging is to find links between behavior and brain processes, it is a strength to use tasks whose performance has already been linked to neural processes.
Research in our laboratory has made extensive use of oculomotor tasks during fMRI in an effort to optimize the aforementioned considerations and we use this as an example of a type of task that addresses possible limitations in developmental neuroimaging [for a review see, Luna et al.,2009]. In particular, our research has focused on measuring the metrics of voluntarily generated eye movements and their functional-anatomic correlates across development and on measures related to the inhibition of reflexive eye movements [e.g., Geier et al.,2009b; Luna et al.,2001; Velanova et al.,2008]. Our emphasis on the use of “basic” oculomotor paradigms such as the antisaccade and prosaccade tasks, delayed response, and countermanding tasks stem from their having, as a minimum, three distinct advantages for developmental imaging research. First, these paradigms are often sufficiently simple to be understood and readily performed by children, particularly when the stimulus-response relationship is direct (e.g., a suddenly appearing visual stimulus elicits an eye-movement) rather than requiring transformations across modalities [Cohen and Ross,1978; Ross et al.,1993]. Although performance of these tasks are not simple in themselves given that they require sophisticated cognitive control, the instructions (“look at a light,” “do not look at a light,” or “remember the location of a light”) are straightforward so that limitations in following rules do not impede the ability to assess basic cognitive control. Hence, despite their simplicity, a range of oculomotor tasks permit the assessment of basic cognitive functions in a broad array of domains of interest to cognitive and developmental neuroscientists including inhibitory control, working memory, error regulation, and speed of processing. Second, many oculomotor tasks are relatively resistant to the use of verbal strategies and consequently, show minimal practice effects [Dyckman and McDowell,2005; Smyrnis,2008], which can vary in magnitude across development in a task-dependant fashion [e.g., Ahonniska et al.,2001] thus complicating the interpretation of developmental effects. Strategy use can undermine the ability to probe the developmental status of cognitive processing. Third, many oculomotor tasks have been exceptionally well-characterized, both in terms of their component cognitive processes and in terms of their underlying neuroanatomy, neurochemistry, and neurophysiology due to extensive single-unit research in non-human primates [Bon and Lucchetti,1990; Bruce et al.,1985; Robinson et al.,1978]. Using a task whose neural basis is well-understood allows for ready comparisons across literatures and can enhance the ability to make brain-behavior associations. Hence, these approaches to optimize developmental neuroimaging studies are inherent in oculomotor studies.
Integration of these considerations is evident in a range of tasks present in the literature, including child friendly adaptations of more traditional neuropsychological tasks such as the go-no-go [Rubia et al.,2006; Tamm et al.,2002], flanker [Bunge et al.,2002a], stop-signal [Rubia et al.,2007], Stroop [Adleman et al.,2002; Marsh et al.,2006], and working memory tasks [Nelson et al.,2000; Thomas,1999], all of which attempt to minimize instruction demands. Notwithstanding, it is important to note that strategy development in and of itself is a relevant and compelling area of research that can be investigated directly [Cowan et al.,2006; van Leijenhorst et al.,2006]. Further, we have found particular advantages to tasks that can be repeated and are amenable to presentation in multiple, short runs. This permits acquisition of usable data even among participants prone to head movement (typically pre-adolescent children) [Yuan et al.,2009] and/or sleepiness (adolescents) [Millman,2005; Moore and Meltzer,2008]. Finally, we note that simple tasks with the characteristics we have outlined are often ones that lend themselves to parametric manipulation [Bookheimer,2000; Gaillard et al.,2001] the advantages of which are detailed in other articles in this issue.
On What Basis Should Age Groups be Defined?
Understanding developmental stages is a primary aim of developmental neuroscience. Specifically, we would like to be able to characterize functional-anatomic and structural transitions that have implications for age-related changes in behavior and for understanding developmental trajectories in clinical populations. Developmental periods, however, are difficult to define because they vary across individuals and may differ across specific processes. In developmental fMRI research, child groups typically include individuals whose age extends downward to the youngest age at which subjects can reliably be scanned-usually 7 or 8 years of age. However, several groups have extended the age of child samples to include individuals as young as 4 and 5 years with reasonable success [e.g., Byars et al.,2002; Evans et al.,2009; Yerys et al.,2009]. Therefore, we must recognize that in most developmental fMRI research we are characterizing late childhood.
Although in some studies, children in late childhood are not distinguished from adolescents, neuroscientists are increasingly acknowledging the relevance of treating adolescence as a unique developmental period. In the broader literature, adolescence is typically defined as a period of gradual transition between childhood and adulthood beginning around puberty, which usually takes place between 12 and 17 years of age [Dahl,2004; Dorn et al.,2006; Spear,2000]. Consequently, in developmental fMRI research, adolescent groups (when distinguished from children) have been variously defined as spanning 10–13 years to 17–19 years of age reflecting not only the complexity of the inter-related changes occurring during this developmental period [Dorn et al.,2006] but also its inherent variability [Spear,2000].
In our own research, we define age groups based on performance on the task of interest as measured in large independent samples (n > 100) where age can be considered as a continuous variable and changes in performance can be assessed. For our oculomotor tasks we find that performance of inhibitory and working memory tasks begins to be adult-like by 14–15 years of age and performance differences are found between 8 and 12, 13 and 17, and 18 and 30 years [Asato et al.,2006; Luna et al.,2004]. To adjust for variability, in our fMRI studies we include 13–17 year olds in our adolescent groups, and younger individuals, aged 8–12 years in our child group. The adult group includes 18-year-olds up to individuals in their mid 20s but not beyond to ensure that we do not include individuals in regressive stages of development. Although this is still a suboptimal approach, it provides a rationale from which to derive hypotheses regarding different stages of development such as inverted U-shape functions of systems that are believed to peak in adolescence such as the dopamine system and its effects on reward processing in ventral striatum. Although childhood and adulthood could be distinguished from adolescence based on any number of dimensions, we give priority to identifying periods of cognitive difference within a biologically plausible age-range because these are the transitions about which we are asking questions.
Our studies of core components of cognitive control assess “cool” control functions [Zelazo et al.,2003], which are thought to develop according to a biological time table. However, we still acquire subject-assessed measures of puberty, which we have found account for less of the variability in cognitive performance than chronological age [Asato, 2009]. Nonetheless, pubertal stage has been shown to influence performance of a number of tasks, many of which assess domains of interest in current developmental fMRI research such as reward processing and motivational control [Dahl,2004]. For studies in these and similar “hot” domains we suggest that researchers consider not only age but also pubertal stage when defining developmental groups [Dorn,2006].
Pubertal staging is a challenging endeavor as it involves assessment of interacting hormonal processes and individual characteristics. The most direct measurements of pubertal status are invasive and include estimation of bone age achieved using X-Ray and assessment of breast and testicular growth by physician exam. Blood and saliva samples may also be used to measure hormone levels but require use of specialized technology and training. One non-invasive method for determining pubertal stage is the Tanner Maturation Scale (TMS) [Marshall and Tanner,1969,1970] a self-report questionnaire which measures the emergence of secondary sexual characteristics that result from puberty and that occur after the onset of hormonal changes [Brooks-Gunn et al.,1985; Duke et al.,1980; Tanner,1962]. Tanner staging has shown high agreement between physician- and self-assessments [Duke et al.,1980] but the validity of the approach is known to depend on factors including age, gender, ethnicity, race, and weight status [Bonat et al.,2002; Neinstein,1982; Raman et al.,2009; Schlossberger et al.,1992].
Given the variability in performance by age in different domains, thought has to be given to a rationale for the age definition of groups, be it determined by performance, theoretical models, or hypotheses. Optimally, and especially if there is no clear rationale for grouping different ages, age should be considered as a continuous variable. Considering age as a continuous variable, however, requires a larger sample than is typically included in fMRI studies, and statistical analyses that include regression and curvilinear approaches instead of ANOVA. We have used both age groups and age as a continuous variable as a confirmatory approach to support our developmental findings [Velanova et al.,2009]. What should be avoided, though seen in the literature, is definition of groups with large age ranges that include young children and adolescents. This is especially the case for cognitive studies where a large literature indicates that important changes occur at these stages.
What is an Appropriate fMRI Design?
Having decided on a task pertinent to one's hypotheses and identified appropriate age groups or age-ranges, the precise form of the functional imaging protocol requires consideration. There are many different designs (block, event-related, rapid event-related, mixed, and self-driven experiment designs) that vary in complexity and in limitations and advantages [for details regarding different fMRI designs see Amaro and Barker,2006]. Regarding developmental neuroimaging studies, specific questions may optimize one approach over another. Here, we review the main approaches that have been used in developmental neuroimaging studies: block and event-related designs. The most simple of designs is the block design where two periods of different cognitive engagement are compared. Block designs are particularly useful when characterizing the regions that comprise a circuitry underlying a particular process that can be passively viewed such as when imaging developmental differences in face versus object processing [Aylward et al.,2005; Scherf et al.,2007]. Block designs have the advantages of robustness [Rombouts et al.,1997], large values of signal change relative to baseline [Buxton et al.,1998], and superior statistical power [Friston et al.,1999] as the rapid presentation of trials or stimuli of interest has an additive effect on blood oxygenated level dependent (BOLD) signaling and peak signaling at plateau can be repeatedly sampled. However, not all tasks are amenable to blocked presentation (e.g., oddball tasks or others where trials of interest are presented infrequently) and the detection power of these designs is limited if subjects do not engage the cognitive operations of interest throughout block periods, as may be more likely with pediatric groups. For example, trials on which errors are committed may engage different processes than correct trials and if these are averaged together the results are limited in their interpretability [Murphy and Garavan,2004].
Event-related designs have lower signal to noise compared to block designs but they allow more specific aspects of behavior and of the BOLD response to be investigated. These designs are to be used if one's question concerns developmental differences in brain systems underlying specific behaviors. Slow event-related designs involve presenting extended inter-trial-intervals (ITIs; typically 12-s to 18-s periods of rest or fixation) following each experimental trial to allow time for the trial-related BOLD response to recover [Buckner et al.,1996]. However, long ITIs limit the number of trials that can be presented and introduce fatigue and distraction resulting in limited use of these designs in developmental studies.
A design that is optimal for developmental studies is rapid event-related imaging in which brief variable ITIs are interposed between trials [Buckner et al.,1998; Dale,1999]. Although their detection power is substantially decreased relative to block designs, these designs are attractive for developmental research because they permit presentation and estimation of the hemodynamic response to multiple intermixed trial types. This includes their permitting post hoc sorting of task trials based on subject responses allowing one to distinguish activation associated with correctly performed versus error trials. In addition, event-related fMRI can allow measurement of activation associated with individual trial components when whole (compound) trials are unpredictably intermixed with partial trials [Ollinger et al.,2001a,b]. Consider, for example, a rewarded antisaccade paradigm in which compound trials consist of three components—an initial “reward assessment” period when an incentive cue indicates whether correct performance on the upcoming trial will be rewarded or not, a “response preparation/reward anticipation” period, when subjects anticipate responding for a reward, and a “saccade response” period when subjects must make an eye movement towards a location on a screen opposite the location of a briefly presented light [see Fig. 1; Geier et al.,2009b]. Immature recruitment of brain regions implicated in a specific cognitive process engaged during a single trial epoch (e.g., anticipation/preparation) can then be assessed. Alternatively, it could be that brain regions supporting processing at multiple trial stages are immature and combine to influence behavior (e.g., initial incentive assessment and response preparation). The use of rapid, event-related fMRI, and deconvolution techniques [e.g., Ward,2006] enables one to uncover not only what kinds of trials show developmental differences but also which specific trial components underlie age-related differences. Finally, event-related studies have the advantage of providing a time series of the BOLD response which allows for the characterization of the hemodynamic response function that defines activity of a given region and determines group differences.
Finally, mixed block/event-related designs have been used in developmental fMRI [Burgund et al.,2006; Church et al.,2009; Velanova et al.,2009; Wenger et al.,2004]. These designs permit separation of transient activation associated with individual task trials, and activation that is sustained (and constant) throughout an extended task period (together with activation associated with task-period start and end cues) [Chawla et al.,1999; Donaldson et al.,2001; Visscher et al.,2003]. Sustained activation is thought to reflect the activity of a supervisory control network that operates on an extended time-scale to maintain task goals and to modulate transient processing in the service of those goals [Dosenbach et al.,2006]. A growing body of research, including both standard fMRI and investigations of resting state functional connectivity, suggests that maturation of this sustained task-control network plays an important role in the attainment of adult-level task control [Fair et al.,2007; Velanova et al.,2009]. Indeed, our own work demonstrates that the developmental trajectory of controlled signaling is substantially extended relative to that for trial-related controlled processes, extending through adolescence and beyond the age at which transient inhibitory processing reaches maturity. This finding is particularly informative as it suggests that while cognitive processes that support a single correct response may be available, developmental improvements in performance in adolescence are supported by improvements in the ability to maintain task-level control [Velanova et al.,2009].
What is an Appropriate Baseline?
Because there is no absolute level of activation measured by the BOLD response, fMRI studies depend on comparing activation between conditions, one of which is usually considered an experimental condition, and the second a “baseline” comparison. Various approaches to baseline task selection are represented in the developmental fMRI literature. Several investigators emphasize comparison of conditions that vary as much as possible only in their demand for use of a given cognitive process [O'Shaughnessy et al.,2008]. For example, to isolate activation unique to the processing of faces, comparison tasks have been used that require subjects to process other objects that differ from faces regarding biological status (objects such as shoes, etc.), controlling for similar spatial configuration of elements (houses), controlling for familiarity (e.g., greebles), or changing the configuration of elements in a face [Aylward et al.,2005; Scherf et al.,2008]. We have used this approach in our block design studies by comparing the ability to inhibit a prepotent eye movement response to a visual target to trials where eye movements must be made to a visual target and in this manner “subtracting” out oculomotor processes and focusing on the inhibitory aspect of the task [Luna et al.,2001; Scherf et al.,2006]. This approach has yielded important information regarding development however there are limitations: If there are developmental processes that underlie the control condition this can undermine the ability to capture all developmental aspects of the experimental condition.
Another approach is to use a rest or fixation baseline condition. Fixation baseline, where subjects are simply asked to look at a cross-hair on the screen, is often used as a control condition because it makes few cognitive demands. However, retaining fixation does involve effort. In the oculomotor system, saccades are closer to a rest state and fixation requires active engagement of pause cells in the superior colliculus [Leigh and Zee,2006] to stop reflexive saccades. Developmental studies have shown that the ability to retain fixation for extended periods of time improves throughout childhood [Paus et al.,1990]. Further studies have demonstrated that low-level baseline conditions such as fixation and rest (whether comprising null events in event-related imaging, or a true low-level condition in blocked designs) can be associated with robust activity relative to more constrained tasks [see, for example, Stark and Squire, 2001]. In our rapid event-related fMRI studies we use fixation periods of varying length to separate trials. Although we acknowledge that there are difficulties with interpreting results relative to a fixation baseline, stemming largely from the uncontrolled nature of fixation, and from the fact that activation during fixation can differ across age groups, similar difficulties exist for all potential baseline tasks and the problem with baseline task-selection becomes one of infinite regress. The advantage of reporting both A–B (experimental vs. control) contrasts by age group together with estimates of A and B (separately for each age group) relative to a constant term associated with fixation is that one's assumptions about constancy across development can be tested, and further, it can assist with characterizing activation within the framework of default mode functioning.
How to Consider Timecourse Data?
The nature of immature responses can be assessed by qualitative and quantitative evaluation of the estimated BOLD time series. For example, by looking at time courses one can determine whether a particular brain area shows a positive or negative-going BOLD response profile relative to baseline, or how long an area is engaged in response to the stimulus; statistical tests can then quantify the significance of those differences. Although a common critique of developmental studies is that possible age-related differences in vascular physiology may undermine the BOLD response and group differences, a range of studies have provided evidence to indicate that this is very unlike (see Church et al., in this issue). Below, we present a series of steps used by our laboratory and others to extract estimated time courses and make statistical comparisons across age groups and experimental conditions.
We distinguish “modeling the hemodynamic response” and “estimating time courses” as distinct approaches. We use the term “modeling” in reference to analyses that use a pre-specified response shape or function (e.g., SPM gamma variants) in the regression model. In this approach, the underlying shape is fixed and what varies are specific parameters (e.g., magnitude) that are fit to the imaging data. There are numerous options available for which specific model to use, which typically vary based on the number of free parameters, ranging from simple one parameter models (e.g., magnitude in a single parameter gamma function) to more complex models that include multiple parameters (e.g., initial dip, magnitude, duration, and under-shoot). A different approach is to estimate the hemodynamic response for a given regressor of interest in a GLM analysis using either multiple delta functions or a series of basis functions. Common to the use of delta and basis functions is that no assumptions are made about the specific shape of the HDR (which can vary across vascular territories), giving the freedom to obtain any shape. This approach thus enables us to characterize and interpret differences in the shapes of time courses that may be particularly useful for developmental studies. The advantage of using basis versus delta functions include that the stimuli need not be time-locked to the TR and that fewer parameters need to be estimated, increasing power. Several different basis functions are available in the literature [e.g., finite impulse response basis sets Lindquist and Wager,2007], SPM gamma variants [Ward,2006]; more recent approaches have also used Bayesian approaches to optimize estimation [Woolrich et al.,2004]. In our own work, we have used both tent and sine basis functions. Although use of the tent function is perhaps more common, the sine series approach is useful in that fewer parameters need to be estimated and are less sensitive to large fluctuations in signal due to outliers. The sine series also makes the assumption that the transitions between estimated time points are smooth, which is likely true to actual blood flow. We have modeled the same dataset using sine and tent basis functions and found that they yield nearly identical results.
We have taken the following steps to analyze estimated time courses. First, time series data obtained from deconvolution analyses from all subjects and conditions are entered into an omnibus, mixed effects ANOVA. The result of this ANOVA is an uncorrected “main effect of time” image that shows all voxels demonstrating significant modulation across time (i.e., voxels that were active during the task), regardless of trial type. Next, we use an automated search algorithm to select peak voxels. A sphere mask (on the order of ∼9 mm in diameter) can then be placed around the peak. The “main effect of time” image is then corrected for multiple comparisons and sphericity, and exploratory functionally defined regions of interest (ROI) may then be derived from a conjunction of the uncorrected image and the corrected image, or an anatomical ROI applied as a mask. The estimated time courses from each remaining voxel within the ROI are averaged at each time point and across subjects for each given experimental condition. In this manner, one can ensure that the same regions are being considered across subjects. The result is a single, mean time course for a particular ROI and experimental condition.
One issue that we have encountered in our own studies is later occurring peaks (greater than ∼10 s) in the estimated time series (See Fig. 2). Such peaks are relatively rare and, we presume, are often discarded as noise artifact in many studies. It is currently unclear in the literature whether these temporally later peaks have functional significance or if such differences in shape are actually substantial enough in the age ranges discussed to warrant concern. If atypical shapes occur in specific regions while other regions demonstrate a more typical HDR shape or if only one group demonstrates a consistent irregular shape in a specific region across subjects, this warrants examination. It is intriguing to speculate that such secondary peaks may reflect individual subject variability in the recruitment of a specific area, or perhaps underlying processes that, while statistically related to a particular regressor, occur over a more protracted time scale (e.g., DA second messenger signaling). Alternatively, these peaks could simply be a result of deconvolution analyses, which do not assume a fixed HDR shape, overfitting the data. One temporary approach that we have taken to account for these data is to reanalyze the time series with repeated measures ANOVA, including only early time points (time zero to ∼10–12 s) and report results from the whole and partial time series [see Fig. 2, Geier and Luna,2009].
In summary, estimating time courses using basis functions can provide important information about the shape of the hemodynamic response and may inform specific processes underlying developmental differences.
What is an Appropriate Basis for the Definition of ROI?
The inclusion of multiple groups and multiple task levels in developmental fMRI makes it difficult to discern patterns of activation across conditions based solely on whole brain statistical activation maps. A common response, as we have implied, is to conduct statistical analyses of activation parameters within ROI. But, on what basis should one define ROI for investigation in developmental fMRI? Ideally, one should identify regions based on explicit hypotheses about differences among conditions between age groups, defining regions either anatomically or based on independent data from pre-existing studies, meta-analyses [Poldrack,2007], or split (by run or subject) from the data-set to be interrogated [Kriegeskorte et al.,2009; Poldrack and Mumford,2009]. Under these circumstances, valid statistical inference is possible based on test statistics obtained within ROI. However, in current developmental imaging, exploratory analyses are also often justified, given the emerging nature of the field, and these may require specification of functionally defined regions based on the data set under investigation.
Although it is not appropriate to define regions based on the effect for which those regions will be interrogated (e.g., examining age group effects in regions defined using a statistical image testing for voxels showing age group effects), it is similarly not appropriate to assume that one can make valid statistical inferences based on data from regions derived from non-orthogonal contrasts or from omnibus F-tests, as cogently pointed out in a number of recent reviews [Kriegeskorte et al.,2009; Poldrack and Mumford,2009; Vul et al.,2009], which detail the many ways in which statistical results can be impacted by selective analyses of non-independent data. In particular, Kriegekorte et al.  note that analysis of data from regions thus defined is prone to systematic biases. This is particularly likely to be the case in developmental studies, where data from unique age groups can be differentially affected by noise, where designs are frequently unbalanced, and where use of complex models makes distortion of parameter estimates (i.e., overfitting) more likely.
However, we share with Poldrack et al.  the view that exploratory data analysis in regions derived from non-independent data is not without some utility. In particular, for developmental fMRI, such analyses are critical for data visualization and quality control. However, even in this circumstance, we advocate that regions be derived from statistical images from at least nominally independent effects (e.g., main effect of time images) and that the exploratory nature of any depiction of such effects be clearly described.