|
|
||||||||
Institut National de la Santé et de la Recherche Médicale U455, Hôpital Purpan, Toulouse, France; and School of Psychology, University of Wales, Bangor, United Kingdom
ABSTRACT I. INTRODUCTION: LESSONS FROM APHASIA II. FUNCTIONAL NEUROIMAGING TECHNIQUES A. Overview of Neuroimaging Tools 1. Basic principles of PET and fMRI 2. Recent evolution in fMRI 3. Electrophysiology and imaging 4. New imaging techniques B. Tracking Brain Activations 1. The activation paradigm 2. Statistical approaches to structure-function relationships A) FACTORIAL DESIGN. B) COGNITIVE CONJUNCTIONS. C) PARAMETRIC APPROACH, CORRELATIONAL APPROACH, AND STRUCTURAL EQUATION MODELING. 3. Convergence of neuroimaging and electrophysiology C. Outstanding Questions 1. Influence of subject-dependent and stimulus-dependent parameters 2. New methodological challenges A) TIME COURSE OF EVOKED HEMODYNAMIC RESPONSES. B) FUSION OF NEUROIMAGING AND ELECTROPHYSIOLOGY. 3. Limitations of neuroimaging A) HARDWARE CONSTRAINTS. B) METHODOLOGICAL CHOICES. III. LANGUAGE IN THE ''HEALTHY'' ADULT BRAIN A. Single-Word Processing 1. Auditory input 2. Visual input 3. Semantics 4. Speech output 5. Written output B. Sentence and Discourse Processing 1. Sentence context and semantics 2. Dissociating semantics and syntax 3. Discourse-level processing C. Beyond the ''Standard'' Language 1. Illiteracy 2. Multilingualism 3. Sensory deficits D. Outstanding Questions 1. Hemispheric predominance for language 2. Ventral and dorsal language functional pathways 3. Controlled versus automatic language processes: attention and memory IV. LANGUAGE AND BRAIN PLASTICITY A. Developmental Plasticity 1. Normal development 2. Developmental disorders: the case of developmental dyslexia B. Postlesional Plasticity 1. Language recovery poststroke 2. Language reorganization in neurodegenerative diseases C. Outstanding Questions V. CONCLUSIONS ACKNOWLEDGMENTS REFERENCES
| ABSTRACT |
|---|
|
|
|---|
| I. INTRODUCTION: LESSONS FROM APHASIA |
|---|
|
|
|---|
In the century following the foundation of clinical neuropsychology, a crucial advance in cognitive neuroscience was the advent of radiological and isotopic methods for imaging brain structures and functions. X-ray computerized tomography (CT scanner) first appeared in 1967 (Hounsfield, Nobel Prize 1979) and allowed the seat and extension of brain lesions to be located with precision in vivo. Meanwhile, cognitive models of language and memory processes were developed and helped to characterize more precisely the nature of cognitive deficits observed in neurological patients. Twenty years ago, the further progress of science in the domains of nuclear physics, magnetic fields, informatics, and, more generally, electronics triggered a revolution in the history of neurophysiology. The advent of functional neuroimaging has made the dream of structure-function researchers come true: it is now possible to directly correlate mental operations with indices of brain activity. The great enthusiasm generated by these revolutionary techniques must, however, be modulated by the significance of the results obtained so far in the domain of language, which will be discussed in this review. Nevertheless, the profusion of neuroimaging studies, inquiring into various aspects of language processing and, in particular, those identified by psycholinguistic models, have accumulated enough data over two decades to complement, and even challenge, the classical aphasia model.
At this point it must be noted that the aphasia model is far from being adequate when it comes to deriving representations of brain functions. Although some of the first lesion-related findings have never been totally invalidated (e.g., the involvement of Broca's area1 in speech production and the critical role of the left superior temporal gyrus in auditory verbal comprehension; for contemporary examples, see Refs. 138, 196), there are several reasons that make the interpretation of lesion studies complex and, in some cases, impossible. Four major limitations of the classical aphasia model should be considered in particular.
1) Language-related brain regions are embedded in complex and highly interconnected networks. It is therefore very unlikely that accidental lesions selectively affect specialized neural networks, such as those related to language, without damaging other, possibly less specialized, functional systems. In particular, lesions may impair language processes by affecting neural structures (such as the basal ganglia or the thalamus) that are connected to specialized brain areas and/or their mutual connections without damaging the specialized areas themselves. Consequently, anatomical lesions located a fair distance away from language-related regions can still affect language function (33).
2) Despite the massive development of diagnostic tools and tests (e.g., Boston Diagnostic Aphasia Examination, Goodglass and Kaplan, 1972, Psychological Assessment Resources, Lutz, FL 33549), the classical syndrome-based approach to aphasia has proven insufficiently specified (423). Not only do syndromes such as "Broca's aphasia" correspond to poorly defined entities in terms of the cognitive components involved, but also aphasic syndromes as a whole (e.g., agrammatism) cannot be trivially related to reproducible and consistent lesion sites (13). Even when considering more specific disorders such as aphasic symptoms (e.g., word finding difficulties; Refs. 295, 423), necessary and sufficient correspondence between a specific lesion site and a symptom is rarely established. There are several reasons for the scarceness of such one-to-one relationships. It is generally accepted that lesion studies in aphasia can indicate which brain region is necessary for implementing a language process by observing language disorders following focal brain lesions. Notwithstanding the oversimplification of real conditions implied by this logic (see the first point above), the validity of this assumption is generally not assessed in formal, Bayesian terms. Indeed, "sufficiency" implies that the presence of a specific lesion can predict the symptom of interest in any patient. On the other hand, "necessity" implies that the lesion must be present for the symptom to be observed. Although necessity or sufficiency relationships are sometimes validated in clinical data (see Table 1), their coexistence is very rare. Effectively, since language, like other higher cognitive functions, is thought to rely on the interplay of many different brain areas (89, 240, 316), lesion-symptom relationships are likely to be influenced by a set of distributed regions rather than a single, circumscribed area (4).
|
4) Numerous subject-dependent factors, such as gender, age, handedness, and literacy (see sect. IIC1 and Table 2), whose precise effects are still poorly understood, seem to substantially influence aphasic symptoms (35, 63). For instance, stroke more frequently affects elderly patients; therefore, the influence of age on aphasic symptoms cannot easily be dissociated from the etiology of the lesion.
|
Based on different physical principles, positron emission tomography (PET), functional magnetic resonance imaging (fMRI), and multichannel electro- or magnetoencephalography (EEG, MEG) make it possible to measure various indices of ongoing neural activities arising from the brain "in action." Depending on their relative spatial and temporal resolutions, they depict the functional anatomy of cognitive operations and/or index their time course. However, the above-mentioned limitations of the aphasia model should be kept in mind because most similarly apply to the study of language physiology in normal subjects.
We first address various technical and methodological issues on neuroimaging. Second, we review a series of significant contributions to the understanding of language based on the study of normal subjects. Third, we discuss the matter of language processing in the context of developmental or postlesional brain plasticity.
| II. FUNCTIONAL NEUROIMAGING TECHNIQUES |
|---|
|
|
|---|
A. Overview of Neuroimaging Tools
In contrast to the static clinical anatomic paradigm, the "activation" method used in neuroimaging experiments is fundamentally dynamic. This method relies on recording changes in indices of cerebral activity (for reviews on technical issues, see Ref. 403). These variations are recorded in the form of tomograms, i.e., successive sets of slices through the brain which allow measurement of the regional cerebral blood flow (rCBF) in different areas. In most studies the main independent variable generating the statistical variance of interest is time. Experiments are based on repeated functional imaging measurements and alterations of controlled experimental parameters over a series of blocks ("block design") or trials ("event-related design," see sect. IIA2 and Fig. 1). Time scales differ dramatically from one neuroimaging technique to another. In the original, rudimentary use of PET, the temporal resolution was on the order of several minutes (>10 min/measurement). In contrast, the maximal temporal resolution of fMRI is in the range of a few hundreds of milliseconds (122); the typical time needed to acquire a single functional brain slice is in the region of 50 ms. However, the repetition time used to acquire more than one functional slice is typically on the order of 24 s, and the time needed to sample the entire hemodynamic response is in the range of 12 s (11). Whatever the temporal resolution, the activation method is based on the comparison of signal level across different experimental conditions (e.g., one condition eliciting a particular cognitive process versus another in which the process is not likely to occur). A basic activation experiment features an "active" experimental task (involving stimulation, cognitive computation, and response) and a "rest" condition in which none of these stages is supposed to be involved. The statistical difference between condition-specific patterns of activation is then likely to reflect neural activities associated with the process under study. In spite of many unresolved methodological issues (see sect. IIB), this approach has generated great enthusiasm and has produced attractive, colorful activation maps that were soon found to be congruent with physiological predictions.
|
1. Basic principles of PET and fMRI
Neuroimaging techniques (such as PET and fMRI) have poor temporal resolution relative to typical neural firing rates of neurons, but have reasonably good spatial resolution throughout the entire brain volume. The typical precision of images (cubic millimeter) remains unsatisfactory vis-à-vis functional exploration at the neuron level, however. This limited anatomical resolution might be only an apparent drawback because high-order cognitive operations such as language processing are likely to involve large neural assemblies rather than microscopic circuits.
Signals exploited in PET (gamma emissions provoked by the positron emitting 15O isotope incorporated in water and injected into the systemic blood supply) or fMRI (magnetic susceptibility of hemoglobin indicating blood oxygenation status) relate to local changes in vascular parameters deduced from the local concentration of the tracer in microvessels. Considering a population of
104 synapses in which the energetic demand varies suddenly as a consequence of sensory stimulation, motor response, or any kind of neural operation, such vascular changes have been estimated to happen in a brain volume a few hundred micrometers in diameter, using optical cortical activity imaging (404).
In fMRI experiments, the most studied signal is known as BOLD (blood oxygenation level dependent). It is based on the measure of changes in magnetic susceptibility of hemoglobin, depending on whether it conveys a dioxygen molecule or not (271). A sudden increase in synaptic metabolism is thought to be followed by a transient drop in oxyhemoglobin concentration, and consequently in the BOLD signal, in vessels neighboring activated neurons (429). A major increase in oxyhemoglobin concentration then occurs as a consequence of vessel dilatation, with a peak observed
56 s after stimulus onset time (SOT). This massive local vascular response provides more metabolites than needed by neural activity. The physiological relation between blood flow in gray matter vessels and local variations of neural metabolism remains largely unknown, although recent advances have begun to unravel some important characteristics of this phenomenon (217), such as the role of astrocytes as "energy transducers" interposed between capillary walls and neurons (222).
Combining resolution in space and time is a general requirement for neuroimaging of cognitive functions; it is especially crucial when studying the neural correlates of language functions. Indeed, if one accepts that written language is an acquired artifact, human language depends primarily on auditory and vocal functions, which are linked to time in very essence, as they rely on a continuous stream of events. Because it is now possible to record fMRI signals from the whole brain volume in
1 s, the temporal resolution of the BOLD effect is much higher than that of the 15O-PET response, which can only be computed after integration of gamma activity over a minimum of 30 s. Although some parameters, such as pulse or respiratory rates, have to be carefully controlled in fast fMRI acquisition, this method is emerging as the state-of-the-art approach to brain mapping, in paradigms such as single-trial acquisition.
Two current methods are used to acquire fMRI data: block design and single-trial or event-related design (see Fig. 1). In the first method, alternation of different conditions (activation/rest) is used as an entry function to convolve the hemodynamic response. The different blocks consist of different conditions, during which stimuli are presented and/or responses are required from subjects. A statistical analysis then identifies voxel signals that correlate with the alternation of experimental conditions (11). Such a correlational approach improves the amount of information that can be extracted from fMRI data because the signal-to-noise ratio is low when classical subtractive t-test analyses are employed.
In the second method, single-trial or event-related fMRI (55, 337), the hemodynamic response corresponding to each single stimulation is acquired individually. This is achieved using two different procedures: 1) stimuli are presented tens of seconds apart, allowing the complete sampling of the hemodynamic response between two stimuli (i.e., over at least 12 s, Ref. 10), and 2) alternatively, stimuli are presented at a faster rate (e.g., every second) and unitary hemodynamic responses are reconstructed by deconvolving the summated response acquired over the entire series (162, 241). Compared with block design, single-trial design makes it possible to present stimuli in a randomized order and therefore reduce habituation effects. Some studies have even described the recording of single events, even though signal-to-noise ratio in association cortices is in most cases insufficient (e.g., Ref. 270).
Some authors have pushed the temporal resolution of event-related fMRI to its actual limit (i.e., the time necessary to acquire one slice) to temporally discriminate activated clusters. For example, Menon et al. (238) used a single-trial design to show that the BOLD response elicited by a visuomotor task in the premotor cortex was delayed compared with the BOLD response of the primary visual cortex. Moreover, they found a robust correlation between participants' reaction times and the lag between onsets of hemodynamic response in the primary visual area and the supplementary motor area. Some preliminary results in the language domain also appear very encouraging (154, 391, 395, 396; see sect. IIC2). Single-trial fMRI might well become the best procedure for collecting spatial and temporal information simultaneously because its temporal resolution, which is in the range of 100 ms/slice, can still be improved (for a review on the potential of time-resolved fMRI, see Ref. 122). Important questions currently being addressed are as follows: 1) What is the reproducibility of the hemodynamic response in different conditions, regions, subjects, or scans? 2) Does the slow time course of the BOLD effect impair its capacity to temporally discriminate brain activations? 3) Can we model the relationship between the amplitude of the signal and its time course in different brain regions?
3. Electrophysiology and imaging
Electrical and magnetic neuroimaging techniques are based on the noninvasive recording of electrical and magnetic field variations induced by neural activity. Neural electrical activity can be divided into two categories: action potential (AP) and postsynaptic potentials (PSPs). The AP corresponds to the propagation of ion flux bursts along the axon of a neuron and can be described as a quadrupole, the magnetic and electric fields of which decay rapidly (422). PSPs can either be excitatory or inhibitory and are larger bursts of ion exchanges on the surface of postsynaptic neurons. In contrast to the AP, PSPs can be described by current dipoles active for several tens of milliseconds. As a consequence, magnetic and electrical fields recorded over the scalp derive from the summation of PSP dipoles rather than AP sources (218).
Pyramidal cells mainly found in layer V of the cortex are tall and parallel to one another. Their orientation is perpendicular to the surface of the cortex, and groups of a few hundred thousand pyramidal cells (i.e., a few mm2 of cortex, see Ref. 331) activated simultaneously can produce electrical and magnetic activity deriving from PSPs and measurable on the surface of the scalp. Scalp electromagnetic activity is therefore mainly a consequence of synaptic discharge and global cellular polarization in collinear neurons. The recording of electrical activity using electrolyte gel and highly conductive electrodes is known as EEG, while MEG is the recording of correlative magnetic field variations using very sensitive sensors. Here, neural mechanisms underlying surface effects are better known than in the case of tomographic techniques. However, the precise localization of electrical and magnetic sources is complex. On the one hand, predicting surface pattern from the location, orientation, and intensity of brain sources (forward modeling) implies a comprehensive approach to the propagation of electromagnetic flow throughout the brain and the head tissues. On the other hand, modeling brain sources on the basis of surface recordings may be misleading because this backward modeling problem has an infinite number of solutions, especially if multiple and deep sources are likely to be involved. With that said, source analysis has recently benefited from the integration of whole brain structural anatomy provided by high-resolution three-dimensional MRI.
From EEG and MEG are derived event-related potentials (ERPs) and evoked magnetic fields (EMFs). ERPs and EMFs are based on the averaging of a large number of recordings time-locked to the occurrence of a stimulus to compensate for their low signal-to-noise ratio. Averaging over a large number of trials progressively cancels spurious brain electrical or magnetic activity that does not relate to the cognitive task performed by the participant. Conversely, electromagnetic activity relating directly to the processing of the event (stimulus) and subsequent cognitive operations are enhanced by averaging and emerge in the form of a series of positive and negative deflections (see Ref. 342; for a review on ERP components elicited by language processing, see Ref. 202). This dominant approach overlooks the value of studying changes in the spectral power of electromagnetic signals (quantitative EEG) and synchronization phenomena across different recording sites (coherence analysis; see, for instance, Refs. 332, 370, 382). Such new approaches could be applied to the investigation of language processing, however (86, 195).
Several new neuroimaging techniques, such as optical or near-infrared cortical imaging (383, 411), spectroscopic magnetic resonance imaging (e.g., Ref. 365), diffusion tensor imaging (e.g., Ref. 85), or physiological techniques that are not brain centered (e.g., pupillometry, cardiovascular measures, and electrodermal activity; for a review, see Ref. 200), are likely to complement current approaches to exploring the physiological substrates of cognitive processes and the timing of their involvement.
Computerized image processing software implementing elaborated mathematical unfolding procedures used in conjunction with high-resolution MRI techniques will provide significant improvements in our understanding of brain functions. For instance, cortical unfolding routines have proven especially useful in the exploration of the visual system (88, 197). Systematic statistical approaches to the morphometry of cortical regions may reveal significant differences in terms of brain functions in individual subjects (e.g., Ref. 223). With the use of diffusion tensor imaging, it is now possible to track the three-dimensional geometry of white matter bundles and fascicles (85), allowing the modeling of neural networks in vivo. For instance, imaging diffusion and perfusion a few hours after ischemic stroke can reveal functionally impaired regions that are not at the core of the lesion. This advance will lead to a better understanding of the brain-symptom relationships in the acute phase of aphasia (160). Very brief and localized magnetic pulses produced by transcranial magnetic stimulation (TMS) can be used to transiently stimulate neural populations and induce either facilitation or inhibition of cognitive operations (277). This technique can demonstrate the intervention of particular cortical areas in a precise time window, from the onset of the stimulus in a confrontation naming task for instance, and can be used in combination with functional imaging (116). In the future, TMS should help to address complex language issues, such as the role of the right hemisphere in functional compensation of aphasia.
The first way to conceptualize the cognitive structure of "activation" experiments was an additive model, akin to the "pure insertion" hypothesis (128). In this model, an active condition involves several cognitive components that are thought to be independent from one another (e.g., input, intermediate, and output components). The "deletion" of one component in a second task is hypothesized to leave the other cognitive components unaltered. Consequently, the contrast (logically called "subtraction") between the patterns of activity observed in the first condition and those obtained in the second condition are supposed to reveal the neural correlates of the removed component. The "additive-subtractive" model is usually used as the first step in data analysis because it has proven capable of leading to straightforward inferences. This is the basis of "hierarchical" designs that involve several tasks of increasing complexity. As cognitive components are progressively added to the higher order tasks, activation maps are thought to reflect the "add-up" effect in neurofunctional terms. Therefore, the additive-subtractive model appears to be the transposition of the transparency hypothesis formulated by Caramazza (65) to functional neuroimaging. It thus allows neuroimaging results to be compared with clinical anatomical findings. However, this model has important limitations, as stressed by Friston et al. (128). Its hierarchical structure implies that all components of lower order tasks are entirely embedded in any higher order tasks. In the case of complex functions such as language, a given task does not require a simple series of successive and independent processing stages that can be added or subtracted at will. Rather, such tasks involve different cognitive processes implemented in a parallel and interdependent fashion and need to be approached accordingly. Depending on the experiment, the manipulation of two or more experimental factors may be such that their combination would induce changes in neural activity that are not the simple, straightforward addition of activations elicited by each of them, but are, for instance, greater than this sum. Signal changes induced by activation tasks are thus frequently nonlinear and make it necessary to consider factorial interactions.
2. Statistical approaches to structure-function relationships
A) FACTORIAL DESIGN. Following this line of research, the Friston and Frackowiak group in London have set up a general method and dedicated software, Statistical Parametric Mapping (SPM, www.fil.ucl.ac.uk), in which a voxel-by-voxel analysis is performed to test experiment-induced signal changes according to the general linear model. This group has emphasized the interest in building up experiments in which cognitive components can be used as orthogonal factors so that their potential interdependency can be directly investigated.
The statistical analysis is based on a factorial design in which each cognitive component corresponds to a main effect, and the interaction between these factors can be formally tested. Other independent variables, such as subject-specific characteristics (e.g., handedness, gender), can also be involved in such analyses (for an example, see Fig. 2).
|
B) COGNITIVE CONJUNCTIONS. An interesting alternative to the additive-subtractive model is the conjunction approach. Instead of contrasting different conditions to relate a specific brain region to a specific cognitive process, one can investigate which parts of the brain are systematically active in various tasks sharing a defined cognitive component (304). This method is useful to demonstrate that different tasks may require common neural substrates in spite of their particularities. Combined with random effect analysis across groups of subjects, it is particularly efficient for identifying neurofunctional crossroads, i.e., cortical regions in which functionally distinct neural networks overlap with each other and show repeated activations although the cognitive tasks used are very different. Price and Friston (304) used this approach to characterize a common network involved in four different visual naming tasks, each of them associated with a specific reference task controlling for the effects of basic perceptual processes.
C) PARAMETRIC APPROACH, CORRELATIONAL APPROACH, AND STRUCTURAL EQUATION MODELING. Activation phenomena appear richer than what was originally observed in the framework of the additive model. Friston et al. (125) showed that principal component analysis may reveal a lot about the functional systems involved in cognitive experiments by unraveling functional connectivity in the brain (for an example of this method in a reading task, see Fig. 3). Without an a priori hypothesis about the differences to be expected from a particular contrast, activated networks may be related to the influence of continuous variables such as time and/or behavioral indices (e.g., Ref. 142). In the language domain, systematic research carried out by Price et al. (307, 311, 312) has focused on the influence of lower order stimulus-dependent factors (see sect. IB4) and has shown that nonlinear correlations between such variables and changes in rCBF might be even more significant than linear ones.
|
3. Convergence of neuroimaging and electrophysiology
As mentioned above, the temporal resolution of electrophysiology (EEG and MEG) is compatible with the speed of cognitive processes, while its spatial information remains poor. These properties are in striking opposition to those of tomography (PET and fMRI). Used independently, each type of technique requires highly specific activation paradigms focusing either on the temporal or the anatomical dimension of the signal, respectively. This specialization has rendered electrophysiological and tomographic data impossible to compare for at least 20 years. Nevertheless, different authors (9, 96, 374, 393) have reported language experiments performed with PET or fMRI on the one hand and ERPs on the other in an attempt to provide complementary results (see, for instance, Refs. 396 and 392). Given the evident complexity of correlating PET and ERPs, for instance, it is necessary to consider a single and unique paradigm that is compatible with both techniques. One possible procedure is to identify a set of activations via tomographic methods and then explore the relative intensity of corresponding brain generators (see Fig. 4) using source analysis software such as BESA (353) or Curry (Neuroscan). This procedure offers a way to bypass the inverse problem raised by brain source analysis (cf. sect. IIB3). The functional significance of localized activations can then be revisited in terms of the kinetics of activation in neural assemblies. The opposite procedure can be proposed: first elaborate a tomography-compatible paradigm eliciting a well-defined evoked component such as the P300 or the N400, and then seek for its generator with tomographic techniques (e.g., Ref. 367). A promising procedure is the replication of ERP or MEG experiments using single trial fMRI (96, 395). However, such protocols cannot just be adapted from one technique to the other but must be specifically developed (see sect. IIC3).
|
1. Influence of subject-dependent and stimulus-dependent parameters
Whatever the procedure for functional neuroimaging analysis, subject-specific parameters and task generic parameters of language experiments have a profound, albeit frequently disregarded influence on the results. As the effects of these parameters were not anticipated in the pioneering experiments, they were not studied in a systematic way. In fact, their impact was established progressively, sometimes as a by-product of studies designed for other purposes. The most relevant of these variables are listed in Table 2.
For example, the earliest studies of brain activation using nontomographic isotopic blood flow measurements incidentally pointed out the overwhelming influence of motivation (higher activation and network modulation being observed in subjects showing high motivation; e.g., Ref. 204) and emotional state (a reduction of anxiety along a time series of brain recordings being associated with a global decrease in activation). The effects of these variables have been specifically investigated in recent and sophisticated studies (e.g., Refs. 172, 182).
Although the vast majority of functional neuroimaging studies have been conducted in young, well-educated subjects with a predominance of males (usually undergraduate or graduate students recruited in universities or laboratories), subject-dependent parameters such as gender, age, handedness, or literacy have been found to influence activation patterns drastically. Male subjects, for instance, were originally thought to display a stronger left-greater-than-right asymmetry for language (314, 360), although these results failed to be confirmed by further studies conducted in larger groups (129, 376, 410).
If one excludes the case of infant development (see sect. IVA), the influence of aging on language processing has been largely overlooked so far. In a study of visual recognition, however, Madden et al. (220) described an increase of activity in the anterior part of the ventral visual system in healthy elderly subjects compared with younger controls. One cannot estimate the impact of this parameter on already published data.
Handedness is frequently viewed as a major factor influencing hemispheric dominance for language. Notwithstanding differences between right- and left-handers in terms of structural anatomy (e.g., Ref. 406), systematic neuroimaging studies including large subject samples have recently demonstrated that left-handers tend to present a left hemispheric dominance for language. Although the right hemisphere is less involved than the left in left-handers, the functional asymmetry is less marked than in right-handers. Activation has only rarely been observed in the right hemisphere in isolation (315, 376, 405).
Other studies have stressed the importance of more general, though less obvious, sources of signal modulation. A series of experiments performed by Price and co-workers (306, 312) focused on the influence of low-order stimulus-dependent factors, such as exposure duration and rate of stimulation, together with absence/presence of an overt utterance while reading. These authors demonstrated massive and unexpected effects in a variety of areas including "key" regions such as the left occipital temporal cortex and premotor areas. The response function relative to these stimulus-related parameters varied dramatically even between two neighboring areas, e.g., fusiform versus lingual gyrus.
While keeping these low-order parameters constant, other general factors might also bias brain mapping of language functions. For instance, subjects' familiarity with the task may dramatically alter the pattern of activation, as demonstrated by Raichle et al. (320). These authors compared activations measured during a verb generation task with activations recorded in the same participants doing the same task after extensive training with the specific word list used. Much of the activation observed at the naive stage in the left inferior frontal areas disappeared at the trained stage and was seen again, though to a lesser extent, when subjects were presented with another word list.
An important source of modulation of activation patterns is the degree of task "difficulty" that can be manipulated via several experimental features, such as perceptual ambiguity between targets and distracters (e.g., phoneme targets among phonetically similar versus dissimilar distracters, Ref. 106) or the number of candidates among which an item has to be chosen in a word generation task (401).
2. New methodological challenges
A) TIME COURSE OF EVOKED HEMODYNAMIC RESPONSES. The issue of hemodynamic response variability between brain regions and between individuals has been extensively addressed in the last decade (2, 10, 53, 55, 110, 184, 206, 241, 352, 391). In two studies of evoked hemodynamic responses (EHRs) recorded during language tasks, Thierry and co-workers (391, 395) found a sequence of hemodynamic peak latencies that was compatible with physiological expectations (e.g., primary auditory cortex early, superior temporal regions intermediary, inferior prefrontal regions late).
However, it must be kept in mind that the hemodynamic response has proven too variable across regions in terms of timing, amplitude, and shape to enable direct comparison between different parts of the brain (10, 53, 55, 206, 352). Such regional differences might be due to variable influences of microscopic and macroscopic blood flow (76), to differential vascular sampling, or to real differences of neural activity (55, 241, 352). Although the hemodynamic response of one region is susceptible to being dysphased by several seconds across subjects (55, 184, 241), its grand-average latency and amplitude have proven reproducible for groups of subjects as small as n = 6, i.e., the central tendency of the EHR can be reproduced in different groups of subjects and, a fortiori, in the same group of subjects across experimental blocks, with a precision of tenths of seconds (55).
More recently, several authors have proposed a temporal analysis of averaged fMRI signals in cognitive tasks, called time-resolved fMRI (for a review, see Ref. 122). Formisano and Goebel (122) have proposed that the main processing stages of cognitive operations can be temporally differentiated using time-resolved fMRI, although fast neural exchanges between two interconnected regions are unlikely to be distinguished.
Thierry et al. (396, see Fig. 5) found that EHR peak latencies were significantly delayed by one experimental factor (maintenance of information in verbal working memory) in one region of the brain (inferior prefrontal cortex) while they remained identical in another region (superior temporal regions). According to Miezin et al. (241), the hemodynamic response in a given region is nearly identical from one data set to another (time to peak correlation r2 = 0.95 across sets) so that the significant difference found by Thierry et al. (396) for the inferior prefrontal cortex can only relate to the difference introduced by condition or task variations and not to spurious hemodynamic effects. If cognitive tasks can significantly influence the time course of hemodynamic response, event-related fMRI provides a unique opportunity to merge spatial and temporal information in a single approach.
|
B) FUSION OF NEUROIMAGING AND ELECTROPHYSIOLOGY. Simultaneous recording of both BOLD and electrophysiological signals has already been reported in investigations of memory (343) and vision experiments (39, 377). Complex EEG artifacts generated by this procedure and relating to pulsatile blood flow seem to vary from one participant to another but can be corrected accordingly (pulse artifacts; see Ref. 5). Conversely, the magnetic susceptibility of EEG electrodes and electrolyte can induce fMRI image distortion (38). Nevertheless, these technical difficulties can be overcome, and studies reporting simultaneous EEG and fMRI recording are already being published (226).
The time course of EHRs cannot be directly compared with that of ERPs, and the procedure for statistical analysis is very different. The question is to what extent EHRs and ERPs can provide convergent sources of information about the same cognitive process. FMRI essentially provides anatomical differences between conditions while EEG contributes temporal windows of differences between these same conditions, but a direct correspondence between a region of the brain and a moment of involvement can be established only in the case of highly focal brain activations (e.g., activation of primary sensory regions or motor cortex). As soon as one considers distributed cognitive networks, such as those involved in language processing, the number of combinations [region of interest (ROI), equivalent brain generator, real time latency of activity] becomes overwhelming. No statistical method is available to date for such a four-dimensional mapping of brain activation. Therefore, one has to consider what the technique can offer to elaborate the methodology used. In other words, experiments using simultaneous fMRI and EEG recording need to rely on specific spatial and temporal hypotheses that can be tested independently by the two techniques. Thus the real advantage of using the two simultaneously is the guarantee that cognitive processes underlying the anatomical results are identical to those eliciting electrophysiological effects.
3. Limitations of neuroimaging
We have chosen to address the issue of constraints imposed by the physics of the scanner and the biophysics of brain metabolism in section IIC3A, and we address questions relating to experimental parameters such as stimuli, tasks, timing, and statistical analysis of the data in section IIC3B.
A) HARDWARE CONSTRAINTS. A fundamental drawback of current functional brain mapping methods is that they do not reflect neural metabolism per se but only indirect, vascular phenomena. Hence, it is not possible to distinguish between excitatory and inhibitory neural processes as they are both thought to induce energy consumption resulting from local synaptic activity, and to thus correspond to an increase in vascular signal. The intensity of signal changes might be much less for inhibitory synaptic populations, however, and such populations are thought to be less widely distributed in the cortex (413). New fMRI approaches using water diffusion tensor imaging (e.g., Refs. 85, 205), direct imaging of neural firing (37), or spectroscopic imaging (365) may provide fruitful alternatives to traditional BOLD monitoring as they permit more direct exploration of neural tissue metabolism. In addition, diffusion tensor imaging allows the tracing of neural pathways. Abnormalities in fiber bundles connecting cortical areas involved in language have been correlated with language impairments, despite the fact that the cortical areas themselves are spared (187).
Recent studies gathering data from C13 magnetic resonance spectroscopy, high-field fMRI and extracellular recordings in anesthetized rats explored the relationships between the BOLD signal, oxygen consumption rate, and cellular firing during a sustained stimulation of the somatosensory cortex (169, 372). Under such experimental conditions, a coupling between oxygen consumption and firing rate was found in a cortical layer mainly reflecting the activity of glutamatergic neurons. Smith et al. (372) stressed that the amount of energy consumed at "baseline" (via oxidative glycolysis) is massive compared with minute stimulation-induced changes. They also reported stimulation-induced decreases of electrophysiological signal, recorded in
10% of electrodes.
Nevertheless, deactivation has been observed in various PET studies (e.g., Ref. 364) and fMRI studies (26, for a review see Ref. 147). The fact that similar deactivations can be seen using both techniques substantiates the fact that BOLD and PET signals are linked with rCBF variables. If rCBF variations reflect synaptic activity, inhibitory groups of neurons, like excitatory ones, should participate in brain "activation," although to a lesser extent (413). Thus deactivation loci should correspond to regions that are inhibited (rather than inhibiting).
Deactivations probably involve more than one physiological mechanism and include both local and large-scale effects (364). The latter probably have a major impact in language studies. When subjects are at rest, i.e., not focusing on any particular cognitive process, attentional resources are widely distributed over cortical territories. However, when subjects engage in higher cognitive operations such as language processing, attentional resources are reallocated as a result of mutual competition between different processing pathways or subnetworks. The logical consequence of this reallocation is an increase in activity in the operative network and a decrease in activity in irrelevant functional systems (see Ref. 152 for discussion of cross-modality suppression effects). Such competitive mechanisms have been proposed to account for disorders of attention caused by thalamic lesions (417) and are congruent with neuropsychological models of attention (203). From a physiological point of view, Gusnard and Raichle (147) have suggested that deactivated areas, i.e., the posterior cingulate cortex, the posterior temporal/parietal cortex, and the medial frontal cortex, are involved in a "default" mode of brain functional status linked to nonspecific conscious experience. These authors proposed that the energy metabolism in these areas, especially the posterior cingulate cortex, is characterized by a constant and tight coupling between oxygen and glucose consumption, while phasic activation in other territories results in transient anaerobic episodes characterized by a very limited increase in oxygen consumption concurrent with a large increase in glucose consumption and local blood flow. Recent findings indicating a positive correlation between oxygen consumption and neural firing rate under stimulation in anesthetized rats (169) do not seem to support this hypothesis, but this apparent discrepancy might relate to methodological differences between experiments. Further basic physiological experimentation is needed to clarify this issue.
Because of its peculiar and reproducible anatomical distribution, deactivation was proposed to reflect implicit verbal elaboration in subjects supposed to "keep at rest" (26). Covert inner speech phenomena could account for apparent deactivation in some areas of the temporal parietal association cortex when rest is compared with cognitive tasks that are known to recruit other areas. Even if the neural basis for deactivation has not yet been clearly elucidated, the contribution of deactivation to the understanding of the physiology of language should be considered since 1) increase in activity in some regions of the brain during highly demanding, explicit language operations may be mirrored by dimmed metabolic signals in other regions, and 2) implicit, covert language processing may alter cross-condition comparisons by inducing an apparent decrease in activation in association cortices.
Aside from such conceptualization related to high-order phenomena, it remains the case, as pointed out earlier, that an immense gap exists between measurements of signal changes in language-specific large-scale networks and the recording of activities at the neuron level. The ultimate technical goal is to describe fundamental neuronal mechanisms that generate signal changes recorded by neuroimaging methods such as fMRI. Simultaneous recordings of intracortical neural activity and BOLD signals in monkeys' striate cortex during visual stimulation suggest that the BOLD effect mainly reflects dendritic input rather than spiking output (217). This study also substantiates decreases in BOLD signal in the periphery of the activated cortical area as an index of corticocortical inhibitory mechanisms.
Even though simultaneous intra/extracortical recordings cannot be obtained under standard conditions when normal participants perform language tasks, they provide insight into the principles of "local" cortical physiology. The open challenge for the neuroimaging of cognitive functions such as language is to build up a general model integrating elementary and local cortical physiology into the global dynamics of large-scale neural networks.
B) METHODOLOGICAL CHOICES. Methodological choices directly determine how hardware constraints can be tackled to answer essential questions. For instance, is it possible to determine the invariant parameters of brain activation from one individual to another? Is it possible to characterize the determinants of interindividual variability in brain activation and describe their influence? These questions not only affect the theoretical significance of empirical studies, but are also critical to clinical applications, since only the most reproducible results can be reliably used in presurgical mapping of language in the brain.
From a physiological standpoint, constructing experiments and interpreting the results requires that one takes fundamental physiological facts into account. 1) Neural substrates implementing cognitive functions are distributed over the entire encephalon, with functional crossroads or "nodes" being closely interlaced in some "bottle-neck" regions (240), such as the left STG, the angular gyrus or the basal temporal language area (cf. infra). 2) Significant changes in patterns of activity are minute in the energy range (
102 of the measured signal or less) compared with the baseline level of activity measurable in the brain. 3) These changes occur over several tens of milliseconds or more, i.e., in a time scale that clearly overrides the time range of neural events. 4) Subject-specific variables and various experimental parameters (see sect. IIC1) can influence the functional state of neural networks more than cognitive tasks, making the signal-to-noise ratio insufficient vis-à-vis interindividual variance. In other words, despite the rush to collect neuroimaging data depicting the neural basis of human cognition, it is now obvious that one cannot freely manipulate all the variables that may alter patterns of activity in the brain. Activation experiments can only tackle transient and minor alterations of complex patterns under the influence of carefully selected experimental stimuli, instructions, and, most importantly, in the framework of clearly established hypotheses deriving from robust cognitive models.
In fact, neuroimaging language (or any cognitive process) is equivalent to dealing with more or less fuzzy pictures or echoes from a complex and moving landscape (a problem very similar to that of mapping subtle ocean streams from a satellite). Over and above the subtlety of language-related brain signals, a critical issue is signal variability from one subject to another, relating to background noise and region location within the three-dimensional structure of the cerebral cortex and subcortical nuclei. In the past two decades, neuroimaging studies of language have largely overlooked subject variability and have based their conclusions on averaged data obtained in small groups of subjects (typically <20). In spite of such limitations, this approach has proven empirically valid (see sect. III). However, having reached the point of validation, this new domain of physiology now faces the challenge of specifying the many sources of signal variability, or at least, of defining the conditions under which such variations can be optimally reduced by making pertinent methodological decisions.
Such issues have been recently addressed by several authors, especially by the group of the Wellcome Department of Imaging Neuroscience in London. Friston et al. (127) proposed a "random-effect" approach to statistical comparison between groups of subjects. Earlier neuroimaging studies typically used "fixed-effect" analyses in which the results of planned contrasts between conditions apply exclusively to the studied subject sample. In such an approach, significant effects can be induced by signal changes in one or two subjects, possibly yielding spurious results at the group level. The more recent random-effect method implemented in SPM uses estimation of between-subject rather than within-subject variance, and degrees of freedom relate to the number of subjects rather than to the number of scans. Consequently, group analyses are performed using contrast images involving one image volume per subject per contrast, and they allow generalization of the findings to the whole population.
Another recent advance in this field is the neuroimaging of single subjects (e.g., Ref. 308; see sect. IVB). Although crucial to the renewal of single case studies in neuropsychology, especially for assessing the biological impact of therapeutic interventions (209), this analysis of brain activity changes in one patient requires further investigation to define the optimal statistical analysis (conjunction and contrast statistics).
With respect to the distinction between block-design and event-related design described in section IIA2, it is worth mentioning that methodological problems inherent in these two experimental modes remain unsolved. When engaged in a block-design experiment, participants are likely to show strong habituation effects; they may become tired or even drowsy (especially in the confined environment of the fMRI scanner), and, because they are exposed to the same experimental condition for a substantial period of time (in the range of 30 s), they may develop task-specific strategies (see, for instance, Ref. 294). Although this issue can be partly addressed by the counterbalancing of experimental blocks and by independently manipulating task difficulty (e.g., Ref. 394), it must be kept in mind that the spontaneous nature of brain processing is less likely to be observed in a series of very similar trials than in randomized series of trials. The use of an event-related design solves the problem of habituation and is meant to reduce strategic effects; however, it brings with it other methodological pitfalls. When a participant is exposed alternately to different experimental conditions in each trial, task switching and attentional mechanisms are likely to contribute substantially to the pattern of activity found. To overcome this, some authors have resorted to running their experiment using both block-design and event-related design (e.g., Ref. 82). Although rather laborious to implement, this approach makes it possible to check for the consistency of results in different cognitive contexts (75).
| III. LANGUAGE IN THE "HEALTHY" ADULT BRAIN |
|---|
|
|
|---|
|
Most studies have addressed language physiology on the basis of its two main input routes: audition (spoken words and environmental sounds) and vision (written code, sign language, scene and picture viewing). However, some exceptions can be noted. For instance, the functional mapping of brain regions related to olfactory input and their links to verbal representations (318) have shown that familiar odors associated with verbal labels yield specific activation in the left cuneus, suggesting a particular involvement of mental imagery.
A classical issue concerning speech perception is the dominance of the left temporal cortex. Several neuroimaging studies (e.g., Refs. 24, 103, 105, 106, 259, 279) initially located the structures involved in the processing of language-specific sounds in the left superior temporal association cortex surrounding the primary auditory cortex (i.e., the medial part of Heschl's gyrus, Ref. 421). More recently, numerous publications including meta-analyses (27, 30, 158) have highlighted the involvement of the anterior part of the superior temporal gyrus and the superior temporal sulcus in both hemispheres as the main neural substrates involved in the auditory representation of speech components, including those specific to the human voice (21). Attempts to localize neural responses that are specific to the human voice or to speech components do not seem to point to a single, homogeneous, and clearly left-sided area, although a slight left-greater-than-right asymmetry in the temporal structures was noted by Binder et al. (27).
Several factors might influence the functional dominance of the left superior temporal gyrus in language processing.
1) The rate of change over time in speech signals. Fast changing temporal cues seem to elicit preponderant activities in the left auditory system. Using a correlational approach, Belin et al. (22) showed that the right superior temporal cortex responded less efficiently to quick variations in sound spectral structure, a major feature of speech sounds (see sect. IIID1), than its left homolog. Zatorre and Belin and co-workers (431, 432) have proposed a low-level perceptual dissociation between the left and right superior temporal cortices for processing rapid temporal transitions versus spectral variations, respectively (Fig. 7).
|