Language processing is a trait of human species. The knowledge about its neurobiological basis has been increased considerably over the past decades. Different brain regions in the left and right hemisphere have been identified to support particular language functions. Networks involving the temporal cortex and the inferior frontal cortex with a clear left lateralization were shown to support syntactic processes, whereas less lateralized temporo-frontal networks subserve semantic processes. These networks have been substantiated both by functional as well as by structural connectivity data. Electrophysiological measures indicate that within these networks syntactic processes of local structure building precede the assignment of grammatical and semantic relations in a sentence. Suprasegmental prosodic information overtly available in the acoustic language input is processed predominantly in a temporo-frontal network in the right hemisphere associated with a clear electrophysiological marker. Studies with patients suffering from lesions in the corpus callosum reveal that the posterior portion of this structure plays a crucial role in the interaction of syntactic and prosodic information during language processing.
Our words are bound by an invisible grammar which is embedded in the brain.
Jonah Lehrer, in Proust Was a Neuroscientist.
Since the first discovery that language functions are directly related to brain tissue (28, 161, 258), people have been interested in understanding the neural basis of language. Starting with these early lesion studies, the advent of new methodologies such as electroencephalography (EEG), magnetoencephalography (MEG), and magnet resonance imaging (MRI), which can be used in vivo to image cognitive functions in the brain (fMRI) as well as gray matter anatomy and white matter fiber tracts (diffusion-weighted MRI), has lead to a considerable increase in brain-based language studies (for recent reviews, see Refs. 15, 208, 251).
Despite the fact that there are hundreds of studies on the topic, the description of the neural basis of language and speech still remains difficult. It is hard to see the wood through the trees. In the last decade, various models have proposed various paths through the wood (21, 67, 102, 117, 118). Although different in their perspective, there is a considerable and “hope-making” overlap of the different paths through the wood taken by the various models. Some models primarily focus on the neuroanatomy of speech perception (118, 213), whereas others try to specify the functional neuroanatomy of semantic and syntactic processes as well as the time course of these (21, 67). Yet others have considered different memory systems (247) or memory and control systems (102) as major parts of language processing. Taken together, however, these models seem to cover the different components of a language processing system quite well.
The goal of the present article is to describe the structural and functional neural network underlying sentence comprehension and how this process evolves over time as a sentence is perceived. We start the review by briefly sketching the time course of the different subprocesses constituting the process of sentence comprehension. Then, the general network underlying language function in the perisylvian cortex will be defined and its neuroanatomical architecture will be specified. Based on this background, the different processes taking place during comprehension, such as acoustic-phonological analyses as well as syntactic and semantic processes, will be described. These processes are hierarchically structured in time from the analysis of the auditory input to final integration and sentence comprehension. While auditory analyses clearly take place in the auditory cortices in the temporal lobes bilaterally, syntactic and semantic processes are supported by separable temporo-frontal networks strongly lateralized to the left hemisphere (LH) for syntax and less so for semantics. Processing of sentence-level prosody is supported by a temporo-frontal network in the right hemisphere (RH). These different processes and their respective neural implementation will be discussed at the neuroanatomical macro-level, and whenever possible also with respect to the neural structure at the micro-level considering cytoarchitectonics and receptorarchitectonics of the language-relevant cortices.
This review should be considered a critical one, but the goal is not to attack the position of single researchers. Rather, it is an attempt to provide a convergent view of what we know about the functional neuroanatomy of language up to now and what recent debates focus on.
The review will mainly focus on neuroimaging studies (fMRI, EEG, MEG) and will not include full coverage of all patient studies on language processing, although patient work is considered. This decision was taken based on the fact that lesion data are not always restricted to small circumscribed brain regions, and, moreover, on the finding that performance depends on the time of lesion onset and on plastic reorganization of language functions that may have occurred.
II. A BRIEF VIEW OF SENTENCE PROCESSING
The present description of sentence processing crucially differentiates three linguistic processing phases after an initial phase of acoustic-phonological analysis (67). In a first sentence-level processing phase, the local phrase structure is built on the basis of word category information. In the second phase, syntactic and semantic relations in the sentence are computed. These involve the computation of the relations between the verb and its arguments, thereby leading to the assignment of thematic roles (i.e., the analysis of who is doing what to whom). Once both semantic and syntactic information lead to the compatible interpretation, comprehension can easily take place. For example, the interpretation of an animate noun in sentence initial position as in “Mary cuts the flowers” is easy, as a person is a likely actor. For sentences in which semantic and syntactic information do not easily map, the processing system might need an additional third phase during which a final consideration and integration of the different information types is achieved, possibly including the context or world knowledge. During auditory sentence processing, these three different phases interact with linguistic prosody providing, for example, information about phrase boundaries relevant for syntactic processes. Linguistic prosody can also signal what is in the thematic focus of a sentence (indicated by stress in German and other Indo-European languages) and whether an utterance is a declarative sentence or a question (indicated by pitch in German and other Indo-European languages). This information is either essential or modulatory to the syntactic and semantic processes in a given sentence.
The above description of the process of language understanding is certainly only a sketch of what psycholinguistics have to say about this very complex process, but it entails the basic processes that have to be considered when characterizing the neural basis of language comprehension.
III. THE LANGUAGE NETWORK
From different overviews (67, 118, 251), it is clear that the language-relevant cortex includes Broca's area in the inferior frontal gyrus (IFG), Wernicke's area in the superior temporal gyrus (STG), as well as parts of the middle temporal gyrus (MTG) and the inferior parietal and angular gyrus in the parietal lobe (see Fig. 1). Within these macroanatomically defined regions, microanatomical subregions can be specified.
A. Parcellation of the Language Cortex
Korbian Brodmann (29) was the first to provide a cytoarchitectonic description of the human cortex. Novel neuroarchitectonic approaches provide detailed information about subdivisions of regions of the language network. These new neuroarchitectonic approaches are 1) advanced objective cytoarchitectonic analysis based on the density of different types of neurons in the cortex (5, 6), 2) receptorarchitectonic analysis based on the distribution of different types of neuroreceptors in the cortex (3, 267), and 3) the connectivity-based parcellation approach that subdivides brain regions according to their area-specific connectivity to other areas in the brain (8, 132).
Interestingly, all these approaches propose a subdivision of Broca's area itself, and segregate it from adjacent areas. This appears to be of importance as the larger region of Broca's area has often been discussed as supporting different aspects of language processing (20, 102, 207). Broca's area is usually defined as consisting of the cytoarchitectonically defined Brodmann area (BA) 44, the pars opercularis and BA 45, and the pars triangularis (5, 29) (see Fig. 1). Receptorarchitectonically, area 45 can be subdivided into two portions, a more anterior area 45a bordering BA 47 and a more posterior area 45p bordering BA 44 (3) (see Fig. 2). Moreover, area 44 can be receptorarchitectonically subdivided into a dorsal (44d) and a ventral (44v) area. These subdivisions may be of particular functional importance as different language experiments have allocated different functions to area 45, and also to area 44 which now can possibly be assigned to different subregions within 45 (45a versus 45p) and 44 (44d versus 44v) when considering the more fine-grained neuroanatomic parcellation of this area (compare with sect. IVC2).1
With the use of a connectivity based approach, the IFG has been shown to separate into a subregion (BA 44) connecting to the temporal cortex via a dorsal pathway [which includes the arcuate fasciculus (AF) and the superior longitudinal fasciculus (SLF)], a second region anterior to it (BA 45) connecting to the temporal cortex via the extreme fiber capsule system (EFCS) and a third region located more ventrally (frontal operculum, FOP) connecting via the uncinate fasciculus (UF) to the anterior temporal cortex (8). This latter article shows that there is variance between subjects with respect to the absolute localization of each area, but it also reveals that the relative location of the three areas is stable across different subjects [see also Klein et al. (141) for a connectivity-based parcellation of the separation of BA 44 and BA 45 and their probabilistic overlap].
The microanatomical description of the auditory and temporal cortices provides the following picture. In the primary auditory cortex (BA 41 in Fig. 1), cytoarchitectonic analyses have revealed different subregions in a medial-to-lateral direction (with Te1.0 in the middle, Te1.1 more medially located, and Te1.2 more laterally located) (176). The cytoarchitectonically defined region BA 22 covers the posterior two-thirds of the lateral convexity of the STG (29) (see Fig. 1). Receptor and cytoarchitectonic subdivisions have proposed a separation of the dorsal and ventral banks of the STG (175). It is suggested that the lateral STG proper excluding the dorsal and ventral banks is a functionally relevant area for language processing in humans. In the anterior-posterior dimension, there is no cytoarchitectonic parcellation of BA 22 as it covers most of the STG, except its most anterior portion (BA 38) (see Fig. 1).
As the cyto- and receptorachitectonic analysis cannot be conducted in the living brain, the team working with these approaches has calculated “probability maps” from post mortem brains of which the cytoarchitectonic analyses are available online (http://www.fz-juelich.de/inm/index.php?index=51).
B. Structural Connections Between the Language Cortices
The identification of fiber pathways between Broca's area and the temporal cortex (Wernicke's area) dates back to the late 19th century when Dejerine (47) defined the arcuate fasciculus as the dominant fiber tract connecting these two regions. Nowadays, diffusion tensor imaging (DTI) allows the identification of structural connections between different brain regions in the human in vivo (e.g., Refs. 11, 132). For a recent tractography atlas representing the major fiber connections based on this method, see Catani and de Schotten (38). Note, however, that with this approach the directionality of the connection cannot be determined. Concerning the connection between the language-relevant regions, i.e., the (pre)frontal cortex and the temporal cortex, the literature generally agrees on two pathways, a dorsal and a ventral pathway. Recently, there has been debate with respect to the particular functions of different pathways from the temporal cortex to other parts of the brain as well as with respect to their end points in the other brain regions (see Refs. 65, 66, 256) (see Fig. 3).
Within “dual stream models” (117, 118, 213), the ventral pathway has been taken to support sound-to-meaning mapping, whereas the dorsal pathway connecting the posterior dorsal-most aspect of the temporal lobe and the posterior frontal lobe has been suggested to support auditory-motor integration (118). Using a deterministic fiber tracking approach in which the two end points of the connection are predefined on the basis of functional data, Saur and co-workers (227, 228) interpret the ventral pathway connecting the temporal cortex with the pars orbitalis (BA 47) and triangularis (BA 45) via the EFCS as supporting sound-to-meaning mapping, and define the dorsal pathway as going from the temporal lobe to the premotor cortex and continuing to the pars opercularis (BA 44) supporting sensory-motor mapping of sound-to-articulation. This functional interpretation stands in slight contrast to probabilistic fiber tracking approach in which only one end of the connection is defined as a seed point. Defining two seed points in the IFG on the basis of two functionally different activations, Friederici et al. (69) identified a dorsal pathway going from pars opercularis (BA 44) to the posterior temporal cortex via the AF/SLF, and a ventral pathway from the FOP via the UF to the anterior temporal cortex. The function of the dorsal pathway was seen in the support of processing nonadjacent elements in syntactically complex sentences and the ventral pathway taken to support combinations of adjacent elements in a sequence.
Thus these findings as well as additional data from intraoperative deep stimulation (56) make it likely that there are two ventral pathways connecting the frontal to the temporal cortex involved in language processing, one from BA 45 via the EFCS to the temporal cortex (ventral pathway I) and one from the FOP via the UF (ventral pathway II). Moreover, there is suggestive evidence that there are two parallel dorsal pathways, one from the temporal cortex to the premotor cortex (dorsal pathway I) and one from the temporal cortex to BA 44 (dorsal pathway II), with the former mainly supporting sound-to-motor mapping and the latter supporting higher-level language processes (see Ref. 39, and for a recent debate, see Refs. 65, 66, 256).
This subdivision into two dorsal pathways is in line with recent structural connectivity data from very young infants showing a dorsal fiber tract from the temporal lobe going only to the motor/premotor cortex (55). This pathway (dorsal pathway I) subserving auditory-motor integration is already of primary importance during early language acquisition, when tuning the system towards the target language (118). A dorsal fiber tract that connects the temporal lobe with Broca's area in the IFG (dorsal pathway II) develops much later and appears to be functionally related to higher-level semantic and syntactic language functions (26). It is an open issue whether these dorsal connections are direct or indirect with an intermediate stage in the inferior parietal cortex (39, 212, 213) whose role within the dorsal stream might be that of phonological working memory storage (198, 245).
In addition to these long-range connections, functional connectivity and structural connectivity analyses, moreover, have identified two short-range pathways within the temporal cortex, a first one from Heschl's gyrus (HG) to the planum polare and anterior STG via a rostral fiber pathway and a second one from HG to the planum temporale (PT) and posterior STG via a caudal fiber pathway (248). These data suggest two auditory processing streams within the temporal cortex, 1) between the primary auditory cortex (PAC) and the anterior auditory cortex (planum polare) and 2) between the PAC and posterior auditory cortex (planum temporale). Short-range connections have also been reported for the prefrontal cortex, interconnecting the inferior frontal sulcus and BA 44 (166).
To summarize, in addition to short-range structural connections within the language-related cortex, there are multiple long-range structural connections between the language-relevant regions in the frontal and temporal cortices: two dorsal pathways and possibly two parallel ventral pathways. Although the direction of the connectivity cannot be determined in humans using the DTI approach, data from animal studies using invasive tracer methods suggest strong directionality from sensory regions to the prefrontal cortex in the monkey (101, 221). The reverse information flow is also considered, and the two directions are discussed in terms of feed-forward and backward projections (212). In the domain of human language processing, projections from sensory to the premotor cortex (via dorsal pathway I) could support bottom-up information processes, whereas projections from Broca's area to the temporal context (via dorsal pathway II) could subserve top-down processes drawing prediction about the incoming information, thereby easing its integration. Further research must show whether these assumptions for language processing hold.
The precise function of these structural connections, however, can only be defined indirectly, namely based on the function of the particular regions they connect. One way to establish a closer relation between structural and functional information might be to use the anatomical connectivity as a prior for dynamic causal modeling of fMRI data (240).
C. Functional Connections in the Default Language Network
Every brain-based study on language processing reports at least one function-related activation in the left perisylvian cortex, which includes the prefrontal, frontal, temporal, and parietal cortices. The particular function assigned to a given area in the perisylvian cortex as defined on the basis of functional imaging studies investigating different aspects of language processing, such as phonology, syntax, and semantics, will be discussed in detail in section IV.
Here we will first consider recent data which suggest that the experimental variations in these studies only reflect the tip of the iceberg, since specific experimental conditions can only explain ∼20% or less of the total variance of the activation of the brain in a given experiment (162). The rest of the variance represents activation not induced by the specific experimental conditions. Interestingly, this “unexplained” activity is not random. For language experiments, it is located in the perisylvian cortex. As this activation pattern was only observed for language experiments and not for nonlanguage experiments, it was taken to represent the default language network (162). To identify this default activation, a low-frequency fluctuation analysis of fMRI data compared four language experiments with two nonlanguage experiments from the same laboratory (for method, see Ref. 162; for low-frequency fluctuation analysis in general, see Refs. 17, 211).2 Moreover, when conducting a functional connectivity analysis within this default language network, a significant correlational connectivity was found between Broca's area in the IFG and the posterior superior temporal lobe (162) (see Fig. 4).
Thus it is already within the default language network that there are functional connections between different language regions, independent of the different conditions induced by a given experiment. To summarize, the particular activation pattern reported for specific experimental conditions aiming to test semantic or syntactic processes as reported in the different language fMRI studies thus only represents a modulation of this default language network.
IV. PROCESS-SPECIFIC NEURAL NETWORKS
Spoken sentence comprehension requires a number of subprocesses to derive the meaning of a sentence from the auditory input, as there are acoustic-phonological, syntactic, and semantic processes. We will discuss the brain regions supporting these different processes in turn.3
A. Acoustic-Phonological Analysis
The comprehension of spoken language starts with the acoustic-phonological analysis of the speech input. The obvious neural candidate to support this process is the auditory cortex and adjacent areas.
In an attempt to specify subregions in the auditory cortex and adjacent areas in humans, researchers have relied on neuroanatomical data from non-human primates for which a core region in HG, a surrounding belt and parabelt region has been identified (213, 230). In humans, the PAC is located on the superior surface of the temporal lobe bilaterally in HG. Three regions can be identified adjacent to HG. A region located posterior, the planum temporale (PT), a region anterolateral to HG called planum polare (PP), and a region at the lateral convexity of the cortex in the STG extending to the superior temporal sulcus (STS). All these regions are involved in the acoustic analysis of speech. Cytoarchitectonic studies have indicated that the PAC usually covers the medial two-thirds of the anterior HG (176), and the identification of a subregion in the lateral convexity of the STG has been confirmed by a receptorarchitectonic analysis (175).
Functionally, a primary step is to differentiate speech from nonspeech acoustic signals, and for a description of the neuroanatomic basis of speech comprehension, it would be of major interest to identify where in the processing stream this takes place. The primary auditory analysis is computed in HG. Functional neuroimaging studies show that HG is activated by any type of sound (133, 177). The region lateral to HG at the convexity of the STG extending into the STS has been found to respond to acoustic features of phonetic parameters (16), but also to variations of frequency and spectral information in nonspeech sounds (109) and is thus not specialized for speech. Functional imaging studies have, moreover, shown that PT also does not react specifically to speech sounds, at least compared with equally complex nonspeech sounds (48, 261, 266). The information flow from HG to PT has been demonstrated in a time-sensitive fMRI paradigm, indicating the involvement of HG and PT at different points in time (264). It has been concluded that HG is associated with analyzing the sound signal per se, whereas the PT may be involved in categorizational processes. The PT has been proposed as the region for the segregation and matching of spectrotemporal patterns and as serving as a “computational hub” gating the information to higher-order cortical areas (95).
Speech perception of phonemes (consonants) was found to activate a region anterolateral to HG in the STG/STS (189). This region differentiates between speech and nonspeech sounds. In contrast, the left posterior STG was found to process the basic acoustic characteristics of the signal. Given their respective responsibilities, the posterior STG was defined as reflecting earlier processes than the anterolateral STG/STS (146). The fMRI finding that the posterior STG houses an earlier processing level than the anterolateral STG/STS is consistent with magnetoencephalographic evidence locating the relatively early N100 response to consonants in HG and PT (188) and with patient evidence showing that lesions in the posterior STG lead to word deafness as well as deficits in the perception of nonspeech sounds (204). Other neuroimaging studies, however, reported the PT or the supramaginal gyrus to respond to speech compared with nonspeech sounds (46, 131, 174). These studies, in contrast to Obleser et al. (189), who used a passive listening paradigm, used attention-demanding tasks. From these data, it appears that under specific task demands, the differentiation between speech and nonspeech sounds by means of top-down processes may be shifted to an earlier processing level, in this case the PT.
Functionally, PAC in the left and the right hemispheres are responding to speech and tonal pitch, but they appear to have different computational preferences, with the left PAC reacting specifically to speech sounds characteristics and the right PAC to characteristics of tonal pitch (265). The relative specialization of the two auditory cortices for these stimulus types, which differ in their temporal and spectral characteristics, is described as a specialization for rapidly changing information with a limited frequency resolution in the left hemisphere and a system with reverse characteristics in the right hemisphere. The former system would be ideal for the perception and recognition of speech sounds, as the determination of these (i.e., phonemes in a sequence) requires a system with a time resolution of 20–50 ms. The latter system would be able to deal with suprasegmental information (i.e., prosody requiring a system with a time resolution of 150–300 ms). Hickok and Poeppel (118) proposed that the left and right hemisphere generally work at different frequencies, leading to a relative lateralization of functions. The left hemisphere primarily works in gamma frequencies, whereas the right hemisphere works in the theta range (93).
When considering functional levels of speech perception, a next relevant level is “intelligibility” in its most general sense (i.e., language understanding including both semantic and syntactic processes). The methodological approach used to investigate processes at this level is the manipulation of the acoustic signal by spectrally rotating normal speech to render the speech signal unintelligible (18). Studies using such manipulations have consistently shown that the anterior STS is systematically activated as a function of intelligibility (see Table 1). The posterior STS, in contrast, was found to be equally activated by normal speech, rotated speech, and noise-vocoded speech (232), leading to the idea that this area is involved in the short-term representation of sequences of sounds that contain some phonetic information (without being necessarily intelligible) (229). This functional differentiation is interesting in the light of the two different pathways from the primary auditory cortex discussed in section IIIB, one going from HG to the anterior STS/STG and one going from HG to the posterior STS/STG (248). Moreover, these observations are in line with clinical studies on patients with focal cerebral disease in the anterior temporal regions showing deficient speech comprehension (1, 14, 89, 119, 182).
To summarize, as a first processing step during auditory language comprehension, the brain has to perform an acoustic analysis in an auditory cortical network starting at the PAC and then distributing the information in two directions, 1) to the PT and posterior STG and 2) to the planum polare and the anterior STG. As yet, little is known about the particular function of the planum polare in processing speech or complex nonspeech sounds. The PT has been suggested as the “computational hub” from which information is gated to higher-order cortical regions (95). A connection from the temporal cortex to the premotor cortex appears to support auditory-to-motor mapping and has been claimed to represent part of the “phonological network” (228).
B. Initial Syntactic Processes
Several psycholinguistic models have proposed that the sentence parser processes syntactic information at different levels with an initial stage during which the simplest syntactic structure based on word category information is constructed and a second stage during which the relations who is doing what to whom are established (63). These models called serial syntax-first models have been challenged by interactive and constraint-satisfaction models (163, 169), which assume that syntactic and semantic information interact at any time. Syntax-first models, however, receive some support from neurocognitive models of language comprehension, which consider event-related brain potentials (ERPs) to provide crucial information about the temporal structure of language processing (21, 67).
As syntax-first models assume that the important syntactic processes relevant for the assignment of the grammatical structure of a sentence to occur only a couple of hundred milliseconds later than the initial syntactic parse, it is not easy to separate these two stages of syntactic processing using fMRI due to the low temporal resolution of this method. One way to investigate the different syntactic stages is to introduce violations in natural sentences which tap either the initial or the later syntactic processing stage. The initial processing stage will clearly be affected by word category violations, since incorrect word category information would make the building up of an initial local phrase structure impossible while violations of grammatical relations in the sentence will affect a later processing stage. Another way of investigating local syntactic structure building is to use artificial grammars which lack semantic relations. Initial local phrase structure building processes4 were found to be correlated with increased activation in the frontal operculum and the anterior STG both in studies on natural grammar processing (81) and on artificial grammar sequences (69). The natural grammar study in German introduced a word category error within a prepositional phrase by putting a verb instead of a noun after the preposition, e.g., “The pizza was in the eaten” instead of “The pizza was in the restaurant eaten” (literal translation). The past participle verb form is syntactically incorrect, disallowing local structure building. The artificial grammar experiment used a probabilistic grammar in which an element of the category A (a certain syllable type) was always followed by an element of the category B (another syllable type), e.g., ABABAB. A violation was created by having an A syllable followed by another A syllable in the sequence. The processing of this syntactic error in the artificial grammar sequence led to activation in the FOP. Taking the maximum of activation as a seed point for tractography analysis in each individual, a ventrally located fiber tract connecting the FOP and the anterior STG via the uncinate fasciculus was found (69). On the basis of this finding, it has been suggested that the FOP together with the anterior STG supports local structure building. More generally, this network could be viewed as the system that supports rule-based combinatorics of adjacent elements.
During sentence processing, this initial stage of phrase structure building is mandatory and should in principle be observable whenever a sentence is processed. Thus the FOP should be seen with increased activation not only for violations in sentences and sequences, but also when comparing sentences to nonstructured word lists. Activation of the FOP was observed in a study comparing sentences to word lists without function words (78), but not in other studies using mixed word lists. Most of these other studies used word lists that allowed local structure building partly due to syntactically legal combinations of two or three words in the list, for example, adjectives and nouns (125, 127, 236, 241, 250). Interestingly, Vandenberghe et al. (250) report activation in the FOP (−48, 22, 4) for different sentence conditions providing word category information compared with control conditions in which unpronounceable letter sequences (providing no word category information) were used. All these findings are thus generally in line with the view that local structure building is supported by the FOP. However, it should be noted that local structure building is quite automatic in adults only requiring small resources (as indicated by ERP studies; see sect. VB). Therefore, the FOP may not be seen to be significantly activated in each study with native adult listeners. Moreover, given that the activation in native listeners is very small, significant activations may not be observable in grand averages across subjects due to the variability of the location of the FOP across individuals as shown in a connectivity-based parcellation study (8). Further research taking individual subject data into account must clarify this issue.
Studies investigating sentence processing under less proficient processing conditions as in language development (27) and second language learning (222) show that processing phrase structure violations involves the IFG, in particular Broca's area, and not just the FOP. This suggests that there may be a shift in the recruitment of necessary parts of the ventral prefrontal cortex for local syntactic structure building as a function of language proficiency.
C. Computation of Semantic and Syntactic Relations
Empirically, there are three basic methodological approaches to investigate syntactic and semantic processes during sentence comprehension. The first is to vary the presence/absence of syntactic information (by comparing sentences to word list) or of semantic information (by comparing real word lists/sentences to pseudoword lists/sentences). The second approach is to introduce syntactic or semantic errors in sentences. The third is to vary the complexity of the syntactic structure (including syntactic ambiguities) or the difficulty of semantic interpretation (including semantic ambiguities). All these approaches have been used in fMRI studies published in the last 15 years.
In general, these studies found activations at different locations in the anterior and posterior temporal cortex as well as in the IFG. The picture that emerges from these studies may be less clear than some researchers had hoped (Ref. 59 and a reply to this paper by Grodzinsky, Ref. 97). However, once we take both stimulus type and task as well as neuroarchitectonic subdivisions of language-relevant brain regions into consideration, a picture emerges that is worth presenting as a tentative state of the art model. Once these different aspects are considered, the reported activation pattern provides a surprisingly coherent picture even across typologically different languages. We will first consider activations in the temporal lobe and then those in the IFG.
1. Role of the temporal lobe
Many of the neuroimaging studies on language comprehension report activation in the anterior and posterior temporal lobe. While some studies concluded that the anterior and posterior temporal regions react specifically to semantic or syntactic aspects, others challenged this view by arguing either that the anterior temporal lobe (218, 250) or the posterior temporal lobe is not domain specific (126).
A) ANTERIOR TEMPORAL LOBE.
A number of fMRI studies reporting activation in the temporal lobe investigated semantic and syntactic processes by systematically varying the presence/absence of semantic and syntactic information in a within-subject design. Those studies that compared sentences (syntax present) to word lists (syntax absent) found the lateral anterior temporal lobe to activate more strongly for sentences than for word lists (for French, 171; for German, 78; for English, 125) (see Table 2 for more studies). As this increase of activation in the anterior STG/STS is present even when comparing meaningless pseudoword sentences (i.e., sentences in which function words remain in their syntactic correct position, but content words are replaced by pseudowords) with meaningless pseudoword lists, this region has been interpreted to support the construction of phrase structure in particular (78, 125). One study investigating the processing of sentences containing syntactic and semantic violations found that, compared with baseline, syntactic violations led to an increased activation in the anterior STG, whereas semantic violations did not (81). Moreover, studies testing semantics by comparing real-word stimuli (sentences and word lists) with pseudo-word stimuli (sentences and word lists) reported no main effect of semantics in anterior STG/STS (78, 125).
However, activation in the anterior temporal lobe has been reported to change as a function of sentence-level semantic processes (218, 250). This appears to be the case only under certain experimental conditions. Vandenberghe, Nobre, and Price (250) used sentences that were either semantically incoherent and/or syntactically incorrect. The anterior temporal pole was found to be more active for semantic incoherence, but only when syntactically incorrect versions were compared with normal sentences. Thus the semantic effect only carries once the syntax is incorrect. Rogalsky and Hickok (218) reported a direct comparison of activation of two task conditions: subjects listening to sentences including a semantic or syntactic violation had to detect either the semantic violation or the syntactic violation, respectively. In a whole head analysis conducted over correct sentences, they were found to be activated during the syntactic task and the semantic task. A region of interest analysis in the anterior temporal lobe including BA 38 revealed a large region that was equally modulated by the two tasks, but only a small subregion that was only modulated by the semantic task. From these data the authors concluded that the anterior temporal lobe should be considered as a region that supports combinational processes both in the syntactic and the semantic domain.
From the studies discussed, we can conclude that the anterior STG is systematically involved whenever syntactic structure has to be processed (sentences versus word lists). For a localization of the anterior STG, see Figure 1. The simple presence/absence of word semantics (real > pseudo-words) does not modulate this region. Sentence-level semantic aspects can activate the anterior temporal lobe but only under certain stimulus conditions (250) or under specific task conditions (218). It has been proposed that there are two different subregions within the anterior STG/STS that modulate their activation differentially as a function of semantic and syntactic processes, with the most anterior portion of the STS responding to syntactic manipulations (sentence versus word list) and a region directly posterior to it showing an interaction of syntactic and semantic factors (125). Future studies will have to provide additional evidence of this functional separation within the anterior temporal lobe.
It should be noted that the anterior temporal lobe has long been discussed as supporting semantic tasks in general (155). Evidence for this view mainly comes from patients with dementia or lesions in the anterior temporal lobe, who show semantic impairments for word and picture processing and memory. We will not discuss these studies in detail as the focus of this review is on sentential processes, but refer to recent meta-analyses. One recent meta-analysis (15) reviewed 120 fMRI studies on semantic processing at the word level and identified a left-lateralized semantic network consisting of seven regions, none of which, however, were in the anterior temporal lobe. Another recent meta-analysis (252) reviewed 164 functional imaging studies including those investigating words and sentences presented auditorily and visually as well as pictures. This analysis revealed that the likelihood of anterior temporal lobe activation is dependent on the type of stimuli, and that studies using auditory sentences are more likely to find activation in this region than studies using other stimulus types, but the authors refrain from defining this region's function in auditory sentence processing.
To conclude, it appears that the anterior temporal cortex is involved in semantic and syntactic processes. Its function during sentence processing may be primarily combinatorial in nature.
B) POSTERIOR TEMPORAL LOBE.
The posterior temporal lobe has also been found to be activated during language comprehension. Activation in the left posterior STG/STS has been reported for syntactic information across different studies, when comparing sentences to word lists (127, 236, 250), when comparing syntactically complex to less complex sentences (41, 77, 140, 184, 225), and when comparing sentences containing a syntactic violation with syntactically correct sentences (76, 81) (see Table 3).
Activation in the posterior STG/STS has also been seen to be modulated by specific semantic information at the sentential level, in particular, when the stimulus material involves the processing of the relation between the verb and its arguments, be it in correct sentences when considering a sentence's semantic close probability with respect to the verb-argument relation (185), or in sentences which contain a restriction violation between the verb and its arguments (81). When different verb classes and their argument order were investigated, it was found that these two factors interact in the posterior STG/STS (22). Together, these studies suggest that the left posterior STG/STS is a region in which syntactic information and verb-argument-based information are integrated (98).
Moreover, syntactic and semantic ambiguity involve the posterior temporal cortex. Syntactic ambiguity activates the posterior temporal lobe extending posteriorly to the inferior parietal lobe and the MTG anterior to Heschl's gyrus (245). Semantic sentence ambiguity was found to activate the left posterior temporal cortex including the STS, MTG, and inferior temporal gyrus (215). However, it should be noted that both semantic and syntactic ambiguity are processed in a network which, in addition to the temporal cortex, also involves the left IFG, as evidenced by a functional connectivity analysis using a predictor time series located in the left IFG (245). In this analysis, the activation due to semantic ambiguity in the left IFG predicts the activation in the left anterior STG, whereas the activation of syntactic ambiguity in the left IFG predicts the activation in the anterior and posterior MTG/STG. Thus both ambiguity types activate a temporo-frontal network with type-specific modulations in the temporal cortex. These modulations for semantic and syntactic ambiguity are in line with the type-specific modulations observed in the language studies listed in Tables 2 and 3.
Note, however, that the posterior STG is not specific to integration processes in language or speech (230). Rather, it has been implicated in the integration of different information types, for audiovisual integration (2, 31), for biological motion (209), and for face processing (110). It has been proposed that the function of the STS varies depending on the coactivations of the network with regions in the medial temporal lobe and in the frontal cortex (111).
There is one additional region in the left superior Sylvian fissure at the parietal-temporal boundary, called area Spt, which has been discussed as part of the auditory-motor integration circuit, which involves left frontal regions and the STS bilaterally (116, 118). The Spt is also not specific to speech, as it is activated during the perception and reproduction (humming) of tonal sequences as well (116). It is speculated that Spt is more highly coupled to the motor system than to the sensory system. Thus the posterior temporal cortex is clearly involved in language processing, and its function appears to be primarily to integrate different types of information. For sentence processing, this might mean the integration of semantic and syntactic information.
2. Role of the IFG
The IFG, in particular Broca's area, has long been known to support language production (28, 223) and comprehension processes (269). For the localization of Broca's area, defined as consisting of BA 44 and BA 45, see Figure 1. Its function in language comprehension is still a matter of considerable debate (99, 102, 103, 219). Although the different views agree upon the involvement of Broca's area in language comprehension, they debate its particular role in this process. This discussion takes place on multiple levels. At the most general level, the claim is made that Broca's region supports action observation and execution and that its part in language is related to motor-based speech production and comprehension processes (210, 214). At the next level, the claim is that Broca's region supports verbal working memory (235) and that this is why this region shows activation when processing syntactically complex sentences (37, 220). At a linguistic level, subregions of Broca's area have been allocated to different aspects of language processing, either seeing BA 44 as supporting syntactic structure building, BA 44/45 as supporting thematic role assignment and BA 45/47 supporting semantic processes (67), or specifying Broca's area (BA 44/45) as the region supporting the computation of syntactic movement (96), or defining Broca's region (BA 44/45/47) as the space for the unification of different aspects in language (102). This debate was and is based on a large number of neuroimaging studies as well as neurophysiological and behavioral studies with healthy individuals and with patients suffering from circumscribed brain lesions in the IFG. The majority of these are described in different review articles published over the past decade (20, 67, 96, 98, 102, 219). This review will not reiterate each of these studies, but will discuss recent studies that have contributed possible solutions to the open issues at the linguistic level and the related verbal working memory processes.
A) SYNTACTIC COMPLEXITY.
A large number of studies in different Indo-European languages have investigated the neural substrate of syntactic processes by varying syntactic complexity. In these languages the canonical word order is subject-first either with a subject-verb-object or a subject-object-verb structure. Studies in these languages often compare brain activation for the processing of noncanonical object-first to canonical subject-first sentences using different sentence types in which the object-noun phrase is moved to a position in front of the subject-noun phase, called movement in linguistics (for studies in different languages, see Table 4 and Fig. 5). In linguistic terms, this means that the object-noun phrase (now antecedent) leaves an empty position in the original structure (gap) of the sentence. What is analyzed in the imaging studies is the difference in the brain activation between sentences containing movement or not, or the difference between sentences varying the distance of the antecedent-gap relation (short/long). The studies listed in Table 4 show an activation increase in Broca's area (BA 44 and/or BA 45) for movement operations across different languages with the exceptions of three studies. These are as follows: Caplan et al. (35), who presented the critical sentences together with semantically implausible sentences and employed a plausibility judgement task, and two studies (41, 60) which only found IFG activation for a long, but not for a short antecedent-gap relation, suggesting an interaction between syntactic structure and distance as such. However, the finding that these two studies only observed an effect for the long conditions could be explained by the fact that their short conditions differed from the long conditions in the number of intervening noun phrases.
Supporting evidence for the view Broca's area is crucial for the processing of syntactic complexity comes from studies investigating patients with focal lesions in Broca's area (for a review, see Ref. 96). A recent study investigating patients with the nonfluent, agrammatic variant of primary progressive aphasia (PPA), which is a clinical syndrome associated with degeneration of relevant language regions, provides additional insights into the involvement of Broca's area in processing syntactically complex sentences in English (259). In a functional and structural imaging experiment, these PPA patients, in contrast to controls, showed low performance for the processing of noncanonical sentences, i.e., sentences requiring movement operations. In controls, the left dorsal posterior IFG (BA 44) including IFS well as the mid-posterior STS were modulated by syntactic complexity, and in patients, atrophy was observed in these very same brain regions including the left dorsal precentral gyrus, but sparing the most posterior portion of the STS. While the mid-posterior STS showed preserved modulation in the patient group, the posterior portion of IFG (BA 44) did not. These data suggest that BA 44 is the most critical region for processing syntactic complexity.
A second cluster of studies, those in free word order languages such as German and Japanese, investigated sentences with noncanonical word order structures different from those in English. Due to case marking in these languages, an object-noun can simply change position in the sentence (object-verb-subject) and is still grammatical (as in 1 and 2 below; nominative case = NOM; accusative case = ACC).
1) Der Junge (NOM) grüßtden Mann (ACC).
The boy (subject) greets the man (object).
2) Den Mann (ACC) grüßtder Junge (NOM).
The man (object) greets the boy (subject) [literal].
The boy greets the man [nonliteral].
Clause-initial object-first order as in 2 is called topicalization; clause-medial object-first order as in 3 is called scrambling (when occurring in the so-called middle field).
3) Heute hat den Mann (ACC) der Junge (NOM) gegrüßt.
Today has the man (object) the boy (subject) greets [literal].
In linguistic theory, it is discussed whether topicalization and scrambling can be considered as a type of movement or not. At the neural level, it appears that scrambling activates Broca's area in quite a similar way to movement (see Table 4 and Fig. 6).
A study in German investigated scrambling by parametrically varying the number of permutations in a sentence (70). Object noun phrases (indirect object = IO, direct object = DO) were scrambled in front of the subject noun (S) as in sentences 5 and 6, leading to sentences of varying syntactic complexity (nominative case = NOM, dative case = DAT, and accusative case = ACC).
4) Low complexity (S-IO-DO).
Heute hat der Opa dem Jungen den Lutscher geschenkt.
Today has the grandfather (NOM) to the boy (DAT) the lollipop (ACC) given.
5) Medium complexity (IO-S-DO).
Heute hat dem Jungen der Opa ____ den Lutscher geschenkt.
Today has to the boy the grandfather the lollipop given.
6) High complexity (IO-CO-S).
Heute hat dem Jungen den Lutscher der Opa ____ ____ geschenkt.
Today has to the boy the lollipop the grandfather given.
The brain activation in BA 44 increased systematically as the syntactic complexity increased. Activation for the different sentence types in Broca's area is displayed in Figure 6.
A study on sentence embedding (nested structures) in German also showed activation in BA 44 (166). Comparing embedding and movement directly in English, it was found that embedding activated BA 44 and movement BA 45 as well as BA 44 (225), suggesting BA 44 as the core region of syntactic complexity.
In sum, different studies indicate that the processing of syntactically complex sentences recruits Broca's area. The particular function of BA 44 and BA 45, however, still remains to be specified across different languages.
B) SYNTACTIC COMPLEXITY AND WORKING MEMORY.
With respect to the discussion on the role of Broca's area, it is clear that Broca's area is involved in working memory (WM) in general (253) and that the processing of syntactically complex sentences requires some WM capacity (41, 92, 134). It is debated whether the verbal WM involved in language comprehension is specific for syntax or not (37, 58, 158, 255). Some authors see the role of Broca's area in WM as specific to the processing of movement, since they found WM to interact with the processing of sentences requiring movement in BA 45, but not with the processing of other sentence types (226).
The interplay between syntactic complexity, length of syntactic ambiguity, and working memory has been investigated in a study involving participants with low and high reading span (61). This study found that the superior portion of BA 44 bordering the IFS increased its activation as the length of the syntactically ambiguous part of the sentence increased (requiring increased memory resources), whereas the activation in the more inferior part of BA 44 increased as a function of syntactic complexity (but only for low span readers) (61). This suggests a possible subdivision of Broca's area with its most dorsal part bordering the IFS responding as working memory demands increase, and with the more inferior part of BA 44 reacting to syntactic complexity. More recently, a study on processing syntactically complex, center-embedded, nested sentences varied the factors WM and syntactic complexity systematically and was able to segregate the two factors neuroanatomically. WM was operationalized as the distance between the subject noun-phrase and its related verb, whereas syntax was operationalized as the number of hierarchical embeddings (see Fig. 7). Example sentences for the long distance condition are 1) embedded structure and 2) nonembedded structure.
1) Peter wusste, dass (Peter knew that).
Maria (S1), die (S2) Hans, der (S3) gut aussah (V3) liebte (V2) Johann geküsst hatte (V1).
Maria who Hans who was good looking loved Johann kissed. [literal]
Maria who loved Hans who was good looking kissed Johann. [nonliteral]
2) Peter wusste, dass (Peter knew that).
Achim (S1) den großen Mann gestern am späten Abend gesehen hatte (V1).
Achim the tall man yesterday at late night saw. [literal]
Achim saw the tall man yesterday late at night. [nonliteral]
The main effect of distance reflecting WM was located in the IFS, whereas the main effect of hierarchy reflecting syntactic complexity was located in BA 44 proper (166) (see Fig. 7). Functionally, it was shown that the two areas strongly interact during sentence comprehension. Although in this study the number of embeddings directly correlated with the number of subject-verb dependencies, the observed activation in BA 44 as a function of syntactic complexity is in line with earlier findings showing that the inferior portion of BA 44 parametrically increased its activation with increased syntactic complexity operationalized as the number of permutations of noun phrases in scrambled sentence structures (70). Moreover, these data sets are perfectly compatible with a recent study which reports activation in BA 44 for syntactic complexity in general (embedding and movement), but activation in BA 45 specifically only for syntactic movement (225). These functional subdivisions of Broca's area into BA 44 and BA 45, and even into a dorsal and ventral region within BA 44 might be seen in the light of recent receptorarchitectonic subdivisions of Broca's area (3) (described in sect. IIIA). Future research might be able to relate different language functions to receptorarchitectonically defined subregions of Broca's area in more detail.
Note that the proposed fine-grained segregation of a functional specification in subregions of Broca's area into BA 45 and BA 44 or its subregions during language processing is orthogonal to the dispute of whether Broca's area has to be conceived to be language-specific or not, since Broca's area may receive domain-specific functions as part of different domain-specific neural networks. Within this dispute it is still an open question whether Broca's area subserves a more general function underlying the domain-specific functions. Researchers who try to specify the brain structure-function relationship for the language domain have already pointed out that Broca's area is involved in other nonlanguage processing domains and suggested that Broca's area most general function is to support sequence processing in both the language and the nonlanguage domain (67). Indeed, recent studies have shown that the processing of hierarchical sequences in artificial grammars (9, 69, 190, 192) and even in the visuospatial domain (10) activates Broca's area (BA 44), but as part of different neural networks.
Recently, however, the claim has again been made that the only contribution of Broca's area to sentence comprehension is its role as a phonological short-term memory resource and not more (219). This claim is based on a combination of behavioral and fMRI data. Behaviorally, it was shown that the comprehension difference between difficult object-relative sentences (OR) and easy subject-relative sentences (SR) (i.e., OR > SR, the syntactic complexity effect) is affected by a concurrent articulatory suppression task performed while listening to these sentences (220). However, it is also affected by concurrent finger tapping, although to a somewhat less degree. The authors take the effect of articulation on sentence comprehension to indicate that verbal rehearsal, blocked by articulation, supports sentence processing. During sentence processing in the scanner without any task, both BA 44 and BA 45 showed the syntactic complexity effect (OR > SR). With the concurrent articulation task, the syntactic complexity effect is eliminated in BA 44 (not in BA 45), due to an increase in activation for the easy-to-process SR sentences. The authors do not provide a compelling explanation for why articulatory suppression should affect processing of the easy SR sentences rather than the difficult and complex OR sentences. Thus it is not entirely clear how their findings can be linked directly to the claim that the role of Broca's area (BA 44) in sentence comprehension is nothing more than providing a phonological short-term memory resource, necessary for the processing of syntactically complex sentences.
Moreover, this general claim is challenged by patient data indicating that phonological rehearsal capacities are independent from sentence processing abilities (36). Thus these data rather lead to the assumption of two working memory systems in the prefrontal cortex reflecting a phonological rehearsal component and a syntactic manipulation component (37).
3. Syntactic complexity and experimental demands
The majority of studies on syntactic complexity reported activation in the IFG, mostly in BA 45 and BA 44, but some also in the more anteriorly located BA 47. The localization of the syntactic complexity effect, manifested in more activation for complex than simple sentences, appears to be subject to the experimental demands, such as task demands or intelligibility of the stimulus.
A number of studies have demonstrated large effects of task demands on identical stimulus sets (33, 79). On the single-word processing level, a shift from BA 44 to BA 45 was demonstrated when words had to be judged for their syntactic word category (BA 44) or for their concreteness (BA 45), respectively (79). On the sentence level, the effect of syntactic complexity was shown to differ as a function of task (33). The complexity effect has repeatedly been shown to correlate with an increase of activation in Broca's area, both in BA 45 and BA 44. Even across different tasks, activation in BA 44 and BA 45 was reported in studies using plausibility judgement tasks (Ref. 243; and seven experiments reported in Ref. 32), as well as studies that had used comprehension verification tasks either by question answering (who did what to whom) (13, 22, 70, 166), by sentence probe verification (184), or by word probe verification (225, 226). Regions adjacent to Broca's area were observed to be activated when the factor syntactic complexity and semantic constraint were mixed in sentence comprehension experiments. Under such conditions, the syntactic effect (more activation for object-extracted than for subject-extracted sentences) was found in the more anteriorly located BA 47 (34), an area which had previously been allocated to the processing of semantic aspects of a sentence rather than its syntactic aspects (20, 43). These data seem to raise the possibility that task demands can lead to a shift in the activation focus or an additional recruitment of adjacent brain regions within the IFG. A study that directly compared three different tasks (plausibility judgement, sentence verification, and non-word detection) in a within-subject design identified BA 44 as the only region demonstrating a syntactic complexity effect across the different tasks. Additional regions observed during sentence processing in the verification or plausibility judgement conditions were allocated to ancillary cognitive operations (34). BA 44 was thus taken as the core region of syntactic operations.
In a recent series of studies, it was shown that degraded auditory input can lead to a shift in locus of syntax effects during sentence processing. Across studies, it was found that syntactic phrase structure violations, which were seen to correlate with activation in the frontal operculum under normal auditory input conditions (81), activated BA 44 when normal sentences were presented pseudo-randomly together with auditorily unintelligible sentences (76). For syntactically complex sentences, normally activating BA 44, degradation of the auditory input (intelligibility) caused a shift in the maximum of activation in the IFG towards a more posterior and more superior region (inferior frontal sulcus) (187). This focal shift in activation towards regions that under normal auditory input are responsible for more elaborate syntactic processes (from frontal operculum to BA 44 and from BA 44 to inferior frontal sulcus) has been termed “upstream” delegation and is not only observed in the IFG, but also in the temporal cortex, where activations shift from the anterior and posterior STG/STS towards the auditory cortex (see Fig. 8). Notably, this upstream shift for syntactic processes stands in clear contrast to the effect of auditory degradation (intelligibility) on semantic processes, which leads to a more distributed neural network involving a number of brain regions in addition to those observed under normal auditory input conditions (185).
In sum, it appears that the syntax complexity effect can shift its maximum within the IFG. When semantic processing demands increase due to task or stimulus configurations, more anterior portions of the IFG are recruited. When perceptual processing conditions induce increased demands during syntactic processes, more posterior-superior regions of the IFG towards the IFS are recruited. The data thus point towards a language processing system which allocates different subregions in the perisylvian default language network as needed. For syntactic processing, BA 44 appears to be central, but the involvement of adjacent areas in the IFG is observed as a function of specific processing demands.
A) ARTIFICIAL GRAMMAR LEARNING.
The role of Broca's area as a central region for syntactic processes has also been demonstrated in the context of artificial grammar learning. The idea behind the artificial grammar learning approach is that, in such studies, all crucial input variables can be systematically controlled, allowing language learning to be held constant across subjects.
The role of Broca's area in syntax learning was demonstrated in a study showing that participants were able to learn a novel language whose rules followed the universal principles of natural grammars, but not a language disobeying such rules (178). This study observed an increase in activation over time in left Broca's area (BA 45), and in parts of the right inferior frontal gyrus, thereby providing evidence of the role of Broca's area in the learning of syntactic rules. In another artificial grammar learning experiment, it was shown that during the initial learning phase, activation is low in Broca's area (BA 44), and high in the hippocampus, but during the course of syntax learning in the scanner (during ∼40 min), hippocampal activation decreased and activation of Broca's area systematically increased (191) (see Fig. 9). Looking at artificial grammar learning across the time course of 8 days, Broca's area and in particular BA 45 was found to be sensitive to the classification of grammaticality (62, 201). These studies indicate that the learning of syntactic rules following the universal principles of grammar activate Broca's area. Moreover, recent studies applying transcranial direct current stimulation during artificial grammar learning were able to demonstrate that Broca's area is causally involved in the acquisition and the processing of syntactic knowledge (44, 246).
These findings, in particular those of Musso et al. (178), require a view of Broca's area that goes beyond that of providing “phonological working memory resources” as stated by Rogalsky and Hickok (219), since the learning of both rule types (as in Ref. 178) should require such resources. Rather, the findings point towards the specific role Broca's area plays in learning syntactic rules. In addition, a number of artificial grammar studies with grammar that follows universal grammar principles have found BA 44 activation, in particular for the processing of hierarchical tree structures compared with local sequential dependencies (9, 69, 192). Together, these studies on grammar learning and processing indicate the crucial role of Broca's area in the processing of syntax.
Comparing artificial grammar sequence processing and sentence comprehension in an fMRI directly, Hoen et al. (121) found large parts of the perisylvian cortex activated. Based on their data, they proposed the following functional subdivision of Broca's area and its adjacent regions in the prefrontal cortex (54, 121): superior and posterior regions (BA 6/9/46/44) are engaged in sequential and structural aspects of processing, whereas anterior and inferior regions (BA 11/47/45) are implicated in context information insertion into structural matrixes selected by the upper regions.
D. Localization of Integration: IFG or STG?
Psycholinguistic models on sentences comprehension assume a processing phase during which syntactic and semantic information interact with each other and are integrated to achieve interpretation. Some models hold that the different information types interact at any time during comprehension (163, 169) or after an initial syntactic structure building phase (63, 64). Neuroimaging approaches have discussed two different regions as possible sites where integration takes place. Some researchers (98) assume that the final integration of syntactic and semantic information takes place in the left posterior STG, whereas others (102, 236) assume that unification of different language-relevant information types is located in the left IFG. Interestingly, the crucial neuroimaging studies these proposals are based on all show activation in both the IFG and the STG (for localization, see Fig. 1).
The arguments in favor of the view that the STG is the locus of semantic-syntactic integration come from a cross-study comparison revealing that activation in the STG is observed only for sentences containing semantic information, whereas BA 44 is activated both for syntactic processes in sentences (structural sequences) without semantically meaningful words (9, 190) and in sentences with meaningful words (70, 166). The argument for the IFG as the locus of unification (integration) is based on findings reporting an interaction of semantic and syntactic information in the left IFG (e.g., Refs. 150, 217). The unification approach (102) subdivides the IFG functionally into BA 44/6 supporting phonological processes, BA 44/45 supporting syntactic processes and BA 45/47 supporting semantic processes is made, but defines the entire left IFG as the space where unification takes place. Empirical data providing direct evidence of such a unification process are sparse, and a study testing this view directly concludes that language understanding involves a dynamic interplay between the left inferior frontal and the posterior temporal regions (236). The role of the STG and the IFG in the processes of integration or unification, respectively, cannot be ultimately defined on the basis of this study, but some additional specifications emerge from the data available in the literature.
It is clear, however, that the posterior temporal cortex is crucial in binding the verb and its arguments and more generally for integration across domains and that the inferior frontal gyrus support different language aspects within its subregions (BA 47/45/44). Interactions between semantic aspects and syntax, as seen in studies manipulating semantics by lexical-semantic ambiguity (216), semantic relatedness (184), or semantic constraint due to animacy (34), are located in the more anterior portions of the IFG (BA 47/45), but not in BA 44 (184). From this, we may conclude that the IFG's role as a region of combining semantic and syntactic information may be restricted to its more anterior parts.
E. Prosodic Processes
When processing spoken sentences, phonological information in addition to semantic and syntactic information must be processed. We have already discussed acoustic-phonological processes at the segmental level, i.e., phonemes and features of these (see sect. IVA). But the acoustic signal also conveys suprasegmental phonological information, called prosody. Two types of prosodic information are usually distinguished: emotional prosody and linguistic prosody. Emotional prosody is an extralinguistic cue signaling either the speaker's emotional state or emotional aspects of the content conveyed by the speaker. In the context of this review, we will focus on the brain basis of linguistic prosody only.
Prosodic information is mainly encoded in the intonational contour, which signals the separation of constituents (syntactic phrases) in a spoken sentence and the accentuation of (thematically) relevant words in a speech stream. By signaling constituent boundaries, this information becomes most relevant for sentence comprehension and the interpretation of who is doing what to whom. This can be gathered from the example below. In the example, # indicates the prosodic boundary (PB).
1) The man said # the woman is stupid.
2) The man # said the woman # is stupid.
The PBs in these sentences are crucial for the interpretation as they signal the noun phrase to which the attribute “to be stupid” has to be assigned, either to the woman (1) or to the man (2). As the example shows, the prosodic information is relevant for syntactic processes, and there seems to be a close relation between prosody and syntax. Indeed, almost every PB is also a syntactic boundary, while the reverse does not hold.
The brain basis of prosodic information has initially been investigated behaviorally in patients with cortical lesions in the left hemisphere (LH) and the right hemisphere (RH). While some studies came to the conclusion that linguistic prosody is mainly processed in the RH (25, 257), others found that both LH and RH patients showed deficits in processing sentence level prosody (30). However, when segmental information was filtered, thereby increasing the reliance on suprasegmental information, RH patients demonstrated significantly worse performance than LH patients (30). These and other studies (e.g., Ref. 200) suggest a relative involvement of the RH in processing prosodic information. The less segmental information there is available, the more dominant the RH.
Neuroimaging studies provide support for this observation. Processing of pitch information (intonational contour) is correlated with an activation increase in the RH, but can be modulated by task demands (205). An fMRI study that systematically varied the presence/absence of suprasegmental and segmental information reported changes in brain activation in the superior temporal and fronto-opercular cortices of the RH as a function of the presence/absence of pitch information (172, 173) (see Fig. 10). Right dorsolateral prefrontal cortex and right cerebellar activation were also reported for prosodic segmentation during sentence processing (242). A study investigating sentences and word lists both with sentence prosody and word list prosody found bilateral activation in the anterior temporal cortex for syntactic and prosodic information, with the left being more selective for sentence structure (127). In this study clear RH dominance was found for prosody, but the authors point out that the activation in the right anterior temporal cortex may indicate prosody processing. Together, the studies suggest an involvement of the RH for the processing of intonational (pitch) information during sentence processing, but, in addition, indicate that the actual lateralization partly depends on task demands (90, 205) and on the presence of concurrent segmental information (30, 68).
Moreover, it should be noted that the lateralization of linguistic prosody depends on the particular information prosody encodes in a given language. In tonal languages like, for example, Thai, pitch patterns are used to distinguish lexical meaning. When encoding lexical information, pitch is processed in the LH, similar to lexical information in non-tonal languages (91). From this, it appears that the localization of language in the brain is determined by its function (lexical information) and not its form (pitch information). Only when intonation marks suprasegmental prosody, it is localized in the RH.
V. TIME COURSE OF AUDITORY LANGUAGE COMPREHENSION
Language undoubtedly unfolds in time. The data available from the fMRI studies on language processing do not provide the sufficient time resolution to capture this crucial aspect. The cognitive description of the comprehension process itself has been laid out in the introduction as consisting of several subprocesses that take place in a serial cascading and partly parallel fashion. Three linguistic processing phases have been assumed, and these correlate with functionally distinct components identified in the electrophysiological signal (67). In the last decades, different language-relevant event-related brain potential (ERP) components have been identified: an early left anterior negativity (ELAN) between 120 and 200 ms, taken to reflect initial syntactic structure building processes; a centroparietal negativity between 300 and 500 ms (N400), reflecting semantic processes; and a late centroparietal positivity (P600), taken to reflect late syntactic processes. Moreover, in the time window between 300 and 500 ms, a left anterior negativity (LAN) was observed to syntactic features that mark the grammatical relation between arguments and verb, and this was taken to reflect the assignment of thematic relations (who did what to whom) (see Fig. 11). This led to the formulation of the so-called three-phase model of language comprehension allocating different components in the event-related brain potential to different processes in the comprehension process (67). Modifications of this model have been proposed based on subsequent data (see Refs. 21, 85). The different ERP components are still observed during language processes, but their functional relevance was partly redefined given additional data. However, to provide a structured view of the crucial ERP components and their functional relevance, this review will start out with the originally observed language ERP components leading to the three-phase model of language comprehension (67), and crucial modifications will be added on the fly.
Before discussing these ERP components relevant for sentence-level processes, however, we will briefly review ERP effects reported for acoustic-phonological processes.
A. Acoustic-Phonological Processes (N100)
The first ERP effect correlating with the identification of phonemes is the N100, a negativity around 100 ms after stimulus onset (188). This ERP component is not specific to language, but reflects the discrimination of auditory categories and can thus be used to investigate aspects of vowel category perception. The same holds for the mismatch negativity (MMN), an ERP component occurring shortly after 100 ms, which has been shown to reflect the discrimination of acoustic and phoneme categories (180). Studies investigating phoneme perception have used single phonemes or syllables as stimulus material (for a review, see Refs. 179, 202, 260). The different studies indicate language-specific representations at the phoneme and syllable level (45, 203). The latter MMN study on the co-occurrence of phonemes in word-like items (pseudowords) compared French and Japanese listeners. The data suggest that during speech processing, the input signal is directly parsed into the language-specific phonological format of native language.
The N100 and the MMN have been located in or in the vicinity of the auditory cortex (51, 206), thereby indicating that these processes take place early during speech perception in this region. This is compatible with results from neuroimaging studies on phoneme processing that localize the N100 for vowels and consonants in the HG and PT (186, 188, 233). It has been proposed that the fast computation of a phonological representation from the speech input facilitates lexical access (45) and access to syntactically relevant morphological information.
B. Initial Syntactic Processes (ELAN)
The first sentence-level ERP component is the ELAN, correlating with the identification of the syntactic category of a word (e.g., verb, noun, preposition, etc.) occurring in response to a word category violation 120–200 ms after word onset or after the part of the word which provides the word category information (e.g., the inflection as in refine versus refinement) (80, 129, 145, 183; for a recent review, see Ref. 85). Based on this word category information, the initial local phrase structure can be built (e.g., verb phrase, noun phrase, prepositional phrase). These phrases are the building blocks for larger sentence structures. Within the three-phase model of language comprehension (67), this initial processing phase constitutes phase 1. The initial build up of local phrase structure has been shown to be highly automatic as it is independent of attentional processes (107) and independent of task demands (106). The earliness of the component was attributed to the ease with which word category information can be extracted from the stimulus, be it to the word's shortness (e.g., function word, as in Ref. 183), or its morphological markedness (e.g., inflection, as in Ref. 53), or more general for its atypical form properties (52). As this component has been reported mostly for connected speech (but see Ref. 53, 183; and for other studies, see the review of Ref. 85), the question has been asked to what extent this component might reflect prosodic aspects. However, it has been shown that prosodic violations elicit a different component (right hemispheric anterior negativity) (57) and that changes in the prosodic contour cannot account for the early syntactic effect (113).
A key question is where in the brain this initial process takes place. One way to localize language processes on-line is to use MEG, as it provides the possibility for a good topographic resolution (depending on the number of channels), although the method inherently has to deal with the so-called inverse problem (i.e., calculation of the neural generator based on scalp distribution data). Another approach is EEG in patients with circumscribed brain lesion applying an ERP design known to elicit certain language-related components. With the use of the latter approach, it was found that the ELAN component is absent in patients with left frontal cortical lesions (including left basal ganglia lesions), but present in patients only suffering from left basal ganglia lesions, indicating that the left frontal cortex plays a crucial role in the generation of the ELAN (82). The ELAN is also affected in patients with lesions in the left anterior temporal lobe, but not in patients with lesions in the right temporal lobe, suggesting that the left frontal and left anterior temporal cortex are involved in early structure building processes as reflected in the ELAN (75). With the use of MEG, the ELAN effect has been localized in the anterior temporal cortex and the inferior frontal cortex (84, 142) or solely in the temporal cortex (115) for auditory language experiments. More fine-grained analyses revealed syntactic effects in the temporal cortex already during the first 200 ms after stimulus onset just anterior to the primary auditory cortex, i.e., in the anterior STG (114), but not in the primary auditory cortex itself (see Fig. 12).
For visual experiments, the syntactic violation effect present around 100 ms was localized in the visual cortex, at least for sentences in which the word category information was morphologically marked (53). These data have raised the possibility that clearly marked syntactic word category violations may be detected in the sensory cortices (53, 115) or in their direct vicinity (114). The speed of this process may be surprising. However, the process of building up a local structure such as noun phrase (determiner plus noun) or a prepositional phrase (preposition plus noun phrase) on the basis of word category information could be performed quickly once the possible minimal local structures in a given language are learned. Once learned, this process could be viewed as a fast template-matching process taking place early in comprehension (21). During this process, templates of local phrase structures are activated (e.g., a preposition would activate a template of a prepositional phrase), against which the incoming information is checked. If this information does not match the template, a phrase structure violation is detected and further processes are not syntactically licensed.
This would predict that sentences containing both a phrase structure violation and semantic violation should elicit only an ELAN, but no semantic effect. It has been shown that this is indeed the case when combining a phrase structure violation with a semantic violation (72, 106), and that it even holds when combining it with a violation of the verb-argument information (86). These data indicate that syntactic phrase structure violations are processed prior to semantic information and can block higher-level processes, thereby providing strong evidence for models assuming a crucial initial syntactic processing phase (21, 67). Although this conclusion was called into question on the basis of an experiment using Dutch language material (249), the data from this study do not speak against the model's assumption. This is because in this study the syntactic category information of the critical word (given in the word's suffix) only became available after the semantic information (given in the word stem). A review of the literature on the timing of syntactic information and semantic information across the different languages reveals that the absolute timing of the syntax-initial and other processes may vary, but that the order of these processes in time is fixed across the different languages with syntactic word category information being processed first (85).
C. Computation of Syntactic and Semantic Relations (LAN/N400)
A crucial part in the process of sentence comprehension is the assignment of grammatical relations. To understand who is doing what to whom, semantic features (e.g., animacy) as well as syntactic features (e.g., subject-verb agreement, case marking, etc.) have to be processed. Neurolinguistic models assume that these processes take place after initial structure building. In the three-phase model of language comprehension of Friederici (67) this constitutes phase 2 (see Fig. 11). Bornkessel and Schlesewsky (21) subdivide this phase 2 into two phases: phase 2a, during which relevant features are extracted, and phase 2b, during which computation takes place. In their review they interpret different ERP effects observed to different linguistic aspects as investigated in various languages in detail (21).
For the purpose of this review, we will summarize the major findings observed across different languages, with a focus on two ERP components often reported in the literature, i.e., the LAN found for syntactic and the N400 found for semantic-thematic processes. Languages differ as to whether they have a fixed word order, like English, for example, or a free word order, such as German or Japanese. To identify who is the subject of sentence, the strategy in a language with fixed word order is to rely on positional information (e.g., the first noun phrase is likely to be the actor). However, in a language with free word order, morphosyntactic features must be considered. Subject-verb number agreement [plural (PL) versus singular (SG)] determines who is the subject of the action, but assignment is only possible if subject and object noun differ in number marking as in sentence 1, but not if the two noun phrases in a sentence carry the same number marking as in 2.
1) Die Männer [PL] grüßt [SG] der Junge [SG].
The men greet the boy [Actor] [literal].
The boy greets the men [nonliteral].
2) Die Männer [PL] grüßen [PL] die Jungen [PL].
The men greet the boys.
In an ambiguous situation as in 2, a subject-first strategy is applied, taking the first noun as the actor. Case marking is an additional feature that can help to resolve ambiguity. There are a number of languages in which thematic roles (actor, patient, etc.) can be determined by case [nominative (NOM) assigns the actor, accusative (ACC) assigns the patient, etc.], thereby allowing the assignment of who is doing what to whom as in 3, in which the boy is the actor.
3) Den Mann [SG, ACC] grüßt der Junge [SG, NOM].
The man greets the boy [Actor] [literal].
The boy greets the man [nonliteral].
If morphosyntactic cues are not available or are ambiguous as in 2, the system might rely on a simple subject-first word order strategy, or it might consider semantic features, such as animacy. Since the prototypical actor is animate, this information may help to assign the role of the actor, but not always (e.g., in the sentence “The tree hit the man when falling,” the animacy-strategy could lead to an initial misassignment of the tree's role, as tree is an inanimate noun). Nevertheless, the parsing system has to assign thematic roles on-line as the sentence is perceived in order to keep the working memory demands low, even if initial assignments must be reanalyzed later in the sentence.
1. Processing semantic and verb-argument relations
Sentence understanding crucially depends on extraction of the sentence's meaning, that is on the meaning of different words and the relation between them. Since the first ERP paper on language processing (152), a specific ERP component has been correlated with the processing of semantic information. This ERP component is a centro-parietal negativity around 400 ms, called N400. An almost uncountable number of papers have been published on semantic processes both at the word level and sentence level across different languages (for recent reviews, see Refs. 153, 156). The N400 is interpreted as reflecting difficulty of lexical-semantic integration, as its amplitude is known to increase 1) when a word does not have a lexical status (i.e., a non-word or a pseudoword); 2) when the second word of a word pair does not fit the first word semantically, and in a sentence 3) when the selectional restriction of verb-argument relations is violated; 4) when a word does not fit the preceding sentence context with respect to world knowledge or is, moreover, simply unexpected; and 5) its amplitude is known to decrease for words as the sentence unrolls due to increased predictability of the upcoming word. Thus the N400 is an indicator of 1) lexical processes, 2) lexical-semantic processes, 3) semantic contextual predictability, and 4) predictability due to world knowledge. Therefore, it reflects processes relevant to language comprehension at different levels, but not only those that are language internal but also those that concern world knowledge (105). The present review, however, will focus on the language internal level.
At this level, the N400 is correlated with semantic information carried by nouns and adjectives, and also with verb-internally information represented. This information is quite complex and partly concerns the semantic domain (i.e., selectional restriction information) and partly the syntactic domain (i.e., number and type of arguments). Selectional restriction information of the verb indicates which theoretically defined semantic features the related noun argument(s) must have. For example, the verb “drink” requires the noun to have the feature of “liquid,” as in “drink the wine” and not “drink the chair.” For the latter type of combination, an N400 is observed at the violating noun (during reading, Ref. 152; and during listening, Refs. 80, 123). Most interestingly, in one of the more recent studies, it has been shown that the amplitude of the N400 increases systematically as a function of the number of semantic features violating the relation between the verb and its noun argument (159) (see Fig. 13). This is a strong demonstration of the N400's modulation by theoretically defined semantic aspects of a word.
The N400 has also been observed in the verb's syntax-related domain when it comes to processing the information of how many arguments a verb can take. For example, linguistic theory defines that the verb “cry” only takes one argument, “she cries,” whereas the verb “give” takes three arguments, “she gave a letter to Peter.” Moreover, the verb encodes the type of the arguments (subject, direct object, indirect objects) which in some languages is marked by position in the sentence (word order) and in other languages by case (inflection or preposition), e.g., “to Peter.” Violations of the number of arguments and types of arguments (incorrect case marking) in a sentence lead to an N400 followed by a late positivity (71, 86, 88). Thus the ERP violation of number and type of arguments (syntactic domain) differs from that of violations of selectional restrictions (semantic domain), as the former is reflected in a biphasic pattern N400/P600, whereas the latter is reflected in a N400. (For variations of the particular realization of argument-related negativity as a function of different language typologies, see Ref. 21.) Thus semantic and thematic processes during language comprehension are correlated with the N400 across different languages.
This leads to the question of where in the brain these processes take place. There are a number of MEG studies, both at the word and sentential level, that have tried to localize the semantic N400 effect. The main generators of the N400 during speech processing have been located in the vicinity of the auditory cortex (108, 112), sometimes with an additional generator in the inferior frontal cortex (164). FMRI experiments using the same stimulus material as used in an ERP experiment on the processing of selectional restriction violations (106) revealed activation mainly in the STG (mid portion and posterior portion) (27, 81). The number and type of verb-argument relations eliciting an N400 have not been investigated using the same material in ERP and fMRI experiments. However, fMRI experiments on this issue suggest an involvement of the left posterior STG in addition to the IFG (22). Further research must show whether the N400 observed in response to semantic information and the N400 in the N400/P600 pattern found for the syntax-related information in the verb is a unitary component or whether the N400 differs as a function of information type.
2. Processing grammatical relations
In parallel to the processing of semantic and verb-argument information, morphosyntactic information provided by the verb's inflection (number and person) is most relevant for sentence comprehension, as it is essential for the assignment of grammatical roles in a sentence. While this information is less important for sentence interpretation in languages with fixed word order, it is crucial for languages with free word order (compare sect. IVC).
Violations of subject-verb agreement (singular versus plural) in an inflecting language usually induce a LAN between 300 and 500 ms (German, Ref. 199; Italian, Ref. 7; Spanish, Ref. 234). In a fixed word order language such as English, an LAN is found less systematically (LAN in Ref. 195, but not in Refs. 151, 196). It has been argued that the presence/absence of the LAN should be viewed as a continuum across different languages, and the likelihood of observing this effect increases with the amount of morphosyntactic marking in a given language (85).
However, it is not the pure amount of morphosyntactic marking that determines the presence of the LAN, but whether this information is crucial for the assignment of syntactic roles. In some languages, determiner-noun agreement with respect to gender (masculine, feminine, neuter) is crucial, and in others it is not. If this information is not crucial for the assignment of grammatical relations between a verb and its arguments in sentences, a violation of gender agreement between determiner and noun does not lead to a strong LAN effect. However, once gender agreement is relevant for the assignment of grammatical roles, as in Hebrew, in which there is gender agreement between subject noun and verb, the LAN is clearly present (50). Thus, whenever morphosyntactic marking is crucial for the assignment of grammatical relations in a sentence, an LAN is observed.
D. Integration and Interpretation (P600)
Models on the time course of language processes have assumed a late processing phase during which different information types are mapped onto each other to achieve interpretation (21, 67, 75). Friederici (67) proposed that this last phase (phase 3) represents a phase during which processes of syntactic reanalysis and repair take place and that these processes are reflected in a late centro-parietal positivity, called P600. This component, first observed for the processing of syntactic anomalies (193), was found for the processing of temporarily ambiguous sentences at the point of disambiguation when reanalysis was necessary (194), and also after a syntactic violation requiring repair (104), and sometimes as part of a biphasic ELAN/P600 pattern (80, 107). A direct comparison of the P600 topography in both instances revealed a differential pattern of distribution with a more fronto-central distribution for the reanalysis P600 and a centro-parietal distribution for the repair P600 (74) (see Fig. 14).
The functional interpretation of the P600 has changed to some degree over the past years. Initially, it was taken to reflect syntactic processes in general (104), processes of syntactic reanalysis and repair (73), or the difficulty of syntactic integration (136). However, later studies found the P600 to vary not only as a function of syntactic variables, but also to reflect the interaction of syntactic and semantic anomaly at the sentence level (100, 147, 148), suggesting that the P600 might reflect sentence-level integration processes of syntactic and semantic information. More recently, the status of the P600 as reflecting integration processes involving syntactic aspect was challenged by studies reporting P600 effects for sentence-level semantic violations (120, 139, 144, 149). For example, sentences like “The hearty meal was devouring” led to a P600 (139). Different explanations were put forward for “semantic P600” effects: 1) plausibility/semantic attraction between the verb and an argument (139), 2) thematic processing cost (120), and 3) interaction of thematic and semantic memory (149). Interestingly, all these different interpretations concern aspects of thematic role assignment in sentences and can be explained in an existing linguistically based processing model (23).
The brain basis of P600 effects is still unclear, as the P600 has not been localized using time-sensitive neuroimaging measures with the exception of a few MEG studies (154, 231). These MEG studies localized the P600 in the middle temporal gyrus and the posterior portion of the temporal cortex. Moreover, there is some indication that the basal ganglia are part of the circuit supporting processes reflected in the syntax-related P600, since patients with lesions in the basal ganglia show reduced P600 amplitudes (82, 87). An involvement of the basal ganglia in syntactic processes has also been proposed in the model by Ullman (247), although not specifically for the late processing phase reflected by the P600. The localization of the P600 in the fMRI is difficult, as the P600 often occurs in close vicinity in time with the LAN or N400 and is thus difficult to separate from these effects. At present, therefore, the neural basis underlying the P600 effect has not yet been specified in much detail.
In summary, the data available on the neurotemporal dynamics of language comprehension can be described as follows. Language comprehension is incremental and takes place in three sequential phases. In an initial phase (phase 1), an initial phrase structure on the basis of word category information is built. This process is highly automatic, independent of semantic and verb argument information, and independent of task demands. The process involves a portion of the left STG immediately anterior to the primary auditory cortex, possibly connecting to the frontal operculum located ventrally to Broca's area. During a second phase (phase 2), the relation between the verb and its arguments is computed to assign the thematic roles in a sentence. Morphosyntactic information (subject-verb agreement, LAN), case information (LAN or N400, depending on the particular language), and lexical selectional restriction information (N400) are taken into consideration to achieve assignment of the relation between the different elements in a sentence. The on-line assignment of semantic relations mainly appears to involve the mid and posterior portion of the temporal cortex. Processes of subject-verb agreement have not been clearly localized, but the distribution of the LAN suggests an involvement of the left frontal cortex. During a last phase (phase 3), the final interpretation takes place, with semantic and syntactic information being taken into account and mapped onto world knowledge. At the linguistic level, the difficulty of integrating syntactic and semantic information and the need for reanalysis is reflected in a P600. The difficulty of mapping linguistic information onto world knowledge also appears to elicit a P600 effect. At the moment it remains open whether these two P600 effects are members of the same family of ERP components or not.
E. Prosodic Processes (CPS)
The processing of auditorily presented sentences not only requires the processing of semantic and syntactic information but, moreover, the processing of prosodic information. The first electrophysiological correlate for the processing of sentence-level prosodic information was found in a study that recorded the EEG during the processing of German sentences which either contained one intonational phrase boundary (IPB) or two. At the IPB, the ERPs revealed a centro-parietally distributed positive shift that was called the closure positive shift (CPS) since the IPB indicates the closure of a phrase (238) (see Fig. 15). This effect was replicated in other studies using a different language, namely, Dutch (19, 137), Japanese (262), Chinese (160), and English (130). Crucially, it was shown that the CPS is not triggered by the pause at the IPB per se, but that the two other parameters signaling the IPB, namely, the pitch change and the lengthening of the syllable prior to the pause are sufficient to evoke boundary perception. This was evidenced in an experiment in which the pause at the IPB was deleted (238).
Interestingly, the latter does not hold for young children. In infants and toddlers, a boundary response is not elicited when the pause is deleted, but only when the pause is present (167, 168). However, in older children, who show a CPS as boundary response once sufficient syntactic knowledge is acquired, pitch information and syllable-lengthening alone can trigger a CPS, just as in adults (167). This suggests that the pause initially serves as a relevant cue to structure the speech input, but that it is not needed for intonational phrasing once sufficient knowledge about prosodic and syntactic structure are acquired. Additional experiments with adults showed that the CPS can also be elicited when only prosodic information of a sentence is delivered (i.e., when segmental information is deleted), under this condition the CPS is lateralized to the RH (197). Moreover, the CPS is reported for sentence reading triggered by the comma indicating the syntactic phrase boundary (138, 237, 239). Thus the CPS can be viewed as an ERP component to correlate with prosodic phrasing both when realized openly in the speech stream and when realized covertly in written sentences.
F. Interaction of Syntax and Prosody
Syntax and prosody are known to interact during language comprehension as indicated by behavioral studies on syntactic ambiguity resolution (170, 254). The studies cited in section IV indicate that syntax is mainly processed in the LH and prosody as such mainly in the RH. The two hemispheres are neuroanatomically connected via the corpus callosum (122, 124). If the above view about the functional role of the LH and RH in language processing is valid, any interaction between syntactic (LH) and prosodic (RH) information should be affected by a lesion to the corpus callosum (CC).
The prosody-syntax interaction may take place during different processing phases: 1) during the initial phase of phrase structure building since the end of a syntactic phrase is marked prosodically, and/or 2) during the second processing phase during which the verb argument structure is processed, since the constituent structure is also prosodically marked. In the following, we will take up these issues in turn.
1. Prosody-syntax interaction during phrase structure building
ERP studies have reported a right anterior negativity for prosodic violations in sentences in which, for example, phrase final prosodic information was presented at nonfinal position. These types of prosodic violations were shown to interact with syntactic phrase structure violations (57). Patients with lesions in the posterior portion of the CC did not show such an interaction effect, although they exhibited prosody-independent syntactic processing (224). These data indicate that the CC builds the brain basis for the integration of local syntactic and prosodic features during auditory speech comprehension, as it connects the brain regions in which syntax and prosody are computed.
2. Prosody-syntax interaction during verb argument structure assignment
An interaction of prosodic and syntactic information is also observed when it comes to assign relations between a verb and its arguments. For example, in the following prosodically correct German sentences as in 1) in which “Anna” is the object of “promise” and 2) in which “Anna” is the object of “help” (the relation between the verb and its object noun phrase is marked by the arrow):
1) Peter verspricht Anna zu arbeiten.
Peter promises Anna to work.
2) Peter verspricht # Anna zu helfen.
Peter promises # to help Anna.
Due to German word order, the two sentences appear identical up to the word “zu,” but their syntactic structure is marked differently by intonation in speech (indicated by # marking the IPB). The prosodically correct sentence 1 becomes prosodically incorrect as in 3 by inserting the IPB after the verb as in 2. In 3, the prosodic information signals that “Anna” is the object of the following verb “arbeiten,” but the verb arbeiten/work cannot take a direct object.
3) Peter verspricht # Anna zu arbeiten.
Peter promises # Anna to work.
With the use of such prosodically incorrect sentences, it was demonstrated that prosody guides syntactic parsing (238). This was evidenced by an ERP effect at “zu arbeiten” in the prosodically incorrect sentence 3. Based on the prosodic information, the parsing system expects a transitive verb (such as “help” as in 2), but it receives an intransitive verb (namely, “work” as in 1). This unexpected verb form leads to a mismatch effect in the ERP, namely, an N400/P600 pattern, with the N400 reflecting a reaction to the unexpected verb and the P600 reflecting processes of reanalysis. This functional interpretation of the two ERP components was supported by experiments, which, in contrast to the original experiment (238), did not use a grammaticality judgment task. Without such a grammatical task (passive listening), only an N400 was observed at the critical verb reflecting simply the unexpectedness of the verb (19, 83) (see Fig. 16).
To test the hypothesis that the prosody-syntax interaction is based on the information exchange of the LH and the RH, sentences 1–3 were presented to patients with lesions in the CC. With the application of the passive listening paradigm, a prosody-syntax mismatch effect (N400) was observed in healthy controls and in patients with lesions in the anterior CC, but not in patients with lesions in the posterior CC (83) (see Fig. 16). This finding provides clear evidence for the view that the interaction of prosody and syntax relies on communication between the LH and the RH supported by the posterior portion of the CC through which the temporal cortices of the left and the right hemisphere are connected.
VI. LANGUAGE FUNCTION: BINDING SPATIALLY DISTRIBUTED NEURAL ACTIVITY IN TIME
A. The Model
The review has specified the brain areas in the temporal and inferior frontal cortex supporting different aspects of language processing, for example, phonetic, syntactic, sentence-level semantic, and prosodic processes (compare Fig. 11; and for anatomical details, see Figs. 1 and 3). Acoustic-phonological processes taking place during the first 100 ms after acoustic stimulation crucially involve the primary auditory cortex (PAC) and the planum temporale (PT). From these regions, the information is delivered to the anterior and the posterior STG and STS, with the left anterior STS reacting generally as a function of the intelligibility of the stimulus. The anterior STG, together with the left frontal operculum connected via a ventral pathway through the uncinate fasciculus (ventral pathway II), is seen as a possible neural network for initial local structure building processes taking place between 120–200 ms.5 Semantic and syntactic relations in a sentence are processed between 300–500 ms after the stimulus onset, possibly in parallel systems, activating separable left-lateralized temporo-frontal networks. The semantic network involves the middle and posterior STG/MTG (sometimes extending into the anterior temporal cortex) and BA 45 (and BA 47) in the frontal cortex connected via another ventral pathway (ventral pathway I) through the extreme capsule fiber system (ECFS),6 whereas the syntactic network dealing with complex sentence structures involves the posterior STG/STS and BA 44 in the frontal cortex connected via a dorsal pathway (dorsal pathway II). Note that dorsal pathway I connecting the temporal cortex to the premotor cortex is supposed to support sensory-to-motor mappings. Syntactic and semantic integration processes take place ∼600 ms after the stimulus input and beyond, possibly under the involvement of the posterior STG/STS and the basal ganglia. The processing of suprasegmental prosodic information is supported by the right hemisphere in close interaction with the left hemisphere through the posterior portion of the CC, the structure which connects the temporal cortices of the two hemispheres.
B. Caveats and Open Issues
The model presented is a model based on empirical data, but it is a model and thereby subject to changes on the basis of new data. Moreover, it should be kept in mind that a model tries to cover most of the data in the literature, but certainly cannot include each and every data point published. A model always is a generalization.
With this in mind, we should now briefly consider the weaknesses all such models might include.
1. Neuroanatomic variability
The model is based on data from imaging studies, which usually present group data that are averaged over a group of subjects (usually using spatial smoothing algorithms) and mapped onto a standard brain. We know, however, that the neuroanatomic variability between subjects is quite considerable (4, 5, 268). Different approaches have been proposed to deal with this problem. One way is to discuss the observed group activation with respect to its probability to fall into one or the other cytoarchitectonically defined area. Such probabilities have been calculated on the basis of 10 brains analyzed post mortem in the “Jülich maps” (Ref. 5; compare sect. IIIA). This approach has already been successfully applied for language-related studies (9, 166, 225, 226). A second approach would be to calculate connectivity-based parcellations for each individual (8) and localize the language-related activation according to this parcellation. So far, no study using this approach has been published, but there is work in progress (Amunts, Tittgemeyer, and Friederici, unpublished data). As a third approach, the use of a functional localizer task has been proposed, and, in the case of language studies, this would be a particular language task (59). The idea is that a localizer task reliably activates locations across individuals, which can then be taken as the “same” functional region in different brains [see Grodzinsky (97) and Fedorenko and Kanwisher (59) for a discussion of this approach]. Such a localizer task has been applied in a recent study to define a particular region for a region-of-interest analysis (218), and there is work in progress applying this approach more broadly to language studies (59).
Thus several methodological approaches are being developed to address the variability in neuroanatomy and thereby the functional neuroanatomy for a particular language function. This is of particular importance when trying to specify a fine-grained distinction in adjacent areas, such as activation in BA 44 versus BA 45 or activation in the frontal operculum versus the anterior insula.
2. Cross-linguistic variability
Another critical issue to be considered is to what extent a functional neuroanatomic model of language processing based on data mostly from English, German, Dutch, Hebrew and, in a few cases, Japanese and Thai can be taken to be valid in general. There is a “yes” and a “no” answer to this question. The affirmative answer is based on the finding that, across different languages, it is the particular language function that determines particular activation patterns and not its form. This is evidenced by the finding that syntactic processes in fixed word order languages such as English and Dutch as well as in free word order languages such as German and Hebrew all show activation in Broca's area (Table 4), and moreover, by the finding that prosody (normally processed in the RH) is processed in the LH when signaling a lexical function. The more negative answer to the generality question is that there are certain neurocognitive processing differences observable in the language-related ERP patterns, in particular when investigating the different cues used to assign thematic role in a sentence. This issue has been taken up by a recent neurotypological approach describing the brain basis of language processing (24).
3. Domain specificity
This is a significant issue in the discussion of a functional neuroanatomic model of language. The present model, as well as an earlier version of it (67, 68), relates a particular function to a particular brain region within the language system, leaving the option open that this same brain region serves another function in another domain than language. The particular function the same region supports in the other domain may either be closely related, as for example, the syntactic function of Broca's area in language and music (165), or the function in the other domain may not be that similar, as, for example, the role of Broca's area in language and in processing simple chunks in goal-directed actions (143). The ongoing discussion about the specificity of a particular area, be it the posterior STG or be it Broca's area (see Refs. 99, 111), is hard to reconcile given the data available.
In this article, we described the function of a given brain area within the language processing domain. Taking a more general perspective, we suggest that a given area, for example, Broca's area, receives its particular domain-specific function as part of a particular domain-specific network which, for the language domain, involves the posterior STG and which, for the action domain, involves the parietal cortex (128). Thus the function of an area should always be considered within a neural network of which it is a part.
Future work will have to deal with these open issues to allow not only a more detailed description of the brain basis of language, but moreover, to clarify the function of certain brain regions in the concert of cognitive functions.
The research reported here was partly supported by The German Ministry of Education and Research (BMBF; Grant 01GW0773).
No conflicts of interest, financial or otherwise, are declared by the author.
I thank the Center for Advanced Study in the Behavioral Sciences at Stanford University for providing the tranquility that allowed me to write the article. I am most thankful to Melanie Trümper and Margund Greiner for their patience in going with me through several versions of the text and reference lists, and to Andrea Gast-Sandmann for designing the figures. I also thank Jonas Obleser for his input on early auditory processes. Michiru Makuuchi and Emiliano Zaccarella were of great help in constructing tables and related figures. Further thanks to Jens Brauer, Claudia Männel, Jutta Mueller, Lars Meyer, Björn Herrmann, Daniela Sammler, Sarah Gierhan, and Jürgen Weissenborn for their helpful comments and suggestions. I thank Rosie Wallis for her careful and thoughtful English editing activities. Finally, I am grateful to Cathy Price and one anonymous reviewer for the helpful comments on the manuscript.
Address for reprint requests and other correspondence: A. D. Friederici, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstraße 1A, 04103 Leipzig, Germany (e-mail:).
↵1 It should be noted that these receptorarchitectonic analyses are performed in post mortem brains and thus represent an analysis of the brain's neuron receptors at a certain point in time. However, it is known that the density of neuron receptors is subject to dynamic modulations over a millisecond time scale. Moreover, we should keep in mind that up to now the functional relation between particular neuron receptors and particular language functions is not known.
↵2 Earlier studies using the method of low-frequency fluctuation analysis identified a general default network while subjects rested quietly in the scanner (17, 211). With data from such a resting state, functional connectivities between different subregions of the IFG (i.e., pars orbitalis, pars triangularis, and pars opercularis) and subregions in the parietal cortex and temporal cortex have been reported (263).
↵3 Note that the anatomic terminology varies from study to study. Here we used those anatomic terms provided by the authors of the study discussed. Figure 1 may help to orient the reader with respect to the different anatomic terms.
↵4 The low temporal resolution of fMRI, however, will not allow us to differentiate early and late effects observed in the ERP in response to incorrect word category information (see sect. V, B and C), but in combination with ERP studies from patients with lesions in particular parts of the brain as well as MEG localization studies with healthy participants, conclusions about the localization of these effects are possible.
↵5 This pathway may not only serve to support adjacent structural dependencies but, moreover, to subserve semantic combinatorics.
↵6 The processing of word semantics involves a large neural network including the middle and posterior part of the middle and superior temporal gyrus (including the angular gyrus and frontal association areas). For recent reviews, see Refs. 15 and 49.