Mechanics of the Mammalian Cochlea

Luis Robles, Mario A. Ruggero


In mammals, environmental sounds stimulate the auditory receptor, the cochlea, via vibrations of the stapes, the innermost of the middle ear ossicles. These vibrations produce displacement waves that travel on the elongated and spirally wound basilar membrane (BM). As they travel, waves grow in amplitude, reaching a maximum and then dying out. The location of maximum BM motion is a function of stimulus frequency, with high-frequency waves being localized to the “base” of the cochlea (near the stapes) and low-frequency waves approaching the “apex” of the cochlea. Thus each cochlear site has a characteristic frequency (CF), to which it responds maximally. BM vibrations produce motion of hair cell stereocilia, which gates stereociliar transduction channels leading to the generation of hair cell receptor potentials and the excitation of afferent auditory nerve fibers. At the base of the cochlea, BM motion exhibits a CF-specific and level-dependent compressive nonlinearity such that responses to low-level, near-CF stimuli are sensitive and sharply frequency-tuned and responses to intense stimuli are insensitive and poorly tuned. The high sensitivity and sharp-frequency tuning, as well as compression and other nonlinearities (two-tone suppression and intermodulation distortion), are highly labile, indicating the presence in normal cochleae of a positive feedback from the organ of Corti, the “cochlear amplifier.” This mechanism involves forces generated by the outer hair cells and controlled, directly or indirectly, by their transduction currents. At the apex of the cochlea, nonlinearities appear to be less prominent than at the base, perhaps implying that the cochlear amplifier plays a lesser role in determining apical mechanical responses to sound. Whether at the base or the apex, the properties of BM vibration adequately account for most frequency-specific properties of the responses to sound of auditory nerve fibers.


A.  Scope of the Review

The present review, updating one by Patuzzi and Robertson (273), focuses on the mechanical processes leading to the stimulation of the inner hair cells of the mammalian cochlea, i.e., the vibrations of the basilar membrane (BM), the organ of Corti, and the tectorial membrane (TM). Otoacoustic emissions (for reviews, see Refs.285, 362) are discussed only to the extent that they shed direct light on cochlear vibrations. Mathematical models of cochlear mechanics (for reviews, see Refs. 80,81, 152) are mentioned, but rather selectively. The reader may want to consult two recent reviews that cover much the same ground as the present one, but from different perspectives. One (271) is especially useful because it provides theoretical contexts for many of the empirical findings, especially those dealing with micromechanics. The other (379) holds views, notably of experiments in excised cochleae, that clash strongly with our own.

B.  Anatomical and Functional Setting

The auditory system of mammals is a marvelous achievement of biological evolution, capable of detecting and analyzing sounds over wide ranges of spectral frequency and intensity. Humans can hear sounds with frequencies over the range 20 Hz to 20 kHz, and some mammals can perceive frequencies beyond 100 kHz. Mammalian auditory systems also possess great sensitivity and respond to sounds over an intensity range spanning 12 orders of magnitude or 120 decibels. Such striking performance is largely determined by mechanical and biophysical processes in the cochlea, the peripheral organ of hearing of mammals.

The cochlea consists of three adjacent membranous tubes coiled in the form of a snail (hence its name; see Fig.1) and enclosed by a bony shell, the otic capsule (For a review of cochlear structure, see Ref.365). Two of the tubes, the scalae vestibuli and tympani, are filled with perilymph, a liquid whose ionic composition resembles that of other extracellular fluids. At the base of the approximately conical otic capsule, two membrane-covered windows open into the middle ear: the oval window in scala vestibuli, against which abuts the footplate of the stapes, the innermost of the three middle ear ossicles, and the round window in scala tympani. Between scala vestibuli and scala tympani is the scala media, which ends blindly near the “apex” of the otic capsule, allowing scala vestibuli and scala tympani to communicate (at the helicotrema). Scala media contains the organ of Corti, where the inner and outer hair cells transduce mechanical vibrations into electrical signals, endolymph, a fluid of unusual composition (with a high potassium concentration) sustaining a large positive electrical potential, and the TM, overlying the hair cells. The organ of Corti rests on the BM, which separates scala media from scala tympani. The length of the BM in mammals is positively correlated with body weight (e.g., 7, 18, 20, 21, 25, 34, 60, and 100 mm, respectively, in mouse, chinchilla, guinea pig, squirrel monkey, cat, human, elephant, and some whales) (206) and considerably greater than in reptiles or birds of comparable size (221).

Fig. 1.

Sites of measurement of mechanical vibrations in mammalian cochleae.A: a schematic cross section of the guinea pig cochlea, indicating approaches to basilar membrane (BM) locations at the cochlear base, via an opening into scala tympani (ST), and to structures at the cochlear apex, via an opening into scala vestibuli (SV). The recording sites are also indicated in B, a diagram of the organ of Corti, the BM, and the TM. The numbers in Aindicate the four cochlear turns. HC, Hensen's cells; IHC, inner hair cell; L, spiral limbus; OHC, outer hair cells; OSL, osseous spiral lamina; RM, Reissner's membrane; SL, spiral ligament; TM, tectorial membrane. [From Cooper (44), with permission from Elsevier Science.]

Pressure waves reaching the eardrum are transmitted via vibrations of the middle ear ossicles to the oval window at the base of the cochlea, where they create pressure differences between scala tympani and the other scalae, thus displacing the BM in a transverse direction. BM vibration causes shearing between the reticular lamina at the top of the organ of Corti and the TM, tilting the stereocilia that protrude from the outer hair cells. In the case of inner hair cells, whose stereocilia may not be firmly attached to the TM, stereocilia may be displaced by friction against the endolymphatic fluid. Tilting of the stereocilia, in turn, gates cationic channels located in their tips. Aided by the endolymphatic (or endocochlear) potential, the modulation of hair cell conductances produces transduction currents and receptor potentials in the hair cells (for reviews, see Refs. 64,149, 153, 282,409). Depolarizing receptor potentials in inner hair cells lead to the generation of action potentials in type I auditory nerve fibers, which constitute the vast majority (95%) of auditory nerve afferents and carry to the brain the bulk of the acoustic information processed by the cochlea.

C.  Historical Highlights

The first measurements of the vibrational response to sound of the BM were carried out by Georg von Békésy (391), for which he was awarded the 1961 Nobel Prize for Physiology or Medicine. Working principally in the ears of human cadavers, Békésy showed that the cochlea performs a kind of spatial Fourier analysis, mapping frequencies upon longitudinal position along the BM. He described a displacement wave that travels on the BM from base to apex of the cochlea at speeds much slower than that of sound in water. As it propagates, the traveling wave grows in amplitude, reaches a maximum, and then decays. The location of the maximum is a function of stimulus frequency: high-frequency vibrations reach a peak near the base of the cochlea, whereas low-frequency waves travel all the way to the cochlear apex.

A turning point in the understanding of cochlear mechanics came in 1971 when Rhode (293) demonstrated that BM vibrations in live squirrel monkeys exhibit a compressive nonlinearity, growing in magnitude as a function of stimulus intensity at a rate of <1 dB/dB. Rhode also showed that the nonlinearity was frequency specific, being demonstrable only at or near the characteristic frequency (CF, the frequency to which the BM site is most sensitive) and that it was labile, disappearing after death. Rhode's discoveries had to wait longer than a decade for confirmation and extension (205,307, 355), after several unsuccessful attempts (for review, see Ref. 297). However, by the time the confirmatory BM experiments were published, the existence of nonlinear and active cochlear processes consistent with Rhode's pioneering findings had received unexpected but influential support from Kemp's discovery of otoacoustic emissions, sounds emitted by the cochlea which grow at compressive rates with stimulus intensity (174). Kemp immediately revived Gold's prescient arguments (121) on the need for a positive electromechanical feedback to boost BM vibrations to compensate for the viscous damping exerted by the cochlear fluids. A few years later, a possible origin for both otoacoustic emissions and the hypothetical electromechanical feedback was identified by Brownell et al. (21), who showed that outer hair cells change their length under electrical stimulation.


Most of our knowledge of mechanics in normal inner ears is based on observations of BM motion at basal sites of the cochleae of guinea pigs, chinchillas, squirrel monkeys, and cats. Studies of BM vibrations at the base of the cochlea have now reached consensus on many issues, including how to judge the quality of recordings, regardless of species or precise CF. The situation is quite different with regard to in vivo studies of mechanical responses at the apex of the cochlea, with studies in chinchilla and guinea pig having yielded contradictory verdicts regarding several fundamental issues. Nevertheless, there is now sufficient evidence to conclude that responses at the apex of the cochlea differ at least quantitatively from those at the base. Accordingly, in vivo responses to “basic” stimuli (tones, clicks, and noise) of sites at the base and apex of the cochlea are described in sections ii and iii, respectively, of this review.

All techniques for measuring cochlear vibrations require a relatively unobstructed view of the site of measurement (see sect.xii). At basal locations, the BM can be reached directly via scala tympani, but the organ of Corti and the TM are inaccessible (Fig. 1). At the “hook” region of the cochlea (within 2 mm of the stapes), the BM is approached by way of the round window (41, 47, 179, 243). Slightly more apical locations (e.g., 3–4 mm from the stapes) are reached after perforating the otic capsule (43,307, 355). An even more apical BM location (∼8–9 mm from the stapes) in the squirrel monkey cochlea (293, 306) was reached from inside the posterior cranial fossa after removal of cerebral tissue and perforation of the temporal bone near the internal auditory meatus (the central exit of the auditory nerve).

A.  Responses to Single Tones

1.  Input/output functions

Figure 2 illustrates velocity-intensity functions for BM responses to tones recorded at a site of the chinchilla cochlea located 3.5 mm from the oval window (326). In contrast to the linear growth of responses to tones with higher or lower frequencies, responses to stimuli with frequencies near CF (10 kHz) exhibit highly compressive growth, i.e., response magnitude grows by only 28 dB as stimulus intensity increases by 96 dB. Compression is most prominent at moderate and high levels, with average rates of growth as low as 0.2 dB/dB at CF (measured for intensities between 40 and 90 dB SPL; i.e., sound pressure level referenced to 20 μPascals) and even lower rates at frequencies immediately above CF. The input/output functions for responses to CF tones at basal BM sites in guinea pig and cat resemble those in chinchilla (Fig. 3). Rates of growth for responses to CF tones measured in several studies are collected in Table 1.

Fig. 2.

Velocity-intensity functions of BM responses to tones.A: responses to tones with frequency equal to and lower than characteristic frequency (CF; 10 kHz). B: responses to tones with frequency equal to and higher than CF. The straight dotted lines (bottom right in each panel) have linear slopes (1 dB/dB). Recordings were made at a site of chinchilla cochlea situated some 3.5 mm from its basal end. [From Ruggero et al. (326). Copyright, Acoustical Society of America, 1997.]

Fig. 3.

Displacement-intensity functions for BM responses to CF tones recorded at basal cochlear sites in chinchilla, guinea pig, and cat. For comparison, the dotted line indicates linear growth. [Chinchilla data (squares) from Ruggero et al. (326); guinea pig data (solid and open circles) from Nuttal and Dolan (259) and Cooper (43), respectively; cat data (diamonds) from Cooper and Rhode (47).]

View this table:
Table 1.

Sensitivity, nonlinearity, and frequency tuning of basilar membrane responses to tones at basal sites of the cochlea

Early measurements of BM vibrations using the Mössbauer technique (see sect. xii) suggested, but did not establish conclusively, that responses to CF tones grow linearly at low stimulus intensities (307, 355). Newer measurements using optical techniques have provided confirmation (47,237, 259, 326). Theoretical considerations suggest that input-output functions should also be linear at sufficiently high stimulus intensities (123,164, 251, 279,408). Indeed, the literature includes CF input-output curves in which compression decreases systematically above 80 dB SPL and slopes approach linearity at 90–100 dB SPL (272,300, 322, 324,332). In contrast, other reports (43,259, 326) indicate that highly compressive growth is maintained essentially undiminished up to intensities as high as 100 dB SPL (e.g., Fig. 3). Complicating this issue is the fact that cochlear damage linearizes input-output BM functions (see sect.ix B1) and therefore the occasionally reported full linearization of input-output curves at high stimulus levels may reflect cochlear damage (326). On balance, it seems that although compression may be lessened in healthy cochleae above 80 dB SPL, full linearization occurs at levels >100 dB SPL, at which stimulation for longer than a few minutes is likely to result in permanent cochlear damage.

2.  Isointensity functions

The variation of BM velocity as a function of stimulus frequency and intensity is depicted in Figure 4 as a family of isointensity functions (replotted from the data of Fig. 2). At low stimulus levels, the isointensity curves are sharply frequency tuned, with steep slopes arranged roughly symmetrically on both sides of CF. At higher stimulus levels, isointensity curves retain a steep high-frequency slope near CF but become more broadly tuned and increasingly asymmetrical as the peak responses shift toward lower frequencies. The response roll-off stops at a frequency about one-third octave higher than CF; at higher frequencies, response magnitude is relatively constant, i.e., it reaches a plateau, evident in responses to intense (70–90 dB) tones (Fig. 4).

Fig. 4.

A family of isointensity curves representing the velocity of BM responses to tones as a function of frequency (abscissa) and intensity (parameter, in dB SPL). The isointensity curves represent the same chinchilla data of Fig. 2. [From Ruggero et al. (326). Copyright, Acoustical Society of America, 1997.]

3.  Sensitivity functions and BM/stapes magnitude ratios

Figure 5 shows curves of BM sensitivity (displacement per unit of stimulus pressure) as a function of frequency for basal sites of the chinchilla and guinea pig cochleae. If responses grew linearly, the sensitivity curves measured at different intensities would be identical. In fact, sensitivity grows systematically larger as a function of decreasing stimulus level at frequencies near CF so that the curves superimpose only at frequencies well removed from CF (i.e., frequencies lower than 0.7 CF or in the plateau region above CF).

Fig. 5.

Families of isointensity curves representing the sensitivity (displacement divided by stimulus pressure) of BM responses to tones as a function of frequency (abscissa) and intensity (parameter, in dB SPL). The lower CF (10 kHz) data were recorded at the 3.5-mm site of the chinchilla cochlea. [Redrawn from Ruggero et al. (326).] The higher CF (17 kHz) data are from a basal site of the guinea pig cochlea. [Redrawn from Cooper and Rhode (52).]

The compressive nonlinearity is strongly dependent on stimulus frequency. As a result, both the bandwidth (or the sharpness of tuning) and the peak (or best) frequency of the sensitivity functions change as a function of stimulus level. At the highest stimulus levels, responses reach their maxima at frequencies about one-half octave lower than at the lowest stimulus levels and the sharpness of tuning decreases (e.g., from a Q10 of 5 at 10 dB to only 1.4 at 90 dB SPL for the chinchilla data of Fig. 5). (Q10 equals the frequency of the response peak divided by the bandwidth 10 dB below the peak.)

Normalizing BM sensitivity functions to stapes or incus motion yields estimates of cochlear gain as a function of stimulus frequency, permitting an assessment of cochlear contributions to frequency tuning in isolation from middle ear inputs. Figure6 compares maximal cochlear gains (i.e., computed for low-level stimuli), as a function of frequency, for basal sites of the cochleae of cat, guinea pig, and chinchilla. (Also shown are curves for an apical site in chinchilla, to be discussed in sect. iii.) At CF, the peak magnitudes of BM vibrations far exceed those of the middle ear ossicles, with cochlear gains ranging from 47 to 75 dB. At frequencies well below CF, BM vibration at basal cochlear sites exhibit slopes of ∼6 dB/octave relative to stapes motion. This is as expected on the following assumptions, which apply to the relevant range of stimulus frequencies, from ∼200 Hz to several kiloHertz in chinchilla (59, 73,216, 328, 424): stapes velocity gain (re SPL) is roughly constant with respect to stimulus frequency; cochlear input impedance is real (or resistive) so that pressure in scala vestibuli near the stapes is proportional to stapes velocity (and greater than in scala tympani); BM displacement is proportional to the local pressure difference across the “cochlear partition” (the organ of Corti and the BM).1

Fig. 6.

Maximum gains of BM or TM responses, relative to middle-ear vibration, from basal and apical cochlear sites. Basal BM responses to low-level tones have been normalized to responses of the incus (guinea pig and cat) or the stapes (chinchilla). The TM data for the chinchilla apex, normalized to the vibrations of the umbo of the tympanic membrane, were selected to indicate the range of sensitivities. Chinchilla base (circles): CF = 8.4 kHz, data from Ruggero et al. (328); chinchilla apex (dashed lines): CF = 0.35–0.5 kHz, data from Rhode and Cooper (299); guinea pig (squares): CF = 17 kHz, data from Sellick et al. (360); cat (triangles): CF = 30 kHz, data from Cooper and Rhode (47).

4.  Phases of BM responses to tones

Because the filter characteristics of the cochlea are distributed over space, wave propagation from base to apex (see sect. v) involves “pure” (i.e., frequency independent) delays as well as “filter” delays that vary with frequency. As a consequence of these delays, BM responses to tones exhibit increasing phase lag as a function of increasing frequency. Figure7 displays the phases of BM vibrations, relative to middle ear motion, at basal cochlear sites of squirrel monkey, chinchilla, guinea pig, and cat. Most of the phase versus frequency curves are characterized by a shallow segment at low frequencies and a steep segment at frequencies around CF. The exception is the curve for the squirrel monkey site with CF = 6 kHz, with a low-frequency segment nearly as steep as the segment in the neighborhood of CF. At low frequencies, BM responses typically lead middle ear vibration by ∼90 degrees. This phase lead and the 6-dB/octave slope of BM magnitude gain relative to the middle ear (see sect. ii A3) indicate that BM displacements are proportional to stapes velocity, consistent with the idea that cochlear input impedance is resistive (real) (424) (see footnote 1).

Fig. 7.

The phases of BM responses to tones as a function of frequency. The phases of BM displacement toward scala tympani are expressed relative to inward ossicular displacement. The data were obtained at basal sites of the cochleae of squirrel monkey, chinchilla, guinea pig, and cat. CFs are indicated by closed symbols. Guinea pig (diamonds): data from Nuttall and Dolan (259); guinea pig (circles): data from Sellick et al. (360); chinchilla (X): CF = 9.7 kHz, data from Ruggero et al. (326); chinchilla (crosses): CF = 15 kHz, data from Narayan et al. (243); squirrel monkey (squares): data from Rhode (293); cat (triangles): data from Cooper and Rhode (47).

The slopes of the phase versus frequency curves are steepest just above CF. Because the slope of the phase curve can be interpreted as the (group) delay intervening between the launching of a wave near the stapes and its arrival at the measuring site, the increase in phase slope with frequency indicates that waves of higher frequency propagate more slowly than those of lower frequency. The phase-lag accumulation at CF, 1–2.5 periods, is not obviously correlated with CF, and one may conjecture that, in any single species, phase accumulation at CF may be a constant throughout the cochlea (415). We shall see below that such a conjecture is also supported by phase measurements at apical cochlear locations (see Fig.12). At frequencies well above CF, some of the curves reach phase plateaus (discussed in the sect. ii A5). Table2 provides a summary of the features of phase versus frequency curves, including those of Figure 7.

View this table:
Table 2.

Characteristics of phase vs. frequency curves for cochlear mechanical responses to tones

At the cochlear base, BM responses to tones with frequency well below CF grow linearly with stimulus intensity and, appropriately, response phases at those frequencies are invariant with respect to stimulus intensity. At near-CF stimulus frequencies, response magnitudes grow nonlinearly, and phases vary systematically with intensity. This intensity dependence is illustrated in Figure8, which shows BM response phases in chinchilla and guinea pig normalized to the responses to moderately intense stimuli (80 and 74 dB SPL, respectively). Response phases increasingly lag with intensity for frequencies just below CF and lead for frequencies somewhat above CF, remaining relatively constant at a frequency close to CF. This pattern of phase shift with increasing stimulus level, first described in the responses of low-CF auditory nerve fibers (7), has been demonstrated in BM responses at the base of the cochlea in several species (47,258, 259, 300, 301,323, 326, 355). The intensity-dependent variation of responses around CF results in systematic changes of phase slope or group delay as a function of stimulus level. At the base of the chinchilla cochlea, for example, group delays around CF decrease from 990 to 610 μs as stimulus levels increase from 10 to 90 dB SPL (326). However, the accumulation of phase lag at CF (∼1.5 periods, equivalent to a delay of 150 μs) remains essentially invariant over the same range of intensities.

Fig. 8.

Intensity dependence of BM response phases around CF. BM phases in chinchilla (top) or guinea pig (bottom) are expressed relative to the phases of responses at a single stimulus intensity. Chinchilla data were normalized to 80 dB SPL and were from Ruggero et al. (326). Guinea pig data were normalized to 74 dB SPL and were from Nuttall and Dolan (258).

5.  Amplitude and phase plateaus

The sensitivity curves of Figure 5 include high-frequency plateaus, within which responses grow linearly and have sensitivities some 70–90 dB lower than for low-level stimulation at CF. Similarly, some of the phase curves in Figure 7 exhibit plateaus in the corresponding frequency regions. High-frequency plateaus have been observed in the cochleae of several species [e.g., curves for squirrel monkey (squares), chinchilla (crosses and X), and guinea pig (diamonds) in Fig. 7] (47, 243, 259,293, 307, 326, 400,402). The phase values at the plateaus have been variously reported as lagging middle ear vibration by 1.5 periods (165); 3.5, 4, or 4.5 periods (e.g., squares in Fig. 7; Refs. 293, 297); 2.5 periods (crosses and diamonds in Fig. 7; Refs.243, 259; also see arrowhead in Fig. 19); 0.25 or 1.25 periods (400); 2.3 or 3.3 periods (328); or 3.75 periods (X in Fig. 7; Ref. 326). The phase plateaus often appear quantized in that, at any given BM site in each species, they occur at discrete values separated by one or more periods (307,328, 400) (or, less commonly, 0.5 period; Ref. 293) (see also sect. v C and Fig.15).2

It was once suggested that the plateaus arise from cochlear damage, including acoustic trauma incurred while testing for its presence (133). However, plateaus were subsequently demonstrated in normal cochleae using test stimuli that did not impair normal sensitivity (307). Plateaus do not appear to be instrumental artifacts, since they have been measured at the BM using such disparate methodologies as the Mössbauer technique, capacitive probes, and laser velocimetry. In addition, plateaus are present in pressure waves recorded in the perilymph of scala tympani, near the BM (266, 267) (see Fig. 15 and sect.v C). At the apex of the cochlea, plateaus appear to be artifacts resulting from opening the otic capsule over scala vestibuli (50; see sect. iii C). At the base of the cochlea, however, it seems unlikely that opening the otic capsule over scala tympani would have a similar effect, since the round window membrane provides an effective pressure release even when the otic capsule is intact. Furthermore, plateaus are also evident in pressure recordings at basal sites of cochleae in which the hydraulic seal of the otic capsule was restored after introduction of the microphone into scala tympani (266, 267; see Fig. 15 and sect.v C). Interestingly, plateaus are present in BM vibrations but absent from the responses of high-CF auditory nerve fibers recorded under identical conditions in the same cochleae (see Fig. 1B of Ref. 244). In other words, BM vibrations at frequencies well above CF seemingly are not transmitted to inner hair cells (399). However, magnitude and phase plateaus are apparent in some responses of auditory nerve fibers as a function of CF (Fig. 10 of Ref. 188).

In conclusion, although BM vibrations at the plateau frequencies may not participate in determining responses of auditory nerve fibers, it is clear that the high-frequency amplitude and phase plateaus are normal features of BM displacement waves. By their very nature, which implies very high wave velocity, the phase plateaus indicate a mode of vibration distinct from the “slow” pressure and displacement waves that propagate in the cochlear fluids and on the BM (sect.v). The fixed, frequency-independent, phase relation between middle ear and BM vibrations (and also pressure in scala tympani; see sect. v C) suggests that the plateaus reflect, more or less directly, positive or negative components of stapes motion, transmitted as “fast” (acoustic) pressure waves or standing waves made up by the combination of a sound wave and its reflection (211). If the BM responds resistively to the fast pressure wave, its displacement must be in phase or antiphase with stapes displacement (rather than stapes velocity) (211), in agreement with many of the reported phase lags (165, 243, 259,293, 297) (see footnote 2).

6.  Direct-current and harmonic distortion

At the base of the cochlea, BM responses to tones contain fairly low harmonic distortion (43, 47,326). Among distortion components, the second harmonic is the largest (47), attaining levels as high as 3.5% (or −29 dB) referred to the fundamental component (43). Harmonic distortion decreases with increasing stimulus level and with deterioration of cochlear function, an indication that harmonics arise as by-products of the mechanism that enhances cochlear sensitivity at low stimulus levels (43).

Although second-order harmonic distortion might be expected to be accompanied by direct current (DC) or tonic displacements (since all even-order distortions generate DC distortion), these have been measured at the base of the cochlea only following severe injury, in responses to very intense stimuli (203, 204). DC displacements were sought, but not found, in studies that used a displacement-sensitive interferometer to measure vibrations in near-normal cochleae (47).

B.  Responses to Broadband Stimuli

For nonlinear systems, the responses to tones cannot generally be used to predict responses to arbitrary stimuli. Therefore, a thorough understanding of BM behavior requires the use of other stimuli, such as tone complexes, noise, and clicks. Clicks are especially useful because, being punctate and wide-band in nature, they permit precise timing of a system's response while simultaneously testing it over a wide range of frequencies. BM responses to clicks were first recorded in live animals using the Mössbauer technique (301, 306). Despite the severe distortion introduced by the Mössbauer recording technique (see sect.xii A), these early studies revealed responses exhibiting compressive nonlinearities consistent with those of responses to tones (293). Those results have been confirmed and extended by laser-velocimetry recordings in chinchilla and guinea pig (84, 290,323).

Figure 9 shows BM responses to clicks at the 10-kHz site of the chinchilla cochlea. Responses consist of lightly damped transient oscillations with latency of ∼30 μs (referred to the onset of ossicular motion) and instantaneous frequency that increases rapidly and settles at CF within a few hundreds of microseconds (84, 290). The frequency increase is largely impervious to changes in stimulus intensity (84, 290) and remains after death (290). Response wave shape and sensitivity change systematically as a function of stimulus intensity. The responses to low-level clicks have high sensitivity and nearly symmetrical spindle-shaped envelopes, which reach their maxima roughly 1 ms after the onset of stapes motion. As click level is raised, most cycles of oscillation grow at highly compressive rates (thus decreasing in sensitivity), but the earliest cycle grows at faster, almost linear, rates. This produces a systematic skewing of the envelopes toward earlier times. Responses to clicks at a basal location of the guinea pig cochlea (CF = 18 kHz; Ref. 84) display nonlinear behavior similar to that illustrated in Figure 9.

Fig. 9.

BM responses to rarefaction clicks. The response waveforms, recorded at a basal site of the chinchilla cochlea, are displayed with a uniform scale of sensitivity (velocity per unit pressure). The thin vertical line indicates the onset of vibration of the middle-ear ossicles. Positive values indicate velocity toward scala vestibuli. Peak stimulus pressures (in dB/20 μPa) are indicated above each trace. [Data from Recio et al. (290).]

Despite the level-dependent compressive nonlinearity, the magnitude and phase spectra of responses to clicks closely resemble the magnitudes (Fig. 10) and phases (data not shown) of responses to tones (290, 323). In other words, the features that characterize steady-state responses to tones are also expressed in the brief responses to clicks. This is made possible by the rapid development of compressive growth, which is detectable within 100 μs of the response onset (290). Taking into account the aforementioned frequency glide, it seems that compressive nonlinearity and its correlates, high sensitivity and frequency selectivity, all start nearly as soon as the BM begins to vibrate at CF. It is remarkable that these almost instantaneous level-dependent changes in gain are accomplished while generating relatively low harmonic and intermodulation distortion (see sects. ii A6 andvii B). For comparison, electronic feedback systems in which gain is reduced as a function of increasing input magnitude typically incorporate relatively long time constants to prevent large harmonic and intermodulation distortion. In the case of the cochlea, harmonic and intermodulation distortion components appear to be minimized by the sharp frequency filtering that accompanies enhancement of BM responses to low-level stimuli and amplitude compression at high stimulus levels.

Fig. 10.

BM responses to clicks and tones recorded at the same site in a chinchilla cochlea. Frequency spectra of BM responses to clicks (dashed and solid lines; peak pressures 24–104 dB) compared with the magnitudes of responses to tones (symbols). The thick solid line indicates the spectrum of responses to 104-dB clicks recorded 10–20 min postmortem. [Adapted from Recio et al. (290).]

Many nonlinear systems can be described by analyzing their responses to white noise using cross-correlation functions (96,225). The first-order cross-correlation between the noise and the response, the first-order Wiener kernel, is identical to the impulse response in a linear system, but it also provides information on odd-order nonlinearities in the case of nonlinear systems. The second-order kernel provides information on quadratic and higher-order even nonlinearities. First-order Wiener kernels for BM vibration resemble responses to clicks (84,289). This resemblance, as well as the fact that second-order Wiener kernels are small in magnitude, shows that even-order nonlinearities contribute little to BM responses to noise at the base of the cochlea (289). This is consistent with the relative weakness of harmonic distortion in BM responses to tones (see sect. ii A6; Refs. 43, 47, 326).


Mechanical responses to sound at the apex of the cochlea are still poorly understood in part because few such data have been obtained from cochleae that were reasonably free of experimenter-induced damage. Most methods for measuring cochlear vibrations require the placement of artifacts (mirrors, Mössbauer sources) on the vibrating structures and, therefore, studying the responses of the organ of Corti, the BM, or the TM at the apex (Fig. 1) often entails perforating Reissner's membrane, which alters the endolymph composition and reduces the endocochlear potential. The scarcity of data has been compounded by a lack of independent controls to assess the physiological state of the preparation. In particular, although compound action potential thresholds are very useful in testing the overall integrity of function of high-CF regions of the cochlea, when they have been (exceptionally) used in conjunction with mechanical measurements at the cochlear apex (49) they have been judged to be very insensitive as an indicator of damage. In addition, it is not clear whether compound action potential thresholds provide CF-specific information for cochlear regions with CFs lower than 500–1,000 Hz (68, 136, 166,413).

A.  Responses to Single Tones

1.  Response magnitudes

Perhaps the recordings that are most representative of responses at apical sites in intact cochleae were obtained in chinchilla (50, 52, 299). Gold-coated polystyrene beads, introduced into scala media via small holes in Reissner's membrane, settled on the TM or Claudius cells (which line the scala media surface of the BM; see Fig. 1). When conditions were optimal, substantial compressive nonlinearities (reminiscent of those present in BM vibrations at the cochlear base) were demonstrated (50-52, 299). Figure11 (open symbols) displays sensitivity functions (isointensity functions normalized to stimulus level) for TM responses to tones at a site of the chinchilla cochlea with CF ∼500 Hz. Responses grow at mildly compressive rates (0.5–0.8 dB/dB) so that vibration sensitivity increases systematically with decreasing stimulus intensity (by as much as 22 dB between 90 and 30 dB SPL). The region of compressive nonlinearity encompasses the entire frequency range of responses so that both the peak frequency and bandwidth are largely independent of stimulus level (52, 299). Gains (computed relative to middle ear ossicular motion) for representative TM recordings at apical sites of the chinchilla cochlea are shown in Figure 6 (dashed lines) (299).

Fig. 11.

Sensitivity (displacement divided by stimulus pressure) of responses to tones at the apex of guinea pig and chinchilla cochleae. Open symbols: chinchilla TM (CF = 500 Hz), data from Cooper and Rhode (52); solid symbols: guinea pig organ of Corti (CF = 400 Hz), data replotted from Zinn et al. (413); thick continuous line (no symbols): guinea pig BM (CF ∼300 Hz), data replotted from Cooper and Rhode (49); thin lines (no symbols): responses to tones at three intensities spaced 10 dB apart, guinea pig organ of Corti (CF ∼300 Hz), data replotted from Khanna and Hao (178).

Several investigations of mechanical vibration in apical regions of guinea pig and squirrel monkey cochleae have failed to detect compressive nonlinearities comparable to those observed in chinchilla (49, 52, 135, 178,296, 413). Especially notable is the absence of nonlinearities in organ of Corti vibrations measured without rupturing Reissner's membrane (177, 178), since under such circumstances one would expect minimal disruption of normal function. A single investigation, however, has described a vulnerable expansive nonlinearity at an apical site of the guinea pig cochlea with CF ∼300–400 Hz (Fig. 11, solid symbols) (413). The fact that the expansive nonlinearity disappeared postmortem (not shown in Fig. 11) suggests that this exceptional investigation is the only one that has managed to preserve the apex of the guinea pig cochlea in anything approaching its normal, undisturbed condition. As illustrated in Figure 11 (solid symbols), response magnitudes grew almost linearly at frequencies lower than CF. However, at CF and, especially, at a frequency just higher than CF, responses grew with stimulus intensity at rates higher than 1 dB/dB, i.e., expansively. In other words, for frequencies equal to and just higher than CF, responses became more sensitive as stimulus levels were raised. This level dependence of response magnitude is the opposite of the level dependence of responses at the apex of the chinchilla (Fig. 11, open symbols) or at basal locations (Figs. 2-5and 10).

The mechanical responses of apical sites in guinea pig cochleae (Table3 and Fig. 11) differ from responses in chinchilla in being less sensitive (by 20–30 dB for low-level CF stimuli), regardless of whether responses are linear (lines without symbols) or nonlinear (solid symbols). It is not clear whether this difference in sensitivity reflects differences in species or the locations of the recording sites, ∼17 mm from the oval window (CFs = 200–400 Hz) in guinea pig versus 14 mm (CFs = 400–800 Hz) in chinchilla. Despite other differences, apical responses of chinchilla and guinea pig display similar sharpness of frequency tuning (Q10 ∼0.6–1.3; Fig. 11, Table 3).

View this table:
Table 3.

Sensitivity and frequency tuning of mechanical responses to tones at apical sites of the cochlea

2.  Response phases

Figure 12 shows the variation of phase as a function of frequency for TM responses to tones at apical sites of chinchilla and guinea pig cochleae (42,49, 50, 299). The curves show phase lags (expressed relative to motion of the middle ear ossicles) that increase monotonically with almost constant slope. At CF, the phase accumulation amounts to ∼0.8 periods in the cochleae of Figure12, but larger phase accumulations (1–1.4 periods), nearly comparable to those at the base of the cochlea (Fig. 7), have also been measured at apical sites in other cochleae (50, 299). At frequencies lower than 150 Hz, response phases in chinchilla lead those in guinea pig by nearly 180 degrees, perhaps partly reflecting differences in the cochlear input impedance in the two species associated with the disparate sizes of their helicotremas (58).

Fig. 12.

Vibration phases at the tectorial membrane (TM) of the cochleae of chinchilla and guinea pig. Displacements toward scala tympani (for the “slow” traveling wave) are shown relative to inward displacement of the middle ear ossicles. CFs are indicated by closed symbols. Solid line: chinchilla, at a site ∼14 mm from base (CF ∼500 Hz); reference, umbo. Dashed line: guinea pig, at a site ∼16.5 mm from base (CF ∼400 Hz); reference, incus. [Data from Cooper (42).]

At apical sites of chinchilla cochleae that display compressive nonlinearity (Fig. 11, open symbols), the response phases vary systematically with stimulus level (data not shown), exhibiting leads and lags as a function of increasing stimulus level for frequencies lower and higher than CF, respectively (299). At the apex of the guinea pig cochlea, nonlinearity (when it exists) is expansive (Fig. 11, solid symbols), but it is also accompanied by labile changes in response phases with stimulus level (data not shown). However, responses exhibit phase leads as a function of increasing stimulus level at all frequencies (413).

3.  DC and harmonic distortion

TM vibrations at the apex of the chinchilla cochlea exhibit DC components (299). These are typically tonic displacements toward scala vestibuli that accompany alternating current (AC) responses within the range of compressive growth. The DC displacements (≤35 nm in response to 40- to 80-dB SPL tones) are much smaller than the corresponding AC responses.

One study of the apex of the guinea pig cochlea reported that, paradoxically, even though organ of Corti vibrations grew at linear rates, they contained enormous harmonic distortion (e.g., second harmonic magnitude as high as 50% of the fundamental) for tone frequencies slightly lower than CF (177). At apical sites of the chinchilla cochlea, TM responses to tones with frequency lower than CF also exhibit harmonic distortion (299). Although their magnitude was not reported, perusal of the published waveforms (299) suggests that harmonic distortion at apical cochlear sites is much lower in chinchilla than in guinea pig (177).

B.  Responses to Clicks

Responses to clicks at the third turn of the chinchilla cochlea and the fourth turn of the guinea pig cochlea (50) consist of transient oscillations at frequencies corresponding to CF. These oscillations, which have a latency of ∼1–1.5 ms relative to the onset of middle ear ossicular vibration, grow at compressive rates in the chinchilla throughout their duration.

In addition to the aforementioned “slow” components, with latencies of 1–1.5 ms, responses to clicks from apical turns of chinchilla and guinea pig cochleae include a linear “fast” component with an onset delay of ∼20–50 μs. This fast wave is believed to be an experimental artifact (related to the propagation of the sound wave in the cochlea; see sect. v), since it disappears when the hydraulic seal of the cochlea is restored (50). Similarly, high-frequency magnitude and phase plateaus become less prominent or disappear upon closing the otic capsule.


A.  Mapping of CF Upon Cochlear Location

As first shown by von Békésy (388), the systematic mapping of CF upon longitudinal position on the BM is a general and fundamental principle of the mechanical processing of acoustic signals in the cochleae of mammals. Because such “tonotopic” mapping can be demonstrated in vitro, it is clear that spatial frequency analysis arises from the passive mechanical properties of cochlear fluids and tissues (see sect.v B). The tonotopic cochlear map has been worked out in some species, such as cat and Mongolian gerbil, by measuring the sites of innervation of auditory nerve fibers of known CF (208, 240). In these cochleae, as well as in the cochleae of several other species that are known with less precision, the tonotopic map follows Equation 1 CF=A(10αxk), Equation 1where CF is expressed in kHz (127, 128). If x, the distance from the apex, is expressed as a proportion of BM length, from 0 to 1, the constant α is the same (2.1) in many cochleae differing widely in length (11.1 to 60 mm), including those of gerbil, chinchilla, guinea pig, cat, macaque monkey, humans, cow, and elephant (127, 129). The constant k also varies only slightly in many species (between 0.8 and 1, typically 0.85). This is a remarkable result, implying that in all these species, where α and k are (nearly) the same, every octave arranged “by rank, from highest to lowest in an animal's frequency range, subtends the same proportion of the length of the cochlea” (129). For the basal 75% of the cochlea (i.e., for x > 0.25), Equation1 describes a simple linear relation between BM position and the logarithm of CF; for the 25% apical region, CF octaves are relatively compressed, occupying BM lengths shorter than at more basal locations. The constant A (0.456 in cat, 0.164 in chinchilla, 0.35 in guinea pig; 0.36 in the macaque monkey, and 0.4 in gerbil) determines the range of CFs.

Because the length of the BM tends to be greater in mammals with larger body size (see sect. i B), the actual length (e.g., in mm) subtended by a particular octave of CF varies from species to species. Thus, for example, the uppermost octaves represented in the cochleae of cat and gerbil occupy 14% of BM length, which translates to 1.57 mm in gerbil and 3.55 mm in cat (129). The corresponding BM distances are substantially shorter in nonmammalian species, which have short BMs (e.g., ∼0.6 mm in several birds, 0.13 mm in the red-eared turtle, and 1 mm in the monitor lizard) (221). Because, in general, the hearing range in mammals extends to much higher frequencies than in nonmammals (95, 105), it seems reasonable to speculate that the reptilian ancestors of mammals had relatively short basilar papillae and that increased length of mammalian cochleae was associated with the evolution of mechanical processes (221), absent (or only present in primitive form) in reptiles, which led to the development of high-frequency hearing (371) and a consequent improvement in the ability to localize sounds (227).

B.  Response Magnitudes

In addition to the aforementioned compression of CF octaves near the apex (see sect. iv A), the mechanical behavior of the cochlea differs in several other respects between apical and basal regions.

1) BM vibrations are less sharply tuned at the apex of the cochlea than at the base (378, 391), regardless of stimulus level or the physiological state of the cochlea. In the chinchilla, for example, responses to intense stimuli of BM sites distant 3.5 and 14 mm from the oval window have Q10values of 1.6 and 0.9, respectively (Fig.13).

Fig. 13.

Comparison of vibration sensitivity (velocity per unit pressure) at the base and apex of the chinchilla cochlea. For each family of curves, stimulus frequencies are shown normalized to CF (500 Hz and 9 kHz). At either site, sensitivity decreases as function of stimulus level. At the base of the cochlea, responses to CF tones differ by as much as 56 dB as a function of stimulus level; peak sensitivities at the highest and lowest stimulus levels differ by 48 dB. At the apex, the two measures of the intensity dependence of sensitivity are the same, 15 dB. The upward arrows indicate the frequencies of peak sensitivity (BF) for responses to the highest level tones, which resemble postmortem data. [Basal BM data from Ruggero et al. (326); apical TM data from Cooper and Rhode (52).]

2) Whereas a compressive nonlinearity is a prominent feature of BM vibrations at the base of the cochlea, the presence and/or nature of nonlinearities at apical sites have not been well established. Some studies have reported that apical vibrations are linear (49, 178), others have demonstrated a relatively weak compressive nonlinearity (52,299), and still another has found a weak expansive nonlinearity (413).

3) At the base of the cochlea, compressive nonlinearity is confined near CF, and vibrations grow linearly at frequencies lower than 0.7CF (Fig. 13). In contrast, at apical sites, when a compressive nonlinearity exists (chinchilla), it extends uniformly throughout the frequency range of responses; when an expansive nonlinearity exists (guinea pig), it is prominent only for frequencies higher than CF (413).

4) At the cochlear base, the strength of the compressive nonlinearity and sharpness of tuning are highly correlated (and perhaps inextricably linked; Figs. 5, 10, 13, 19, and 20). At the apex of the chinchilla cochlea, frequency tuning is largely independent of stimulus level (Figs. 11 and 13) even in the presence of a compressive nonlinearity (50, 52, 299). At the apex of the guinea pig, however, when an expansive nonlinearity exists, frequency tuning is labile and level dependent (413) but to a much lesser extent than at the base of the cochlea.

5) In chinchilla cochleae, responses to clicks at apical sites differ from those at basal sites in that the former grow at compressive rates throughout their duration, whereas at the base onset oscillations grow linearly. This difference is consistent with the contrasting distribution of compressive nonlinearity as a function of frequency in responses to tones at both cochlear sites (see point 3 above).

Some of the aforementioned differences between BM responses at the base and the apex are consistent with results of several studies of auditory nerve fibers. One study derived putative BM input-output functions for CF tones from the responses of auditory nerve fibers, on the assumption that at frequencies well below CF the BM at the nerve fiber site vibrates linearly (see sect. viii A), and found that the compressive nonlinearity became weaker with decreasing CF (53). Other studies found that the modulation or suppression exerted by low-frequency tones on auditory nerve fiber responses to CF tones (see sect. vii A3) grows weaker as a function of decreasing CF (27,86, 376). All these results on single- and two-tone stimulation could be interpreted as evidence that the strength of compression diminishes systematically with decreasing CF or, alternatively, as evidence that at the apex of the cochlea BM responses grow at similarly compressive rates at all frequencies. Both interpretations may turn out to be correct, in light of the contrasting apical mechanical data from chinchilla and guinea pig (see sect.iii).

The base versus apex differences are also consistent with the dependence on CF of the effects of furosemide on the responses of auditory nerve fibers (361). Furosemide elevates the thresholds of high-CF auditory nerve fibers at all frequencies, but preferentially at CF. In light of the effects of furosemide on BM vibration at the base of the cochlea (Fig. 20; see sect.ix B3), it is clear that the CF specificity reflects the reduction of BM vibrations, whereas threshold elevations at other frequencies (and partially near CF) result from the reduced drive for mechanoelectrical transduction in inner hair cells. In other words, the decrease in endocochlear potential produced by furosemide affects high-CF inner hair cells directly at all frequencies and indirectly (via the outer hair cells and BM vibrations) only near CF. In the case of low-CF fibers, the effects of furosemide are less marked than for high-CF fibers and are similar at all response frequencies. The former finding is consistent with the lesser extent of compressive nonlinearity in the mechanical responses to sound at the apex of the cochlea. The latter finding agrees with the fact that at the apex the compressive nonlinearity is nearly evenly distributed over the entire frequency range of responses (see sect.iii A1 and Fig. 11). Finally, differences between mechanical responses at the base and apex of the cochlea may underlie the finding that low-frequency hearing thresholds are relatively immune to the loss of apical outer hair cells (286).

C.  Response Phases

In section ii A4 it was noted that the phase versus frequency curves of BM responses at basal sites (within 4 mm of the stapes) consist of three distinct segments: one with shallow slope at low frequencies, another with steep slope around CF, and a high-frequency phase plateau (Fig. 7). At the apex of the cochlea (Fig. 12), phase versus frequency curves differ from those at the base mainly in that there is no distinct separation between the low-frequency and the near-CF segments so that phase accumulates at a roughly constant rate. Nevertheless, the accumulation of phase lag at CF is similar at the base (1–2.2 periods at the base; Fig. 7, excluding the squirrel monkey data) and at the apex (0.8–1.4 periods), and in both cases the curves possess high-frequency plateaus. The squirrel monkey data for a BM site with CF ∼6 kHz, presumably located ∼9.5 mm (41% of total BM length) from the stapes (128), may represent a region of transition between base and apex. Low-frequency and near-CF segments are distinguishable in this phase versus frequency curve (Fig. 7), but their slopes do not differ as much as at more basal BM sites, and the phase accumulation at CF (2.5 periods) is larger than at sites closer to the stapes. A maximal phase accumulation at CF near the middle of the cochlea has been predicted on theoretical grounds (126). Systematic changes in phase slope as those noted above have been observed in cochlear microphonics and in responses of auditory nerve fibers (281).

Apical responses to tones with frequency lower and higher than CF exhibit systematic leads and lags, respectively, as a function of increasing stimulus intensity in chinchilla (see sect.iii A2). These level dependencies of response phases are reversed from those that hold both for BM vibrations at basal cochlear regions (see sect. ii A4) and for the responses of low-CF auditory nerve fibers (7) as well as low-CF (30, 33, 62) and high-CF (335) inner hair cells.


A.  Fast and Slow Cochlear Traveling Waves

One of von Békésy's fundamental contributions to auditory physiology was the discovery of cochlear mechanical traveling waves, i.e., “slow” displacement waves that propagate on the BM from base to apex (391). Although strictly speaking sound waves are also traveling waves, in the context of mammalian cochlear physiology “traveling waves” refers to displacement waves (or pressure waves; see sect. v C) that are slower by orders of magnitude than ordinary (acoustic) pressure waves, which propagate in the cochlear fluids at speeds of 1,550 m/s and traverse the entire cochlea in a few microseconds. There has been some confusion regarding what is meant by a traveling wave. Wever and Lawrence (394) originally rejected the applicability of the term to the mammalian cochlea because they thought that the term implied that energy is transmitted directly from one segment of the BM to another (rather than via the cochlear fluids; see sect.v C). Eventually, however, Wever, Lawrence, and von Békésy (395) jointly agreed that the patterns of motion described by von Békésy “can be referred to as … a traveling wave, provided that … nothing is implied about the underlying causes” (i.e., “how any given segment of the basilar membrane gets the energy that makes it vibrate”).

The traveling-wave displacement patterns that von Békésy observed on the BM are characterized by three properties. 1) Displacements exhibit increasing phase lags as a function of distance from the oval window. At a given cochlear location, BM responses increasingly lag the motion of the stapes as a function of stimulus frequency, reaching phase accumulations far exceeding the 90-degree lag expected from simple resonances. For example, at the 300-Hz characteristic place of the BM in the human temporal bone, von Békésy measured phase accumulations equivalent to about one period at CF. In experimental animals, the phase accumulation reaches values as high as 4 periods at frequencies higher than CF (Figs. 7 and 12). 2) Displacement magnitudes have an asymmetrical envelope around the characteristic place, with the apical slope being steeper than the basal slope. At a single cochlear location, BM displacements are asymmetrically tuned around CF, with the high-frequency slope being steeper than the low-frequency slope (211, 399). 3) Traveling waves are demonstrable in the absence of normal cellular processes, when cochlear vibrations are entirely linear, such as in the temporal bones of human cadavers. In other words, traveling waves are manifestations of the “passive” mechanical characteristics (mass, stiffness, and damping) of cochlear tissues and fluids and constitute the first stage of frequency filtering and spatial analysis of auditory signals.3

A direct demonstration of the traveling wave is obtained by measuring the phases of responses to identical stimuli at closely spaced BM locations (50, 191, 243,267, 293, 300,338). Detailed measurements using 15-kHz tones at the base of one guinea pig cochlea (Fig. 14) (338) reveal phase lags that accumulate as a function of distance from the stapes. Over a range of 1 mm straddling the CF site, the phase accumulation for 35-dB tones amounts to 1.5 periods, indicating a wavelength at CF of ∼0.67 mm and a wave velocity of 10 m/s, computed according to the following equationsδt=δφ/(2πf) Equation 2 wave velocity=δx/δt Equation 3 wavelength=2πδx/δφ Equation 4where δt is the travel time, δφ is the phase difference between responses at the two sites (radians),f is the stimulus frequency (Hz), and δx is the distance between the sites.

Fig. 14.

Magnitudes and phases of BM responses to 15-kHz tones as functions of cochlear longitudinal position (expressed in mm from the apex). Data for BM positions apical to the dotted line were obtained from a single guinea pig, whereas data from more basal sites came from four other subjects. [Guinea pig data reprinted from Russell and Nilsen (338). Copyright 1997 National Academy of Sciences, USA.]

Wavelengths and wave velocities at CF for cochlear sites in several species are gathered in Table 4. The relation between distance from the oval window and velocities is generally consistent with the slowing down of the traveling wave as it approaches the cochlear apex. For any given stimulus frequency, wave velocities are as high as 100 m/s at sites basal to the characteristic place and decrease rapidly as the wave approaches the characteristic site (with CF = stimulus frequency) (300). For stimulus frequencies close to CF, wave velocities are more than one order of magnitude greater near the oval window than at sites close to the apex (e.g., 28 m/s at the 1.7-mm site vs. 1.55 m/s at the 12.8-mm site in the guinea pig cochlea). The wavelengths of responses to CF tones also appear to vary as a function of cochlear position, increasing from 0.5–0.9 mm near the oval window to 1.2–1.6 mm at apical sites in the guinea pig and chinchilla cochleae. A somewhat different conclusion was reached on the basis of the variation with CF of the phases of responses of cat auditory nerve fibers to near-CF tones: in the CF range 300–2,400 Hz, wavelength was found to be approximately constant, 2.2 mm or ∼10% of the length of the BM (186).

View this table:
Table 4.

Characteristics of the traveling wave derived from responses to identical near-CF stimuli at two or more sites in the cochleae of several species

The finite speed of the traveling wave also causes delays in the onsets of BM and neural responses to clicks that grow systematically longer as a function of distance from the cochlear base. In the case of the chinchilla, the delays have been measured directly at basal and apical cochlear sites, as well as indirectly, from the latencies of neural responses, throughout the cochlea. The delay between the onset of ossicular motion and BM vibration at the 3.5-mm site of the chinchilla cochlea (CF = 9–10 kHz) is 30 μs (289,290). At the third cochlear turn (14 mm from the base: CF = ∼500 Hz), delays are considerably longer, ∼1–1.5 ms (50). Appropriately, the response latencies of auditory nerve fibers in chinchilla (as well as other species) increase systematically as a function of decreasing CF (184,187, 314, 364). For rarefaction clicks, latencies range from ∼1 ms for fibers with CFs higher than 3–4 kHz to ∼2.7 ms for fibers with CF of 320 Hz (318,319, 346). These latencies are entirely consistent with those of BM vibrations, on the assumption that synaptic and neural delays account for a constant 1-ms delay (319).

B.  Stiffness of the Cochlear Partition

Von Békésy (391) complemented his observations of BM vibrations by measuring the stiffness of the cochlear partition, using two different methods. In one method, “volume” stiffness was inferred by observing the static displacements of Reissner's membrane as a function of a hydrostatic pressure difference applied between the two sides of the partition. In the other method, the “point” stiffness of the partition was determined by observing the displacement caused by pushing a narrow probe onto the BM. Finding that stiffness decreased by 2–4 orders of magnitude as a function of distance from the stapes, von Békésy reasoned that it must be stiffness that principally determines CF in the cochlear partition, since other properties of cochlear function do not vary nearly as much with longitudinal location. The equations describing the variation of elasticity and CF as a function of distance have the same exponential form (seeEquation 1 ), qualitatively supporting von Békésy's contention that CF is determined by BM stiffness (127). These equations also have similar slopes in each of several species (elephant, human, and guinea pig), a coincidence that when first noticed was taken as additional support for CF being determined by stiffness (127). In hindsight, such interpretation seems questionable: since resonance frequency (the frequency at which inertial and elastic reactances are equal and hence impedance is minimal) is determined by the square root of the stiffness-to-mass ratio, for CF to be determined solely by stiffness the slopes of the elasticity versus distance functions should be twice as large as those of the CF versus distance functions. [Interestingly, von Békésy's elasticity and CF data for mouse, rat, and cow do approach such relation (127).]

Following von Békésy's lead, most models of cochlear mechanics have been based on the idea that traveling waves are generated by interactions between relatively incompressible cochlear fluids and a flexible BM, with CF being determined by the gradation of BM stiffness. These models presume that the stiffness of the cochlear partition decreases exponentially with distance from the base and that mass is either constant (3, 247,280) or varies only weakly (220). To obtain realistic CF-to-distance maps in which CF changes by more than two decades between extreme basal and apical cochlear locations, most models set stiffness to vary by 3–5 orders of magnitude. It is not clear, however, whether the stiffness of the cochlear partition actually varies over such wide ranges. Extrapolating from measurements that did not fully span the entire cochlear length, von Békésy (391) estimated base-to-apex ratios of volume stiffness of ∼10,000 in the cochleae of several species. Other estimates of base-to-apex stiffness ratio are much smaller [100 and 1,000, respectively, from measurements of volume and point stiffness in human cochleae (391) and 100 from point stiffness measurements in gerbils (242)] and therefore seemingly inconsistent with CF versus distance maps. It is not clear how these smaller stiffness ratios may be reconciled with the theoretical requirements. Perhaps the cochlear partition vibrates in a complex fashion so that CF and stiffness are not simply related (242). Alternatively, the stiffness values obtained so far, derived from responses to relatively large forces, may not reflect the stiffness present in normal cochleae for small-amplitude responses to low-level stimuli (see below).

Estimates of BM point stiffness at the base of the cochlea (134, 229, 242,268) have varied over a wide range, exceeding an order of magnitude. The lowest estimates (e.g., 0.2 to 1.1 N/m; Ref. 134), derived from imposed 1–3 μm BM deflections, may reflect the low resistance to shear of the ground substance and mesothelial cells of the BM which, being relatively incompressible, may be irrelevant under more physiological conditions, when BM deflection is caused by pressure differences across the partition (229).4 The stiffness estimates believed to be physiologically relevant, 2–5 N/m in excised cochlea (229) or 6–11 N/m in vivo (268), were obtained for BM displacements as large as 8 μm. One may question the validity of such estimates because they were derived from displacements that are orders of magnitude larger than the magnitudes of BM responses to acoustic stimuli [e.g., <10 nm for CF tones presented at 94 dB SPL (1 Pascal) in Fig. 5]. Nevertheless, estimates of volume compliance based on the point stiffness measurements (∼3 × 10−14 m4/N) are similar to estimates based on independent determinations of BM displacements and intracochlear pressure under acoustic stimulation (1.8–6.4 × 10−14 m4/N; Ref.328).

The variation among stiffness values reported by different studies may partly reflect variations of stiffness as a function of radial position on the BM. Von Békésy's in vitro observations of the displacements of various structures of the cochlear partition in response to static forces suggested that the BM is responsible for most of the partition stiffness (391). He also found that, in response to either local forces or hydrostatic pressure, displacements of the BM were largest at central sites and tapered off smoothly toward its attachments to the spiral lamina and spiral ligament. In basic agreement with von Békésy's observations, in vitro measurements at the base of guinea pig cochleae showed that point stiffness varied relatively simply as a function of radial position, approximately as expected for a simple beam anchored at the spiral ligament and the spiral lamina: the BM was relatively compliant near the central region of the pectinate zone, and stiffness increased monotonically with increasing proximity to the spiral lamina or spiral ligament (229). Near the spiral lamina, at the zona arcuata, stiffness was several times higher than in the central region of the BM. A starkly contrasting result was obtained at the base of gerbil cochleae: stiffness was several times higher in the zona pectinata than in the zona arcuata (242,268). The pattern of stiffness with radial position was also more complex in the gerbil: stiffness was ∼7 N/m in the central region of the pectinate zone, reached a maximum (∼11 N/m) under the outer pillar cells, and was lowest (1–2 N/m) at the arcuate zone (near the spiral lamina, under the tunnel of Corti). Such variations may imply that the cells of the organ of Corti contribute substantially to the overall partition stiffness. For example, the feet of the outer pillar cells could provide a fulcrum around which the pectinate and arcuate zones of the BM might pivot (see sect.ix D).

C.  Pressure Waves in the Cochlear Fluids

Today there is general agreement that the energy delivered to the cochlea by the stapes footplate is transported principally via pressure waves in the cochlear fluids (since the BM exhibits negligible longitudinal coupling; Ref. 387). Fluid pressure interacts with the flexible BM, generating coupled slow waves that travel from base to apex: a differential pressure wave that propagates in the cochlear fluids and a displacement wave that propagates on the BM. As the slow differential pressure and displacement waves approach the BM site with corresponding resonance frequency, their group velocity decreases and their energy is dissipated over a short distance just basal to that site (76, 211).

Pressure has been measured at basal regions of the cochleae of guinea pig (73, 217), cat (60,245, 246), and Mongolian gerbil (266, 267). The earlier studies (in guinea pig and cat) reported differential pressure waves whose magnitudes and phases for stimulus frequencies well below CF closely resembled those of BM vibration or cochlear microphonics but did not find the large phase accumulations characteristic of slow traveling waves at frequencies equal to and higher than CF. These studies probably failed to detect slow differential pressure waves because they used microphones placed relatively far from the BM.

Theoretical considerations lead to the expectation that slow pressure waves in the cochlear fluids must be confined very near the BM at the characteristic place (211, 372). This expectation has been confirmed by elegant studies of E. Olson who observed, for the first time, pressure counterparts of the displacement traveling wave (266, 267). Using a miniature microphone of her own design and manufacture, Olson measured pressure in scala vestibuli and in scala tympani at the base of the gerbil cochlea. In scala tympani it was possible to measure pressure as a function of distance between the sensor and the BM (Fig.15). When the microphone was relatively far from the BM, responses were small, poorly tuned, and exhibited minimal phase lags with increasing stimulus frequency. As the BM was approached within 100 μm, scala tympani pressures grew in magnitude, became better tuned, and exhibited large phase lags resembling those of BM vibrations (Fig. 7). Near the BM, the phase versus frequency curves (Fig. 15, B and D) had shallow slopes at low frequencies and steep slopes around CF. Consistent with the existence of phase plateaus in BM vibrations (see sect.ii A5 and Fig. 7) and with theoretical predictions (211), the pressure phase versus frequency curves also exhibited plateaus at frequencies higher than CF, in phase with or lagging scala vestibuli pressure by one (Fig. 15, Band D) (266) or two (267) periods.

Fig. 15.

Pressure magnitudes (A and C) and phases (B and D) near the BM in scala tympani of a gerbil cochlea. The distance from the BM (in μm) is indicated in the legends of B and D. A andB: in vivo data. C and D: data recorded immediately postmortem. The abscissas indicate stimulus frequency. Phases are expressed relative to pressure in scala vestibuli. Stimuli were 80-dB tones. [Replotted from Olson (266). Copyright, Acoustical Society of America, 1998.]

D.  Questioning the Existence of Cochlear Traveling Waves

Von Békésy (389, 390) maintained that his observations of BM motion implied both the existence of a traveling wave and the absence of resonance. His rejection of resonance was partly based on the observation that pressing a needle on the surface of the BM causes an almost circular deformation pattern, implying that longitudinal coupling is strong, thus preventing resonance. Later studies, however, found that the BM exhibits negligible longitudinal coupling (303,387) (see sect. vi B2). Von Békésy also argued against the existence of BM resonance on the basis of the behavior of hardware models of the cochlea, such as a bank of reeds of graded stiffness, which he believed distinguished unambiguously between traveling waves and resonance, depending on whether the resonators (reeds) were coupled or not. It turns out that the distinction between traveling waves and resonance in the reeds is less absolute than von Békésy thought (see Fig. 12.11a in Ref. 391). Even when the reeds are uncoupled, the bank of resonators sustains rudimentary traveling waves that propagate from “high-CF” to “low-CF” regions, albeit with symmetrical tuning and with an overall phase accumulation of only 180 degrees (399). When the reeds are coupled, reed vibrations become asymmetrically tuned and phase accumulations assume large values (>2 periods), much as in the cochlea (399).

After von Békésy's rejection of resonance theories of cochlear mechanics, the pioneering one-dimensional mathematical model proposed by Zwislocki envisioned the cochlea as a “pure” traveling-wave system, devoid of BM resonance (423). Later models, however, beginning with Peterson and Bogert's (280) one-dimensional one and others including two and three dimensions (3, 193, 220,247, 372), have generally been designed to produce traveling waves that also exhibit resonance. According to Lighthill (211), all the more successful models, regardless of whether they simulate the cochlea using one, two, or three dimensions, describe a specific type of system, namely, “critical layer absorption” (16), in which traveling waves and resonance coexist. A dissenting opinion has been voiced by Dancer and colleagues (72, 217,218), who dispute the existence of traveling waves.

Dancer and colleagues (72, 217,218) have argued that, in the absence of “active” processes, BM vibrations do not exhibit the substantial accumulations of phase lag at CF that characterize traveling waves (e.g., Fig. 7). This argument disregards the fact that large phase accumulations (exceeding 2 and 3 periods, respectively) are evident in vivo in BM vibrations near the apex of guinea pig cochleae even after removal of most of the organ of Corti (49) or in vitro, in excised cochleae (see sect. vi B1), following exposure to glutaraldehyde or dinitrophenol (157). This argument also disregards the fact that death and other insults to cochlear function at the base of the cochlea typically affect phases minimally or produce significant phase lags (e.g., Fig. 20; sect.ix B4). Even in the exceptional cases [e.g., Fig. 19 (259); see also Ref. 326] in which substantial postmortem phase leads have been documented, phase lags were never reduced to values inconsistent with the existence of a traveling wave. In the guinea pig data of Figure 19, for example, postmortem phase lag at CF (exceeding 1.5 periods) is actually greater than the phase lag measured in vivo at the same cochlear location in another guinea pig (see Fig. 7, circles).

The arguments against the existence of traveling waves (217) were supported by the small phase accumulations found in early studies of pressure in the cochlear fluids (73). In fact, however, large phase accumulations were demonstrated at frequencies near, and especially above, CF (Fig. 15; see sect. v C) when scala tympani pressure measurements were carried out sufficiently close to the BM (266, 267). The pressure data of Figure 15also refuted the idea that large phase lags are a product of active processes (73), since the large phase lags measured in vivo (i.e., about one period at the high-frequency plateau) were not altered postmortem (compare Fig. 15, B andD).

Another argument against the existence of cochlear traveling waves hinges on the distinction between filter delays and “signal-front” delays, both of which may contribute to the near-CF (or “weighted-average”) group delay (124,270, 313). In linear systems, “frequency selectivity is obtained at the price of slow response time or, equivalently, long transmission delays” (124): the sharper the band-pass filtering, the greater the delays. It is interesting that a similar correlation also exists in the case of BM vibrations at any given cochlear site, even though these are nonlinear (117). As discussed in section ii,A3 and A4, both the sharpness of frequency filtering and the group delay of BM responses around CF decrease systematically as a function of increasing stimulus intensity (Figs. 5and 8). In so-called “minimum-phase” linear systems, response phases are completely determined by the filtering characteristics so that the phase versus frequency function can be accurately computed on the basis of the magnitude versus frequency function (15,377). Although it is not clear whether the concept of minimum phase is applicable to a nonlinear system such as the cochlea, one might argue that if BM responses have the minimum-phase property, then “signal-front” delays (or “travel time”) must be nil (all delays being solely the result of local filtering). It is a matter of dispute whether BM vibrations in normal cochleae display minimum-phase behavior (82). Some analyses support such behavior (196, 415), but others do not (83, 289).


To a great extent, the focus of cochlear function is to direct mechanical stimulation to the stereocilia of the inner hair cells. Therefore, a reasonably complete understanding of the mechanics of the cochlea must explain how the vibrations of the cellular and membranous components of the cochlear partition result in deflections of the inner hair cell stereocilia. Thus it is of immense interest to investigate the “micromechanics” of the cochlea, i.e., how various sites of the organ of Corti, the BM, and the TM move in relation to each other, preferably in individual ears. Advances in technology are just now beginning to permit such measurements.

A.  In Vivo Micromechanics

At the base of the cochlea, vibrations at central sites of the BM are much larger than those at sites near the spiral lamina or the spiral ligament, which support its central and peripheral edges (46, 48, 250, 293). This radial profile of vibration magnitudes reflects the compliance of the BM and the relative rigidity of the spiral lamina or the spiral ligament. At the cochlear apex, TM and reticular lamina vibrations are also largest near the center of the cochlear partition (50, 178). Accordingly, the cochlear partition has been often modeled as a series of independent slivers, each the analog of a beam anchored at its ends, yielding a relatively simple radial profile of vibration with a single maximum near the center of the partition (211, 229,372). Other models, however, propose that the cochlear partition sustains multiple modes of vibrations (150,192-195, 220, 233), each of them propagating independently, and that sites along a radial sliver of the partition do not all vibrate in phase. This is not unreasonable given the complex architecture of the organ of Corti.

Although electrical stimulation seems to elicit multiple vibration modes in the cochlear partition both in vitro and in vivo (sect.ix D), comparable evidence for BM responses to sound is meager. Such evidence includes the following. 1) BM motion may be more sharply tuned near the spiral lamina than at more central BM sites (357, 360). 2) At the base of the cochlea, notches have been reported in the Fourier spectra of responses to clicks (290) and in isointensity versus frequency functions for responses to tones (300).3) At apical cochlear sites, notches are also common features of isointensity versus frequency functions at frequencies near CF (49, 50, 52,132, 135). These notches are often associated with abrupt phase shifts of ∼180 degrees, possibly resulting from cancellation between two waves. The notches at apical sites are thought to be an artifact of opening the otic capsule, since they are abolished upon restoration of the normal hydraulic seal of the cochlea (52). It is unclear whether the notches observed at basal BM sites are also artifactual (290, 300).

Convincing evidence for or against the existence of multiple modes in BM responses to sound could potentially be obtained using sensitive methodology that does not require the placement of reflecting beads on the BM and allows laser recordings from many closely spaced locations in individual cochleae. Using such methodology, one study found a bimodal vibration profile, with amplitude maxima flanking a minimum at the foot of the outer pillar cells and response phases exhibiting large phase shifts (90–180 degrees) as a function of radial position (250). In striking contrast, however, another study using similar methodology found that BM responses closely resembled those predicted by the beam models: phases varied little with radial position and magnitudes exhibited a single maximum located between the midline of the BM and the feet of the outer pillar cells (46). A third study that used standard methods, recording from multiple beads strewn on the BM, did not find large phase variations or bimodal magnitude profiles as a function of radial position (300). In conclusion, we are left with the unsatisfactory situation that although the complex architecture of the organ of Corti and its associated membranes seems well suited to generate multiple modes of BM motion, and that these are actually present in electrically evoked motion, clear evidence for multiple modes in BM responses to sounds is still lacking.

The evidence for the existence of multiple vibration modes is more convincing for the stage of signal transformation that lies between BM vibration and stimulation of the stereocilia of inner hair cells. The strongest evidence comes from recordings of the responses to low-frequency tones of auditory nerve fibers in cats, chinchillas, and guinea pigs. At stimulus intensities near 90 dB, the responses of auditory nerve fibers undergo large and abrupt phase shifts, approaching 180 degrees, often associated with prominent harmonic-like distortion and sometimes coinciding with notches in rate intensity functions (181, 182,329, 356). Counterparts of these phenomena apparently also exist in the receptor potentials of inner hair cells (38, 233). A simple interpretation of these neural and hair cell data is that they reveal a summation of two modes of inner hair cell stimulation, reflecting two different paths of signal transmission from their ultimate origin at the BM. If the two modes grow at different rates (e.g., one linearly, the other compressively), one mode would dominate neural responses below a given intensity (e.g., 90 dB) and the other would control responses at higher intensities. The occasionally observed notches in the rate-intensity functions suggest a mutual cancellation between the two modes. Glimpses of these two modes, and perhaps of as many as two or three others may have been detected in auditory nerve fiber responses to clicks (212). Finally, some comparisons of frequency tuning in auditory nerve fibers and at the BM hint at a “second filter” that attenuates neural responses to stimuli with frequency both lower and higher than CF (see sect.viii C).

B.  Responses to Sound in Excised Cochleae

Mechanical responses to sound at apical cochlear sites have been measured postmortem in isolated guinea pig temporal bones (131, 141, 157,231). These in vitro preparations are useful because they permit relatively unobstructed access to many structures of the organ of Corti (380) and therefore can provide crucial information on the micromechanics of the cochlea which are difficult to obtain from in vivo recordings. However, separation of the cochlea from its blood supply renders it severely anoxic, rapidly and drastically reducing the endocochlear potential and eventually killing the cells of the organ of Corti.

1.  Vibrations in the “ITER” temporal bone preparation

Many of the in vitro measurements have been performed in the ITER preparation, an excised guinea pig temporal bone in which the middle ear is filled with fluid (22, 24,157, 172, 176,381). Measurements in the ITER preparation have been usually carried out with a laser velocimeter sufficiently sensitive to detect vibrations of relatively transparent tissue without using reflecting targets. Thus it was unnecessary to disturb Reissner's membrane to introduce reflectors into scala media. When the immersion fluid was oxygenated, the excised temporal bone retained residual endocochlear and microphonic potentials (24,107, 382).

Studies of cochlear motion using the ITER preparation have generally reported vibrations that grow linearly with stimulus level (22, 24, 157, 172,176,381).5 Frequency tuning at the reticular lamina (Hensen's cells, outer hair cells) in the fourth turn of this preparation resembles the tuning measured at similar locations in living guinea pigs (49,135, 178). However, at this location in the temporal bone preparation, the vibrations of the BM are untuned and several hundred times smaller than those of Hensen's cells or outer hair cells (157, 176). This surprising result, whose validity has been recently reasserted (178,379), has been interpreted as evidence that frequency tuning is an intrinsic property of outer hair cells and that it is the outer hair cells that drive the motion of the BM, rather than the other way around (157). In clear contradiction, however, Cooper and Rhode (49) showed that the vibrations of the BM at the apex of in vivo cochleae were only minimally altered after peeling off the entire organ of Corti. It is difficult to account for the ITER results. We wonder whether the “BM” measurements were actually obtained from the cochlear bony shell, due to poor reflectivity of Claudius cells and the BM, which may result in underestimating by 10–40 dB the BM vibration amplitude (140).

The magnitude of the mechanical responses to sound in the ITER preparation also remains uncertain. Reported sensitivities at CF have ranged from ∼0.06 μm/Pa (157) to ∼70 μm/Pa (24). These extreme values lie well outside the range of in vivo measurements at the same site of guinea pig cochleae (Table 3). The lowest values may reflect severe damage caused by excision of the temporal bone. The largest values may be unduly inflated by a 35-dB correction intended to compensate for losses in stimulus pressure due to the middle ear being immersed in fluid (22-24).6

Responses to sound in the ITER preparation include DC components with magnitudes and sharpness of tuning much greater than those of the AC responses at the frequency of stimulation (22-24). The sensitivity of the DC component, as high as 200–1,800 μm/Pa, also greatly exceeds the AC and DC responses measured in vivo at the apex of the chinchilla cochlea (299) (see Table 3 and sect.iii A). It is questionable whether such enormous DC responses actually exist at the apex of normal cochleae. Rather, it is likely that they represent instrumental or other artifacts (as suggested in Refs. 131, 140,231).

2.  Vibrations in other excised cochlear preparations

DC displacements comparable to those described in the ITER preparation (22-24) were sought, but not found, at the apex of another isolated guinea pig temporal-bone preparation (131, 140), which resembles the one used by the ITER group but with air-filled middle ear (see also Ref.25). Using this preparation, two studies measured prominent radial motion of the TM in responses to either acoustic (141) or electrical stimulation (132). However, radial TM motion, which would be consistent with conjectures that interactions between the TM and the stereocilia produce a cochlear resonance additional to that of the BM (4,425, 426), was not found in the ITER preparation (383).

Yet another type of in vitro preparation, the “hemicochlea,” is obtained by cutting the cochlea along the mid-modiolar plane (303). The hemicochlea preparation, in which fluid pressures are short-circuited, affords the opportunity to study BM vibrations largely in isolation from normal hydrodynamics, and therefore almost solely determined by the intrinsic mechanical properties of the BM and the organ of Corti. In this abnormal situation, stimulation with a tiny paddle generates waves that propagate unidirectionally on the BM, from base to apex, and are tonotopically organized (303). These traveling waves, however, are rudimentary in that they propagate over distances much shorter than in cochleae in which fluid pressures are not short-circuited. This observation is consistent with the idea that the energy of the traveling wave is largely conducted by the cochlear fluids, rather than via the tissues of the cochlear partition.


In nonlinear systems, responses to tone pairs often include interactions between the tones, such as suppression and intermodulation distortion. In the auditory system, one of these nonlinear phenomena, intermodulation distortion, has been known for a long time to shape the perception of sounds in humans (139). Psychophysical studies of intermodulation distortion led to a recognition of the existence of BM nonlinearities well before these were demonstrated in physiological experiments, and even to predictions of the existence of a feedback from the organ of Corti to BM vibrations (122,366).

A.  Two-Tone Suppression

1.  Reduction of response magnitude

Two-tone suppression consists of the reduction of the response to one tone by the simultaneous presence of another. Figure16, top, illustrates suppression of BM responses to a CF (18.8 kHz) tone by a suppressor tone at various intensities in the base of the guinea pig cochlea. The abscissa indicates the level of the CF tone, and the parameter indicates suppressor levels. In the absence of a suppressor (or for low suppressor levels, such as in Fig. 16) the responses to the CF tone grow at compressive rates. At higher suppressor levels, the responses to low-level CF tones are reduced strongly, but only weakly at high levels. As a result, the BM input-output curve for the CF tones is substantially linearized in the presence of moderately intense suppressor tones.7

Fig. 16.

Two-tone suppression at the BM. Top: velocity-intensity functions for BM responses to a near-CF tone (18.8 kHz) presented simultaneously with a 22.9-kHz suppressor tone. The parameter is the intensity of the suppressor tone. [Guinea pig data from Nuttall and Dolan (257).] Bottom: frequency specificity of two-tone suppression. Iso-velocity contours (100 μm/s) are shown for responses to single tones (open circles) and for responses to the same tones in the presence of 500-Hz or 12-kHz suppressors (solid circles and squares, respectively) presented at 70 dB SPL. [Chinchilla data from Ruggero et al. (332).]

Two-tone suppression is CF specific with regard to both the probe tone and the suppressor tone. Figure 16, bottom, illustrates the dependence of suppression strength on the frequency of the probe tone. The curves indicate isoresponse contours (100 μm/s) for probe tones presented alone (open symbols) or together with 500-Hz or 12-kHz suppressors (332). In the presence of either suppressor tone, higher stimulus levels are required to reach a criterion response. The magnitude of suppression is maximal at CF and diminishes as the frequency of the probe tones departs from CF. Suppression is also CF specific in that, with a fixed probe tone at CF, suppression thresholds vary much in the same manner as the sensitivity of BM responses to single tones, i.e., suppression thresholds are lowest for suppressor frequencies close to CF (41, 51,298). However, the frequency tuning of suppression is broader than the tuning for single tones, and the thresholds are quite different on either side of CF. For suppressor frequencies just higher than CF, suppression thresholds are low so that the overall response to a suppressor paired with a CF tone can be lower than for the CF tone presented alone (41, 332). For suppressor frequencies lower than CF, suppression thresholds are relatively high and proportional to BM displacement (see sect.vii A3) so that suppression of a CF tone occurs only when the overall displacement response to the paired tones is larger than for the CF tone in isolation (41,116).

In general, the main features of two-tone suppression in auditory nerve fibers have counterparts (and probably originate) in BM vibrations (Table 5; for reviews, see Refs. 41, 332). One question still being debated is whether BM suppression can fully account for neural two-tone rate suppression by low-frequency suppressors (28, 142,376). In particular, it was suggested that a suppressive mechanism must exist at the synapses of inner hair cells with primary afferents with low spontaneous activity (28) to account for their suppression thresholds, which correspond approximately to a constant BM displacement (28, 116,376), but can be substantially lower than the excitatory thresholds for the suppressor tones (e.g., Refs. 28,103, 376). The suggestion was based on the assumption that neural excitation thresholds correspond to a constant BM displacement at all stimulus frequencies higher than 600 Hz (28). In our view, this assumption is incorrect, since comparisons of neural thresholds and BM vibrations in the same cochleae show that, at the base of the cochlea, neural excitation is elicited not by a constant magnitude of BM displacement, but rather by a combination of displacement and velocity (see sect.viii C) (244,328). As a result, excitatory thresholds are increasingly elevated relative to suppression thresholds as frequency decreases. This “high-pass filtering” is also evident in the receptor potentials of inner hair cells, with isosuppression thresholds much lower than the isodepolarization thresholds for single tones (358). Therefore, it seems unnecessary to postulate the existence of a synaptic suppressive mechanism. Rather, there is a need to understand what processes introduce “high-pass filtering” in neural responses to single tones.

View this table:
Table 5.

Comparison of two-tone suppression at the basilar membrane and two-tone rate suppression in auditory nerve fibers

At the apex of the chinchilla cochlea, the effects of suppression on BM response magnitude are similar in most respects to those demonstrable at the base. However, apical and basal locations differ in that at the apex the rates of growth of suppression are smaller and the frequency tuning of suppression thresholds is considerably broader (51). These two differences are consistent with the fact that BM responses to single tones at the apex of the cochlea are more linear and less sharply tuned than at the base (see sect.iv B).

2.  Phase changes associated with suppression

Suppression of BM responses is accompanied by phase changes both at basal and apical regions of the cochlea. These are summarized in Table 6 as a function of the frequencies of the suppressor and probe tones. At basal regions of the cochlea, suppression produces phase lags and leads, respectively, in probe tones with frequency lower and higher than CF (41,332). Such phase alterations are similar to those produced by increments of the intensity of single-tone stimuli (Fig. 8; see sect. ii A4). Similar suppression effects have been noted in recordings of inner hair cell receptor potentials for probe levels generating responses below saturation (30,31).

View this table:
Table 6.

Phase effects of two-tone suppression

The polarity of phase shifts induced by suppression on BM responses to CF probes has not been established unambiguously (Table 6). There are reports of phase leads (41, 298), phase lags (332), or a dependence of polarity on suppressor frequency (51, 257). It is not clear why findings are so disparate. However, given that the polarity of suppression-induced phase shifts reverses at or near CF, the inconsistent results may reflect errors in the determination of CF. In auditory nerve fibers, recordings show mostly phase lags associated with rate suppression for CF probes and above-CF, nonexcitatory suppressors (87). Dissimilar phase effects of suppression have also been found in inner hair cell receptor potentials, depending on cochlear location, probe intensity, and frequency of the suppressor (30, 31, 257, 258).

3.  Modulation of responses to CF tones by low-frequency tones

When BM responses to near-CF tones are suppressed by low-frequency tones, the suppression effect waxes and wanes once or twice within each period of the suppressor tone, depending on stimulus intensity (41, 116, 277,298, 332). At basal sites of the cochlea, suppression is synchronous with BM displacements toward either scala tympani or scala vestibuli, with maximal suppression coinciding with BM displacement toward scala tympani (41, 116,277,298).8 At the apex of the chinchilla cochlea, low-frequency suppressor tones (100–200 Hz) also produce phasic reductions of the vibrational responses to near-CF tones, with a periodicity corresponding to the frequency of the suppressor. As is the case at the base of the cochlea, one or two suppression maxima occur during each period of the suppressor tone. However, in contrast to modulation at the cochlear base, suppression maxima at the apex apparently do not show a clear correspondence to any particular phase of BM vibration (51). This observation is also at odds with results for auditory nerve fibers in chinchilla (376) as well as inner hair cells in guinea pig (e.g., Refs. 32, 274, 334), in which maximal suppression at near-threshold levels typically coincides with BM displacement toward scala tympani, regardless of CF.

B.  Intermodulation Distortion

When two or more tones are presented simultaneously, humans can hear additional tones that are not present in the acoustic stimulus. For two-tone stimuli these additional distortion products have pitches corresponding to combinations of the primary frequencies (f 1 and f 2,f 2 > f 1), such asf 2-f 1, 2f 1-f 2, 2f 2-f 1. Psychophysical experiments showed that 2f 1-f 2 distortion components have magnitudes that are highly dependent on stimulus-frequency separation. Even before the discovery of BM nonlinearities, this frequency dependence suggested that distortion products originate in the mechanics of the cochlea (122,366). Early attempts failed to find distortion products in BM vibrations (295, 401), but their presence was subsequently demonstrated (52, 260,298, 310-312). BM responses to two-tone stimuli with close primary frequencies (f 1,f 2), as those shown in Figure17, top, contain several distortion products at frequencies both higher and lower than the frequencies of the primary tones (such as 3f 2-2f 1, 2f 2-f 1, 2f 1-f 2, 3f 1-2f 2 andf 2-f 1). As the frequencies of the primaries are increasingly separated, the number of detectable distortion products in the response decreases.

Fig. 17.

Cubic difference tones in BM vibrations. Top panel: spectrum of responses from a basal BM site to a pair of tones with frequencies slightly higher than CF. The primary tones, each presented at 50 dB SPL, were chosen so that 2f 1-f 2 = CF (7.5 kHz). [Chinchilla data replotted from Robles et al. (312).] Bottom panels: BM magnitude of the (2f 1-f 2) distortion product as a function of stimulus frequency ratiof 2/f 1 chosen so that 2f 1-f 2 = CF. Stimulus SPLs are indicated by the circled numbers (e.g., “8” signifies “80 dB SPL”). Left: from an apical site in chinchilla. Right: from a basal site in guinea pig. [From Cooper and Rhode (52).]

1.  Cubic difference tone (2f1-f2)

Stimulation with pairs of tones elicits robust cubic difference tones (2f 1-f 2) at basal sites of the BM in chinchilla, cat, and guinea pig (52,260, 298, 310-312). Cubic difference tones reach levels as high as −16 dB (i.e., 15%) to −20 dB relative to the level of the primaries. These relative levels are comparable to those obtained at later stages of cochlear processing: −27 dB in guinea pig inner hair cells (254) and −20 dB in cat auditory nerve fibers (26, 125). They are also comparable to the −15 to −22 dB relative levels estimated for the cubic difference tone in psychophysical experiments with human subjects (122, 366). Several other characteristics of the 2f 1-f 2 distortion components recorded at the BM in animals also resemble those of distortion components measured psychophysically in humans.1) For equal-level primaries, distortion product magnitudes grow at linear or faster-than-linear rates at low intensities and saturate and even decrease slightly at higher stimulus intensities (e.g., at 60 dB SPL or higher). 2) Distortion-product relative levels are highest at low stimulus intensities and decrease little over wide ranges of stimulus intensity.3) For a fixed level of one of the primary tones, the distortion product magnitude is a nonmonotonic function of the level of the other primary tone. 4) For moderatef 2/f 1 ratios (e.g., >1.2), distortion product magnitudes decrease rapidly with increasing frequency ratio f 2/f 1, reaching rates in the order of −200 to −300 dB/octave at the basal cochlea (52, 326) and in psychophysical data (122).

As already indicated for two-tone suppression (see sect.vii A1), two-tone distortion is CF specific in that the magnitude and phase of distortion products on the BM depend strongly on the frequency separation between the primary tones. In agreement with predictions based on early psychophysical experiments (122, 419), the magnitude of the cubic difference tone on the BM decays with increasing frequency ratio,f 2/f 1 (Fig. 17,bottom). However, there is some uncertainty regarding the dependence of distortion product magnitudes on frequency ratio for ratios close to one. At the base of the guinea pig cochlea (Fig. 17,bottom right) distortion product magnitudes are relatively small at ratios close to one, increase for ratios in the range 1.06–1.25, reach a maximum atf 2/f 1 ∼1.2–1.3, and only then begin a steep final decay for higher ratios (52). In contrast, at the base of the chinchilla cochlea distortion product magnitudes decrease monotonically from a maximum at a frequency ratio near one (312), at low rates for low-frequency ratios, and at steep rates for higher ratios (much as in Fig. 17, bottom left). This monotonic pattern in chinchilla closely resembles psychophysical data for humans (122, 366, 418), whereas the broadly tuned (nonmonotonic) pattern in guinea pig resembles the behavior of distortion product otoacoustic emissions (285). It has been argued that the decrease in the magnitude of the distortion-product otoacoustic emission as the ratio of the primary frequencies approaches unity reflects a micromechanical “second filter” within the cochlea (5,19). However, other explanations are possible: nonmonotonic patterns of distortion product otoacoustic emissions could arise from monotonic patterns at the BM (228) or, alternatively, the nonmonotonic pattern might arise at the BM (Fig. 17,bottom right) as a result of mutual suppression between the primary tones as well as suppression of the distortion product (52, 170).

At the apex of the cochlea, 2f 1-f 2 distortion product magnitude begins to decrease sharply at higher frequency ratios (Fig.17, bottom left) than at the base of the cochlea, but the rate of decrease is lower than at the base (52). These properties are consistent with the fact that BM vibrations are more broadly tuned at the apex than at the base, given that the magnitude of the distortion product components must be determined by the extent of interaction between the BM vibration patterns of the primary tones.

2.  Quadratic difference tone (f2-f1)

In contrast to the cubic difference tone, the quadratic difference tone f 2-f 1 can only be perceived at high stimulus levels and does not exhibit a sharp dependence on the frequency ratio of the primary tones (122, 417). Nevertheless, recordings of cochlear microphonics (120, 256) and from auditory nerve fibers (188, 364) and cochlear nucleus neurons (367) suggested that quadratic difference tones propagate mechanically on the BM, much as the 2f 1-f 2 distortion product. Indeed, at the apex of the cochlea mechanical responses to two-tone stimuli contain prominent difference tone components, with levels comparable to those of cubic difference tones, about −23 dB relative to the primaries (52).

Quadratic difference tones have not been detected in BM vibrations at the base of the cochlea (52, 256,295, 312). Although it is possible that at basal regions of the cochlea the quadratic difference tone exists in the organ of Corti only as a local response that somehow does not reach the BM (258), we think it more likely that its absence from basal BM vibrations results from unfavorable conditions of stimulation. Because settingf 2-f 1 = CF would require values of f 2 andf 1 higher than the upper frequency limit of the cochlea, the presence of difference tones at basal sites has only been tested for f 2-f 1 << CF, for which BM vibrations are insensitive.


A.  BM Vibration at Neural Threshold

The magnitude of BM vibration at threshold is an issue in which past controversy is being gradually replaced by consensus. Linear extrapolation from von Békésy's apical data for humans leads to the rather improbable conclusion that the BM moves by ∼0.1 pm (10−13 m) at hearing threshold for tones between 300 and 1,500 Hz (391). Eventually, the discovery of a labile compressive nonlinearity (293) implied that BM displacements at threshold had to be much larger in vivo. Table 1surveys the BM vibration magnitude corresponding to neural threshold at CF at basal cochlear sites in several species. The BM vibration magnitudes at the base of the guinea pig and chinchilla cochleae may be accepted with confidence because consistent results have been repeatedly obtained in essentially intact preparations, as judged by compound action potential thresholds. However, even when the mechanical data are collected in pristine cochleae, the determination of BM vibration magnitude at neural threshold is complicated by the substantial variability of single-fiber thresholds across different cochleae as well as their dependence on spontaneous activity (e.g., Refs. 207, 210). For the 17- to 18-kHz site of the guinea pig cochlea, neural threshold at CF corresponds to BM vibration magnitudes of 34–40 μm/s or 0.3–0.35 nm (using BM and auditory nerve fiber data recorded from two different cochleae). For the 9- to 10-kHz site of chinchilla, neural threshold (∼10 dB SPL for fibers with high spontaneous activity, Ref. 326) corresponds to BM vibration magnitudes of 50–100 μm/s or 1–2 nm when using averaged neural data for the comparison (307, 326). A comparison of BM and auditory nerve fiber data recorded in the same cochleae yielded values of 164 μm/s or 2.7 nm in the case of a fiber with medium spontaneous activity and 15 μm/s or 0.26 nm for a fiber with high spontaneous activity (244). At the 30- to 33-kHz site of the cat cochlea, the reported BM vibration magnitude of ∼1 nm (or 200 μm/s) at threshold (47) is less secure than in guinea pig or chinchilla, since compound action potential thresholds were not used as controls.

It has been suggested that a counterpart of the transition between the linear and compressive regions of BM input-output functions (Fig.3) can be demonstrated in the responses of auditory nerve fibers to CF tones (342, 411). Since at the base of the cochlea BM vibrations grow linearly in response to tones with frequency well below CF, “BM input-output functions” for responses to CF tones at basal cochlear sites can in principle be derived from rate-intensity functions for auditory nerve fibers (on the assumption that spike rates are determined solely by BM displacement) by plotting the SPLs required to elicit the same response rate for stimuli at CF and at a frequency sufficiently lower than CF (411). Such derived BM input-output functions indicate that at the site of the guinea pig cochlea at a distance of 3–4 mm from the oval window (with CF ∼17–18 kHz) the “compression threshold” is ∼30–60 dB SPL (53, 241,411) or ∼30 dB higher than the CF threshold of the most sensitive fibers in the same cochlear region. However, the derived values substantially exceed (by 20–30 dB) the compression thresholds directly measured at the BM (10–20 dB SPL; see Fig. 3), which approximately coincide with the rate threshold of auditory nerve fibers (e.g., Fig. 18; Refs. 317, 326, see also Ref. 259). This discrepancy may imply that the derivation of the compression threshold from auditory nerve fiber data is based on inaccurate assumptions.

Fig. 18.

Frequency tuning in cochlear vibrations and in auditory-nerve fibers in chinchilla. Curves at right: a frequency-threshold tuning curve for an auditory-nerve fiber (solid line) is compared with isoresponse curves for a BM site with identical CF (9.5 kHz) recorded in the same ear (244). At the fiber's CF threshold (13 dB SPL), BM vibrations had a peak displacement of 2.7 nm or, equivalently, 164 μm/s. These values were used to plot BM isodisplacement and isovelocity tuning curves (dotted and dashed lines, respectively). [A better mechanical/neural match was obtained by high-pass filtering the BM displacement curve at a rate of 3.8 dB/octave (not shown).] Curves at left: isodisplacement and isovelocity tuning curves (1-nm and 2.5-μm/s, respectively; dotted and dashed lines) for TM vibrations recorded at an apical cochlear site (299) are compared with a neural tuning curve (solid line), the average of recordings from many auditory-nerve fibers (A. N. Temchin, N. C. Rich, and M. A. Ruggero, unpublished observations).

At the apex of the chinchilla cochlea (CF = 400–800 Hz), TM displacements and velocities at neural CF threshold (∼15–23 dB SPL) (68) in the most sensitive cochleae (Figs. 11 and 18) amount to 1–3 nm and 2–7 μm/s, respectively (52,299). At the apex of the guinea pig cochlea (CF = 350 Hz), the most sensitive responses reach 38 nm at 60 dB SPL in cochleae exhibiting an expansive nonlinearity (413). Average thresholds of guinea pig auditory nerve fibers with CF ∼300–400 Hz have been variously estimated as 28 dB SPL (386) [seemingly with the bulla open, which augments middle ear transmission by ∼10 dB at low frequencies (49) and 50 (49) or 59 dB SPL (99) with the bulla closed]. Thus vibrations near the apex of the guinea pig cochlea correspond to 1–11 nm at the CF threshold of auditory nerve fibers. Because the displacements at neural threshold at apical sites of chinchilla and guinea pig cochleae are similar to the displacement values at the base, one might conclude that neural thresholds correspond to a constant BM displacement (in the order of a few nanometers) throughout the cochlea (49). This interpretation, however, clashes with the velocity sensitivity of low-CF inner hair cells and auditory nerve fibers (61,319, 329, 349). The alternative possibility is that even the most sensitive responses so far measured at the apex have underestimated, by as much as 25 dB, the magnitude of vibrations in normal cochleae. In that case, apical cochlear vibration velocities at threshold would be in the range of 39–420 μm/s, matching closely the threshold velocities of basal sites in guinea pig and chinchilla.

B.  The “Second Filter”

The concept of the “second filter” provided a focus for many of the controversies surrounding cochlear function over the last quarter century or so. Second filters were proposed as mechanisms that could bridge a perceived gap between the apparent poor tuning of the BM (the “first filter”) and the sharp tuning of auditory nerve fibers: second filters receive their input from (but do not feed back on) the BM, and transform poorly tuned and insensitive mechanical vibrations into well-tuned and sensitive responses of hair cells and auditory nerve fibers (99, 101). Electrical second filters (resonances due to interactions of ionic channels in the basolateral membranes of hair cells) actually exist in the cochleae of turtles and chicks and in the amphibian papilla of frogs (8, 56, 111, 283), but have not been found in mammals.

In the case of the mammalian cochlea, second filter models (4, 426, 427) dominated thinking about its mechanics even long after Rhode's demonstrations of BM compressive nonlinearities and their vulnerability (293,294, 296, 306). This was partly due to the inability of several investigations (101,165, 402), including one by Rhode himself (296), to replicate his pioneering findings in species other than the squirrel monkey. Especially influential in this regard was one study that reported sharply tuned and sensitive responses of auditory nerve fibers from cat cochleae in which BM vibrations were insensitive and poorly tuned (101). In retrospect, it seems apparent that the methods used to measure BM vibrations, including a capacitive probe, induced severe but localized cochlear damage and that the neural recordings came from fibers innervating sites other than those where vibrations were measured.

Even when it became indisputable that BM responses at the base of the cochlea exhibit a compressive nonlinearity (307,355) and that they are well-tuned and sensitive (179, 307, 355), a lack of consensus regarding the exact correspondence between BM vibration and auditory nerve excitation permitted a lingering defense (5, 6) of second filter models or even denials that BM vibrations participate in the stimulation of the auditory nerve (18).

C.  Frequency Tuning at the BM and in Auditory Nerve Fibers

At basal cochlear sites, the tuning of the BM and of auditory nerve fibers is very similar at frequencies near CF (47,307, 326, 328,355), but it is not certain whether, over a wider frequency range, neural threshold curves correspond to a constant BM displacement, velocity, or some function of these variables. Comparisons involving a single BM tuning curve from one cochlea and a representative neural tuning curve from another cochlea have variously suggested that neural threshold corresponds to a velocity of ∼40 μm/s (355) or to a displacement of 1 nm (47, 248).

The correspondence between the frequency tuning of neural thresholds and BM response magnitudes is perhaps best known at the 3.5-mm site of the chinchilla cochlea (reviewed by Ruggero et al., Ref.317). Comparisons of averaged BM and neural data (307, 317, 328) indicate that, over a wide range of frequencies, thresholds for fibers with CF of 9–10 kHz follow a course intermediate between isodisplacement (0.9 nm) and isovelocity (50 μm/s). Direct comparisons of tuning curves for BM and auditory nerve fiber responses recorded in the same chinchilla cochleae under identical conditions (Fig. 18) led to the same conclusion (244). On average, the frequency tuning at threshold of auditory nerve fibers for frequencies below CF corresponds closely to BM displacement subjected to high-pass filtering at constant rates of 3–5 dB/octave. High-pass filtering must reflect in part the velocity sensitivity of inner hair cells at low frequencies (which, in turn, probably results from the viscous fluid coupling between endolymph and their stereocilia). In the case of high-CF inner hair cells, velocity sensitivity may extend to frequencies as high as 1–1.6 kHz (see Fig. 7 of Ref. 278). (Lower limits for velocity sensitivity were estimated in other studies of high-CF inner hair cells, Refs. 253, 341,359.) However, it is also possible that the attenuation of responses to low-frequency stimuli may reflect unspecified micromechanical processes. Micromechanical processes might also account for the absence from neural tuning curves (Fig. 18) of counterparts of the high-frequency plateaus evident in responses to sound at the BM (Fig. 5, 6, and 18) and in scala tympani pressure (Fig. 15) (see sect.ii A5).

Figure 18 compares isodisplacement and isovelocity tuning curves for TM responses at a site of the chinchilla cochlea with CF ∼400 Hz (299) with an average frequency-threshold tuning curve derived from the responses of many chinchilla auditory nerve fibers with similar CF (A. N. Temchin, N. C. Rich, and M. A. Ruggero, unpublished observations). Both the low- and high-frequency slopes are similar in the neural and mechanical tuning curves, but the tips of the TM curves are distinctly blunt compared with the tip of the neural curve. The Q10 of the neural curve is nearly twice as large as the Q10 values of the mechanical curves (1.32 vs. 0.77). The discrepancy may indicate that even the best available data underestimate the sensitivity and frequency tuning of TM vibrations near CF at the cochlear apex (see sect. viii A). Near the extreme apical end of the guinea pig cochlea, however, regardless of whether responses to sound are linear or not, or whether measurements are made in vitro or in vivo (49, 135, 140), the sharpness of mechanical tuning is similar to the threshold tuning of normal auditory nerve fibers, with Q10 values ≤1.

Table 7 summarizes a series of properties of auditory nerve responses that probably reflect corresponding features of BM vibration. Appropriately, all these properties exhibit CF specificity, i.e., a dependency on stimulus frequency relative to CF. Many other properties of auditory nerve fiber responses (e.g., synaptic delay, refractoriness, rate saturation and rectification, decay of phase locking as a function of increasing stimulus frequency; for review, see Ref. 314) do not exhibit CF specificity and, accordingly, probably originate at sites other than the BM (such as the inner hair cells and their synapses). Some striking nonlinear features of auditory nerve responses (e.g., notches in rate-intensity functions and associated phase shifts) may indicate that the micromechanics of the cochlear partition contain multiple modes that transmit BM vibrations to the inner hair cell stereocilia via two or more distinct pathways (see A).

View this table:
Table 7.

Properties of auditory nerve responses that primarily reflect BM mechanics


A.  Mechanical Feedback From the Organ of Corti and Amplification

Today, the “cochlear amplifier” is the dominant unifying idea of cochlear mechanics. Although the term cochlear amplifierwas apparently coined by Davis in 1983 (74), cochlear amplification was first proposed by Gold in 1948 (121) as an electromechanical feedback process or “negative resistance” that counteracts the viscous damping that would otherwise attenuate and detune BM vibrations (see also Refs. 174, 175, 189, 393). As distinct from the “second filter” (see sect. viii B), the cochlear amplifier provides a positive feedback to BM vibrations via “reverse transduction” (i.e., the conversion of biological, presumably electrical, energy into mechanical vibrations). Although cochlear amplifier means different things to different people, a simple definition states that it is a “positive feedback process which increases the sensitivity of BM responses to low-level stimuli.” A more strict definition additionally requires that the increase in sensitivity involves the dissipation of biological energy (i.e., not present in the acoustic stimulus).

In discussing the evidence for the existence of a cochlear amplifier, it is useful to distinguish this concept from a related one, namely, nonlinearity of BM vibrations. The preceding sections of this review amply document the features of compressive nonlinearity at both basal and apical cochlear sites. At least at the base of the cochlea, compressive nonlinearity appears to be inextricably linked with high sensitivity and sharpness of frequency tuning (327) so that when one of these three properties is abolished by cochlear insults, the other two are also eliminated or drastically reduced. Thus the presence of compressive nonlinearity may well be a certain indicator of the activity of the cochlear amplifier (see sect.x D). However, it is also conceivable that saturation, which attenuates BM vibrations at moderate and high stimulus levels, could arise independently of amplification. What lends credence to the view that compressive nonlinearity signals amplification is that insults to cochlear function (see sect.ix B) abolish nonlinearity largely by reducing the sensitivity of responses to low-level stimuli. Such effect is obviously consistent with the existence of an amplifier and points to the existence of a feedback from the organ of Corti that boosts BM vibration.

B.  Lability of Cochlear Mechanical Responses to Sound

1.  Effects of death and surgical trauma on BM sensitivity

Until the 1980s, most investigations of BM vibration, except for those in the squirrel monkey (293, 296,306), reported linear, insensitive, and poorly tuned responses to sound (101, 163,165, 391, 400-402). With hindsight, it is obvious that failures to demonstrate sensitive and nonlinear BM responses in vivo resulted from surgical damage inflicted during experimental manipulations of the cochlea. This became evident when an independent, sensitive, and frequency-specific measure of cochlear function, namely, the threshold of tone-pip-evoked compound action potentials, was shown to correlate well with BM sensitivity (307, 355).

The BM compressive nonlinearity discovered by Rhode (293) in the squirrel monkey cochlea disappeared within minutes after death (294). More recent BM measurements at the base of the cochlea (e.g., Figs. 10 and 19) are consistent with most of Rhode's 1973 observations, showing that postmortem responses exhibit a large decrease in the sensitivity of responses to low-level near-CF stimuli, a loss in sharpness of tuning, and a downward shift (by about one-half octave) of the most sensitive frequency (47, 259,307, 326, 327,355). For low-level CF stimuli, death reduces response sensitivity by as much as 65–81 dB (259,326, 355) (see Table 1).

Fig. 19.

The effect of death on the magnitudes and phases of BM responses to tones. The response magnitudes are expressed as tuning curves, in terms of the stapes velocity required for a constant BM velocity of 50 μm/s. The phases are given relative to stapes vibrations. The arrowhead indicates the high-frequency phase plateau (see sect.ii A5). Postmortem data were measured within 1 h after death of the animal. [Guinea pig data replotted from Nuttall and Dolan (259).]

At the apex of the chinchilla cochlea, nonlinear vibrations can be recorded only when the organ of Corti and Reissner's membrane are minimally disturbed (see sect. iii A). Responses that initially exhibit compressive nonlinearity always become linear and less sensitive with the passage of time (51,299). Similarly, at the apex of the guinea pig cochlea, an expansive nonlinearity has been demonstrated only in exceptionally intact preparations (413).

2.  Effects of acoustic overstimulation on BM sensitivity

Exposure of the mammalian ear to intense sounds can produce transient or permanent elevations of hearing thresholds (for reviews, see Refs. 351, 353). It is clear that these threshold elevations originate in the cochlea, since acoustic overexposure causes comparable threshold shifts in responses to sound of auditory nerve fibers (35, 37, 213) and hair cells (39).

The in vivo effects of acoustic overstimulation on BM vibrations were measured, in limited fashion, in the cochleae of two guinea pigs (272, 355) and one cat (47) and, in a more extensive investigation, in chinchillas (324,325, 327). The latter study, which used both tone and click test stimuli, exposed the ear to intense tones that caused severe but temporary threshold shift centered at the CF of the BM recording site. The effects of acoustic overstimulation on BM responses to sounds closely resembled the effects of death (see sect.ix B1). Responses were linearized, reducing sensitivity at CF, broadening frequency tuning, and shifting the peak response to a frequency ∼0.5 octaves lower than CF (47,272, 324, 325, 327,355). Overall, the BM effects of acoustic overstimulation can account for most (but not all) of the corresponding effects in auditory nerve fibers (35, 37,345) and their psychophysical correlates, including the long-known “half-octave shift” that accompanies temporary threshold elevations (37, 75). The BM effects cannot account for changes in auditory nerve spontaneous activity or for alterations in thresholds, including hypersensitivity, at frequencies well below CF (209).

3.  Pharmacological manipulation of cochlear sensitivity

Furosemide, a “loop-inhibiting” diuretic, drastically but reversibly alters cochlear function by abolishing the endocochlear potential. This reduces the drive to mechanoelectrical transduction, presumably causing reduced receptor potentials of inner and outer hair cells, which in turn alter the sensitivity of auditory nerve fibers (100, 361). Interestingly, the sensitivity changes in high-CF fibers are substantially greater at CF than at other frequencies (361). The mechanical bases of this frequency-specific alteration has been elucidated by experiments in chinchillas involving intravenous injection of furosemide. As illustrated in Figure 20, furosemide causes large but reversible CF-specific reduction and linearization of BM responses to tones and clicks (320,322).

Fig. 20.

The effects of furosemide on BM responses. Frequency spectra of BM responses to 75-dB (peak SPL) clicks measured in two cochleae, before (solid line) and after (dashed lines) intravenous furosemide injections. The panels show magnitudes and phases, computed by Fourier transformation. [Chinchilla data from Ruggero and Rich (322).]

Although all manipulations of cochlear sensitivity point to the existence of cellular processes that boost BM vibrations, most [e.g., the effects of acoustic overstimulation (sect.ix B2) and surgical trauma or death (sect.ix B1)] do not address directly the nature of the feedback. The experiments involving furosemide are more specific, almost inescapably implying that the sensitivity and nonlinearity of BM vibrations depend critically on the receptor potentials of outer hair cells. Although furosemide must also reduce independently the receptor potentials of inner hair cells, it is the outer hair cells that are implicated, in view of the differential effect on inner hair cells of DC currents applied extracellularly or intracellularly. Injecting negative DC currents into scala vestibuli (a procedure analogous to decreasing the endocochlear potential by means of furosemide) causes CF-specific reductions in BM (264) (see sect.ix D), inner hair cell (252), and auditory nerve sensitivity (375). In contrast, alterations in inner hair cell responses induced by intracellular current injection are not frequency specific (252).

It is noteworthy that the large effects of furosemide on BM vibration (e.g., reductions of vibration amplitude of up to 60 dB at CF, Ref.322) are seemingly controlled by relatively small alterations of hair cell potentials. For example, if furosemide reduces the endocochlear potential to zero, the receptor currents would be reduced by only 50% or 6 dB (since approximately one-half of the drive for the transduction current, namely, the intracellular electrical potential of outer hair cells relative to perilymph, would remain unaffected). The fact that a relatively small decrement in transduction currents can produce disproportionately large reductions in the magnitude of mechanical vibrations implies that the link between outer hair cells and the BM consists of a high-gain positive feedback (279, 408).

Quinine can reversibly raise auditory thresholds and induce tinnitus (for review, see Ref. 137). Quinine does not reduce the endocochlear potential (287), but it may affect outer hair cells directly (160, 173). Intravenous injections of quinine in chinchillas linearize BM responses, broadening frequency tuning and reducing sensitivity at CF by up to 15 dB (327).

Both acetyl salicylic acid (aspirin) and salicylates can reversibly reduce auditory sensitivity and selectivity and may induce tinnitus (for review, see Ref. 161). A mechanical basis for these effects has been demonstrated: perilymphatic perfusion of the cochlea with 2.5–5 mM sodium salicylate produces strong but reversible alterations in BM responses to tones (237). During perfusion, BM tuning curves become broader, lose up to 45 dB in sensitivity at CF, shift their apparent CF to lower frequencies, and become sensitized by ∼10 dB in their low-frequency tail. Interestingly, salicylate reduces the electromotile forces (see sect.x D) and the axial stiffness of isolated outer hair cells (340), effects which may underlie the salicylate-induced alterations of BM responses.

4.  Lability of BM response phases

With but one exception (272), studies of the effects of acoustic overstimulation on in vivo BM vibrations found that loss of sensitivity is accompanied by phase lags, as well as decreased slopes of the phase versus frequency curves (group delays) at frequencies around CF (47, 324, 325,327). Phase lags and decreased group delays are also evident after administration of furosemide in responses to low- and moderate-level tones and clicks at frequencies around CF (Fig.20 B) (322).

The first investigation of the effects of death in cochleae exhibiting in vivo nonlinearities also found postmortem phase lags at all measured frequencies (from much lower than CF to well above CF) as well as decreased group delays around CF (294). Among more recent studies, only one has confirmed these results fully (300). Other studies did not find postmortem phase lags at frequencies much lower than CF and observed leads (rather than lags) at frequencies higher than CF (259, 290, 315) and, in some cases, also at CF (259, 326). Figure 19 compares the magnitudes and phases of responses to tones measured in vivo and postmortem in an initially sensitive cochlea (259). Death did not alter significantly the response phases at frequencies lower than CF but caused a large relative phase lead (∼270 degrees) at CF (18 kHz): lag was reduced from ∼2.2 periods premortem to about 1.5 periods postmortem. The phase effects of death on responses to clicks (290, 315) closely resemble the phase effects of increasing stimulus intensity in vivo: relative lags and leads at frequencies lower and higher than CF, respectively, with little change at CF (Fig. 8 and sect.ii A4). Such changes are consistent with the decreased postmortem phase slopes near CF reported by Rhode (294) but are not consistent with the results of other studies using tone stimuli, including those of Figure 19. The apparently disparate results could be reconciled by postulating that the various studies measured BM responses at different times after death and that the postmortem processes are time dependent (326). The postmortem phase effects probably consist of relative leads at CF and at higher frequencies immediately after death, gradually changing to lags at all frequencies around CF after 1 h or so (300). In addition, the steep slope of the phase versus frequency curves may confound interpretation by exaggerating errors in estimating CF. For example, in the case of Figure 19, if the “real” CF were 17 kHz, the postmortem phase lead would amount to only ∼80 degrees (instead of 270 degrees at the 18-kHz nominal CF).

5.  Lability of two-tone interactions

Well before distortion products could be demonstrated on the BM, psychoacoustical and neurophysiological experiments suggested that distortion products originate in the cochlea and propagate on the BM much like responses to single tones (122,364, 366). Among the strongest early evidence for distortion product propagation on the BM (and also for a feedback linking the organ of Corti and BM vibrations) was that their audibility was reduced when the stimulus frequencies coincided with regions of elevated audiometric threshold (366). Parallel results were obtained using recordings of auditory nerve fibers in the cochleae of chinchillas suffering hair cell loss (364). More recently, a similar BM experiment was carried out using acute acoustic overstimulation to selectively decrease cochlear sensitivity to the primary tones, f 1 and f 2(see sect. ix B2), selected so that 2f 1-f 2 was equal to the CF of the recording site (312). Under such circumstances, 2f 1-f 2 distortion products were reduced in magnitude, although responses to CF tones were largely unaffected. This is consistent with the idea that the 2f 1-f 2 distortion product originates at the BM region with CFs near the primary frequencies and propagates to the location with CF equal to 2f 1-f 2.

Both two-tone suppression and the modulation of responses to CF tones by low-frequency tones are CF specific (Fig. 16,bottom, and sect. vii A). Therefore, on the assumption that suppression and low-frequency modulation are closely related to the nonlinear, compressive growth of BM responses at CF, it is not surprising that these nonlinearities disappear postmortem and that, in vivo, their strength is well correlated with sensitivity at CF (41, 277, 332).

6.  Effects of noxious agents on vibrations of the ITER temporal-bone preparation

Despite being linear, the mechanical responses to sound of the ITER temporal bone preparation can be altered by exposure to noxious agents (see sect. vi B1). However, the alterations are small and differ qualitatively from those caused by the same insults, in vivo, at basal BM sites. For example, after the ITER preparation was exposed to intense sounds, response sensitivity “…generally increased…” (384). Such an increase in sensitivity sharply contrasts with the effects of overstimulation on BM vibrations at the base of the in vivo cochlea (see sect. ix B2) and are also inconsistent with the neurophysiological and psychophysical correlates of acoustic trauma, including some findings using the ITER preparation (108). The ITER preparation was also used to study the effects of quinine, again with unexpected results: quinine “… increased the vibration amplitude at the peak of the mechanical resonance curves and increased the sharpness of tuning” (172), effects which are quite different from those observed at basal sites of the BM (see sect.ix B3). The peculiarity of the effects of quinine and overstimulation on the ITER preparation probably reflects underlying dysfunction, at least in part due to the absence of a blood supply.

C.  Efferent Control of BM Vibration

The cochlea receives bilateral centrifugal innervation that originates at the superior olivary complex of the brain stem (288, 392). Medial efferent fibers originate from large cells of the medial regions of the superior olivary complex and terminate in large synapses at the base of outer hair cells (369, 392). Although a functional auditory role of the olivocochlear efferent system has not been established, there is clear experimental evidence that efferent fibers have an inhibitory effect on cochlear responses. Electrical stimulation of the crossed olivocochlear bundle (formed predominantly by medial efferent fibers) reduces the amplitude of cochlear compound action potentials produced by acoustic clicks (112) and responses to tones of auditory nerve fibers (106, 396) and inner hair cells (20).

The underlying basis for the neural effects of activation of the olivocochlear system is a CF-specific loss of sensitivity and linearization of BM vibrations (92, 238,337). At near-CF frequencies, efferent stimulation shifts the steep portion of the velocity-intensity curves to higher levels, by up to 30 dB, and increases its slope. The attenuation of BM vibration is largest at stimulus levels between 50 and 75 dB SPL and may exceed 10 dB even at 90 dB SPL (337). These attenuations are similar to and could account for those produced by efferent stimulation in responses of low and medium spontaneous-rate auditory nerve fibers (130). In contrast to other experimental manipulations of cochlear sensitivity, olivocochlear stimulation does not alter the sharpness of BM frequency tuning or the phases of responses to tones (238). In addition, electrical stimulation of the olivocochlear bundle can increase BM responses to tones at frequencies well above CF and at high intensities (92). Counterparts of this phenomenon have not been previously observed in BM or in auditory nerve responses.

Whatever doubts may have lingered regarding the ability of outer hair cells to influence BM vibrations, these have been dispelled by the demonstration that stimulation of the olivocochlear bundle, which innervates outer hair cells and type I afferent terminals, reduces BM responses to sound in a frequency-specific manner (238). Because it is inconceivable that afferent fibers can affect BM vibrations, the efferent effects must be mediated by the outer hair cells.

Cochlear perfusion of ACh, the neurotransmitter of medial olivocochlear efferents, produces transient losses of sensitivity and linearization of the displacement/intensity curves for BM responses, with the greatest effect at low to moderate levels and at frequencies above CF (239). As expected, the effects of ACh perfusion on BM responses resemble those induced by electrical stimulation of the efferent system (238). However, in contrast to efferent stimulation, ACh increases the bandwidth of BM tuning curves. BM vibration data (239), as well as measurements of somatic electromotility (see sect. x D) in isolated outer hair cells (71), suggest that ACh acts by producing an influx of Ca2+ into outer hair cells, which in turn induces Ca2+ release from intracellular stores. The effects of ACh on both BM vibrations and the motile responses of outer hair cells have onset delays in the order of seconds and probably correspond to the “slow effects” of efferent stimulation on compound action potential thresholds (370). In isolated outer hair cells, ACh produces an increase in the magnitude of axial electromotile responses (71). This is a surprising result in view of the widespread belief that BM vibrations are boosted by the somatic electromotile responses of outer hair cells (see sect.x D) (63, 64).

D.  Reverse Transduction: Mechanical Effects of Electrical Currents

That “reverse” (i.e., electrical to mechanical) transduction can occur in the cochlea was first demonstrated by injecting sinusoidal electrical currents into scala media while simultaneously delivering an acoustic tone (151). Such currents produce otoacoustic emissions at the frequency of the electrical stimulus and also interact with the acoustic tone to produce distortion-product otoacoustic emissions. Electrically evoked otoacoustic emissions can also be enhanced by acoustic tones in a manner roughly consistent with the frequency tuning and tonotopicity of BM vibrations (234,405) and are modulated by low-frequency acoustic tones (410) much as these periodically suppress responses to CF tones at the BM or in auditory-nerve fibers (see sect.vii A3). In other words, either the electrical stimuli themselves or electrically stimulated cochlear vibrations can interact nonlinearly with cochlear vibrations elicited by normal acoustic stimuli. The fact that these interactions between acoustical and electrical stimuli take place on a cycle-by-cycle basis suggests that they occur in hair cells, the sites of mechanical to electrical transduction (155, 156). The outer hair cells are specifically implicated because acoustically stimulated distortion-product emissions can be modified by electrical activation of the olivocochlear efferent system, whose terminals innervate outer hair cells but not inner hair cells (232,363).

If reverse transduction in outer hair cells (either voltage-driven somatic electromotility or current-driven stereocilia motility; see sect. x D) plays a role in the mechanics of the intact cochlea, it should be capable of inducing motion of the organ of Corti and the BM. Indeed, in explants of the organ of Corti, outer hair cells and adjacent structures display motile responses to electrical stimulation at frequencies as high as 15 kHz (132,171, 219, 291,292). Under electrical stimulation, the reticular lamina and the BM move with opposite polarity (219), implying that the top and bottom of the organ of Corti could move independently under acoustic stimulation (see sect. vi A). Consistent with findings in isolated outer hair cells (e.g., Ref.350), depolarization causes outer hair cells to shorten, pushing the reticular lamina toward scala tympani and the BM toward scala vestibuli (219). Displacements are several times larger at the reticular lamina than at the BM, indicating that the BM is stiffer than the reticular lamina.

Electrically induced BM motion has been demonstrated in vivo in gerbil and guinea pig cochleae (255, 265,406, 407). In gerbil, the electrically evoked vibrations differed substantially from those produced by acoustic stimulation. In particular, they exhibited only small phase accumulation at CF, indicating that traveling waves were not generated (see sect. v) (406, 407). A very different result was reported for similar experiments in guinea pig (255, 265): BM responses to electrical sinusoids exhibit spectral tuning and phase characteristics (including a phase lag exceeding 2 periods at CF) very similar to responses evoked by acoustic tones (265). Similarly, stimulation with electrical pulses causes “ringing” of the BM practically indistinguishable from responses to acoustic clicks in the same cochlea (see sect. ii B).

Upon cochlear stimulation with electrical pulses, auditory nerve fibers exhibit responses strikingly analogous to those seen at the BM in that they resemble neural responses to acoustic clicks (146,236). Although Moxon (236) thought that such “electrophonic” responses were artifacts, Nuttall and Ren (265) have presented evidence that they do not originate at the stimulating electrode and that they require the presence of an intact organ of Corti. It seems as if the fast deformation of biological tissue, probably the outer hair cells, generates (“fast”) acoustic pressure waves. Because these apparently behave in every respect like the pressure waves produced by ossicular vibration, they generate (“slow”) traveling waves that propagate in the cochlear fluids and the BM (see sect. v).

Direct currents passed across the organ of Corti produce marked changes in BM responses to acoustic tones with frequencies at and above CF, and little changes in responses to tones with frequencies below CF (264). Positive currents (from scala vestibuli to scala tympani) increase the sensitivity and frequency tuning of BM sound-evoked motion and shift its characteristic frequency upward, and negative currents decrease the sensitivity and tuning of the response and shift the characteristic frequency downward (264). These effects suggest that positive and negative direct currents modify cochlear amplification via enhancement and reduction of the transduction currents in outer hair cells, respectively. Presumably, the effect of negative currents is analogous to that of decreased endocochlear potential, such as brought about by furosemide (see sect. ix B3).

There is no consensus regarding the polarity of the partition movements produced by injection of electrical currents into the cochlea. Both BM displacements toward scala tympani (219, 265) and toward scala vestibuli (406) have been reported for positive currents injected into scala media (or scala vestibuli). [Yet, in another report, positive and negative currents into scala vestibuli produce displacements toward scala tympani (264).] One possible reason for these disparate findings is that under stimulation with electrical sinusoids the arcuate and pectinate regions of the BM (see footnote 4) move with opposite polarity (404) so that the seemingly contradictory reports may reflect different radial sites of measurement. Paradoxically, although BM DC-displacement responses to electrical current pulses also exhibit opposite polarity in the arcuate and pectinate regions, the corresponding ringing or transient responses (which closely resemble responses to acoustic clicks) do not undergo phase shifts as a function of radial position (262). The absence of phase shifts in the ringing responses is directly at odds with the polarity reversal observed in response to electrical sinusoids at the boundary between the arcuate and pectinate zones (404).

E.  Origin of Compression and Other BM Nonlinearities

The receptor potentials of hair cells vary nonlinearly with stereociliary displacement (155, 201), following a voltage-displacement relation that exhibits compression and saturation and resembles a second-order Boltzmann function. In general, functions exhibiting compression and/or saturation generate harmonic distortion, as well as suppression and intermodulation distortion under stimulation with tone pairs (98,214, 215, 420). Because the mechanical feedback that the organ of Corti exerts upon BM vibration appears to be controlled by the magnitude of hair cell receptor potentials or transduction currents (see sect.x D), it stands to reason that the nonlinearities present in the transduction process must necessarily result in mechanical counterparts in BM vibration. Accordingly, models incorporating a feedback loop between outer hair cells and BM vibration often identify mechanical-to-electrical transduction as the source of all BM nonlinear phenomena, including the compressive nonlinear growth at CF, as well as two-tone suppression and intermodulation distortion (29, 77, 119,197, 232, 235, 275,279, 421, 422).

There are two other nonlinear processes, namely, adaptation (11, 55, 94, 147; reviewed by Eatock, Ref. 93) and nonlinear gating stiffness (148), that characterize mechanoelectrical transduction in hair cells of the frog sacculus and which could conceivably also play a role in generating BM nonlinearities in the mammalian cochlea. However, evidence for the presence of adaptation and nonlinear gating stiffness in outer hair cells is still meager and mostly derived from recordings in immature cochleae (202,336, 339).

Finally, the feedback process itself may be a source of nonlinearity (see sect. x D). In the case of outer hair cell somatic electromotility, the deformation of outer hair cells follows a nonlinear (Boltzmann) function of transmembrane voltage, which should introduce nonlinear effects into BM vibrations. These effects, however, are likely to be smaller than those originating in the nonlinearity of forward transduction under normal in vivo conditions (279,348).


A.  Are Cochlear Vibrations Really Amplified?

Although the existence of an organ of Corti mechanical feedback upon the BM is now well established, the nature of the feedback, including the question of whether it involves expenditure of biological energy, remains uncertain. In strict terms, the existence of amplification has been addressed directly only by analyses of power fluxes (17, 90, 91,189, 211). Such analyses, in which energy flows along the cochlear duct are calculated from the experimentally measured magnitudes and phases of BM vibration, attempt to determine whether the power flux along the cochlea increases as the wave propagates toward its resonance point. Increased power fluxes indicate that the cochlea adds energy to the traveling wave in excess of that present in the acoustic stimulus. Calculations based on data from sensitive cochleae suggest that BM vibrations are indeed biologically amplified, whereas no amplification is found in the case of insensitive cochleae (17, 90, 91).

In addition to analyses of power fluxes, the very existence of spontaneous otoacoustic emissions suggests that biological energy can be converted into cochlear vibrations and, therefore, may be taken as (indirect) evidence favoring cochlear amplification in the strict sense. Spontaneous otoacoustic emissions are narrow-band sounds emanating continuously from the inner ear (175,398, 414) in the absence of acoustic stimulation. Both the occasionally large magnitude of spontaneous emissions (316, 414) and the nature of their frequency spectra (14, 403) argue that they do not originate from Brownian motion within the cochlear fluids but rather that they represent free-standing oscillators powered by biological energy sources (e.g., Ref.373).9 Because there is evidence (albeit indirect) that spontaneous emissions are accompanied by corresponding BM vibrations (102, 284), it is not too far-fetched to postulate that the same processes that give rise to spontaneous emissions are also involved in amplifying the magnitude of acoustically stimulated BM motion.

Interestingly, spontaneous otoacoustic emissions are broadcast not only by the ears of mammals but also of other vertebrates, e.g., frogs, which have neither a BM nor outer hair cells (385), and lizards and birds, which have a BM but lack outer hair cells (199, 374). The existence of spontaneous otoacoustic emissions in nonmammalian vertebrates has led to the hypothesis that the inner ears of these animals also possess “amplifiers” (154, 198, 222) whose function is to enhance the motion of hair cell stereocilia (see sect. x E).

B.  Gain of the Cochlear Amplifier

Because the amplification (or sensitivity boost) provided by the cochlear amplifier to BM vibration varies with stimulus frequency and intensity, as well as with the physiological condition of the cochlea, several definitions of its gain are possible. One definition presupposes that BM responses to CF tones grow linearly at both low and high levels of stimulation (123, 164,279, 408). The cochlear amplifier gain could then be defined as the difference between the asymptotic sensitivities of responses to high- and low-level CF tones. However, because compressive growth prevails even at very high levels (Figs. 2, 3, and5), it is only possible to specify lower bounds for the cochlear amplifier gain. These lower bounds, compared in Tables 1 and 3, reach 69 dB at mid-basal sites but appear to be substantially lower at the hook region (36 dB). The relatively low values at the hook may well underestimate the sensitivity of fully normal cochleae. At the apex, gains are even smaller or actually negative (Fig. 11). Gains amount to 22 dB in the compressive responses in chinchilla, 0 in the linear responses in guinea pig, and −21 dB in guinea pig cochleae exhibiting “active attenuation.” It is not clear whether the low (or zero) gains result from sensitivity losses due to surgical damage or simply indicate that amplification is weak (or nonexistent) in perfectly normal cochleae. Active attenuation (413) is a unique finding that demands replication and reinforces the impression that cochlear mechanics differ fundamentally between the extreme apex and more basal regions (see sect. iv).

Another definition states that the gain of the cochlear amplifier corresponds to the difference between the maximum sensitivities of BM responses to CF tones in vivo and postmortem (Figs. 10 and 19). According to this definition, the cochlear amplifier gain is as large as 78 dB at the 17- to 18-kHz site of the guinea pig cochlea (164, 259, 355) and 81 dB at the 9- to 10-kHz region of the chinchilla cochlea (326).

The aforementioned definitions of cochlear amplifier gain disregard the frequency shifts of the sensitivity peaks (which are displaced downward by about one-half octave postmortem or for high-intensity stimuli in vivo) as well as possible alterations of cochlear passive properties (e.g., elasticity) brought about by death. Therefore, we favor a third definition (74, 326) which addresses these issues: the cochlear amplifier gain is the difference between the peaks in the sensitivity functions for low- and high-intensity tones (Figs. 5 and 13) or between in vivo and postmortem responses (Fig. 10). According to this definition, the cochlear amplifier gain, measured in terms of displacement (Table 1), is 35–58 dB at the 9- to 10-kHz site of the chinchilla (326) and 35 dB at the 17- to 18-kHz region of the guinea pig (52, 164).

C.  Location of the Cochlear Amplifier

Keeping in mind the mapping of CF upon longitudinal cochlear position, the level dependence of the frequency of maximal BM response sensitivity at the base of the cochlea (Figs. 5 and 10) suggests that the passive traveling wave has a spatial envelope that peaks ∼1 mm or so basal to the peak of the active (i.e., amplified) wave. This position shift presumably reflects the fact that the amplitude at the characteristic place is a function of amplification not only at that particular site, but also over a longer cochlear region, extending mostly toward more basal locations. Presumably, as the BM traveling wave propagates it gathers energy from outer hair cells nearby until it reaches and passes the BM site of passive resonance for any given stimulus frequency. This idea has been embodied in mathematical models of the cochlea, which typically include active elements or negative resistance distributed over a region basal to the characteristic place (78, 79, 189, 248,249, 421).

The spatial distribution of amplifier elements has been estimated by studying the effects of cochlear lesions on the responses of spiral ganglion neurons innervating neighboring inner hair cells (36). Significant losses in sensitivity were observed in spiral ganglion neurons located at the apical edge of regions with many missing or damaged outer hair cells. This indicates that amplification of the traveling wave occurs in a region basal to the CF place. Sensitive and sharply tuned responses could be recorded from neurons located apical to and within 0.5 mm of an area where 97% of the outer hair cells were missing or damaged, and within 1.3 mm of an area where all outer hair cells were missing. This suggests that the cochlear amplifier spans a region between 0.5 and 1.3 mm basal to the CF site.

Since compressive nonlinearity appears to be inextricably linked to amplification (see sects. ii and vii), the presence, extent, and spatial distribution of amplification may be inferred by ascertaining the rate of growth of responses to single tones as a function of cochlear location. However, because the literature contains few data on the spatial patterns of BM responses to single tones, it is useful to attempt to derive such patterns from curves of BM sensitivity as a function of frequency (such as those of Fig. 5) recorded at a single location. On the assumption that BM amplitude and phase scales depend only on the ratio of stimulus frequency to CF, the amplitudes of the BM as a function of distance are derived from the responses measured at a single BM point (415), using maps that relate CF to longitudinal cochlear position (97, 128, 208). Therefore, the spatial extent of the amplifier can be assessed by measuring at a single location the range of frequencies around CF over which responses grow at compressive rates. For the chinchilla cochlear site with CF ∼9–10 kHz, this frequency band extends from one-half octave below CF to one-third octave above CF (326). The corresponding distances (for a cochlear map with slope of 2.5 mm/octave CF, Ref. 128) are 1.3 mm on the basal side and 0.8 mm on the apical side, respectively, of the characteristic place, yielding an overall region of nonlinearity of ∼2.1 mm. The spatial extent of this region compares favorably with values estimated by analyses of cochlear power flux of BM responses to tones (see sect.x A) (17, 90) and by deriving the mechanical impedance of the BM as a function of distance from first-order Wiener kernels computed from BM responses to wide-band noise (see sect. ii B) (85). Such estimates show that the region of “activity” or “amplification,” where the main increase in power flux occurs and the real part of the impedance attains negative values, extends basalward from the characteristic site over a distance of 1–2 mm.

The longitudinal spatial representation of single tones on the BM has been measured directly in the cochleae of guinea pig (338) and chinchilla (300). Both studies found that nonlinear growth was distributed more or less symmetrically on both sides of the characteristic place. In the guinea pig cochlea (Fig. 14), nonlinear growth of BM displacements was restricted to a 1.25-mm region flanking the characteristic place. In the chinchilla, the region of nonlinearity was somewhat longer and varied with CF: its extent ranged from 2 mm for a CF of 15 kHz to 3.5 mm for a CF of 5.5 kHz (300), consistent with the 2.1 mm computed above on the basis of responses measured at a single site of the chinchilla cochlea with CF ∼9–10 kHz (326). The guinea pig data of Figure 14 are at variance with expectations in that spatial tuning does not shift along the cochlea when stimulus intensities are varied over a wide range. This result, however, is contradicted by comparable measurements carried out in chinchilla (300), which yielded results entirely consistent with predictions based on sensitivity versus frequency functions for one cochlear position (Fig. 5).

D.  Cellular Basis of the Cochlear Amplifier

The feedback from the organ of Corti that enhances the sensitivity and frequency tuning of BM responses in the normal cochlea, the cochlear amplifier, presumably involves mechanical forces produced by the outer hair cells under the direct or indirect control of the (forward) transduction currents. The process that has received the most attention as an embodiment of the cochlear amplifier is “somatic electromotility,” the electrically induced longitudinal deformation of outer hair cells discovered by Brownell et al. (21). Somatic electromotility consists of a rapid elongation or shortening of outer hair cells that accompanies, respectively, the hyperpolarization or depolarization of their transmembrane potential (10,63, 167). Such changes in length are generally thought to reflect forces produced by a large number of molecular motors (67, 145), but they may also reflect dynamic stiffness changes (70, 138). The motors, densely distributed in the basolateral plasma membrane (168), appear to be molecules of prestin, a newly identified protein (412). Changes in transmembrane voltage (not current) are translated into length changes according to a Boltzmann function with a peak sensitivity of 30 nm/mV (350). In vitro, the length changes follow imposed voltage changes very rapidly, with an upper frequency cut-off at least as high as 79 kHz (109). Intrinsically, such responses are sufficiently fast to provide forces (or stiffness changes) potentially capable of enhancing BM vibration, on a cycle-by-cycle basis, even at the highest-CF regions of the cochleae of most mammalian species (104). However, the capacitance of the basolateral membrane causes the AC receptor potentials of outer hair cells to roll off as a function of increasing frequency above ∼1.2 kHz, at a rate of 6 dB/octave, so that they are very small at high frequencies [e.g., 15 μV for a 16-kHz (CF) stimulus presented near neural threshold, 15 dB SPL] (333). Therefore, at such frequencies, at threshold levels, somatic electromotility would be much too weak to account for BM displacements in the order of 1 nm (347). Although the decay of receptor potentials with frequency might be ameliorated if electromotility at high frequencies could be driven by the extracellular electrical fields generated jointly by many outer hair cells (65), the efficacy of such mechanism remains unclear (66).

Another possibility is that the stereociliary transduction mechanism itself functions bidirectionally (393). The inner ears of certain nonmammalian vertebrates can produce spontaneous otoacoustic emissions (see sect. x A), and their hair cells can respond to mechanical stimulation with stereociliar “twitches” (13, 57, 88). These “twitches” and spontaneous emissions may signal the existence of amplification that enhances stereociliar responses to low-level stimulation. One mechanism could involve actin, myosin, and ATPase activity, all of which are present in stereocilia and probably participate in the adaptation of transduction currents to sustained stereociliar displacement (154). Rapid movements of hair-bundle myosin molecules could change the tension on the tip links between stereocilia, producing motion. However, it is doubtful that actin/myosin interactions can be fast enough to play a role in amplifying high-frequency stimuli. A faster mechanism could be based on an interaction between calcium ions and the mechanoelectrical transduction channels (34). Upon opening of the transduction channels (due to positive stereociliar deflection), calcium ions enter into the stereocilium, apparently promoting closure of the transduction channels and causing negative stereociliar deflection (158). When the frog saccular macula is held in vitro in a chamber allowing endolymph-like fluid to bathe the apical surface of the hair cells, the stereocilia display spontaneous oscillations as large as 50 nm and are able to amplify imposed displacements, with expenditure of metabolically generated energy (226).

A current-driven “stereociliar amplifier” would have the advantage of not being subject to low-pass filtering by the basolateral membrane of outer hair cells (which reduces the voltage drive for somatic electromotiliy at high frequencies). It would also be consistent with the observation that low-frequency tones can modulate electrically evoked otoacoustic emissions (assuming that the latter indicate the operation of electrical-to-mechanical or reverse transduction) (410). Although there is no experimental evidence for the existence of active hair bundle motion in outer hair cells, cochlear amplification in the mammalian cochlea could conceivably involve both the stereociliar mechanism and somatic electromotility. It has been proposed that the former could enhance BM motion on a cycle-by-cycle basis, whereas the latter could control the operating point of the amplifier via activity of the medial olivocochlear efferent system (185). At present, however, it seems unlikely that a fast stereociliar amplifier could enhance BM vibrations in mammalian cochleae since such a mechanism would be effective only if the stiffness of the stereocilia were an important fraction of the overall stiffness of the BM/organ of Corti/TM complex (12, 13). The available evidence, albeit not conclusive, indicates that the stiffness of the BM (e.g., 6–11 N/m, Ref. 268) far exceeds the stiffness of stereociliary bundles (in the order of 0.002–0.005 N/m for outer hair cells, Refs.302, 336).

E.  Amplification and the Origin of Frequency Specificity

For longer than a century, the fundamental question of cochlear mechanics has been, “What mechanisms account for frequency tuning?” In contemporary terms, we may ask, “What mechanisms account for the frequency-specific and intensity-dependent properties of cochlear responses to sound, including sharp frequency tuning at low stimulus levels?” Although an electromechanical feedback between receptor currents and BM motion helps to unify and explain some of the fundamental properties of cochlear responses to sound, including the origin of nonlinearities (see sect. ix E), it may not by itself account for frequency specificity. In a frequency-tuned system like the cochlea, in which sharpness of tuning is limited by viscous damping, it should be possible to increase the sharpness of tuning by adding a force that opposes damping. Such a scheme was successfully implemented in an early model of active cochlear mechanics incorporating “negative damping” (of unspecified origin) over a limited region of the cochlear partition (189). It is not clear, however, whether sharply tuned BM responses can be obtained with models based on a feedback loop that does not itself include frequency filtering. To mimic BM vibrations realistically, modelers of cochlear mechanics using specific mechanisms (such as somatic electromotility) to generate active forces have found it necessary to include frequency-specific properties other than the passive resonance of the BM.10 This has been typically accomplished by including a secondary resonant system in the path that delivers the active forces to the BM (169,220, 224, 248,249). Such secondary resonances (which usually represent putative micromechanical interactions of the TM, the reticular lamina, the subtectorial fluids, and the stereocilia) have resonant frequencies that define a frequency-distance cochlear map distinct from the map that relates to the passive resonances of the BM. Other feedback models (114, 115) do not include secondary resonances but, instead, introduce frequency specificity in the guise of delays that are proportional to the characteristic period (=1/CF) and thus vary systematically along the length of the cochlea (416). The variation of delay with distance is equivalent to a frequency-to-distance map, chosen so that at each position the feedback forces are exerted with timing appropriate to compensate for hydrodynamic damping. Thus it seems that “good” modeling of BM vibrations requires not only the inclusion of a feedback, but also of a filter or other frequency-specific mechanism that determines the proper timing for exerting the feedback.11 In this context, it is noteworthy that whereas somatic electromotility provides no intrinsic filtering, stereociliar amplification (such as described in saccular hair cells) is inherently frequency specific (34, 226).


At the base of the mammalian cochlea, the sharp frequency tuning of auditory nerve fibers is fully determined by the vibration of the BM. The second filter postulated at one time is unnecessary, except as a minor adjunct. In this respect, the mammalian cochlea differs fundamentally from the cochleae of turtles (and, perhaps, other reptiles and birds), where second filters, resident in hair cells, sharpen the frequency tuning and enhance the sensitivity of BM vibrations. Furthermore, all the CF-specific nonlinearities of auditory-nerve responses, including dependence of frequency tuning on stimulus intensity, two-tone rate suppression and intermodulation distortion, also have correlates in BM motion.

Input/output functions for BM responses at basal cochlear sites are highly compressive (0.2–0.3 dB/dB) at frequencies around CF. This compressive behavior, which allows the cochlea to translate the enormous range (120 dB) of auditory stimuli into a range of vibrations (30–40 dB) suitable for transduction by the inner hair cells, probably provides the foundation of many psychoacoustic phenomena, such as the nonlinear growth of forward masking with masking level and the level dependence in the ability to detect changes in stimulus intensity (230).

At the base of the cochlea, high sensitivity, frequency selectivity, and nonlinearity of BM vibrations are intimately linked and highly labile, depending on the physiological state of the organ of Corti, generally, and on the receptor potentials or transduction currents of outer hair cells specifically. This indicates that in normal cochleae the sensitivity and frequency tuning of BM responses are enhanced by a positive feedback from the organ of Corti, the cochlear amplifier. This amplification of cochlear vibrations appears to involve active forces generated by the outer hair cells under direct or indirect control of their transduction currents and capable of operating up to the highest frequency limits of mammalian hearing.

At the apex of the cochlea, compressively nonlinear vibrations, reminiscent of those at the base, have been recorded at the 400- to 800-Hz CF region of chinchilla, but it is not clear whether even the most sensitive data available are representative of the properties of intact cochleae. At more apical cochlear locations in guinea pig, one report described a weak expansive nonlinearity (413), but others found only linear responses (49, 178). These results suggest that at the extreme cochlear apex positive feedback (the cochlear amplifier) plays a lesser role than at the base.

Several other important questions about cochlear mechanics remain unanswered. What mechanical transformations intervene between BM motion and deflection of inner hair cell stereocilia? Are there different modes of inner hair cell stimulation? Does the positive feedback loop that boosts BM vibration include tuned elements additional to the passive resonance of the BM? Does this mechanism, the cochlear amplifier, actually make use of biological energy to boost BM vibrations? Does the cochlear amplifier reside in the basolateral membrane of outer hair cells, in their stereocilia, or in both structures? Are spontaneous otoacoustic emissions generated by the same mechanism that amplifies BM vibrations? What mechanisms are involved in the efferent control of the gain of the cochlear amplifier?


Early measurements of BM vibration were performed using light microscopy under stroboscopic illumination (391). The limited sensitivity of this technique restricted the observations to displacements larger than 1 μm and required the use of very intense stimuli (e.g., 140 dB SPL).

A.  Mössbauer Technique

The first in vivo recordings of BM vibrations (163) as well as the discovery (293) and early descriptions of compressive nonlinearities (306, 307,355, 357, 360) were carried out using an application of the Mössbauer effect. A small source of gamma photons is placed on the BM, and a resonant absorber, tuned to the energy of photons emitted by the source at or near rest, is interposed between the source and a detector. With such configuration, the rate of detected photons is a function of the velocity of the source (293, 355). Because this function is extremely nonlinear, undistorted velocity measurements are only possible over a narrow range of response magnitudes. This inherent nonlinearity also confounds the measurements of nonlinear cochlear phenomena, such as two-tone interactions.

B.  Capacitive Probe

Cochlear vibrations have also been measured using a capacitive probe. This technique, first used by von Békésy (391), relies on measuring the capacitance that exists between electrically conducting objects, such as a fixed electrode and the BM, when separated by a nonconducting gap (e.g., air). BM vibrations cause changes in capacitance, which in turn modulate a radio-frequency carrier signal (391,402). Although this method is very sensitive and inherently linear, it requires the removal of most of the perilymph from scala tympani at the measurement site so that the BM remains covered only by a thin film of fluid. In practice, this is difficult to achieve without causing damage.

C.  Optical Methods

Most laboratories now measuring cochlear vibrations have adopted optical techniques, principally several forms of laser interferometry, which are essentially linear and at least one order of magnitude more sensitive than the Mössbauer method (321). In heterodyne laser interferometry, the laser light is split into two beams whose frequencies are shifted so that they differ by Δf. One of the beams is reflected from the vibrating target and the other serves as a fixed reference. Upon reflection, the beams are recombined at a detector that produces an interference or beat signal with frequency Δf. Vibrations of the target change the optical path length and translate into a phase modulation of the beat signal. Direct measurement of this phase change provides a signal proportional to the displacement of the target (47). Alternatively, target velocity may be extracted by frequency demodulation (157, 397), much as implemented in commercially available Doppler-shift laser vibrometers (261, 321).

Yet another type of interferometer (200, 269) utilizes the self-mixing effect of a laser diode; when some of the emitted light is reflected back into the laser cavity, the intensity of the light is modulated in proportion to the distance between laser and target (190). The modulation is measured by a photodiode located behind the laser crystal. One limitation of this method is that the modulation becomes significantly nonlinear for displacement amplitudes larger than 30 nm.

Laser interferometers are typically used in conjunction with a compound microscope to facilitate the identification and targeting of vibrating cochlear structures (44, 47,261, 321). In the case of the confocal heterodyne velocimeter used by the ITER group (157), the interferometer is integrated with a specially designed optical sectioning microscope intended to confine the measurement site within a thin slice of tissue.

Cochlear motion has also been measured (204,406) using a fiber-optic probe whose tip is placed in close proximity to the BM (40). The probe consists of one or more central fibers that deliver light to the target and surrounding fibers that receive reflected light. The amount of light captured by the receiving fibers varies approximately in proportion to target displacement so that target vibration modulates light intensity, which is measured with a photodiode.

With the exception of the confocal heterodyne velocimeter (157), the self-mixing laser interferometer (200), and a newly developed heterodyne displacement interferometer (44), most optical techniques for measuring cochlear vibration require the use of light-reflecting targets (small flat mirrors, glass microbeads, or polystyrene microbeads coated with gold) to make up for the transparency of the BM, the organ of Corti, or the TM. This raises the question of whether these reflective objects (or radioactive metal foils in the case of the Mössbauer technique) follow the movement of the cochlear partition. This question has generated conflicting answers. One study found that beads placed on the TM or on Claudius' cells in the apical turns of isolated cochleae faithfully follow their motion (132). Another study at the apex of isolated cochleae confirmed the result in the case of the TM but found that beads did not follow the motion of Claudius' cells closely (180). Yet a third study in the basal turn of the in vivo cochlea found no significant differences between the motion of beads and adjacent sites of the BM (44, 45).


We are grateful to Graeme Yates, who reviewed a previous version of this manuscript and helped to improve it greatly. We are very sorry that he could not see its final version. We also thank Mary Ann Cheatham, Gulam Emadi, and Robert Withnell for their comments; Peter Dallos and David Mountain for informative discussions; and Shyamla Narayan, Ed Overstreet, and Andrei Temchin for their help in the preparation of the figures.

We were supported by National Institute on Deafness and Other Communication Disorders Grants R01-DC-00419 and P01-DC-00110.


  • Address for reprint requests and other correspondence: M. A. Ruggero, Dept. of Communication Sciences and Disorders, Northwestern University, 2299 North Campus Dr., Evanston, IL 60208–3550 (E-mail:mruggero{at}

  • 1 The assumption that cochlear input impedance is resistive probably does not apply to responses of the chinchilla and cat cochleae to stimulus frequencies <200 Hz. The helicotrema is especially large in the cochleae of cat and chinchilla so that low-frequency pressure waves are probably shunted between scala vestibuli and scala tympani more effectively than in the cochleae of other species, causing reduced pressure difference across the cochlear partition (58). Accordingly, BM responses at the base of the cochleae of cat and chinchilla (as well as their correlate, cochlear microphonics recorded at the round window) have magnitudes that grow at a rate of ∼12 dB/octave and phases that lead stapes displacement by 180 degrees (58, 330,331).

  • 2 If the phase versus frequency curves reflect two component waves, one “slow” (the traditional traveling wave) and the other “fast” (an acoustic pressure wave), then the aforementioned quantization of phase lags at the high-frequency plateaus is more apparent than real, akin to a mathematical artifact. It simply results from the relative magnitudes of the two vibration components and from the unwrapping of the phases: the phase of the plateau is determined by the accumulation of phase (due to the slow component) at the lowest frequency at which the fast component is dominant.

  • 3 The phase-locked responses of auditory-nerve fibers in frogs exhibit large phase accumulations (143, 144) comparable to those of mammalian auditory-nerve fibers (7). Whereas the latter reflect the phase accumulations of a mechanical (BM) traveling wave, the phase accumulations in fibers innervating the frog auditory organs, which lack BMs, may result from electrical filtering within hair cells (283).

  • 4 On the basis of its structure, the basilar membrane may be divided into medial and lateral regions. The medial region (arcuate zone) extends from the osseous spiral lamina to the outer pillar cells. The lateral region (pectinate zone) extends from the outer pillars to the spiral ligament. The BM is made up of radial fibrils imbedded in abundant ground substance, with its tympanic face lined by a single layer of (mesothelial) cells.

  • 5 Exceptionally, CF responses exhibited slightly compressive growth rates in two cochleae immersed in oxygenated culture medium (382).

  • 6 The effects of filling the middle ear with fluid in the ITER preparation have not been well specified, but they are frequency dependent and involve stimulus attenuations lower than 35 dB at frequencies relevant to third- and fourth-turn measurements (e.g., they grow from −3 to +23 dB as frequency increases from 100 to 1,500 Hz; see Fig. 15 in Ref. 110). Moreover, the sensitivity of malleus vibrations in the ITER preparation (0.6 μm/Pa) is actually larger than in the living guinea pig at frequencies <250 Hz (0.16–0.22 μm/Pa) (49, 223).

  • 7 When this behavior was first observed (307) it seemed to conflict with neural rate suppression, whose strength was thought to be independent of probe-tone level (162). This idea, however, arose from the study of low-threshold fibers, whose response rates saturate at relatively low stimulus levels and thus cannot reflect level-dependent variations of BM responses at moderate and high stimulus levels. In high-threshold fibers, which have broad dynamic ranges and do not reach a rate plateau at physiologically relevant intensities, the magnitude of suppression decreases at high probe-tone levels, much as at the BM (368).

  • 8 The single apparent exception was a study that reported that, at the 3.5-mm site of the chinchilla BM, maximal suppression of responses to CF tones coincided with peaks of BM velocity, rather than displacement (332). In retrospect, it has become clear that this result was artifactual (376). It is estimated that the phases of responses to low-frequency tones reported by Ruggero et al. (332) must be corrected by ∼90 degrees in the lagging direction. Applying this correction brings the phases of maximal suppression at the base of the chinchilla BM into agreement with those of basal BM sites in other species and with corresponding findings in chinchilla auditory-nerve fibers (376).

  • 9 Brownian motion may play some role in cochlear mechanics. It has been estimated that, if the stereocilia of inner hair cells are free standing, their motion at low stimulus levels might be aided by “stochastic resonance” driven by Brownian motion in the endolymph (89, 159). There is also evidence that, in the absence of acoustic stimulation, the BM exhibits frequency-tuned vibrations (263). Their amplitudes (<10 μm/s), however, appear to be insufficient to generate neural excitation.

  • 10 The single exception appears to be a single-resonance model (113) in which fairly realistic BM tuning was achieved solely by reducing damping by means of an electromechanical feedback loop. Unfortunately, the model was based on the assumption that depolarization elongates outer hair cells, i.e., a polarity of electromotility opposite to that actually present in outer hair cells (10, 350; see also Ref. 224).

  • 11 Hubbard has presented a unique, two-mode model of the cochlea consisting of two coupled transmission lines. Although only one of the transmission lines includes resonances, the other one does include delays that vary systematically with longitudinal cochlear position. The velocities of the two lines match in a short region of the cochlea, in which mutual feedback occurs, producing “BM vibrations” that mimic experimental values amazingly well (150).