Introduction to Hearing.



The auditory system is essential for normal functioning communication including species specific vocalizations that convey information about territory, reproduction, warning, etc. Human communication in the form of speech is the basis for the unique place of humans in the evolutionary chain. Sounds can be thought of as playing a role in imaging: i.e., they often result in a visual image being called up from memory, e.g., the sound of a brook, a waterfall, the rustle of grass, the sound of a rattle snake, etc. were all experienced long before speech evolved. We hear and visualize things and events that important to us, not the actual sound waves or light waves.


The study of the auditory system involves morphological/physiological approaches to auditory function. The auditory system is perhaps the most complex sensory system with seven (or more) distinct processing levels in the CNS as seen in figure 1. Understanding the function of these structures is limited by our ability to manipulate them in meaningful ways, especially in humans where we have to wait for appropriate lesions in the brain and note the resulting deficits. In contrast, the use of invasive studies in animal-based research has led to a vast knowledge base over the years that encompasses anatomical, physiological, molecular and genetic data often obtained in the same set of experiments. Modern techniques permit correlation of structure and function in the brain. There are many newer methodologies that permit gathering information in humans. The current ‘glitzy’ approaches use various imaging techniques that help to localize function in the brain, e.g., PETT, MRI, and fMRI.


A tiny bit of History.

The Greeks (BC) were analytical in their approach to science. They may have discovered the theoretical basis for music. It is of course possible that similar levels of comprehension of sound and hearing were present in other cultures, e.g., in China though the documentation isn’t available.


Aristotle generalized the notion of Phythagorus and realized that the propagation of sound involved the variation in density of the medium. He thought that the entire medium had to move which proved to be incorrect.


Mersenne in 1650 shouted ‘bendicam dominum’ after finding that it took him 1 second to say it. He then found a distance from a wall such that the tail to head of the shout to echo was found: = 316meters/second, very close to the actual value of 344m/s, very impressive when the technique is considered.


The modern era for hearing research and a lot of other areas.

Invention of the triode: 1905. This was used for electronic amplifiers, it was a vacuum tube that played the same role as transistors do today.

Microelectrodes: 1950. - provide the basis for single neuron studies so essential for our present understanding of the brain.




Clinical perspective


One area that hearing has often been ignored is the clinical applications of hearing science. Recently, there has been greater emphasis on several aspects of several clinical areas:

  1. Molecular genetics & deafness, e.g., the use of phenotypic markers such as those found in Waardenburgs (white forelock and cochlear deafness) and Usher’s syndrome (retinitis pigmentosa and congenital nerve deafness. Autosomal recessive).
  2. Regeneration of the auditory nerve: data from amphibians and lower vertebrates indicate capacity for repair of neurites & regeneration of proper neural connections that result in the recovery of auditory function (e.g. Corwin, 1986).
  3. Sound deprivation: studies of sound deprivation indicate the need for intervention. The acoustic experience is important for the development of linguistic ability.
  4. Ototoxicity. The destruction of the sensory transduction by the application of drugs.
  5. Viral infection & autoimmune diseases. E.g. rubella, herpes, HIV, CMV,…
  6. Otoacoustic emissions have proven to be a very useful clinical tool. OAEs help define a functional phenotype for diverse hearing disorders: these can result from the effects of hypoxia and anoxia, noise trauma, Ménière’s syndrome, genetic hearing loss, idiopathic hearing loss. OAEs can be used in screening infants: it works in most newborns. That is, a cooperative subject is not required.



Introduction to sound.


Sound requires a medium to exist: there is no sound when a space ship blows up in outer space. Air molecules provide the necessary element for sound transmission to occur.


Sound signals:

There are several properties of a sound signal that characterize it. Sound moves through various media such as air, water and solids with a velocity ( c) that is dependent on the media.

Often the signal repeats as a function of time as it moves through the media. The number of repeats in a one second period is its frequency ( f.) and has the units of Hertz or Hz. The reciprocal of frequency is its period (T = 1/f).

The distance traveled in one period is called it’s wavelength (l ) and has units of meters.

l = cT = c/f

See the table below for wavelengths under the conditions stated. The wavelength is important as it will determine when a sound will interact strongly with the environment.



Table 1

Wavelengths of sound.


Frequency (Hz)


vel = 344m/s (20° C)


vel = 1,500m/s


vel = 6,000m/s

20 (low end)




256 (middle C)




1000 (popular)




10000 (high-old)




20000 (high-young)






Note that air molecules constitute an elastic medium. Each is accelerated in the direction of the pressure change and moves only a small distance before encountering (colliding) with another molecule thereby transferring its kinetic energy. Fig. 2 illustrates the change in density of the air molecules as a result of the movement of a piston in a tube. The change in density shown is far greater than would occur for a normal sound.


Note for future reference that the resonant wavelength in a pipe (L meters long) closed at one end is l =4*L (see fig. 3). Do you understand how this is arrived at? The wavelength of sound will be shown later to be important in the manner in which sound interacts with a structure, e.g., the head and the distance between the two ears. That is, the sound that enters the two ears will vary in intensity and phase due to the distance traveled and interaction with the head based on the position and frequency of the sound source.



The intensity of a wave at any point in space is defined as the amount of energy passing perpendicular through a unit area in unit time (units are watts/cm2). The most frequently employed measurement technique for intensity uses pressure to quantify the amplitude of a sound. Intensity varies with the distance from a sound source, decreasing in proportion to the distance squared. Pressure is measured with a microphone, often with a condenser microphone because of their accuracy and stability. What is truly remarkable is the range in intensities we experience in our environment as shown in fig. 3. Atmospheric pressure (about 15 lbs/ corresponds to a sound pressure level of 194 dB (=1Bar). A logarithmic scale is used to represent the sound level because the sound pressure between threshold and the experience of pain ranges over 1,000,000. It is clumsy to write these large numbers so sound pressure level (SPL) employs the relation:


SPL = 20* log10(P/(Pref=0.0002dynes/cm2)) in dB


The reference pressure corresponds to the least intense sound level that the average human can hear and also corresponds to an energy level of 10-16 watts/cm2. Pascals are often used for sound measurements: 1 Pascal = 94 dB SPL. The reference level for Pascals is 20 m Pa.


While figure 4 illustrates the sound levels produced by some common things, it is important to note that sounds often have different energies at different frequencies. Further the human ear and most vertebrate ears have a frequency dependent hearing range, that is, the threshold for detection of a sound and the maximum sound level that can be ‘heard’ without destroying your hearing varies with frequency as shown in fig. 5.


While the ear responds to this tremendous range of sound levels, we will learn that the auditory nerve, the pathway for all acoustic information to the central nervous system, is limited, in general’, to a dynamic range of 30 to 40 dB. Auditory nerve fibers respond to sound with a maximum discharge rate of approximately 200 spikes/s. Therefore, a change in rate of 1-2 spikes/s would indicate about a 1 dB change in stimulus level. How does the auditory system then function over an additional 5 orders of magnitude of sound intensity?


Some detection thresholds:

A 12% change in amplitude: 1 < f < 4 kHz.

0.5 to 4 Hz change in frequency f < 3 kHz.

Note the musical scale is limited to f < 4 kHz. This may be related to the difficulty of distinguishing between 2 tones on the basis of frequency for f > 6 kHz.


We will see later that there is a phenomena termed phase locking in the discharges of auditory nerve fibers that coinsides with the musical range and is involved in the discussion as to whether the auditory code for frequency lies in the temporal domain or utilizes discharge rate.


The composition of sound.

Any periodic waveform (sound pressure) can be decomposed using Fourier analysis. The 1st harmonic is called the fundamental frequency (f0) relates to the period of the waveform (T = 1/f0). Harmonics of the fundamental (fn = n*f0) are added at appropriate phases to reproduce the waveform (see fig. 2.6 for a waveform that consists of three harmonics; from Stephen Handel, Listening, 1989). Also not the reconstruction of a squarewave in fig 2.6b. Each successive odd-harmonic is added with an amplitude of 1/n. By the time the 5th harmonic is added a reasonable replication of the squarewave is obtained. An infinite number of harmonics would be required to get the waveform exact. Even then, then is something called Gibbs phenomenom that prevents a perfect squarewave to be constructed. We will not consider this further.


Several common auditory stimuli are illustrated in fig. 2.7. The 1st is white noise is analogy to white light that contains all the colors (frequencies). White noise has a Fourier spectrum that is flat: all frequencies at equal amplitude. A 2nd frequently used stimulus is a click (an impulse) that is a sound that is well marked in time of occurrence as it occurs in an infinitely short time epoch.

The third stimulus shown is one that corresponds to a pure tone (a sinewave). The three spectra that are shown for signals of different duration indicate that it is only in the limit as the stimulus duration approaches infinity that a pure tone is achieved (pure implying no energy at any frequency other than f0.


Some other features of signals that will be of importance to us later are illustrated in fig. 2.8. Due to physical limitations of sound generation apparatus, there is a finite time required to turn a sound on or off. In music these times are referred to as the attack and decay times. They can be related to filter properties of the sound generation device. One measure of a filter is its quality factor or Q-value. The higher the Q, the more narrow the filter, the more the system will ring…. That is, the attack decay will have a long period. This is why a speaker has a low Q, so that it can respond quickly to changes in the signal input. However, an instrument like a violin has different requirements: it has to be loud, rich in harmonics, and therefore more compliant than the rather stiff speaker. The speaker doesn’t respond very strongly to a perturbance though amplifiers are used to boost its output. The Q of the speaker may be < 1 while the Q of the violin is 40-50. High Q implies that the filter is selective for a narrow range of frequencies.


We will see that the Q of the auditory system is 6-10 in the mid-frequency range. Further, there are specializations in some species that have Q values of several 100 in their cochleas and auditory nerve fibers.


Returning to fig. 2.8, in our use of auditory stimuli, we will have occasion to vary/control the rise and fall time of a sound stimulus. The shape of the waveform will effect the outcome. It has been found that sharp edges in the envelope result in ‘energy splatter’. For this reason the sinewave stimulus is often multiplied by a raised cosine to reduce this effect. The rise/fall time is often varied depending on the frequency response of the system under study.



There is probably some dispute as to what is noise and what either information or maybe music. However, one thing that is clear is that continued exposure to loud sounds is detrimental to maintaining low hearing thresholds as one ages. An example of this is illustrated in fig 6.12. This was part of a study of a tribe in Africa. The presumption is that the tribe members were not exposed to industrial noise as likely occurred in the other three populations. What is obvious is that even at the ripe old age of mid-forties, these tribe members still retained good threshold for a 14 kHz tone. This is a relatively high frequency and one that is not probably heard by most people at that age.


Lastly, as a precautionary note, the last figure shows the effect of signal-to-noise level of the perception of monosyllables. Clearly higher S/N ratios improve the number of correct identifications. One of the parameters varied is the number of monosyllables. Note the strong effect of the size of the stimulus set. Just something to keep in mind for the future.