05.01.2013 Views

Perceptual Coherence : Hearing and Seeing

Perceptual Coherence : Hearing and Seeing

Perceptual Coherence : Hearing and Seeing

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

358 <strong>Perceptual</strong> <strong>Coherence</strong><br />

for each instrument. The results indicated that listeners were quite accurate,<br />

roughly 85% to 90% correct.<br />

Using the same sounds, human performance was compared to a classification<br />

model that used only spectral information. The classification model<br />

compared each test sound to a set of prototypical spectra for both the oboe<br />

<strong>and</strong> saxophone. To derive prototypical instruments, Brown (1999) used<br />

oboe <strong>and</strong> saxophone passages (about 1 min in length) to derive an overall<br />

spectral envelope averaged across notes. The overall envelope portrays the<br />

sets of overlapping sound body resonances that amplify the excitation in<br />

different frequency ranges. These overlapping resonances are termed formants,<br />

<strong>and</strong> a formant in the 800–1000 Hz range would amplify the fourth<br />

harmonic of 200–250 Hz excitations, the third harmonic of 300–333 Hz<br />

excitations, <strong>and</strong> the second harmonic of 400–500 Hz excitations <strong>and</strong><br />

800–1000 Hz fundamental excitations. Thus, across a set of notes, there<br />

would be consistent energy in the 800–1000 Hz range <strong>and</strong> that formant<br />

would be part of the signature of a particular instrument.<br />

To measure the performance of the computer model, the short segments<br />

presented to the listeners were partitioned into 23 ms windows, <strong>and</strong> the<br />

spectral envelopes of the windows were compared to the prototypical envelopes.<br />

The computer model then calculated the likelihood that the segment<br />

was played by the oboe or by the saxophone. Overall, the computer<br />

was better than human listeners at identifying the oboe but equal at identifying<br />

the saxophone. The success of the computer model demonstrates that<br />

the spectra information was sufficient to distinguish between the two instruments,<br />

but of course it does not unambiguously demonstrate that human<br />

listeners are making use of the same information.<br />

In subsequent work, J. C. Brown et al. (2001) used four wind instruments,<br />

oboe, saxophone, clarinet, <strong>and</strong> flute. The results were similar to the<br />

previous work: human <strong>and</strong> computer identification were roughly identical.<br />

For the computer model, the most effective spectral information was the<br />

shape of the spectrum due to overlapping resonances or the jaggedness of the<br />

spectrum measured by the variation in the amplitudes of the harmonics. A detailed<br />

spectral envelope description yielded better performance than just the<br />

frequency of the spectral centroid, which is based on only one number.<br />

Two studies illustrate that the spectral envelope can be used to discriminate<br />

among natural events. Freed (1990) investigated whether listeners<br />

could judge the hardness of mallets when they struck aluminum cooking<br />

pots. The mallets were made of metal, wood, rubber, cloth-covered wood,<br />

felt, <strong>and</strong> felt-covered rubber. The judges based their ratings on the spectra.<br />

Mallets were judged as hard if the sound energy initially was concentrated<br />

at higher frequencies <strong>and</strong> then shifted to lower frequencies over time (about<br />

300 ms). As can be imagined easily, the harder mallets created a louder<br />

sound, which also affected the hardness judgments. X. Li, Logan, <strong>and</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!