DISTINGUISHING SPECTRAL AND TEMPORAL PROPERTIES OF SPEECH USING AN INFORMATION-THEORETIC APPROACH

Thomas Ulrich Christiansen1 & Steven Greenberg2
1Ørsted*DTU, Technical University of Denmark; 2Silicon Speech

ID 1192
[full paper]

The spectro-temporal coding of Danish consonants was investigated using an information-theoretic approach. Listeners were asked to identify eleven different consonants spoken in a CV[l] syllable context. Each syllable was processed so that only a portion of the original audio spectrum was present. Narrow speech-bands, with center frequencies of 750 Hz, 1500 Hz and 3000 Hz, were presented individually and in combination with each other. The modulation spectrum of each band was low-pass filtered at 24, 12, 6 and 3 Hz. Confusion matrices of the consonant-identification data were computed. From these the amount of information transmitted for each of three phonetic features (voicing, manner and place) was calculated for each condition. Such analyses indicate that: (1) Accurate, robust decoding of place-of-articulation information requires broadband cross-spectral integration (2) Place-of-articulation information is most closely associated with the modulation spectrum above 12 Hz.