Session Sound to Sense:

Sound to Sense: Modelling Fine Phonetic Detail

Type: special
Chair: John Local, Sarah Hawkins
Date: Thursday - August 09, 2007
Time: 16:00
Room: 1 (Red)

 

Sound to Sense-1 SOUND TO SENSE: INTRODUCTION TO THE SPECIAL SESSION
Sarah Hawkins, University of Cambridge
John Local, University of York
Paper File
  This paper forms the Introduction to the ICPhS-07 Special Session “Sound to Sense” (S2S). S2S is a Marie Curie Research Training Network (funded May 2007-2011) for interdisciplinary research on modeling the role of fine phonetic detail in speech understanding/recognition by humans and machines. The special session includes four position papers that illustrate some of the subject areas and aims of S2S, and two discussion papers. This Introduction sets the scene by defining what fine phonetic detail is, describing the theoretical motivations for investigating its influence on speech processing, and outlining why it is timely to investigate its role in how speech is understood.
Sound to Sense-2 NORMALIZATION OF CZECH VOWELS FROM CONTINUOUS READ TEXTS
Jan Volín, Institute of Phonetics, Charles University in Prague
Davod Studenovský, Institute of Phonetics, Charles University in Prague
Paper File
  The effectiveness of vowel normalization methods has been suggested to be language-dependent. Six such methods have been used on Czech vowels to see which of them would lead to the best results in follow-up discriminant analyses while preserving the linguistically informative detail. The discriminant analyses had lower success rates for read continuous texts with multiple tokens from 75 speakers than for the carefully-pronounced monosyllables used previously by other authors, suggesting that the results might also be materialdependent. On the other hand, our variable data offered additional insights into sources of contextual variation and allowed us to identify the so-called enhancing contexts in which identity of a vowel is best preserved.
Sound to Sense-3 FINE PHONETIC DETAIL AND INTONATIONAL MEANING
Brechtje Post, University of Cambridge
Mariapaola D'Imperio, Aix-Marseille 1, CNRS (LPL), Aix-en-Provence, France
Carlos Gussenhoven, Centre for Language Studies, Radboud University Nijmegen
Paper File
  The development of theories about form-function relations in intonation should be informed by a better understanding of the dependencies that hold among different phonetic parameters. Fine phonetic detail encodes both linguistically structured meaning and paralinguistic meaning. Keywords: Intonational meaning, paralinguistic meaning, fine phonetic detail, tonal alignment.
Sound to Sense-4 PRESERVING FINE PHONETIC DETAIL USING EPISODIC MEMORY: AUTOMATIC SPEECH RECOGNITION WITH MINERVA2
Roger Moore, Dept. Computer Science, University of Sheffield, UK
Viktoria Maier, Dept. Computer Science, University of Sheffield, UK
Paper File
  Previous research has demonstrated competitive recognition results using a simulation of episodic memory - 'MINERVA2' - on the Peterson & Barney corpus of vowel formant data. This paper presents a modified implementation designed to work on real speech data, and results are reported on isolated-word recognition experiments conducted using the TI-ALPHA corpus. It is shown that access to fine phonetic detail is critical for achieving high recognition accuracy, whether it is provided by the episodic model or by hidden Markov models incorporating large numbers of Gaussian mixture components. However it is confirmed that, although MINERVA2 offers a powerful means for generalizing by accessing the fine detail retained in all the training data, it is severely hampered by its inability to model temporal sequence. It is concluded that a new episodic model is needed that is based on the principles of MINERVA2 but which overcomes such limitations.
Sound to Sense-5 EFFECT OF CROSS-WORD CONTEXT ON PLOSIVE IDENTIFICATION IN NOISE FOR NATIVE AND NON-NATIVE LISTENERS
Maria Luisa Garcia Lecumberri, University of the Basque Country,
Martin Cooke, University of Sheffield
Paper File
  Studies of second language speech perception can highlight the role of prior knowledge in native language processing. This study compared native and non-native identification of plosives in words spliced from natural utterances when presented in noise, with/without the context of preceding word. Both listener groups performed at the same level in the absence of context at high noise, suggesting that cues surviving energetic masking and splicing were similar for the two languages or that they had already been acquired by the non-native group. However, native listeners gained significantly more when contextual information in the preceding word was present, indicating that cross-word, extra-syllabic, cues are less easily exploited by non-native listeners. An acoustic analysis revealed subtle durational differences in the preceding word rhyme, knowledge of which may contribute to the native advantage. Other possible explanations for the native benefit from cross-word context are discussed.
Sound to Sense-6 When is Fine Phonetic Detail a Detail?
Rolf Carlson, KTH, CSC, Dept. Speech, Music and Hearing
Sarah Hawkins, University of Cambridge
Paper File
  This paper discusses the papers by Moore and Maier, and by Lecumberri and Cooke, which are two of the four position papers in the ICPhS special session on Sound to Sense (S2S). The rationale for our comments is to illuminate and support the hypothesis that speech perception is a dynamic and adaptive perceptual process in the interpretation of the sensory speech signal. As background for the discussion of the two position papers, two further perceptual experiments are described. Their results are discussed with respect to (1) identification of phonetic detail by experimenters and by native and non-native listeners, (2) the perceptual and theoretical status of “detail” as additional versus fundamental auditory information, and (3) challenges in balancing the practical advantages of using tractable goals and data versus development of richer models whose parameters probably more closely reflect the processes of normal speech perception.
Sound to Sense-7 DETAILS AND CONTEXTS: COMMENTS ON THE PAPERS
Richard Ogden, University of York
Paper File
  In this paper I make two main points: (1) we need a better understanding of context, (2) there may be naturally-occurring phenomena in conversational data which offer a good basis to see the interplay of segmental and prosodic factors in constructing meaning; it may be possible to use such data as the basis for further work. I aim to open a dialogue between quantitative and qualitative approaches to the study of language.

Back to Conference Schedule