Session Poster II:

Poster II

Type: poster
Chair: Nicolas Dumay, Mark Jones
Date: Tuesday - August 07, 2007
Time: 10:00
Room: Poster Area


Marija Tabain, La Trobe University, Melbourne
Kristine Rickard, La Trobe University, Melbourne
Paper File
  The Australian Aboriginal language Arrernte has four coronal consonants in the stop, nasal and lateral series. This paper presents EPG data for the four coronal stops of Arrernte in inter-vocalic context for one female speaker of the language. Results show comparatively little variability in the laminal articulations, and comparatively greater variability in the apical articulations. An interesting finding suggests that a retroflex harmony may exist in the language, whereby a retroflex consonant later in the word may cause a previous alveolar consonant to harmonize.
Poster II-3 Schwa vocalization in the realization of /r/
Robert Vago, Queens College and The Graduate Center, City University of New York
Mária Gósy, Research Institute for Linguistics, Hungarian Academy of Sciences
Paper File
  The realization of the phoneme /r/ is commonly identified in terms of trills, taps, approximants, fricatives, vowels, and devoicing. An experimental investigation in Hungarian revealed a heretofore not discussed variant: [r] with a schwa on-vocalization ([r]) or off-vocalization ([r]). In Cr clusters the occurrence of schwa was more frequent in homorganic than in heterorganic clusters, while in the case of rC clusters the occurrence of schwa was more frequent in heterorganic than in homorganic clusters. The [r] realization was found before vowels (onset position), [r] before consonants or word finally (coda position). These facts are explained on the basis of articulatory / aerodynamic principles.
Poster II-5 The role of vowel contrast in language-specific patterns of vowel-to-vowel coarticulation: Evidence from Korean and Japanese
Jeong-Im Han, Konkuk University
Paper File
  The purpose of this paper is to test the role of vowel contrast in V-to-V coarticulation. Specifically, V-to-V anticipatory and carryover coarticulations in Korean and Japanese were examined in terms of F1 and F2 in crowded vs. non-crowded regions of the vowel space. The results showed that the vowel contrast does not directly contribute to the language-specific coarticulation pattern between these two languages, which is at odds with Manuel & Krakow (1984), and Manuel (1987), but in good agreement with Bradlow (1995).
Christine Ericsdotter, Stockholm University
Paper File
  This paper presents some results and a small follow-up investigation from an MRI study of vowels [3], in which classical distance-to-area equations [5] were evaluated for implementation in sagittal view articulatory modelling. It was shown that an articulatorily more detailed application of the conversion rules improved the accuracy of the predicted areas, but that this increased realism failed to improve acoustic performance, if midline derivation and vocal tract termination points were kept the same. These results are discussed in relation to articulatory modelling in linguistic research. Work funded by the NIH (R01DC02014) and Stockholm University (SU617023001).
EFTYCHIA EFTYCHIOU, University of Cambridge
Paper File
  The present paper presents the results of an experimental investigation of the connected speech process of close-vowel lenition in Cypriot Greek (henceforth CG). The process appears to be gradient, with stops whose adjacent vowels have been elided being acoustically different from canonical word-final stops, indicating residual stop-vowel coarticulation. Finally, the study reveals two routes to lenition in CG; one involves a full consonant with a lenited vowel, and the other a lenited consonant with a full vowel, potentially signifying that the laryngeal setting is the same in both cases and that the different acoustic patterns are the result of supralaryngeal imprecision.
Mark Tiede, MIT R.L.E. & Haskins Laboratories
Stefanie Shattuck-Hufnagel, MIT R.L.E.
Beth Johnson, Yale University
Satrajit Ghosh, MIT R.L.E.
Melanie Matthies, Boston University & MIT R.L.E.
Madjid Zandipour, MIT R.L.E.
Joseph Perkell, MIT R.L.E.
Paper File
  This work presents results of an EMMA study of the articulatory phasing between successive /k/ and /t/ gestures in English tautosyllabic ("pact op") and heterosyllabic ("pack top") contexts, varied by speaking rate and stress. Although subjects responded idiosyncratically, in general coda clusters are shown to be significantly less variable in timing than heterosyllabic sequences relative to the labial gestures of the carrier context.
Poster II-13 Cross-linguistic differences in the perception of palatalization
Molly Babel, University of California, Berkeley
Keith Johnson, University of California, Berkeley
Paper File
  This paper investigates the difference between basic psycho-acoustic auditory perception and language-specific perception of speech sounds. This was examined in two experiments with American English and Russian listeners. Results suggest that listeners' language does not influence auditory perception, but does affect the rated perceptual similarity of speech sounds.
SIMONE GRAETZER, University of Melbourne
Paper File
  Formant two distribution for Arrernte and Burarra, two Australian languages, at V1-offset and V2-onset in V1CV2 sequences, reveals that phonemic voiceless plosive consonants differ in coarticulation resistance. While the two languages display slightly dissimilar patterns of resistance, they share a strong tendency towards greater variation in V1-offset, suggesting that the effects of coarticulation resistance are strongest immediately after intervocalic consonants in these languages.
Abdellah KACHA, Université Libre de Bruxelles
Francis GRENEZ, Université Libre de Bruxelles
Jean SCHOENTGEN, Fonds National de la Recherche Scientifique
Paper File
  The presentation concerns the evaluation of the anatomical plausibility of vocal tract shapes calculated by means of an analytical formant-to-area map. A constraint that requests that the jerk of the evolving model parameters is minimal is used to select a single solution among the infinitely many area functions that are compatible with the observed formant frequencies. A similarity measure between observed and inferred cross-sections has been computed to express the plausibility of the recovered shapes quantitatively. The test corpus has comprised observed area functions and formant frequencies of ten French vowels sustained by two male and two female speakers. Results show that vowel qualities involving double articulations have been the most likely to give rise to large dissimilarities between acoustically inferred and measured vocal tract cross-sections.
Poster II-19 Acoustic description of a soprano's vowels based on perceptual linear prediction
Thomas John Millhouse, Sydney Conservatorium, University of Sydney
Frantz Clermont, JP French Associates, Forensic Speech and Acoustics Laboratory York
Paper File
  A perceptually-motivated model (Hermansky, 1990) known as Perceptual Linear Prediction (PLP) is employed to parameterise and to interpret the cardinal vowels sung by a professional soprano at pitches ranging from 220 to 880 Hz. The PLP model yields perceptual formants (F1’ and F2’), which encode the low and high-spectral regions, respectively. These formants are found to be tractable and robust, thereby facilitating a more complete description of the sung-vowel space.
Poster II-21 Prosodic Phrasing in Elliptic and Non-elliptic Coordinations
Gerrit Kentner, Universität Potsdam
Paper File
  This paper reports a prosodic difference between elliptic and non-elliptic coordinations in German. Findings of a speech production experiment indicate that ellipsis has an effect on prosodic phrasing and that speakers avoid phrase boundaries between an elliptic gap and its filler. The data is incompatible with accounts stating that phonetically empty material resurfaces in the form of increased segment duration and greater pitch excursion at the gap. The results are evaluated against the Sense Unit Condition on intonational phrasing.
Poster II-23 Rhythmical classification of languages based on voice parameters
Volker Dellwo, University College London
Adrian Fourcin, University College London
Evelyn Abberton, University College London
Paper File
  It has been demonstrated that speech rhythm classes (e.g. stress-timed, syllable-timed) can be distinguished acoustically and perceptually on the basis of the variability of consonantal and vocalic interval durations. It has moreover been shown that even infants are able to use these cues to distinguish between languages from different rhythm classes. Here we demonstrate that the same classification is possible in the acoustic domain based simply on the durational variability of voiced and voiceless intervals in speech. The advantages of such a procedure will be discussed and we will argue that 'voice' possibly offers a more plausible cue for infants to distinguish between languages of different rhythmic class.
Poster II-25 Tonal Targets and Their Alignment in Daegu Korean
Akira Utsugi, JSPS Postdoctoral Fellow for Research Abroad / University of Edinburgh
Hyejin Jang, Korea University
Minyoung Seol, Korea University
Paper File
  This study investigates tonal targets in Daegu Korean. Through our analysis of F0 and alignment, especially focusing on the turning point, we identified the different features between the rise before the accent and the fall after the accent. In the contour before the accent, we identified the turning point from the low plateau to the rise, anchored to the end of the syllable immediately preceding the accented syllable. On the other hand, in the contour after the accent, the turning point was not clear and, even if it exists, it was delayed. These results are against the theory in previous literature that the accented syllable is associated with H*+L in this dialect.
Poster II-27 An initial account of the intonation of Emirati Arabic
Allison Blodgett, University of Maryland Center for Advanced Study of Language
Jonathan Owens, University of Maryland Center for Advanced Study of Language
Trent Rockwood, University of Maryland Center for Advanced Study of Language
Paper File Additional Files
  We conducted auditory and visual analyses of recordings of colloquial Emirati Arabic in order to develop an autosegmental-metrical account of the intonation. Based on our analyses, we propose an initial tonal inventory of two main pitch accents (i.e., H*, (LH)*), one downstepped variant (i.e., !H*), and four bitonal phrase accents (i.e., LL%, LH%, HL%, HH%), which mark the right edges of intonation phrases. The data suggest that speakers produce a pitch accent on every content word and can use pitch range compression to vary the position of the perceptually most prominent pitch accent within a prosodic phrase. The data further suggest that speakers can initiate and complete compression within a prosodic phrase and that they can extend that compression across silent durations to subsequent phrases.
Poster II-29 Acoustic Effects of Prosodic Phrasing on Domain-initial Vowels in Korean
Eun-Kyung Lee, University of Illinois at Urbana-Champaign
Paper File
  This paper investigates acoustic evidence of strengthening and lengthening on domain-intial vowels in Korean, by comparing measures of F1, F2, and duration across vowels /a, e, i, o, u/ in three different prosodic domains; Intonational Phrase, Accentual Phrase, and Phonological Word. In contrary to previous findings on domain-initial vowels in the CV syllable where no prosodic strengthening effects were observed, the results of the current study confirms the presence of acoustic correlates of prosodic phrasing in the spectral and temporal dimensions of onsetless domain-initial vowels: place features are enhanced and duration is reduced in higher level domains relative to lower levels. This indicates that prosodic phrasig is manifested in vowel features as well as in those of consonants if vowels are immediately adjacent to prosodic boundaries. The findings also suggest that strengthening and lengthening are independent effects on domain-initial vowels in Korean, rejecting the undershoot hypothesis.
Poster II-31 For a dependency theory of intonation
David LE GAC, Université de Rouen
Hi-Yon YOO, Université de Paris 7
Paper File Additional Files
  This paper accounts for a theory of intonation for French. We discuss morphological approaches where tones or contours are derived by meaning. We claim for an intonative structure independent from other components of the grammar, where the phonological units are interrelated by dependency rules.
Myriam Piccaluga, Laboratoire des Sciences de la Parole, Académie Universitaire Wallonie-Bruxelles
Jean-Luc Nespoulous, Laboratoire Jacques Lordat, Université de Toulouse-Le Mirail et Institut Universitaire de France
Bernard Harmegnies, Laboratoire des sciences de la Parole, Académie Universitaire Wallonie-Bruxelles
Paper File
  This paper focuses on a new speech signal based index ("Inter syllabic Interval": ISI), intended to improve the study of disfluencies produced by the chunking process in subjects performing a SI task (Simultaneous Interpreting, i.e., on line oral translation). The variable is introduced on the basis of a discussion of the main methodological trends in the field, with the aim of improving the quality of the numerical treatments applied to the study of SI. It is argued that because the technical and epistemological limitations of the study of pauses, an index based upon the amplitude peaks in the speech signal should provide more reliable and valid information. An exploratory experiment is carried out on a prototypical sample of 4 subjects, performing SI under several conditions. Results show ISI usefulness in detecting events related to high-level cognitive processes, on the basis of the speech signal.
Linda Shockey, University of Reading
Z. S. Bond, Ohio University
Paper File
  In casual conversation, listeners occasionally report hearing something which differs from what the talker has intended. A large proportion of such ‘slips of the ear’ involves casual speech phonological alternations. The error patterns suggest that listeners employ knowledge of casual speech phonology to map phonetic forms into lexical entries.
James Kirby, Phonology Laboratory, University of Chicago
Alan C. L. Yu, Phonology Laboratory, University of Chicago
Paper File
  This paper reports the results of a wordlikeness task designed to investigate Cantonese speakers’ gradient phonotactic knowledge of systematic versus accidental phonotactic gaps. Regression analyses found that wordlikeness judgments correlate with token frequency-weighted neighborhood density and transitional (bigram) probability.This is suggested to be an effect of the relative phonological densities of the Cantonese and English lexica.
Kevin Heffernan, University of Toronto
Paper File
  Women typically produce more dispersed vowels than men. This sex difference makes predictions about the role of each sex in vowel changes. Specifically, women lead changes that maintain the distance between vowels, such as chain shifts, while men lead changes that reduce the distance between vowels, such as vowel mergers. That women lead chain shifts is well-established. That men lead mergers has not been established. An investigation of vowel mergers among the Atlas of North American English speakers reveals that men do lead mergers, and that speakers with a less dispersed vowel system show more instances of mergers, regardless of sex. I conclude by positing vowel dispersion as an internal explanation of which sex leads a vowel change.
Katrin Schneider, Institute of Natural Language Processing, Experimental Phonetics Group, University of Stuttgart
Paper File
  This paper presents the results of a study concerning the acoustic correlates of contrastive word stress in bisyllabic and trisyllabic German words, produced by four children aged 2;3 to 7;3 and their mothers. We found that German children of that age are certainly able to produce contrastive word stress and that vowel duration is the most reliable correlate of word stress in the utterances produced by all four children and their mothers, independent of the position of the vowel within the produced word. Furthermore, we found the voice quality parameter incompleteness of closure most uniformly used by the mothers to mark word stress while the children are on different acquisition stages for this parameter.
Julia Monnin, EA Transcultures & ICP, Speech and Cognition Department, GIPSA-lab
Hélène Loevenbruck, ICP, Speech and Cognition Department, GIPSA-lab
Mary Beckman, Department of Linguistics, Ohio State University
Paper File
  The present study is part of a larger cross-linguistic comparison of phonological development. The aim is to compare production of word-initial obstruents across pairs of languages which have comparable consonants that differ either in overall frequency or in the frequency with which they occur in analogous sound sequences. By comparing across languages, the influence of language-specific distributional patterns on consonant mastery can be disentangled from the effects of more general phonetic constraints on development. The present study aims at extending the comparison to Hexagonal French. We report frequency measures obtained on French databases and results of a preliminary experiment with French-acquiring two-year-old children.
Poster II-45 Phonetics Ear-Training - Design and Duration
Patricia D S ASHBY, University of Westminster
Paper File
  A recent study of the product of traditional phonetics ear-training revealed a number of interesting (sometimes unexpected) effects. Mastery of the sounds of the IPA (including Cardinal Vowels) was taken as the goal/benchmark. 125 ab initio students of phonetics at a UK university were followed through a typical year-long phonetics ear-training programme; their ability to recognise sounds was tested at two points during the year. With respect to vowel identification, the findings confirmed the expectation that contextualised vowels would be harder to identify than vowels in isolation, but they also raised questions about the contribution made by length of training to the level of achievement.
Poster II-47 PROPER NAMES: Features of Ambiguity in a Multicultural Context
Marie Dohalska, Phonetics - Institute of Phonetics in Prague
Radka Skardova, Phonetics - Institute of Phonetics in Prague
Jana Kralova, Phonetics - Institute of Translation Studies
Paper File
  The aim of the experiment was to prove, via production and perception tests, the comprehensibility of proper names. We were interested not only in their recognition in fluent speech, but also in the relationship that may exist between these significant bearers of information, and furthermore in their interpretation in different languages. This work focuses on the most obvious distortions and degrees of understanding the significant facts when talking about prominent sportsmen/sportswomen in authentic English, Spanish, French and Czech TV announcements.
Poster II-49 Effects of phonetic speech training on the pronunciation of vowels in a foreign language
Vesna Mildner, University of Zagreb, Faculty of Humanities and Social Sciences, Dept. of Phonetics
Diana Tomic, University of Zagreb, Faculty of Humanities and Social Sciences, Dept. of Phonetics
Paper File
  The paper presents the results of speech training exercises on a sample of American English and Spanish native speakers learning Croatian as a foreign language. The success of training was assessed by a panel of trained phoneticians, who evaluated examples of speech before and after a series of individual training sessions. Two different evaluation tests revealed significant improvement in the quality of pronunciation of the five Croatian wovels, which was also reflected in the shape of their vowel space expressed in terms of F1 and F2 frequencies.
Poster II-51 Examination of similarity between English /r/, /l/, and Japanese flap: An investigation of best exemplars by English and Japanese speakers
Kota Hattori, University College London
Paul Iverson, University College London
Paper File
  Japanese adults have difficulty learning the English /r/-/l/ contrast, and it has been suggested that this occurs because /r/ and /l/ are similar to the Japanese flap category. The present experiment evaluated this similarity by finding best exemplars of these three consonants in a 5-dimensional acoustic space (F1, F2, F3, closure duration, transition duration) for native speakers of Japanese and English. The results demonstrated that Japanese flap was similar to /l/, but not /r/, for Japanese listeners. However, the flap and /l/ best exemplars of Japanese speakers were still significantly different (e.g., flap having a shorter closer than /l/), indicating that Japanese speakers maintained separate mental representations for these categories rather than using their L1 flap for both consonants.
Kimiko Tsukada, University of Oregon
Thu, T. A. Nguyen, The University of Queensland
Rungpat Roengpitya, Mahidol University
Shunichi Ishihara, The Australian National University
Paper File
  This study examined the discrimination of word-final stop contrasts (/p/-/t/, /p/-/k/, /t/-/k/) in English and Thai by native speakers of Cantonese (C), Japanese (J), Korean (K) and Vietnamese (V). The listeners’ first languages (L1) differ substantially in how word-final stops are phonetically realized. Although Japanese does not permit word-final stops, the J listeners were able to discriminate English (but not Thai) contrasts accurately, demonstrating that non-native contrasts are learnable beyond early childhood. The C, K and V listeners have experience with unreleased final stops in their L1s, but differed in their discrimination accuracy especially for Thai stop contrasts. This research highlights the value of systematically comparing listener groups from diverse L1 backgrounds in gaining a better understanding of the role of L1 experience in cross-language speech perception.
Poster II-55 The mapping of phonetic information to lexical representations in Spanish: Evidence from eye movements
Andrea Weber, Saarland University
Alissa Melinger, Dundee University
Lourdes Lara Tapia, Saarland University
Paper File
  In a visual-world study, we examined spoken-word recognition in Spanish. Spanish listeners followed spoken instructions to click on pictures while their eye movements were monitored. When instructed to click on the picture of a door (puerta), they experienced interference from the picture of a pig (puerco). The same interference was observed when the displays contained a printed name or a combination of a picture with its name printed underneath. The results confirm for Spanish the simultaneous activation of multiple lexical candidates that match the unfolding speech signal. Implications of the finding that the effect can be induced with standard pictorial displays as well as with orthographic displays will be discussed.
Poster II-57 Strategies for editing out speech errors in inner speech
Sieb G. Nooteboom, Utrecht institute of Linguistics OTS
Hugo Quené, Utrecht institute of Linguistics OTS
Paper File
  In a classical SLIP task spoonerisms are elicited with either a lexical or a nonlexical outcome. We argue that if the frequency of a particular class of responses is affected by the lexicality of the expected spoonerisms, this indicates that many such responses have replaced elicited spoonerisms in inner speech. Such effects are shown in early interrupted speech errors, speech errors that are form-related to the spoonerisms, and form-unrelated speech errors. Keywords: Speech errors, lexical bias, feedback, self-monitoring, inner speech.
Poster II-59 Increased left-hemisphere contribution to native- versus foreign-language talker identification revealed by dichotic listening.
Tyler K. Perrachione, Department of Linguistics and Cognitive Science Program, Northwestern University, Evanston, IL
Patrick C.M. Wong, Department of Communication Sciences and Disorders and Northwestern University Interdepartmental Neuroscience Program, Northwestern University, Evanston, IL
Paper File
  Previous studies of human listeners’ ability to identify speakers by voice have revealed a reliable language-familiarity effect: Listeners are better at identifying voices when they can understand the language being spoken. It has been claimed that talker identification is facilitated in a familiar language because of functional integration between the cognitive systems underlying speech and voice perception. However, prior studies have not provided specific evidence demonstrating neural integration between these two systems. Using dichotic listening as a means to assess the role of each hemisphere in talker identification, we show that listeners’ right-, but not left-, ear (left-hemisphere) performance better predicts overall accuracy in their native than non-native language. By demonstrating functional integration of speech perception regions (classical left-hemisphere language areas) in a talker identification task, we provide evidence for a neurologic basis underlying the language-familiarity effect.
Petra Jongmans, University of Amsterdam/Netherlands Cancer Institute
Ton Wempe, University of Amsterdam
Frans Hilgers, Netherlands Cancer Institute/University of Amsterdam
Louis Pols, University of Amsterdam
Corina van As-Brooks, Netherlands Cancer Institute
Paper File Additional Files
  Confusions between voiced and voiceless plosives and fricatives are the most common confusions in Dutch tracheoesophageal (TE) speech. The problem is attributed to the working of the new voice source: the pharyngo-esophageal segment, or neoglottis. In order to learn how these speakers convey the voiced-voiceless distinction, detailed analyses are necessary. 15 acoustic correlates (and a subset of 6 for the fricatives) were selected and analyzed. Statistical analyses were then used to determine which correlates are used to distinguish between voiced and voiceless sounds. The data show that TE speakers do not differ much from normal laryngeal speakers, except where voicing is concerned.
Poster II-63 Aerodynamic Validation of Perceptually-Based Breath Group Determination and Temporal Breath Group Structure Analysis in Taiwanese Adolescents with Prelingual Severe to Profound Hearing Impairment
Wei-Chun Che, Department of Physical and Medical Rehabilitation, National Taiwan University, Taipei, Taiwan
Yu-Tsai Wang, School of Dentistry, National Yang-Ming University, Taipei, Taiwan
Hsiu-Jung Lu, School of Dentistry, National Yang-Ming University, Taipei, Taiwan
Paper File Additional Files
  This study reported the reliability and validity of perceptually determined inspiratory loci and temporal breath group structure between 20 young Taiwanese adults with prelingual severe to profound hearing impairment (HI) and 20 age-gender-education-matched controls (HC) with normal hearing. The reliability and validity of perceptual judgment of inspiratory loci were considered satisfactory for both groups, although the HI group exhibited more error rate than the HC group. Furthermore, compared to the HC group, the HI group had more inappropriate inspiratory loci and speech breathing frequencies, longer inter-breath-group pause, but comparable breath group duration.
Poster II-65 Segmental aspects in speakers with Parkinson's disease
Maria Francisca de Paula Soares, UNICAMP - BRAZIL
Paper File
  In this study, we explored the segmental aspects of the speech production of a Brazilian parkinsonian group. Three spectral moments and vocalic space area were measured. A total of 8 subjects participated in this study, including 5 parkinsonians with dysarthria and 3 healthy subjects. The experimental task was read given sentences. For acoustic analysis, we selected the words containing the voiceless stop at word onset, followed by /a/, and the lexically stressed vowels . The results suggested that the values for all three spectral moments were higher as compared to the healthy subjects. The spectral moment distribution analysis showed that three parkinsonians were able to distinct places of articulations; two parkinsonian did not present the distinctions for place of articulation; and all the participants of the control did it. For vowel production, our results pointed to great intersubject vocalic space variability expressed by higher values for variance, in the parkinsonian group.
Kanae Sawamura, Japan Advanced Institute of Science and Technology,Japan
Jianwu Dang, Japan Advanced Institute of Science and Technology,Japan
Masato Akagi, Japan Advanced Institute of Science and Technology,Japan
Donna Erickson, Showa Academia Musicale
Aijun Li, Institute of Linguistics Chinese Academy of Social Science,China
Kyoko Sakuraba, Kiyose-shi Welfare Center for the Handicapped
Nobuaki Minematsu, The University of Tokyo
Keikichi Hirose, The University of Tokyo
Paper File
  It is believed that there are some common factors independent of languages and cultures in human perception of emotion via speech sounds. This study investigated the factors using 3 country people. An emotional speech evaluated using 3- and 6-emotional dimensions. It was found that most speech materials were perceived to have multiple emotional components, even though a single emotion had been intended to be expressed by the speaker. This phenomenon is common across the three cultures. The principle component analysis showed that the loading pattern of the explanatory variables was consistent with one another for the three different cultures at about 67% cover rate. This suggested that people of different language/cultural backgrounds can perceive emotion from speech sounds sans linguistic information to about the same extent. Extending the evaluation dimension from three emotions to six emotions, it was found that “anger” “joy” and “sad” constitute three basic emotions.
Katja Grauwinkel, Institute for Speech and Communication, Berlin University of Technology, Germany
Britta Dewitt, Institute for Speech and Communication, Berlin University of Technology, Germany
Sascha Fagel, Institute for Speech and Communication, Berlin University of Technology, Germany
Paper File
  This paper presents the result of a study investigating the influence of visualization of internal articulator movements on the intelligibility of synthesized audiovisual speech. A talking head was supplemented by internal passive and active articulators. A comparative perception test before and after two different training lessons was carried out, where one type of display included all internal articulator movements and the other displayed dynamics without tongue dorsum height, velum opening/closing and tongue forward/backward movements. Results show that recognition scores were significantly higher in audiovisual compared to auditory alone presentation with non-significantly different recognition scores for both kinds of display. But only in case of additional motion information of internal articulators the training lesson was able to significantly increase the visual and audiovisual intelligibility.
Poster II-71 A phonetically balanced Modified Rhyme Test for evaluating Catalan speech intelligibility
Francesc Alías, Enginyeria i Arquitectura La Salle. Ramon Llull University
Manuel Pablo Triviño, Enginyeria i Arquitectura La Salle. Ramon Llull University
Paper File
  This work introduces a phonetically balanced modified rhyme test (MRT) for evaluating Catalan speech intelligibility. The proposal contents fulfill the standard MRT restrictions, besides yielding phonetic balanced word ensembles so as to avoid biasing the test to scarcely representative phonemes. Hence, it allows testing the intelligibility of any communication system delivering Catalan speech by means of a unique phonetic meaningful comparison framework.

Back to Conference Schedule