OUISPER: CORPUS BASED SYNTHESIS DRIVEN BY ARTICULATORY DATA

Thomas Hueber¹, Gerard Chollet², Bruce Denby³, Maureen Stone⁴ & Leila Zouari²
¹LABORATOIRE D'ELECTRONIQUE, ESPCI / CNRS-LTCI, ENST; ²CNRS-LTCI, ENST; ³LABORATOIRE D'ELECTRONIQUE, ESPCI / UPMC-PARIS VI; ⁴Dept of Biomedical Sciences and Orthodontics, University of Maryland Dental School, Baltimore, MD, USA

ID 1513
[full paper]

Many applications require the production of intelligible speech from articulatory data. This paper outlines a research program (Ouisper : Oral Ultrasound synthetIc SPEech souRce) to synthesize speech from ultrasound acquisition of the tongue movement and video sequences of the lips. Video data is used to search in a multistream corpus associating images of the vocal tract and lips with the audio signal. The search is driven by the recognition of phone units using Hidden Markov Models trained on video sequences. Preliminary results support the feasibility of this approach.

OUISPER: CORPUS BASED SYNTHESIS DRIVEN BY ARTICULATORY DATA

ID 1513 [full paper]

ID 1513
[full paper]