Close Copy Speech Synthesis for Speech Perception Testing
PDF

Jak cytować

Bachan, J., & Gibbon, D. (2006). Close Copy Speech Synthesis for Speech Perception Testing. Investigationes Linguisticae, 13, 9–24. https://doi.org/10.14746/il.2006.13.2

Liczba wyświetleń: 173


Liczba pobrań: 158

Abstrakt

The present study is concerned with developing a speech synthesis subcomponent for perception testing in the context of evaluating cochlear implants in children. We provide a detailed requirements analysis, and develop a strategy for maximally high quality speech synthesis using Close Copy Speech synthesis techniques with a diphone based speech synthesiser, MBROLA. The close copy concept used in this work defines close copy as a function from a pair of speech signal recording and a phonemic annotation aligned with the recording into the pronunciation specification interface of the speech synthesiser. The design procedure has three phases: Manual Close Copy Speech (MCCS) synthesis as a ?best case gold standard?, in which the function is implemented manually as a preliminary step; Automatic Close Copy Speech (ACCS) synthesis, in which the steps taken in manual transformation are emulated by software; finally, Parametric Close Copy Speech (PCCS) synthesis, in which prosodic parameters are modifiable while retaining the diphones. This contribution reports on the MCCS and ACCS synthesis phases.
https://doi.org/10.14746/il.2006.13.2
PDF

Bibliografia

Bachan, J. 2006. Verification of a Set of Speech Perception Tests for Children with a Cochlear Implant. Speech signal annotation, processing and synthesis, [in:] Proceedings of Speech Signal Annotation, Processing and Synthesis Symposium, Poznań, September 2006.

Boersma, P. andD. Weenink. 2001. PRAAT, a system for doing phonetics by Computer. Glot International 5(9/10): 341-345.

Demenko, G. & Wypych, M. & Baranowska, E. 2003. Implementation of Grapheme-to-Phoneme Rules and Extended SAMPA Alphabet in Polish Text-to-Speech Synthesis. Speech and Language Technology Vol. 7. Poznań: Zakład Graficzny UAM.

Demenko, G. & Grocholewski, S. & Wagner, A. & Szymański M. 2006. Prosody annotation for corpus based speech synthesis. [in:] Proceedings of the Eleventh Australasian International Conference on Speech Science and

Technology. Auckland, New Zealand.

Dutoit, T. 1997. An Introduction To Text-To-Speech Synthesis. Dordrecht: Kluwer Academic Publishers.

Dutoit, T. 2005. The MBROLA project. <http://www.tcts.fpms.ac.be/synthesis/mbrola.html>, accessed 2006-11-30.

Gibbon, D. & Moore, R. & Winski, R. 1997. Handbook of Standards and Resources for Spoken Language Systems. Berlin: Mouton de Gruyter.

Gibbon, D. & Mertins, I. & Moore, R. 2000. Handbook o f Multimodal and Spoken Dialogue Systems: Terminology, Resources and Product Evaluation. New York: Kluwer Academic Publishers.

Gut, U. & Milde, J-T. 2003. Annotation and Analysis of Conversational Gestures in the TASX environment. Kunstliche Intelligenz 17:4.

Szklanny, K. & Masarek, K. 2002. PL1 - A Polish female voice for the MBROLA synthesizer. Copying the MBROLA Bin and Databases. <http://tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html>, accessed 2006-11-25.

Sjolader, Kare & Jonas Beskow. 2005. WaveSurfer 1.8.5/0511011429 © 2005.