Technical Program

Paper Detail

Paper:	SP-P9.1
Session:	Topics in Speech Synthesis
Time:	Wednesday, May 19, 15:30 - 17:30
Presentation:	Poster
Topic:	Speech Processing: Speech Synthesis (including TTS)
Title:	MINIMUM SEGMENTATION ERROR BASED DISCRIMINATIVE TRAINING FOR SPEECH SYNTHESIS APPLICATION
Authors:	Yi-Jian Wu; University of Science and Technology of China
	Hisashi Kawai; ATR, Spoken Language Translation Laboratories
	Jinfu Ni; ATR, Spoken Language Translation Laboratories
	Ren-Hua Wang; University of Science and Technology of China
Abstract:	In the conventional HMM-based segmentation method, the HMM training is based on MLE criteria, which links the segmentation task to the problem of distribution estimation. The HMMs are built to identify the phonetic segments, not to detect the boundary. This kind of inconsistency between training and application limited the performance of segmentation. In this paper, we adopt the discriminative training method and introduce a new criterion, named Minimum Segmentation Error (MSGE), for HMM training. In this method, a loss function directly related to the segmentation error is defined. By minimizing the overall empirical loss with the Generalized Probabilistic Descent (GPD) algorithm, the segmentation error is also minimized. From the results on both Chinese and Japanese data, the accuracy of segmentation is improved. Moreover, this method is robust even when we do not have enough knowledge on HMM modeling, e.g. the number of states is not optimized.

Back

Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004