Technical Program

Paper Detail

Paper:SP-P14.3
Session:Acoustic Modeling: Tone, Prosody, and Features
Time:Thursday, May 20, 15:30 - 17:30
Presentation: Poster
Topic: Speech Processing: Acoustic Modeling for Speech Recognition
Title: SEGMENTAL TONAL MODELING FOR PHONE SET DESIGN IN MANDARIN LVCSR
Authors: Chao Huang; Microsoft Research Asia 
 Yu Shi; Microsoft Research Asia 
 Jian-Lai Zhou; Microsoft Research Asia 
 Min Chu; Microsoft Research Asia 
 Terry Wang; Microsoft Research Asia 
 Eric Chang; Microsoft Research Asia 
Abstract: Modeling units play a very important role in state-of-art speech recognition systems. The design and selection of them will directly impact the performance of final speech recognition engine. As a tonal language, Mandarin’s modeling units are more special for the tonal processing. In this paper, after fully investigating several dominant modeling strategies, we propose a new phone set design strategy for Mandarin, called segmental tonal modeling. Instead of modeling tone types directly, we realized them implicitly and jointly by two segments, which both carry tonal information. Both HTK and SAPI based experiments confirmed that such method is very efficient. In addition to improving the accuracy by 9~23%, it greatly reduces the decoding time by 30~45%. Given the similar decoding speed, new phone set configuration can reduce the error rate by relatively 35%.
 
           Back


Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004