Technical Program

Paper Detail

Paper:SP-P11.10
Session:Topics in Large Vocabulary Continuous Speech Recognition
Time:Thursday, May 20, 09:30 - 11:30
Presentation: Poster
Topic: Speech Processing: Large Vocabulary Recognition/Search
Title: THE 2003 ISL RICH TRANSCRIPTION SYSTEM FOR CONVERSATIONAL TELEPHONY SPEECH
Authors: Hagen Soltau; Interactive Systems Labs 
 Hua Yu; Interactive Systems Labs 
 Florian Metze; Interactive Systems Labs 
 Christian Fügen; Interactive Systems Labs 
 Qin Jin; Interactive Systems Labs 
 Szu-Chen Jou; Interactive Systems Labs 
Abstract: This paper describes the ISL large vocabulary conversational telephony speech recognition system, which was tested in NIST's RT-03S (``Switchboard'') evaluation. We present our experiments on improving preprocessing, acoustic modelling, and language modelling. The system features phone dependent semi-tied full covariances, semi-tied clustering of septa-phones, clustering across phones, feature adaptive training, robust estimation of VTLN and MLLR, as well as context dependent interpolation of language models. We present detailed results for each stage of our multi-pass transcription scheme. System development started in 2002 with an error rate of 35.1% on our internal 1h development set. The final system performed at WER 21.8%, a 38% relative improvement. The error rate on the RT-03 CTS evaluation set is 23.4%.
 
           Back


Home -||- Organizing Committee -||- Technical Committee -||- Technical Program -||- Plenaries
Paper Submission -||- Special Sessions -||- ITT -||- Paper Review -||- Exhibits -||- Tutorials
Information -||- Registration -||- Travel Insurance -||- Housing -||- Workshops

©2015 Conference Management Services, Inc. -||- email: webmaster@icassp2004.org -||- Last updated Wednesday, April 07, 2004