Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
| Paper: | SP-L6.3 |
| Session: | Feature Analysis for Speech Recognition |
| Time: | Thursday, May 20, 13:40 - 14:00 |
| Presentation: |
Lecture |
| Topic: |
Speech Processing: Feature Extraction |
| Title: |
THE ETSI EXTENDED DISTRIBUTED SPEECH RECOGNITION (DSR) STANDARDS: CLIENT SIDE PROCESSING AND TONAL LANGUAGE RECOGNITION EVALUATION |
| Authors: |
Alexander Sorin; IBM Labs | | |
| | Tenkasi Ramabadran; Motorola Labs | | |
| | Dan Chazan; IBM Labs | | |
| | Ron Hoory; IBM Labs | | |
| | Michael McLaughlin; Motorola Labs | | |
| | David Pearce; Motorola Labs | | |
| | Fan Wang; IBM Labs | | |
| | Yaxin Zhang; Motorola Labs | | |
| Abstract: |
In this paper we present work that has been carried out in developing the ETSI Extended DSR standards ES 202 211 and ES 202 212 [1][2]. These standards extend the previous ETSI DSR standards: basic front-end ES 201 108 and advanced (noise robust) front-end ES 202 050 respectively. The extensions enable enhanced tonal language recognition as well as server-side speech reconstruction capability. This paper discusses the client-side estimation of pitch and voicing class parameters whereas a companion paper discusses the server-side speech reconstruction. Experimental results show enhancement of tonal language recognition rates of proprietary recognition engines, when the standard extensions are used. |
| |
| Back | |