Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
| Paper: | SP-P9.4 |
| Session: | Topics in Speech Synthesis |
| Time: | Wednesday, May 19, 15:30 - 17:30 |
| Presentation: |
Poster |
| Topic: |
Speech Processing: Speech Synthesis (including TTS) |
| Title: |
REFINING SEGMENTAL BOUNDARIES FOR TTS DATABASE USING FINE CONTEXTUAL-DEPENDENT BOUNDARY MODELS |
| Authors: |
Lijuan Wang; Tsinghua University | | |
| | Yong Zhao; Microsoft Research Asia | | |
| | Min Chu; Microsoft Research Asia | | |
| | Jian-Lai Zhou; Microsoft Research Asia | | |
| | Zhigang Cao; Tsinghua University | | |
| Abstract: |
This paper proposes a post-refining method with fine contextual-dependent GMMs for the auto-segmentation task. A GMM trained with a super feature vector extracted from multiple evenly spaced frames near the boundary is used to describe the waveform evolution across a boundary. CART is used to cluster acoustically similar boundaries, so that the GMM for each leaf node is reliably trained with a small amount of limited manually labeled boundaries. An accuracy of 90% is thus achieved when only about 250 manually labeled sentences are provided to train the refining models. |
| |
| Back | |