| Paper: | SP-L1.1 | ||
| Session: | Voice Conversion and Morphing Algorithms for TTS Systems | ||
| Time: | Tuesday, May 18, 15:30 - 15:50 | ||
| Presentation: | Lecture | ||
| Topic: | Speech Processing: Speech Synthesis (including TTS) | ||
| Title: | NON-PARALLEL TRAINING FOR VOICE CONVERSION BY MAXIMUM LIKELIHOOD CONSTRAINED ADAPTATION | ||
| Authors: | Athanasios Mouchtaris; University of Pennsylvania | ||
| Jan Van der Spiegel; University of Pennsylvania | |||
| Paul Mueller; Corticon Inc. | |||
| Abstract: | The objective of voice conversion methods is to modify the speech characteristics of a particular speaker in such manner, as to sound like speech by a different target speaker. Current voice conversion algorithms are based on deriving a conversion function by estimating its parameters through a corpus that contains the same utterances spoken by both speakers. Such a corpus, usually referred to as a parallel corpus, has the disadvantage that many times it is difficult or even impossible to collect. Here, we propose a voice conversion method that does not require a parallel corpus for training, i.e. the spoken utterances by the two speakers need not be the same, by employing speaker adaptation techniques to adapt to a particular pair of source and target speakers, the derived conversion parameters from a different pair of speakers. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30% in many cases, and with performance comparable with the ideal case when a parallel corpus is available. | ||
| Back | |||
Home -||-
Organizing Committee -||-
Technical Committee -||-
Technical Program -||-
Plenaries
Paper Submission -||-
Special Sessions -||-
ITT -||-
Paper Review -||-
Exhibits -||-
Tutorials
Information -||-
Registration -||-
Travel Insurance -||-
Housing -||-
Workshops