Chinese search giant Baidu’s new improved text-to-speech program, Deep Voice 2, can speak hundreds of accents indistinguishable from human speech

Briefing

Chinese search giant Baidu’s new improved text-to-speech program, Deep Voice 2, can speak hundreds of accents indistinguishable from human speech

May 31, 2017

Briefing

  • Significant Text-to-Speech Improvement – Chinese search giant Baidu developed second iteration of text-to-speech (TTS) system, Deep Voice 2, producing hundreds of hours of speech in hundreds of voices, improved from one voice speaking for 20 hours accomplished by original Deep Voice program
  • Better Than Alternatives – Deep Voice 1 and 2 generates indistinguishable human-like speech and run in real-time, compared to alternative neural TTS systems
  • Fast Learning – Learns to imitate human voice with less than 30 minutes of audio recording, compared to many hours of audio training in original version
  • Multiple Voice and Accent System – Identifies shared characteristics of different voices to generate human model and tweaks it to speak different characters and accents
  • Applications – Useful for voice assistants and text-to-speech applications, such audio readers in e-books

Accelerator

Sector

Information Technology

Organization

Baidu

Source

Original Publication Date

May 24, 2017

Leave a comment