| 
 Briefing 
 | 
- Significant Text-to-Speech Improvement – Chinese search giant Baidu developed second iteration of text-to-speech (TTS) system, Deep Voice 2, producing hundreds of hours of speech in hundreds of voices, improved from one voice speaking for 20 hours accomplished by original Deep Voice program
  - Better Than Alternatives – Deep Voice 1 and 2 generates indistinguishable human-like speech and run in real-time, compared to alternative neural TTS systems
  - Fast Learning – Learns to imitate human voice with less than 30 minutes of audio recording, compared to many hours of audio training in original version
  - Multiple Voice and Accent System – Identifies shared characteristics of different voices to generate human model and tweaks it to speak different characters and accents
  - Applications – Useful for voice assistants and text-to-speech applications, such audio readers in e-books
 
 
 
 | 
| 
 Accelerator 
 | 
 | 
| 
 Sector 
 | 
 
Information Technology
 
 | 
| 
 Organization 
 | 
 
Baidu
 
 | 
| 
 Source 
 | 
-      
Popper, B., "Baidu’s new text-to-speech system can master hundreds of accents", 
 
-      
David, E., "Baidu’s text-to-speech AI can replicate hundreds of accents", 
 
-      
Arik, S., et. al., "Deep Voice: Real-time Neural Text-to-Speech", 
 
-      
"Deep Voice 2: Multi-Speaker Neural Text-to-Speech," 
 
-      
AcceleratingBiz analysis
 
 
 
 
 | 
| 
 Original Publication Date 
 | 
 
May 24, 2017 
 |