Due to a licence, we don't have a permission to show audio samples.
(We are going to train models by using an alternative Japanese database which allows us to publish.)
English audio samples
Male speaker: bdl
(Supported: Safari, Chrome, FireFox, Opera)
TTS
AnaSyn
Natural
Merlin
V1
V2msp
WORLD
Bonus
Female speaker: slt
Analysis-and-Synthesis of English audio samples
(Supported: Safari, Chrome, FireFox, Opera)
AnaSyn
Natural
WORLD
GL
WaveNet
WaveGlow
V2msp
V2msp'
To promote the other waveform generation research, we also publish our results. tar.gz
We really appreciate that WaveGlow's authors published the audio samples of WaveGlow.
Bonus: Applying to HTS-Engine
(Supported: Safari, Chrome, FireFox, Opera)
TTS
HTS
NSF
V2msp
V2msp'
References
Zhizheng Wu, Oliver Watts, and Simon King,
"Merlin: An Open Source Neural Network Speech Synthesis System,"
in Proc. 9th ISCA Speech Synthesis Workshop (SSW9), Sep. 2016.
web page
Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, and Hirokazu Kameoka,
"WaveCycleGAN: Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks,"
in Proc. IEEE Spoken Language Technology (SLT), Dec. 2018.
web page
Masanori Morise, Fumiya Yokomori, and Kenji Ozawa,
"WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications,"
in IEICE Transactions on Information and Systems, 2016.
web page
John Kominek and Alan W. Black,
"The CMU Arctic Speech Databases,"
in Proc. 5th ISCA Speech Synthesis Workshop (SSW5), June 2004.
web page
Daniel W. Griffin and Jae S. Lim,
"Signal Estimation from Modified Short-Time Fourier Transform,"
in IEEE Transactions on ASSP, 1984.
web page
AƤron V. D. Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, and Koray Kavukcuoglu,
"WaveNet: A Generative Model for Raw Audio,"
in Proc. 9th Speech Synthesis Workshop (SSW9), Sep. 2016.
web page
Ryan Prenger, Rafael Valle, and Bryan Catanzaro,
"WAVEGLOW: A FLOW-BASED GENERATIVE NETWORK FOR SPEECH SYNTHESIS,"
in Proc. IEEE ICASSP, 2019.
web page
HTS Working Group,
"HMM/DNN-based Speech Synthesis System (HTS),"
web page
Xin Wang, Shinji Takaki, and Junichi Yamagishi,
"Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis,"
in Proc. IEEE ICASSP, 2019.
web page
Akira Kurematsu, Kazuya Takeda, Yoshinori Sagisaka, Shigeru Katagiri, Hisao Kuwabara, and Kiyohiro Shikano,
"ATR japanese speech database as a tool of speech recognition and synthesis,"
in Speech Communication, 1990.
web page