GAN-PF (DEMO)
  • Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, and Kunio Kashino, "Generative Adversarial Network-based Postfilter for Statistical Parametric Speech Synthesis," The IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
  • Takuhiro Kaneko, Shinji Takaki, Hirokazu Kameoka, Junichi Yamagishi, "Generative Adversarial Network-based Postfilter for STFT Spectrograms," The Annual Conference of the International Speech Communication Association (Interspeech), 2017.
CycleGAN-VC (DEMO)
  • Takuhiro Kaneko and Hirokazu Kameoka, "Parallel-data-free voice conversion using cycle-consistent adversarial networks," arXiv:1711.11293 [stat.ML], Nov. 2017. (PDF)
CycleGAN-VC2 (DEMO)
  • Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, and Nobukatsu Hojo, "CycleGAN-VC2: Improved CycleGAN-based non-parallel voice conversion," in Proc. ICASSP 2019 (arXiv:1904.04631, Apr. 2019). (PDF)
StarGAN-VC (DEMO)
  • Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, and Nobukatsu Hojo, "StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks," arXiv:1806.02169 [cs.SD], Jun. 2018. (PDF)
ACVAE-VC (DEMO)

  • Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, and Nobukatsu Hojo, "ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder," arXiv:1808.05092 [stat.ML], Aug. 2018. (PDF)
WaveCycleGAN (DEMO)
  • Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, and Hirokazu Kameoka, "WaveCycleGAN: Synthetic-to-natural speech waveform conversion using cycle-consistent adversarial networks," arXiv:1809.10288 [eess.AS], Sep. 2018. (PDF)
WaveCycleGAN2 (DEMO)
  • Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, and Nobukatsu Hojo, "WaveCycleGAN2: Time-domain neural post-filter for speech waveform generation," arXiv:1904.02892 [cs.SD], Apr. 2019. (PDF)
ConvS2S-VC (DEMO)
  • Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, and Nobukatsu Hojo, "ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion," arXiv:1811.01609 [cs.SD], Nov. 2018. (PDF)
Crossmodal VC (DEMO)
  • Hirokazu Kameoka, Kou Tanaka, Aaron Valero Puche, Yasunori Ohishi, and Takuhiro Kaneko, "Crossmodal Voice Conversion," arXiv:1904.04540 [cs.SD], Apr. 2019. (PDF)
CDR-NMF (DEMO)

  • Hirokazu Kameoka, Takuya Higuchi, Mikihiro Tanaka, and Li Li, "Nonnegative matrix factorization with basis clustering using cepstral distance regularization," IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 26, no. 6, pp. 1029-1040, Jun. 2018. (PDF)
Multichannel FHMM (DEMO1, DEMO2)

  • Takuya Higuchi, Hirofumi Takeda, Tomohiko Nakamura, and Hirokazu Kameoka, "A unified approach for underdetermined blind signal separation and source activity detection by multichannel factorial hidden Markov models," in Proc. The 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), pp. 850-854, Sep. 2014. (PDF)
Multichannel VAE (DEMO)

  • Hirokazu Kameoka, Li Li, Shota Inoue, and Shoji Makino, "Semi-blind source separation with multichannel variational autoencoder," arXiv:1808.00892 [stat.ML], Aug. 2018. (PDF)