Takuhiro Kaneko

Research Scientist

NTT Communication Science Laboratories, NTT Corporation
kaneko.takuhiro at lab.ntt.co.jp
[Google Scholar]

About

I am a research scientist at NTT Communication Science Laboratories in NTT Corporation.

I received the BE and ME degrees from the University of Tokyo in 2012 and 2014, respectively.
I started PhD studies in Harada Lab. at the University of Tokyo in 2017.
I joined the Recognition Research Group at Media Information Laboratory in NTT Communication Science Laboratories in 2014.

My research interests include computer vision, signal processing, and machine learning.
In particular, I am currently working on image generation, speech synthesis, and voice conversion using deep generative models.


News

  • International
  • Domestic (Japanese)
    • I uploaded the GAN tutorial slides presented at MIRU 2019. [GAN Tutorial]
    • I will present an invited talk on DTLC-GAN and rGAN at FIT 2019. [Program]
    • I will present an invited talk on rGAN at MIRU 2019. [Program]
    • I will present a tutorial on GAN at MIRU 2019. [Abstract]

Projects

Generative Personal Assistance with Audio and Visual Examples
[Project]


Publications

Class-Distinct and Class-Mutual Image Generation with GANs New!
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
BMVC 2019 (Spotlight)
(arXiv:1811.11163, Nov. 2018)
[Paper] [Project] [Code] [Slides] [Poster]

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion New!
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
Interspeech 2019
(arXiv:1907.12279, July 2019)
[Paper] [Project] [Poster]

Label-Noise Robust Multi-Domain Image-to-Image Translation New!
Takuhiro Kaneko, Tatsuya Harada
arXiv:1905.02185, May 2019
[Paper]

Label-Noise Robust Generative Adversarial Networks New!
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
CVPR 2019 (Oral, Acceptance Rate: 5.6%)
(arXiv:1811.11165, Nov. 2018)
[Paper] [Project] [Code] [Slides] [Poster] [Talk]

Crossmodal Voice Conversion New!
Hirokazu Kameoka, Kou Tanaka, Aaron Valero Puche, Yasunori Ohishi, Takuhiro Kaneko
arXiv:1904.04540, Apr. 2019
[Paper] [Project]

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation New!
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
arXiv:1904.02892, Apr. 2019
[Paper] [Project]

CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion New!
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
ICASSP 2019
(arXiv:1904.04631, Apr. 2019)
[Paper] [Project] [Poster]

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms New!
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
ICASSP 2019
(arXiv:1811.04076, Nov. 2018)
[Paper] [Project]

ConvS2S-VC: Fully Convolutional Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo
arXiv:1811.01609, Nov. 2018
[Paper] [Project]

WaveCycleGAN: Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks
Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka
SLT 2018
(arXiv:1809.10288, Sept. 2018)
[Paper] [Project]

ACVAE-VC: Non-parallel Many-to-Many Voice Conversion with Auxiliary Classifier Variational Autoencoder
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
arXiv:1808.05092, Aug. 2018
(IEEE/ACM Transactions on Audio, Speech, and Language Processing, May 2019)
[Paper] [IEEE Xplore] [Project]
(Alternative title: "ACVAE-VC: Non-parallel Voice Conversion with Auxiliary Classifier Variational Autoencoder")

Automatic Speech Pronunciation Correction with Dynamic Frequency Warping-Based Spectral Conversion
Nobukatsu Hojo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
EUSIPCO 2018
[Paper]

StarGAN-VC: Non-parallel Many-to-Many Voice Conversion with Star Generative Adversarial Networks
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
SLT 2018
(arXiv:1806.02169, June 2018)
[Paper] [Project]

Generative Adversarial Image Synthesis with Decision Tree Latent Controller
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
CVPR 2018
[Paper] [Project] [Poster]

Generative Adversarial Network-based Approach to Signal Reconstruction from Magnitude Spectrograms
Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Hiroyasu Ando
EUSIPCO 2018
(arXiv:1804.02181, Apr. 2018)
[Paper]

Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks
Takuhiro Kaneko, Hirokazu Kameoka
arXiv:1804.02181, Nov. 2017
(EUSIPCO 2018)
[Paper] [Project]
(Alternative title: "CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks")

Non-native Speech Conversion with Consistency-Aware Recursive Network and Generative Adversarial Network
Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Hiroyasu Ando, Kaoru Hiramatsu, Kunio Kashino
APSIPA ASC 2017
[Paper]

Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks
Takuhiro Kaneko, Hirokazu Kameoka, Kaouru Hiramatsu, Kunio Kashino
Interspeech 2017
[Paper]

Generative Adversarial Network-based Postfilter for STFT Spectrograms
Takuhiro Kaneko, Shiji Takaki, Hirokazu Kameoka, Junichi Yamagishi
Interspeech 2017
[Paper] [Project]

Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
CVPR 2017
[Paper] [Supplemental] [Project]

Generative Adversarial Network-based Postfilter for Statistical Parametric Speech Synthesis
Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino
ICASSP 2017
[Paper]

Adaptive Visual Feedback Generation for Facial Expression Improvement with Multi-task Deep Neural Networks
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
ACMMM 2016
[Paper]

Collective Activity Localization by Spatiality Preservation Search
Shigeyuki Odashima, Masamichi Shimosaka, Takuhiro Kaneko, Rui Fukui, Tomomasa Sato
Advanced Robotics 2016
[Paper]

Modeling Risk Anticipation and Defensive Driving on Residential Roads with Inverse Reinforcement Learning
Masamichi Shimosaka, Takuhiro Kaneko, Kentaro Nishi
ITSC 2014
[Paper] [Project]

A Fully Connected Model for Consistent Collective Activity Recognition in Videos
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
Pattern Recognition Letters 2014
[Paper] [Project]

Consistent Collective Activity Recognition with Fully Connected CRFs
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
ICPR 2012 Best Student Paper Award
[Paper] [Project]

Viewpoint Invariant Collective Activity Recognition with Relative Action Context
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
ECCV Workshop 2012
[Paper] [Project]

Collective Activity Localization with Contextual Spatial Pyramid
Shigeyuki Odashima, Masamichi Shimosaka, Takuhiro Kaneko, Rui Fukui, Tomomasa Sato
ECCV Workshop 2012
[Paper]


Review Paper

[Invited Review] Generative Adversarial Networks: Foundations and Applications New!
Takuhiro Kaneko
Acoustical Science and Technology, vol. 39, no. 3, pp. 189-197, May 2018
[Paper] [Paper (Japanese)]

Generative Personal Assistance with Audio and Visual Examples
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
NTT Technical Review, vol. 15, no. 11, Nov. 2017
[Paper] [Paper (Japanese)]


Talks & Exhibitions

[Invited Talk] Generative Adversarial Image Synthesis with Decision Tree Latent Controller (CVPR 2018)1 / Label-Noise Robust Generative Adversarial Networks (CVPR 2019)2 New!
1Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
2Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
FIT 2019 (in Japanese)
[Program (Japanese)]

[Invited Talk] Label-Noise Robust Generative Adversarial Networks (CVPR 2019) New!
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
MIRU 2019 (in Japanese)
[Program (Japanese)]

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks New!
Takuhiro Kaneko
MIRU 2019 (in Japanese)
[Abstract (Japanese)] [Slides (Japanese)]

[Invited Talk] Foundations, Advances, and Applications of Generative Adversarial Networks: From Image Generation to Speech Synthesis and Voice Conversion
Takuhiro Kaneko
75th JSAI Seminar (in Japanese)
[Abstract (Japanese)]

[Invited Talk] Generative Adversarial Networks: Foundations and Applications
Takuhiro Kaneko
JAMIT 2018 (in Japanese)
[Abstract (Japanese)]

Creating Favorite Images with Selective Decisions: Hierarchical Image Analysis and Synthesis with DTLC-GAN
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2018
[Poster] [Poster (Japanese)]

Free-Feature-Point Image Generation: Interactive and Flexible Image Generation with Deep Learning
Takuhiro Kaneko
NTT R&D Forum 2018
[Poster] [Poster (Japanese)]

[Invited Talk] Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks (CVPR 2017)
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
MIRU 2017 (in Japanese)
[Program (Japanese)]

Generative Personal Assistance with Audio and Visual Examples: Deep Learning Opens the Way to Innovative Media Generation
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2017
[Paper] [Abstract] [Poster]
[Paper (Japanese)] [Abstract (Japanese)] [Poster (Japanese)]


Awards & Honors

  • IEICE ISS Young Researcher's Award in Speech Field, 2017
  • Best Student Paper Award, ICPR, 2012
  • The Hatakeyama Award, the Japan Society of Mechanical Engineers, 2012

  • Contact

    Takuhiro Kaneko
    NTT Communication Science Laboratories, NTT Corporation
    kaneko.takuhiro at lab.ntt.co.jp