About

I am a Distinguished Researcher at NTT Communication Science Laboratories in NTT Corporation.

I completed the Ph.D. in Harada Lab. at the University of Tokyo, under the supervision of Tatsuya Harada, in 2020.
I received the BE and ME degrees from the University of Tokyo in 2012 and 2014, respectively.

I joined the Recognition Research Group at Media Information Laboratory in NTT Communication Science Laboratories in 2014.

My research interests include computer vision, signal processing, and machine learning.
In particular, I am currently working on image synthesis, speech synthesis, and voice conversion using deep generative models (e.g., GANs).


News

  • International
  • Domestic (Japanese)
    • I will present an invited talk at RIMS Workshop in Biofluids 2021. [Program] New!
    • I will present an invited talk at MIRU 2021. [Program] New!
    • The news release was published. [News release] New!
    • I will present an invited talk at OTOGAKU Symposium 2021. [Program]
  • International
    • One paper accepted to IEEE/ACM Trans. ASLP. [Many-to-Many VTN] New!
    • I will present an invited talk at ACCV 2020 Workshop (MLCSA). [Program] New!
    • One paper accepted to IEEE/ACM Trans. ASLP. [A-StarGAN-VC] New!
    • One paper submitted to arXiv. [VoiceGrad] New!
    • One paper accepted to Interspeech 2020. [CycleGAN-VC3] New!
    • One paper accepted to IEEE/ACM Trans. ASLP. [ConvS2S-VC] New!
    • I became a Distinguished Researcher from April 2020. [Introduction] New!
    • One paper submitted to arXiv. [BNCR-GAN] New!
    • One paper accepted to CVPR 2020 (acceptance rate: 22.1%). [NR-GAN] New!
  • Domestic (Japanese)
    • I uploaded the GAN tutorial slides presented at JSAI 2020. [GAN Tutorial]
    • I will present a tutorial on GAN at JSAI 2020. [Program]
    • I served as Graduate Student Representative of Graduate School.
    • I received Dean's Award for Best Doctoral Thesis.
  • International
    • One paper submitted to arXiv. [NR-GAN]
    • I was selected as Outstanding Reviewer in ICCV 2019 (Top 91). [List]
    • One paper accepted to BMVC 2019 (Spotlight). [CP-GAN]
    • One paper accepted to Interspeech 2019. [StarGAN-VC2]
    • One paper accepted to IEEE/ACM Trans. ASLP. [ACVAE-VC]
    • One paper submitted to arXiv. [RMIT]
    • One paper accepted to CVPR 2019 (Oral, acceptance rate: 5.6%). [rGAN]
    • Two papers submitted to arXiv. [WaveCycleGAN2] [Crossmodal VC]
    • Two papers accepted to ICASSP 2019. [CycleGAN-VC2] [AttS2S-VC]
  • Domestic (Japanese)
    • I uploaded the GAN tutorial slides presented at MIRU 2019. [GAN Tutorial]
    • I will present an invited talk on DTLC-GAN and rGAN at FIT 2019. [Program]
    • I will present an invited talk on rGAN at MIRU 2019. [Program]
    • I will present a tutorial on GAN at MIRU 2019. [Abstract]

Publications

[Google Scholar] [Semantic Scholar] [dblp]

Image Synthesis and Image Recognition

Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks New!
Takuhiro Kaneko
CVPR 2021 (Oral) (oral acceptance rate: 4.2%) (arXiv:2106.13041, June 2021)
[Paper] [Project] [Slides] [Poster] [News Release (Japanese)]

Blur, Noise, and Compression Robust Generative Adversarial Networks New!
Takuhiro Kaneko, Tatsuya Harada
CVPR 2021 (acceptance rate: 23.6%) (arXiv:2003.07849, Mar. 2020)
[Paper] [Project] [Slides] [Poster]

Noise Robust Generative Adversarial Networks New!
Takuhiro Kaneko, Tatsuya Harada
CVPR 2020 (acceptance rate: 22.1%) (arXiv:1911.11776, Nov. 2019)
[Paper] [Project] [Code] [Slides] [Video]

Class-Distinct and Class-Mutual Image Generation with GANs
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
BMVC 2019 (Spotlight) (arXiv:1811.11163, Nov. 2018)
[Paper] [Project] [Code] [Slides] [Poster]

Label-Noise Robust Multi-Domain Image-to-Image Translation
Takuhiro Kaneko, Tatsuya Harada
arXiv:1905.02185, May 2019
[Paper]

Label-Noise Robust Generative Adversarial Networks
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
CVPR 2019 (Oral) (oral acceptance rate: 5.6%) (arXiv:1811.11165, Nov. 2018)
[Paper] [Project] [Code] [Slides] [Poster] [Talk]

Generative Adversarial Image Synthesis with Decision Tree Latent Controller
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
CVPR 2018 (acceptance rate: 29.6%) (arXiv:1805.10603, May 2018)
[Paper] [Project] [Poster]

Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
CVPR 2017 (acceptance rate: 29.9%)
[Paper] [Supplemental] [Project]

Adaptive Visual Feedback Generation for Facial Expression Improvement with Multi-task Deep Neural Networks
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
ACMMM 2016
[Paper]

Collective Activity Localization by Spatiality Preservation Search
Shigeyuki Odashima, Masamichi Shimosaka, Takuhiro Kaneko, Rui Fukui, Tomomasa Sato
Advanced Robotics 30(11-12), Mar. 2016
[Paper]

A Fully Connected Model for Consistent Collective Activity Recognition in Videos
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
Pattern Recognition Letters 43, July 2014
[Paper] [Project]

Consistent Collective Activity Recognition with Fully Connected CRFs
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
ICPR 2012 (Best Student Paper Award)
[Paper] [Project]

Viewpoint Invariant Collective Activity Recognition with Relative Action Context
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
ECCV Workshop 2012
[Paper] [Project]

Collective Activity Localization with Contextual Spatial Pyramid
Shigeyuki Odashima, Masamichi Shimosaka, Takuhiro Kaneko, Rui Fukui, Tomomasa Sato
ECCV Workshop 2012
[Paper]

Voice Conversion and Speech Synthesis

FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion New!
Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
arXiv:2104.06900, Apr. 2021
[Paper] [Project 1 (ConvS2S-VC2)] [Project 2 (Transformer-VC2)]

MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames New!
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
ICASSP 2021 (arXiv:2102.12841, Feb. 2021)
[Paper] [Project] [Slides] [Poster]

VoiceGrad: Non-parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics New!
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Shogo Seki
arXiv:2010.02977, Oct. 2020
[Paper] [Project]

Many-to-Many Voice Transformer Network New!
Hirokazu Kameoka, Wen-Chin Huang, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Tomoki Toda
IEEE/ACM Trans. Audio Speech Lang. Process., 29, Dec. 2020 (arXiv:2005.08445, May 2020)
[Paper] [IEEE Xplore] [Project]

CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion New!
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
Interspeech 2020 (arXiv:2010.11672, Oct. 2020)
[Paper] [Project] [Slides]

Non-parallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks New!
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, and Nobukatsu Hojo
IEEE/ACM Trans. Audio Speech Lang. Process., 28, Nov. 2020 (arXiv:2008.12604, Aug. 2020)
[Paper] [IEEE Xplore]

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
Interspeech 2019 (arXiv:1907.12279, July 2019)
[Paper] [Project] [Poster]

Crossmodal Voice Conversion
Hirokazu Kameoka, Kou Tanaka, Aaron Valero Puche, Yasunori Ohishi, Takuhiro Kaneko
arXiv:1904.04540, Apr. 2019
[Paper] [Project]

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
arXiv:1904.02892, Apr. 2019
[Paper] [Project]

CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
ICASSP 2019 (arXiv:1904.04631, Apr. 2019)
[Paper] [Project] [Poster]

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
ICASSP 2019 (arXiv:1811.04076, Nov. 2018)
[Paper] [Project]

ConvS2S-VC: Fully Convolutional Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo
IEEE/ACM Trans. Audio Speech Lang. Process., 28, June 2020 (arXiv:1811.01609, Nov. 2018)
[Paper] [IEEE Xplore] [Project]

WaveCycleGAN: Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks
Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka
SLT 2018 (arXiv:1809.10288, Sept. 2018)
[Paper] [Project]

ACVAE-VC: Non-parallel Voice Conversion with Auxiliary Classifier Variational Autoencoder
(Alternative title: ACVAE-VC: Non-parallel Many-to-Many Voice Conversion with Auxiliary Classifier Variational Autoencoder)
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
IEEE/ACM Trans. Audio Speech Lang. Process., 27(9), Sept. 2019 (arXiv:1808.05092, Aug. 2018)
[Paper] [IEEE Xplore] [Project]

Automatic Speech Pronunciation Correction with Dynamic Frequency Warping-Based Spectral Conversion
Nobukatsu Hojo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
EUSIPCO 2018
[Paper]

StarGAN-VC: Non-parallel Many-to-Many Voice Conversion with Star Generative Adversarial Networks
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
SLT 2018 (arXiv:1806.02169, June 2018)
[Paper] [Project]

Generative Adversarial Network-based Approach to Signal Reconstruction from Magnitude Spectrograms
Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Hiroyasu Ando
EUSIPCO 2018 (arXiv:1804.02181, Apr. 2018)
[Paper]

CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks
(Alternative title: Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks)
Takuhiro Kaneko, Hirokazu Kameoka
EUSIPCO 2018 (arXiv:1804.02181, Nov. 2017)
[Paper] [Project]

Non-native Speech Conversion with Consistency-Aware Recursive Network and Generative Adversarial Network
Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Hiroyasu Ando, Kaoru Hiramatsu, Kunio Kashino
APSIPA ASC 2017
[Paper]

Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks
Takuhiro Kaneko, Hirokazu Kameoka, Kaoru Hiramatsu, Kunio Kashino
Interspeech 2017
[Paper]

Generative Adversarial Network-based Postfilter for STFT Spectrograms
Takuhiro Kaneko, Shiji Takaki, Hirokazu Kameoka, Junichi Yamagishi
Interspeech 2017
[Paper] [Project]

Generative Adversarial Network-based Postfilter for Statistical Parametric Speech Synthesis
Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino
ICASSP 2017
[Paper]

Intelligent Transportation Systems

Modeling Risk Anticipation and Defensive Driving on Residential Roads with Inverse Reinforcement Learning
Masamichi Shimosaka, Takuhiro Kaneko, Kentaro Nishi
ITSC 2014
[Paper] [Project]


Review Paper

Communication with Desired Voice
Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka
NTT Technical Review 18(11), Nov. 2020
[Paper] [Paper (Japanese)]

[Invited Review] Generative Adversarial Networks: Foundations and Applications
Takuhiro Kaneko
Acoustical Science and Technology 39(3), May 2018
[Paper] [Paper (Japanese)]

Generative Personal Assistance with Audio and Visual Examples
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
NTT Technical Review 15(11), Nov. 2017
[Paper] [Paper (Japanese)]


Talks & Exhibitions

[Invited Talk] Learning to Generate Images with Generative Adversarial Networks New!
Takuhiro Kaneko
RIMS Workshop 2021
[Program]

[Invited Talk] Blur, Noise, and Compression Robust Generative Adversarial Networks (CVPR 2021) New!
Takuhiro Kaneko, Tatsuya Harada
MIRU 2021 (in Japanese)
[Program (Japanese)]

[Invited Talk] Image Synthesis and Voice Conversion with Generative Adversarial Networks
Takuhiro Kaneko
OTOGAKU Symposium 2021 (in Japanese)
[Program (Japanese)]

[Invited Talk] Learning to Generate Images with Imperfect Supervision
Takuhiro Kaneko
ACCV 2020 Workshop (MLCSA)
[Program]

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks
Takuhiro Kaneko
JSAI 2020 (in Japanese)
[Program (Japanese)] [Slides (Japanese)]

[Invited Talk] Voice Conversion with Image-to-Image Translation and Sequence-to-Sequence Learning Approaches
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
SANE 2019
[Video] [Program]

[Invited Talk] Generative Adversarial Image Synthesis with Decision Tree Latent Controller (CVPR 2018)1
Label-Noise Robust Generative Adversarial Networks (CVPR 2019)2

1Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
2Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
FIT 2019 (in Japanese)
[Program (Japanese)]

[Invited Talk] Label-Noise Robust Generative Adversarial Networks (CVPR 2019)
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
MIRU 2019 (in Japanese)
[Program (Japanese)]

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks
Takuhiro Kaneko
MIRU 2019 (in Japanese)
[Abstract (Japanese)] [Slides (Japanese)]

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks: From Image Generation to Speech Synthesis and Voice Conversion
Takuhiro Kaneko
The 75th JSAI Seminar (in Japanese)
[Abstract (Japanese)]

[Invited Talk] Generative Adversarial Networks: Foundations and Applications
Takuhiro Kaneko
JAMIT 2018 (in Japanese)
[Abstract (Japanese)]

Creating Favorite Images with Selective Decisions: Hierarchical Image Analysis and Synthesis with DTLC-GAN
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2018
[Poster] [Poster (Japanese)]

Free-Feature-Point Image Generation: Interactive and Flexible Image Generation with Deep Learning
Takuhiro Kaneko
NTT R&D Forum 2018
[Poster] [Poster (Japanese)]

[Invited Talk] Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks (CVPR 2017)
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
MIRU 2017 (in Japanese)
[Program (Japanese)]

Generative Personal Assistance with Audio and Visual Examples: Deep Learning Opens the Way to Innovative Media Generation
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2017
[Paper] [Abstract] [Poster]
[Paper (Japanese)] [Abstract (Japanese)] [Poster (Japanese)]


Lectures

  • Generative Model, GANs
    Intelligent Informatics
    Graduate School of Information Science and Technology, The University of Tokyo, June 25th, 2020

Awards & Honors


Contact

Takuhiro Kaneko
NTT Communication Science Laboratories, NTT Corporation
takuhiro.kaneko.tb at hco.ntt.co.jp