Takuhiro Kaneko

About

I am a Distinguished Researcher at NTT Communication Science Laboratories in NTT, Inc.
I received my Ph.D. from the University of Tokyo in 2020, under the supervision of Tatsuya Harada.
I also received the B.E. and M.E. degrees from the University of Tokyo in 2012 and 2014, respectively.

I am a member of the Recognition Research Group and Computational Modeling Research Group at the Media Information Laboratory within NTT Communication Science Laboratories.

My research interests include computer vision, signal processing, and machine learning.
In particular, I am currently working on image synthesis, 3D reconstruction, speech synthesis, and voice conversion using deep generative and physical models.

Google Scholar / Semantic Scholar / dblp / GitHub

Job Opportunities We are hiring! Please check the following page if interested: 日本語 / English

News

International

New speech corpus released: Japanese Idol Speech Corpus
One paper accepted to ICASSP 2026: MeanVoiceflow
Two papers accepted to WACV 2026: IPCD / DHRL

Domestic (Japanese)

I will give a lecture on Image Generation at the Unversity of Tokyo. Schedule

International

One paper accepted to TASLP. LatentVoiceGrad
One paper accepted to ICIP 2025. Real-world IID assessment
Three papers accepted to Interspeech 2025. FasterVoiceGrad / VPFD / JIS
One paper accepted to CVPR 2025 (Highlight). Structure from Collision
One paper accepted to ICASSP 2025. Rethinking MOS
One paper accepted to WACV 2025. LIET

Domestic (Japanese)

I will give a lecture on Image Generative AI at the Unversity of Tokyo. スケジュール
I will present an invited talk at MIRU 2025. プログラム
I will present an invited talk at The 2nd Spatial AI Seminar. プログラム / スライド
I will present a tutorial at SSII 2025. プログラム / スライド
I will serve as Area Chair at MIRU 2025.
I will serve as Vice Session Chair at ASJ 2025 Spring.

International

I will give an invited talk at ACCV 2024 Workshop (MLCSA 2024). Website
NTT's 17 papers accepted to Interspeech 2024. NTT Topics
Two papers accepted to Interspeech 2024. FastVoiceGrad / PRVAE-VC2
One paper accepted to EUSIPCO 2024. Subjective voice description
NTT's 3 papers accepted to CVPR 2024. NTT Topics (in Japanese)
One paper accepted to CVPR 2024. LPO
NTT's 20 papers accepted to ICASSP 2024. NTT Topics
One paper submitted to arXiv. LIET
One paper accepted to TASLP. VoiceGrad
Two papers accepted to ICASSP 2024. AugCondD / N_low-MOS

I will serve as an Associate Editor for IEICE Trans. Inf. & Syst. List

Domestic (Japanese)

I will give an invited talk at 音声研究会 (SP) Dec. 2024. プログラム
A review on Deep Generative Models was published in IEICE Bplus. 論文
I will give an invited talk at パターン認識・メディア理解研究会 (PRMU) Nov. 2024. プログラム
I will give an invited talk on LPO at FIT 2024. プログラム
I will serve as Session Chair at MIRU 2024. 日本語 / English
Our voice conversion technology was featured on TV programs. 日テレ / フジテレビ / テレ朝
Our voice conversion technology was featured in NTT News Release. 日本語 / English
Our voice conversion demo will be exhibited at NTT CS Labs Open House 2024. 日本語 / English
I will give a tutorial on Deep Generative Models at Advanced Image Seminar 2024. プログラム
I will give a lecture on Generative Models at the Unversity of Tokyo. カタログ
I will serve as Area Chair at MIRU 2024. 日本語 / English
A review on GANs was published in Journal of ISCIE. 目次

International

NTT's 3 papers accepted to ICCV 2023. NTT Topics (MIMO-NeRF)
One paper accepted to ICMI 2023. Frame-Level Event Representation
One paper accepted to ICCV 2023. MIMO-NeRF
One paper accepted to SSW 2023. PRVAE-VC
One paper accepted to EUSIPCO 2023. W2N-AVSC
NTT's 19 papers accepted to Interspeech 2023. NTT Topics (iSTFTNet2 / CFVC)
Two papers accepted to Interspeech 2023. iSTFTNet2 / CFVC
One paper accepted to IEEE Access. W2N-SC
One paper accepted to CVPR 2023. IID-LI
Two papers accepted to ICASSP 2023. Wave-U-Net D / JSV-VC

Domestic (Japanese)

NTT's 3 papers accepted to ICCV 2023. NTT Topics (MIMO-NeRF)
NTT's 19 papers accepted to Interspeech 2023. NTT Topics (iSTFTNet2 / CFVC)
I received the 54th Awaya Prize Young Researcher Award. List
I will present a tutorial on GAN at the IPSJ Continuous Seminar 2023. Program
IID-LI (CVPR 2023) was featured in NTT News Release.
NTT's 3 papers accepted to CVPR 2023. NTT topics (IID-LI)
NTT's 15 papers accepted to ICASSP 2023. NTT topics (Wave-U-Net D / JSV-VC)
I received the 38th Telecom System Technology Award. Press release
I serve as an Area Chair for MIRU 2023. List

International

One paper accepted to SLT 2022. Distilling S2S-VC
I was selected as Outstanding Reviewer at ICML 2022 (Top 10%). List
Two papers accepted to Interspeech 2022. MISRNet / CAUSE
A review on AR-GAN was published in NTT Technical Review. Website
An interview was published in NTT Technical Review. Website
One paper accepted to CVPR 2022. AR-NeRF
One paper accepted to ICASSP 2022. iSTFTNet

Domestic (Japanese)

I will present an invited talk at The 72th ODG Seminar. Program
I will present an invited talk at The 118th AVM Seminar. Program
I will present an invited talk on AR-GAN at FIT 2022. Program
I will present an invited talk on AR-NeRF at MIRU 2022. Program
A review on AR-GAN was published in Hikari Gijutsu Contact. Contents
I wrote a chapter in a book published by Kyoritsu Shuppan. CV Saizensen Summer 2022
I will exhibit our research – AR-GAN and AR-NeRF – at NTT CS Labs Open House 2022. Website
I will present a lecture on generative model (particularly, GANs) at the Unversity of Tokyo. Course catalogue
An interview was published in NTT Technical Journal. Website
A review on image synthesis and voice conversion was published in The Journal of IEICE. Contents
I serve as an Area Chair for MIRU 2022. List

International

I will present an invited talk at IDW 2021. Program
I will present an invited talk at RIMS Workshop in Biofluids 2021. Program
Two papers accepted to CVPR 2021 (1 oral, 1 poster). AR-GAN / BNCR-GAN
One paper submitted to arXiv. FastS2S-VC
One paper accepted to ICASSP 2021. MaskCycleGAN-VC

Domestic (Japanese)

I will present an invited talk at MIRU 2021. Program
The news release was published. News release
I will present an invited talk at OTOGAKU Symposium 2021. Program

International

One paper accepted to TASLP. Many-to-Many VTN
I will present an invited talk at ACCV 2020 Workshop (MLCSA 2020). Program
One paper accepted to TASLP. A-StarGAN-VC
One paper submitted to arXiv. VoiceGrad
One paper accepted to Interspeech 2020. CycleGAN-VC3
One paper accepted to TASLP. ConvS2S-VC
I became a Distinguished Researcher from April 2020. Introduction
One paper submitted to arXiv. BNCR-GAN
One paper accepted to CVPR 2020 (acceptance rate: 22.1%). NR-GAN

Domestic (Japanese)

I uploaded the GAN tutorial slides presented at JSAI 2020. GAN Tutorial
I will present a tutorial on GAN at JSAI 2020. Program
I served as Graduate Student Representative of Graduate School.
I received Dean's Award for Best Doctoral Thesis.

International

One paper submitted to arXiv. NR-GAN
I was selected as Outstanding Reviewer at ICCV 2019 (Top 91). List
One paper accepted to BMVC 2019 (Spotlight). CP-GAN
One paper accepted to Interspeech 2019. StarGAN-VC2
One paper accepted to TASLP. ACVAE-VC
One paper submitted to arXiv. RMIT
One paper accepted to CVPR 2019 (Oral, acceptance rate: 5.6%). rGAN
Two papers submitted to arXiv. WaveCycleGAN2 / Crossmodal VC
Two papers accepted to ICASSP 2019. CycleGAN-VC2 / AttS2S-VC

Domestic (Japanese)

I uploaded the GAN tutorial slides presented at MIRU 2019. GAN Tutorial
I will present an invited talk on DTLC-GAN and rGAN at FIT 2019. Program
I will present an invited talk on rGAN at MIRU 2019. Program
I will present a tutorial on GAN at MIRU 2019. Abstract

Two papers submitted to arXiv. rGAN / CPGAN
Two papers submitted to arXiv. ConvS2S-VC / AttS2S-VC
One paper submitted to arXiv. WaveCycleGAN
Two papers accepted to SLT 2018. StarGAN-VC / WaveCycleGAN
I will present an invited talk at the 75th JSAI Seminar (in Japanese). Abstract (Japanese)
One paper submitted to arXiv. ACVAE-VC
Three papers accepted to EUSIPCO 2018. CycleGAN-VC / GAN Signal Reconstruction / DFW-SC
One paper submitted to arXiv. StarGAN-VC
I will present an invited talk at JAMIT 2018 (in Japanese). Abstract (Japanese)
I will exhibit at NTT Communication Science Laboratories Open House 2018. Poster
One paper accepted to CVPR 2018. DTLC-GAN
One invited review was published at Acoustical Science and Technology. GANs: Foundation and Applications
One paper submitted to arXiv. GAN Signal Reconstruction
I will exhibit at NTT R&D Forum 2018. Poster

One paper submitted to arXiv. CycleGAN-VC
One paper accepted to APSIPA ASC 2017. Recursive Net & GAN-PF for VC
I received IEICE ISS Young Researcher's Award in Speech Field in August 2017. Announcement (Japanese)
Two papers accepted to Interspeech 2017. GAN-PF for STFT Spectrograms / GAN-VC
I will present an invited talk at MIRU 2017 (in Japanese). Program (Japanese)
One paper accepted to CVPR 2017. CFGAN
I will talk at NTT Communication Science Laboratories Open House 2017 (in Japanese). Abstract
One paper accepted to ICASSP 2017. GAN-PF

One paper accepted to ACMMM 2016. Deep Feedback

Publications

MeanVoiceFlow: One-step Nonparallel Voice Conversion with Mean Flows
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo
ICASSP 2026
Project Page / Paper

IPCD: Intrinsic Point-Cloud Decomposition
Shogo Sato, Takuhiro Kaneko, Shoichiro Takeda, Tomoyasu Shimada, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida, Akisato Kimura
WACV 2026
Paper

Distribution Highlighted Reference-based Label Distribution Learning for Facial Age Estimation
Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda, Takuhiro Kaneko, Shota Orihashi, Ryo Masumura
WACV 2026
Paper

LatentVoiceGrad: Nonparallel Voice Conversion with Latent Diffusion/Flow-Matching Models
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Yuto Kondo
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP)
(Tech report, Sep. 2025)
Project Page / Paper / IEEE Xplore

Objective, Absolute and Hue-aware Metrics for Intrinsic Image Decomposition on Real-World Scenes: A Proof of Concept
Shogo Sato, Masaru Tsuchida, Mariko Yamaguchi, Takuhiro Kaneko, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida
ICIP 2025
Paper

FasterVoiceGrad: Faster One-step Diffusion-Based Voice Conversion with Adversarial Diffusion Conversion Distillation
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo
Interspeech 2025
Project Page / Paper

Vocoder-Projected Feature Discriminator
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo
Interspeech 2025
Project Page / Paper

JIS: A Speech Corpus of Japanese Idol Speakers with Various Speaking Styles
Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
Interspeech 2025
Paper / Dataset

Structure from Collision
Takuhiro Kaneko
CVPR 2025 (Highlight)
Project Page / Paper / Video

Rethinking Mean Opinion Scores in Speech Quality Assessment: Score Aggregation through Quantized Distribution Fitting
Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
ICASSP 2025
Paper

Unsupervised Intrinsic Image Decomposition with LiDAR Intensity Enhanced Training
Shogo Sato, Takuhiro Kaneko, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida, Akisato Kimura
WACV 2025
Paper

FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo
Interspeech 2024
Project Page / Paper

PRVAE-VC2: Non-Parallel Voice Conversion by Distillation of Speech Representations
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Yuto Kondo
Interspeech 2024
Project Page / Paper

Learning to Assess Subjective Impressions from Speech
Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Noboru Harada
EUSIPCO 2024
Paper

Improving Physics-Augmented Continuum Neural Radiance Field-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization
Takuhiro Kaneko
CVPR 2024
Project Page / Paper / Video

VoiceGrad: Non-parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Shogo Seki
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP)
(Tech report, Oct. 2020)
Project Page 1 (applied to mel-cepstrums) / Project Page 2 (applied to mel-spectrograms) / Paper / IEEE Xplore

Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka
ICASSP 2024
Project Page / Paper

Selecting N-Lowest Scores for Training MOS Prediction Models
Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
ICASSP 2024
Paper

Frame-Level Event Representation Learning for Semantic-Level Generation and Editing of Avatar Motion
Ayaka Ideno, Takuhiro Kaneko, Tatsuya Harada
ICMI 2023
Paper / Code

MIMO-NeRF: Fast Neural Rendering with Multi-input Multi-output Neural Radiance Fields
Takuhiro Kaneko
ICCV 2023
Project Page / Paper

W2N-AVSC: Audiovisual Extension for Whisper-to-Normal Speech Conversion
Shogo Seki, Kanami Imamura, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Noboru Harada
EUSIPCO 2023
Paper

PRVAE-VC: Non-parallel Many-to-Many Voice Conversion with Perturbation-Resistant Variational Autoencoder
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko
SSW 2023
Project Page / Paper

iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki
Interspeech 2023
Project Page / Paper

CFVC: Conditional Filtering for Controllable Voice Conversion
Kou Tanaka, Takuhiro Kaneko, Hirokazu Kameoka, Shogo Seki
Interspeech 2023
Project Page / Paper

Unsupervised Intrinsic Image Decomposition with LiDAR Intensity
Shogo Sato, Yasuhiro Yao, Taiga Yoshida, Takuhiro Kaneko, Shingo Ando, Jun Shimamura
CVPR 2023
Paper / Video / Dataset

Non-parallel Whisper-to-Normal Speaking Style Conversion Using Auxiliary Classifier Variational Autoencoder
Shogo Seki, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka
IEEE Access
Paper

Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki
ICASSP 2023
Project Page / Paper

JSV-VC: Jointly Trained Speaker Verification and Voice Conversion Models
Shogo Seki, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
ICASSP 2023
Paper

Distilling Sequence-to-Sequence Voice Conversion Models For Streaming Conversion Applications
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Shogo Seki
SLT 2022
Project Page / Paper

MISRNet: Lightweight Neural Vocoder Using Multi-Input Single Shared Residual Blocks
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki
Interspeech 2022
Project Page / Paper

CAUSE: Crossmodal Action Unit Sequence Estimation from Speech
Hirokazu Kameoka, Takuhiro Kaneko, Shogo Seki, Kou Tanaka
Interspeech 2022
Project Page / Paper

AR-NeRF: Unsupervised Learning of Depth and Defocus Effects from Natural Images with Aperture Rendering Neural Radiance Fields
Takuhiro Kaneko
CVPR 2022
Project Page / Paper

iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform
Takuhiro Kaneko, Kou Tanaka, Hirokazu Kameoka, Shogo Seki
ICASSP 2022
Project Page / Paper

Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks
Takuhiro Kaneko
CVPR 2021 (Oral)
Project Page / Paper / Slides / Poster / News Release (in Japanese)

Blur, Noise, and Compression Robust Generative Adversarial Networks
Takuhiro Kaneko, Tatsuya Harada
CVPR 2021
Project Page / Paper / Slides / Poster

FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
arXiv:2104.06900, Apr. 2021
Project Page 1 (ConvS2S-VC2) / Project Page 2 (Transformer-VC2) / Paper

MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
ICASSP 2021
Project Page / Paper / Slides / Poster

Many-to-Many Voice Transformer Network
Hirokazu Kameoka, Wen-Chin Huang, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Tomoki Toda
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP), 29, Dec. 2020
(Tech report, May 2020)
Project Page / Paper / IEEE Xplore

CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
Interspeech 2020
Project Page / Paper / Slides

Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, and Nobukatsu Hojo
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP), 28, Nov. 2020
(Tech report, Aug. 2020)
Paper / IEEE Xplore

ConvS2S-VC: Fully Convolutional Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP), 28, June 2020
(Tech report, Nov. 2018)
Project Page / Paper / IEEE Xplore

Noise Robust Generative Adversarial Networks
Takuhiro Kaneko, Tatsuya Harada
CVPR 2020
Project Page / Paper / Code / Slides / Video

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
Interspeech 2019
Project Page / Paper / Poster

Class-Distinct and Class-Mutual Image Generation with GANs
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
BMVC 2019 (Spotlight)
Project Page / Paper / Code / Slides / Poster

ACVAE-VC: Non-parallel Voice Conversion with Auxiliary Classifier Variational Autoencoder
(Alternative title: ACVAE-VC: Non-parallel Many-to-Many Voice Conversion with Auxiliary Classifier Variational Autoencoder)
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
IEEE/ACM Trans. Audio Speech Lang. Process. (TASLP), 27(9), Sep. 2019
(Tech report, Aug. 2018)
Project Page / Paper / IEEE Xplore

Label-Noise Robust Multi-Domain Image-to-Image Translation
Takuhiro Kaneko, Tatsuya Harada
arXiv:1905.02185, May 2019
Paper

Label-Noise Robust Generative Adversarial Networks
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
CVPR 2019 (Oral)
Project Page / Paper / Code / Slides / Poster / Talk

Crossmodal Voice Conversion
Hirokazu Kameoka, Kou Tanaka, Aaron Valero Puche, Yasunori Ohishi, Takuhiro Kaneko
arXiv:1904.04540, Apr. 2019
Project Page / Paper

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
arXiv:1904.02892, Apr. 2019
Project Page / Paper

CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo
ICASSP 2019
Project Page / Paper / Poster

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo
ICASSP 2019
Project Page / Paper

WaveCycleGAN: Synthetic-to-Natural Speech Waveform Conversion Using Cycle-Consistent Adversarial Networks
Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka
SLT 2018
Project Page / Paper

StarGAN-VC: Non-parallel Many-to-Many Voice Conversion with Star Generative Adversarial Networks
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
SLT 2018
Project Page / Paper

Automatic Speech Pronunciation Correction with Dynamic Frequency Warping-Based Spectral Conversion
Nobukatsu Hojo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko
EUSIPCO 2018
Paper

Generative Adversarial Network-based Approach to Signal Reconstruction from Magnitude Spectrograms
Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Hiroyasu Ando
EUSIPCO 2018
Paper

CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks
(Alternative title: Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks)
Takuhiro Kaneko, Hirokazu Kameoka
EUSIPCO 2018
(Tech report, Nov. 2017)
Project Page / Paper

Generative Adversarial Image Synthesis with Decision Tree Latent Controller
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
CVPR 2018
Project Page / Paper / Poster

Non-native Speech Conversion with Consistency-Aware Recursive Network and Generative Adversarial Network
Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Hiroyasu Ando, Kaoru Hiramatsu, Kunio Kashino
APSIPA ASC 2017
Paper

Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks
Takuhiro Kaneko, Hirokazu Kameoka, Kaoru Hiramatsu, Kunio Kashino
Interspeech 2017
Paper

Generative Adversarial Network-based Postfilter for STFT Spectrograms
Takuhiro Kaneko, Shiji Takaki, Hirokazu Kameoka, Junichi Yamagishi
Interspeech 2017
Project Page / Paper

Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
CVPR 2017
Project Page / Paper / Supplemental

Generative Adversarial Network-based Postfilter for Statistical Parametric Speech Synthesis
Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino
ICASSP 2017
Paper

Adaptive Visual Feedback Generation for Facial Expression Improvement with Multi-task Deep Neural Networks
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
ACMMM 2016
Paper

Collective Activity Localization by Spatiality Preservation Search
Shigeyuki Odashima, Masamichi Shimosaka, Takuhiro Kaneko, Rui Fukui, Tomomasa Sato
Advanced Robotics 30(11-12), Mar. 2016
Paper

Modeling Risk Anticipation and Defensive Driving on Residential Roads with Inverse Reinforcement Learning
Masamichi Shimosaka, Takuhiro Kaneko, Kentaro Nishi
ITSC 2014
Project Page / Paper

A Fully Connected Model for Consistent Collective Activity Recognition in Videos
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
Pattern Recognition Letters 43, July 2014
Project Page / Paper

Consistent Collective Activity Recognition with Fully Connected CRFs
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
ICPR 2012 (Best Student Paper Award)
Project Page / Paper

Collective Activity Localization with Contextual Spatial Pyramid
Shigeyuki Odashima, Masamichi Shimosaka, Takuhiro Kaneko, Rui Fukui, Tomomasa Sato
ECCV Workshop 2012
Paper

Viewpoint Invariant Collective Activity Recognition with Relative Action Context
Takuhiro Kaneko, Masamichi Shimosaka, Shigeyuki Odashima, Rui Fukui, Tomomasa Sato
ECCV Workshop 2012
Project Page / Paper

Books / Interviews / Reviews

Fundamentals and Applications of Deep Generative Models
Takuhiro Kaneko
IEICE Communications Society Magazine, 18(3), Dec. 2024
Paper (Japanese)

Foundations, Advances, and Applications of GANs
Takuhiro Kaneko
The Journal of the Institute of Systems, Control and Information Engineers, 68(4), Apr. 2024
Contents (Japanese)

Unsupervised Depth and Bokeh Learning from Natural Images Using Aperture Rendering Generative Adversarial Networks
Takuhiro Kaneko
NTT Technical Review, 20(7), July 2022
Website / Paper

Learning 3D Information from 2D Images Using Aperture Rendering Generative Adversarial Networks toward Developing a Computer that "Understands the 3D World"
Takuhiro Kaneko
NTT Technical Review, 20(7), July 2022 (NTT Technical Journal, 34(5), May 2022)
Website / Paper
Website (Japanese) / Paper (Japanese)

Unsupervised Learning of Depth and Bokeh Effects from Natural Images using AR-GAN
Takuhiro Kaneko
Hikari Gijutsu Contact, 60(6), June 2022
Contents (Japanese)

Fukayomi Noise Robust GAN
Takuhiro Kaneko
Computer Vision Saizensen Summer 2022, Kyoritsu Shuppan, June 2022
Contents (Japanese)

Advancement and Application of Deep Generative Models in Image Synthesis and Voice Conversion
Takuhiro Kaneko
The Journal of the Institute of Electronics Information and Communication Engineers, 105(5), May 2022
Contents (Japanese)

Communication with Desired Voice
Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka
NTT Technical Review, 18(11), Nov. 2020 (NTT Technical Journal, 32(9), Sep. 2020)
Paper / Paper (Japanese)

Generative Adversarial Networks: Foundations and Applications
Takuhiro Kaneko
Acoustical Science and Technology, 39(3), May 2018
Paper / Paper (Japanese)

Generative Personal Assistance with Audio and Visual Examples
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
NTT Technical Review, 15(11), Nov. 2017 (NTT Technical Journal, 29(9), Sep. 2017)
Paper / Paper (Japanese)

Talks / Exhibitions

[Invited Talk] Structure from Collision (CVPR 2025)
Takuhiro Kaneko
MIRU 2025 (in Japanese)
プログラム

[Invited Talk] Deep Image Generation Based on Optics and Physics
Takuhiro Kaneko
第2回Spatial AI勉強会 (in Japanese)
プログラム / スライド

[Tutorial] Deep Image Generation Based on Optics and Physics
Takuhiro Kaneko
画像センシングシンポジウム (SSII) 2025 (in Japanese)
プログラム / スライド

[Invited Talk] Improving Geometry-Agnostic System Identification with Lagrangian Particle Optimization
Takuhiro Kaneko
ACCV Workshop 2024 (MLCSA 2024)
Website

[Invited Talk] Report on Participation in Interspeech 2024
Sei Ueno, Takuhiro Kaneko
音声研究会 (SP) Dec. 2024 (in Japanese)
プログラム

[Invited Talk] Deep Generative Models with Physical and Geometrical Constraints
Takuhiro Kaneko
パターン認識・メディア理解研究会 (PRMU) Nov. 2024 (in Japanese)
プログラム

[Invited Talk] Improving Physics-Augmented Continuum Neural Radiance Field-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization (CVPR 2024)
Takuhiro Kaneko
FIT 2024 (in Japanese)
プログラム

[Exhibition] Changing your voice and speaking style in real-time
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2024
Website (English) / ウェブサイト (日本語) / 日テレ / フジテレビ / テレ朝

[Tutorial] Foundations and Applications of Deep Generative Models
Takuhiro Kaneko
Advanced Image Seminar, 2024 (in Japanese)
Program (Japanese)

[Tutorial] Image Generation with GANs and Its Applications
Takuhiro Kaneko
The IPSJ Continuous Seminar, 2023 (in Japanese)
Program (Japanese)

[Invited Talk] Learning 3D Information from 2D Images with Deep Generative Models:
Advancements in Optics-Based Deep Generative Models
Takuhiro Kaneko
The 72th ODG Seminar, 2022 (in Japanese)
Program (Japanese)

[Invited Talk] Report on CVPR 2022 and Introduction of Our CVPR 2022 Papers
Takuhiro Kaneko, Shoichiro Takeda
AVM 2022 (in Japanese)
Program (Japanese)

[Invited Talk] Unsupervised Learning of Depth and Depth-of-Field Effect from Natural Images with Aperture Rendering Generative Adversarial Networks (CVPR 2021)
Takuhiro Kaneko
FIT 2022 (in Japanese)
Program (Japanese)

[Invited Talk] AR-NeRF: Unsupervised Learning of Depth and Defocus Effects from Natural Images with Aperture Rendering Neural Radiance Fields (CVPR 2022)
Takuhiro Kaneko
MIRU 2022 (in Japanese)
Program (Japanese)

[Exhibition] Flexible Bokeh Renderer Based on Predicted Depth
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2022
Website / Website (Japanese)

[Invited Talk] Image Synthesis and Voice Conversion Using Generative Adversarial Networks
Takuhiro Kaneko
IDW 2021
Program

[Invited Talk] Learning to Generate Images with Generative Adversarial Networks
Takuhiro Kaneko
RIMS Workshop 2021
Program

[Invited Talk] Blur, Noise, and Compression Robust Generative Adversarial Networks (CVPR 2021)
Takuhiro Kaneko, Tatsuya Harada
MIRU 2021 (in Japanese)
Program (Japanese)

[Invited Talk] Image Synthesis and Voice Conversion with Generative Adversarial Networks
Takuhiro Kaneko
OTOGAKU Symposium 2021 (in Japanese)
Program (Japanese)

[Invited Talk] Learning to Generate Images with Imperfect Supervision
Takuhiro Kaneko
ACCV Workshop 2020 (MLCSA 2020)
Program

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks
Takuhiro Kaneko
JSAI 2020 (in Japanese)
Program (Japanese) / Slides (Japanese)

[Invited Talk] Voice Conversion with Image-to-Image Translation and Sequence-to-Sequence Learning Approaches
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo
SANE 2019
Video / Program

[Invited Talk] Generative Adversarial Image Synthesis with Decision Tree Latent Controller (CVPR 2018)¹
Label-Noise Robust Generative Adversarial Networks (CVPR 2019)²
¹Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
²Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
FIT 2019 (in Japanese)
Program (Japanese)

[Invited Talk] Label-Noise Robust Generative Adversarial Networks (CVPR 2019)
Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada
MIRU 2019 (in Japanese)
Program (Japanese)

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks
Takuhiro Kaneko
MIRU 2019 (in Japanese)
Abstract (Japanese) / Slides (Japanese)

[Tutorial] Foundations, Advances, and Applications of Generative Adversarial Networks: From Image Generation to Speech Synthesis and Voice Conversion
Takuhiro Kaneko
The 75th JSAI Seminar, 2018 (in Japanese)
Abstract (Japanese)

[Invited Talk] Generative Adversarial Networks: Foundations and Applications
Takuhiro Kaneko
JAMIT 2018 (in Japanese)
Abstract (Japanese)

[Exhibition] Creating Favorite Images with Selective Decisions: Hierarchical Image Analysis and Synthesis with DTLC-GAN
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2018
Poster / Poster (Japanese)

[Exhibition] Free-Feature-Point Image Generation: Interactive and Flexible Image Generation with Deep Learning
Takuhiro Kaneko
NTT R&D Forum 2018
Poster / Poster (Japanese)

[Invited Talk] Generative Attribute Controller with Conditional Filtered Generative Adversarial Networks (CVPR 2017)
Takuhiro Kaneko, Kaoru Hiramatsu, Kunio Kashino
MIRU 2017 (in Japanese)
Program (Japanese)

[Exhibition] Generative Personal Assistance with Audio and Visual Examples: Deep Learning Opens the Way to Innovative Media Generation
Takuhiro Kaneko
NTT Communication Science Laboratories Open House 2017
Paper / Abstract / Poster
Paper (Japanese) / Abstract (Japanese) / Poster (Japanese)

Lectures

Image Generation
Intelligent Informatics
Graduate School of Information Science and Technology, The University of Tokyo, May 28th, 2026
Deep Learning × Optical and Physical Models
Media Content II
Faculty of Engineering, The University of Tokyo, Dec. 4th, 2025
Generative Models
Intelligent Informatics
Graduate School of Information Science and Technology, The University of Tokyo, May 23rd, 2024
Generative Models, GANs
Intelligent Informatics
Graduate School of Information Science and Technology, The University of Tokyo, May 19th, 2022
Generative Models, GANs
Intelligent Informatics
Graduate School of Information Science and Technology, The University of Tokyo, June 25th, 2020

Awards & Honors

The 54th Awaya Prize Young Researcher Award
The Acoustical Society of Japan, Sep. 2023
The 38th Telecom System Technology Award
The Telecommunication Advancement Foundation, Mar. 2023
ICML 2022 Outstanding Reviewer (Top 10%)
ICML 2022, July 2022
Graduate Student Representative of Graduate School
Graduate School of Information Science and Technology, The University of Tokyo, Mar. 2020
Dean's Award for Best Doctoral Thesis
Graduate School of Information Science and Technology, The University of Tokyo, Mar. 2020
ICCV 2019 Outstanding Reviewer (Top 91)
ICCV 2019, Oct. 2019
IEICE ISS Young Researcher's Award in Speech Field
IEICE, Aug. 2017
ICPR 2012 Best Student Paper Award
ICPR 2012, Nov. 2012
The Hatakeyama Award
The Japan Society of Mechanical Engineers, Mar. 2012

About

News

2026

International

Domestic (Japanese)

2025

International

Domestic (Japanese)

2024

International

Domestic (Japanese)

2023

International

Domestic (Japanese)

2022

International

Domestic (Japanese)

2021

International

Domestic (Japanese)

2020

International

Domestic (Japanese)

2019

International

Domestic (Japanese)

2018

2017

2016

Publications

Books / Interviews / Reviews

Talks / Exhibitions

Lectures

Awards & Honors