HOME / Lecture video / Research Talk
Research Talk

AI hears your voice as if it were "right next to you"
Audio processing framework for separating distant sounds with close microphone quality
Tomohiro Nakatani
Signal Processing Research Group, Media Information Laboratory

Abstract

When distant microphones capture a speech signal, they also record reverberation, background noise, and voices from extraneous speakers, degrading the quality of the captured speech signal. This talk will introduce advanced speech enhancement techniques for extracting high-quality speech signals from such degraded recordings as if close microphones recorded them. They include a unified model for performing joint dereverberation, denoising, and source separation, as well as a switching mechanism for enabling high-quality processing with a small number of microphones. Integration with deep learning-based speech enhancement, e.g., SpeakerBeam, will also be discussed.

Speaker
Tomohiro Nakatani
Tomohiro Nakatani
Signal Processing Research Group, Media Information Laboratory