Computational selective hearing based on deep learning


In conversations, when several people speak at the same time, people have the ability to focus on listening to a desired speaker (Selective hearing). However, current computers and voice assistant devices are not necessarily good at such hearing. We are pursuing research aimed at realizing computational selective hearing that will enable a computer to focus on listening to a target speaker and ignore the other speakers.
We use our recently developed context adaptive neural network and propose informing the neural network about the target speaker’s voice characteristics such that the network can extract only that target speaker’s voice. This technology will lead the way to a more natural voice assistant that can focus on listening to a target speaker in the same way that people do.


