AI that attends to the sounds you want to listen to ｜Exhibition Program｜NTT Communication Science Laboratories OPEN HOUSE 2023

Exhibition Program

Science of Media Information

09	AI that attends to the sounds you want to listen to Deep learning based selective hearing of arbitrary sounds

Abstract

Humans can focus on listening to a desired sound in complex sound scenes consisting of many co-occurring sounds to retrieve important information, an ability known as selective hearing. In this work, we introduce SoundBeam, a deep-learning-based approach for computational selective hearing. SoundBeam extends our prior work on selective hearing of speech based on the characteristics of the target speaker’s voice to arbitrary sounds. With SoundBeam, users can control which sound to listen to depending on their preference or the environment, e.g., hearing a klaxon when crossing a street but not when working at home, thus creating comfortable and safe sound environments.

AI that attends to the sounds you want to listen to

References

[1] M. Delcroix, K. Zmolikova, K. Kinoshita, S. Araki, A. Ogawa, and T. Nakatani, “SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker’s Voice Characteristics,” NTT Technical Review, Vol. 16, No. 11, pp. 19–24, Nov. 2018.

[2] T. Ochiai, M. Delcroix, Y. Koizumi, H. Ito, K. Kinoshita, S. Araki, “Listen to what you want: Neural network-based universal sound selector,” in Proc. Interspeech, pp. 2718 - 2722, 2020.

[3] M. Delcroix, J. B. Vázquez, T. Ochiai, K. Kinoshita, Y. Ohishi, S. Araki, "SoundBeam: Target Sound Extraction Conditioned on Sound-Class Labels and Enrollment Clues for Increased Performance and Continuous Learning," IEEE/ACM Trans. on Audio, Speech, and Language Processing, Vol. 31, pp. 121-136, 2023.

Poster

Please click the icon to open the full-size PDF file.

Contact

Marc Delcroix

Signal Processing Research Group, Media Information Laboratory

Click here for other research exhibits

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16

AI that attends to the sounds you want to listen to

Deep learning based selective hearing of arbitrary sounds

Contact

Download