Media Intelligence

Extracting essential information from sounds

- Advances in distant speech recognition by deep learning -

Abstract

We are working on conversational speech recognition and communication scene analysis in real world sound environments. We have proposed various speech processing methods based on deep learning (DL), which is an essential technique for their realization. In addition to the speech recognition techniques in which DL has been widely employed, we are proposing a variety of DL-based speech processing methods, namely, speech enhancement and acoustic event detection techniques. These DL-based speech processing methods achieve excellent recognition performance for conversational speech. Our DL-based techniques expand the usability of a voice interface in real and noisy daily scenes.

Photos

Poster


Please click the thumbnail image to open the full-size PDF file.

Presenters

Araki Shoko
Araki Shoko
Media Information Laboratory
Masakiyo Fujimoto
Masakiyo Fujimoto
Media Information Laboratory
Marc Delcroix
Marc Delcroix
Media Information Laboratory
Takuya Yoshioka
Takuya Yoshioka
Media Information Laboratory
Espi Miquel
Espi Miquel
Media Information Laboratory
Atsunori Ogawa
Atsunori Ogawa
Media Information Laboratory
Keisuke Kinoshita
Keisuke Kinoshita
Media Information Laboratory
Nobutaka Ito
Nobutaka Ito
Media Information Laboratory
Tomohiro Nakatani
Tomohiro Nakatani
Media Information Laboratory
Taichi Asami
Taichi Asami
Media Intelligence Laboratories
Takanori Ashihara
Takanori Ashihara
Media Intelligence Laboratories