Science of Media Information

Recognizing your voice even in noisy environments

- Advances in distant speech recognition technologies -

Abstract

While automatic speech recognition (ASR) technology has been recently greatly improved and increasingly come into our daily lives, its use scene is still limited. For example, it is still difficult for ASR systems to perform reliably in noisy environments such as street, cafés and exhibition halls, especially when they are used in the situation where microphone(s) and users’ mouth are not close enough, and thus the recorded speech signal contains noise and reverberation. Through this poster, we introduce some of our recently developed key fundamental technologies such as distortion-less speech enhancement and deep learning-based ASR technologies to address such issues, with which we won the CHiME-3 Challenge, an international program for evaluating the performance of speech recognizers in noisy outdoor public areas. In the future, these technologies will serve as a key to help us enhance the quality of ASR in smartphones and development of communication robots.

Photos

Poster


Please click the thumbnail image to open the full-size PDF file.

Presenters

Araki Shoko
Araki Shoko
Media Information Laboratory
Keisuke Kinoshita
Keisuke Kinoshita
Media Information Laboratory
Atsunori Ogawa
Atsunori Ogawa
Media Information Laboratory
Marc Delcroix
Marc Delcroix
Media Information Laboratory
Takuya Yoshioka
Takuya Yoshioka
Media Information Laboratory
Nobutaka Ito
Nobutaka Ito
Media Information Laboratory
Takuya Higuchi
Takuya Higuchi
Media Information Laboratory
Dung Tran
Dung Tran
Media Information Laboratory
Tomohiro Nakatani
Tomohiro Nakatani
Media Information Laboratory