We are working on conversational speech recognition and communication scene analysis in real world sound environments. We have proposed various speech processing methods based on deep learning (DL), which is an essential technique for their realization. In addition to the speech recognition techniques in which DL has been widely employed, we are proposing a variety of DL-based speech processing methods, namely, speech enhancement and acoustic event detection techniques. These DL-based speech processing methods achieve excellent recognition performance for conversational speech. Our DL-based techniques expand the usability of a voice interface in real and noisy daily scenes.

Please click the thumbnail image to open the full-size PDF file.
Masakiyo Fujimoto
Media Information Laboratory
Takuya Yoshioka
Media Information Laboratory
Espi Miquel
Media Information Laboratory
Atsunori Ogawa
Media Information Laboratory
Nobutaka Ito
Media Information Laboratory
Taichi Asami
Media Intelligence Laboratories
Takanori Ashihara
Media Intelligence Laboratories