While automatic speech recognition (ASR) technology has been recently greatly improved and increasingly come into our daily lives, its use scene is still limited. For example, it is still difficult for ASR systems to perform reliably in noisy environments such as street, cafés and exhibition halls, especially when they are used in the situation where microphone(s) and users’ mouth are not close enough, and thus the recorded speech signal contains noise and reverberation. Through this poster, we introduce some of our recently developed key fundamental technologies such as distortion-less speech enhancement and deep learning-based ASR technologies to address such issues, with which we won the CHiME-3 Challenge, an international program for evaluating the performance of speech recognizers in noisy outdoor public areas. In the future, these technologies will serve as a key to help us enhance the quality of ASR in smartphones and development of communication robots.
Please click the thumbnail image to open the full-size PDF file.