Cleaning-up speech from noisy, reverberant recordings｜Exhibition Program｜NTT Communication Science Laboratories OPEN HOUSE 2025

Exhibition Program

Science of Media Information

08	Cleaning-up speech from noisy, reverberant recordings Ensemble of multi-stream diffusion model enhances speech

Abstract

Recent advancements in generative AI have enabled high-quality speech enhancement. This research introduces a speech enhancement method that utilizes the diffusion model, one of the most powerful generative AI models, to effectively remove noise and reverberation from speech recordings. Our approach integrates multiple conventional speech enhancement techniques into a diffusion model-based framework, significantly improving performance. Additionally, we are the first in the world to demonstrate that averaging multiple outputs from the diffusion model, a technique we refer to as “ensemble inference”, greatly enhancing performance. In the future, this technology will enable high-quality speech recording even in noisy environments, making voices sound as if recorded in a studio. This advancement is expected to greatly enhance various speech applications, such as collecting high-quality audio data in everyday environments and enabling more comfortable remote meetings.

Cleaning-up speech from noisy, reverberant recordings

References

[1] N. Kamo, M. Delcroix, T. Nakatani, “Target speech extraction with conditional diffusion model,” in Proc. INTERSPEECH, pp. 176-180, 2023.

[2] T. Nakatani, N. Kamo, M. Delcroix, S. Araki, “Multi-stream diffusion model for probabilistic integration of model-based and data-driven speech enhancement,” in Proc. IWAENC, pp. 65-69, 2024.

Poster

Please click the icon to open the full-size PDF file.

Contact

Naoyuki Kamo, Signal Processing Research Group, Media Information Laboratory

Click here for other research exhibits

01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20

Cleaning-up speech from noisy, reverberant recordings

Ensemble of multi-stream diffusion model enhances speech

Contact

Download