Media Intelligence

Recovering perfectly-sounding speech by denoising

- High-quality speech enhancement using vast amounts of examples -

Abstract

A considerable number of studies have addressed speech enhancement (SE) algorithms to remove interference signals (e.g., noise, reverberation) from input speech signals. However, resultant signals often contain distortions such as residual interference and artifacts, which degrade overall audible quality.
Our proposed method establishes an SE mechanism that does not induce such distortions and utilizes vast amounts of clean training raw data (i.e., examples). It first seeks an example that best matches underlying clean speech sequence in test data and replaces them with examples. Resultant signals are generated essentially by collecting clean examples, and thus there is no place for interference to remain in output signals, which leads to high-quality enhancement.