Demonstration of speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model

Demonstration of Time-domain Speech Dereverberation based on Maximum Likelihood Estimation with Time-varying Gaussian Source Model

Reference

Nakatani, T., Juang, B.H., Yoshioka, T., Kinoshita, K., Delcroix, M., and Miyoshi, M., ``Speech dereverberation based on maximum likelihood estimation with time-varying Gaussian source model,'' submitted to IEEE Trans. Audio, Speech, and Language processing.

Task

Dereverberation of female/male utterances
Dereverberation filter was estimated from each reverberant sound shown below, and each dereverberated sound was obtained by applying the dereverberation filter to the reverberant sound.
Length of the dereverberation filter in each channel: 3000 taps (fixed)
Sampling frequency: 8 kHz

Source models used for performance comparison

WG	Stationary white Gaussian model
TVWG	Time-varying white Gaussian model
TVAR	Time-varyign autoregressive Gaussian model
TVARC	Time-varying autoregressive Gaussian model with codebook prior
TVGC	Time-varying Gaussian source model with codebook prior

Demonstration sounds

Dereverberation of reverberant female speech with reverberation time (RT60) of 0.5 sec.

Source model	Source sound	Reverberant sound	Dereverberated sound
WG
TVWG
TVAR
TVARC
TVGC

Dereverberation of reverberant female speech using TVGC under different reverberation time conditions.

The length of the dereverberation filter was set at 3000 taps, which is much larger than that required for the strict inverse filtering under the 0.1 sec reverberation time condition, and much smaller than that required for the strict inverse filtering under the 1.0 sec reverberation time condition.

Rtime (second)	Source sound	Reverberant sound	Dereverberated sound
0.1
0.5
1.0

Dereverberation of noisy reverberant female speech using TVGC.

The reverberation time (RT60) was 0.5 sec, and the reverberant signal to noise power ratio was 30 dB. The noise was stationary white Gaussian noise. A post-processing was also performed for eliminating musical noise that remained in the dereverberated signal.

Rtime

(second)

Noisy reverberant sound

Sound denoised using Wiener filter

Denoised and Dereverberated sound

Post-

processed

0.5

4. Dereverberation of male speech captured by a stereo voice recorder in a noisy reverberant conference room.

Note that no noise reduction was performed for this example to show the robustness of our dereverberation method, and thus a certain level of noise remain audible after the dereverberation.

RT60/SNR	Source sound	Reverberant sound	Dereverberated sound
Unknown	N/A

(back to Tomohiro's homepage)