- Probabilistic fusion of different recording devices -
When we capture speech signals in actual environments such as the meeting scenario depicted below, the recorded signal inevitably contains an interference signal (i.e., speech from non-target speakers, ambient noise) that overlaps the target speech signal. Although many multi-channel speech separation techniques have been proposed in previous decades, they tend to fail in distributed microphone scenarios owing to the different characteristics of different recording devices (i.e., the sampling frequency mismatch between devices). In this presentation, we introduce speech separation techniques that can work in distributed microphone scenarios by fusing hypotheses from different recording devices in a probabilistic manner and making recording devices work collaboratively.
Please click the thumbnail image to open the full-size PDF file.
M. Souden, K. Kinoshita, T. Nakatani, "An integration of source location cues for speech clustering in distributed microphone arrays," in Proc. International Conference on Acoustic, Speech and Signal Processing (ICASSP), (to appear), 2013.