Extracting hidden information from speech and audio signals

- Generative modeling approach to speech and audio signal processing -

Hirokazu Kameoka, Media Information Laboratory

Abstract

Many audio signal processing problems can each be viewed as the inverse problem of estimating an unknown “cause”, i.e., information or a quantity of interest, from a known “consequence”, i.e., an observed signal. Inverse problems generally have infinitely many solutions, thus making it difficult to find the correct “cause” only from the observed consequence. Thus, one important key to success involves incorporating relevant empirical knowledge or statistics about the “cause” into the inference scheme. We are dealing with a variety of inverse problems in the speech and audio processing area, including audio source separation, speech enhancement, speech prosodic feature extraction, and music transcription, by modeling and estimating the generating processes of speech and audio signals with a probabilistic approach.

Speaker

Media Information Laboratory

Hirokazu Kameoka