A paper presented at Interspeech 2022

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Yasunori Ohishi

Jun 14, 2022 1 min read Audio captioning

We are pleased to announce that our paper "Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval" by Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada, and Kunio Kashino has been accepted to Interspeech 2022.

This paper propose a content-based audio retrieval method that can retrieve a target audio that is similar to but slightly different from the query audio by introducing auxiliary textual information which describes the difference between the query and the target audio. While the range of conventional content-based audio retrieval is limited to audio that is similar to the query audio, the proposed method can adjust the retrieval range by adding an embedding of the auxiliary text query-modifier to the embedding of the query sample audio in a shared latent space.

Audio captioning

A paper presented at Interspeech 2022

Yasunori Ohishi

Senior Manager

Related