Visually-grounded speech

Best paper candidates at ACM Multimedia 2022!!

We are very proud that our paper “ConceptBeam: Concept Driven Target Speech Extraction” was one of the best paper candidates at ACM Multimedia 2022.

Yasunori Ohishi

Oct 11, 2022 1 min read Crossmodal, Visually-grounded speech

Best paper candidates at ACM Multimedia 2022!!

ConceptBeam

Target speech extraction based on “concept” or semantic information.

Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino

ConceptBeam

A paper presented at ACM Multimedia 2022

We are pleased to announce that our paper “ConceptBeam: Concept Driven Target Speech Extraction” by Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, and Kunio Kashino has been accepted to ACM Multimedia 2022 (acceptance rate = 690/2473 = 27.9%).

Yasunori Ohishi

Jul 1, 2022 1 min read Crossmodal, Visually-grounded speech

A paper presented at ACM Multimedia 2022

Our dataset publicly available

We are excited to announce that our dataset, The Places audio caption (Japanese) 100K corpus, is now available. This speech corpus was collected to investigate the learning of spoken language (words, sub-word units, higher-level semantics, etc.) from visually-grounded speech.

Yasunori Ohishi

Nov 18, 2021 1 min read Dataset, Visually-grounded speech, Crossmodal