メディア情報研究部Media Information Laboratory

メッセージ

原田 登
メディア情報研究部 部長
原田 登

  メディア情報研究部では、メディア認識技術、信号処理技術、情報基礎理論、を3つの柱として、メディア情報処理の研究を進めています。
  近年の技術の進歩は急速で、誰かが一度空想できてしまったものは、あたかもその時点で既に実現されていなければならないかのようです。私たちが取り組んでいるメディア情報処理の研究分野でも、基礎研究と応用研究の距離がどんどん近づいています。
  このような環境の中で、私たちは、原理や理論に立脚すると同時に、現実の環境や経験からも学びながら、社会課題の解決や豊かな社会の実現に寄与する技術の創出を目指しています。

ニュース

2020.01.27 お知らせ

NTTより、29件の論文が ICASSP 2020 (International Conference on Acoustics, Speech and Signal Processing) に採択されました。

文献リストはこちら

  • C. Boeddeker, T. Nakatani, K. Kinoshita, and R. Haeb-Umbach, "Jointly Optimal Dereverberation and Beamforming," Lecture
  • M. Delcroix, T. Ochiai, K. Zmolikova, K. Kinoshita, N. Tawara, T. Nakatani, and S. Araki, "Improving Speaker Discrimination of Target Speech Extraction with Time-domain SpeakerBeam", Poster
  • S. Emura, H. Sawada, S. Araki, and N. Harada, "A Frequency-domain BSS Method based on L1 Norm, Unitary Constraint, and Cayley Transform," Lecture
  • M. Ihori, A. Takashima, and R. Masumura, "Large-Context Ponter-Generater Networks for Spoken-to-Written Style Conversion," Poster
  • R. Ikeshita, T. Nakatani, and S. Araki, "Overdetermined Independent Vector Analysis," Poster
  • K. Imoto, N. Tonami, Y. Koizumi, M. Yasuda, R. Yamanishi, and Y. Yamashita, "Sound Event Detection By Multitask Learning of Sound Events and Scenes with Soft Scene Labels," Poster
  • M. Kawanaka, Y. Koizumi, R. Miyazaki, and K. Yatabe, "Stable Training of DNN for Speech Enhancement based on Perceptually-Motivated Black-box Cost Function," Poster
  • K. Kinoshita, T. Ochiai, M. Delcroix, and T. Nakatani, "Improving Noise Robust Automatic Speech Recognition with Single-channel Time-domain Enhancement Network,” Poster
  • K. Kinoshita, M. Delcroix, S. Araki, and T. Nakatani, "Tackling Real Noisy Reverberant Meetings with All-neural Source Separation, Counting, and Diarization System," Poster
  • Y. Koizumi, K. Yatabe, M. Delcroix, Y. Masuyama, and D. Takeuchi, "Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention," Lecture
  • Y. Koizumi, M. Yasuda, S. Murata, S. Saito, H. Uematsu, and N. Harada, "SPIDERnet: Attention Network for One-shot Anomaly Detection in Sounds," Poster
  • T. Kondo, K. Fukushige, N. Takamune, D. Kitamura, H. Saruwatari, R. Ikeshita, and T. Nakatani, "Convergence-guaranteed Independent Positive Semidefinite Tensor Analysis based on Student's t Distribution", Poster
  • S. Kurihara, M. Fukui, S. Shimauchi, and N. Harada, "Objective Quality Estimation Using PESQ for Hands-free Terminals," Poster
  • R. Masumura, M. Ihori, A. Takashima, T. Moriya, A. Ando, and Y. Shinohara, "Sequence-level consistency training for semi-supervised end-to-end automatic speech recognition," Poster
  • Y. Masuyama, K. Yatabe, Y. Koizumi, Y. Oikawa, and N. Harada, "Phase reconstruction based on recurrent phase unwrapping with deep neural networks," Poster
  • T. Moriya, H. Sato, T. Tanaka, T. Ashihara, R. Masumura, Y. Shinohara, "Distilling Attention Weights for CTC-based ASR Systems," Poster
  • T. Nakatani, R. Takahashi, T. Ochiai, K. Kinoshita, R. Ikeshita, M. Delcroix, and S. Araki, "DNN-supported Mask-based Convolutional Beamforming for Simultaneous Denoising, Dereverberation, and Source Separation", Lecture
  • H. Narimatsu and H. Kasai "Overlapped State Hidden Semi-Markov Model for Grouped Multiple Sequences," Lecture
  • T. von Neumann, K. Kinoshita, L. Drude, C. Boeddeker, M. Delcroix, T. Nakatani, and R. Haeb-Umbach, "End-to-end Training of Time Domain Audio Separation and Recognition," Poster
  • T. Ochiai, M. Delcroix, R. Ikeshita, K. Kinoshita, T. Nakatani, and S. Araki, "BEAM-TASNET: Time-domain Audio Separation Network Meets Frequency-domain Beamformer,"
  • Y. Ohishi, A. Kimura, T. Kawanishi, K. Kashino, D. Harwath, and J. Glass, "Trilingual Semantic Embeddings of Visually Grounded Speech with Self-attention Mechanisms," Lecture.
  • C. Schymura, T. Ochiai, M. Delcroix, K. Kinoshita, T. Nakatani, S. Araki, and D. Kolossa, “A Dynamic Stream Weight Backprop Kalman Filter for Audiovisual Speaker Tracking," Poster.
  • D. Takeuchi, K. Yatabe, Y. Koizumi, Y. Oikawa, and N. Harada, "Real-Time Speech Enhancement using Equilibraited RNN," Poster
  • D. Takeuchi, K. Yatabe, Y. Koizumi, Y. Oikawa, and N. Harada, "Invertible DNN-based Nonlinear Time-Frequency Transform for Speech Enhancement," Poster
  • N. Tawara, A. Ogawa, T. Iwata, M. Delcroix, and T. Ogawa, “Frame-level Phoneme-invariant Speaker Embedding for Text-independent Speaker Recognition on Extremely Short Utterances," Poster
  • N. Tawara, H. Kamiyama, S. Kobashikawa, and A. Ogawa, “Improving Speaker-attribute Estimation by Voting based on Speaker Cluster Information,” Poster
  • X. Wu, T. Kawanishi, and K. Kashino, "Reflectance-guided, Contrast-accumulated Histogram Equalization," Poster
  • M. Yasuda, Y. Koizumi, S. Saito, H. Uematsu, and K. Imoto, "Sound Event Localization based on Sound Intensity Vector Refined by DNN-based Denoising and Source Separation," Poster
  • G. Zhang, K. Niwa and W.B. Kleijn, "PROJECTED WEIGHT REGULARIZATION TO IMPROVE NEURAL NETWORK GENERALIZATION," Poster

研究グループ

研究内容

メンバー

部長

上席特別研究員

基識G

基信G

基論G

アクセス

Last Update:2020/5/12

Page top ←