About Me

Naohiro TAWARA (Ph.D.) / 俵直弘 (博士)

I am a Researcher at the Signal Processing Research Group, Media Information Laboratory, NTT Communication Science Laboratories, Kyoto, Japan.

I received B.S., M.S., and Ph.D. from Waseda University in Tokyo, Japan in 2010, 2012, and 2017. In 2016, I joined Waseda University where I served as a research assistant, assistant professor and then as a lecturer. Since 2019, I am now working as a research scientist at NTT Communication Science Laboratories.

I am a member of the Institute of Electrical and Electronics Engineering (IEEE), Institute of Electronics, Information and Communication Engineers (IEICE), Information Processing Society of Japan (IPSJ) and Acoustic Society of Japan (ASJ). I received the Awaya Prize Young Researcher Award from the ASJ in 2018 and Yamashita SIG Research Award from the IPSJ in 2019.

LinkedIn | Google Scholar | ResearchGate | dblp

Education

Waseda University, Tokyo, Japan

Apr. 2012 - Mar. 2016

Ph.D. in Computer Science and Engineering

Research Topic: Fully Bayesian method for speaker recognition and clustering
Supervisor and chief examiner: Tetsunori Kobayashi
Sub-chief examiner: Tetsuji Ogawa, Yasuo Matsuyama (Waseda Univ.), Daichi Mochihashi (Institute of Statistical Mathematics)

Apr. 2010 - Mar. 2012

M.S. in Computer Science and Engineering

Research Topic: Markov Monte Carlo methods for speech processing
Supervisor: Tetsunori Kobayashi

Apr. 2006 - Mar. 2010

B.E. in Computer Science and Engineering

Research Topic: Computer vision for upper-body tracking
Supervisor: Tetsunori Kobayashi

Experience

Apr. 2019 - Present

NTT Communication Science Laboratories, Kyoto

Researcher	Apr. 2020 - Present
Research associate	Apr. 2019 - 2020

Apr. 2016 - Present

Waseda University, Tokyo

Visiting Researcher	Apr. 2019 - Present
Lecturer	Apr. 2018 - Mar. 2019
Assistant professor	Apr. 2017 - Mar. 2018
Research associate	Apr. 2016 - Mar. 2017

Oct. 2015 - Dec. 2015

Toyota Technological Institute at Chicago (TTIC)

Visiting students

Oct. 2015 - Dec. 2015

Publications

Journal paper (peer-reviewed)

Naohiro tawara, Tsunori Ogawa, Tomoharu Iwata, Hiroto Ashikawa, Tetsunori Kobayashi, Tetsuji Ogawa, " Multi-Source Domain Generalization Using Domain Attributes for Recurrent Neural Network Language Models", IEICE TRANSACTIONS on Information and Systems, Vol E105-D No.1, pp.150-160, Jan. 2022. [pdf]
Naohiro tawara, Tetsuji Ogawa, Shinji Watanabe, Tetsunori Kobayashi, " Nested gibbs sampling for mixture-of-mixture model and its application to speaker clustering", Apsipa transactions on signal and information processing, cambridge university press. vol.5, Aug. 2016. [pdf]
Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, ushi Nakamura, Tetsunori Kobayashi, " A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large scale data", APSIPA Transactions on Signal and Information Processing Cambridge University Press, Vol.4:e6, Sept. 2015. [pdf]
Kazuya Ueki, Youhei Shiraishi, Naohiro Tawara, Tetsunori Kobayashi, "Improving classification accuracy of image categories using local descriptors with supplemental information", Journal of the Japan Society for precision engineering, VOl.80 No.12, (in Japanese), Dec. 2014. [pdf]

International conference (peer-reviewed)

Naohiro Tawara, Atsunori Ogawa, Yuki Kitagishi, Hosana Kamiyama, and Yusuke Ijima, "Robust speech-age estimation using local maximum mean discrepancy under mismatched recording conditions", ASRU 2021, Dec. 2021. [url]
Keisuke Kinoshita, Marc Delcroix, and Naohiro Tawara, "Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech", In Proc. Interspeech, 2021, Sept. 2021. [url]
Naohiro Tawara, Atsunori Ogawa, Yuki Kitagishi, and Hosana Kamiyama,, "Age-vox-celeb: Multi-modal corpus for facial and speech estimation", In Proc. ICASSP, pp. 6963-6967, June. 2021. [url]
Atsunori Ogawa, Naohiro Tawara, Takatomo Kano, and Marc Delcroix, "BLSTM-based confidence estimation for end-to-end speech recognition", In Proc. ICASSP, pp. 6383-6387, June. 2021. [url]
Keisuke Kinoshita, Marc Delcroix, and Naohiro Tawara, "Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds", In Proc. ICASSP, pp. 7198-7202, June. 2021. [url]
Yosuke Higuchi, Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Tetsunori Kobayashi, and Tetsuji Ogawa "Noise-robust Attention Learning for End-to-End Speech Recognition", In Proc. EUSIPCO, pp. 3111-3115, Jan. 2021. [url]
Yuki Kitagishi, Hosana Kamiyama, Atsushi Ando, Naohiro Tawara, Takeshi Mori, and Satoshi Kobashikawa, "Speaker age estimation using age-dependent insensitive loss", In Proc. APSIPA, pp. 319-324, Dec. 2020. [pdf]
Atsunori Ogawa, Naohiro Tawara, and Marc Delcroix "Language Model Data Augmentation Based on Text Domain Transfer", In Proc. Interspeech, pp. 4926-4930, Oct. 2020. [pdf]
Naohiro Tawara, Atsunori Ogawa, Tomoharu Iwata, Marc Delcroix, and Tetsuji Ogawa, "Frame-level phoneme-invariant speaker embedding for text-independent speaker recognition on extremely short utterances", In Proc. ICASSP, pp. 6799-6803, May 2020. [url]
Naohiro Tawara, osana Kamiyama, Satoshi Kobashikawa, Atsunori Ogawa, "Improving speaker-attribute estimation by voting based on speaker cluster information", In Proc. ICASSP, pp. 6594-6598, May 2020. [url]
Marc Delcroix, Tsubasa Ochiai, Katerina Zmolikova, Keisuke Kinoshita, Naohiro Tawara, Tomohiro Nakatani, and Shoko Araki, "Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam", In Proc. ICASSP, May 2020. [pdf]
Naohiro Tawara, Tetsunori Kobayashi, Tetsuji Ogawa, "Multi-channel speech enhancement using time-domain convolutional denoising autoencoder", In Proc. Interspeech, pp.86-90, Sept. 2020. [pdf]
Yosuke Higuchi,Naohiro Tawara, Tetsunori Kobayashi, Tetsuji Ogawa, "Speaker Adversarial Training of DPGMM-based Feature Extractorfor Zero-Resource Languages", In Proc. Interspeech, pp.266-270, Sept. 2019. [pdf]
Naohiro Tawara, Hikari Tanabe, Tetsunori Kobayashi, Masaru Fujieda, Kazuhiro Katagiri, Takashi Yazu, Tetsuji Ogawa, "Postfiltering Using an Adversarial Denoising Autoencoder with Noise-aware Training", In Proc. ICASSP, pp.3282-3286, May 2019. [url]
Naohiro Tawara, Tetsunori Kobayashi, Masaru Fujieda, Kazuhiro Katagiri, Takashi Yazu, Tetsuji Ogawa, "Adversarial autoencoder for reducing nonlinear distortion", In Proc. APSIPA,, pp.1669-1673, Nov. 2018. [pdf]
Yuya Kokaki, Naohiro Tawara, Tetsunori Kobayashi, Kazuo Hashimoto, Tetsuji Ogawa, "Sequential fish catch forecasting using state space models and Hamiltonian Monte Carlo method", In Proc. ICPR, Aug. 2018. [url]
Tsuyoshi Morioka, Naohiro Tawara, Tetsuji Ogawa, Atsunori Ogawa, Tomoharu Iwata, Tetsunori Kobayashi, "Language model domain adaptation via recurrent neural network with domain-shared and domain-specific representations", In Proc. ICASSP, pp.6084-6088, April 2018. [url]
Taira Tsuchiya, Naohiro Tawara, Tetsunori Kobayashi, Tetsuji Ogawa, "Speaker invariant feature extraction for zero-resource languages with adversarial training", In Proc. ICASSP, pp.2381-2385, April 2018. [rul]
Hiroto Ashikawa, Naohiro Tawara, Tetsunori Kobayashi, Tetsuji Ogawa, "Exploiting end of sentences and speaker alternations in recurrent neural network-based language modeling for multiparty conversations", In Proc. APSIPA, Dec. 2017. [pdf]
Kotaro Kikuchi, Naohiro Tawara, Tetsunori Kobayashi, Yoshihiko Hayashi, "Word Vector Augmentation by its Definition for Zero-shot Image Classification", CVPR Language and Vision workshop, July 2017.
Naohiro Tawara, Tetsuji Ogawa, Tetsunori Kobayashi, "A comparative study of spectral clustering for i-vector-based speaker clustering under noisy conditions", In Proc. ICASSP, pp.2041-2045, April 2015. [url]
Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi, "Blocked Gibbs sampling based multi-scale mixture model for speaker clustering on noisy data", In Proc. MLSP, Sept. 2013. [url]
Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, and Tetsunori Kobayashi, "Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model", In Proc. Interspeech, pp.2166-2169, Sept. 2012. [pdf]
Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Tetsunori Kobayashi, "Fully Bayesian inference of multi-mixture Gaussian model and its evaluation using speaker clustering", In Proc. ICASSP, pp.5253-5256, March 2012. [url]
Naohiro Tawara, Shinji Watanabe, Tetsuji Ogawa, Tetsunori Kobayashi, "Speaker clustering based on utterance-oriented Dirichelet process mixture model", In Proc. Interspeech, pp.2905-2908, Aug. 2011. [pdf]

Competitive research funding (KAKEN)

Knowledge acquisition from unknown domain data with segmental clustering Principal Investigator
Research Category : Grant-in-Aid for Young Scientists (B) / Project Period (FY) : 2017 - 2018 [url]
A study on speaker-specific information extraction in consideration of vocalization mechanism and its application to speaker verification Research Collaborator
Research Category : Grant-in-Aid for Challenging Exploratory Research / Project Period (FY) : 2016 - 2018 [url]

Awards

The best student paper at the 2010 annual meeting of Acoustical Society of Japan in 2010.
Poster Book Prizes in APSIPA2017.
Awaya Prize Young Researcher Award from the Acoustic Society of Japan in 2018.
Yamashita SIG Research Award from the Information Processing Society of Japan in 2019.

Contact

Address

NTT Communication Science Laboratories

Media Information Laboratory,

Signal Processing Research Group,

2-4, Hikaridai, Seika-cho, "Keihanna Science City", Kyoto, 619-0237, Japan

E-mail

naohiro (dot) tawara (dot) ex (at) hco.ntt.co.jp