Signal ProcessingResearch Group

Human-friendly technologies, such as a voice interface, become increasingly popular in our daily lives. This means that speech will be used more and more to communicate with computers. Our group aims to achieve comfortable conversation with computers whenever, wherever, and involving whoever. With the powerful, state-of-the-art Digital Signal Processing technology, we are researching speech recognition technology, acoustic signal processing technology, and other various kinds of signal processing technologies.
Group Leader Tomohiro Nakatani

Research Index

Speech Recognition for Computers

Automatic speech recognition (ASR) technology endeavors to endow computers with the primary human communication channel: speech. Given an audio signal, ASR systems identify segments that contain a speech signal. For each speech segment, a sequence of information-rich feature vectors is extracted and matched to the most likely word sequence given a set of previously trained speech models that reflects the salient features of a languages’ phonemes. Although they...read more

Listening to human speech in noisy reverberant environments

Speech is one of the most natural and useful media for human communication. If a computer could appropriately handle speech signals in our daily lives, it could provide us with more convenient and comfortable speech services. However, when a speech signal is captured by distant microphones, background noise and reverberation contaminate the original signal and severely degrade the performance of existing speech applications. To overcome such limitations, we are...read more

Dynamical Information Processing (Fast physical random number generation)

Absolutely unpredictable random number sequences are essential for data security. For example, they are used for encryption, password generation, and the segmentation processing of secret data in secret sharing schemes. So there is a demand for a compact device that rapidly generates unpredictable random numbers on the basis of a physical phenomenon. We are developing a random signal generator module, based on the phenomenon in which the intensity of light from a semiconductor laser varies randomly over time at high speed. We are ...read more

Publications

  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001

2017

Journal Papers

  1. T. Kawase, K. Niwa, M. Fujimoto, K. Kobayashi, S. Araki, and T. Nakatani, “Integration of spatial cue-based noise reduction and speech model-based source restoration for real time speech enhancement, ”IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E100. A (1027) , no. 5, pp. 1127-1136, May 2017.
  2. A. Ogawa, and T. Hori, “Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks,” Speech Communication, vol. 89, pp. 70-83 2017, May 2017.
  3. T. Higuchi, N. Ito, S. Araki, T. Yoshioka, M. Delcroix, and T. Nakatani, “Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 4, pp. 780-793, April 2017.
  4. S. Shinohara, K. Arai, P. Davis, S. Sunada, and T. Harayama. “Chaotic laser based physical random bit streaming system with a computer application interface,” Optics Express, vol. 25, pp. 6461-6474, 2017.
  5. N. Suzuki, T. Hida, M. Tomiyama, A. Uchida, K. Yoshimura, K. Arai, and M. Inubushi, “Common-signal-induced synchronization in semiconductor lasers with broadband optical noise signal,” IEEE Journal of Selected Topics in Quantum Electronics, 2017.

Peer-reviewed Conference Papers

  1. T. Nakatani, N. Ito, T. Higuchi, S. Araki, and K. Kinoshita, “Integrating DNN-based and spatial clustering-based mask estimation for robust MVDR beamforming,” IEEE ICASSP 2017, pp. 286-290, March 2017.
  2. T. Ochiai, M. Delcroix, K. Kinoshita, A. Ogawa, T. Asami, S. Katagiri, and T. Nakatani, “Cumulative moving averaged bottleneck speaker vectors for online speaker adaptation of CNN-based acoustic models,” IEEE ICASSP 2017, pp. 5175-7179, March 2017.
  3. C. Huemmer, M. Delcroix, A. Ogawa, K. Kinoshita, T. Nakatani, and W. Kellermann, “Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features,” IEEE ICASSP 2017, pp. 4875-4879 March 2017.
  4. D. Tran, M. Delcroix, A. Ogawa, C. Huemmer, and T. Nakatani, “Feedback connection for deep neural network-based acoustic modeling,” IEEE ICASSP 2017, pp. 5240-5244 March 2017.
  5. N. Ito, S. Araki, M. Delcroix, and T. Nakatani, “Probabilistic spatial dictionary based online adaptive beamforming for meeting recognition in noisy and reverberant environments,” IEEE ICASSP 2017, pp. 681-685, March 2017.
  6. K. Kinoshita, M. Delcroix, A. Ogawa, T. Higuchi, and T. Nakatani, “Deep mixture density network for statistical model-based feature enhancement,” IEEE ICASSP 2017, pp. 251-255, March 2017.
  7. T. Higuchi, T. Yoshioka, K. Kinoshita, and T. Nakatani, “Unsupervised utterance-wise beamformer estimation with speech recognition-level criterion,” IEEE ICASSP 2017, pp. 5170-5174, March 2017.
  8. S. Araki, N. Ito, M. Delcroix, A. Ogawa, K. Kinoshita, T.Higuchi, T. Yoshioka, D. Tran, S. Karita, and T.Nakatani, “Online meeting recognition in noisy environments with time-frequency mask based MVDR beamforming,” HSCMA 2017, pp. 16-20 March 2017.
  9. Y. Kawashima, S. Shinohara, S. Sunada, and T. Harayama, “Asymmetric emission of the quadrupole-deformed microcavity laser with spatially selective pumping,” Workshop on Asymmetric Microcavity and Wave Chaos, March 2017.
  10. Y. Suzuki, S. Shinohara, S. Sunada, and T. Harayama, “Chiral mode lasing in an asymmetrically deformed microcavity laser,” Workshop on Asymmetric Microcavity and Wave Chaos, March 2017.
  11. T. Harayama, S. Sunada, and S. Shinohara, “Universal single-mode lasing in fully-chaotic billiard lasers,” Workshop on Asymmetric Mircrocavity and Wave Chaos, March 2017.
  12. A. Liutkus, F. -R. Stoter, Z. Rafii, D. Kitamura, B. Rivet, N. Ito, N. Ono, and J. Fontecave, “The 2016 Signal Separation Evaluation Campaign,” LVA/ICA, pp. 323-332, February 2017.

2016

Journal Papers

  1. A. Ogawa, T. Hori, and A. Nakamura, “Estimating speech recognition accuracy based on error type classification,” IEEE Trans. ASLP, vol. 24, no. 12, pp. 2400-2413, December 2016.
  2. S. Shinohara, T. Fukushima, S. Sunada, T. Harayama, and K. Arai, “Long-path formation in a deformed microdisk laser,” Physical Review A, vol. 94, 013831, July 2016.
  3. S. Sunada, S. Shinohara, T. Fukushima, and T. Harayama, ”Signature of Wave Chaos in Spectral Characteristics of Microcavity Lasers,” Phys. Rev. Lett. 116, 203903, May 2016.
  4. M. Delcroix, A. Ogawa, S.-J. Hahm, T. Nakatani, and A. Nakamura, “Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation,” Computer Speech and Language (CSL), Elsevier, vol. 36, pp. 24-41, March 2016.
  5. K. Kinoshita, M. Delcroix, S. Gannot, E. Habets, R. Haeb-Umbach, W. Kellermann, V. Leutnant, R. Maas, T. Nakatani, B. Raj, A. Sehr and T. Yoshioka1, “A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research,” EURASIP journal on advanced signal processing, DOI:10.1186/s13634-016-0306-6, January 2016.

Peer-reviewed Conference Papers

  1. S. Watanabe, X. Xiao, and M. Delcroix, “Multi-Microphone Speech Recognition,” APSIPA December 2016.
  2. T. Sasaki, I. Kakesu, A. Uchida, S. Sunada, K. Yoshimura, and K. Arai, “Common-signal-induced synchronization in photonic integrated circuits driven by constant-amplitude random-phase light,” NOLTA 2016, C1L-B4, vol. 1, pp. 566-569, November 2016.
  3. T. Higuchi, T. Yoshioka, and T. Nakatani, “Sparseness-based multichannel nonnegative matrix factorization for blind source separation,” IWANC 2016, September 2016.
  4. M. Fakhry, N. Ito, S. Araki, and T. Nakatani, “Modeling audio directional statistics using a probabilistic spatial dictionary for speaker diarization in real meetings,” IWAENC 2016, September 2016.
  5. T. Higuchi, T. Yoshioka, and T. Nakatani, “Optimization of speech enhancement front-end with speech recognition-level criterion,” Interspeech 2016, September 2016.
  6. A. Ogawa, S. Seki, K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, and K. Takeda, “Robust example search using bottleneck features for example-based speech enhancement,” Interspeech 2016, pp. 3733-3737, September 2016.
  7. M. Delcroix, K. Kinoshita, A. Ogawa, T. Yoshioka, D. Tran, and T. Nakatani, “Context adaptive neural network for rapid adaptation of deep CNN based acoustic models,” Interspeech 2016, pp. 1573-1577, September. 2016.
  8. D. Tran, M. Delcroix, A. Ogawa, and T. Nakatani, “Factorized linear input network for acoustic model adaptation in noisy conditions,” Interspeech 2016, pp. 3813-3817, September 2016.
  9. K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, “Speech intelligibility prediction based on the envelope power spectrum model with the dynamic compressive Gammachirp auditory filterbank,” Interspeech 2016, pp. 2885-2889 September 2016.
  10. M. Delcroix, and S. Watanabe, “Recent advances in distant speech recognition,” Interspeech 2016, September 2016.
  11. K. Zmolikova, M. Karafiat, K. Vesel, M. Delcroix, S. Watanabe, L. Burget, and H. Cernock. “ Data selection by sequence summarizing neural network in mismatch condition training,” Interspeech 2016, September 2016.
  12. Li. Li, H. Kameoka, T. Higuchi, and H. Saruwatari, “Semi-supervised joint enhancement of spectral and cepstral sequences of noisy speech,” Interspeech 2016, September 2016.
  13. N. Ito, S. Araki, and T. Nakatani, “Complex angular central Gaussian mixture model for directional statistics in mask-based microphone array signal processing,” pp. 1153-1157 EUSIPCO-2016, August 2016.
  14. N. Murata, H. Kameoka, K. Kinoshita, S. Araki, T. Nakatani, S. Koyama, and H. Saruwatari,“ Reverberation-robust underdetermined source separation with non-negative tensor double deconvolution,” EUSIPCO 2016, pp. 1648-1652, August 2016.
  15. S. Sunada, S. Shinohara, T. Fukushima, and T. Harayama, “Wave-chaos-induced single-frequency lasing in microcavities,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper C1L-B2, 2016.
  16. S. Orihara, K. Koyama, S. Shinohara, S. Sunada, T. Fukushima, and T. Harayama, “Optimal design of two-dimensional external cavities for delayed optical feedback,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper B3L-B3, 2016.
  17. K. Kawashima, S. Shinohara, S. Sunada, T. Fukushima, and T. Harayama, “Asymmetric emission caused by chaos-assisted tunneling and synchronization in two-dimensional microcavity lasers,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper C1L-B3, 2016.
  18. S. Suzuki, S. Sunada, S. Shinohara, T. Fukushima, and T. Harayama, “Fast physical random bit generation by chaotic lasers with delayed feedback using extremely short external cavities,” Proceedings of the NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper B3L-B2, 2016.
  19. S. Sekiguchi, S. Shinohara, T. Fukushima, and T. Harayama, “Effects of phase space sticky motions in nearly-integrable dielectric billiards on far-field patterns,” NOLTA 2016 the 2016 International Symposium on Nonlinear Theory and its Applications, Paper C2L-B5, 2016.
  20. M. Fujimoto and T. Nakatani, "Multi-pass feature enhancement based on generative-discriminative hybrid approach for noise robust speech recognition," ICASSP 2016, pp. 5750-5754, March 2016.
  21. T. Kawase, K. Niwa, M. Fujimoto, N. Kamado, K. Kobayashi, S. Araki, and T. Nakatani, "Real-time integration of statistical model-based speech enhancement with unsupervised noise psd estimation using microphone array," ICASSP 2016, pp. 604-608, March 2016.
  22. S. Araki, M. Okada, T. Higuchi, A. Ogawa and T. Nakatani, "Spatial correlatoin model based observation vector clustering and MVDR beamforming for meeting recognition," ICASSP2016, pp. 385-389, 2016.
  23. H. Meutzner, S. Araki, M. Fujimoto and T. Nakatani, "A generative-discriminative hybrid approach to multi-channel noise reduction for robust automatic speech recognition," ICASSP2016, pp. 5740-5744, 2016.
  24. T. Yoshioka, K. Ohnishi, F. Fang, and T. Nakatani, “Noise robust speech recognition using recent developments in neural networks for computer vision,” ICASSP 2016, pp. 5730-5734, Mar. 2016.
  25. N. Ito, S. Araki, and T. Nakatani, "Modeling audio directional statistics using a complex Bingham mixture model and its application to blind diffuse noise reduction," ICASSP2016, pp. 465-468, March 2016.
  26. M. Delcroix, K. Kinoshita, C. Yu, A. Ogawa, T. Yoshioka, and T. Nakatani, “Context adaptive deep neural networks for fast acoustic model adaptation in noisy conditions,” Proc. of ICASSP’16, pp. 5270-5274, March 2016.
  27. S. Kundu, G. V. Mantena, Y. Qian, T. Tan, M. Delcroix, and K. C. Sim, “Joint acoustic factor learning for robust deep neural network based automatic speech recognition,” Proc. of ICASSP’16, pp. 5025-5029, March 2016.
  28. T. Higuchi, N. Ito, T. Yoshioka and T. Nakatani, "Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise," ICASSP 2016, pp. 5210-5214, March 2016.

Other Conference Papers

  1. K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, “Analysis of acoustic features for speech intelligibility prediction models,” 5th ASA/ASJ Joint meeting, Journal of the Acoustical Society America, vol. 140, No. 4, Pt. 2, pp. 3114, November 2016.

2015

Journal Papers

  1. M. Espi, M. Fujimoto, K. Kinoshita, and T. Nakatani, "Exploiting spectro-temporal locality in deep learning based acoustic event detection," EURASIP Journal on Audio, Speech, and Music Processing, DOI 10.1186/s13636-015-0069-2December 2015.
  2. M. Espi, M. Fujimoto, and T. Nakatani, "Acoustic event detection in speech overlapping scenarios based on high resolution spectral input and deep learning," IEICE Transactions on Information and Systems, vol. E98-D, no. 10, pp. 1799-1807, October 2015.
  3. T. Harayama and S. Shinohara, “Ray-wave correspondence in chaotic dielectric billiards,” Physical Review E, vol. 92, p. 042916 (6 pages), 2015
  4. T. Yoshioka and M. J. F. Gales, “Environmentally robust ASR front-end for deep neural network acoustic models,” Computer Speech and Language, vol. 31, no. 1, pp. 65-86, May 2015.
  5. N. Ito, E. Vincent, T. Nakatani, N. Ono, S. Araki, and S. Sagayama, "Blind suppression of nonstationary diffuse acoustic noise based on spatial covariance matrix decomposition," Springer Journal of Signal Processing Systems, vol. 79, no. 2, pp. 145-157, May 2015.(招待論文)
  6. M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fujimoto, N. Ito, K. Kinoshita, M. Espi, S. Araki, T. Hori, and T. Nakatani, “Strategies for distant speech recognition in reverberant environments,” EURASIP Journal on Advances in Signal Processing, July 2015.
  7. M. Inubushi, K. Yoshimura, K. Arai, and Peter Davis, "Physical random bit generators and their reliability: focusing on chaotic laser systems", Nonlinear Theory and Its Applications, vol. 6, issue 2, pp. 133-143, 2015.

News Release

  1. NTT achieved top performance in a noisy speech recognition international challenge -Advances in distortionless speech enhancement and deep-learning speech recognition techniques-, 2015. 12.14.

Book Chapter, Tutorial Papers

  1. T. Oba, K. Kobayashi, H. Uematsu, T. Asami, K. Niwa, N. Kamado, T. Kawase, and T. Hori, "Media Processing Technology for Business Task Support," NTT Technical Review, vol. 13, no. 4, April 2015.
  2. S. Araki, M. Fujimoto, T. Yoshioka, M. Delcroix, M. Espi, and T. Nakatani, "Deep learning based distant talk speech processing in real world sound environments, " NTT Technical Review, 2015.

Peer-reviewed Conference Papers

  1. M. Espi, M. Fujimoto, K. Kinoshita, and T. Nakatani, "On the importance of feature extraction for acoustic event detection using deep neural networks," Interspeech 2015, pp. 2922-2926, September 2015.
  2. M. Fujimoto and T. Nakatani, "Feature enhancement based on generative-discriminative hybrid approach with GMMs and DNNs for noise robust speech recognition," ICASSP 2015, pp. 5019-5023, April 2015.
  3. D. Q. Truong, S. Nakamura, M. Delcroix, and T. Hori, "WFST-Based Structural Classification Integrating DNN Acoustic Features and RNN Language Features for Speech Recognition," ICASSP 2015, pp. 4959-4963, April 2015.
  4. S. Araki, T. Hayashi, M. Delcroix, M. Fujimoto, K. Takeda and T. Nakatani, "Exploring multi-channel features for denoising-autoencoder-based speech enhancement,"ICASSP2015, pp. 116-120, Apr. 2015.
  5. T. Yoshioka, N. Ito, M. Delcroix, A. Ogawa, K. Kinoshita, M. Fujimoto, C. Yu, W. J. Fabian, M. Espi, T. Higuchi, S. Araki, T. Nakatani, “The NTT CHiME-3 system: advances in speech enhancement and recognition for mobile multi-microphone devices,” ASRU 2015, pp. 436-443, Dec. 2015.
  6. T. Yoshioka, S. Karita, and T. Nakatani, “Far-field speech recognition using CNN-DNN-HMM with convolution in time,” ICASSP 2015, pp. 4360-4364, Apr. 2015.
  7. N. Ono, Z. Rafii, D. Kitamura, N. Ito, and A. Liutkus, "The 2015 signal separation evaluation campaign," LVA/ICA2015, pp. 387-395, August 2015.
  8. N. Ito, S. Araki, and T. Nakatani, "Permutation-free clustering of relative transfer function features for blind source separation," EUSIPCO2015, pp. 409-413, September 2015.
  9. M. Delcroix, K. Kinoshita, T. Hori, and T. Nakatani, “Context adaptive deep neural networks for fast acoustic model adaptation,” Proc. of ICASSP’15, pp. 4535–4539, April 2015.
  10. K. Kinoshita, M. Delcroix, A. Ogawa, T. Nakatani, ``Text-informed speech enhancement with deep neural networks,'' Interspeech, pp.1760-1764, 2015
  11. K. Kinoshita, T. Nakatani, ``Modeling inter-node acoustic dependencies with Restricted Boltzmann Machine for distributed microphone array based BSS,'' ICASSP, pp. 464-468, 2015
  12. N. Suzuki, T. Hida, I. Kakesu, A. Uchida, K. Yoshimura, and K. Arai, "Effect of the bandwidth limitation of an optical noise signal used for common-signal induced synchronization in chaotic semiconductor lasers", XXXV Dynamics Days Europe 2015, September 2015.
  13. C. Yu, A. Ogawa, M. Delcroix, Takuya Yoshioka, Tomohiro Nakatani, and John H.L. Hansen, "Robust i-vector extraction for neural network adaptation in noisy environment," Proc. Interspeech, pp. 2854-2857, 2015.
  14. A. Ogawa and T. Hori, "ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks," Proc. IEEE ICASSP, pp. 4370-4374, 2015.
  15. K. Aoyama, A. Ogawa, T. Hattori, and T. Hori, "Double-layer neighborhood graph based similarity search for fast query-by-example sopken term detection," Proc. IEEE ICASSP, pp. 5216-5220, 2015.

2014

Journal Papers

  1. S. Shinohara, S. Sunada, T. Fukushima, T. Harayama, K. Arai, and K. Yoshimura, “Efficient optical path folding by using multiple total internal reflections in a microcavity,” Applied Physics Letters, vol. 105, p.151111 (4 pages), 2014.
  2. T. Fukushima, S. Shinohara, S. Sunada, T. Harayama, K. Sakaguchi, and Y. Tokuda, “Lasing of TM modes in a two-dimensional GaAs microlaser,” Optics Express, vol. 22, pp.11912-11917, 2014.
  3. S. Sunada, T. Fukushima, S. Shinohara, T. Harayama, K. Arai, and M. Adachi, “A compact chaotic laser device with a two-dimensional external cavity structure,” Applied Physics Letters, vol. 104, p.241105 (4 pages), 2014.
  4. S. Shinohara, T. Fukushima, S. Sunada, T. Harayama, K. Arai, and K. Yoshimura, “Anticorrelated bidirectional output from quasistadium-shaped semiconductor microlasers,” Optical Review, vol. 21, pp.113-116, 2014.
  5. T. Otsuka, K. Ishiguro, T., H. Sawada, and H. G. Okuno, “Multichannel Sound Source Dereverberation and Separation for Arbitrary Number of Sources Based on Bayesian Nonparametrics,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 22, no. 12, pp. 2218-2232, Oct. 2014.
  6. S. Sunada, K. Arai, K. Yoshimura, and M. Adachi, "Optical Phase Synchronization by Injection of Common Broadband Low-Coherent Light", Physical Review Letters, Vol. 112, 204101, May 2014.
  7. R. Takahashi, Y. Akizawa, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Arai, K. Yoshimura, and P. Davis, "Fast physical random bit generation with photonic integrated circuits with different external cavity lengths for chaos generation", Optics Express, vol. 22, pp. 11727-11740, May 2014.
  8. S. Yamahata, Y. Yamaguchi, A. Ogawa, H. Masataki, O. Yoshioka, and S. Takahashi, "Automatic vocabulary adaptation based on semantic and acoustic similarities," IEICE Trans. Inf. & Syst. Vol. E97-D, No.6, pp.1488-1496, June 2014.

Book Chapter, Tutorial Papers

  1. Y. Iwata, T. Nakatani, T. Yoshioka, M. Fujimoto, and H. Saito, "Maximum a posteriori spectral estimation with source log-spectral priors for multichannel speech enhancement," in "Advances in Speech and Audio Processing for Coding, Enhancement and Recognition," pp. 281-317, Springer, October 2014.

Peer-reviewed Conference Papers

  1. M. Fujimoto, Y. Kubo, and T. Nakatani, "Unsupervised non-parametric Bayesian modeling of non-stationary noise for model-based noise suppression," ICASSP 2014, pp. 5562-5566, May 2014.
  2. T. Hori, Y. Kubo, and A. Nakamura, "Real-time one-pass decoding with recurrent neural network language model for speech recognition," ICASSP 2014, pp. 6364-6368, May 2014.
  3. M. Espi, M. Fujimoto, Y. Kubo, and T. Nakatani, "Spectrogram patch based acoustic event detection and classification in speech overlapping conditions," in Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Array (HSCMA 2014), pp. 117-121, May 2014.
  4. Y. Kubo, J. Suzuki, T. Hori, A. Nakamura, "Restructuring Output Layers of Deep Neural Networks Using Minimum Risk Parameter Clustering," Interspeech 2014, pp. 1068-1072, September 2014.
  5. T. Yoshioka, A. Ragni, and M. J. F. Gales, “Investigation of unsupervised adaptation of DNN acoustic models with filter bank input,” ICASSP 2014, pp. 6344-6348, May 2014.
  6. T. Yoshioka, X. Chen, and M. J. F. Gales, “Impact of single-microphone dereverberation on DNN-based meeting transcription systems,” ICASSP 2014, pp. 5527-5531, May 2014.
  7. N. Ito, S. Araki, T. Yoshioka, and T. Nakatani, "Relaxed disjointness based clustering for joint blind source separation and dereverberation," IWAENC2014, pp. 268-272, September, 2014.
  8. N. Ito, S. Araki, and T. Nakatani, "Probabilistic integration of diffuse noise suppression and dereverberation," ICASSP2014, pp. 5167-5171, May 2014.
  9. M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fujimoto, N. Ito, K. Kinoshita, M. Espi, T. Nakatani, and A. Nakamura, “Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB challenge,” proc. of REVERB challenge workshop, May 2014. (Best performance on the recognition task of the REVERB Challenge)
  10. M. Delcroix, T. Yoshioka, A. Ogawa, Y. Kubo, M. Fujimoto, N. Ito, K. Kinoshita, M. Espi, S. Araki, and T. Nakatani, “Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition,” Invited paper to IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 522-526 December 2014.
  11. K. Arai, "Synchronization of semiconductor lasers for secret key distribution", Forum Math-for-Industry 2014, Octorber 2014.
  12. I. Kakesu, N. Suzuki, A. Uchida, K. Yoshimura, K. Arai, and Peter Davis, "Frequency Dependence of Common-Signal-Induced Synchronization in Semiconductor Lasers with Constant-Amplitude and Random-Phase Light", The 2014 International Symposium on Nonlinear Theory and its Applications, pp. 466-469, September 2014.
  13. S. Sunada, K. Arai, K. Yoshimura, and M. Adachi, "Common Noise-Induced Optical Phase Synchronization in Lasers", The 2014 International Symposium on Nonlinear Theory and its Applications, pp. 470-473, September 2014.
  14. K. Arai, K. Yoshimura, S. Sunada, and A. Uchida, "Synchronization Induced by Common ASE Noise in Semiconductor Lasers", The 2014 International Symposium on Nonlinear Theory and its Applications, pp. 474-477, September 2014.
  15. A. Ogawa, K. Kinoshita, T. Hori, T. Nakatani, and A. Nakamura, "Fast segment search for corpus-based speech enhancement based on speech recognition technology," Proc. IEEE ICASSP, pp. 1576-1580, 2014.
  16. K. Aoyama, A. Ogawa, T. Hattori, T. Hori, and A. Nakamura, "Zero-resource spoken term detection using hierarchical graph-based similarity search," Proc. IEEE ICASSP, pp. 7143-7147, 2014.

Other Conference Papers

  1. M. Espi,M. Fujimoto,and T. Nakatani, "Detection and classification of acoustic events using multiple resolution spectrogram patch models," in Proceedings of ASJ Autumn Meeting, pp. 1529-1530, September 2014.

2013

Journal Papers

  1. M. Delcloix, K. Kinoshita, T. Nakatani, S. Araki, A. Ogawa, T. Hori, S. Watanabe, M. Fujimoto, T. Yoshioka, T. Oba, Y. Kubo, M. Souden, S.-J. Hahm, and A. Nakamura, "Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral & temporal modeling of sounds," Computer Speech and Language (CSL), vol. 27, issue 3, pp. 851-873, May 2013.
  2. M. Delcroix, S. Watanabe, T. Nakatani, and A. Nakamura, "Cluster-based dynamic variance adaptation for interconnecting speech enhancement pre-processor and speech recognizer," Computer Speech and Language, Elsevier, vol. 27, issue 1, pp. 350-368, January 2013.
  3. T. Nakatani, S. Araki, T. Yoshioka, M. Delcroix, and M. Fujimoto, "Dominance based integration of spatial and spectral features for speech enhancement," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 12, pp. 2516-2531, December 2013.
  4. T. Yoshioka and T. Nakatani, "Noise model transfer: novel approach to robustness against nonstationary noise," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 10, pp. 2182-2192, October 2013.
  5. M. Souden, K. Kinoshita, M. Delcroix, and T. Nakatani, "Location feature integration for clustering-based speech separation in distributed microphone arrays," IEEE Transactions on Audio, Speech, and Language Processing, 2013.
  6. M. Souden, S. Araki, K. Kinoshita, T. Nakatani, and H. Sawada, "A multichannel MMSE-based framework for speech source separation and noise reduction," IEEE Transactions on Audio Speech and Language Processing, vol. 21, no. 9, pp. 1913-1928, September 2013.
  7. M. Souden, K. Kinoshita, and T. Nakatani, "Towards online maximum likelihood speech clustering and separation," Journal of Acoust. Soc. America (JASA) Express letter, vol. 133, no. 5, pp. EL339-EL345, 2013.
  8. H. Sawada, H. Kameoka, S. Araki, and N. Ueda, "Multichannel extensions of nonnegative matrix factorization with complex-valued data," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 971-982, May 2013.
  9. M. Suzuki, T. Yoshioka, S. Watanabe, N. Minematsu, and K. Hirose, "Feature enhancement with joint use of consecutive corrupted and noise feature vectors with discriminative region weighting," IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no, 10, pp. 2172-2181, October 2013.
  10. J. Muramatsu, "Channel coding and lossy source coding using a generator of constrained random numbers," to appear in IEEE Transactions on Information Theory, 2013.
  11. J. Muramatsu and S. Miyake, "Corrections to "Hash property and coding theorems for sparse matrices and maximum-likelihood coding,"" IEEE Transactions on Information Theory, vol.IT-59, no. 10, pp. 6952-6953, October 2013.
  12. J. B. Goette, S. Shinohara, and M. Hentschel, "Are fresnel filtering and the angular Goos-Haenchen shift the same?," Journal of Optics 15 p. 014009, 2013.
  13. S. Sunada, T. Fukushima, S. Shinohara, T. Harayama, and M. Adachi, "Stable single-wavelength emission from stadium-shaped chaotic microcavity lasers," Physical Review A 88, p. 013802, 2013.
  14. T. Fukushima, S. Shinohara, S. Sunada, T. Harayama, K. Arai, K. Sakaguchi, and T. Tokuda, "Selective excitation of the lowest-order transverse ring modes in a quasi-stadium laser diode," Optics Letters 38, pp. 4158-4161, 2013.
  15. H. Koizumi, S. Morikatsu, H. Aida, T. Nozawa, I. Kakesu, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Information-theoretic secure key distribution based on common random-signal induced synchronization in unidirectionally-coupled cascades of semiconductor lasers," Optics Express, vol. 21, no. 15, pp. 17869-17893, July 2013.

Book Chapter, Tutorial Papers

  1. T. Hori, S. Araki, T. Nakatani, and A. Nakamura, "Advances in multi-speaker conversational speech recognition and understanding," NTT Technical Review, vol. 11, no. 12, December 2013.
  2. T. Hori and A. Nakamura, "Speech recognition algorithms using weighted finite-state transducers," Morgan & Claypool Publishers, January 2013.
  3. Y. Kubo, A. Ogawa, T. Hori, and A. Nakamura, "Speech recognition based on unified model of acoustic and language aspects of speech," NTT Technical Review, vol. 11, no. 12, December 2013.
  4. H. Masataki, T. Asami, S. Yamahata, and M. Fujimoto, "Speech recognition technology that can adapt to changes in service and environment," NTT Technical Review, vol. 11 no. 7, July 2013.
  5. A. Uchida, H. Koizumi, I. Kakesu, K. Yoshimura, J. Muramatsu, and P. Davis, "Synchronized semiconductor lases for secure key distribution," SPIE Newsroom, 10.1117/2.1201311.005200, 2013.

Peer-reviewed Conference Papers

  1. A. Ogawa, T. Hori, and A. Nakamura, "Discriminative recognition rate estimation for n-best list and its application to n-best rescoring," ICASSP2013, pp. 6832-6836, 2013.
  2. A. Ogawa, T. Hori, A. Nakamura, and T. Oba, "Recognition rate estimation based on error type classification and its applications," Invited Talk at Workshop Errare 2013.
  3. M. Fujimoto and T. Nakatani, "Model-based noise suppression using unsupervised estimation of hidden Markov model for non-stationary noise," Interspeech2013, pp. 2982-2986, August 2013.
  4. M. Delcroix, A. Ogawa, S.-J. Hahm, T. Nakatani, and A. Nakamura, "Unsupervised discriminative adaptation using differenced maximum mutual information based linear regression," ICASSP2013, pp. 7888-7892, 2013.
  5. M. Delcroix, Y. Kubo, T. Nakatani, and A. Nakamura, "Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling?" Interspeech2013, pp. 2992-2996, 2013.
  6. K. Aoyama, A. Ogawa, T. Hattori, T. Hori, and A. Nakamura, "Graph index based query-by-example search on a large speech data set," ICASSP2013, pp. 8520-8524, 2013.
  7. Y. Kubo, T. Hori, and A. Nakamura, "A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion," Interspeech2013.
  8. T. Oba, A. Ogawa, T. Hori, H. Masataki, and A. Nakamura, "Unsupervised discriminative language modeling using error rate estimator," Interspeech2013, pp. 1223-1227, 2013.
  9. Y. Kubo, T. Hori, and A. Nakamura, "Large vocabulary continuous speech recognition based on WFST structured classifiers and deep bottleneck features," ICASSP2013, pp. 7629-7633, 2013.
  10. S.-J. Hahm, A. Ogawa, M. Delcroix, M. Fujimoto, T. Hori, and A. Nakamura, "Feature space variational Bayesian linear regression and its combination with model space VBLR," ICASSP2013, pp. 7898-7902, 2013.
  11. T. Yoshioka and T. Nakatani, "Noise model transfer using affine transformation with application to large vocabulary reverberant speech recognition," ICASSP2013, pp. 7058-7062, May 2013.
  12. T. Yoshioka and T. Nakatani, "Dereverberation for reverberation-robust microphone arrays," Proc. 21th European Signal Processing Conference (EUSIPCO 2013), September 2013.
  13. T. Nakatani, M. Souden, S. Araki, T. Yoshioka, T. Hori, and A. Ogawa, "Coupling beamforming with spatial and spectral feature based spectral enhancement and its application to meeting recognition," ICASSP2013, pp. 7249-7253, May 2013.
  14. T. Nakatani, M. Delcroix, and M. Fujimoto, "Speech enhancement in a car using spatial and spectral models for speaker and noise," in Proc. of The 6th Biennial Workshop on Digital Signal Processing for In-Vehicle Systems, September 2013.
  15. K. Kinoshita, M. Souden, and T. Nakatani, "Blind source separation using spatially distributed microphones based on microphone-location dependent source activities," Interspeech2013, pp. 822-826, August 2013.
  16. K. Kinoshita, M. Delcroix, T. Yoshioka, T. Nakatani, E. Habets, R. Haeb-Umbach, V. Leutnant, A. Sehr, W. Kellermann, R. Maas, S. Gannot, and B. Raj, "The REVERB challenge: a common evaluation framework for dereverberation and recognition of reverberant speech," WASPAA, October 2013.
  17. K. Kinoshita and T. Nakatani, "Microphone-location dependent mask estimation for BSS using spatially distributed asynchronous microphones," 2013 International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS), pp. 326-331, November 2013.
  18. N. Ito, S. Araki, and T. Nakatani, "Permutation-free convolutive blind source separation via full-band clustering based on frequency-independent source presence priors," ICASSP2013, pp. 3238-3242, May 2013.
  19. M. Souden, K. Kinoshita, and T. Nakatani, "An integration of source location cues for speech clustering in distributed microphone arrays," ICASSP2013, pp. 111-115, May 2013.
  20. R. Maas, W. Kellermann, A. Sehr, T. Yoshioka, M. Delcroix, K. Kinoshita, and T. Nakatani, "Formulation of the REMOS concept from an uncertainty decoding perspective," in Proc. of the international conference on digital signal processing (IEEE), pp. 1-6, July 2013.
  21. A. Sehr, T. Yoshioka, M. Delcroix, K. Kinoshita, T. Nakatani, R. Maas, and W. Kellermann, "Conditional emission densities for interconnecting speech enhancement and recognition systems," Interspeech2013, pp. 3502-3506, 2013.
  22. I. Jafari, N. Ito, M. Souden, S. Araki, and T. Nakatani, "Source number estimation based on clustering of speech activity sequences for microphone array processing," Proc. IEEE International Workshop on Machine Learning for Signal Processing (IEEE MLSP), September 2013.
  23. Y. Uezu, K. Kinoshita, M. Souden, and T. Nakatani, "On the robustness of distributed EM based BSS in asynchronous distributed microphone array scenarios," Interspeech2013, pp. 3298-3302, August 2013.
  24. N. Ono, Z. Koldovsky, S. Miyabe, and N. Ito, "The 2013 signal separation evaluation campaign," Proc. IEEE International Workshop on Machine Learning for Signal Processing (IEEE MLSP), September 2013.
  25. J. Muramatsu, "Equivalence between inner regions for broadcast channel coding," The Proceedings of the 2013 IEEE Information Theory Workshop, pp. 164-168, 2013.
  26. J. Muramatsu, "Channel code using a constrained random number generator," The Proceedings of the 2013 IEEE International Symposium on Information Theory, pp. 2463-2467, 2013.
  27. J. Muramatsu, "Lossy source code using a constrained random number generator," The Proceedings of the 2013 IEEE International Symposium on Information Theory, pp. 2354-2358, 2013.
  28. K. Yoshimura, J. Muramatsu, K. Arai, S. Shinohara, and A. Uchida, "Synchronization of semiconductor lasers by injection of common broadband random light," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 449-452, 2013.
  29. K. Yoshimura, "Existence and stability of discrete breathers in Fermi-Pasta-Ulam lattices," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 274-277, 2013.
  30. K. Arai, S. Shinohara, S. Sunada, K. Yoshimura, T. Harayama, and A. Uchida, "Noise effects on generalized chaos synchronization in semiconductor lasers," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 413-416, 2013.
  31. S. Shinohara, T. Fukushima, S. Sunada, T. Harayama, K. Arai, and K. Yoshimura, "Nonlinear modal dynamics in two-dimensional cavity microlasers," Proc. of the 2013 International Symposium on Nonlinear Theory and Its Applications, pp. 409-412. 2013.
  32. T. Fukushima, S. Shinohara, S. Sunada, T. Harayama, K. Sakaguchi, and Y. Tokuda, "Ray dynamical simulation of penrose unilluminable room cavity," Frontiers in Optics 2013/Laser Science XXIX, October 2013.
  33. I. Kakesu, H. Koizumi, S. Morikatu, H. Aida, T. Nozawa, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Secure key distribution using common-signal-induced synchronization in cascaded semiconductor lasers," Proc. of Frontiers in Optics, 2013.
  34. R. Takahashi, Y. Akizawa, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Yoshimura, K. Arai, and Peter Davis, "Physical random number generation using photonic integrated circuit with mutually coupled semiconductor lasers," Frontiers in Optics 2013, October 8-12, 2013.

Other Conference Papers

  1. T. Yoshioka and M. J. F. Gales, "An investigation of single-microphone automatic meeting transcription," present at the 2nd UKSpeech Conference, September 2013.

2012

Journal Papers

  1. T. Hori, S. Araki, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, K. Kinoshita, T. Nakatani, A. Nakamura, and J. Yamato, "Low-latency real-time meeting recognition and understanding using distant microphones and omni-directional camera," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 499-513, February 2012.
  2. A. Ogawa and A. Nakamura, "Joint estimation of confidence and error causes in speech recognition," Speech Communication, vol. 54, no. 9, pp. 1014-1028, November 2012.
  3. M. Fujimoto, S. Watanabe, and T. Nakatani, "Frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection," Speech Communication, vol. 54, no. 2, pp. 229-244, February 2012.
  4. Y. Kubo, S. Watanabe, T. Hori, and A. Nakamura, "Structural classification methods based on weighted finite-state transducers for automatic speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, issue 8, pp. 2240-2251, October 2012.
  5. T. Oba, T. Hori, A. Nakamura, and A. Ito, "Round-robin duel discriminative language models," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, Issue 4, pp. 1244-1255, May 2012.
  6. T. Oba, T. Hori, A. Nakamura, and A. Ito, "Model shrinkage for discriminative language models," IEICE Transactions on Information and Systems, vol. E95-D, No. 5, pp. 1465-1474, May 2012.
  7. T. Yoshioka, and T. Nakatani, "Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening", IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 10, pp. 2707-2720, December 2012.
  8. M. Souden, M. Delcroix, K. Kinoshita, T, Yoshioka, and T. Nakatani, "Noise power spectral density tracking: A maximum likelihood perspective," IEEE Signal Processing Letters, vol. 19, no. 8, pp. 495-498, August 2012.
  9. K. Ishiguro, T. Yamada, S. Araki, T. Nakatani, and H. Sawada, "Probabilistic speaker diarization with bag-of-words representations of speaker angle information," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 2, pp. 447-460, 2012.
  10. E. Vincent, S. Araki, F. Theis, G. Nolte, P. Bofill, H. Sawada, A. Ozerov, V. Gowreesunker, D. Lutter, and N. Q. K. Duong, "The signal separation evaluation campaign (2007-2010): Achievements and remaining challenges," Signal Processing vol. 92, issue 8, pp. 1928-1936, August 2012.
  11. J. Muramatsu and S. Miyake, "Construction of codes for wiretap channel and secret key agreement from correlated source outputs based on hash property," IEEE Transactions on Information Theory, vol. IT-58, no. 2, pp. 671-692, February 2012.
  12. J. Muramatsu and S. Miyake, "Corrections to "Hash property and fixed-rate universal coding theorems,"" IEEE Transactions on Information Theory, vol.IT-58, no. 5, pp. 3305-3307, May 2012.
  13. K. Yoshimura, J. Muramatsu, P. Davis, T. Harayama, A. Uchida, H. Okumura, S. Morikatsu, H. Aida, and A. Uchida, "Secure key distribution using correlated randomness in lasers driven by common random light," Physical Review Letters, vol. 108, 070602, February 2012.
  14. K. Yoshimura, "Stability of discrete breathers in diatomic nonlinear oscillator chains," Nonlinear theory and its applications, IEICE 3, pp. 52-66, 2012.
  15. K. Yoshimura, "Stability of discrete breathers in nonlinear Klein-Gordon type lattices with pure anharmonic couplings," Journal of Mathematical Physics 53, 102701, 2012.
  16. K. Arai, S, Sunada, T, Harayama, and P, Davis, "The randomness in galton board from viewpoint of predictability : sensitivity and statistical bias of output states," Physical Review, E 86, 056216, 2012.
  17. H. Aida, M. Arahata, H. Okumura, H. Koizumi, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Experiment on synchronization of semiconductor lasers by common injection of constant-amplitude random-phase light," Optics Express, vol. 20, no. 11, pp. 11813-11829, May 2012.
  18. T. Harayama, S. Sunada, K. Yoshimura, J. Muramatsu, K. Arai, A. Uchida, and P. Davis, "Theory of fast non-deterministic physical random bit generation with chaotic lasers," Physical Review E, vol. 85, 046215, April 2012.
  19. S. Sunada, T. Harayama, P. Davis, K. Tsuzuki, K. Arai, K. Yoshimura, and A. Uchida, "Noise amplification by chaotic dynamics in a delayed feedback laser system and its application to nondeterministic random bit generation," Chaos 22, 047513, 2012.
  20. Y. Akizawa, T. Yamazaki, A. Uchida, T. Harayama, S. Sunada, K. Arai, K. Yoshimura, and P. Davis, "Fast random number generation with bandwidth-enhanced chaotic semiconductor lasers at 8×40 Gb/s," IEEE Photonics Technolgy Letters 24, pp. 1042-1044, 2012.
  21. T. Mikami, K. Kanno, K. Aoyama, A. Uchida, T. Ikeguchi, T. Harayama, S. Sunada, K. Arai, K. Yoshimura, and P. Davis, "Estimation of entropy rate in a fast physical random-bit generator using a chaotic semiconductor laser with intrinsic noise," Physical Review, E 85, 016211, January 2012.
  22. S. Sunada, T. Harayama, P. Davis, K. Tsuzuki, K. Arai, K. Yoshimura, and A. Uchida, "Noise amplification in high dimensional chaotic laser systems and its application to nondeterministic physical random bit generation," Chaos: Interdisciplinary Journal of Nonlinear Science vol. 22, 047513, 2012.
  23. T. Hirayama, S. Arakawa, K. Arai, and M. Murata, "Dynamics of feedback-induced packet delay in ISP router-level topologies," IEICE Transactions on Communications, vol. E95-B, no. 9, pp. 2785-2793, 2012.
  24. J.-W. Ryu, J. Cho, C.-M. Kim, S. Shinohara, and S. W. Kim, "Terahertz beat frequency generation from two-mode lasing operation of coupled microdisk laser," Optics Letters 37, pp. 3210-3213, 2012.

Book Chapter, Tutorial Papers

  1. T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann, "Making machines understand us in reverberant rooms: robustness against reverberation for automatic speech recognition," IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 114-126, November 2012.
  2. K. Yoshimura, S. Shinohara, and K. Arai, "Fast Physical Random Number Generation Using Semiconductor Laser Chaos," NTT Technical Review 10, no.11, 2012.

Peer-reviewed Conference Papers

  1. T. Hori, K. Kinoshita, S. Araki, A. Ogawa, T. Yoshioka, M. Fujimoto, T. Oba, M. Delcroix, M. Souden, Y. Kubo, S.-J. Hahm, D. Mikami, K. Otsuka, T. Nakatani, A. Nakamura, and J. Yamato, "Real-time audio-visual meeting recognition and understanding using distant microphone array," ICASSP2012, Show & Tell session.
  2. A. Ogawa, T. Hori and A. Nakamura, "Recognition rate estimation based on word alignment network and discriminative error type classification," SLT, 2012.
  3. A. Ogawa, T. Hori, and A. Nakamura, "Error type classification and word accuracy estimation using alignment information in word confusion network," ICASSP2012, pp. 4925-4928, March 2012.
  4. M. Fujimoto and T. Nakatani, "A reliable data selection for model-based noise suppression using unsupervised joint speaker adaptation and noise model estimation," ICSPCC 2012, pp. 4713-4716, Aug 2012. (invited talk)
  5. M. Fujimoto, S. Watanabe, and T. Nakatani, "Noise suppression with unsupervised joint speaker adaptation and noise mixture model estimation," ICASSP2012, pp. 4713-4716, March 2012.
  6. M. Espi, M. Fujimoto, D. Saito, N. Ono, and S. Sagayama, "A tandem connectionist model using combination of multi-scale spectro-temporal features for acoustic event detection," ICASSP2012, pp. 4293-4296, March 2012.
  7. M. Delcroix, A. Ogawa, T. Nakatani, and A. Nakamura, "Dynamic variance adaptation using differenced maximum mutual information," MLSLP, 2012.
  8. M. Delcroix, A. Ogawa, S. Watanabe, T. Nakatani, and A. Nakamura, "Discriminative feature transforms using difference maximum mutual information," ICASSP2012, pp. 4753-4756, March 2012.
  9. Y. Kubo, T. Hori, and A. Nakamura, "Integrating deep neural networks into structured classification approach based on weighted finite-state transducers," Interspeech2012, September 2012.
  10. T. Oba, T. Hori, A. Nakamura, and A. Ito, "Spoken document retrieval by discriminative modeling in a high dimensional feature space," ICASSP2012, pp.5153-5156, March 2012.
  11. S.-J. Hahm, A. Ogawa, M. Fujimoto, T. Hori, and Atsushi Nakamura, "Speaker adaptation using variational Bayesian linear regression in normalized feature space," Interspeech2012, September 2012.
  12. S.-J. Hahm, S. Watanabe, M. Fujimoto, T. Hori, and A. Nakamura, "Normalization and adaptation by consistently employing MAP estimation," IWSML 2012.
  13. S. Watanabe, Y. Kubo, T. Oba, T. Hori, and A. Nakamura, "Bag of arcs: new representation of speech segment features based on finite state machines," ICASSP2012, pp. 4201-4204, March 2012.
  14. S. Kobashikawa, T. Hori, Y. Yamaguchi, T. Asami, H. Masataki, and S. Takahashi, "Efficient beam width control to suppress excessive speech recognition computation time based on prior score range normalization," Interspeech2012, September 2012.
  15. S. Kobashikawa, T. Hori, Y. Yamaguchi, T. Asami, H. Masataki, and S. Takahashi, "Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation," SLT, 2012.
  16. S. Yamahata, Y. Yamaguchi, A. Ogawa, H. Masataki, O. Yoshioka, and S. Takahashi, "Automatic vocabulary adaptation based on semantic similarity and speech recognition confidence measure," Interspeech2012, September 2012.
  17. E. Chuangsuwanich, S. Watanabe, T. Hori, T. Iwata, and J. Glass, "Handling uncertain observations in unsupervised topic-mixture language model adaptation," ICASSP2012, pp. 5033-5036, March 2012.
  18. M. Suzuki, T. Yoshioka, S. Watanabe, N. Minematsu, and K. Hirose, "MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments," ICASSP2012, pp. 4109-4112, March 2012.
  19. R. Roller, S. Watanabe and T. Iwata, "Effect of dialog acts on word use in polylogue," ICASSP2012, pp. 4969-4972, March 2012.
  20. T. Nakatani, T. Yoshioka, S. Araki, M. Delcroix, and M. Fujimoto, "Logmax observation model with mfcc-based spectral prior for reduction of highly nonstationary ambient noise," ICASSP2012, pp. 4029-4032, March 2012.
  21. S. Araki and T. Nakatani, "Sparse vector factorization for underdetermined BSS using wrapped-phase GMM and source log-spectral prior," ICASSP2012, pp. 265-268, March 2012.
  22. S. Araki, F. Nesta, E. Vincent, Z. Koldovsky, G. Nolte, A. Ziehe, and A. Benichoux, "SiSEC2011 overview: Audio source separation," in Proc. LVA/ICA2012, pp. 414-422, March 2012.
  23. K. Kinoshita, M. Delcroix, M. Souden, and T. Nakatani, "Example-based speech enhancement with joint utilization of spatial, spectral & temporal cues of speech and noise," Interspeech2012.
  24. T. Yoshioka and T. Nakatani, "Time-varying residual noise feature model estimation for multi-microphone speech recognition," ICASSP2012, pp. 4913-4916, March 2012.
  25. T. Yoshioka, and D. Sakaue, "Log-normal matrix factorization with application to speech-music separation," SAPA-SCALE 2012, pp. 80-85, September 2012.
  26. T. Yoshioka, A. Sehr, M. Delcroix, K. Kinoshita, R. Maas, T. Nakatani, and W. Kellermann, "Survey on approaches to speech recognition in reverberant environments," APSIPA, 2012. (Invited paper)
  27. M. Souden, K. Kinoshita, M. Delcroix, and T. Nakatani, "Distributed microphone array processing for speech source separation with classifier fusion," MLSP, September 2012.
  28. M. Souden, S. Araki, K. Kinoshita, T. Nakatani, and H. Sawada, "A multichannel MMSE-based framework for joint blind source separation and noise reduction," ICASSP2012, pp. 109-112, March 2012.
  29. T. Maruyama, S. Araki, T. Nakatani, S. Miyabe, T. Yamada, S. Makino, and A. Nakamura, "New analytical update rule for TDOA inference for underdetermined BSS in noisy environments," ICASSP2012, pp. 269-272, March 2012.
  30. Y. Iwata and T. Nakatani, "Introduction of speech log-spectral priors into dereverberation based on Itakura-Saito distance minimization," ICASSP2012, pp. 245-248, March 2012.
  31. H. Sawada, H. Kameoka, S. Araki, and N. Ueda, "Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization," ICASSP2012, pp. 261-264, March 2012.
  32. G. Nolte, D. Lutter, A. Ziehe, F. Nesta, E. Vincent, Z. Koldovsky, A. Benichoux, and S. Araki, "SiSEC2011 overview: biomedical data analysis," in Proc. LVA/ICA2012, pp. 423-429, March 2012.
  33. T. Maruyama, S. Araki, T. Nakatani, S. Miyabe, T. Yamada, S. Makino, and A. Nakamura, "New analytical calculation and estimation for TDOA inference for underdetermined BSS in noisy environments," APSIPA, 2012.
  34. J. Muramatsu, "Information theoretic security based on bounded observability," DIMACS Workshop on Information-Theoretic Network Security, November 2012.
  35. J. Muramatsu and S. Miyake, "Uniform random number generation by using sparse matrix," Proceedings of the 2012 IEEE Information Theory Workshop, pp. 612-616, 2012.
  36. K. Yoshimura, J. Muramatsu, P. Davis, A. Uchida, and T. Harayama, "Secure key distribution using correlated randomness in optical devices," Proceedings of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp.336-339, 2012.
  37. K. Yoshimura, "Existence and stability of localized modes in one-dimensional nonlinear lattices," The 19th International Symposium on Nonlinear Acoustics, AIP Conference Proceedings 1474, pp. 59-62, 2012.
  38. K. Yoshimura, "Stability of discrete breathers in nonlinear Klein-Gordon type lattices," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 403-406, 2012.
  39. K. Arai, T. Harayama, P. Davis, J. Muramatsu, and S. Sunada, "Multi-bit sampling from chaotic time series in random number generation," Proceedings of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 268-271, 2012.
  40. S. Sunada, T. Harayama, P. Davis, K. Arai, K. Yoshimura, K. Tsuzuki, M. Adachi, and A. Uchida, "Noise amplification based on dynamical instabilities in semiconductor laser systems and its application to nondeterministic random bit generators," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 263-267, 2012.
  41. S. Miyake and J. Muramatsu, "Universal codes on continuous alphabet using sparse matrices," Proceedings of the 2012 International Symposium on Information Theory and its Applications, pp. 493-497, 2012.
  42. S. Miyake and J. Muramatsu, "On a construction of universal network code using LDPC matrices," The Proceedings of the 2012 IEEE International Symposium on Information Theory, pp. 1306-1310, 2012.
  43. H. Koizumi, S. Morikatsu, H. Aida, M. Arahata, T. Nozawa, A. Uchida, K. Yoshimura, J. Muramatsu, and P. Davis, "Experiment on secure key distribution using correlated random phenomenon in semiconductor lasers," Proceedings of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 340-343, 2012.
  44. T. Yamazaki, Y. Akizawa, A. Uchida, K. Yoshimura, K. Arai, and P. Davis, "Fast random number generation with bandwidth-enhanced chaos and post-processing," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 142-145, 2012.
  45. R. Takahashi, Y. Akizawa, T. Yamazaki, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Yoshimura, K. Arai, and P. Davis, "Random number generation with a photonic integrated circuit for fast chaos generation," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 138-141, 2012.
  46. Y. Akizawa, R. Takahashi, H. Aida, T. Yamazaki, A. Uchida, T. Harayama, K. Tsuzuki, S. Sunada, K. Yoshimura, K. Arai, and P. Davis, "Nonlinear dynamics in a photonic integrated circuit for fast chaos generation," Proc. of the 2012 International Symposium on Nonlinear Theory and Its Applications, pp. 134-137, 2012.
  47. T. Hirayama, S. Arakawa, K. Arai, and M. Murata, "On the power-law characteristic of link capacity distribution in ISP router-level topologies," 21st International Conference on Computer Communications and Networks (ICCCN12), July 30 - August 2, 2012.

2011

Journal Papers

  1. T. Yoshioka, T. Nakatani, M. Miyoshi, and H. G. Okuno, “New method for blind separation and dereverberation of highly reverberant mixtures,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 19, no. 1, pp. 69-84, January 2011.
  2. A. Ogawa, S. Takahashi, and A. Nakamura, “Efficient combination of likelihood recycling and batch calculation for fast acoustic likelihood calculation,” IEICE TRANS. INF. & SYST., VOL.E94-D, NO.3 March 2011.
  3. S. Araki, T. Nakatani, and H. Sawada, “Sparse source separation based on simultaneous clustering of source locational and spectral features,” Acoustical Science and Technology, Acoustic Letter, (in press), 2011.
  4. T. Harayama, S. Sunada, K. Yoshimura, K. Tsuzuki, P. Davis, and A. Uchida, “Fast non-deterministic random bit generation with on-chip chaos lasers,” Physical Review A Vol. 83 031803(R), 2011.
  5. S. Sunada, T. Harayama, K. Arai, K. Yoshimura, P. Davis, K. Tsuzuki, and A. Uchida, “Chaos laser chips with delayed optical feedback using a passive ring waveguide,” Optics Express, Vol. 19 pp. 5713-5724, 2011.
  6. S. Sunada, T. Harayama, K. Arai, K. Yoshimura, K. Tsuzuki, A. Uchida, and P. Davis, “Random optical pulse generation with bistable semiconductor ring lasers,” Optics Express Vol. 19, pp. 7439-7450, 2011.
  7. K. Yoshimura, “Existence and stability of discrete breathers in diatomic Fermi-Pasta-Ulam type lattices,” Nonlinearity 24, 293-317, 2011.
  8. S. Watanabe, T. Iwata, T. Hori, A. Sako, and Y. Ariki, “Topic Tracking Language Model for Speech,” Computer Speech and Language, vol. 25, issue 2, pp. 440-461, 2011.

Book Chapter, Tutorial Papers

  1. M. Fujimoto, “Chapter 1: Integration of statistical model-based voice activity detection and noise suppression for noise robust speech recognition,” in "Advances in Robust Speech Recognition Technology,'' Bentham Publishing Services, March 2011.

Peer-reviewed Conference Papers

  1. S. Sunada, T. Harayama, K. Arai, K. Yoshimura, K. Tsuzuki, A. Uchida, and P. Davis, “Theory and experiment of fast non-deterministic random bit generation with on-chip chaos lasers,” Dynamics Days 2011, pp. 31-32, January 2011.
  2. S. Araki, T. Hori, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, M. Delcroix, K. Kinoshita, T. Nakatani, A. Nakamura, J. Yamato, “Low-latency meeting recognition and understanding using distant microphones,” to appear in Proceedings of the 3rd Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2011), May 2011, presented in the Demo Session.
  3. M. Fujimoto, S. Watanabe, and T. Nakatani, “Non-stationary noise estimation method based on bias-residual component decomposition for robust speech recognition,” Proc. of ICASSP '11, May 2011. (accepted)
  4. A. Ogawa, S. Takahashi, and A. Nakamura, “Machine and acoustical condition dependency analyses for fast acoustic likelihood calculation techniques,” Proc. ICASSP, May 2011, to appear.
  5. T. Yoshioka, and T. Nakatani, “A microphone array system integrating beamforming, feature enhancement, and spectral mask-based noise estimation,” to appear in Proceedings of the Third Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2011), May 2011.
  6. T. Yoshioka, and T. Nakatani, “Speech enhancement based on log spectral envelope model and harmonicity-derived spectral mask, and its coupling with feature compensation,” to appear in Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), May 2011.
  7. N. Yasuraoka, H. Kameoka, T.Yoshioka, and H. G. Okuno, “I-divergence-based dereverberation method with auxiliary function approach,” to appear in Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), May 2011.
  8. T. Nakatani, S. Araki, T. Yoshioka, and M. Fujimoto, “Joint unsupervised learning of hidden Markov source models and source location models for multichannel source separation,” to appear in Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2011), May 2011.
  9. Y. Kubo, S. Wiesler, R. Schlueter, H. Ney, S. Watanabe, A. Nakamura, T. Kobayashi, “Subspace Pursuit Method for Kernel-Log-Linear Models,” Proc. ICASSP 2011, Prague, Chez, May 2011.
  10. S. Araki, T. Hori, T. Yoshioka, M. Fujimoto, S. Watanabe, T. Oba, A. Ogawa, K. Otsuka, D. Mikami, M. Delcroix, K. Kinoshita, T. Nakatani, A. Nakamura, and J. Yamato, “Demonstration on low-latency meeting recognition and understanding using distant microphones,” HSCMA2011, (accepted).
  11. M. Delcroix, S. Watanabe, T Nakatani, and A Nakamura, “Discriminative approach to dynamic variance adaptation for noisy speech recognition,” Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2011 (to appear).
  12. S. Araki and T. Nakatani, “Hybrid Approach for Multichannel Source Separation Combining Time-frequency Mask with Multi-channel Wiener Filter,” ICASSP2011, (accepted).
  13. H. Sawada, H. Kameoka, S. Araki, and N. Ueda, “FORMULATIONS AND ALGORITHMS FOR MULTICHANNEL COMPLEX NMF,” ICASSP 2011, (accepted)
  14. K. Iso, S. Araki, S. Makino, T. Nakatani, H. Sawada, T. Yamada, and A. Nakamura, “BLIND SOURCE SEPARATION OF MIXED SPEECH IN A HIGH REBERBERATION ENVIRONMENT,” HSCMA2011, (accepted)
  15. T. Oba, T. Hori, A. Ito, and A. Nakamura, “Round-Robin Duel Discriminative Language Models in One-pass Decoding with On-the-fly Error Correction,” Proceedings of ICASSP, 2011.
  16. S. Watanabe, D. Mochihashi, T. Hori, and A. Nakamura, “Gibbs Sampling Based Multi-Scale Mixture Model for Speaker Clustering,” Proc. ICASSP'11.
  17. D. Saito, S. Watanabe, A. Nakamura, and N. Minematsu, “High Accurate Model-Integration-Based Voice Conversion Using Dynamic Features and Model Structure Optimization,” Proc. ICASSP'11.
  18. T. Maekawa and S. Watanabe, “Modeling Activities with User's Physical Characteristics Data,” Proc. ISWC'11.

2010

Journal Papers

  1. T. Yoshioka, T. Nakatani, M. Miyoshi, and H. G. Okuno, “New method for blind separation and dereverberation of highly reverberant mixtures,” accepted for publication in IEEE Transactions on Audio, Speech, and Language Processing, now available on IEEE Xplore, January 2010.
  2. T. Oba, T. Hori, and A. Nakamura, “Improved Sequential Dependency Analysis Integrating Labeling-based Sentence Boundary Detection,” IEICE, Vol.E93-D,No.5,pp.-, May 2010.
  3. J. Muramatsu, and S. Miyake “Hash property and coding theorems for sparce matrices and maximal-likelihood coding,” IEEE Transactions on Information Theory, vol. IT-56, no. 5, pp. 2143-2167, May 2010.
  4. J. Muramatsu, and S. Miyake “Hash property and fixed-rate universal coding theorems,” IEEE Transactions on Information Theory, vol. IT-56, no. 6, pp. 2688-2698, Jun. 2010.
  5. J. Muramatsu, and S. Miyake, “Construction of broadcast channel code based on hash property,” in Proceedings of the 2010 IEEE International Symposium on Information Theory, pp. 575-579, 2010.
  6. K. Ishizuka, S. Araki, and T. Kawahara, “Speech activity detection for muti-party conversation analyses based on likelihood ratio test on spatial magnitude,” IEEE Transaction on Audio, Speech, and Language Processing (in press).
  7. K. Ishizuka, T. Nakatani, M. Fujimoto, and N. Miyazaki, “Noise robust voice activity detection based on periodic to aperiodic component ratio,” Speech Communication, Vol.52, No.1, pp. 41-60, 2010.
  8. S. Araki, H. Sawada, and S. Makino, “Blind Speech Separation in a Meeting Situation with Maximum SNR Beamformers,” IEEE Trans. Audio, Speech, and Language Processing, (submitting)
  9. S. Watanabe and A. Nakamura, “Predictor-Corrector Adaptation based on a Macroscopic Time Evolution System,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, issue 2, pp. 395-406, 2010.

Book Chapter, Tutorial Papers

  1. T. Yoshioka, T. Nakatani, K. Kinoshita, and M. Miyoshi, “Speech dereverberation and denoising based on time varying speech model and autoregressive reverberation model,” to appear in Speech Processing in Modern Communication: Challenges and Perspectives, Israel Cohen, Jacob Benesty, and Sharon Gannot (eds.), Springer, pp. 151-182, February 2010.
  2. M. Fujimoto, K. Takeda, and S. Nakamura, “Chapter 4.4.2: An evaluation database for in-car speech recognition and its common evaluation framework,” in "Resources and Standards of Spoken Language Systems - Advances in Oriental Spoken Language Processing, " World Scientific Publishing Co., March 2010.
  3. M. Miyoshi, M. Delcroix, K. Kinoshita, T. Yoshioka, T. Nakatani, and T. Hikichi, “Inverse-filtering for speech dereverberation without the use of room acoustics information,” to appear in Speech Dereverberation, Patrik A. Naylor and Nikolay Gaubitch (eds.), Springer.
  4. M. Fujimoto, “Chapter 1: Integration of statistical model-based voice activity detection and noise suppression for noise robust speech recognition,” in "Advances in Robust Speech Recognition Technology," Bentham Publishing Services. (in publishing)

Peer-reviewed Conference Papers

  1. T. Yoshioka, T. Nakatani, and H. G. Okuno, “Noisy speech enhancement based on prior knowledge about spectral envelope and harmonic structure,” in Proceedings of the 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), pp. 4270-4273, March 2010.
  2. N. Yasuraoka, T. Yoshioka, T. Nakatani, A. Nakamura, and Hiroshi G. Okuno, “Music dereverberation using harmonic structure source model and Wiener filtering,” in Proceedings of the 2010 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2010), pp. 53-56, March 2010.
  3. T. Hori, S. Watanabe, and A. Nakamura, “Search Error Risk Minimization in Viterbi Beam Search for Speech Recognition,” in Proc. ICASSP2010, pp. 4934-4937, March 2010.
  4. T. Oba, T. Hori and A. Nakamura, “A Comparative Study on Methods of Weighted Language Model Training for Reranking LVCSR N-best Hypotheses,” in Proc. ICASSP2010, pp. 5126-5129, March 2010.
  5. S. Watanabe, T. Hori, E. McDermott, and A. Nakamura, “A Discriminative Model for Continuous Speech Recognition Based on Weighted Finite State Transducers,” in Proc. ICASSP2010, pp. 4922-4925, March 2010.
  6. A. Ogawa and A. Nakamura, “Discriminative confidence and error cause estimation for extended speech recognition function,” Proc. ICASSP, pp. 4454-4457, March 2010.
  7. A. Ogawa and A. Nakamura, “A novel confidence measure based on marginalization of jointly estimated error cause probabilities,” Proc. Interspeech, September 2010.
  8. J. Muramatsu, K. Yoshimura, K., and P. Davis, “Information theoretic security based on bounded observability,” Proceedings of the 4th International Conference on Information Theoretic Security, Lecture Notes on Computer Science (LNCS), vol.5973, pp.128-139, Splinger (in press).
  9. D. Cournapeau, S. Watanabe, A. Nakamura, and T. Kawahara, “Using Online Model Comparison In The Variational Bayes Framework For Online Unsupervised Voice Activity Detection,” ICASSP 2010, pp. 4462-4465, 2010.
  10. E. McDermott, S. Watanabe, and A. Nakamura, “Discriminative Training Based On An Integrated View Of MPE And MMI In Margin And Error Space,” ICASSP 2010, pp. 4894-4897, 2010.
  11. H. Watanabe, S. Katagiri, K. Yamada, E. McDermott, A. Nakamura, S. Watanabe, and M. Ohsaki, “Minimum Error Classification With Geometric Margin Control,” ICASSP 2010, pp. 2170-2173, 2010.
  12. K. Aoyama, S. Watanabe, H. Sawada, Y. Minami, N. Ueda, and K. Saito, “Fast Similarity Search On A Large Speech Data Set With Neighborhood Graph Indexing,” ICASSP 2010, pp. 5358-5361, 2010.
  13. S. Araki, T. Nakatani and H. Sawada, “Simultaneous clustering of mixing and spectral model parameters for blind sparse source separation,” ICASSP2010, 2010.
  14. T. Hori, S. Watanabe, and A. Nakamura, “Search Error Risk Minimization In Viterbi Beam Search For Speech Recognition,” ICASSP 2010, pp. 4934-4937, 2010.
  15. T. Nakatani and S. Araki, “SINGLE CHANNEL SOURCE SEPARATION BASED ON SPARSE SOURCE OBSERVATION MODEL WITH HARMONIC CONSTRAINT,” ICASSP2010, 2010.
  16. Y. Ansai, S. Araki, S. Makino, T. Nakatani, T. Yamada, A. Nakamura and N. Kitawaki, “Cepstral Smoothing of Separated Signals for Underdetermined Speech Separation,” ISCAS2010, (to appear)

2009

Journal Papers

  1. T. Yoshioka, T. Nakatani, and M. Miyoshi, “Integrated speech enhancement method using noise suppression and dereverberation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 2, pp. 231-246, February. 2009.
  2. S. Miyake, and J. Muramatsu, “A Construction of Channel Code, Joint Source-Channel Code, and Universal Code for Arbitrary Stationary Memoryless Channels using Sparse Matrices,” IEICE Transactions on Fundamentals, vol.E92-A, no.9, pp.2333-2344, September. 2009.
  3. H. K. Solvang, Y. Nagahara, S. Araki, H. Sawada and S. Makino, “Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation,” IEEE Trans. Speech & Language Processing, vol, 17, no. 4, pp. 639-649, 2009.
  4. K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction” IEEE Transactions on Audio, Speech and Language processing
  5. M. Delcroix, T. Nakatani, and S. Watanabe, “Static and dynamic variance compensation for recognition of reverberant speech with dereverberation pre-processing,” IEEE transactions on Audio, Speech, and Language Processing, vol. 17, issue 2, pp. 324-334, 2009.
  6. S. Araki, H. Sawada, R. Mukai and S. Makino, “DOA estimation for multiple sparse sources with arbitrarily arranged multiple sensors,” Journal of Signal Processing Systems, doi:10.1007/s11265-009-0413-9, 2009.

Book Chapter, Tutorial Papers

  1. T. Hori, K. Sudoh, H. Tsukada, and A. Nakamura, “World-Wide Media Browser--Multilingual Audio-visual Content Retrieval and Browsing System,” NTT Technical Review, Vol. 7, No. 2, February 2009.
  2. S. Makino, S. Araki, S. Winter, H. Sawada, “Underdetermined Blind Source Separation using Acoustic Arrays,” Handbook on Array Processing and Sensor Networks, S. Haykin, and K. J. R. Liu Eds., Wiley, 2009 (in press).

Peer-reviewed Conference Papers

  1. T. Yoshioka, H. Tachibana, T. Nakatani, and M. Miyoshi, “Adaptive dereverberation of speech signals with speaker-position change detection,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 3733-3736, April 2009.
  2. H. Kameoka, T. Nakatani, and T. Yoshioka, “Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 45-48, April 2009.
  3. T. Nakatnai, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H. Juang, “Real-time speech enhancement in noisy reverberant multi-talker environments based on a localtion-independent room acoustics model,” to appear in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 137-140, April 2009.
  4. A. Ogawa, S. Takahashi, and A. Nakamura, “Efficient combination of likelihood recycling and batch calculation based on conditional fast processing and acoustic back-off,” Proc. ICASSP, pp. 4164-4164, April 2009.
  5. T. Yoshioka, T. Nakatani, and M. Miyoshi, “Fast algorithm for conditional separation and dereverberation,” in Proceedings of the 17th European Signal Processing Conference (EUSIPCO 2009), CD-ROM Proceedings, August 2009.
  6. T. Yoshioka, H. Kameoka, T. Nakatani, and H. G. Okuno, “Statistical models for speech dereverberation,” in Proceedings of the 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2009), pp. 145-148, October 2009.
  7. A. Nakamura, E. McDermott, S. Watanabe, S. Katagiri, “A unified view for discriminative objective functions based on negative exponential of difference measure between strings,” Proc. ICASSP 2009, pp. 1633-1636, 2009.
  8. E. McDermott, S. Watanabe, and A. Nakamura, “Margin-Space Integration of MPE Loss via Differencing of MMI Functionals for Generalized Error-Weighted Discriminative Training,” Proc. Interspeech 2009 Eurospeech, pp. 224-227, 2009.
  9. J. Muramatsu, and S. Miyake, “Coding theorem for general stationary memryless channel based on hash property,” Proceedings of the 2009 IEEE International Symposium on Information Theory, Seoul, Korea, pp.541-545, 2009.
  10. J. Muramatsu, and S. Miyake, “Construction of wiretap channel codes by using sparse matrices,” Proceedings of the 2009 IEEE Information Theory Workshop, Taormina, Italy, pp.105-109, 2009.
  11. K. Ishiguro, T. Yamada S. Araki and T. Nakatani, “A PROBABILISTIC SPEAKER CLUSTERING FOR DOA-BASED DIARIZATION,” WASPAA2009, 2009.
  12. K. Ishizuka, S. Araki, K. Otsuka, T. Nakatani and M. Fujimoto, “A Speaker Diarization Method based on the Probabilistic Fusion of Audio-Visual Location Information,” ICMI-MLMI 2009, 2009.
  13. K. Ishizuka, S. Araki, K. Otsuka, T. Nakatani, and M. Fujimoto, “A speaker diarization method based on the probabilistic fusion of audio-visual location information,” Proceedings of the 11th International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI2009), pp.55-62, 2009.
  14. K. Otsuka, S. Araki, D. Mikami, K. Ishizuka, M. Fujimoto and J. Yamato, “Realtime Meeting Analysis and 3D Meeting Viewer Based on Omndirectional Multimodal Sensors,” ICMI-MLMI 2009, 2009.
  15. K. Otsuka, S. Araki, D. Mikami, K. Ishizuka, M. Fujimoto, and J. Yamato, “Realtime meeting analysis and 3D meeting viewer based on omnidirectional multimodal sensors,” Proceedings of the 11th International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI2009), pp.219-220, 2009.
  16. M. Fujimoto, K. Ishizuka, and T. Nakatani, “A study of mutual front-end processing method based on statistical model for noise robust speech recognition,” Proc. of Interspeech '09, pp. 1235-1238, September 2009.
  17. A. Ogawa and A. Nakamura, “Simultaneous estimation of confidence and error cause in speech recognition using discriminative model,” Proc. Interspeech, pp. 1199-1202, September. 2009.
  18. S. Kobashikawa, A. Ogawa, Y. Yamaguchi, and S. Takahashi, “Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models,” Proc. Interspeech, pp.1615-1618, September 2009.
  19. M. Fujimoto, K. Ishizuka, and T. Nakatani, “A study of mutual front-end processing method based on statistical model for noise robust speech recognition,” Proceedings of the 10th Interspeech (Interspeech2009), pp. 1235-1238, 2009.
  20. R. Mugitani, K. Ishizuka, T. Kondo, and S. Amano, “Acquisition of durational control of vocalic and consonantal intervals in speech production,” The 34th Boston University Conference on Language Development (BUCLD34), 2009.
  21. S. Araki, T. Nakatani, H. Sawada, and S. Makino, “Blind sparse source separation for unknown number of sources using Gaussian mixture model fitting with Dirichlet prior,” ICASSP2009, pp.33-36, 2009.
  22. S. Araki, T. Nakatani, H. Sawada, and S. Makino, “Stereo source separation and source counting with MAP estimation with Dirichlet prior considering spatial aliasing problem,” ICA2009, pp. 742-750, 2009.
  23. S. Watanabe and A. Nakamura, “Speech recognition with incremental tracking and detection of changing environments based on a macroscopic time evolution system,” Proc. ICASSP 2009, pp. 4373-4376, 2009.
  24. T. Iwata, S. Watanabe, T. Yamada, and N. Ueda, “Topic tracking model for analyzing consumer purchase behavior,” IJCAI 2009, pp. 1427-1432, 2009.
  25. Y. Izumi, K. Nishiki, S. Watanabe, T. Nishimoto, N. Ono, and S. Sagayama, “Stereo-input Speech Recognition using Sparseness-based Time-frequency Masking in a Reverberant Environment,” Proc. Interspeech 2009 Eurospeech , pp. 1955-1958, 2009.
  26. S. Kobashikawa, A. Ogawa, Y. Yamaguchi, and S. Takahashi, “Rapid unsupervised adaptation using context independent phoneme model,” The 13th IEEE International Symposium on Consumer Electronics (ISCE'09).

Other Conference Papers

  1. K. Kinoshita, T. Nakatani, M. Miyoshi and T. Kubota, “Blind upmix of stereo music signal using multi-step linear prediction based reverberation extraction,” International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp49-52, 2009

2008

Journal Papers

  1. J. Muramatsu, “Effect of random permutation of symbols in a sequence,” IEEE Transactions on Information Theory, vol.IT-54, no.1, pp.78-86, January. 2008.
  2. J. Muramatsu, K. Yoshimura, K. Arai, and P. Davis, “Some results on secret key agreement using correlated sources,” NTT Technical Review, vol.6, No.2, February. 2008.
  3. M. Fujimoto and K. Ishizuka, “Noise Robust Voice Activity Detection Based on Switching Kalman Filter,” IEICE Transactions on Information and Systems, Vol. E91-D, No. 3, pp. 467-477, March. 2008.
  4. S. Miyake, and J. Muramatsu, “A construction of lossy source code using LDPC matrices, IEICE Transactions on Fundamentals,” vol.E91-A, no.6, pp.1488-1501, June 2008.
  5. T. Oba, T. Hori, and A. Nakamura, “Sequential Dependency Analysis for Online Spontaneous Speech Processing,” Speech Communication, Volume 50, Issue 7, pp. 616-625, July 2008.
  6. T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, and M. Miyoshi, “Speech dereverberation based on maximum likelihood estimation with time-varying Gaussian source model,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 8, pp. 1512-1527, November 2008.
  7. K. Yoshimura, J. Muramatsu, and P. Davis, “Conditions for common-noise-induced synchronization in time-delay systems,” Physica D, vol. 237, no. 23, pp.3146-3152, December. 2008.
  8. H. K. Solvang, K. Ishizuka, and M. Fujimoto, “Voice activity detection based on adjustable linear prediction and GARCH models,” Speech Communication, Vol.50, No.6, pp.476-486, 2008.
  9. T. Nakatani, S. Amano, T. Irino, K. Ishizuka, and T. Kondo, “A method for fundamental frequency estimation and voicing decision: Application to infant utterances recorded in real acoustical environments,” Speech Communication, Vol.50, No.3, pp.203-214, 2008.

Book Chapter, Tutorial Papers

  1. S. Makino, S. Araki, and H. Sawada, “Underdetermined Blind Source Separation using Acoustic Arrays,” in Handbook on Array Processing and Sensor Networks, S. Haykin and K.J. Ray Liu, Eds, Wiley, 2008.

Peer-reviewed Conference Papers

  1. T. Yoshioka, T. Nakatani, T. Hikichi, and M. Miyoshi, “Maximum likelihood approach to speech enhancement for noisy reverberant signals,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 4585-4588, March 2008.
  2. T. Yoshioka and M. Miyoshi, “Adaptive suppression of non-stationary noise by using variational Bayesian method,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 4889-4892, March 2008.
  3. T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H., Juang, “Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation,” in Proceedings of the 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2008), pp. 85-88, March 2008.
  4. M. Fujimoto and K. Ishizuka, and T. Nakatani, “A Voice Activity Detection Based on the Adaptive Integration of Multiple Speech Features and a Signal Decision Scheme,” Proc. ICASSP '08, pp. 4441-4444, March 2008.
  5. T. Oba, T. Hori, and A. Nakamura, “Efficient Discriminative Training of Error Corrective Models Using High-WER Competitors,” Asian Workshop on Speech Science and Technology, IEICE Technical Report SP2007-185-214, pp. 99-104, March 2008.
  6. A. Ogawa and S. Takahashi, “Weighted distance measures for efficient reduction of Gaussian mixture components in HMM-based acoustic model,” Proc. ICASSP, pp. 4173-4176, March 2008.
  7. T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H., Juang, “Speech dereverberation in short time Fourier transform domain with cross band effect compensation,” in Proceedings of the 2008 Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2008), pp. 220-223, May 2008.
  8. T. Yoshioka, T. Nakatani, and M. Miyoshi, “An integrated method for blind separation and dereverberation of convolutive audio mixtures,” in Proceedings of the 16th European Signal Processing Conference (EUSIPCO 2008), CD-ROM Proceedings, August 2008.
  9. T. Yoshioka, T. Nakatani, and M. Miyoshi, “Enhancement of noisy reverberant speech by linear filtering followed by nonlinear noise suppression,” in Proceedings of the 2008 International Workshop on Acoustic Echo and Noise Control (IWAENC 2008), CD-ROM Proceedings, September 2008.
  10. T. Nakatani, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H. Juang, “Incremental estimation of reverberation with uncertainty using prior knowledge of room acoustics for speech dereverberation,” in Proceedings of the 2008 International Workshop on Acoustic Echo and Noise Control (IWAENC 2008), CD-ROM Proceedings, September 2008.
  11. M. Fujimoto, K. Ishizuka, and T. Nakatani, “Study of Integration of Statistical Model-Based Voice Activity Detection and Noise Suppression,” Proc. Interspeech '08, September 2008.
  12. M. Miyoshi, K. Kinoshita, T. Nakatani, and T. Yoshioka, “Principles and applications of dereverberation for noisy and reverberant audio signals,” in Proceedings of the 2008 Asilomar Conference on Signals, Systems, and Computers, CD-ROM Proceedings, October 2008.
  13. S. Miyake, and J. Muramatsu, “A construction of channel code, joint source-channel code, and universal code for arbitrary stationary memoryless channels using sparse matrices,” Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, Canada, pp.1193-1197, 2008.
  14. D. Kolossa (TU Berlin), S. Araki , M. Delcroix, T. Nakatani, R. Orglmeister (TU Berlin), S. Makino, “Missing Feature Speech Recognition in a Meeting Situation with Maximum SNR Beamforming,” ISCAS2008, pp. 3218 -3221, 2008.
  15. J. Muramatsu, and S. Miyake, “Hash property and multi-terminal source coding theorems for sparse matrices and maximal-likelihood coding,” Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, Canada, pp.424-428, 2008.
  16. J. Muramatsu, and S. Miyake, “Lossy source coding algorithm using lossless multi-terminal source codes,” Proceedings of the 2008 International Symposium on Information Theory and its Applications, Auckland, New Zealand, pp.606-611, 2008.
  17. K. Ishizuka, S. Araki, and T. Kawahara, “Statistical speech activity detection based on spatial power distribution for analyses of poster presentations,” Proceedings of the 10th International Conference on Spoken Language Processing (Interspeech2008 - ICSLP), pp.99-102, 2008.
  18. K. Ishizuka, S. Araki, T. Kawahara, “Statistical Speech Activity Detection based on Spatial Power Distribution for Analyses of Poster Presentations,” Interspeech2008, pp.99-102, 2008.
  19. K. Otsuka, S. Araki, K. Ishizuka, M. Fujimoto, M. Heinrich, J. Yamato, “A Realtime Multimodal System for Analyzing Group Meetings by Combining Face Pose Tracking and Speaker Diarization,” ICMI2008, pp. 257-264, 2008.
  20. K. Otsuka, S. Araki, K. Ishizuka, M. Fujimoto, M. Hinrich, and J. Yamato, “A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization,” Proceedings of the 10th International Conference on Multimodal Interfaces (ICMI2008), pp. 257-264, 2008.
  21. M. Delcroix, T. Nakatani, and S. Watanabe, “Combined static and dynamic variance adaptation for efficient interconnection of speech enhancement pre-processor with speech recognizer,” Proc. ICASSP 2008 pp. 4073-4076, 2008.
  22. M. Fujimoto, K. Ishizuka, and T. Nakatani, “A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme,” Proceedings of the 33rd International Conference on Acoustics, Speech and Signal Processing (ICASSP2008), pp.4441-4444, 2008.
  23. M. Fujimoto, K. Ishizuka, and T. Nakatani, “Study of integration of statistical model-based voice activity detection and noise suppression,” Proceedings of the 10th International Conference on Spoken Language Processing (Interspeech2008 - ICSLP), pp.2008-2011, 2008.
  24. S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, and S. Makino, “A DOA based speaker diarization system for real meetings,” HSCMA2008, pp.29-32, 2008.
  25. S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, and S. Makino, “A DOA based speaker diarization system for real meetings,” Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2008), pp.29-32, 2008.
  26. S. Araki, M. Fujimoto, K. Ishizuka, H. Sawada, and S. Makino, “Speaker indexing and speech enhancement in real meetings / conversations,” Proceedings of the 33rd International Conference on Acoustics, Speech and Signal Processing (ICASSP2008), pp.93-96, 2008.
  27. S. Watanabe and A. Nakamura, “A unified interpretation of adaptation techniques based on a macroscopic time evolution system with indirect/direct approaches,” Proc. ICASSP 2008 pp. 4285-4286, 2008.
  28. T. Hager, S. Araki, K. Ishizuka, M. Fujimoto, T. Nakatani, and S. Makino, “Handling speaker position changes in a meeting diarization system by combining DOA clustering and speaker identification,” Proceedings of the 11th International Workshop on Acoustic Echo and Noise Control (IWAENC2008), 2008.
  29. T. Hager, S. Araki, K. Ishizuka, M. Fujimoto, T. Nakatani, S. Makino, “Handling speaker position changes in a meeting diarization system by combining DOA clustering and speaker identification,” IWAENC2008 CD-ROM proceedings, 2008.
  30. T. Kawahara, H. Setoguchi, K. Takanashi, K. Ishizuka, and S. Araki, “Multi-modal recording, analysis and indexing of poster sessions,” Proceedings of the 10th International Conference on Spoken Language Processing (Interspeech2008 - ICSLP), pp.1622-1625, 2008.
  31. T. Kawahara, H. Setoguchi, K. Takanashi, K. Ishizuka, S. Araki, “Multi-Modal Recording, Analysis and Indexing of Poster Sessions,” Interspeech2008, pp. 1622-1625, 2008.

Other Conference Papers

  1. K. Kinoshita, T. Nakatani, M. Miyoshi and T. Kubota, “A new audio post-production tool for speech dereverberation,” Audio Engineering Society (AES) 125th Convention, San Francisco, 2008.

2007

Journal Papers

  1. S. Araki, H. Sawada, R. Mukai and S. Makino, “Underdetermined Blind Sparse Source Separation for Arbitrarily Arranged Multiple Sensors,” Signal Processing, vol. 87, pp. 1833-1847, February. 2007. doi:10.1016/j.sigpro.2007.02.003.
  2. M. Knaak (Technical University Berlin), S. Araki and S. Makino, “Geometrically Constrained Independent Component Analysis,” IEEE Trans. Audio, Speech and Language Processing, vol. 15, No. 2, pp. 715-726, February, 2007.
  3. T. Yamamoto, I. Oowada, H. Yip, A. Uchida, S. Yoshimori, K. Yoshimura, J. Muramatsu, S. Goto, and P. Davis, “Common-chaotic-signal induced synchronization in semiconductor lasers,” Opt. Express, vol.15, no.7, pp.3974-3980, April 2007.
  4. Hiroko Kato Solvang, Kentaro Ishizuka, and Masakiyo Fujimoto, “A voice activity detection based on an AR-GARCH model,” IEICE Transaction on Information Systems, Vol.J90-D, No.12, pp.3210-3220, 2007 (in Japanese).
  5. H. Sawada, S. Araki, R. Mukai and S. Makino, “Grouping Separated Frequency Components with Estimating Propagation Model Parameters in Frequency-Domain Blind Source Separation,” IEEE Trans. Audio, Speech & Language Processing, vol. 15, no. 5, pp. 1592-1604, July 2007.
  6. K. Ishizuka, R. Mugitani, H. Kato, and S. Amano, “Longitudinal developmental changes in spectral peaks of vowels produced by Japanese infants,” The Journal of the Acoustical Society of America, Vol.121, No.11, pp.2272-2282, 2007.
  7. K. Kinoshita, T. Nakatani and M. Miyoshi, “Fast estimation of a precise dereverberation filter based on the harmonic structure of speech,” Acoustical Science and Technology (AST)
  8. T. Yoshioka, T. Hikichi, and M. Miyoshi, “Dereverberation by using time-variant nature of speech production system,” EURASIP Journal on Advances in Signal Processing, vol. 2007, article ID 65698, doi:10.1155/2007/65698, 2007.
  9. T. Hori, C. Hori, Y. Minami, and A. Nakamura, “Efficient WFST-based one-pass decoding with on-the-fly hypothesis rescoring in extremely large vocabulary continuous speech recognition,” IEEE Trans., Audio, Speech and Language Processing, Vol. 15, pp. 1352-1365, 2007.

Book Chapter, Tutorial Papers

  1. H. Sawada, S. Araki, and S. Makino, “Frequency-Domain Blind Source Separation,” in Blind Speech Separation, S. Makino T.-W. Lee and H. Sawada, Eds., Springer, 2007.
  2. S. Araki, H. Sawada and S. Makino, “K-means based Underdetermined Blind Speech Separation,” in Blind Speech Separation, S. Makino T.-W. Lee and H. Sawada, Eds., Springer, 2007.

Peer-reviewed Conference Papers

  1. T. Nakatani, B.-H. Juang, T. Hikichi, T. Yoshioka, K. Kinoshita, M. Delcroix, and M. Miyoshi, “Study on speech dereverberation with autocorrelation codebook,” in Proceedings of the 2007 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), vol. 1, pp. 193-196, April 2007.
  2. M. Fujimoto, K. Ishizuka, and H. Kato, “Noise Robust Voice Activity Detection Based on Statistical Model and Parallel Non-linear Kalman filtering,” Proc. ICASSP '07, Vol. IV, pp. 797-800, April 2007.
  3. S. Araki, H. Sawada, and S. Makino, “Blind speech separation in a meeting situation with maximum SNR beamformers,” ICASSP2007, vol. 1, pp. 41-44, April 2007.
  4. J. Cermak, S. Araki, H. Sawada and S. Makino, “Blind Source Separation Based on Beamformer Array and Time Frequency Binary Masking,” in Proc. ICASSP2007, vol. I, pp. 145 -148, April 2007.
  5. J. E. Rubio, K. Ishizuka, H. Sawada, S. Araki, T. Nakatani and M. Fujimoto, “Two-Microphone Voice Activity Detection Based on the Homogeneity of the Direction of Arrival Estimates,” in Proc. ICASSP2007, vol.4, pp. 385-388, April 2007.
  6. T. Nakatani, T. Hikichi, K. Kinoshita, T. Yoshioka, M. Delcroix, M. Miyoshi, and Biing-Hwang Juang, “Robust blind dereverberation of speech signals based on characteristics of short-time speech segments,” in Proceedings of the 2007 IEEE International Symposium on Circuits and Systems (ISCAS 2007), pp. 2986-2989, May. 2007.
  7. H. Sawada, S. Araki and S. Makino, “Measuring Dependence of Bin-wise Separated Signals for Permutation Alignment in Frequency-domain BSS,” in Proc. ISCAS2007, pp. 3247 - 3250, May 2007.
  8. M. Fujimoto and K. Ishizuka, “Noise Robust Voice Activity Detection Based on Switching Kalman Filtering,” Proc. Eurospeech '07, pp. 2933-2936, August 2007.
  9. T. Oba, T. Hori, and A. Nakamura, “A Study of Efficient Discriminative Word Sequences for Reranking of Recognition Results based on N-gram Counts,” Interspeech2007, pp. 1753-1756, August 2007.
  10. T. Yoshioka, T. Nakatani, T. Hikichi, and M. Miyoshi, “Overfitting-resistant speech dereverberation,” in Proceedings of the 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2007), pp. 163-166, October 2007.
  11. T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, and M. Miyoshi, “Importance of energy and spectral features in Gaussian source model for speech dereverberation,” in Proceedings of the 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2007), pp. 299-302, October 2007.
  12. Y. Minami, M. Sawaki, K. Dohsaka, R. Higashinaka, K. Ishizuka, H. Isozaki, T. Matsubayashi, M. Miyoshi, A. Nakamura, T. Oba, H. Sawada, T. Yamada, and E. Maeda, “The world of Mushrooms: Human-computer interaction prototype systems for ambient intelligence,” Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI2007), 2007.
  13. I. Oowada, Y. Yamamoto, H. Yip, H. Arizumi, A. Uchida, S. Yoshimori, K. Yoshimura, J. Muramatsu, S. Goto, and P. Davis, “Synchronization in semiconductor lasers subject to a common a common chaotic drive signal,” Proceedings of the 15th IEEE International Workshop on Nonlinear Dynamics of Electronic Systems Tokushima, Japan, pp.149-152, 2007.
  14. H. Sawada, S. Araki, and S. Makino, “A two-stage frequency-domain blind source separation method for underdetermined convolutive mixtures,” WASPAA2007.
  15. H. Sawada, S. Araki, and S. Makino, “MLSP 2007 data analysis competition: Frequency-domain blind source separation for convolutive mixtures of speech/and audio,” MLSP2007, 2007.
  16. J. E. Rubio, K. Ishizuka, H. Sawada, S. Araki, T. Nakatani, and M. Fujimoto, “Two-microphone voice activity detection based on the homogeneity of the direction of arrival estimate,” Proceedings of the 32nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP2007), Vol.4, pp.385-388, 2007.
  17. J. Muramatsu, “Effect of random permutation of symbols in a sequence,” Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, pp.1486-1490, 2007.
  18. K. Ishizuka, T. Nakatani, M. Fujimoto, and N. Miyazaki, “Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio,” Proceedings of the 10th European Conference on Speech Communication and Technology (Interspeech2007 - Eurospeech), pp.230-233, 2007.
  19. M. Fujimoto and K. Ishizuka, “Noise robust voice activity detection based on switching Kalman filter,” Proceedings of the 10th European Conference on Speech Communication and Technology (Interspeech2007 - Eurospeech), pp.2933-2936, 2007.
  20. M. Fujimoto, K. Ishizuka, and H. Kato, “Noise robust voice activity detection based on statistical model and parallel non-linear Kalman filtering,” Proceedings of the 32nd International Conference on Acoustics, Speech, and Signal Processing (ICASSP2007), Vol.4, pp.797-800, 2007.
  21. R. Mugitani, T. Kobayashi, and K. Ishizuka, “Perceptual development of phonemic categories for Japanese single/geminate obstruents,” The 32nd Boston University Conference on Language Development (BUCLD32), 2007.
  22. S. Miyake, and J. Muramatsu, “Constructions of a lossy source code using LDPC matrices,” Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, pp.1106-1110, 2007.
  23. S. Watanabe and A. Nakamura, “Incremental adaptation based on a macroscopic time evolution system,” Proc. ICASSP 2007, vol. 4, pp. 769-772, 2007.
  24. Y. Minami, M. Sawaki, K. Dohsaka, R. Higashinaka, K. Ishizuka, H. Isozaki, T. Matsubayashi, M. Miyoshi, A. Nakamura, T. Oba, H. Sawada, T. Yamada, and E. Maeda, “The world of Mushrooms: Human-computer interaction prototype systems for ambient intelligence,” Proceedings of the 9th International Conference on Multimodal Interfaces (ICMI2007), pp.366-373, 2007.

Other Conference Papers

  1. K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Dereverberation of real recordings using linear prediction-based microphone array,” Audio Engineering Society (AES) 13th Regional Convention, Tokyo, 2007
  2. K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Multi-step linear prediction based speech enhancement in noisy reverberant environment,” Proc. of Interspeech, pp.854-857, 2007

2006

Journal Papers

  1. T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, “Common acoustical pole estimation from multi-channel musical audio signals,” IEICE Transactions on Fundamentals, vol. E89-A, no. 1, pp. 240-247, January 2006.
  2. J. Muramatsu, “Secret key agreement from correlated source outputs using low density parity check matrices,” IEICE Transactions on Fundamentals, vol.E89-A, no.7, pp.2036-2046, July 2006.
  3. J. Muramatsu, K. Yoshimura, and P. Davis, “Secret key capacity and advantage distillation capacity,” IEICE Transactions on Fundamentals, vol.E89-A, no.10, pp.2589-2596, October 2006.
  4. J. Muramatsu, K. Yoshimura, K. Arai, and P. Davis, “Secret key capacity for optimally correlated sources under sampling attack,” IEEE Transactions on Information Theory, vol.IT-52, no.11, pp.5140-5151, November 2006.
  5. H. Sawada, S. Araki, R. Mukai, S. Makino, “Blind extraction of dominant target sources using ICA and time-frequency masking,” IEEE Trans. Audio, Speech, and Language Processing, vol.14, no.6, pp.2165-2173, November 2006.
  6. K. Ishizuka and T. Nakatani, “A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition,” Speech Communication, Vol.48, No.11, pp.1447-1457, 2006.
  7. K. Ishizuka, T. Nakatani, Y. Minami, and N. Miyazaki, “Speech feature extraction method using subband-based periodicity and nonperiodicity decomposition,” The Journal of the Acoustical Society of America, Vol.120, No.1, pp.443-452, 2006.
  8. R. Mugitani, T. Kobayashi, K. Ishizuka, S. Amano, and K. Hiraki, “Audiovisual matching in lips and voice on vowel /i/ by Japanese infants,” The Journal of the Phonetic Society of Japan, Vol.10, No.1, pp.96-108, 2006 (in Japanese).
  9. R. Mukai, H. Sawada, S. Araki, S. Makino, “Frequency Domain Blind Source Separation of Many Speech Signals Using Near-field and Far-field Models,” EURASIP Journal on Applied Signal Processing, vol. 2006, Article ID 83683, 13 pages, 2006. doi:10.1155/ASP/2006/83683.
  10. S. Watanabe and A. Nakamura, “Speech recognition based on Student's t-distribution derived from total Bayesian framework,” IEICE D-II, vol. E89-D, no. 3, pp. 970-980, 2006
  11. S. Watanabe, A. Sako, and A. Nakamura, “Automatic Determination of Acoustic Model Topology using Variational Bayesian Estimation and Clustering for large vocabulary continuous speech recognition,” IEEE Transactions on Speech and Audio Processing, vol. 14, issue 3, pp. 855-872, 2006.

Book Chapter, Tutorial Papers

  1. A. Nakamura, S. Watanabe, T. Hori, E. McDermott, and S. Katagiri, “Advanced Computational Models and Learning Theories for Spoken Language Processing,” IEEE Computational Intelligence Magazine, vol. 1, issue 2, pp. 5-9, 2006.
  2. S. Makino, H. Sawada, R. Mukai, and S. Araki, “Blind source separation of convolutive mixtures of audio signals in frequency domain,” in Topics in Acoustic Echo and Noise Control, E. Haensler and G. Schmidt, Eds., Springer, 2006.

Peer-reviewed Conference Papers

  1. T. Yoshioka, T. Hikichi, and M. Miyoshi, “Second-order statistics based dereverberation by using nonstationarity of speech,” in Proceedings of the 2006 International Workshop on Acoustic Echo and Noise Control (IWAENC 2006), CD-ROM Proceedings, September 2006.
  2. T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, “Robust decomposition of inverse filter of channel and prediction error filter of speech signal for dereverberation,” in Proceedings of the 2006 European Signal Processing Conference (EUSIPCO 2006), CD-ROM Proceedings, September 2006.
  3. T. Oba, T. Hori, and A. Nakamura, “Sentence Boundary Detection Using Sequential Dependency Analysis Combined with CRF-based Chunking,” ICSLP2006, pp. 284-289, September 2006.
  4. H. Sawada, S. Araki, R. Mukai and S. Makino, “Blind separation and localization of speeches in a meeting situation,” Asilomar 2006, pp. 1407-1411, October 2006.
  5. R. Mukai, H. Sawada, S. Araki and S. Makino, “Frequency Domain Blind Source Separation in a Noisy Environment,” Joint meeting of ASA and ASJ 2006, November 2006, (invited).
  6. H. Kato, Y. Nagahara, S. Araki, H. Sawada and S. Makino, “Parametric Pearson Approach based Independent Component Analysis for Frequency Domain Blind Speech Separation,” EUSIPCO2006, 2006.
  7. H. Sawada, S. Araki, R. Mukai and S. Makino, “On Calculating the Inverse of Separation Matrix in Frequency-Domain BSS,” ICA2006, pp. 691-699, 2006.
  8. H. Sawada, S. Araki, R. Mukai and S. Makino, “Solving the permutation problem of frequency-domain BSS when spatial aliasing occurs with wide sensor spacing,” ICASSP2006, vol. 5, pp. 77-80, 2006.
  9. J. Cermak, S. Araki, H. Sawada and S. Makino, “Blind Speech Separation by Combining Beamformers and a Time Frequency Binary Mask,” IWAENC2006, 2006.
  10. J. Cermak, S. Araki, H. Sawada and S. Makino, “Musical Noise Reduction in Time-frequency-binary-masking-based Blind Source Separation Systems,” 16th Czech-German Workshop, 2006.
  11. J. Muramatsu, K. Yoshimura, and P. Davis, “Secret key capacity and advantage distillation capacity,” Proceedings of the 2006 IEEE International Symposium on Information Theory, pp.2147-2151, 2006.
  12. J. Muramatsu, K. Yoshimura, K. Arai, and P. Davis, “Some results on secret key agreement from correlated sources,” Proceedings of the 5th Asian-European Workshop on Information Theory, Jeju, Korea, pp.10-13, 2006.
  13. K. Ishizuka and H. Kato, “A feature for voice activity detection derived from speech analysis with the exponential autoregressive model,” Proceedings of the 31st International Conference on Acoustics, Speech, and Signal Processing (ICASSP2006), Vol.1, pp.789-792, 2006.
  14. K. Ishizuka and Tomohiro Nakatani, “Study of noise robust voice activity detection based on periodic compnent to aperiodic component ratio,” Proceedings of ISCA Tutorial and Research Workshop on Statistical And Perceptual Audition (SAPA2006), pp.65-70, 2006.
  15. K. Yoshimura, J. Muramatsu, and P. Davis, “Conditions for consistency in time-delay systems,” Proceedings of the International Workshop on Synchronization Phenomena and Analyses, p.135, 2006.
  16. K. Yoshimura, J. Muramatsu, and P. Davis, “Consistency in time-delay systems with periodic feedback functions,” Proceedings of the 2006 International Symposium on Nonlinear Theory and its Applications, pp.287-290, 2006.
  17. R. Mukai, H. Sawada, S. Araki, S. Makino, “Blind Source Separation of Many Signals in the Frequency Domain,” ICASSP2006, vol.5, pp.969-972, 2006.
  18. S. Araki, H. Sawada, R. Mukai and S. Makino, “ Blind sparse source separation with spatially smoothed time-frequency masking,” IWAENC2006, 2006.
  19. S. Araki, H. Sawada, R. Mukai and S. Makino, “Performance evaluation of sparse source separation and DOA estimation with observation vector clustering in reverberant environments,” IWAENC2006, 2006.
  20. S. Araki, H. Sawada, R. Mukai and S. Makino, “DOA estimation for multiple sparse sources with normalized observation vector clustering,” ICASSP2006, vol. 5, pp. 33-36, 2006.
  21. S. Araki, H. Sawada, R. Mukai and S. Makino, “Normalized Observation Vector Clustering Approach for Sparse Source Separation,” EUSIPCO2006, (invited).
  22. S. Araki, H. Sawada, R. Mukai and S. Makino, “Underdetermined Sparse Source Separation of Convolutive Mixtures with Observation Vector Clustering,” ISCAS2006, pp. 3594-3597, 2006.
  23. S. Mizutani, J. Muramatsu, K. Arai, and P. Davis, “Noise-assisted quantization,” Proceedings of the 2006 International Symposium on Nonlinear Theory and its Applications, pp.843-846, 2006.
  24. S. Watanabe and A. Nakamura, “Acoustic model adaptation based on coarse/fine training of transfer vector using directional statistics,” Proc. ICASSP 2006 , vol. 1, pp. 1005-1008, 2006.
  25. T. Hori and A. Nakamura, “An extremely large vocabulary approach to named entity extraction from speech,” in Proc. ICASSP2006, Vol. 1, pp. 973-976, 2006.
  26. T. Hori, I. L. Hetherington, T. J. Hazen, and J. R. Glass, “Open-vocabulary spoken utterance retrieval using confusion networks,” in Proc. ICASSP2007, Vol. 1, pp. 973-976 2006.

Other Conference Papers

  1. K. Kinoshita, T. Nakatani and M. Miyoshi, “Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation,” Proc. Of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), I, pp.817-820.

2005

Journal Papers

  1. A. Blin, S. Araki, and S. Makino, “Underdetermined blind separation of convolutive mixtures of speech using time-frequency mask and mixing matrix estimation,” IEICE Trans. Fundamentals, Vol.E88-A, No.7, pp.1693-1700, 2005.
  2. H. Sawada, R. Mukai, S. Araki, and S. Makino, “Estimating the number of sources using independent component analysis,” Acoustical Science and Technology, vol. 26, no. 5, pp.450-452, 2005.
  3. K. Kinoshita, T. Nakatani and M. Miyoshi, “Harmonicity based dereverberation for improving automatic speech recognition performance and speech intelligibility,” IEICE,2005.
  4. S. Araki, S. Makino, R. Aichner(Univ. Erlangen-Nuremberg), T. Nishikawa(NAIST) and H. Saruwatari(NAIST), “Subband-based Blind Separation for Convolutive Mixtures of Speech,” IEICE Trans. Fundamentals, E88-A(12), pp. 3593-3603, 2005.
  5. S. Makino, H. Sawada, R. Mukai, and S. Araki, “Blind source separation of convolutive mixtures of speech in frequency domain,” IEICE Trans. Fundamentals, Vol.E88-A, No.7, pp.1640-1655, 2005. (invited)

Book Chapter, Tutorial Papers

  1. S. Araki, S. Makino, “Subband Based Blind Source Separation,” In J. Benesty, S. Makino, and J. Chen, editors, Speech Enhancement, pp. 329-352, Springer, March 2005.
  2. H. Sawada, R. Mukai, S. Araki and S. Makino, “Frequency-domain blind source separation,” In J. Benesty, S. Makino, and J. Chen, editors, Speech Enhancement, pp.299-327, Springer, March 2005.
  3. R. Mukai, H. Sawada, S. Araki and S. Makino, “Real-time blind source separation for moving speech signals,” In J. Benesty, S. Makino, and J. Chen, editors, Speech Enhancement, pp.353-369, Springer, March 2005.

Peer-reviewed Conference Papers

  1. S. Araki, S. Makino, H. Sawada and R. Mukai, “Reducing musical noise by a fine-shift overlap-add method applied to source separation using a time-frequency mask,” ICASSP2005, vol. III, pp. 81-84, March 2005.
  2. S. Araki, S. Makino, H. Sawada, and R. Mukai, “Source extraction from speech mixtures with null-directivity pattern based mask,” Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005), pp. d1-d2, March 2005.
  3. H. Sawada, S. Araki, R. Mukai, S. Makino, “Blind Extraction of a Dominant Source Signal from Mixtures of Many Sources,” ICASSP2005, vol. III, pp. 61-64, March 2005.
  4. H. Sawada, R. Mukai, S. Araki, and S. Makino, “Frequency-domain blind source separation without array geometry information,” Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005), pp.d13-d14, March 2005.
  5. R. Mukai, H. Sawada, S. Araki, and S. Makino, “Blind source separation and {DOA} estimation using small 3-D microphone array,” Proc. of Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA 2005), pp. d9-d10, March 2005.
  6. M. Schuster and T. Hori, “Efficient generation of high-order context-dependent weighted finite state transducers for speech recognition,” in Proc. ICASSP2005, Vol I, pp. 201-204, March 2005.
  7. T. Yoshioka, T. Hikichi, M. Miyoshi, and H. G. Okuno, “Blind estimation of room resonances using popular, classical, and jazz Music,” in Proceedings of the 118th Audio Engineering Society Convention (AES 118), article ID 6632, May. 2005
  8. H. Sawada, S. Araki, R. Mukai, and S. Makino, “Blind extraction of a dominant source from mixtures of many sources using ICA and time-frequency masking,” Proc. of 2005 IEEE International Symposium on Circuits and Systems (ISCAS 2005), pp. 5882-5885, May 2005.
  9. H. Sawada, R. Mukai, S. Araki, and S. Makino, “Multiple source localization using independent component analysis,” Proc. of 2005 IEEE AP-S International Symposium and USNC/URSI National Radio Science Meeting, July 2005.
  10. H. Kato, Y. Nagahara (Meiji Univ.), S. Araki, and H. Sawada, “Pearson distribution system applied to blind speech separation,” 25th European Meeting of Statisticians (EMS2005), p.394, July 2005.
  11. T. Hori and A. Nakamura, “Generalized fast on-the-fly composition algorithm for WFST-based speech recognition,” in Proc. Interspeech2005-Eurospeech, pp. 557-560, September 2005.
  12. M. Schuster, T. Hori, and A. Nakamura, “Experiments with Probabilistic Principal Component Analysis in LVCSR,” in Proc. Interspeech2005-Eurospeech, pp. 1685-1688, September 2005.
  13. R. Mukai, H. Sawada, S. Araki, and S. Makino, “Blind Source Separation of 3-D Located Many Speech Signals,” in Proc. WASPAA2005, pp. 9-12, October 2005.
  14. T. Oba, T. Hori, and A. Nakamura, “Dependency modeling for integrated spontaneous speech processing,” in Proc. ASRU2005, pp. 284-289, November 2005.
  15. M. Schuster, and T. Hori, “Construction of weighted finite state transducers for very wide context-dependent acoustic models,” in Proc. ASRU2005, pp. 162-167, November 2005.
  16. T. Oba, T. Hori, and A. Nakamura, “Sequential Dependency Analysis for Spontaneous Speech Understanding,” ASRU2005, pp. 284-289, November 2005.
  17. F. Flego, S. Araki, H. Sawada, T. Nakatani, and S. Makino, “Underdetermined blind separation for speech in real environments with F0 adaptive comb filtering,” IWAENC2005, pp. 93-96, 2005.
  18. H. Sawada, R. Mukai, S. Araki, and S. Makino, “Real-time blind extraction of dominant target sources from many background interferences,” IWAENC2005, pp. 73-76, 2005.
  19. K. Ishizuka and T. Nakatani, “Robust speech feature extraction using subband based periodicity and aperiodicity decomposition in the frequency domain,” Proceedings of the Joint Workshop on Hands-Free Speech Communication and Microphone Arrays (HSCMA2005), pp.a13-a14, 2005.
  20. K. Ishizuka, H. Kato, and T. Nakatani, “Speech signal analysis with exponential autoregressive model,” Proceedings of the 30th International Conference on Acoustics, Speech, and Signal Processing (ICASSP2005), Vol.1, pp.225-228, 2005.
  21. K. Ishizuka, R. Mugitani, H. Kato, and S. Amano, “A longitudinal analysis of the spectral peaks of vowels for a Japanese infant,” Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech2005 - Eurospeech) pp.1169-1172, 2005.
  22. R. Mugitani, K. Ishizuka, and S. Amano, “Longitudinal development of mora-timed rhythmic structure in Japanese,” The 30th Boston University Conference on Language Development BUCLD30, p.52, 2005.
  23. R. Mukai, H. Sawada, S. Araki, and S. Makino, “Real-Time Blind Source Separation and DOA Estimation Using Small 3-D Microphone Array,” IWAENC2005, pp. 45-48, 2005.
  24. S. Araki, H. Sawada, R. Mukai and S. Makino, “A novel blind source separation method with observation vector clustering,” , IWAENC2005, pp.117-120, 2005.
  25. S. Watanabe and A. Nakamura, “Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognition,” Proc. Interspeech '2005 Eurospeech, pp. 1105-1108, 2005.

Other Conference Papers

  1. K. Kinoshita, T. Nakatani and M. Miyoshi, “ Fast estimation of a precise dereverberation filter based on speech harmonicity,” Proc. Of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), 2005
  2. K. Kinoshita, T. Nakatani and M. Miyoshi, “Efficient blind dereverberation framework for automatic speech recognition,” Proc. of Interspeech, 2005

2004

Journal Papers

  1. R. Mukai, S. Araki, H. Sawada, S. Makino, “Evaluation of Separation and Dereverberation Performance in Frequency Domain Blind Source Separation,” Acoustical Science and Technology, Vol.25, No.2, pp.119-126, March. 2004.
  2. H. Sawada, R. Mukai, S. Araki, S. Makino, “Convolutive Blind Source Separation for more than Two Sources in the Frequency Domain,” Acoustical Science and Technology, the Acoustical Society of Japan, vol.25, no.4, pp. 296-298, July 2004.
  3. R. Mukai, H. Sawada, S. Araki, S. Makino, “Blind Source Separation for Moving Speech Signals using Blockwise ICA and Residual Crosstalk Subtraction,” IEICE Trans. Fundamentals, Special Section on Digital Signal Processing, vol.E87-A, no.8, pp.1941-1948, August, 2004.
  4. H. Sawada, R. Mukai, S. Araki, S. Makino, “A Robust and Precise Method for Solving the Permutation Problem of Frequency-Domain Blind Source Separation,” IEEE Trans. Speech and Audio Processing, vol.12, no.5, pp.530-538, September 2004.
  5. S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Variational Bayesian Estimation and Clustering for Speech Recognition,” IEEE Transactions on Speech and Audio Processing, vol. 12, pp. 365-381, 2004.

Peer-reviewed Conference Papers

  1. S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, “Underdetermined Blind Separation for Speech in Real Environments with Sparseness and ICA,” ICASSP2004, vol. III, pp. 881-884, May 2004 (invited).
  2. A. Blin, S. Araki and S. Makino, “A Sparseness-Mixing Matrix Estimation (SMME) Solving the Underdetermined BSS for Convolutive Mixtures,” ICASSP2004, vol. IV, pp. 85-88, May 2004.
  3. R. Mukai, H. Sawada, S. Araki, S. Makino, “Near-Field Frequency Domain Blind Source Separation for Convolutive Mixtures,” ICASSP2004, vol. IV, pp. 49-52, May 2004.
  4. H. Sawada, R. Mukai, S. Araki, S. Makino, “Convolutive Blind Source Separation for more than Two Sources in the Frequency Domain,” ICASSP2004, vol. III, pp. 885-888, May 2004 (invited).
  5. S. Makino, S. Araki, R. Mukai, and H. Sawada, “Audio source separation based on independent component analysis,” in Proc. ISCAS2004 (International Symposium on Circuits and Systems), vol. V, pp. 668-671, May 2004 (invited).
  6. R. Mukai, H. Sawada, S. Araki and S. Makino, “Frequency Domain Blind Source Separation using Small and Large Spacing Sensor Pairs,” ISCAS2004, vol. V, pp. 1-4, May 2004.
  7. S. Araki, S. Makino, H. Sawada and R. Mukai, “Underdetermined Blind Speech Separation with Directivity Pattern based Continuous Mask and ICA,” EUSIPCO2004, pp.1991-1994, September 2004.
  8. S. Araki, S. Makino, H. Sawada and R. Mukai, “Underdetermined Blind Separation of Convolutive Mixtures of Speech with Directivity Pattern based Mask and ICA,” ICA2004, pp.898-905, September 2004.
  9. H. Sawada, S. Winter, S. Araki, R. Mukai, S. Makino, “Estimating the Number of Sources for Frequency-Domain Blind Source Separation,” ICA2004 (5th International Conference on Independent Component Analysis and Blind Signal Separation), pp.610-617, September 2004.
  10. S. Winter, H. Sawada, S. Araki, S. Makino, “Overcomplete BSS for convolutive mixtures based on hierarchical clustering,” ICA2004, pp.652-660, September 2004.
  11. R. Mukai, H. Sawada, S. Araki, S. Makino, “Frequency Domain Blind Source Separation for Many Speech Signals,” ICA2004, pp.461-469, September 2004.
  12. S. Winter, H. Sawada, S. Araki, S. Makino, “Hierarchical Clustering Applied to Overcomplete BSS for Convolutive Mixtures,” SAPA2004 (ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing), Session I-3, October 2004.
  13. A. Blin, S. Araki, and S. Makino, “Underdetermined blind source separation for convolutive mixtures exploiting a sparseness-mixing matrix estimation (SMME),” in Proc. ICA2004 (International Congress on Acoustics), vol. IV, pp. 3139-3142, 2004.
  14. H. Sawada, R. Mukai, S. Araki, S. Makino, “Solving the Permutation and the Circularity Problem of Frequency-Domain Blind Source Separation,” ICA2004 (International Congress on Acoustics), vol. I, pp. 89-92, 2004 (invited).
  15. K. Ishizuka and N. Miyazaki, “Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition," Proceedings of the 29th International Conference on Acoustics, Speech, and Signal Processing (ICASSP2004), Vol.1, pp.141-144, 2004.
  16. K. Ishizuka and N. Miyazaki, “Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition," The 2nd NTT Workshop on Communication Scene Analysis (CSA2004) Poster presentation, 2004.
  17. K. Ishizuka, N. Miyazaki, T. Nakatani and Y. Minami, “mprovement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition," Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech2004 - ICSLP), Vol.2, pp.937-940, 2004.
  18. P. Zolfaghari, H. Kato, S. Watanabe and S. Katagiri, “Speech Spectral Modelling using Mixture of Gaussians,” Proc. SWIM , 2004
  19. P. Zolfaghari, S. Watanabe, A. Nakamura and S. Katagiri, “Bayesian Modelling of the Speech Spectrum Using Mixture of Gaussians,” Proc. ICASSP'04, vol. 1, pp. 553-556, 2004.
  20. R. Mukai, H. Sawada, S. Araki, S. Makino, “A Solution for the Permutation Problem in Frequency Domain BSS using Near- and Far-field Models,” ICA2004 (International Congress on Acoustics), vol. IV, pp. 3135-3138, 2004.
  21. S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, “Underdetermined blind separation of convolutive mixtures of speech by combining time-frequency masks and ICA,” in Proc. ICA2004 (International Congress on Acoustics), vol. I, pp.321-324, 2004.
  22. S. Watanabe and A. Nakamura, “Acoustic model adaptation based on coarse-fine training of transfer vectors and its application to speaker adaptation task,” Proc. ICSLP'04 , vol. 4, 2933-2936, 2004.
  23. S. Watanabe and A. Nakamura, “Robustness of acoustic model topology determined by Variational Bayesian Estimation and Clustering for speech recognition for different speech data sets,” Proc. Workshop on statistical modeling approach for speech recognition - Beyond HMM, pp. 55-60, 2004.
  24. S. Watanabe, A. Sako (Ryukoku Univ.) and A. Nakamura, “Automatic Determination of Acoustic Model Topology using Variational Bayesian Estimation and Clustering,” Proc. ICASSP'04, vol. 1, pp. 813-816, 2004.
  25. T. Hori, C. Hori, and Y. Minami, “Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous-speech recognition,” in Proc. ICSLP2004, Vol. 1, pp. 289-292, 2004.

Other Conference Papers

  1. H. Sawada, R. Mukai, S. Araki, S. Makino, “Blind Source Separation for Convolutive Mixtures in the Frequency Domain,” CSA2004.
  2. K. Kinoshita, T. Nakatani and M. Miyoshi, “Improving automatic speech recognition performance and speech intelligibility with harmonicity based dereverberation,” Proc. Of Interspeech, 2004
  3. K. Kinoshita, T. Nakatani and M. Miyoshi, “Speech dereverberation based on harmonic structure using a single microphone,” Poster presentation at 2004 NTT Workshop on Communication Scene Analysis, 2004
  4. R. Mukai, H. Sawada, S. Araki, S. Makino, “A Solution for the Permutation Problem in Frequency Domain BSS using Near- and Far-field Models,” CSA2004.
  5. S. Araki, S. Makino, H. Sawada and R. Mukai, “Blind Separation of More Speech than Sensors using Time-frequency Masks and ICA,” Proceedings of 2004 NTT Workshop on Communication Scene Analysis (CSA2004), (invited)
  6. S. Winter, H. Sawada,S. Araki, S. Makino, “Underdetermined Blind Source Separation for Convolutive Mixtures of Sparse Signals,” CSA2004

2003

Journal Papers

  1. H. Sawada, R. Mukai, S. Araki, S. Makino, “Polar Coordinate based Nonlinear Function for Frequency Domain Blind Source Separation,” IEICE Trans. Fundamentals, vol.E86-A, no.3, pp. 590-596, March 2003.
  2. S. Araki, R. Mukai, S. Makino, T. Nishikawa(NAIST) and H. Saruwatari(NAIST), “The Fundamental Limitation of Frequency Domain Blind Source Separation for Convolutive Mixtures of Speech,” IEEE Trans. Speech Audio Processing, Vol. 11, No. 2, pp. 109-116, 2003.
  3. S. Araki, S. Makino, Y. Hinamoto(NAIST), R. Mukai, T. Nishikawa(NAIST) and H. Saruwatari(NAIST), “Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Beamforming for Convolutive Mixtures,” EURASIP Journal on Applied Signal Processing, vol. 2003, no. 11, pp. 1157-1166, 2003.

Peer-reviewed Conference Papers

  1. R. Mukai, H. Sawada, S. Araki, S. Makino, “Real-Time Blind Source Separation for Moving Speakers using Blockwise ICA and Residual Crosstalk Subtraction,” ICA2003, pp. 975-980, April 2003.
  2. H. Sawada, R. Mukai, S. Araki, S. Makino, “A Robust and Precise Method for Solving the Permutation Problem of Frequency-Domain Blind Source Separation,” ICA 2003, pp. 505-510, April 2003.
  3. R. Mukai, H. Sawada, S. Araki, S. Makino, “Robust Real-Time Blind Source Separation for Moving Speakers in a Room,” ICASSP2003, pp. 469-472, April 2003.
  4. H. Sawada, R. Mukai, S. Araki, S. Makino, “A Robust Approach to the Permutation Problem of Frequency-Domain Blind Source Separation,” ICASSP 2003, pp. 381-384, April 2003.
  5. T. Hori, D. Willett, and Y. Minami, “Language model adaptation using WFST-based speaking-style translation,” in Proc. ICASSP2003, Vol. 1, pp. 228-231, April 2003.
  6. C. Hori, T. Hori, H. Isozaki, E. Maeda, S. Katagiri, and S. Furui, “Deriving Disambiguous Queries in a Spoken Interactive ODQA System,” in Proc. ICASSP2003, Vol.1, pp. 384-387 April, 2003.
  7. T. Hori, D. Willett, and Y. Minami, “Paraphrasing spontaneous speech using weighted finite-state transducers,” in Proc. SSPR2003, pp.219-222 April, 2003.
  8. C. Hori, T. Hori, H. Isozaki, E. Maeda, S. Katagiri, and S. Furui, “Study on Spoken Interactive Open Domain Question Answering,” in Proc. SSPR2003, pp.111-113 April, 2003.
  9. S. Araki, S. Makino, H. Sawada, A. Blin and R. Mukai, “Underdetermined Blind Separation of Convolutive Mixtures of Speech with Binary Masks and ICA,” NIPS 2003 workshop on ICA: Sparse Representations in Signal Processing, December, 2003. (We did not have the proceedings in the workshop).
  10. A. Blin, S. Araki and S. Makino, “Blind Source Separation when Speech Signals Outnumber Sensors using a Sparseness-Mixing Matrix Combination,” IWAENC2003, pp. 211-214, 2003.
  11. H. Sawada, R. Mukai, S. de la Kethulle, S. Araki and S. Makino, “Spectral Smoothing for Frequency-Domain Blind Source Separation,” IWAENC2003, pp.311-314, 2003.
  12. M. Knaak, S. Araki , S. Makino, “Geometrically Constraint ICA for a Convolutive Mixtures of Sound,” ICASSP2003, Vol. II, pp. 725-728, 2003.
  13. M. Knaak, S. Araki, S. Makino, “Geometrically Constraint ICA for a Robust Separation of Sound Mixtures,” ICA2003, pp. 951-956, 2003.
  14. R. Aichner, H. Buchner, S. Araki, S. Makino, “On-line Time-domain Blind Source Separation of Nonstationary Convoluved Signals,” ICA2003, pp. 987-992, 2003.
  15. R. Mukai, H. Sawada, S. de la Kethulle, S. Araki and S. Makino, “Array Geometry Arrangement for Frequency Domain Blind Source Separation,” IWAENC2003, pp.219-222, 2003.
  16. S. Araki, S. Makino, A. Blin, R. Mukai and H. Sawada, “Blind Separation of More Speech than Sensors with Less Distortion by Combining Sparseness and ICA,” IWAENC2003, pp.271-274, 2003.
  17. S. Araki, S. Makino, R. Aichner, T. Nishikawa(NAIST), and H. Saruwatari(NAIST), “Subband Based Blind Source Separation for Convolutive Mixtures of Speech,” ICASSP2003, Vol. V, pp. 509-512, 2003.
  18. S. Araki, S. Makino, R. Aichner, T. Nishikawa(NAIST), and H. Saruwatari(NAIST), “Subband Based Blind Source Separation with Appropriate Processing for Each Frequency Band,” ICA2003, pp. 499-504, 2003 .
  19. S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Application of Variational Bayesian Estimation and Clustering to Acoustic Model Adaptation,” Proc. ICASSP'03. vol. 1, pp. 568-571, 2003.
  20. S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Bayesian Acoustic Modeling for Spontaneous Speech Recognition,” Proc. SSPR'03. pp. 47-50, 2003.
  21. T. Nishikawa, H. Saruwatari, K. Shikano, S. Araki , S. Makino, “Multistage ICA for Blind Source Separation of Real Acoustic Convolutive Mixture,” ICA2003, pp. 523-528, 2003
  22. C. Hori, T. Hori, H. Tsukada, H. Isozaki, Y. Sasaki, and E. Maeda, “Spoken Interactive ODQA System: SPIQA,” in Proc. ACL2003, Companion Volume to the Proceedings of the Conference, pp.153-156, 2003.
  23. T. Hori, C. Hori, and Y. Minami, “Speech summarization using weighted finite-state transducers,” in Proc. Eurospeech2003, pp.2817-2820, 2003.
  24. C. Hori, T. Hori, and S. Furui, “Evaluation methods for automatic speech summarization,” in Proc. Eurospeech2003, pp. 2825-2828, 2003.

2002

Peer-reviewed Conference Papers

  1. S. Araki, S. Makino, R. Mukai, Y. Hinamoto, T. Nishikawa and H. Saruwatari, “Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Beamforming,” ICASSP2002, vol. II, pp. 1785-1788, May 2002.
  2. Y. Hinamoto(NAIST), T. Nishikawa(NAIST), H. Saruwatari(NAIST), S. Araki , S. Makino, and R. Mukai, “Equivalence between Frequency Domain Blind Source Separation and Adaptive Beamforming,” Proc. ICFS2002 (The International Conference on Fundamentals of Electronics, Communications and Computer Sciences), R-1, pp. 13-18, March 2002.
  3. R. Mukai, S. Araki, H. Sawada, S. Makino, “Removal of Residual Cross-talk Components in Blind Source Separation using Time-delayed Spectral Subtraction,” ICASSP2002, vol. II, pp.1789-1792, May 2002.
  4. H. Sawada, R. Mukai, S. Araki, S. Makino, ” “Polar Coordinate based Nonlinear Function for Frequency-Domain Blind Source Separation, ICASSP2002, vol. I, pp. 1001-1004, May 2002.
  5. S. Araki, S. Makino, R. Aichner, T. Nishikawa(NAIST), and H. Saruwatari(NAIST), “Blind Source Separation for Convolutive Mixtures of speech using subband processing,” SMMSP2002(Second International Workshop on Spectral Methods and Multirate Signal Processing), pp.195-202, September 2002.
  6. H. Sawada, S. Araki, R. Mukai, S. Makino, “Blind Source Separation with Different Sensor Spacing and Filter Length for Each Frequency Range,” NNSP2002, pp. 465-474, 2002.
  7. R. Aichner, S. Araki, S. Makino, T. Nishikawa(NAIST), and H. Saruwatari(NAIST), “Time domain Blind Source Separation of non-stationary convolved signals by utilizing geometric beamforming,” NNSP2002, pp. 445-454, 2002.
  8. R. Mukai, S. Araki, H. Sawada, S. Makino, “Removal of Residual Cross-talk Components in Blind Source Separation using LMS Filters,” NNSP2002, pp. 435-444, 2002.
  9. S. Makino, S. Araki, R. Mukai, H. Sawada, H. Saruwatari (NAIST), “ICA-Based Source Separation of Sounds,” Proc. of 2002 China-Japan Joint Conference on Acoustics, Vol.21, pp. 83-86, 2002.
  10. S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, “Application of Variational Bayesian Approach to Speech Recognition,” NIPS'02 MIT Press 2002.
  11. S. Watanabe, Y. Minami, A. Nakamura and N. Ueda, ] “Constructing Shared-State Hidden Markov Models Based on a Bayesian Approach,” Proc. ICSLP'02, vol. 4, pp. 2669-2672, 2002.

2001

Peer-reviewed Conference Papers

  1. S. Araki, S. Makino, T. Nishikawa, and H. Saruwatari, “Limitation of Frequency Domain Blind Source Separation for Convolutive Mixture of Speech,” International Workshop on Hands-Free Speech Communication, April 2001.
  2. S. Araki, S. Makino, T. Nishikawa, and H. Saruwatari, “Fundamental Limitation of Frequency Domain Blind Source Separation for Convolutive Mixture of Speech,” IEEE International Conference on Acoustics, Speech, and Signal (ICASSP2001), pp.2737-2740, May, 2001
  3. S. Araki, S. Makino, R. Mukai, and H. Saruwatari, “Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Beamformers,” Consistent & Reliable Acoustic Cues for Sound Analysis (CRAC), September 2001.
  4. S. Araki, S. Makino, R. Mukai, and H. Saruwatari, “Equivalence between Frequency Domain Blind Source Separation and Frequency Domain Adaptive Null Beamformers,” 7th European Conference on Speech Communication and Technology (Eurospeech2001), vol.4, pp 2595-2598, September 2001.
  5. R. Mukai, S. Araki and S. Makino, “Separation and Dereverberation Performance of Frequency Domain Blind Source Separation for Speech in a Reverberant Environment,” Eurospeech 2001, pp. 2599-2603, September 2001.
  6. R. Mukai, S. Araki and S. Makino, “Separation and Dereverberation Performance of Frequency Domain Blind Source Separation in a Reverberant Environment,” IWAENC 2001, pp. 127-130, September 2001.
  7. S. Araki, S. Makino, R. Mukai, T. Nishikawa, and H. Saruwatari, “Fundamental limitation of frequency domain Blind Source Separation for convolved mixture of speech,” 3rd International Conference on INDEPENDENT COMPONENT ANALYSIS and BLIND SIGNAL SEPARATION (ICA2001) pp.132-137, December 2001.
  8. R. Mukai, S. Araki and S. Makino, “Separation and Dereverberation Performance of Frequency Domain Blind Source Separation,” ICA2001, pp. 230-235, December 2001.
  9. H. Sawada, R. Mukai, S. Araki, S. Makino, “A Polar-Coordinate based Activation Function for Frequency Domain Blind Source Separation,” ICA2001, pp. 663-668, December 2001.

Conferences(We organized)

Members

Nakatani, Tomohiro
Tomohiro Nakatani

Group leader

Nobutaka Ito
Nobutaka Ito
Tsubasa Ochiai
Tsubasa Ochiai
Atsunori Ogawa
Atsunori Ogawa
Shigeki Karita
Shigeki Karita

Alumni

Takuya Higuchi
Dung Tran
Susumu Shinohara
Susumu Shinohara
Takuya Yoshioka
Takuya Yoshioka
Masakiyo Fujimoto
Masakiyo Fujimoto
Miquel Espi
Miquel Espi
Takaaki Hori
Takaaki Hori
Masanobu Inubushi
Masanobu Inubushi
Kazuyuki Yoshimura
Kazuyuki Yoshimura
Kazuo Aoyama
Kazuo Aoyama
Takafumi Hikichi
Toshio Irino
Kentaro Ishizuka
Shigeru Katagiri
Hiroko Katoh
Erik McDermott
Shoji Makino
Masato Miyoshi
Ryo Mukai
Ryo Mukai
Tatsuto Murayama
Satoshi Sunada
Mike Schuster
Shinji Watanabe
Daniel Willett
Parham Zolfaghari
Takanobu Oba
Takanobu Oba
Seong-Jun Hahm
Takahisa Harayama
Souden Mehrez
Atsushi Nakamura
Yasuhiro Minami
Yotaro Kubo