Signal Processing Research Group
Contents
Top
Research Topics
People
Publications
Organization
NTT Communication Science Laboratories
Media Information Laboratory
Recognition Research Group
Signal Processing Research Group
Communication environment research group
Innovative Communication Laboratory
Human and Information Science Laboratory
Moriya Research Laboratory
NTT Science and Core Technology Laboratory Group
Nippon Telegraph and Telephone Corporation (NTT)

| 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 |

Publications

2009

Journal Papers

  1. T. Yoshioka, T. Nakatani, and M. Miyoshi, “Integrated speech enhancement method using noise suppression and dereverberation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 2, pp. 231-246, February. 2009.
  2. S. Miyake, and J. Muramatsu, “A Construction of Channel Code, Joint Source-Channel Code, and Universal Code for Arbitrary Stationary Memoryless Channels using Sparse Matrices,” IEICE Transactions on Fundamentals, vol.E92-A, no.9, pp.2333-2344, September. 2009.
  3. H. K. Solvang, Y. Nagahara, S. Araki, H. Sawada and S. Makino, “Frequency-Domain Pearson Distribution Approach for Independent Component Analysis (FD-Pearson-ICA) in Blind Source Separation,” IEEE Trans. Speech & Language Processing, vol, 17, no. 4, pp. 639-649, 2009.
  4. K. Kinoshita, M. Delcroix, T. Nakatani and M. Miyoshi, “Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction,” IEEE Transactions on Audio, Speech and Language processing
  5. M. Delcroix, T. Nakatani, and S. Watanabe, “Static and dynamic variance compensation for recognition of reverberant speech with dereverberation pre-processing,” IEEE transactions on Audio, Speech, and Language Processing, vol. 17, issue 2, pp. 324-334, 2009.
  6. S. Araki, H. Sawada, R. Mukai and S. Makino, “DOA estimation for multiple sparse sources with arbitrarily arranged multiple sensors,” Journal of Signal Processing Systems, doi:10.1007/s11265-009-0413-9, 2009.

Book Chapter, Tutorial Papers

  1. T. Hori, K. Sudoh, H. Tsukada, and A. Nakamura, “World-Wide Media Browser--Multilingual Audio-visual Content Retrieval and Browsing System,” NTT Technical Review, Vol. 7, No. 2, February 2009.
  2. S. Makino, S. Araki, S. Winter, H. Sawada, “Underdetermined Blind Source Separation using Acoustic Arrays,” Handbook on Array Processing and Sensor Networks, S. Haykin, and K. J. R. Liu Eds., Wiley, 2009 (in press).

Peer-reviewed Conference Papers

  1. T. Yoshioka, H. Tachibana, T. Nakatani, and M. Miyoshi, “Adaptive dereverberation of speech signals with speaker-position change detection,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 3733-3736, April 2009.
  2. H. Kameoka, T. Nakatani, and T. Yoshioka, “Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms,” in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 45-48, April 2009.
  3. T. Nakatnai, T. Yoshioka, K. Kinoshita, M. Miyoshi, and B.-H. Juang, “Real-time speech enhancement in noisy reverberant multi-talker environments based on a localtion-independent room acoustics model,” to appear in Proceedings of the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), pp. 137-140, April 2009.
  4. A. Ogawa, S. Takahashi, and A. Nakamura, “Efficient combination of likelihood recycling and batch calculation based on conditional fast processing and acoustic back-off,” Proc. ICASSP, pp. 4164-4164, April 2009.
  5. T. Yoshioka, T. Nakatani, and M. Miyoshi, “Fast algorithm for conditional separation and dereverberation,” in Proceedings of the 17th European Signal Processing Conference (EUSIPCO 2009), CD-ROM Proceedings, August 2009.
  6. T. Yoshioka, H. Kameoka, T. Nakatani, and H. G. Okuno, “Statistical models for speech dereverberation,” in Proceedings of the 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2009), pp. 145-148, October 2009.
  7. A. Nakamura, E. McDermott, S. Watanabe, S. Katagiri, “A unified view for discriminative objective functions based on negative exponential of difference measure between strings,” Proc. ICASSP 2009, pp. 1633-1636, 2009.
  8. E. McDermott, S. Watanabe, and A. Nakamura, “Margin-Space Integration of MPE Loss via Differencing of MMI Functionals for Generalized Error-Weighted Discriminative Training,” Proc. Interspeech 2009 Eurospeech, pp. 224-227, 2009.
  9. J. Muramatsu, and S. Miyake, “Coding theorem for general stationary memryless channel based on hash property,” Proceedings of the 2009 IEEE International Symposium on Information Theory, Seoul, Korea, pp.541-545, 2009.
  10. J. Muramatsu, and S. Miyake, “Construction of wiretap channel codes by using sparse matrices,” Proceedings of the 2009 IEEE Information Theory Workshop, Taormina, Italy, pp.105-109, 2009.
  11. K. Ishiguro, T. Yamada S. Araki and T. Nakatani, “A PROBABILISTIC SPEAKER CLUSTERING FOR DOA-BASED DIARIZATION,” WASPAA2009, 2009.
  12. K. Ishizuka, S. Araki, K. Otsuka, T. Nakatani and M. Fujimoto, “A Speaker Diarization Method based on the Probabilistic Fusion of Audio-Visual Location Information,” ICMI-MLMI 2009, 2009.
  13. K. Ishizuka, S. Araki, K. Otsuka, T. Nakatani, and M. Fujimoto, “A speaker diarization method based on the probabilistic fusion of audio-visual location information,” Proceedings of the 11th International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI2009), pp.55-62, 2009.
  14. K. Otsuka, S. Araki, D. Mikami, K. Ishizuka, M. Fujimoto and J. Yamato, “Realtime Meeting Analysis and 3D Meeting Viewer Based on Omndirectional Multimodal Sensors,” ICMI-MLMI 2009, 2009.
  15. K. Otsuka, S. Araki, D. Mikami, K. Ishizuka, M. Fujimoto, and J. Yamato, “Realtime meeting analysis and 3D meeting viewer based on omnidirectional multimodal sensors,” Proceedings of the 11th International Conference on Multimodal Interfaces and Workshop on Machine Learning for Multi-modal Interaction (ICMI-MLMI2009), pp.219-220, 2009.
  16. M. Fujimoto, K. Ishizuka, and T. Nakatani, “A study of mutual front-end processing method based on statistical model for noise robust speech recognition,” Proc. of Interspeech '09, pp. 1235-1238, September 2009.
  17. A. Ogawa and A. Nakamura, “Simultaneous estimation of confidence and error cause in speech recognition using discriminative model,” Proc. Interspeech, pp. 1199-1202, September. 2009.
  18. S. Kobashikawa, A. Ogawa, Y. Yamaguchi, and S. Takahashi, “Rapid unsupervised adaptation using frame independent output probabilities of gender and context independent phoneme models,” Proc. Interspeech, pp.1615-1618, September 2009.
  19. M. Fujimoto, K. Ishizuka, and T. Nakatani, “A study of mutual front-end processing method based on statistical model for noise robust speech recognition,” Proceedings of the 10th Interspeech (Interspeech2009), pp. 1235-1238, 2009.
  20. R. Mugitani, K. Ishizuka, T. Kondo, and S. Amano, “Acquisition of durational control of vocalic and consonantal intervals in speech production,” The 34th Boston University Conference on Language Development (BUCLD34), 2009.
  21. S. Araki, T. Nakatani, H. Sawada, and S. Makino, “Blind sparse source separation for unknown number of sources using Gaussian mixture model fitting with Dirichlet prior,” ICASSP2009, pp.33-36, 2009.
  22. S. Araki, T. Nakatani, H. Sawada, and S. Makino, “Stereo source separation and source counting with MAP estimation with Dirichlet prior considering spatial aliasing problem,” ICA2009, pp. 742-750, 2009.
  23. S. Watanabe and A. Nakamura, “Speech recognition with incremental tracking and detection of changing environments based on a macroscopic time evolution system,” Proc. ICASSP 2009, pp. 4373-4376, 2009.
  24. T. Iwata, S. Watanabe, T. Yamada, and N. Ueda, “Topic tracking model for analyzing consumer purchase behavior,” IJCAI 2009, pp. 1427-1432, 2009.
  25. Y. Izumi, K. Nishiki, S. Watanabe, T. Nishimoto, N. Ono, and S. Sagayama, “Stereo-input Speech Recognition using Sparseness-based Time-frequency Masking in a Reverberant Environment,” Proc. Interspeech 2009 Eurospeech , pp. 1955-1958, 2009.
  26. S. Kobashikawa, A. Ogawa, Y. Yamaguchi, and S. Takahashi, “Rapid unsupervised adaptation using context independent phoneme model,” The 13th IEEE International Symposium on Consumer Electronics (ISCE'09).

Other Conference Papers

  1. K. Kinoshita, T. Nakatani, M. Miyoshi and T. Kubota, “Blind upmix of stereo music signal using multi-step linear prediction based reverberation extraction,” International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp49-52, 2009