Publication List

Tomohiro Nakatani's Publications

Recent publications

Nakatani, T., Juang, B.H., Yoshioka, T., Kinoshita, K., Delcroix, M., and Miyoshi, M., "Importance of energy and spectral features in Gaussian source model for speech dereverberation," Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-2007), pp.299-302, Oct. 2007.
Yoshioka, T., Nakatani, T., Hikichi, T., and Miyoshi, M., "Overfitting-Resistant Speech Dereverberation," IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-2007), pp.163-166, Oct. 2007.
Nakatani, T., Hikichi, T., Kinoshita, K., Yoshioka, T., Delcroix, M., Miyoshi, M., and Juang, B.H., "Robust blind dereverberation of speech signals based on characteristics of short-time speech segments," Proc. IEEE International Symposium on Circuits and Systems (ISCAS-2007), pp.2986-2989, June 2007.
Juang, B.H. and Nakatani, T., "Joint source-channel modeling and estimation for speech dereverberation," Proc. IEEE International Symposium on Circuits and Systems (ISCAS-2007), pp.2990-2993, June 2007.
Kinoshita, K., Delcroix, M., Nakatani, T., and Miyoshi, M., "Dereverberation of real recordings using linear prediction-based microphone array," Prog. Audio Engineering Society (AES) 13th Regional Convention, Tokyo, August 2007.
Kinoshita, K., Delcroix, M., Nakatani, T., and Miyoshi, M., "Multi-step linear prediction based speech enhancement in noisy reverberant environment," Proc. Interspeech-2007, pp.854-857, August 2007.
Ishizuka, K., Nakatani, T., Fujimoto, M., and Miyazaki, N., "Noise robust front-end with voice activity detection based on periodic to aperiodic component ratio," Proc. Interspeech-2007, pp.230-233, August 2007.
Nakatani, T., Juang, B.H., Hikichi, T., Yoshioka, T., Kinoshita, K., Delcroix, M., and Miyoshi, M., "Study on speech dereverberation with autocorrelation codebook," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2007), vol.I, pp.193-196, April 2007.
Rubio, J.E., Ishizuka, K., Sawada, H., Araki, S., Nakatani, T., and Fujimoto, M., "Two-microphone voice activity detection based on the homogeneity of the direction of arrival estimates," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP2007), vol.4, pp.385-388, April 2007.
Nakatani, T., Juang, B.H., Kinoshita, K., and Miyoshi, M., "Speech dereverberation based on probabilistic models of source and room acoustics," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2006), vol.I, pp.821-824, June 2006.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Spectral subtraction steered by multi-step forward linear prediction for single channel speech dereverberation," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2006), vol.I, pp.817-820, June 2006.
Ishizuka, K. and Nakatani, T., "Study of noise robust voice activity detection based on periodic component to aperiodic component ratio," Proc. ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition (SAPA2006), pp. 65-70, Sep 2006.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Fast estimation of a precise dereverberation filter based on the harmonic structure of speech," Acoustical Science and Technology, vol.28, no.2, pp.105-114, 2007.
Ishizuka, K., Nakatani, T., Minami, Y., and Miyazaki, N., "Speech feature extraction method using subband-based periodicity and non-periodicity decomposition," Journal of the Acoustical Society of America, vol.120, Issue 1, pp.44-452, 2006.
Ishizuka, K., and Nakatani, T., "A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition," Speech Communication, vol.48, no.11, pp.1447-1457, 2006.
Nakatani, T., Kinoshita, K., and Miyoshi, M., "Harmonicity based blind dereverberation for single channel speech signals," IEEE Trans. Audio, Speech, and Language Processing, vol.15, no.1, pp.80-95, 2007.
Nakatani, T., Kinoshita, K., and Miyoshi, M., "Blind dereverberation of monaural speech signals based on harmonic structure," Systems and Computers in Japan, vol.37, Issue 6, pp.1-12, June 2006.
Amano, S., Nakatani, T., and Kondo, K., "Fundamental frequency of infants' and parents' utterances in longitudinal recordings," Journal of the Acoustical Society of America, 119(3), pp.1636-1647, Mar.\ 2006.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Harmonicity based dereverberation for improving automatic speech recognition performance and speech intelligibility," IEICE Trans. on Fundamentals of Electronics, Communications and Computer Sciences, E88-A, no.7, pp.1724-1731, 2005.
Nakatani, T., Juang, B.H., Kinoshita, K., Miyoshi, M., "Harmonicity based dereverberation with maximum a posteriori estimation," 2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-2005), pp.94-97, Oct. 2005.
Flego, F., Araki, S., Sawada, H., Nakatani, T., Makino, S., "Underdetermined blind separation for speech in real environments with F0 adaptive comb filtering," Proc. IEEE International Workshop on Acoustic Echo and Noise Control (IWAENC-2005), pp.93-96, Sep. 2005.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Harmonicity Based Dereverberation for Improving Automatic Speech Recognition Performance and Speech Intelligibility, " IEICE Trans, Vol.E88-A, No.7, pp.1732-1738, Sep., 2005.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Fast estimation of a precise dereverberation filter based on speech harmonicity," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2005), vol.I, pp.1073-1076, Mar., 2005.
Ishizuka, K., Kato, H., and Nakatani, T., "Speech analysis with exponential autoregressive model," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2005), vol.I, pp.225-228, Mar., 2005.
Ishizuka, K. and Nakatani, T., "Robust speech feature extraction using subband based periodicity and aperiodicity decomposition in the frequency domain," Joint workshop on hands-free speech communication and microphone arrays, pp.a13-a14, Mar., 2005.
Nakatani, T., Miyoshi, M., and Kinoshita, K., "Single microphone blind dereverberation," in Benesty, J., Makino, S., and Chen, J. (Eds.), Speech enhancement, pp.247-270, Springer, Mar., 2005.
Nakatani, T., Kinoshita, K., Miyoshi, M., and Zolfaghari, P. S., "Harmonicity based monaural speech dereverberation with time warping and F0 adaptive window," Proc. International Conference on Spoken Language Processing (ICSLP-2004), vol.~II, pp.~873--876, Oct., 2004.
Ishizuka, K., Miyazaki, N., Nakatani, T., and Minami, Y., "Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition," Proc. International Conference on Spoken Language Processing (ICSLP-2004), Vol. 2, pp. 937-940, 2004.
Nakatani, T., Kinoshita, K., Miyoshi, M., and Zolfaghari, P. S., "Harmonicity based blind dereverberation with time warping," Proc. ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing (SAPA), Oct., 2004. [PDF]
Nakatani, T., and Irino., T., "Robust and accurate fundamental frequency estimation based on dominant harmonic components," Journal of the Acoustical Society of America (JASA), vol.~116, Issue~6, pp.~3690--3700, Dec., 2004. [JASA-O]
Nakatani, T., Miyoshi, M., and Kinoshita, K., "One microphone blind dereverberation based on quasi-periodicity of speech signals," in Thrun, S., Saul, L. K., and Scholkopf B. (Eds.), Advances in Neural Information Processing Systems 16, pp.1417-1424, MIT Press, 2004. [Poster][Online Proc.]
Nakatani, T., Irino, T., and Zolfaghari, P.~S., "Dominance spectrum based V/UV classification and F0 estimation," Proc. European Conference on Speech Communication and Technology (EUROSPEECH-2003), pp.~2313-2316, Sep., 2003. [PDF]
Nakatani, T., Miyoshi, M., and Kinoshita, K., "Implementation and effects of single channel dereverberation based on the harmonic structure of speech," Proc. IEEE International Workshop on Acoustic Echo and Noise Control (IWAENC-2003), pp.~91--94, Sep., 2003. [PDF]
Nakatani, T., and Miyoshi, M, "Blind dereverberation of single channel speech signal based on harmonic structure," Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-2003), vol. 1, pp. 92--95, Hong Kong, Apr., 2003. [Sound Demonstration] [Abstract]
Nakatani, T., Amano, S., Irino, T., “An estimation method for fundamental frequency and voiced segment in infant utterance,” First Pan-American/Iberian Meeting on Acoustics (144th Meeging of ASA), Cancun, Dec., JASA, Vol.~112, No.~5, Pt.~2 of 2, p.~2322, Nov., 2002. [Abstract]
Nakatani, T., and Irino, T., "Robust fundamental frequency estimation against background noise and spectral distortion," Proc. International Conference on Spoken Language Processing (ICSLP-2002), vol.~3, pp.~1733--1736, Denver, Sep., 2002. [PDF]
Amano, S., Nakatani, T., and Kondo, T., "Developmental Changes in Voiced-Segment Ratio for Japanese Infants and Parents," Proc. International Conference on Spoken Language Processing (ICSLP-2004), vol.~III, pp.~1857--1860, Oct., 2004.
Ishihara, K., Hattori, Y., Nakatani, T., Komatani, K., Ogata, T., and Okuno, H. G., "Disambiguation in Determining Phonemes of Sound-Imitation Words for Environmental Sound Recognition," Proc. International Conference on Spoken Language Processing (ICSLP-2004), vol.~II, pp.~1485-1488, Oct., 2004.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Improving automatic speech recognition performance and speech intelligibility with harmonicity based dereverberation," Proc. International Conference on Spoken Language Processing (ICSLP-2004), vol.~IV, pp.~2653--2656, Oct., 2004.
Ishizuka, K., Miyazaki, N., Nakatani, T., and Minami, Y., "Improvement in robustness of speech feature extraction method using sub-band based periodicity and aperiodicity decomposition," Proc. International Conference on Spoken Language Processing (ICSLP-2004), vol.~II, pp.~937--940, Oct., 2004.
Kazushi Ishihara, Tomohiro Nakatani, Tetsuya Ogata, Hiroshi G. Okuno, "Automatic Sound-Imitation Word Recognition from Environmental Sounds focusing on Ambiguity Problem in Determining Phonemes," Proc. Pacific Rim International Conferences on Artificial Intelligence (PRICAI-2004), pp.~909--918, Aug., 2004.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Speech dereverberation based on harmonic structure using single microphone," 2004 NTT Workshop on Communication Scene Analysis, Apr., 2004.
Kato, H., Kajikawa, S., Nakatani, T., and Amano, S., "Discrimination and Clustering for Fundamental Frequency Patterns of Infant and Parents Speech," 2004 NTT Workshop on Communication Scene Analysis (CSA2004), Apr., 2004.
Kato, H., Nakatani, T., Kajikawa, S., and Amano, S., "Statistical method for foundamental frequency pattern analysis of infant speech," Proc. of Japan Statistical Society Annual Meeting, pp.~473--474, 2003.
Zolfaghari, P.~S., Nakatani, T., Irino, T., and Kawahara, H., "Glottal closure instant synchronous sinusoidal model for high quality speech analysis/synthesis," Proc. European Conference on Speech Communication and Technology (EUROSPEECH-2003), pp.~2441--2444, Sep., 2003.
Amano, S., Nakatani, T., and Kondo, K., " Fundamental frequency analysis of longitudinal recording in a Japanese infant speech database," Proc. of ICPhs-2003, pp.~1983--1986, Aug., 2003.
Irino, T., Minami, Y., Nakatani, T., Tsuzaki, M., and Tagawa, H., "Evaluation of a speech recognition/generation method based on HMM and STRAIGHT," Proc. International Conference on Spoken Language Processing (ICSLP-2002), vol.~4, pp.~2545--2548, Denver, Sep., 2002. [PDF]
中谷智広, ジュアング・ビンファン, 吉岡拓也, 木下慶介, 三好正人, "音声信号の残響除去におけるエネルギーおよびスペクトル情報の重要性," 日本音響学会春季研究発表会, 秋季, pp.126-127, 9月 2007.
木下慶介, 中谷智広, 澤田宏, 荒木章子, "複数音源が存在する残響環境でのマルチステップ線形予測の効果," 日本音響学会講演論文集, 秋季, pp.118-119, 9月 2007.
藤本雅清, 石塚健太郎, 中谷智広, “複数の音声特徴量及び信号識別処理の適応的統合に基づく音声区間検出,” 日本音響学会講演論文集, 3-3-10, 秋季, pp.161-162, 9月 2007.
石塚健太郎, Juan Emilio Rubio, 澤田宏, 荒木章子, 中谷智広, 藤本雅清, “信号到来方向の推定値の偏りを用いた耐雑音音声区間検出法,” 日本音響学会講演論文集, 3-3-11, 秋季, pp.163-166, 9月 2007.
藤本雅清, 石塚健太郎, 中谷智広, "音声の周期性・非周期性成分比とSwitching Kalman filterに基づく雑音下音声区間検出," 情報処理学会研究報告, SLP-67-13, pp. 69-74, 7月発表 2007.
中谷智広, ジュアング・ビンファン, 引地孝文, 吉岡拓也, 木下慶介, デルクロア・マーク, 三好正人, "自己相関コードブックに基づく音声信号の残響除去," 日本音響学会春季研究発表会, pp.561-562, 3月 2007.
石塚健太郎, 中谷智広, "信号の周期性・非周期性成分の比を用いた耐雑音音声区間検出の評価," 日本音響学会春季研究発表会, pp.85-86, 3月 2007.
木下慶介, デルクロア・マーク, 中谷智広, 三好正人, "マルチステップ線形予測に基づく残響除去法の雑音耐性の音声認識による評価" 電子情報通信学会総合大会, pp.71-72, 3月 2007.
木下慶介, マーク・デルクロア, 中谷智広, 三好正人, "実音場収音した音声による「マルチステップ線形予測に基づく残響除去方法」の評価," 日本音響学会秋季研究発表会, pp.421-422, 9月 2006.
石塚健太郎, 中谷智広, “信号の周期性成分・非周期性成分の比を用いた耐雑音音声区間検出,” 日本音響学会秋季研究発表会, pp. 35-36, 9月 2006.
木下慶介, 中谷智広, 三好正人, "マルチステップ線形予測を用いた 1ch残響除去法の検討," 音響学会春季研究発表会, pp.511-512, 3月, 2006.
石塚健太郎, 中谷智広, "音声特徴抽出法SPADEを用いたフロントエンドの耐雑音評価標準コーパスによる評価," 電子情報通信学会技術研究報告 NLC2005-89, SP2005-122, pp.71-72, 2005.
木下慶介, 中谷智広, 三好正人, "音声のスパース性を用いる1チャンネルブラインド残響除去," 音響学会秋季研究発表会, 2005.
石塚健太郎, 加藤比呂子, 中谷智広, "Exponential 自己回帰モデルを用いた音声信号分析方法," 日本音響学会春季研究発表会, pp.235-236, 3月, 2005.
中谷智広, 三好正人, 木下慶介,"調波構造に基づくモノラル音声信号のブラインド残響除去," 電子通信情報学会論文誌, vol.J88-D-II, no.3, pp.509-520, Mar., 2005.
加藤比呂子, 谷口正信, 中谷智広, 天野成昭, "時系列回帰モデルを基礎とした乳幼児と親の音声基本周波数パタン分類," 日本統計学会大会, 9月, 2004.
加藤比呂子, 中谷智広, 天野成昭, "親子の音声基本周波数パタンの類似性判別手法," 日本音響学会秋季研究発表会, pp.399-400, 9月, 2004.
石塚健太郎, 宮崎昇, 中谷智広, 南泰浩, "音声特徴抽出法SPADEにおける歪補正法の効果," 日本音響学会秋季研究発表会, pp.117-118, 9月, 2004.
石原一志, 中谷智広, 駒谷和範, 尾形哲也, 奥乃博, "環境音の擬音語変換のための環境音用音素の設計," 人工知能学会全国大会, 1E2-03, 6 月, 2004.
木下慶介, 中谷智広, 三好正人, "調波構造を用いた残響除去法の明瞭性と認識率による音声品質評価," 日本音響学会春季研究発表会, pp.611-612, 3月, 2004.
石塚健太郎, 宮崎昇, 中谷智広, 南泰浩, "帯域内での周期性・非周期性を表す音声特徴抽出法SPADの提案とAurora-2Jを用いた耐雑音性評価," 日本音響学会春季研究発表会, pp.447-448, 3月, 2004.
石原一志, 中谷智広, 駒谷和範, 尾形哲也, 奥乃博, "環境音の擬音語変換のための環境音用音素の設計," 人工知能学会全国大会, 1E2-03, 6月, 2004.
石原一志, 服部佑哉, 中谷智広, 尾形哲也, 奥乃博, "環境音の擬音語変換における音素決定曖昧性の解消," 情報処理学会大66回全国大会, 3Y5, 2004.
梶川祥世, 加藤比呂子, 天野成昭, 中谷智広, "0-3歳齢における母子間の音声基本周波数の類似度," 日本発達心理学会第15回大会, p.25, 東京, 2004.
天野成昭, 中谷智広, 近藤公久, "日本語における対乳児音声の特徴," 日本音響学会秋季研究発表会, pp.~389--390, 9月, 2003.
加藤比呂子, 中谷智広, 梶川祥世, 天野成昭, "乳児音声の基本周波数パタン分析のための統計的手法," 統計関連学会連合大会日本統計学会第 71回大会, pp.~472--473, 9月. 2003.
吉田尚史, 中谷智広, 奥乃博, 三好正人, "調波構造と音源方向に基づく音源分離法における残響耐性の改善," 電子情報通信学会技術研究報告, EA2003-13, 4月. 2003.
中谷智広, 三好正人, “調波構造に基づく音声信号のブラインド残響除去,” 音響学会春季研究発表会, 2-8-9, Vol.~1, pp.~669--670, 3月, 2003. [PDF]
中谷智広, 天野成昭, 入野俊夫,“幼児音声の基本周波数および有声区間の推定法,” 音響学会秋季研究発表会, 1-P-11, Vol.~1, pp.~393--394, 9月, 2002. [PDF]
中谷智広, 南泰浩, “バイノーラル音源分離の音声認識による評価,” 音響学会秋季研究発表会, 2-2-9, pp.~73--74, 3月, 2002. [PDF]
中谷智広, 入野俊夫,“占有度を用いた耐雑音性の高い基本周波数推定法,”聴覚・音声研究会, 信学技報 SP2001-138, pp.~105--112, 東大, 3月, 2002. [PDF]

Journal Paper and Ph.D Thesis

Kinoshita, K., Nakatani, T., and Miyoshi, M., "Fast estimation of a precise dereverberation filter based on the harmonic structure of speech," Acoustical Science and Technology, vol.28, no.2, pp.105-114, 2007.
Nakatani, T., Kinoshita, K., and Miyoshi, M., "Harmonicity based blind dereverberation for single channel speech signals," IEEE Trans. Audio, Speech, and Language Processing, vol.15, no.1, pp.80-95, 2007.
Ishizuka, K., Nakatani, T., Minami, Y., and Miyazaki, N., "Speech feature extraction method using subband-based periodicity and non-periodicity decomposition," Journal of the Acoustical Society of America, vol.120, Issue 1, pp.44-452, 2006.
Ishizuka, K., and Nakatani, T., "A feature extraction method using subband based periodicity and aperiodicity decomposition with noise robust frontend processing for automatic speech recognition," Speech Communication, vol.48, no.11, pp.1447-1457, 2006.
Nakatani, T., Kinoshita, K., and Miyoshi, M., "Blind dereverberation of monaural speech signals based on harmonic structure," Systems and Computers in Japan, vol.37, Issue 6, pp.1-12, June 2006.
Amano, S., Nakatani, T., and Kondo, K., "Fundamental frequency of infants' and parents' utterances in longitudinal recordings," Jounal of the Acoustical Society of America, 119(3), pp.1636-1647, Mar.\ 2006.
Kinoshita, K., Nakatani, T., and Miyoshi, M., "Harmonicity based dereverberation for improving automatic speech recognition performance and speech intelligibility," IEICE Trans. on Fundamentals of Electronics, Communications and Computer Sciences, E88-A, no.7, pp.1724-1731, 2005.
中谷智広, 三好正人, 木下慶介,"調波構造に基づくモノラル音声信号のブラインド残響除去," 電子通信情報学会論文誌, vol.J88-D-II, no.3, pp.509-520, Mar., 2005.
Nakatani, T., and Irino., T., "Robust and accurate fundamental frequency estimation based on dominant harmonic components," Journal of the Acoustical Society of America (JASA), vol.~116, Issue~6, pp.~3690--3700, Dec., 2004. [JASA-O]
Tomohiro Nakatani, “Computational auditory scene analysis based on residue-driven architecture and its application to mixed speech recognition,” Ph.D. thesis, Depertment of Applied Analysis and Complex Dynamical Systems, Kyoto University, March, 2002. [PDF]
中谷智広, 奥乃博, "音オントロジーに基づいた音環境理解システムの統合," 人工知能学会誌, Vol.14, No.6, pp.1072-1079, Dec., 1999.
Tomohiro Nakatani, Hiroshi G. Okuno, "Harmonic Sound Stream Segregation Using Localization and Its Application to Speech Stream Segregation," Speech Communcation, Vol.27, Nos.3-4, pp.209-222, Elsevier, Apr., 1999.
Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata, "Listening to Two Simultaneous Speeches," Speech Communication, Vol.27, Nos.3-4(Apr. 1999), pp.299-310, Elsevier, 1999.
中谷智広, 後藤真孝, 川端豪, 奥乃博, "残差駆動型アーキテクチャの提案と音響ストリーム分離への応用"，人工知能学会誌, vol.12, no.1, pp.111-120, Jan. 1997.
奥乃博, 中谷智広, 川端豪, "音声ストリーム分離法の提案と複数音声の同時認識の予備実験," 情報処理学会論文誌, Vol.38, No.3, pp. 510-523, Mar., 1997.
中谷智広, 奥乃博, 川端豪, "マルチエージェントによる音環境理解のための音響ストリーム分離," 人工知能学会誌, vol.10, no.2, pp.232-241, Mar. 1995.
中谷智広, 山本裕, 松本豊, "複合型ニューラルネットワークについて," システム制御情報学会論文誌, vol.~5, No.~9, pp.~349--356, 1992.

Conference/workshop

Speech enhancement

Tomohiro Nakatani, Masataka Goto, and Hiroshi G. Okuno: "Localization by harmonic structure and its application to harmonic sound stream segregation." Proceedings of 1996 International Conference on Acoustics, Speech and Signal Processing (ICASSP-96), Vol II:653--656, IEEE, Atlanta, U.S.A., May 1996
Tomohiro Nakatani, Takeshi Kawabata, and Hiroshi G. Okuno: "A computational model of sound stream segregation with the multi-agent paradigm," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP-95), IEEE, Detroit, May Vol.4, pp.2671--2674, 1995.
Tomohiro Nakatani, Takeshi Kawabata, and Hiroshi G. Okuno: "Unified Architecture for Auditory Scene Analysis and Spoken Language Processing," Proc. International Conference on Spoken Language Processing (ICSLP-94), pp.1403--1406, Sep. 1994
Tomohiro Nakatani, Takeshi Kawabata, and Hiroshi G. Okuno: "Speech Stream Segregation by Multi-Agent System," Proceedings of International Workshop on Speech Processing (IWSP), Nov. 1993, Tokyo.
Hiroshi G. Okuno, Tomohiro Nakatani, and Takeshi Kawabata, "A New Speech Enhancement : Speech Stream Segregation," Proc. International Conference on Spoken Langugage Processing (ICSLP-96), Vol.4, pp.2356-2359, Philadelphia, Oct. 1996.

Robust feature extraction of speech

Nakatani, T., and Irino, T., "Robust fundamental frequency estimation against background noise and spectral distortion," Proc. International Conference on Spoken Langugage Processing (ICSLP-2002), vol.~3, pp.~1733--1736, Denver, Sep., 2002

Unified approach to speech recognition/synthesis/separation

Irino, T., Minami, Y., Nakatani, T., Tsuzaki, M., and Tagawa, H., "Evaluation of a speech recognition/generation method based on HMM and STRAIGHT," Proc. International Conference on Spoken Langugage Processing (ICSLP-2002), Denver, Sep., 2002

Computational auditory scene analysis

Okuno, H.~G., Ikeda, S., and Nakatani, T., "Combining independent component analysis and sound stream segregation," Proc. IJCAI-99 Workshop on Computational Auditory Scene Analysis (CASA-99), pp.~92--98, 1999.
Nakatani, T., and Okuno, H. G.,"Sound ontology for computational auditory scene analysis," Proc. National Conference on Artificial Intelligence (AAAI-98), Vol.~1, pp.~30--35, 1998.
Nakatani, T., Kashino, K., and Okuno, H. G., "Integration of speech stream and music stream segregations based on a sound ontology." Proc. IJCAI-97 Workshop on "Computational Auditory Scene Analysis (CASA-97), pp.~25--32, 1997.
Hiroshi G. Okuno, Tomohiro Nakatani, and Takeshi Kawabata, "Challenge Problem: Understanding Three Simultaneous Speakers," Proc. International Joint Conferences on Artificial Intelligence (IJCAI-97), Vol.1, pp.30-35, IJCAI, Nagoya, Aug. 1997.
Tomohiro Nakatani, Hiroshi G. Okuno, and Takeshi Kawabata: "Residue-driven architecture for Computational Auditory Scene Analysis." Proc. International Joint Conference on Artificial Intelligence (IJCAI-95), Vol.1, pp.165--172, IJCAI, Montreal, Canada, August 1995.
Tomohiro Nakatani, Masataka Goto, Takashi Ito, and Hiroshi G. Okuno, "Multi-Agent Based Binaural Sound Stream Segregation," Working Notes of the IJCAI-95 Workshop on Computational Auditory Scene Analysis (CASA-95), pp.84-91, August 1995.
Tomohiro Nakatani, Hiroshi G. Okuno, and Takeshi Kawabata: "Auditory Stream Segregation in Auditory Scene Analysis with a Multi-Agent System," Proc. National Conference on Artificial Intelligence (AAAI-94), Seattle, pp.100--107, Aug. 1994.
Okuno, H. G., Nakatani, T., and Kawabata, T.,"Understanding three simultaneous speakers," Proc. IJCAI-97, Vol.~1, pp.~30--35, 1997.
Okuno, H. G., Nakatani, T., and Kawabata, T., "Challenge problem: understanding three simultaneous speakers," Proc. of IJCAI-97 Workshop on Computational Auditory Scene Analysis (CASA-97), pp.~61--68, 1997.
Hiroshi G. Okuno, Tomohiro Nakatani, and Takeshi Kawabata: Interfacing Sound Stream Segregation to Speech Recognition Systems --- Preliminary Results of Listening to Several Things at the Same Time. Proc. National Conference on Artificial Intelligence (AAAI-96), Portland, U.S.A., Aug. 1996.
Okuno, H. G., Nakatani, T., and Kawabata, T., "Cocktail-party effect with computational auditory scene analysis--- preliminary report ---," Y. Anzai and Kato, eds., Symbiosis of Human and Artifact --- Proc. HCI Int'l '95, July 1995, (Elsevier Sci. B.V., The Netherland), Vol.~2, pp.~503--508, 1995.

Books (chapters), invited reviews

Nakatani, T., Miyoshi, M., and Kinoshita, K., "Single microphone blind dereverberation," in Jacob Benesty, Shoji Makino, and Jingdong Chen (Eds.), Speech Enhancement: What's New?, Springer, 2005.
Nakatani, T., Miyoshi, M., and Kinoshita, K., "One microphone blind dereverberation based on quasi-periodicity of speech signals," in Thrun, S., Saul, L. K., and Scholkopf B. (Eds.), Advances in Neural Information Processing Systems 16, pp.1417-1424, MIT Press, 2004.
Nakatani, T., Okuno, H.G., Goto, M., and Ito, T., "Multiagent based binaural sound stream segregation," Rosenthal, D., and Okuno, H.G. (eds.), Computational Auditory Scene Analysis, pp.195-214, Laurence Erlbaum Associates, 1998.
Tomohiro Nakatani, Hiroshi G. Okuno, and Takeshi Kawabata: "Auditory Stream Segregation by Cooperation of Different Kinds of Agents," in Multi-Agents and Cooperative Computation IV (MACC'94), edited by K. Hasida, Kindai-Kagaku-Sha, Nov. 1995. (in Japanese)
Tomohiro Nakatani, Hiroshi G. Okuno, and Takeshi Kawabata: "Multi-Agent Based Sound Stream Segregation for Auditory Scene Analysis," in Multi-Agents and Cooperative Computation III (MACC'93), (in Japanese)
奥乃博, 中谷智広, "マルチエージェントシステムによる音響ストリーム分離," システム/情報/制御, システム制御情報学会論文誌, Vol.41, No.8 (Aug. 1997) pp.309-315.

Lectures, invited talks

三好正人, 中谷智広, 向井良, 澤田宏, 引地孝文, 荒木章子, 木下慶介, "ブラインド信号処理技術の研究動向," 信学技報, vol. 104, No. 143, EA-2004-21, pp. 23--30, 2004.
Tomohiro Nakatani, Takeshi Kawabata, Hiroshi G. Okuno, "Sound Stream Segregation for Machine Audition," Proc. International Workshop on Human Interface Technology (IWHIT-94), Aizu, Sep., 1994.
奥乃博, 中谷智広, "音環境理解の研究～カクテルパーティー効果の実現をめざして～," BMC Forum, 理研, 名古屋, 24th Jan., 1996.
中谷智広, “音源分離の計算理論,”パネル討論「聴覚の計算理論には何が必要か？」, 聴覚研究会, ATR, 京都, 9月, 2002.

Others

中谷智広, “複合型ネットワークによる共有ネットワーク法の研究”, 人工知能学会全国大会, 6月, 1992.
奥乃, 柏野, 岡田, 中谷, 川端, “創発的計算モデルによる音声言語理解システムへの環境音処理の組み込み,” 日本ソフトウェア科学会全国大会, 1992.
中谷智広, 川端豪, 奥乃博, “創発的計算モデルによる音環境理解 --- 音響ストリーム分離エージェントの構築と評価 ---”, 人工知能学会全国大会, 7月, 1993.
中谷, 川端, 奥乃, “計算論的アプローチによる音響ストリーム分離,”日本音響学会聴覚研究会資料 H-93-83, 1993.
中谷智広, 奥乃博, 川端豪, “マルチエージェントシステムによる音響ストリーム分離のダイナミクス”, 人工知能学会全国大会, 6月, 1994
中谷智広, 奥乃博, 川端豪,“マルチエージェントシステムによる音響ストリーム分離 --- ストリーム分離の排他性の向上 ---,”情報処理学会全国大会, Vol.6, 6-213, 9月, 1994.
中谷, 川端, 奥乃, “計算論的アプローチによる音環境理解,”音響学会秋季研究発表会, 1994.
中谷智広, 後藤真貴, 奥乃博, “マルチエージェントによる音響ストリーム分離”, 人工知能学会全国大会, 6 月, 1995.
中谷, 後藤, 川端, 奥乃, “調波構造と方向同定に基づく音響ストリーム分離,”音響学会秋季研究発表会, 1995.
中谷智広,川端豪, 奥乃博, “カクテルパーティ効果実現のための音響ストリーム分離の検討残差駆動型アーキテクチャの提案とモノラル音への適用,” 情報処理学会全国大会, 1995.
奥乃, 中谷, 川端,“音響ストリーム分離の音声認識からの評価”, 信学技法, NLC95-51, SP95-86, '95.
中谷, 後藤, 川端, 奥乃, “調波構造分離と子音補完による音声ストリーム分離,”音響学会春季研究発表会, 1996.
中谷智広, 柏野邦夫, 奥乃博, “背景音楽つき音声に対する音響ストリームの分離,” 音楽情報処理研究会, 厚木, 3月, 1997
中谷智広, 柏野邦夫, 奥乃博, “音声分離と楽音分離の統合のための音オントロジーの提案”, 人工知能学会全国大会, 1997.
中谷智広, 入野俊夫,“調波成分の占有度を用いた基本周波数抽出法,”音響学会秋季研究発表会, 3-10-11, pp.~323--324, 3月, 2002.
中谷智広, 入野俊夫,“瞬時周波数を用いたF0抽出法の複数音声による評価,”音響学会秋季研究発表会 1-2-3, pp.~211--212, 10月, 2001.

(back to Tomohiro's homepage)