Mission statement

We need more than just data analysis techniques to solve the wide variety of issues in big data: we also need to utilize databases, cloud computing, security methodologies, and other related technologies along with domain-specific expertise. At the Machine Learning and Data Science Center, we are solving issues in big data by taking a multidisciplinary approach to implementing an advanced and scientifically tested big data analysis framework based on machine learning through close collaboration with specialists from diverse fields.    
Director Naonori Ueda

News

  • The Machine Learning and Data Science Center was primarily responsible for starting a new initiative (“himico”) with the goal of making the world a better place by using big data analysis to predict the flow of people, objects, and information in the near future.
  • The 2015 IFIP/IEEE International Symposium on Integrated Network Management (IFIP/IEEE IM 2015) accepted a paper written by Kei Takeshita, Masahiro Yokota, and Ken Nishimatsu: "Early Network Failure Detection System by Analyzing Twitter Data."
  • Members of NTT Laboratories became the first employees of a Japanese company to join the ranks of Apache Hadoop’s active committers.
  • The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15) accepted a paper written by Yuya Yoshikawa, Tomoharu Iwata, and Hiroshi Sawada: "Non-linear Regression for Bag-of-Words Data via Gaussian Process Latent Variable Set Model."
  • Tomoharu Iwata (principal investigator and research fellow) received the Nagao Special Researcher Award from the Information Processing Society of Japan for his data mining research based on a probabilistic latent variable model.

Research projects

Large-scale real-time distributed data analysis framework (“Jubatus”)

Jubatus is one of the first frameworks for scaling the performance of online machine learning by distributing tasks across multiple machines.
Jubatus
(1)Is effective on closed-circuit television and other large, complex datasets
(2)Learns rules from sample data
(3)Analyzes data immediately without storing or accumulating it
Click here for a larger image.
Click here for more information on Jubatus.
  

High-speed graph mining technology (“Grapon”)

We have implemented one of the fastest graph mining algorithms in the world. Our algorithm uses graph structures to discover insights hidden within data and also has the following properties:
(1) World-class speed (50x faster than conventional algorithms)
(2) Hundreds of millions of nodes can be processed on a single server
(3) Available as a plugin with Gephi, the open-source visualization tool
Grapon can be used to implement precise recommendation services by discovering similar people and influential figures from Twitter and other large-scale social graphs and by finding strongly correlated people and items from one’s purchase history. Grapon can also be used to simulate traffic by dividing street networks into optimal clusters based on dynamic traffic data.
Click here for a larger image.   

Spatiotemporal multidimensional aggregate data analysis

By developing multidimensional aggregate data analysis with an additional spatiotemporal component, we can model spatiotemporal effects, predict when phenomena will occur, and respond proactively.
Spatiotemporal multidimensional aggregate data analysis can:
(1)Model spatial interactions
(2)Be treated as aggregate data
(3)Predict when when phenomena will occur
Click here for a larger image.
  

Publications

  • 2015
  • 2014

2015

  1. Hiroaki Shiokawa, Yasuhiro Fujiwara, Makoto Onizuka "SCAN++: Efficient Algorithm for Finding Clusters, Hubs and Outliers on Large-scale Graphs," PVLDB 8(11), pp. 1178-1189 2015
  2. Yasuhiro Fujiwara, Dennis Shasha "Quiet: Faster Belief Propagation for Images and Related Applications," IJCAI, pp. 3497-3503 2015
  3. Yasuhiro Fujiwara, Makoto Nakatsuji, Hiroaki Shiokawa, Yasutoshi Ida, Machiko Toyoda "Adaptive Message Update for Fast Affinity Propagation," KDD, pp. 309-318 2015
  4. Yasuhiro Fujiwara, Go Irie, Shari Kuroyama, Makoto Onizuka "Scaling Manifold Ranking Based Image Retrieval," PVLDB 8(4), pp. 341-352 2014
  5. Kei Takeshita, Masahiro Yokota, and Ken Nishimatsu, "Early Network Failure Detection System by Analyzing Twitter Data," In Proc. of IFIP/IEEE International Symposium on Integrated Network Management (IFIP/IEEE IM 2015), 2015.(採択決定)
  6. Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada: "Non-linear Regression for Bag-of-Words Data via Gaussian Process Latent Variable Set Model," The 28th AAAI Conference on Artificial Intelligence (AAAI2015), Austin, Texas, USA, Jan. 2015

2014

  1. Hideaki Kim, Noriko Takaya, Hiroshi Sawada, "Tracking Temporal Dynamics of Purchase Decisions via Hierarchical Time-Rescaling Model," In Proc. of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM 2014), pp. 1389-1398, November 2014.
  2. 林 亜紀, 松林 達史, 澤田 宏,”位置情報を利用した情報配信のための習慣 度算出手法”,日本データベース学会和文論文誌,Vol.13, No.1, pp.64-71, October 2014.
  3. Noriko Takaya, Yusuke Kumagae, Yusuke Ichikawa, Hiroshi Sawada, "Adopted Transfer Learning to Item Purchase Prediction on Web Marketing," In Proc. of International Workshop on Informatics (IWIN2014), pp 175-183, September 2014.
  4. Takeshi Kurashima, Tomoharu Iwata, Noriko Takaya, Hiroshi Sawada, "Probabilistic latent network visualization: inferring and embedding diffusion networks," In Proc. of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2014), pp. 1236-1245, August 2014.
  5. Makoto Nakatsuji, Yasuhiro Fujiwara, Hiroyuki Toda, Hiroshi Sawada, Jin Zheng, James Alexander Hendler, "Semantic Data Representation for Improving Tensor Factorization," In Proc. Twenty-Eighth AAAI Conference on Artificial Intelligence (AAAI-14), pp. 2004-2012, July 2014.
  6. Aki Hayashi, Tatsushi Matsubayashi, Hiroshi Sawada, "Regular behavior measure for location based services," In Proc. of the 2014 ACM conference on Web science (WebSci'14), pp. 299-300, June 2014.
  7. Yasuhiro Fujiwara, Go Irie, "Efficient Label Propagation," in Proceedings of The 31st International Conference on Machine Learning, pp. 784-792, 2014.
  8. 新井淳也, 鬼塚真, 塩川浩昭, "クラスタリングと空間分割の併用による効 率的なk-匿名化," 日本データベース学会論文誌, vol.13-J, No.1, pp. 72-77, October 2014.
  9. 飯田恭弘, 岸本康成, 藤原靖宏, 塩川浩昭, 鬼塚真 "大規模グラフ構造データからのコミュニティ抽出と重要度計算 ~高速化への取り組みと応用~, 人工知能学会誌, vol.29, No.5, pp. 472-479
  10. Tatsuaki Kimura, Keisuke Ishibashi, Tatsuya Mori, Hiroshi Sawada, Tsuyoshi Toyono, Ken Nishimatsu, Akira Watanabe, Akihio Shimoda, and Kohei Shiomoto, "Spatio-temporal Factorization of Log Data for Understanding Network Events," In Proc. of IEEE International Conference on Computer Communications (IEEE INFOCOM 2014), pp. 610-618, 2014.
  11. Takuma Otsuka, Katsuhiko Ishiguro, Hiroshi Sawada, Hiroshi Okuno, "Bayesian Nonparametrics for Microphone Array Processing," IEEE Global SIP, (to appear).
  12. Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada, "Collaboration on Social Media: Analyzing Successful Projects on Social Coding," arXiv:1408.6012, 2014.
  13. Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada, "Latent Support Measure Machines for Bag-of-Words Data Classification," NIPS2014, (to appear)
  14. Takuma Otsuka, Katsuhiko Ishiguro, Takuya Yoshioka, Hiroshi Sawada, Hiroshi G. Okuno, "Multichannel Sound Source Dereverberation and Separation for Arbitrary Number of Sources based on Bayesian Nonparametrics," IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 22, No. 12, pp. 2218-2232, December, 2014.
  15. Masakazu Ishihata, Tomoharu Iwata, "Generating Structure of Latent Variable Models for Nested Data," in Proceedings of the 30th conference on Uncertainty in Artificial Intelligence (UAI-2014), July 2014.
  16. Masahiro Nakano, Katsuhiko Ishiguro, Akisato Kimura, Takeshi Yamada, Naonori Ueda, "Rectangular tiling process," In Proceedings of IEEE International Conference on Machine Learning (ICML2014), pp. 361-369, 2014.
  17. Yutaka Yanagisawa, Yasue Kishino, Takayuki Suyama, Futoshi Naya, Tsutomu Terada, Masahiko Tsukamoto, "CILIX: a CIL Virtual Machine for Wireless Sensor Devices," IEEE PDPTA'14, 2014.
  18. Tomohiro Warashina, Kazuo Aoyama, Hiroshi Sawada, Takashi Hattori, "Efficient K-Nearest Neighbor Graph Construction Using MapReduce for Large-Scale Data Sets," IEICE Trans. Inf. and Syst., (to appear), 2014.
  19. Akisato Kimura, Kevin Duh, Tsutomu Hirao, Katsuhiko Ishiguro, Tomoharu Iwata, Ching-man Au Yeung, "Creating stories from socially curated microblog messages," IEICE Transactions on Information and Systems, Vol. E97-D, No. 6, pp.1557-1566, 2014.
  20. Makoto Yamada, Leonid Sigal, Michalis Raptis, "Covariate Shift Adaptation for Discriminative 3D Pose Estimation," IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), February 1, 2014.
  21. Tsutomu Terada, Seiji Takeda, Masahiko Tsukamoto, Yutaka Yanagisawa, Yasue Kishino, Takayuki Suyama, "On achieving dependability for wearable computing by device bypassing," in Proceedings of the 5th Augumented Human International Conference (AH'14), Article No.8, March 2014.
  22. 吉川友也, 岩田具治, 澤田宏, "ソーシャルメディア上の協働: ソーシャルコーディングにおける成功するプロジェクトの要因分析," 日本データベース学会論文誌, Vol.12, No.3, Feb. 2014.
  23. Mathieu Blondel, Akinori Fujino, Naonori Ueda,"Large-scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex," Proc. of the 22nd International Conference on Pattern Recognition (ICPR 2014), 1289--1294 (2014).
  24. Akinori Fujino, Jun Suzuki, Tsutomu Hirao, Hisashi Kuarawasa, Katsuyoshi Hayashi,"SCT-D3 at the NTCIR-11 MedNLP-2 Task," Proc. of the 11th NTCIR Conference, 167--169 (2014).
  25. Naonori Ueda, Yusuke Tanaka, Akinori Fujino, "Robust Naive Bayes Combination of Multiple Classifications," The Impact of Applications on Mathematics, Proceedings of the Forum of Mathematics for Industry,Springer, pp.141-156, 2014.
  26. Mathieu Blondel, Yotaro Kubo, Naonori Ueda, Online Passive-Aggressive Algorithms for Non-Negative Matrix Factorization and Completion,” International Conference on Artificial Intelligence and Statistics (AISTATS2014), 2014.
  27. Mathieu Blondel, Akio Onogi, Hiroshi Iwata, Naonori Ueda, "Genomic Selection: Ranking Approach,”Neural Information Processing,Computational Biology Workshop, 2014.
  28. Mathieu Blondel, Akinori Fujino, Naonori Ueda, "Large-scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex,” International Conference on Pattern Recognition (ICPR2014), 2014.
  29. Yasuko Matsubara, Yasushi Sakurai, Naonori Ueda, Masatoshi Yoshikawa, "Fast and Exact Monitoring of Co-evolving Data Streams," IEEE International Conference on Data Mining (ICDM 2014), 2014.
  30. Kyosuke Nishida, Hiroyuki Toda, Takeshi Kurashima, and Yoshihiko Suhara,"Probabilistic Identification of Visited Point-of-Interest for Personalized Automatic Check-in," In Proc. of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'14), pp.631-642, Seattle, Washington, US, September 2014.

Members

Naonori Ueda
Naonori Ueda

Project Manager

Kenichi Arai
Kenichi Arai
Futoshi Naya
Futoshi Naya
Hideaki Kin
Hideaki Kin
Hitoshi Shimizu
Hitoshi Shimizu
Takuma Otsuka
Takuma Otsuka
Mathieu Blondel
Mathieu Blondel
Jotaro Ikedo
Jotaro Ikedo
Masaru Miyamoto
Masaru Miyamoto
Tatsushi Matsubayashi
Tatsushi Matsubayashi
Takeshi Kurashima
Takeshi Kurashima/strong>
Masahiro Kohjima
Masahiro Kohjima
Shohei Uchikawa
Shohei Uchikawa
Yoshitaka Nakamura
Yoshitaka Nakamura
Sekitoshi Kanai
Sekitoshi Kanai
Masato Sawada
Masato Sawada
Shotaro Tora
Shotaro Tora
Seiji Kihara
Seiji Kihara"
Sotetsu Iwamura
Sotetsu Iwamura
Yasuhiro Iida
Yasuhiro Iida
Yasunari Kishimoto
Yasunari Kishimoto
Yasuhiro Fujiwara
Yasuhiro Fujiwar/strong>
Junya Arai
Junya Arai
Yasutoshi Ida
Yasutoshi Ida
Mitsuaki Tsunakawa
Mitsuaki Tsunakawa
Takashi Hayashi
Takashi Hayashi
Akira Takahashi
Akira Takahashi
Keisuke Ishibashi
Keisuke Ishibashi