Big Data Analysis

Smarter instant analysis of “Current” with big data

- Streaming analysis with distributed online machine learning -

Abstract

In the last decade, huge quantities of unstructured data such as micro-blog posts have been produced. Many studies and surveys have pointed out the potential benefits of the real-time analysis of these unstructured big data. Therefore, sophisticated data analysis, which can deal with unstructured data in a real-time manner, has become an emerging trend. However, real-time analysis and sophisticated analysis inherently have a trade off relationship. We studied a way of balancing these contradictory features at a high level. As a result of our study, we introduced Jubatus, which is a distributed real-time data analysis framework. With Jubatus, we can analyze and classify natural language data in a 16MB/s stream. By offering Jubatus as open source software, we are contributing to real-time marketing and smart social infrastructure management.

Poster


Please click the thumbnail image to open the full-size PDF file.

Presentor

Keitaro Horikawa
Keitaro Horikawa
NTT Software Innovation Center
Distributed Data Processing Platform SE Project
Aoya Koji
Aoya Koji
NTT Software Innovation Center
Distributed Data Processing Platform SE Project
Hiroyuki Makino
Hiroyuki Makino
NTT Software Innovation Center
Distributed Data Processing Platform SE Project