Smarter instant analysis of “Current” with big data
- Streaming analysis with distributed online machine learning -
Abstract
In the last decade, huge quantities of unstructured data such as micro-blog posts have been produced. Many studies and surveys have pointed out the potential benefits of the real-time analysis of these unstructured big data. Therefore, sophisticated data analysis, which can deal with unstructured data in a real-time manner, has become an emerging trend. However, real-time analysis and sophisticated analysis inherently have a trade off relationship. We studied a way of balancing these contradictory features at a high level. As a result of our study, we introduced Jubatus, which is a distributed real-time data analysis framework. With Jubatus, we can analyze and classify natural language data in a 16MB/s stream. By offering Jubatus as open source software, we are contributing to real-time marketing and smart social infrastructure management.
Poster
Please click the thumbnail image to open the full-size PDF file.