|
Abstract of Articles
Please e-mail to akisato <at> ieee org
if you would like to download articles listed below.
Kota Nagayama, Akisato Kimura, Hiroyuki Fujishiro
"Make it go viral - Generating attractive headlines for distributing news articles on social media,"
Proc. Computation + Journalism Symposium (C+J2016), September 2016.
[ pdf ]
[ slides ]
[ copylight notice ]
- Abstract
-
A huge number of news articles are distributed on social media, and news consumers can access those articles at any time from hand-held devices. This means that news providers are under pressure to find ways of attracting the interest of news consumers. One key factor that has a great impact on the attractiveness of a news article is its headline as a way of guiding news consumers to news articles. This research explores the challenge of automatically generating attractive news headlines for social media, and to this end we focus on the problem of identifying key sentences that are useful for generating viral news headlines from a given news article. We show that this problem can be formulated as supervised sequence labeling that utilizes user activity on social media as supervised information, and we propose a neural network model for this purpose. Investigations with our corpus consisting of microblog posts and news articles demonstrate that lead sentences believed to be the most suitable for news summaries do not necessarily contribute to increased virality whereas our proposed method can accurately identify key sentences.
Katsuhiko Ishiguro, Issei Sato, Masahiro Nakano, Akisato Kimura, Naonori Ueda
"Infinite plaid models for infinite bi-clustering,"
Proc. AAAI Conference on Artificial Intelligence (AAAI2016), February 2016.
[ pdf ] [ permalink ] [ poster ] [ copylight notice ]
- Abstract
-
We propose a probabilistic model for non-exhaustive and overlapping (NEO) bi-clustering. Our goal is to extract a few sub-matrices from the given data matrix, where entries of a sub-matrix are characterized by a specific distribution or parameters. Existing NEO biclustering methods typically require the number of submatrices to be extracted, which is essentially diffcult to fix a priori. In this paper, we extend the plaid model, known as one of the best NEO bi-clustering algorithms, to allow infinite bi-clustering; NEO bi-clustering without specifying the number of sub-matrices. Our model can represent infinite sub-matrices formally. We develop a MCMC inference without the finite truncation, which potentially addresses all possible numbers of sub-matrices. Experiments quantitatively and qualitatively verify the usefulness of the proposed model. The results reveal that our model can offer more precise and in-depth analysis of sub-matrices.
Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi
"Distributed forests for MapReduce-based machine learning,"
Proc. IAPR Asian Conference on Pattern Recognition (ACPR2015), November 2015.
[ pdf ] [ permalink ] [ poster ] [ copylight notice ]
- Abstract
-
This paper proposes a novel method for training random forests with big data on MapReduce clusters. Random forests are well suited for parallel distributed systems, since they are composed of multiple decision trees and every decision tree can be independently trained by ensemble learning methods. However, naive implementation of random forests on distributed systems easily overfits the training data, yielding poor classification performances. This is because each cluster node can have access to only a small fraction of the training data. The proposed method tackles this problem by introducing the following three steps. (1) "Shared forests" are built in advance on the master node and shared with all the cluster nodes. (2) With the help of transfer learning, the shared forests are adapted to the training data placed on each cluster node. (3) The adapted forests on every cluster node are returned to the master node, and irrelevant trees yielding poor classification performances are removed to form the final forests. Experimental results show that our proposed method for MapReduce clusters can quickly learn random forests without any sacrifice of classification performance.
Sawa Kourogi, Akisato Kimura, Hiroyuki Fujishiro, Hitoshi Nishikawa
"Identifying attractive news headlines for social media,"
Proc. ACM International Conference on Information and Knoledge Management (CIKM2015), October 2015.
[ pdf ] [ permalink ] [ poster ] [ copylight notice ]
- Abstract
-
In the past, leading newspaper companies and broadcasters were the sole distributors of news articles, and thus news consumers simply received news articles from those outlets at regular intervals. However, the growth of social media and smart devices led to a considerable change in this traditional relationship between news providers and consumers. Hundreds of thousands of news articles are now distributed on social media, and consumers can access those articles at any time via smart devices. This has meant that news providers are under pressure to find ways of engaging the attention of consumers. This paper provides a novel solution to this problem by identifying attractive headlines as a gateway to news articles. We first perform one of the first investigations of news headlines on a major viral medium. Using our investigation as a basis, we also propose a learning-to-rank method that suggests promising news headlines. Our experiments with 2,000 news articles demonstrate that our proposed method can accurately identify attractive news headlines from the candidates and reveals several promising factors of making news articles go viral.
Jun Fujiki, Masaru Tanaka, Hitoshi Sakano, Akisato Kimura
"Geometry of Fisher's linear discriminant analysis,"
Proc. IAPR International Conference on Machine Vision and Applications (MVA2015), May 2015.
[ pdf ] [ permalink ] [ slides ] [ copylight notice ]
- Abstract
-
To appear.
Jiro Nakajima, Akisato Kimura, Akihiro Sugimoto, Kunio Kashino
"Visual attention driven by auditory cues: Selecting visual features in synchronization with attracting auditory events,"
to appear, International Conference on Mutiedia Modeling (MMM2015), January 2015.
[ pdf ] [ DOI link ] [ slides ]
- Abstract
-
Human visual attention can be modulated not only by visual stimuli but also by ones from other modalities such as audition. Hence, incorporating auditory information into a human visual attention model would be a key issue for building more sophisticated models. However, the way of integrating multiple pieces of information arising from audio-visual domains still remains a challenging problem. This paper proposes a novel computational model of human visual attention driven by auditory cues. Founded on the Bayesian surprise model that is considered to be promising in the literature, our model uses surprising auditory events to serve as a clue for selecting synchronized visual features and then emphasizes the selected features to form the final surprise map. Our approach to audio-visual integration focuses on using effective visual features alone but not all available features for simulating visual attention with the help of auditory information. Experiments using several video clips show that our proposed model can better simulate eye movements of human subjects than other existing models in spite that our model uses a smaller number of visual features.
Masahiro Nakano, Katsuhiko Ishiguro, Akisato Kimura, Takeshi Yadama, Naonori Ueda
"Rectangular tiling process,"
Proc. International Conference on Machine Learning (ICML2014), pp.361-369, June 2014.
[ pdf ]
[ permalink ]
[ slides ] [ poster ]
- Abstract
-
This paper proposes a novel stochastic process that represents the arbitrary rectangular partitioning of an infinite-dimensional matrix as the conditional projective limit. Rectangular partitioning is used in relational data analysis, and is classified into three types: regular grid, hierarchical, and arbitrary. Conventionally, a variety of probabilistic models have been advanced for the first two, including the product of Chinese restaurant processes and the Mondrian process. However, existing models for arbitrary partitioning are too complicated to permit the analysis of the statistical behaviors of models, which places very severe capability limits on relational data analysis. In this paper, we propose a new probabilistic model of arbitrary partitioning called the rectangular tiling process (RTP). Our model has a sound mathematical base in projective systems and infinite extension of conditional probabilities, and is capable of representing partitions of infinite elements as found in ordinary Bayesian nonparametric models.
Akisato Kimura
[Invited]
"Large-scale cross-media analysis and mining from socially curated contents,"
Progress in Informatics, 2014.
[ pdf ]
[ DOI link ]
- Abstract
-
The major interest of the current social network service (SNS) developers and users are rapidly shifting from conventional text-based (micro)blogs such as Twitter and Facebook to multimedia contents such as Flickr, Snapchat, MySpace and Tumblr. However, the ability to analyze and exploit unorganized multimedia contents on those services still remain inadequate, even with state-of-the-art media processing and machine learning techniques.
This paper focuses on another emerging trend called social curation, a human-in-the-loop alternative to automatic algorithms for social media analysis. Social curation can be defined as a spontaneous human process of remixing social media content for the purpose of further consumption. What characterize social curation are definitely the manual efforts involved in organizing a collection of social media contents, which indicates that socially curated content has a potential as a promising information source against automatic summaries generated by algorithms. Curated contents would also provide latent perspectives and contexts that are not explicitly presented in the original resources. Following this trend, this paper presents recent developments and growth of social curation services, and reviews several research trials for cross-media analysis and mining from socially curated contents.
Kevin Duh, Akisato Kimura, Tsutomu Hirao, Katsuhiko Ishiguro, Tomoharu Iwata, Albert Au Yeung
"Creating stories from socially curated microblog messages,"
IEICE Transactions on Information and Systems, 2014.
[ permalink ]
[ PDF ]
- Abstract
-
Social media such as microblogs have become so pervasive such that it is now possible to use them as sensors for real-world events and memes. While much recent research has focused on developing automatic methods for filtering and summarizing these data streams, we explore a different trend called social curation. In contrast to automatic methods, social curation is characterized as a human-in-the-loop and sometimes crowd-sourced mechanism for exploiting social media as sensors. Although social curation web services like Togetter, Naver Matome and Storify are gaining popularity, little academic research has studied the phenomenon.
In this paper, our goal is to investigate the phenomenon and potential of this new field of social curation. First, we perform an in-depth analysis of a large corpus of curated microblog data. We seek to understand why and how people participate in this laborious curation process. We then explore new ways in which information retrieval and machine learning technologies can be used to assist curators. In particular, we propose a novel method based on a learning-to-rank framework that increases the curator's productivity and breadth of perspective by suggests which novel microblogs should be added to the curated content.
Koh Takeuchi, Ryota Tomioka, Katsuhiko Ishiguro, Akisato Kimura, Hiroshi Sawada
"Non-negative multiple tensor factorization,"
Proc. IEEE International Conference on Data Mining (ICDM 2013).
[ PDF ] [ slides ]
- Abstract
-
Non-negative Tensor Factorization (NTF) is a widely used technique for decomposing a non-negative value tensor into sparse and reasonably interpretable factors. However, NTF performs poorly when the tensor is extremely sparse, which is often the case with real-world data and higher-order tensors. In this paper, we propose Non-negative Multiple Tensor Factorization (NMTF), which factorizes the target tensor and auxiliary tensors simultaneously. Auxiliary data tensors compensate for the sparseness of the target data tensor. The factors of the auxiliary tensors also allow us to examine the target data from several different aspects. We experimentally confirm that NMTF performs better than NTF in terms of reconstructing the given data. Furthermore, we demonstrate that the proposed NMTF can successfully extract spatio-temporal patterns of people?fs daily life such as leisure, drinking, and shopping activity by analyzing several tensors extracted from online review data sets.
Alejandro Marcos alvarez, Makoto Yamada, Akisato Kimura, Tomoharu Iwata
"Clustering-based anomaly detection in multi-view data,"
Proc. ACM Conference on Information and Knowledge Management (CIKM 2013).
[ PDF ]
[ poster ]
- Abstract
-
This paper proposes a simple yet effective anomaly detection method for multi-view data. The proposed approach detects anomalies by comparing the neighborhoods in different views. Specifically, clustering is performed separately in the different views and affinity vectors are derived for each object from the clustering results. Then, the anomalies are detected by comparing affinity vectors in the multiple views. An advantage of the proposed method over existing methods is that the tuning parameters can be determined effectively from the given data. Through experiments on synthetic and benchmark datasets, we show that the proposed method outperforms existing methods.
Akisato Kimura, Katsuhiko Ishiguro, Alejandro Marcos alvarez, Kaori Kataoka, Kazuhiko Murasaki, Makoto Yamada
"Image context discovery from socially curated contents,"
Proc. ACM International Conference on Multimedia (ACMMM 2013).
[ PDF ]
[ poster ]
- Abstract
-
This paper proposes a novel method of discovering a set of image contents sharing a specific context (attributes or implicit meaning) with the help of image collections obtained from social curation platforms. Socially curated contents are promising to analyze various kinds of multimedia information, since they are manually filtered and organized based on specific individual preferences, interests or perspectives. Our proposed method fully exploits the process of social curation: (1) How image contents are manually grouped together by users, and (2) how image contents are distributed in the platform. Our method reveals the fact that image contents with a specific context are naturally grouped together and every image content includes really various contexts that cannot necessarily be verbalized by texts. A preliminary experiment with a small collection of a million of images yields a promising result.
Alejandro Marcos alvarez, Makoto Yamada, Akisato Kimura
"Exploiting socially-generated side information to improve dimensionality reduction,"
Proc. International Workshop on Socially-Aware Multimedia (IWSAM 2013, in conjunction with ACMMM 2013), Barcelona, Spain, October 2013.
[ PDF ]
[ presentation material ]
- Abstract
-
In this paper, we show how side information extracted from socially-curated data can be used within a dimensionality reduction method and to what extent this side information is beneficial to several tasks such as image classification, data visualization and image retrieval. The key idea is to incorporate side information of an image into a dimensionality reduction method. More specifically, we propose a dimensionality reduction method that can find an embedding transformation so that image pairs with similar side information are close in the embedding space. We introduce three types of side information derived from user behavior. Through experiments on an image dataset obtained from Pinterest, we show that incorporating socially-generated side information in a dimensionality reduction method benefits several image-related tasks such as image classification, data visualization and image retrieval.
Makoto Yamada, Akisato Kimura, Hiroshi Sawada, Futoshi Naya
"Change-point detection with feature selection in high-dimensional time-series data,"
Proc. International Joint Conference on Artificial Intelligence (IJCAI 2013), August 2013.
[ PDF ] [ poster ]
- Abstract
-
Change-point detection is the problem of finding abrupt changes in time-series, and it is attracting a lot of attention in the artificial intelligence and data mining communities. In this paper, we present a supervised learning based change-point detection approach in which we use separability of the past and future data at time t (they are labeled as +1 and -1) as a plausibility of a change-points. Based on this framework, we propose a detection measure called additive Hilbert-Schmidt Independence Criterion (aHSIC), which is defined as a weighted sum of HSIC scores between each feature and its corresponding binary labels. Here, HSIC is a kernel-based independence measure. A novelty of the aHSIC score is that it can incorporate feature selection during its detection measure estimation. More specifically, we first select features that are responsible for an abrupt change by a supervised manner, and then compute the aHSIC score by those selected features. Thus, compared with traditional detection measures, our approach tends to be robust to noise features, and thus the aHSIC is suited for a high-dimensional time-series change point detection problems. Through extensive experiments on synthetic and real-world human activity data set, we demonstrated that the proposed change-point detection method is promising.
Koh Takeuchi, Katsuhiko Ishiguro, Akisato Kimura, Hiroshi Sawada
"Non-negative multiple matrix factorization,"
Proc. International Joint Conference on Artificial Intelligence (IJCAI 2013), August 2013.
[ PDF ] [ poster ]
- Abstract
-
Non-negative Matrix Factorization (NMF) is a traditional unsupervised machine learning technique for decomposing a matrix into a set of bases and coefficients under the non-negative constraint. NMF with sparse constraints is also known for extracting reasonable components from noisy data. However, NMF tends to give undesired results in the case of highly sparse data, because the information included in the data is insufficient to decompose. Our key idea is that we can ease this problem if complementary data are available that we could integrate into the estimation of the bases and coefficients. In this paper, we propose a novel matrix factorization method called Non-negative Multiple Matrix Factorization (NMMF), which utilizes complementary data as auxiliary matrices that share the row or column indices of the target matrix. The data sparseness is improved by decomposing the target and auxiliary matrices simultaneously, since auxiliary matrices provide information about the bases and
coefficients. We formulate NMMF as a generalization of NMF, and then present a parameter estimation procedure derived from the multiplicative update rule. We examined NMMF in both synthetic and real data experiments. The effect of the auxiliary matrices appeared in the improved NMMF performance. We also confirmed that the bases that NMMF obtained from the real data were intuitive and reasonable thanks to the non-negative constraint.
Akisato Kimura
[invited] "Social curation as corpora for large-scale multimedia content analysis,"
presented in Industry and Practitioners Session, ACM International Conference on Multimedia Retrieval (ICMR2013), Dallas, Texas, USA, April 2013.
[ presentation material ]
- Abstract
-
We are entering the age of ubiquitous social media. User-generated content such as microblogs have become so pervasive such that it is now feasible to exploit them as "sensors" for real-world events and memes. As such, an active research area is the development of new algorithms for social media analysis. We imagine these algorithmic advances will provide efficient ways to discover and summarize events of interest from large streams of social media. This talk focuses on yet another different trend. A recent phenomenon called social curation is emerging as a manual human-driven alternative to automatic algorithms for social media analysis. Social curation can be defined as the human process of remixing social media contents for the purpose of further consumption. What characterizes social curation is the manual effort involved in organizing social media content. This human touch means curated content is a potentially richer source of information than automatic summaries/stories generated by algorithms. Specifically, curated content may give additional perspectives that are not present in the original sources. The goal of this talk is to explore this emerging phenomenon and to present several examples applying social curation data to multimedia content analysis.
Katsuhiko Ishiguro, Akisato Kimura, Koh Takeuchi
"Towards automatic image understanding and mining via social curation,"
Proc. IEEE Internatonal Conference on Data Mining (ICDM2012),
pp.906--911, Brussels, Belgium, December 2012.
[ pdf ] [ poster ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
The amount and variety of multimedia data such as images, movies and music distributed over social networks are increasing rapidly. However, the ability to analyze and exploit these unorganized multimedia data remains inadequate, even with state-of-the-art media processing techniques. Our finding in this paper is that emerging social curation services are promising information source for automatic understanding and mining of images distributed and exchanged in social media. One remarkable virtue of social curation service dataset is that the dataset is weakly supervised: social media content in the service is manually collected, selected and maintained by users. This is very distinct from other social information sources, and we can utilize this characteristics for media content mining without expensive media processing technique. In this paper we present a machine learning system for predicting view counts of images on social curation data as a first step of automatic image content evaluation. Through experiments, we confirm that the simple features extracted from a social curation corpus are much superior in view count prediction compared to golden-standard image features of computer vision research.
Hitoshi Sakano, Tsukasa Ohashi, Akisato Kimura, Hiroshi Sawada, Katsuhiko Ishiguro
"Extended Fisher criterion based on auto-correlation matrix information,"
Proc. IAPR International Workshops on Statistical Techniques in
Pattern Recognition (SPR2012),
pp.409-416, Hiroshima, Japan, November 2012.
[ pdf ]
[ presentation material ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
Fisher's linear discriminant analysis (FLDA) has been attracting many researchers and practitioners for several decades thanks to its ease of use and low computational cost. However, FLDA implicitly assumes that all the classes share the same covariance: which implies that FLDA might fail when this assumption is not necessarily satisfied. To overcome this problem, we propose a simple extension of FLDA that exploits a detailed covariance structure of every class revealed by the class-wise auto-correlation matrices. The proposed method achieves remarkable improvements classification accuracy against FLDA while preserving two major strengths of FLDA: the ease of use and low computational costs. Experimental results with MNIST and other several data sets in UCI machine learning repository demonstrate the effectiveness of our method.
Akisato Kimura, Ryo Yonetani,
Takatsugu Hirayama
[Invited survey paper] "Computational models of human visal attention and their implementations: A survey,"
IEICE Transactions on Information and Systems, Vol.E96-D, No.3, pp.562-578, March 2013.
[ pdf ]
[ permalink ]
[
copyright notice ]
- Abstract
-
We humans are easily able to instantaneously detect the regions in a visual scene that are most likely to contain something of interest. Exploiting this pre-selection mechanism called visual attention for image and video processing systems would make them more sophisticated and therefore more useful. This paper briefly describes various computational models of human visual attention and their development, as well as related psychophysical findings. In particular, our objective is to carefully distinguish several types of studies related to human visual attention and saliency as a measure of attentiveness, and to provide a taxonomy from several viewpoints such as the main objective, the use of additional cues and mathematical principles. This survey finally discusses possible future directions for research into human visual attention and saliency computation.
Akisato Kimura, Masashi Sugiyama, Hitoshi Sakano, Hirokazu Kameoka
"Designing various component analysis at will,"
Proc. IAPR International Conference on Pattern Recognition (ICPR2012),
pp.2959-2962, Tsukuba, Japan, November 2012.
[ pdf ]
[ poster ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper provides a generic framework of component analysis (CA) methods introducing a new expression for scatter matrices and Gram matrices, called Generalized Pairwise Expression (GPE). This expression is quite compact but highly powerful: The framework includes not only (1) the standard CA methods but also (2) several regularization techniques, (3) weighted extensions, (4) some clustering methods, and (5) their semi-supervised extensions. This paper also presents quite a simple methodology for designing a desired CA method from the proposed framework: Adopting the known GPEs as templates, and generating a new method by combining these templates appropriately.
Akisato Kimura, Masashi Sugiyama, Hitoshi Sakano, Hirokazu Kameoka
"Designing various multivariate analysis at will via generalized pairwise expression,"
IPSJ Transactions on Mathematical Modeling and its Applications (TOM), Vol.6, No.1, pp.136-145, March 2013.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
It is well known that dimensionality reduction based on multivariate analysis methods and their kernelized extensions can be formulated as generalized eigenvalue problems of scatter matrices, Gram matrices or their augmented matrices. This paper provides a generic framework of multivariate analysis introducing a new expression for scatter matrices and Gram matrices, called Generalized Pairwise Expression (GPE). This expression is quite compact but highly powerful. The framework includes not only (1) the traditional multivariate analysis methods but also (2) several regularization techniques, (3) localization techniques, (4) clustering methods based on generalized eigenvalue problems, and (5) their semi-supervised extensions. This paper also presents a methodology for designing a desired multivariate analysis method from the proposed framework. The methodology is quite simple: adopting the above mentioned special cases as templates, and generating a new method by combining these templates appropriately.
Akisato Kimura, Masashi Sugiyama, Takuho Nakano, Hirokazu Kameoka, Hitoshi Sakano, Eisaku Maeda, Katsuhiko Ishiguro
"SemiCCA: Efficient semi-supervised learning of canonical correlations,"
IPSJ Transactions on Mathematical Modeling and its Applications (TOM), Vol6. No.1, pp.128-135, March 2013.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
Canonical correlation analysis (CCA) is a powerful tool for analyzing multi-dimensional paired data. However, CCA tends to perform poorly when the number of paired samples is limited, which is often the case in practice. To cope with this problem, we propose a semi-supervised variant of CCA named SemiCCA that allows us to incorporate additional unpaired samples for mitigating overfittng. The main contribution of the proposed method against previously proposed methods is its efficiency and intuitive operation: it smoothly bridges the generalized eigenvalue problems of CCA and principal component analysis (PCA), and thus its solution can be computed efficiently just by solving a single eigenvalue problem as the original CCA.
Ryo Yonetani, Akisato Kimura,
Hitoshi Sakano, Ken Fukuchi
"Single image segmentation with estimated depth,"
Proc. British Machine Vision Conference (BMVC2012),
pp.28.1--28.11, Guildford, UK, September 2012.
[ project ]
[ pdf ]
[ DOI link ]
[ poster ]
[ supplementary material ]
[ copyright notice ]
- Abstract
-
A novel framework for automatic object segmentation is proposed
that exploits depth information estimated from a single image as
a supplemental cue. For example, suppose that we have an image
containing an object and a background with a similar color or
texture to the object. The proposed framework enables us to
automatically extract the object from the image while
eliminating the misleading background. Although our segmentation
framework takes a form of a traditional formulation based on
Markov random fields, the proposed method provides a novel
scheme to integrate depth and color information, which derives
objectness/backgroundness likelihood. We also employ depth
estimation via supervised learning so that the proposed method
can work even if it has only a single input image with no actual
depth information. Experimental results with a dataset
originally collected for the evaluation demonstrate the
effectiveness of the proposed method against the baseline method
and several existing methods for salient region detection.
Kevin Duh, Tsutomu Hirao,
Akisato Kimura, Katsuhiko Ishiguro,
Tomoharu Iwata, Ching-Man Au Yeung
"Creating stories: Social curation of Twitter messages,"
Proc. International AAAI Conference on Weblogs and Social Media
(ICWSM2012),
Dublin, Ireland, June 2012.
[ project ]
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
Social media has become ubiquitous. Tweets and other
user-generated content have become so abundant that better
tools for information organization are needed in order to
fully exploit their potential richness. "Social curation" has
recently emerged as a promising new framework for organizing
and adding value to social media, complementing the traditional
methods of algorithmic search and aggregation. For example, web
services like Togetter and Storify empower users to collect and
organize tweets to form stories that are pertinent, memorable,
and easy to read. While social curation services are gaining
popularity, little academic research has studied the
phenomenon. In this work, we perform one of the first analysis
of a large corpus of social curation data. We seek to understand
why and how people curate tweets. We also propose an machine
learning system that suggests new tweets, increasing the
curator's productivity and breadth of perspective.
Ukrit Watchareeruetai, Akisato
Kimura, Robert Cheng Bao, Takahito Kawanishi, Kunio
Kashino
"Interest point detection via stochastically derived stability,"
IPSJ Transactions on Computer Vision and Applications,
Vol.3, pp.189-197, December 2011.
[ PDF (preprint) ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
We propose a novel framework called StochasticSIFT for detecting interest
points (IPs) in video sequences. The proposed framework incorporates a
stochastic model considering the temporal dynamics of videos into the
SIFT detector to improve robustness against fluctuations inherent to
video signals. Instead of detecting IPs and then removing unstable or
inconsistent IP candidates, we introduce IP stability derived from a
stochastic model of inherent fluctuations to detect more stable IPs. The
experimental results show that the proposed IP detector outperforms the
SIFT detector in terms of repeatability and matching rates.
Takuho Nakano, Akisato Kimura,
Hirokazu Kameoka, Shigeki Sagayama, Shigeki Miyabe, Nobutaka Ono,
Kunio Kashino, Takuya Nishimoto
"Automatic video annotation via hierarchical topic trajectory model
considering cross-modal correlations,"
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP2011),
pp.2380--2383, Prague, Czech Repiblic, May 2011.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
We propose a new statistical model, named Hierarchical Topic
Trajectory Model (HTTM), for acquiring a dynamically changing
topic model that represents the relationship between video
frames and associated text labels. Model parameter estimation,
annotation and retrieval can be executed within a unified
framework with a few computation. It is also easy to add new
modals such as audio signal and geotags. Preliminary experiments
on video annotation task with manually annotated video dataset
indicate that our proposed method can improve the annotation
accuracy.
Jun Takagi, Yasunori Ohishi, Akisato Kimura,
Masashi Sugiyama, Makoto Yamada, Hirokazu Kameoka
"Automatic audio tag classification via semi-supervised canonical density
estimation,"
Proc. IEEE International Conference on Acoustics, Speech and Signal
Processing (ICASSP2011),
pp.2232--2235, Prague, Czech Repiblic, May 2011.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
We propose a novel semi-supervised method for building a
statistical model that represents the relationship between
sounds and text labels (?gtags?h). The proposed method, named
semi-supervised canonical density estimation, makes use of
unlabeled sound data in two ways: 1) a low-dimensional latent
space representing topics of sounds is extracted by a semi-
supervised variant of canonical correlation analysis, and 2)
topic models are learned by multi-class extension of semi-
supervised kernel density estimation in the topic space. Real-
world audio tagging experiments indicate that our proposed
method improves the accuracy even when only a small number of
labeled sounds are available.
Akisato Kimura, Hirokazu Kameoka,
Kunio Kashino
"Medie Scene Learning: A framework for extracting meaningful
parts from audio and video signals",
NTT Technical Review,
Vol.8, No.11, pp.1-7, November 2010.
[ PDF ]
- Abstract
-
We describe a novel framework called Media Scene Learning (MSL)
for automatically extracting key components such as the sound of
a single instrument from a given audio signal or a target object
from a given video signal. In particular, we introduce two key
methods: 1) the Composite Auto-Regressive System (CARS) for
decomposing audio signals into several sound components on the
basis of a generative model of sounds and 2) Saliency-Based
Image Learning (SBIL) for extracting object-like regions from a
given video signal on the basis of the characteristics of the
human visual system.
Takuya Maekawa, Akisato Kimura,
Hitoshi Sakano
"Wearable sensor device for automatic recording of hand
drawings,"
presented in a demo session,
Asian Confernece on Computer Vision (ACCV2010),
Queen's Town, New Zealand, November 2010.
[ pdf ] [ DOI link ] [ poster ]
[ copyright notice ]
- Abstract
-
Drawing and writing are two of the most important human
activities when it comes to recording events and information.
Needless to say, the digitization of hand drawn paper is
important and thus many products and methods for capturing hand
drawings have been developed. However, many of these methods
require a special pen, paper, and/or apparatus. Thus, when we
want to capture hand drawings with these methods, we have to
capture the drawings actively, e.g., by preparing special pens
and paper. In this work, we try to capture automatically all
the hand drawings found in our daily lives without any explicit
action by the user. Recent advances in sensing technology
enable us to record our daily life data anywhere and at anytime
using small always-on wearable sensors. In this work, our aim
is to capture hand drawings automatically with an always-on
wearable sensor device equipped with a camera.
Kazuma Akamine, Ken Fukuchi, Akisato
Kimura, Shigeru Takagi
"Fully automatic extraction of salient regions in near
real-time,"
the Computer Journal, November 2010.
[ DOI link ]
[ copyright notice ]
- Abstract
-
Automatic video segmentation plays an important role in a wide
range of computer vision and image processing applications.
Recently, various methods have been proposed for this purpose.
The problem is that most of these methods are far from
real-time processing even for low-resolution videos due to the
complex procedures. To this end, we propose a new and quite
fast method for automatic video segmentation with the help of
(1) efficient optimization of Markov random fields with
polynomial time of the number of pixels by introducing graph
cuts, (2) automatic, computationally efficient but stable
derivation of segmentation priors using visual saliency and
sequential update mechanism and (3) an implementation strategy
in the principle of stream processing with graphics processor
units. Test results indicate that our method extracts
appropriate regions from videos as precisely as and much faster
than previous semi-automatic methods even though no
supervisions have been incorporated.
Gurbachan Sekhon, Akisato Kimura, Yasuhiro Minami, Hitoshi Sakano,
Eisaku Maeda
"Action planning for interactive visual scene understanding based
on knowledge confidence in latent spaces,"
IEICE Technical Report (domestic),
PRMU2010-83 (IBISML2010-5), Fukuoka, Japan, September 2010.
[ presentation material ]
[ copyright notice ]
- Abstract
-
This report proposes a method for action planning in
interactive visual scene understanding through the use of
knowledge confidence generated from a latent space of a topic
model connecting image features and text labels. We then use
information, within the latent space, about the position of an
input sample relative to training samples in order to simulate
knowledge confidence. Coupled with this, we also use the
overall associativity between each text label as determined by
the content of the training samples to determine the knowledge
confidence.
Akisato Kimura, Hirokazu Kameoka, Masashi
Sugiyama, Takuho Nakano, Eisaku Maeda, Hitoshi Sakano, Katsuhiko Ishiguro
"SemiCCA: Efficient semi-supervised learning of canonical correlations,"
Proc. IAPR International Conference on Pattern Recognition
(ICPR2010),
pp.2933--2936, Istanbul, Turkey, August 2010.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
Canonical correlation analysis (CCA) is a powerful tool for
analyzing multi-dimensional paired data. However, CCA tends to
perform poorly when the number of paired samples is limited,
which is often the case in practice. To cope with this problem,
we propose a semi-supervised variant of CCA named semiCCA that
allows us to incorporate additional unpaired samples for
mitigating over-fittng. The proposed method smoothly bridges
the eigenvalue problems of CCA and principal component analysis
(PCA), and thus its solution can be computed efficiently just
by solving a single eigenvalue problem as the original CCA.
Ukrit Watchareeruetai, Akisato Kimura, Robert Cheng Bao, Takahito
Kawanisi, Kunio Kashino
"StochasticSIFT: Interest point detection based on stochastically-
derived stability,"
Proc. Meeting on Image Recognition and Understanding (MIRU2010,
domestic),
IS1-80, Kushiro, Hokkaido, Japan, July 2010.
[ PDF ]
[ Poster ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
We propose a novel framework for detecting interest points
(IPs) in video sequence named "StochasticSIFT". The proposed
framework incorporates a stochastic model considering temporal
dynamics of videos into the SIFT detector to improve robustness
against some fluctuations inherently included in video signals.
Instead of detecting IPs followed by removing unstable or
inconsistent IP candidates, we introduce IP "stability" derived
from a stochastic model of inherent repeat ability and matching
rates.
Gurbachan Sekhon, Ken Fukuchi, Akisato Kimura
"Automatic and precise extraction of generic objects using
saliency-based priors and contour constraints,"
Proc. Meeting on Image Recognition and Understanding (MIRU2010,
domestic),
IS3-3, Kushiro, Hokkaido, Japan, July 2010.
[ PDF ]
[ Poster ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
This paper deals with automatic video segmentation without
supervision or interactions. We examine a method for automatic
noise reduction in segmented video frames utilizing contour
information, which we have dubbed the Contour-Classification
method. This method uses information about the contours of the
segmented image mask in order to accurately reduce noise in
segmented video frames. We will also examine which we have
developed, called the Erosion-Dilation method. Our proposed
method is then composed of these two fundamental techniques:
Contour-Classification and Erosion-Dilation. Test results
indicate our proposed method precisely removes noise regions
from videos with low error rate when compared with both the
original unaltered segmentation result and the Erosion-Dilation
method.
Shigeaki Kuzuoka, Akisato Kimura, Tomohiko Uyematsu
"Universal source coding for multiple decoders with side
information,"
Proc. International Symposium of Information Theory (ISIT2010),
pp.1-5, Austin Texas, USA, June 2010.
- Abstract
-
A multiterminal lossy source coding problem, which includes
various problems such as the Wyner-Ziv problem and the
complementary delivery problem as special cases, is considered.
It is shown that any point in the achievable ratedistortion
region can be attained even if the source statistics are not
known.
Akisato Kimura, Derek Pang, Tatsuto
Takeuchi, Kouji Miyazato, Junji Yamato and Kunio Kashino
"A stochastic model of human visual attention with a dynamic
Bayesian network,"
submitted to IEEE Transactions on Pattern Analysis and Machine
Intelligence.
[ pdf (arXiv.org) ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
Recent studies in the field of human vision science suggest
that the human responses to the stimuli on a visual display are
non-deterministic. People may attend to different locations on
the same visual input at the same time. Based on this
knowledge, we propose a new stochastic model of visual
attention by introducing a dynamic Bayesian network to predict
the likelihood of where humans typically focus on a video
scene. The proposed model is composed of a dynamic Bayesian
network with 4 layers. Our model provides a framework that
simulates and combines the visual saliency response and the
cognitive state of a person to estimate the most probable
attended regions. Sample-based inference with Markov chain
Monte-Carlo based particle filter and stream processing with
multi-core processors enable us to estimate human visual
attention in near real time. Experimental results have
demonstrated that our model performs significantly better in
predicting human visual attention compared to the previous
deterministic models.
Shigeaki Kuzuoka, Akisato Kimura, Tomohiko Uyematsu
"Universal source coding for multiple decoders with side information,"
Workshop on Shannon Theory Workshop (STW2009, domestic),
pp.35-40, Matsuyama, Ehime, Japan, September 2009.
- Abstract
-
A multiterminal lossy source coding problem, which includes various
problems such as the Wyner-Ziv problem and the complementary delivery
problem as a special case, is considered. It is clarified that any point
in the achievable rate-distortion region can be attained even if the
source statistics is not known.
Ken Fukuchi, Kouji Miyazato, Akisato Kimura, Shigeru Takagi and Junji
Yamato
"Saliency-based video segmentation with graph cuts and sequentially updated
priors,"
Proc. International Conference on Multimedia and Expo (ICME2009),
New York, New York, USA, June-July 2009.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
This paper proposes a new method for achieving precise video
segmentation without any supervision or interaction. The main
contributions of this report include 1) the introduction of fully
automatic segmentation based on the maximum a posteriori (MAP)
estimation of the Markov random field (MRF) with graph cuts and
saliency-driven priors and 2) the updating of priors and feature
likelihoods by integrating the previous segmentation results and the
currently estimated saliency-based visual attention. Test results
indicate that our new method precisely extracts probable regions from
videos without any supervised interactions.
Kouji Miyazato, Akisato Kimura, Shigeru Takagi and Junji Yamato
"Real-time estimation of human visual attention with dynamic Bayesian
network and MCMC-based particle filter",
Proc. International Conference on Multimedia and Expo (ICME2009),
New York, New York, USA, June-July 2009.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
Recent studies in signal detection theory suggest that the human
responses to the stimuli on a visual display are non-deterministic.
People may attend to different locations on the same visual input at the
same time. Constructing a stochastic model of human visual attention
would be promising to tackle the above problem. This paper proposes a
new method to achieve a quick and precise estimation of human visual
attention based on our previous stochastic model with a dynamic Bayesian
network. A particle filter with Markov chain Monte-Carlo (MCMC) sampling
make it possible to achieve a quick and precise estimation through
stream processing. Experimental results indicate that the proposed
method can estimate human visual attention in real time and more
precisely than previous methods.
Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Junji Yamato and Kunio
Kashino
"Dynamic Markov random fields for stochastic modeling of visual
attention",
IEICE Technical Report (domestic),
PRMU2008-117 (MVE2008-66), Toyonaka, Osaka, Japan, November 2008.
[ presentation material ]
[
copyright notice ]
- Abstract
-
This report proposes a new stochastic model of visual attention to
predict the likelihood of where humans typically focus on a video scene.
The proposed model is composed of a dynamic Bayesian network that
simulates and combines a person's visual saliency response and eye
movement patterns to estimate the most probable regions of attention.
Dynamic Markov random field (MRF) models are newly introduced to include
spatiotemporal relationships of visual saliency responses. Experimental
results have revealed that the proposed model outperforms the previous
deterministic model and the stochastic model without dynamic MRF in
predicting human visual attention.
Akisato Kimura,
"Particle-based simulation of the Gel'fand-Pinsker channel capacity and
the Wyner-Ziv rate-distortion function,"
Proc. Symposium on Information Theory and its Applications (SITA2008,
domestic),
pp.2-4-4, Kinugawa, Tochigi, Japan, October 2008.
[ presentation material ]
[
copyright notice ]
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyright of the material.]
- Abstract
-
This report presents a new numerical algorithm for simulating the
capacity of a memoryless channel with non-causal encoder side information
(the Gel'fand-Pinsker channel) and the rate-distortion function for a
memoryless source with decoder side information (the Wyner-Ziv coding).
The basic idea is to represent a probabilistic density by a finite number
of particles each of which is composed of a sample value and the
associated weight. The proposed algorithm enables us to simulate the
channel capacity and the rate distortion function with infinite or
continuous alphabets.
Shigeaki Kuzuoka, Akisato Kimura and Tomohiko Uyematsu,
"Simple coding schemes for lossless and lossy complementary delivery
problems,"
Proc. Shannon Theory Workshop (STW2007, domestic),
pp.43-50, Izu, Shizuoka, Japan, September 2007.
- Abstract
-
This paper deals with a coding problem called complementary delivery,
where messages from two correlated sources are jointly encoded, and each
decoder reproduces one of two messages using the other message as the
side information. Simple lossless and lossy complementary delivery coding
schemes are proposed. In the lossless case, it is revealed that the error
probability of the proposed code based on Slepian-Wolf codes is
exponentially tight. Moreover, in the lossy case, it is demonstrated that
Wyner-Ziv codes can be applied to complementary delivery problem.
Kunio Kashino, Akisato Kimura, Takayuki Kurozumi and Hidehisa Nagano
"Robust search methods for music signals based on simple
representation,"
Proc. International Conference on Acoustics, Speech and Signal Processing
(ICASSP2007),
Vol.4, pp.1421--1424, Hawaii, USA, April 2007.
[ DOI link ]
[ copyright notice ]
- Abstract
-
Signal similarity search is an important technique for music information
retrieval. A basic task is finding identical signal segments on unlabeled
music-signal archives, given a short music signal fragment as a query. In
such a task, the search must be fast and sufficiently robust against
possible signal fluctuations due to noise and distortions. In this
special session paper, we describe a search method designed to cope with
additive interfering sounds by spectral partitioning. Then, we introduce
another method designed to be robust under multiplicative noise or
distortion based on binary area representation.
Takahito Kawanishi, Masaru Tsuchida, Shigeru Takagi, Akisato Kimura
and Junji Yamato
"Small cylindrical display using asherical mirror for anthropomorphic
agents",
Proc. International Display Workshop / Asia Display (IDW/AD'05),
pp.1755-1758, Takamatsu, Kagawa, Japan, December 2005.
- Abstract
-
We have developed a small cylindrical display for an anthropomorphic
agent that communicates with mul-tiple users in a 3D environment. The
previously developed cylindrical display was dark with bad contrast at
the lower part of the screen because the density of pixels at the lower
part is much less than at the upper part. We improved that the pixel
density is uniform using aspherical mirror. Experimental results show our
new display has better luminance and better contrast than previous one.
Kunio Kashino, Akisato Kimura and Takayuki Kurozumi
"A quick video search method based on local and global
feature pruning",
Proc. International Conference on Pattern Recognition (ICPR2004)???
Vol.3, pp.894-897, August 2004.
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper proposes a quick method of similarity-based video searching to
detect and locate a specific video clip given as a query in a stored long
video stream. The method employs a two-stage process: local and global
feature clustering. The local clustering exploits continuity or local
similarities between video features, and the global clustering gathers
similar video frames that are not necessarily adjacent to each other.
These processes prune irrelevant sections on a video stream. The method
guarantees the exactly same search result as the exhaustive search.
Experiments performed on a PC show that the proposed method can correctly
detect and locate a 7.5-second clip in a 150-hour video recording in 15
ms on average.
Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Junji Yamato and Kunio
Kashino
"Dynamic Markov random fields for stochastic modeling of visual
attention",
Proc. International Conference on Pattern Recognition (ICPR2008),
Mo.BT8.35, Tampa, Florida, USA, December 2008.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
This report proposes a new stochastic model of visual attention to
predict the likelihood of where humans typically focus on a video scene.
The proposed model is composed of a dynamic Bayesian network that
similates and combines a person's visual saliency response and eye
movement patterns to estimate the most probable regions of attention.
Dynamic Markov random field (MRF) models are newly introduced to include
spatiotemporal relationships of visual saliency responses. Experimental
results have revealed that the propose model outperforms the previous
deterministic model and the stochastic model without dynamic MRF in
predicting human visual attention.
Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and
Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian
network",
Proc. Meeting on Image Recognition and Understanding (MIRU2008,
domestic),
pp. 1500--1505, Karuizawa, Nagano, Japan, July 2008.
(Selected as
Best Interactive Session Award
)
[ pdf ]
[ digest ]
[ poster: Japanese,
English ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
(The content is almost the same as the one presented in IEICE Technical
Meeting held in June 2008. Please see
here.)
Shigeaki Kuzuoka, Akisato Kimura and Tomohiko Uyematsu
"Universal coding for lossy complementary delivery problems",
Proc. International Symposium on Information Theory (ISIT2008),
pp. 2177--2188, Toronto, Canada, July 2008.
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper deals with a universal lossy coding problem for a certain kind
of multiterminal source coding network called a complementary delivery
system. A universal coding scheme based on Wyner-Ziv codes is proposed.
While the proposed scheme cannot attain the optimal rate-distortion
trade-off in general, the rate-loss is upper bounded by a universal
constant under some mild conditions. Moreover, the proposed scheme allows
us to apply (non-universal) Wyner-Ziv codes to construct a universal
lossy complementary delivery code.
Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and
Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian
network",
Proc. International Conference on Multimedia and Expo (ICME2008),
pp.1073--1076, Hannover, Germany, June 2008.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
Recent studies in signal detection theory suggest that the human
responses to the stimuli on a visual display are non-deterministic.
People may attend to different locations on the same visual input at the
same time. To predict the likelihood of where humans typically focus on a
video scene, we propose a new stochastic model of visual attention by
introducing a dynamic Bayesian network. Our model simulates and combines
the visual saliency response and the cognitive state of a person to
estimate the most probable attended regions. Experimental results have
demonstrated that our model performs significantly better in predicting
human visual attention compared to the previous deterministic model.
Derek Pang, Akisato Kimura, Tatsuto Takeuchi, Junji Yamato and
Kunio Kashino
"A stochastic model of selective visual attention with a dynamic Bayesian
network",
IEICE Technical Report (domestic),
PRMU2008-43 (DE2008-25), Otaru, Hokkaido, Japan, June 2008.
[ PDF ]
[ presentation material ]
[
copyright notice ]
- Abstract
-
Recent studies in signal detection theory suggest that the human
responses to the stimuli on a visual display are non-deterministic.
People may attend to different locations on the same visual input at the
same time. To predict the likelihood of where humans typically focus on a
video scene, we propose a new stochastic model of visual attention by
introducing a dynamic Bayesian network. Our model simulates and combines
the visual saliency response and the cognitive state of a person to
estimate the most probable attended regions. Experimental results have
demonstrated that our model performs significantly better in predicting
human visual attention compared to the previous deterministic model.
Akisato Kimura, Tomohiko Uyematsu, Shigeaki Kuzuoka and Shun Watanabe,
"Universal source coding over generalized complementary delivery
networks,"
IEEE Transactions on Information Theory,
Vol.55, No.3, pp.1360-1373, March 2009.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding networks called a generalized complementary
delivery network. In this network, messages from multiple correlated
sources are jointly encoded, and each decoder has access to some of the
messages to enable the decoder to reproduce the other messages. Both
fixed-to-fixed length and fixed-to-variable length lossless coding
schemes are considered. Explicit constructions of universal codes and the
bounds of the error probabilities are clarified via methods of types and
graph-theoretical analysis.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources over generalized complementary
delivery networks,"
Proc. Symposium on Information Theory and its Applications
(SITA2007, domestic),
pp.274-279, Shima, Mie, November 2007.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding networks called a generalized complementary
delivery network. In this network, messages from multiple correlated
sources are jointly encoded, and each decoder has access to some of the
messages to enable the decoder to reproduce the other messages. Both
fixed-to-fixed length and fixed-to-variable length lossless coding
schemes are considered. Explicit constructions of universal codes and the
bounds of the error probabilities are clarified via methods of types and
graph-theoretical analysis.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for audio signals based on a piecewise linear
representation of feature trajectories,"
IEEE Transactions on Audio, Speech and Language Processing,
Vol.16, No.2, pp.396-407, February 2008.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
This paper presents a new method for a quick similarity-based search
through long unlabeled audio streams to detect and locate audio clips
provided by users. The method involves feature dimension reduction based
on a piecewise linear representation of a sequential feature trajectory
extracted from a long audio stream. Two techniques enable us to obtain a
piecewise linear representation: the dynamic segmentation of feature
trajectories and the segment-based KL transform. A new technique is also
introduced that greatly reduces the required feature comparisons. The
proposed search method guarantees in principle that no segment to be
detected is missed. Experiments indicate significant improvements in
search speed. For example the proposed method reduced the total search
time to approximately 1/12 and detected queries in approximately 0.3
seconds from a 200-hour audio database.
Akisato Kimura,
"Coding theorems for correlated sources with cooperative encoders,"
Ph.D dissertation, Tokyo Institute of Technology, September 2007.
[ pdf ]
[ presentation material ]
[ Copyright notice: The author holds the copyright of the material. ]
- Abstract
-
This thesis deals with multiterminal source coding problems for a general
framework of coding systems, called coding systems with cooperation,
where there are some linkages among encoders and decoders. Especially,
the main focus of this thesis is encoder cooperation. Two types of coding
systems are investigated that incorporate encoder cooperation: the
Slepian-Wolf coding system with linkages (called the SWL system) and the
complementary delivery coding system.
The SWL system involves some mutual linkages between two encoders of the
coding system investigated by Slepian and Wolf (called the SW system)
that involves two separate encoders and one common decoder. Especially,
some special cases are considered, where the coding rate for the mutual
linkage between two encoders is negligibly small. The main results in
this thesis shows that the achievable rate region of the SWL system
equals that of the SW system when considering fixed-length coding, while
weak variable-length coding makes the achievable rate region of the SWL
system larger than that of the SW system. This implies that encoder
cooperation may improve the coding rate.
The complementary delivery coding system contrasts with the SW system in
the sense of cooperation, which means that the complementary delivery
coding system consists of a common encoder and separate decoders, while
the SW system includes separate encoders and a common decoder.
Especially, in the complementary delivery coding system, each decoder has
access to some of encoded messages to enable the decoder to reproduce the
other messages from a common codeword emitted from the common encoder.
First, the minimum achievable rate for lossy coding is clarified, which
implies that encoder cooperation may increase the coding rate. Next,
universal coding schemes for lossless coding are proposed. Explicit
constructions of universal lossless codes and the bounds of the error
probabilities are clarified by using methods of types and the
graph-theoretical analysis.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka ,
"Universal coding for correlated sources with complementary delivery,"
IEICE Transactions on Fundamentals,
Vol.E90-A, No.9, pp.1840-1847, September 2007.
Pulished online in
IEICE Transaction Online.
[ pdf ]
[ DOI link ]
[
copyright notice ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding system that we call the complementary
delivery coding system. In this system, messages from two correlated
sources are jointly encoded, and each decoder has access to one of the
two messages to enable it to reproduce the other message. Both
fixed-to-fixed length and fixed-to-variable length lossless coding
schemes are considered. Explicit constructions of universal codes and
bounds of the error probabilities are clarified via type-theoretical and
graph-theoretical analyses.
Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the
salient region extraction of videos",
Proc. Meeting on Image Recognition and Understanding
(MIRU2007, domestic),
pp.582--587, Hiroshima, Japan, July 2007.
[ pdf ]
[ poster ]
[
copyright notice ]
- Abstract
-
This report proposes a new algorithm for extracting salient regions of
videos by introducing two important properties of the early human visual
system: 1) Instantaneous saliency depletion with gradual recovery,
whereby saliency is instantaneously suppressed and gradually recovered in
previously attended regions. 2) Gradual saliency depletion with
instantaneous recovery, whereby saliency is gradually decreased over time
in non-surprising regions and at the same time recovered in surprising
locations. With the introduction of these properties, redundant
information in videos can be suppressed and important information is
eventually enhanced. The proposed algorithm has been evaluated with an
eye tracking device to see how well it fits the human visual system. The
results show that the proposed algorithm substantially outperformed
previous algorithms when only gradual depletion was incorporated, and
instantaneous depletion improved the performance in some cases.
Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the
salient region extraction of videos",
Proc. International Conference on Multimedia and Expo (ICME2007),
pp.300--303, Beijing, China, July 2007.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
This paper proposes a new algorithm for extracting salient regions of
videos by introducing two important properties of the early human visual
system: 1) Instantaneous saliency depletion with gradual recovery,
whereby saliency is insantaneously suppressed and gradually recovered in
previously attended regions. 2) Gradual saliency depletion with
instantaneous recovery, whereby saliency is gradually decreased over time
in non-surprising regions and at the same time recovered in surprising
locations. With the introduction of these properties, redundant
information in videos can be suppressed and important information is
eventually enhanced.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources with generalized complementary
delivery,"
presented at a recent result session,
International Symposium on Information Theory (ISIT2007),
Nice, France, June 2007.
[ poster ]
[ copyright notice ]
- Abstract
-
This presentation deals with a universal coding problem for a certain
kind of multiterminal source coding system called the generalized
complementary delivery coding system. In this system, messages from
multiple correlated sources are jointly encoded, and each decoder has
access to some of the messages to enable them to reproduce the other
messages. Both fixed-to-fixed length and fixed-to-variable length
lossless coding schemes are considered. Explicit constructions of
universal codes and the bounds of the error probabilities are clarified
via methods of types and graph-theoretical analyses.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka,
"Universal coding for correlated sources with complementary delivery,"
Proc. International Symposium on Information Theory (ISIT2007),
pp.1756--1760, Nice, France, June 2007.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
This report deals with a universal coding problem for a certain kind of
multiterminal source coding system that we call the complementary
delivery coding system. Both fixed-to-fixed length and fixed-to-variable
length lossless coding schemes are considered. Explicit constructions of
universal codes and the bounds of the error probabilities are clarified
via type-theoretical and graph-theoretical analyses.
Akisato Kimura, Tomohiko Uyematsu and Shigeaki Kuzuoka ,
"Universal source coding for complementary delivery,"
Proc. Symposium on Information Theory and its Applications
(SITA2006, domestic),
pp.803--806, Hakodate, Japan, November-December 2006.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
This paper deals with a universal coding problem for a certain kind of
multiterminal source coding system that we call complementary delivery
coding system. Both fixed-to-fixed length and fixed-to-variable length
lossless coding schemes are considered. Explicit constructions of
universal codes and bounds of the error probabilities are alarified via
type-theoretical and graph-theoretical analyses.
Akisato Kimura and Tomohiko Uyematsu ,
"Information-theoretical analysis of index searching: Revised,"
Proc. Symposium on Information Theory and its Applications
(SITA2006, domestic),
pp.73--76, Hakodate, Japan, November-December 2006.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
We present an information-theoretical viewpoint for similarity-based
retrieval along with index structures. This retrieval system comprises
two stages: pruning data items based on the index structures, and
matching surviving data items. The first stage is modeled as so-called
Wyner-Ziv problem, while the second stage is considered as a coding
problem such that parts of the decoding results are available as partial
side information at both of the encoder and decoder. We clarify upper
and lower bounds of the optimal retrieval performances and some
relationships between retrieval parameters and performances via
shannon-theoretic analyses.
Akisato Kimura and Tomohiko Uyematsu ,
"Multiterminal source coding with complementary delivery,"
Proc. International Symposium on Information Theory and its Applications
(ISITA2006),
pp.189-194, Seoul, South Korea, Octover 2006.
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
A coding problem where messages from two correlated sources are jointly
encoded and separately decoded is investigated. Each decoder has access
to one of the two messages to enable it to reproduce the other message.
The rate-distortion function for lossy coding is clarified. Some related
coding problems are also examined.
Akisato Kimura, Tomohiko Uyematsu
"Multiterminal source coding for cascading and feedback refinement
systems,"
Prof. Shannon Theory Workshop (STW2006, domestic),
pp.25-31, September 2006
[ pdf ]
[ presentation material ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
- Lossy coding problems are investigated for some communication systems
in the presense of cascading and/or feedback information channels from
decoders so as to refine reproduction messages. This framework provides
different types of refinement structures from so-called successive
refinement. Three different types of communication systems are
considered, i.e. refinement systems in the presense of a cascading
channel, a feedback channel, and both channels. Outer and inner bounds
of achievable rate-distortion regions for those problems are obtained.
Akisato Kimura and Tomohiko Uyemats ,
"Multiterminal source coding with complementary delivering,"
IEICE Technical Report,
IT2006-8, pp.7-12, May 2006,
Presented at 2006 Hawaii, IEICE and SITA Joint Conference on Information
Theory.
[ presentation material ]
[
copyright notice ]
- Abstract
-
We consider a coding problem where messages from two correlated sources
are jointly encoded and separately decoded. Each decoder has access to
one of two messages to reproduce the other message. We clarify the
rate-distortion function for lossy coding.
Akisato Kimura, Takahito Kawanishi and Kunio Kashino,
"Acceleration of similarity-based partial image retrieval using multistage
vector quantization,"
Proc. International Conference on Pattern Recognition (ICPR2004),
Vol.2, pp.993-996, Cambridge, United Kingdom, August 2004.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyrigthe notice ]
- Abstract
- We propose a new method for quick and accurate partial image retrieval
from a huge number of images based on a predefined distance measure.
The proposed method utilizes vector quantization (VQ) on multiple
layers, namely color, block, and feature layers. This can greatly
reduce the amount of calculation needed for partial image retrieval.
Experiments indicate that the proposed method can detect partial images
that are similar to queries through 1000 images within 4 seconds. This
is approximately 30 times faster than the method to which multistage VQ
is not applied.
Akisato Kimura, Takahito Kawanishi and Kunio Kashino,
"Similarity-based partial image retrieval guaranteeing same accuracy as
exhaustive matching,"
Proc. International Conference on Multimedia and Expo (ICME2004),
Vol. 3, pp.1895-1898, Taipei, Taiwan, June 2004.
[ pdf ]
[ poster ]
[ copyright notice ]
- Abstract
- We propose a new framework for quick and accurate partial image
retrieval from a huge number of images based on a predefined distance
measure. Finding partial similarities generally requires a huge amount
of storage space for indexes due to the large number of portions of
images. The proposed method extracts portions from each database image
at a constant spacing, while it extracts all possible portions from a
query image. In this way, the proposed method can greatly reduce the
size of indexes while theoretically guaranteeing the same accuracy as
exhaustive matching.
Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-length Slepian-Wolf coding with linked encoders for mixed
sources,"
IEEE Transactions on Information Theory,
Vol.50, No.1, pp.183-193, Jan. 2004.
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
- Coding problems for correlated information sources were first
investigated by Slepian and Wolf. They considered the data compression
system, called the SW system, where two sequences emitted from
correlated sources are separately encoded to codewords, and sent to a
single decoder which has to output the original sequence pairs with a
small probability of error. In this correspondence, we investigate the
coding problem of a modified SW system allowing two encoders to
communicate with zero rate. First, we consider the fixed-length coding
and clarify that the admissible rate region for general sources is
equal to that of the original SW system. Next, we investigate the
variable-length coding having the asymptotically vanishing probability
of error. We clarify the admissible rate region for mixed sources
characterized by two ergodic sources and show that this region is
strictly wider than that for fixed-length codes. Further, we
investigate the universal coding problem for memoryless sources in the
sysyem and show that the SW system with linked encoders has much more
flexibility than the original SW system.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for multimedia signals using global pruning,"
Systems and Computers in Japan,
Vol.34, No.13, pp.47-58, November 2003.
[ DOI link ]
- Abstract
- The authors propose a new method for quickly searching for a specific
audio or video signal to be detected within a long, stored audio or
video stream to determine segments that contain signals that are nearly
identical to the given signal. The Time-series Active Search (TAS)
method is one of the quick search methods that have been proposed
previously. This singal searching technique based on histograms
extracted from the signals had implemented quick searching by local
pruning, that is, omitting comparisons of segmentsfor which searching
was unnecessary based on similarities in the vicinity of the matching
window. In contrast, the proposed technique implements significantly
quicker searching by introducing global pruning, which looks at the
entire signal time-series according to histogram classifications based
on similarities of the entire signal to eliminate segments that need
not be searched, in addition to local pruning. In this paper, the
authors present a detailed discussion of the relationship between the
degree of global pruning and the accuracy that is guaranteed. For
example, the authors showed through experimentsthat when 128-dimensions
histograms were classified to 1024 clusters, the proposed technique
achieved a search speed approximately 9 times that of TAS while
preserving the same degree of accuracy. The preprocessing calculation
time increased by approximately 1% of the time for playing the signal.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"Dynamic-segmentation-based feature dimension reduction
for quick audio/video searching,"
Proc. International Conference on Multimedia and Expo (ICME2003),
Vol.2, pp.389-392, Baltimore, Maryland, USA, July 2003.
Proc. International Conference on Acoustics,
Speech and Signal Processing (ICASSP2003),
Vol.3, pp.357-360, Hong Kong, Apr. 2003 (cancelled).
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
- We propose a new feature dimension reduction method for multimedia
search. The main technique in the method is dynamic segmentation that
partitions sequential feature trajectories dynamically. While dynamic
segmentation reduces the average dimensionality and accelerates the
search, it requires huge amount of calculation. Thus, our method
quickly executes suboptimal partitioning of the trajectories by using
the discreteness of dimension changes. This guarantees the optimal
amount of calculation to derive the suboptimal partitioning under the
condition that the dimension monotonously increases as the segment
length increases. The experiment shows that our method is over 10 times
faster than a straightforward dynamic segmentation method.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"A quick search method for multimedia signals using feature compression
based on piecewise linear maps,"
Proc. International Conference on Acoustics, Speech and Signal Processing
(ICASSP2002),
Vol.4, pp.3656-3659, Orlando, Florida, USA, May 2002.
[ pdf ]
[ DOI link ]
[ poster ]
[ copyright notice ]
- Abstract
-
We propose a quick algorithm for multimedia signal search. The
algorithm comprises two techniques: feature compression based on
piecewise linear maps and distance bounding to efficiently limit the
search space. When compared with existing multimedia search techniques,
they greatly reduce the computational cost required in searching.
Although feature compression is employed in our method, our bounding
technique mathematically guarantees the same recall rate as the search
based on the original features; no segment to be detected is missed.
Experiments indicate that the proposed algorithm is approximately 10
times faster than and as accurate as an existing fast method maitaining
the same search accuracy.
Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-lenth Slepian-Wolf coding with linked encoders for mixed
source,"
Proc. IEEE Information Theory Workshop 2001 (ITW2001),
pp.82--84, Cairns, Australia, Sep. 2001
[ pdf ]
[ DOI link ]
[ Copyright notice ]
- Abstract
-
Slepian and Wolf first considered the data compression of correlated
sources called the SW system, where two sequences emitted from
correlated sources are separately encoded to codewords, and sent to a
single decoder which has to output original sequence pairs. Resently,
Oohama has extended the SW system and investigated a more general case
where there are come mutual linkages between two encoders of the SW
system. In this papar, we investigate variable-length coding which
allows asymptotically vanishing probability of error for the system
considered by Oohama. We clarify the admissible rate region for mixed
sources, and show that this region is strictly wider than that for
fixed-length codes.
Akisato Kimura, Kunio Kashino, Takayuki Kurozumi and Hiroshi Murase,
"Very quick audio searching : Introducing global pruning to the
Time-Series Active Search,"
Proc. International Conference on Acoustics, Speech and Signal Processing
(ICASSP2001),
Vol.3, pp.1429-1432, Salt Lake City, Utah, USA, May 2001.
[ pdf ]
[ DOI link ]
[ presentation material ]
[ copyright notice ]
- Abstract
-
Previously, we proposed a histogram-based quick signal search method
called Time-Series Active Search (TAS). TAS is a method of searching
through long audio or video recordings for a specified segment, based
on signal similarity. TAS is fast; it can search through a 24-hour
recording in 1 second after a query-independent preprocessing. However,
an even faster method is required when we consider huge amount of audio
archives, for example a month's worth of recordings. Thus, we propose a
preprocessing method that significantly accelerates TAS. The core part
of this method comprises a global histogram clustering of long signals
and a pruning scheme using those clusters. Tests using broadcast
recording indicate that the proposed algorithm achieves the search
speed approximately 3 to 30 times faster than TAS. In these tests,
the search results are exactly the same as with TAS.
Akisato Kimura and Tomohiko Uyematsu,
"Weak variable-length Slepian-Wolf coding with linked encoders for mixed
sources,"
IEICE Technical Report,
IT99-59, pp.7-12, Jan. 2000.
[
copyright notice ]
- Abstract
-
Coding problems for correlated information sources were first
investigated by Slepian and Wolf, where sequences from two correlated
sources are separately encoded, sent to a single decoder and decoded
with sufficiently small probability of error. We investigate the coding
theorem for correlated two sources, where there are some mutual
linkages between two encoders of the coding system proposed by Slepian
and Wolf. We consider weak variable-length coding, i.e. variable-length
code having vanishing error, and show the achievable rate region for
mixed sources characterized by two ergodic sources.
Akisato Kimura and Tomohiko Uyematsu,
"Large deviations performance of interval algorithm for random number
generation,"
Proc. Memorial workshop for the 50th anniversary of the Shannon theory,
pp.1-4, Yamanashi, Japan, Jan. 1999
[ pdf ]
[ copyright notice: The authors hold the copyritht of the material. ]
- Abstract
-
We investigate large deviations performance of the interval algorithm
for random number generation, especially for intrinsic randomness.
First, we show that the length of output fair random bits per the
length of input sequence approaches to the entropy of the source almost
surely. Next, we consider to obtain the fixed number of fair random
bits from the input sequence with fixed length. We show that the
approximation error measured by the variational distance and divergence
vanishes exponentially as the length of input sequence tends to
infinity, if the number of fair bits per input sample is below the
entropy of the source. Contrarily, the approximation error measureby
the variational distance approaches to two exponentially, if the number
of fair bits per input sample is above the entropy.
Nobukazu Takai, Akisato Kimura and Nobuo Fujii,
"CMOS FET companding current-mode integrator,"
Proc. IEEE Asia-Pacific Conference on Circuit and Systems (APCCS98),
pp.17-20, Chiangmai, Thailand, Nov. 1998
[ pdf ]
[ DOI link ]
[ copyright notice ]
- Abstract
-
A new CMOS companding current-mode integrator is proposed. The
companding integrator is based on MOS TransLinear principle and
utilizes a nature of MOSFET square-law. SPICE simulation results
demonstrate good performances.
|