|Now you are here: Home > Research Interests > Saliency-based video segmentation with sequentially updated priors||[ English ] [ Japanese ]|
Saliency-based video segmentation with sequentially updated priors
This dataset contains 10 videos as inputs, and segmented image sequences as ground-truth.
Required:Any report or publication using this data should cite its use as the 3 publications from the top listed in "Selected publications" below.
Detailed description:Videos : 10 uncompressed AVI clips of natural scenes with 12 fps, including at least one target objects or something others. Length varies 5-10 seconds.
Groung-truth: 10 sets of JPEG images, each corresponds to an input video. Segmented images are provided for almost all the frames exculding first 15 frames.
(Top left) Input video (Top right) Visual attention density
Ken Fukuchi, Kouji Miyazato,, Shigeru Takagi and Junji Yamato
"Saliency-based video segmentation with graph cuts and sequentially updated priors,"
Proc. International Conference on Multimedia and Expo (ICME2009),
pp.638--641, New York, New York, USA, June-July 2009.
[ bibliography ]
Kazuma Akamine, Ken Fukuchi,, Shigeru Takagi
"Fully automatic extraction of salient regions in near real-time,"
the Computer Journal, doi:10.1093/comjnl/bxq075.
[ abstract ]
Extracting important (or meaningful) regions from videos is not only a challenging problem in computer vision research but also a crucial task in many applications including object recognition, video classification, annotation and retrieval. It can be formulated as a problem of binary segmentation, where important regions are considered ``objects'' and the remaining regions ``backgrounds''. One of the most promising ways to achieve precise segmentation is the method proposed by Boykov et al. called Interactive Graph Cuts. This method originated in the work of Greig et al., where the exact maximum a posteriori (MAP) solution of a two label pairwise Markov random field (MRF) can be obtained by finding the minimum cut on the equivalent graph of the MRF. Boykov et al. extended this work to MRFs with multiple labels, and applied it to interactive image segmentation. Interactive Graph Cuts has become a defacto standard of interactive image segmentation in recent years. More recently, several approaches for extending it to video segmentation have been proposed. For example, Kohli and Torr described an efficient algorithm for computing MAP estimates for dynamically changing MRF models, and tested its performance on the video segmentation problem.
Although the above approaches are promising, they all pose a critical problem in that they have to provide segmentation cues (seeds) manually and carefully. Such manual labeling is occasionally infeasible, especially when we consider extending those methods to certain other applications. The development of fully automatic segmentation methods has been strongly expected. The use of saliency-based human visual attention models is one of the most promising approaches in this respect. The first biologically plausible model for explaining the human attention system was proposed by Koch and Ullman, and later implemented by Itti et al. This model analyzes still images to produce primary visual features, such as intensity, color and orientation, which are combined to form a saliency map that represents the relevance of visual attention. Also, Pang et al. first proposed a stochastic model for estimating human visual attention that tackled the fundamental problem of the previous attention models related to the non-deterministic properties of the human visual system. Such models would be helpful for automatically providing segmentation seeds.
In line with the above viewpoint, we propose a novel approach for achieving video segmentation based on visual saliency. Our main contributions are as follows: