Now you are here: Home > Research Interests > Computational model of temporal dynamics of early human visual system [ English ] [ Japanese ]

Computational model of temporal dynamics of early human visual system

Developing an accurate computational model of human visual attention has been a long-standing challenge. Such a model may allow any system to select only relevant information from a complex and cluttered visual input in numerous artificial vision applications, such as robotics, surveillance, driving assistance, multimedia recognition and retrieval.

The first biologically plausible model for explaining the human visual attention system was proposed by Koch and Ullman (1985), and later implemented by Itti et al (1998). This model analyzes still images to produce primary visual features, such as intensity, color and orientation, which are combined to form a saliency map that represents the relevance of visual attention. Although several attempts have been made to improve the Koch-Ullman model, visual attention models for videos have not been fully investigated. It is well known that human sensitivity to such visual features varies with time. Considering that sensitivity to saliency depends strongly on the temporal dynamics of the early visual system, such temporal characteristics should be introduced to realize further improvements.

We propose a new algorithm for extracting the saliency of videos based on the above considerations. The proposed algorithm models two contrastive properties of the temporal dynamics of the early human visual system:
1) Instantaneous saliency depletion with gradual recovery, which simulates the ``Inhibition of Return'' effect (Posnet and Cohen (1984)). Owing to this effect, humans tend to get delayed to realize salient events happened around the previously focusing region after attention is diverted away from the region.
2) Gradual saliency depletion with instantaneous recovery, which is derived from the ``Neural Adaptation'' theorem (Hartline (1940)). Based on this theorem, sensitivity to saliency gradually decreases over time when no surprising events occur in a video, and it is only retained in surprising locations in the video.

The proposed algorithm has been evaluated with an eye tracking device to see how well it fits the human visual system. The results show that the proposed algorithm substantially outperformed previous algorithms when only gradual depletion was incorporated, and instantaneous depletion improved the performance in some cases.

Demo movie

Saliency with a computational model of "Inhibition of Return"
(Left) input video, (Right) saliency video

Saliency with a computational model of "Neural Adaptation"
(Left) input video, (Right) saliency video

(August 29, 2008) The problem of the availability of flash videos for Internet Explorer has been fixed. We apologize any inconvenience caused by the problem.

Selected publications

Clement Leung, Akisato Kimura, Tatsuto Takeuchi and Kunio Kashino
"A computational model of saliency depletion/recovery phenomena for the salient region extraction of videos,"
Proc. International Conference on Multimedia and Expo (ICME2007),
pp.300--303, Beijing, China, July 2007.
[ Bibliography ]