Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates

Shiro Kumano, Yoichi Sato Kazuhiro Otsuka, Junji Yamato, Eisaku Maeda
(The University of Tokyo) (NTT Communication Science Laboratories)


Abstract

In this paper, we propose a method for poseinvariant facial expression recognition from monocular video sequences. The advantage of our method is that, unlike existing methods, our method uses a simple model, called the variable-intensity template, for describing different facial expressions. This makes it possible to prepare a model for each person with very little time and effort. Variableintensity templates describe how the intensities of multiple points, defined in the vicinity of facial parts, vary with different facial expressions. By using this model in the framework of a particle filter, our method is capable of estimating facial poses and expressions simultaneously. Experiments demonstrate the effectiveness of our method. A recognition rate of over 90% is achieved for all facial orientations, horizontal, vertical, and in-plane, in the range of 40 degrees, 20 degrees, and 40 degrees from the frontal view, respectively.


System Flow



* Degital Human Research Center, Advanced Industrial Science and Technology.

Our method consists of two stages. First, we prepare a variable-intensity template for each person from just one frontal face image for each facial expression. Second, we estimate facial pose and expression simultaneously within the framework of a particle filter.




Demo Movies

Model Generation Process
mpeg1
Recognition
mpeg1


Variable-Intensity Templates
The variable-intensity template is a novel simple face model to simultaneously estimate facial pose and expression. It can be easily generated as the following movie. It consists of three components.

(1) Rigid Shape Model
The rigid shape model provides the depth coordinates of interest points defined on an image plane. The shape model used is shown in the upper figure.
(3) Intensity Distribution Model



Intensity distribution model describes how the interest point intensity varies for different facial expressions. As shown in the figure, the interest point intensity changes strongly due to the shift of its associated facial part. Focusing on this property, we recognize facial expressions from the changes in observed interest point intensities.
(2) Interest Points


An interest point constitutes a pair of points that straddle and are centered on the edge, to detect the bidirectional motions of the facial parts.


Movies

Fixed-pose dataset
Horizontal
mpeg1

Vertical
mpeg1

In-plane
mpeg1
Free-pose dataset
mpeg1


References

Journal Papers
[1] S. Kumano, K. Otsuka, J. Yamato, E. Maeda and Y. Sato, "Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates", International Journal of Computer Vision, Vol. 83, no. 2, pp.178-194, 2009. [pdf]

Conferences
[1] S. Kumano, K. Otsuka, J. Yamato, E. Maeda and Y. Sato, " Combining Stochastic and Deterministic Search. for Pose-Invariant Facial Expression Recognition", British Machine Vision Conference (BMVC), 2008.
[2] S. Kumano, K. Otsuka, J. Yamato, E. Maeda and Y. Sato, "Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates", Asian Conference on Computer Vision (ACCV), Vol. I, pp.324-334, 2007. [Honorable mention Award]