English Japanese

CS Plaza

Memory-based particle filter for robust object tracking

  • Various applications as well as human-machine interfaces require a technique that can automatically measure the position and orientation of humans and/or objects in image sequences captured by cameras. Particle filters (PFs) are widely used to sequentially estimate a target's state (position and orientation). At each time step, the PF repeatedly conducts update step and prediction step; the update step predicts the target's state by using the input image, and the prediction step predicts the future state from the current state. Existing particle filters sometimes fail to secure reliable prediction when the target's movements include large changes in speed; this can cause tracking failure. To solve the problem, we propose the novel particle filter, Memory-based Particle Filter (M-PF). M-PF predicts the future state from long-term history of the target's movements (past states). The proposed method offers more reliable prediction since it uses the target's long-term dynamics, an ability are not offered by previous particle filters because they assume the Markov property; the proposed method shows high performance, especially when the target moves abruptly and when the tracker rediscovers a lost target.

Brain networks involved in the formation and selection of auditory percepts

  • We can hear a sound mixture as music and discern the sounds of individual instruments of an orchestra. However, it is unclear how auditory percepts are formed in the brain. The present study used auditory illusions and brain imaging techniques to examine neural mechanisms related to the formation and selection of auditory percepts. The results demonstrated that cortical and subcortical areas are involved in the formation of percepts, and the synchronization of activations among these areas plays a critical role in the selection of percepts. Our findings will lead to the development of an audiovisual apparatus that considers individual differences in perception.

Recognizing daily activities with wearable sensors

  • Recognizing activities of daily living is one of the most important technology to realize real-world applications such as context-aware and lifelog applications. We developped a method that recognizes activities of daily living by employing a wrist worn sensor device with such various kinds of sensors as a camera, a microphone, and an accelerometer, and also describes the design of the wrist worn device. Specifically, the device captures a space around the user's hand by the camera to recognize ADLs that involve the manual use of objects such as making tea or coffee and watering plant. Existing wearable sensor devices equipped only with a microphone and an accelerometer cannot recognize these ADLs without object embedded sensors.For further details, click here.

Audio signal modeling and processing based on sparse representation: Complex NMF and composite autoregressive model

  • Although audio signals generated by real-world sources are observed as what seems to be extremely varied waveforms, one can realize by focusing on a particular space or feature that they are actually composed of only a limited number of basic components, as for example the phoneme units such as /a/ or /i/ in human speech, or the semitone units in a piano performance. In this research, we explore under this hypothesis a new paradigm in the audio signal processing area based on the so-called sparse representation approach, which can learn autonomously the previously unknown basic elements of a signal. We have proposed so far two signal analysis methods based on the above concept, complex NMF and the composite autoregressive model.

Mystery of perception -action affects perception-

  • Perceiving own body state and motion is very important for our daily life behavior. This "body perception" has been proposed to be formed through integration of sensory, such as somatic and visual, information in the brain. We recently found, however, that motor command information for action is also involved in body perception. When sensory information becomes unreliable (e.g., when numbness in a limb happens) during action, motor command information is utilized to compensate the sensory unreliability. Such a finding can lead to development of a novel interface based on human action-perception interaction mechanisms.

Standardization of Lossless Audio Coding and Archival Format

  • We have made significant contributions to the establishment of lossless audio coding standard MPEG-4 ALS *(ISO/IEC 14496-3). In addition, we have designed a package file format for archival system, which is one of the most important applications of lossless coding. Recently, the format has been established as MPEG-A PA-AF**(ISO/IEC 23000-6) as well as the associated reference software. These two standards, individually or jointly, will be used for various applications including long-term archival system and preservation, transmission, editing and playback of high quality audio signals, making use of the merits of the international standard. For further details, click here.
    *ALS:Audio Lossless Coding
    **PA-AF: Professional Archival Application Format

Tracking multiple objects and simultaneous learning of their movement patterns

  • Many surveillance cameras have been installed in our daily life, and they watch various kinds of behaviors performed by various people. However, most of such information about the recorded scenes is buried in storages because we don't have enough human resources to watch all the recorded movies. We developed a new statistical model which is able to not only track multiple targets in the scenes, but also to learn and recognize movement patterns of targets (walk right, run upper left, etc...) in a research of time series data mining. Capability of the understanding and recognition about scenes grows according to the amount of movies the system has been fed. Please visit this page to find more information.

Dynamical Information Processing Systems (DIPS)

  • The world around us is full of complex and unpredictable behavior, such as water churning in a stream, electrons tumbling along a wire and human behavior itself. In our research we aim to model and harness complex dynamical behavior for use in information and communication technologies. One example is the use of complex dynamics in lasers for information security applications. Laser light beams with complex fluctuations can be generated and controlled by taking the light from a laser and reflecting some of it back inside the laser. By combining advanced methods from physics and information theory, we have developed methods to harness these laser beam fluctuations so that they can be used as physical random number generators to generate passwords and secret keys for secure data communications. Recently we showed that semiconductor lasers can be used to generate random numbers at very high rates. In collaboration with an experiment group at Takushoku University, we demonstrated the continuous generation of random bits at the word's fastest rate of 1.7 Gbps. This achievement was reported in the research journal Nature Photonics Vol.2, No.12 (2008) pp. 728-732, where it was called "The world's fastest dice". For further details, click here.

Illusion Forum - A collection of visual and auditory illusion-

  • Illusion Forum is a Web site that demonstrates various visual and auditory illusions. Click here to move to the site (written only in Japanese).



"Perceptual Attraction Force: Exploit the Nonlinearity of Human Perception"

  • We developed a new force-feedback haptic device "Buru-navi" using haptic illusion ( characteristic of human perception). For further details, click here .



"Robust Media Search" technology for searching for audio/video data in our surroundings

  • When you hear a good song on the street, or see an interesting item on TV, do you ever want to find out more information about what you're hearing or seeing? Searching for information on a distorted signal captured by a mobile phone or camera is possible with NTT's Robust Media Search technology. For further details, click here.