Technical Program

Paper Detail

Session:Multi-Sensory Processing for Context-Aware Computing
Time:Tuesday, May 18, 14:40 - 15:00
Presentation: Special Session Lecture
Topic: Special Sessions: Multi-sensory Processing for Context-Aware Computing
Authors: Rama Chellappa; University of Maryland, College Park 
 Gang Qian; Arizona State University 
 Qinfen Zheng; University of Maryland, College Park 
Abstract: Multimodal sensing has attracted much attention in solving a wide range of problems, including target detection, tracking, classification, activity understanding, speech recognition, etc. In surveillance applications, different types of sensors, such as video and acoustic sensors, provide distinct observations of on-going activities in the region of interest. Proper fusion of these multimodal observations can produce a much more complete and clearer picture of on-going activities. More accurate measurement of activity metrics can be obtained when multiple modality sensors are used. In this paper, we present a data fusion framework using both video and acoustic data for vehicle detection and tracking. In the detection phase, a rough estimate of target direction-of-arrival (DOA) was first obtained by applying beam-forming techniques to acoustic data. This initial DOA estimate designates approximate target location in video. Given the initial target position, the DOA is refined by moving target detection from the video. After target detection, Markov Chain Monte Carlo techniques are used for combined acoustic and video tracking. Experimental results on both synthetic and real data are presented. Improved tracking performance has been observed by fusing the posterior probability density functions obtained from these two types of sensors.

