log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
PhD Defense: Reasoning about Geometric Object Interactions in 3D for Manipulation Action Understanding
Konstantinos Zampogiannis
Friday, June 28, 2019, 11:00 am-1:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract
In order to efficiently interact with human users, intelligent agents and autonomous systems need the ability of interpreting human actions. We focus our attention on manipulation actions, wherein an agent typically grasps an object and moves it, possibly altering its physical state. Agent-object and object-object interactions during a manipulation are a defining part of the performed action itself. In this thesis, we focus on extracting semantic cues, derived from geometric object interactions in 3D space during a manipulation, that are useful for action understanding at the cognitive level.

First, we introduce a simple grounding model for the most common pairwise spatial relations between objects and investigate the descriptive power of their temporal evolution for action characterization. We propose a compact, abstract action descriptor that encodes the geometric object interactions during action execution, as captured by the spatial relation dynamics. Our experiments on a diverse dataset confirm both the validity and effectiveness of our spatial relation models and the discriminative power of our representation with respect to the underlying action semantics. Second, we model and detect lower level interactions, namely object contacts and separations, viewing them as topological scene changes within a dense motion estimation setting. In addition to improving motion estimation accuracy in the challenging case of motion boundaries induced by these events, our approach shows promising performance in the explicit detection and classification of the latter. Building upon dense motion estimation and using detected contact events as an attention mechanism, we propose a bottom-up pipeline for the guided segmentation and rigid motion extraction of manipulated objects. Finally, in addition to our methodological contributions, we introduce a new open-source software library for point cloud data processing, developed for the needs of this thesis, which aims at providing an easy to use, flexible, and efficient framework for the rapid development of performant software for a range of 3D perception tasks.
 
Examining Committee: 
 
                          Chair:               Dr. Yiannis Aloimonos
                          Dean's rep:      Dr. Shihab Shamma
                          Members:        Dr. Cornelia Fermüller
                                                    Dr. Ramani Duraiswami
                                                    Dr. Matthias Zwicker
This talk is organized by Tom Hurst