PhD Defense: Feedback for Vision
Michael Maynord
Abstract
This thesis explores the application of feedback in image and action understanding, as well as video monitoring. It introduces Mid-Vision Feedback (MVF), a mechanism that modulates perception based on high-level categorical expectations, enhancing accuracy and contextual consistency in object classification. This approach is extended to action understanding through Sub-Action Modulation (SAM), which incorporates context into action interpretation by hierarchically grouping action primitives. SAM demonstrates superior performance over various video understanding architectures, improving action recognition and anticipation accuracies. Additionally, a configurable perception pipeline architecture, the Image Surveillance Assistant (ISA), is presented to aid watchstanders in video surveillance tasks by integrating human-specified expectations into the perceptual loop. Lastly, taking inspiration from contextual contrasting in MVF, a learning formulation for motion and context separation is proposed, showing improvements in action recognition and anticipation accuracies across multiple datasets.
Examining Committee
Bio
Michael Maynord, a PhD candidate studying AI and Computer Vision, has conducted research across classical symbolic methods, cognitive architectures, and deep learning. Motivated by the integration of high-level reasoning and perception, this research has included the exploration of the connection between knowledge and visual perception. This extends to image understanding and action understanding in video, with a dedicated investigation into feedback mechanisms in vision to synchronize perception with higher-level world understanding. Recent projects include contributions to medical image analysis, particularly focusing on detecting lung nodules and discerning manifestations of multiple sclerosis.
This talk is organized by Migo Gui