A tourist gets lost in a city she has never been to. Even worse, she forgot her cellphone and wallet in the hotel and thus cannot afford to buy a map or take a taxi. But still, by asking locals for directions, she finally returns to the hotel safely. This example conveys a powerful fact about humans: they can accomplish tasks that surpass their knowledge and skill levels by reaching out to other humans for help. We posit that, in building AI agents that can generalize robustly to new situations, we must empower them with similar capabilities. Concretely, we must teach these agents to (a) determine when they need additional assistance to make progress and (b) understand human assistance expressed in natural forms (e.g., words, images) and translate it into concrete actions/decisions. We study these two learning problems jointly in the context of a navigation agents finding objects in a house via asking humans for direction along the way. In this talk, I will describe HANNA, a new research photorealistic simulator for learning to leverage multimodal human assistance, and Imitation Learning with Indirect Intervention, a novel learning framework for modeling natural feedback that cannot be captured by the conventional imitation learning and reinforcement learning frameworks.
Khanh Nguyen is a Ph.D. student at the University of Maryland, College Park, where he is advised by Prof. Hal Daumé III. His research focuses on applying NLP methods to grounded language learning, with a specific interest in teaching AI agents to intelligently interact with humans and environments via language. His recent work, “Visual Navigation with Human Assistance”, develops new algorithms for teaching AI agents to leverage human assistance in photo-realistic navigation tasks. Khanh obtained his Bachelor at the University of Massachusetts-Amherst, receiving guidance from Prof. Erik Learned-Miller and Prof. Brendan O’Connor.