Talks

CLIP Research Highlights: Embedding Senses and Reinforcement Learning with no Incremental Feedback

Fenfei Guo and Amr Sharaf - University of Maryland

3258 A.V. Williams Building (AVW)

Wednesday, March 14, 2018, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

This week, we'll hear about two research projects by CLIP PhD students.

Title: Inducing and Embeddings Senses with Scaled Gumbel Softmax

Speaker: Fenfei Guo

Abstract: We propose an unsupervised model that simultaneously learns 1) interpretable sense embeddings and 2) how to select which sense to use in a given context. It uses a modified Gumbel softmax function for differentiable discrete sense selection. Our model not only produces sense embeddings that are competitive on multiple downstream evaluations but also discovers multiple interpretable distinct sense groups per word and achieves state-of-art results on human evaluation task.

Title: Residual Loss Prediction: Reinforcement Learning with no Incremental Feedback

Speaker: Amr Sharaf

State of the art learning-based systems require enormous, costly datasets on which to train supervised models. To progress beyond this requirement, we need learning systems that can interact with their environments, collect feedback, and improve over time. In most real-world settings, such feedback is sparse and delayed.

Abstract: In this talk, I'll present Residual Loss Prediction (RESLOPE) a novel learning algorithm that solves reinforcement learning and bandit structured prediction problems with very sparse feedback. RESLOPE learns an internal representation of a denser reward function and operates as a reduction to contextual bandits. It uses its learned loss representation to solve the credit assignment problem, and a contextual bandit oracle to trade-off exploration and exploitation. RESLOPE enjoys a no-regret reduction style theoretical guarantee and outperforms state of the art reinforcement learning algorithms in both MDP environments and bandit structured prediction settings.

Bio

Fenfei Guo is a Ph.D. student working with Jordan Boyd-Graber. Her research interest lies in interpretable representation learning for languages. And she has been working on multi-sense word embeddings, user interactive sense embeddings and tracking topics in conversations.

Amr Sharaf is a Ph.D. student in the Computational Linguistics and Information Processing (CLIP) Lab of the Department of Computer Science at the University of Maryland, advised by Hal Daumé III. His research focuses on developing interactive learning algorithms in the context of structured prediction for artificial intelligence and natural language processing.

Amr Sharaf is a Ph.D. student working with Hal Daume III.

This talk is organized by Marine Carpuat