log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
MS Defense: EFFECTIVENESS OF PROXIMAL POLICY OPTIMIZATION METHODS FOR NEURAL PROGRAM INDUCTION
Runxing Lin
Remote
Monday, November 30, 2020, 2:30-4:30 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract
The Neural Virtual Machine (NVM) is a novel neurocomputational architecture designed to emulate the functionality of a traditional computer. A version of the NVM called NVM-RL supports reinforcement learning based on standard policy gradient methods as a mechanism for performing neural program induction. In this thesis, I modified NVM-RL using one of the most popular reinforcement learning algorithms, proximal policy optimization (PPO). Surprisingly, using PPO with the existing all-or-nothing reward function did not improve its effectiveness. However, I found that PPO did improve the performance of the existing NVM-RL if one instead used a reward function that grants partial credit for incorrect outputs based on how much those incorrect outputs differ from the correct targets. I conclude that, in some situations, PPO can improve the performance of reinforcement learning during program induction, but that this improvement is dependent on the quality of the reward function that is used.

Examining Committee: 
 
                           Chair:              Dr. James A. Reggia             
                          Members:         Dr.  Dana Nau  
                                                Dr. Garrett E. Katz 
Bio

Runxing Lin is a current MS candidate in the University of Maryland's Department of Computer Science, planning to graduate in Fall 2020. He has been a big fan of artificial intelligence since 2015 when AlphaGo become the first computer program to defeat human professional players. He believes that reinforcement learning is a very important area that can help to design artificial agents that achieve human-level performance in complex tasks such as autonomous driving. He thinks that imitation learning and reinforcement learning are very important issues in our everyday life, and that studies of machine reinforcement learning will contribute to better understanding our own learning.  He plans to be involved with future projects and research on reinforcement learning.

This talk is organized by Tom Hurst