log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
Paradigms of AI alignment: components and enablers
Tuesday, November 21, 2023, 1:00-2:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Registration requested: The organizer of this talk requests that you register if you are planning to attend. There are two ways to register: (1) You can create an account on this site (click the "register" link in the upper-right corner) and then register for this talk; or (2) You can enter your details below and click the "Register for talk" button. Either way, you can always cancel your registration later.

Name:
Email:
Organization:

Abstract

The goal of AI alignment is to figure out how to get advanced AI systems to do what we want them to do and not knowingly act against our interests. Alignment research is focused either on developing different components of an aligned AI system (e.g. reward design and generalization) or enabling more effective work on the components (e.g. through improving interpretability or theoretical understanding). This talk will give an overview of research directions in each of these areas and how alignment research at Google DeepMind fits into this framework.

Bio

Victoria is a senior research scientist on the Alignment team at Google DeepMind. She is currently focusing on evaluating dangerous capabilities in large language models. Her past research includes power-seeking incentives, specification gaming, and avoiding side effects. She has a PhD in statistics and machine learning from Harvard University.

 

Note: Please register using the Google Form on our website https://go.umd.edu/marl for access to the Google Meet and talk resources.

This talk is organized by Saptarashmi Bandyopadhyay