Talks

How do we build models that learn and generalize?

Andrew Wilson

Zoom link: https://umd.zoom.us/j/95197245230?pwd=cDRlVWRVeXBHcURGQkptSHpIS0VGdz09

Thursday, September 8, 2022, 3:30-4:30 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Zoom link: https://umd.zoom.us/j/95197245230?pwd=cDRlVWRVeXBHcURGQkptSHpIS0VGdz09

To answer scientific questions, and reason about data, we must build models and perform inference within those models. But how should we approach model construction and inference to make the most successful predictions? How do we represent uncertainty and prior knowledge? How flexible should our models be? Should we use a single model, or multiple different models? Should we follow a different procedure depending on how much data are available? How do we learn desirable constraints, such as rotation, translation, or reflection symmetries, when they don't improve standard training loss? How do we select between models that are entirely consistent with any observed data? What if our test data are drawn from a different but semantically related distribution?

In this lecture, I will present a philosophy for model construction, grounded in probability theory. I will exemplify this approach with methods that exploit loss surface geometry for scalable and practical Bayesian deep learning, and resolutions to seemingly mysterious generalization behaviour such as double descent. I will also consider prior specification, model selection, generalized Bayesian inference, domain shift, and automatic constraint (symmetry) learning.

Refs:

(1) https://arxiv.org/abs/2002.08791

(2) https://arxiv.org/abs/2003.02139

(3) https://arxiv.org/abs/2106.11905

(4) https://arxiv.org/abs/2202.11678

(5) https://arxiv.org/abs/2112.01388

This talk is organized by Richa Mathur