Talks

Learning Like a Human: How, Why, and When

Tianyi Zhou - University of Washington

https://umd.zoom.us/j/94543765116?pwd=clY3MVV5Z1g4T2xpdnJMdjFiMFhYdz09

Wednesday, February 17, 2021, 1:00-2:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Machine learning (ML) can surpass humans on certain complicated yet specific tasks. However, most ML methods treat samples/tasks equally during the course of training, e.g., by taking a random batch per step and repeating many epochs' SGD on all data, which may work promisingly on well-processed data given sufficient computation but is extraordinarily suboptimal and inefficient from human perspectives, since we would never teach children or students in such a way. On the contrary, human learning is more strategic and smarter in selecting or generating the training contents for different learning stages via experienced teachers, collaboration between learners, curiosity and diversity for exploration, tracking of memory and progress, sub-tasking, etc., which have been underexplored in ML. The selection and scheduling of data/tasks is another type of intelligence as important as the optimization of model parameters. My recent work aims to bridge this gap between human and machine intelligence. As we entering a new era of hybrid intelligence between humans and machines, it is important to make AI not only act like humans but also benefit from human-like strategies for training.

In this talk, I will present several curriculum learning techniques we developed for improving supervised/semi-supervised/self-supervised learning, robust learning with noisy labels, reinforcement learning, ensemble learning, etc., especially when the data are imperfect and thus a curriculum can make a big difference. Firstly, I will show how to translate human learning strategies to discrete-continuous hybrid optimizations, which are challenging to solve in general but we can develop efficient and provable algorithms using techniques from submodular and convex/non-convex optimization. Curiosity and diversity play important roles in these formulations. Secondly, we build both empirical and theoretical connections between curriculum learning and the training dynamics of ML models on individual samples. Empirically, we find that deep neural networks are fast in memorizing some data but also fast in forgetting some others, so we can accurately allocate those easily forgotten data using training dynamics observed in very early stages and make the future training mainly focus on them. Moreover, we find that the consistency of model output overtime for an unlabeled sample is a reliable indicator of its pseudo-label's correctness in self-supervised learning and delineates the forgetting effects on previously learned data. In addition, the learning speed on samples/tasks provides critical information for future exploration in RL. These discoveries are consistent with human learning strategies and lead to more efficient curricula for a rich class of ML problems. Theoretically, we derive a data selection criterion solely from the optimization of learning dynamics in continuous time. Interestingly, the resulted curriculum matches the previous empirical observations and has a natural connection to the neural tangent/path kernel in recent deep learning theories.

Bio

Tianyi Zhou (https://tianyizhou.github.io) is a Ph.D. candidate in the Paul G. Allen School of Computer Science and Engineering at University of Washington, advised by Professor Jeff A. Bilmes. His research interests are in machine learning, optimization, and natural language processing. His recent research focuses on transferring human learning strategies, e.g., curriculum and sub-tasking, to machine learning in the wild, especially when the data are unlabeled, redundant, noisy, biased, or are collected via interaction. The research results can improve supervised/semi-supervised/self-supervised learning, robust learning with noisy data, reinforcement learning, meta-learning, ensemble method, spectral method, etc. He has published ~50 papers at NeurIPS, ICML, ICLR, AISTATS, NAACL, COLING, KDD, AAAI, IJCAI, Machine Learning (Springer), IEEE TIP, IEEE TNNLS, IEEE TKDE, etc., with ~2000 citations. He is the recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE Computer Society Technical Committee on Scalable Computing (TCSC) Most Influential Paper Award.

This talk is organized by Richa Mathur