Talks

Building More Reliable and Scalable AI Systems with Language Model Programming

Omar Khattab

IRB 4105 or https://umd.zoom.us/j/95853135696?pwd=VVEwMVpxeElXeEw0ckVlSWNOMVhXdz09

Monday, March 4, 2024, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

It is now easy to build impressive demos with language models (LMs) but turning these into reliable systems currently requires brittle combinations of prompting, chaining, and finetuning LMs. In this talk, I present LM programming, a systematic way to address this by defining and improving four layers of the LM stack. I start with how to adapt LMs to search for information most effectively (ColBERT, ColBERTv2, UDAPDR) and how to scale that to billions of tokens (PLAID). I then discuss the right architectures and supervision strategies (ColBERT-QA, Baleen, Hindsight) for allowing LMs to search for and cite verifiable sources in their responses. This leads to DSPy, a programming model that replaces ad-hoc LM prompting techniques with composable modules and with optimizers that can supervise complex LM programs. Even simple AI systems expressed in DSPy routinely outperform standard hand-crafted prompt pipelines, in some cases while using small LMs. I highlight how ColBERT and DSPy have sparked applications at dozens of leading tech companies, open-source communities, and research labs, and then conclude by discussing how DSPy enables a new degree of research modularity, one that stands to allow open research to again lead the development of AI systems.

Bio

Omar is a fifth-year CS Ph.D. candidate at Stanford NLP and an Apple Scholar in AI/ML. He is interested in Natural Language Processing (NLP) at scale, where systems capable of retrieval and reasoning can leverage massive text corpora to craft knowledgeable responses efficiently and transparently. Omar is the author of the ColBERT retrieval model, which has helped shape the modern landscape of neural information retrieval (IR), and author of several early multi-stage retrieval-based LM systems like ColBERT-QA and Baleen. His recent work includes the DSPy programming model for building and optimizing reliable language model systems. Much of Omar's work forms the basis of influential open-source projects, and his lines of work on ColBERT and DSPy have sparked applications at dozens of academic research labs and leading tech companies, including at Google, Meta, Amazon, IBM, VMware, Baidu, Huawei, AliExpress, and many others.

This talk is organized by Samuel Malede Zewdu