Over the last five years, the rapid increase in the capability and ubiquity of generative models has revealed a clear and pressing need to incorporate transparency enhancing technologies into these systems. In the first part of this talk, I will introduce our work on output watermarking for large language models and showcase how this technology can enable robust content provenance in realistic deployment scenarios. Next, I will discuss our work on training data attribution and memorization in large language models while bringing to light the ways in which web-scale pretraining presents unique and fundamental challenges in this space. Then, motivated by prescient issues at the intersection of intellectual property law and generative model training, I will re-contextualize output watermarks as general-purpose tools for partitioning the output space of a generative model into sets that are trivially distinguishable with the watermark secret, but indistinguishable without it. Finally, I will use this framing to motivate a proposal for making the problems of training data attribution and membership inference more tractable via proactive, selective watermarking by content owners and creators.
John Kirchenbauer is a PhD student in Tom Goldstein’s lab at the University of Maryland, College Park. He spent the first part of his PhD working on techniques for discerning whether the thing you’re currently reading or looking at was created by a human or generated by an AI system. More broadly, his research has explored robustness, reliability, safety, and scalability in deep learning with a long-standing interest in improving our understanding of how a generative model's training data impacts its behavior.

