This doctoral thesis explores critical advancements in generative artificial intelligence (AI) across three domains: enhancing security through watermarking, advancing algorithm synthesis, and pioneering diverse strategies for generative image tasks. The research contributes novel methodologies and unified frameworks that push the boundaries of AI capabilities.
The first part of the thesis addresses the growing need for AI security by introducing certified watermarking techniques to safeguard deep neural networks (DNNs) from intellectual property violations. These methods establish robust protections for model ownership and integrity, ensuring the secure deployment of AI innovations.
The second part investigates algorithm synthesis, expanding the computational and reasoning capabilities of neural networks. By enabling complex problem-solving and demonstrating advanced adaptability, this work redefines the role of neural networks as tools for sophisticated algorithmic design, transcending traditional applications in pattern recognition.
The third part focuses on diverse generative strategies for image creation and editing. It begins with Cold Diffusion, employing deterministic transformations to expand the operational mechanics of diffusion models. The research further enhances the generative process by enabling the creation of highly specific and contextually relevant imagery with minimal retraining, improving the flexibility and practicality of diffusion-based approaches. Finally, the thesis presents a unified framework for image reasoning and generation, leveraging next-token prediction with a vision encoder that produces discrete, non-lossy image embeddings aligned with language. This innovation enables a transformer-based architecture to support both high-precision image editing and advanced reasoning, paving the way for a cohesive and versatile AI design.
By addressing security, algorithmic adaptability, and generative innovation, this thesis contributes to the development of next-generation AI systems. It establishes a strong foundation for future advancements in AI technologies, ensuring secure, adaptable, and creative solutions for a wide range of applications.
Arpit Bansal is a PhD Candidate at the University of Maryland, College Park where he is supervised by Prof. Tom Goldstein. He does research in Deep Learning, specifically in Computer Vision, Generative AI and Algorithmic Reasoning. Previously he obtained his bachelor's in Electrical Engineering and master's in Signal Processing from the Indian Institute of Technology, Kharagpur.