Deep learning based artificial intelligence (AI) can be unreliable, as they sometimes fail in unexpected ways. A contributing factor to the unexpected nature of the failures is the lack of transparency in how AI-based systems make decisions. Developing tools to shed insight on the inner workings can provide a path towards better understanding when a model can be trusted to work, by proactively uncovering failure modes before they cause harm. Further, after failure modes are identified, they can begin to be mitigated in a more targeted fashion. However, the utility of interpretability methods is hindered by their lack of scalability, as they often provide qualitative insights that require a human's review. In this proposal, we demonstrate a number of approaches to better characterize the limitations of vision models. We start with more traditional and laborious methods, like introducing novel manually annotated benchmarks. We then shift towards more automated alternatives, such as utilizing machine perception to scalably find data of interest within an existing pool. Further, we show that our interpretability methods can immediately provide mitigation strategies, supporting the vision that transparency can facilitate reliability. Finally, we outline ways to further automate the methods we introduced, leveraging powerful frozen auxiliary models in an end-to-end pipeline to uncover, assess, and alleviate potential limitations at scale.
Mazda Moayeri is a PhD student under the direction of Professor Soheil Feizi. His work centers on advancing the interpretability and robustness of deep neural networks, towards trustworthy artificial intelligence. He has extensively studied spurious correlations in modern vision models. He is supported by the ARCS foundation’s endowment scholarship, and has previously been affiliated with Meta AI, ETH Zurich, UCLA, and NIH.