Researchers have been studying whether AI systems reflect stakeholders’ values and developing strategies for alignment when systems do not. In this talk, I will illustrate how two existing approaches to AI alignment, participation and evaluation (of and by relevant stakeholders) fall short of achieving purported goals. First, I will discuss how 1) participation is poorly understood and operationalized by AI researchers and practitioners, and 2) existing participatory mechanisms are insufficient to guarantee alignment. Next, I will show how red-teaming, one general evaluation approach proposed to analyze AI alignment, is an ill-defined process with highly variable inputs and outputs. Lastly, I will conclude by previewing my ongoing and future research agenda, including empirical study of the impact of red-teaming design choices (e.g., instructions for human versus automated evaluation approaches) on evaluation outcomes, aiming to develop more robust AI evaluation methods that empower stakeholders.
Michael Feffer is a fifth-year Societal Computing PhD student at Carnegie Mellon University (CMU). He is broadly interested in examining interactions between AI and society to study how to leverage the strengths of AI while avoiding its negative outcomes and impacts. He aims to develop frameworks whereby everyday people impacted by ML models can influence model development. His primary areas of research include algorithmic fairness, participatory ML, and generative AI evaluation. He is a recipient of a GEM Fellowship and an ARCS Scholarship.

