Talks

PhD Defense: Analyzing Communicative Choices To Understand Their Motivations, Context-Based Variation, And Social Consequences

Pranav Goel

3137 https://umd.zoom.us/j/9845968708 Brendan Iribe Center for Computer Science and Engineering (IRB)

Monday, June 12, 2023, 3:00-5:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

In many settings, communicating in a language requires making choices among different possibilities – the issues to focus on, the aspects to highlight within any issue, the narratives to include, and more. These choices, deliberate or not, are socially structured. The ever- increasing availability of unstructured large-scale textual data, in part due to the bulk of communication and information dissemination happening in online or digital spaces, makes natural language processing (NLP) techniques a natural fit in helping understand socially- situated communicative choices using that textual data. Within NLP methods, unsupervised NLP methods are often needed since digital large-scale textual data in the wild is often available without accompanying labels, and any existing labels or categorization might not be appropriate for answering specific research questions.

This dissertation seeks to address the following question: how can we use unsupervised NLP methods to study texts authored by specific people or institutions in order to effectively explicate the communicative choices being made, as well as to investigate their potential motivations, context-based variation, and consequences?

Our first set of contributions centers on methodological innovation. We focus on topic modeling – a class of generally unsupervised NLP methods that can automatically discover authors’ communicative choices in the form of topics or categorical themes present in a collection of documents. We introduce a new neural topic model (NTM) that effectively incorporates contextualizing sequential knowledge. Next, we find critical gaps in the near-universal automated evaluation paradigm that compares different models in the topic modeling methods literature, which calls into question much of the recent work in NTM development claiming “state-of-the-art” and emphasizes the importance of validating the outputs of unsupervised NLP methods.

In order to use unsupervised NLP methods to investigate potential motivations, context- based variation, and consequences of communicative choices, we link textual data with information about the authors, social contexts, and media involved in their production — these connected information sources help us conduct empirical research in social sciences.

In our second set of contributions, we analyze a previously unexplored connection between a politician’s donors and their communicative choices in their floor speeches to show how donations influence issue-attention in US Congress, enabling a new look at money in politics and providing an example of studying motivations behind communicative choices.

Our third set of contributions uses text-based ideal points to better understand the role of institutional constraints and audience considerations in the varying expression and ideological positioning of politicians. The application of this tool for expanding knowledge of legislative politics is enabled by comprehensive annotations for modeling outputs provided by domain experts in order to establish the tool’s validity and reliability.

In our fourth set of contributions, we demonstrate the potential of both unsupervised NLP techniques and social network data and methods in better understanding the down- stream consequences of communicative choices. We focus on misinformation narratives in mainstream media, viewing and highlighting misinformation as something beyond just false claims published by certain bad actors or stories published by certain ‘fake news’ outlets. Our findings suggest a strategic repurposing of mainstream news by conveyors of misinformation as a way to enhance the reach and persuasiveness of misleading narratives.

Examining Committee

Chair:	Dr. Philip Resnik
Dean's Representative:	Dr. Kristina Miler
Members:	Dr. Jordan Boyd-Graber
	Dr. John Dickerson
	Dr. Naeemul Hassan Dr. David Lazer (Northeastern University)

Bio

Pranav Goel is a 5th-year PhD candidate in Computer Science at the University of Maryland, College Park. He is advised by Prof. Philip Resnik as part of the Computational Linguistics and Information Processing (CLIP) Lab, often working with collaborators (especially political and social scientists) from other departments, labs at other universities, and research groups outside of academia. His research lies at the intersection of Natural Language Processing and Computational Social Science, including analysis of language use in sociopolitical contexts focusing on agenda-setting and framing, enabling better computer-assisted content analysis, and evaluating NLP tools and methods in ways that concord with real-world usage. Some of the latest application focuses of his works include US congressional rhetoric and political discussions on social media, online misinformation, and possible tools and implications for journalism.

This talk is organized by Tom Hurst