log in  |  register  |  feedback?  |  help  |  web accessibility
Fighting the Global Social Media Infodemic: from Fake News to Harmful Content
Wednesday, February 15, 2023, 11:00 am-12:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)

Given the recent proliferation of disinformation online, there has been growing research interest in automatically debunking rumors, false claims, and "fake news". Initially, the focus has been on automating the entire process, and there has been little attention paid to what human fact-checkers actually need, namely (i) to spot interesting check-worthy claims, (ii) to check whether a claim has been previously fact-checked, and (iii) to obtain relevant material to help them make a decision. We will discuss our efforts in these still under-explored directions, which have also been the focus of the CLEF CheckThat! lab for several years.


Even armed with automatic tools to support their work, human fact-checkers cannot cope with the scale of disinformation online, which is hard to manage even using fully automatic systems (in part, because it can be easily machine-generated at scale). We thus argue for the need to shift to a higher level of granularity: from analyzing claims to profiling entire news outlets, which allows to fact-check the news before it has been even written, by checking the trustworthiness of the source where it will be published eventually; indeed, this is what journalists do. We will show how we automated this analysis by looking at a variety of information sources. We will further discuss some results aiming at an even higher level of granularity: analyzing entire coordinated communities.


Next, we point to the gradual shift in terminology and focus: from "fake news" (focus on factuality), to "disinformation" (factuality + malicious intent), and finally to "infodemic" (focus on harm). Subsequently, there has been a gradual shift in research attention from factuality to trying to understand intent. One direction we proposed in this respect was to detect the use of specific propaganda techniques in text, e.g., appeal to emotions, fear, prejudices, logical fallacies, etc. Another direction has been to understand framing, e.g., COVID-19 can be discussed from a health, an economic, a political, or a legal perspective, among others. Yet another direction has been to better understand the type of text, e.g., objective news reporting vs. opinion piece vs. satire. All these are featured in our ongoing SemEval-2023 task 3, which further promotes multilinguality, covering English, French, Georgian, German, Greek, Italian, Polish, Russian, and Spanish.


Yet another important aspect is multimodality, as Internet memes are much more influential than simple text. We will discuss our work on analyzing memes in terms of propaganda (SemEval-2021 Task 6), harmfulness, harm's target identification, role-labeling in terms of who is portrayed as a hero/villain/victim (CONSTRAINT'2022 shared task), and generating natural text explanations for the latter.


Preslav Nakov is Professor at Mohamed bin Zayed University of Artificial Intelligence. Previously, he was Principal Scientist at the Qatar Computing Research Institute, HBKU, where he led the Tanbih mega-project, developed in collaboration with MIT, which aims to limit the impact of "fake news", propaganda and media bias by making users aware of what they are reading, thus promoting media literacy and critical thinking. He received his PhD degree in Computer Science from the University of California at Berkeley, supported by a Fulbright grant. He is Chair-Elect of the Association for Computational Linguistics (ACL), Secretary of ACL SIGSLAV, and Secretary of the Truth and Trust Online board of trustees. Formerly, he was PC chair of ACL 2022, and President of ACL SIGLEX. He is also member of the editorial board of several journals including Computational Linguistics, TACL, ACM TOIS, IEEE TASL, IEEE TAC, CS&L, NLE, AI Communications, and Frontiers in AI. He authored a Morgan & Claypool book on Semantic Relations between Nominals, two books on computer algorithms, and 250+ research papers. He received a Best Paper Award at ACM WebSci'2022, a Best Long Paper Award at CIKM'2020, a Best Demo Paper Award (Honorable Mention) at ACL'2020, a Best Task Paper Award (Honorable Mention) at SemEval'2020, a Best Poster Award at SocInfo'2019, and the Young Researcher Award at RANLP’2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. His research was featured by over 100 news outlets, including Forbes, Boston Globe, Aljazeera, DefenseOne, Business Insider, MIT Technology Review, Science Daily, Popular Science, Fast Company, The Register, WIRED, and Engadget, among others.

This talk is organized by Rachel Rudinger