The availability of large multilingual pre-trained language models has opened up exciting pathways for developing NLP technologies for languages with scarce resources. In this talk I will summarize some of my group's recent work on the challenges we are still facing in the real world, such as handling unseen-during-pretraining languages, language varieties, and languages from bilingual communities, and I will show the advantage of hierarchical approaches for tackling such issues.
Antonios Anastasopoulos is an Assistant Professor in Computer Science at George Mason University. He received his PhD in Computer Science from the University of Notre Dame with a dissertation on "NLP for Endangered Languages Documentation" and then did a postdoc at Languages Technologies Institute at Carnegie Mellon University. His research is on natural language processing with a focus on low-resource settings, endangered languages, and cross-lingual learning, and is currently funded by the National Science Foundation, the National Endowment for the Humanities, the DoD, Google, Amazon, and Meta.