log in  |  register  |  feedback?  |  help  |  web accessibility
Understanding and Generating Human Language
Wei Xu
IRB 4105
Monday, February 24, 2020, 11:00 am-12:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)

Human language is notoriously complex due to the multitude of ways people can express the same meaning. In this talk, I will present two series of work on machine learning methods to understand the varied expressions in human language and to generate paraphrases for applications, such as reading and writing assistive technology. In the first part, I will showcase how to design learning and ranking models for natural language generation, including a new metric that has been widely adopted as a learning objective and evaluation method. In the second part, I will present new datasets and a class of pairwise models for learning textural expressions that convey the same meaning. In contrast to previous work, we focus on extracting paraphrases on a much larger scale and with a much broader range by developing more robust models, leveraging social media data and crowdsourcing. I will also briefly discuss the connections of my work to computational social science, language and code, and human language instructions.  


Wei Xu is an assistant professor in the Department of Computer Science and Engineering at Ohio State University since 2016. Her research interests are on natural language processing and social media. Her recent work focuses on natural language generation for educational applications and semantic similarity models for language understanding. She has also worked on crowdsourcing, summarization, and information extraction for user-generated data, such as Twitter and StackOverflow. She received her Ph.D. in Computer Science from New York University and was a postdoctoral researcher at the University of Pennsylvania. She received an NSF CRII Award, a Best Paper Award at COLING 2018, CrowdFlower AI for Everyone Award, Criteo Faculty Research Award, as well as research funds from DARPA and IARPA. She is currently a senior area chair for ACL 2020 and has served as an area chair, workshop chair, and publicity chair for ACL, EMNLP, NAACL conferences, and as a co-organizer for the annual Workshop on Noisy User-generated Text.

This talk is organized by Richa Mathur