Mining Reliable Information from Crowdsourced Data
With the proliferation of mobile devices and social media platforms, any person can publicize observations about any activity, event or object anywhere and at any time. The confluence of these enormous crowdsourced data can contribute to an inexpensive, sustainable and large-scale decision support system that has never been possible before. The main obstacle in building such a system lies in the problem of information veracity, i.e., it is hard to distinguish true or accurate information from false or inaccurate ones. In this talk, I will present our efforts towards solving information veracity challenge when crowdsourced data are ubiquitous but their reliability is suspect. When there is no supervision available, we model the task as an optimization problem that jointly searches for source reliability and true facts without any supervision. We showed how our proposed models handle different kinds of data, including data with long-tail distributions, data of heterogeneous types, spatial-temporal data, streaming and distributed data, and how they can support a wide range of applications, including crowdsourcing question answering, knowledge base construction and environmental monitoring. When there exist a small set of annotated samples (training data), we model the reliability assessment problem as a binary classification task. Using fake news detection on social media as an example, I will introduce our proposed framework that incorporates adversarial learning and reinforcement learning to extract event-invariant features and leverage user feedback for improved detection performance. At the end of the talk, I will briefly introduce my other work, which is the integration of complementary views for improved inference in healthcare and transportation domains.
Jing Gao is an Associate Professor in the Department of Computer Science and Engineering at the University at Buffalo (UB), State University of New York. She received her PhD from Computer Science Department, University of Illinois at Urbana Champaign in 2011, and subsequently joined UB in 2012. She is broadly interested in data and information analysis with a focus on data mining, information integration, crowdsourcing, social media analysis, misinformation detection, knowledge graphs, anomaly detection, transfer learning and data stream mining. She has published more than 150 papers in referred journals and conferences. She is an editor of ACM Transactions on Intelligence Systems and Technology, and serves in the senior program committee of ACM KDD and CIKM conferences. She is a recipient of NSF CAREER award and IBM faculty award.