Talks

Toward an End-to-end Anomaly Discovery System

Lei Cao

Virtual-https://umd.zoom.us/j/504805391

Friday, March 27, 2020, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Anomaly detection is critical in enterprises, with applications including financial fraud,  defending network intrusions, and detecting imminent device failures. Although previously research has proposed a variety of stand-alone methods for detecting particular types of anomalies, there is no end-to-end solution for data scientists to effectively discover anomalies over large volumes of varied data. To build such a system, several critical challenges have to be solved: How to determine which among many alternative anomaly detection algorithms is the best for a given task and to find the proper parameter settings? How to leverage a small amount of end-user feedback to improve the anomaly extraction process? How to best present the anomaly detection results such that users do not have to evaluate the potentially large number of anomaly candidates one by one? 

This talk will present our solution, called ADS, that solves all above problems. ADS supports all stages of anomaly discovery by seamlessly integrating anomaly-related services within one integrated platform. It enables tuning-free anomaly detection, anomaly summarization and explanation services, and the ability to integrate user-feedback into the discovery process.

Bio

Lei Cao is a Postdoc Associate at the Computer Science and Artificial Intelligence Laboratory of MIT. Before that he worked for IBM T.J. Watson Research Center as a Research Staff Member. He received his Ph.D. degree in Computer Science from Worcester Polytechnic Institute. He has conducted research in the broad areas of data science and systems ranging from the low-level core database performance optimization to designing the high level, application specific machine learning techniques. His recent research falls in the emerging area of "Systems for AI and AI for Systems", focused on designing scalable algorithms and systems for the data scientists to effectively yet efficiently explore and discover knowledge from heterogeneous data sources -- especially anomalies.

This talk is organized by Richa Mathur