log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
Panning for gold: Interpretable and error-controlled hypothesis generation from biomedical data
Yang Lu - Cheriton School of Computer Science, University of Waterloo
IRB 4105 or https://umd.zoom.us/j/95853135696?pwd=VVEwMVpxeElXeEw0ckVlSWNOMVhXdz09
Tuesday, February 20, 2024, 11:00 am-12:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

Rapid developments in high-throughput sequencing have enabled biologists to collect large volumes of multi-omics data with unprecedented resolution. However, interpretation of such an increasing amount of heterogeneous biological data becomes highly nontrivial. In my talk, I will present a data-driven research paradigm to discover testable hypotheses directly from biological data in an interpretable and error-controlled fashion. In particular, the talk will mainly focus on three recent works that span the critical components to biomedical research: data collection, hypothesis generation, and hypothesis prioritization:  

 

(1) An interpretation method that generates testable biological hypotheses from deep learning models. Specifically, I developed an uncertainty-aware method to identify from single-cell RNA-seq data a combinatorial gene set signature to characterize the single-cell type. This method pioneers efforts to streamline existing single-cell analysis pipelines through a unified framework for easy interpretation. 

 

(2) A statistical method that subjects the hypotheses generated from deep learning models to error control, without relying on p-values. This method demonstrated to the community for the first time that the interpretation of deep learning models could achieve confidence guarantees.

 

(3) A critical reevaluation of problematic statistical estimation of the Basic Alignment Search Tool (BLAST), a cornerstone tool used in daily biomedical analysis over the past 30 years.We have introduced an alternative method to address this issue, ensuring that it does not yield inflated estimates of significance. Our study has the potential to influence and reshape numerous conclusions drawn by researchers.

Bio

Yang Lu is an assistant professor at Cheriton School of Computer Science, University of Waterloo. Prior to that, he was a postdoctoral researcher in Prof. William Noble's group at the University of Washington.He obtained his Ph.D. in Computational Biology and Bioinformatics under the supervision of Prof. Fengzhu Sun from University of Southern California.

 

Before moving to the United States, he received M.S. and B.S. degrees in Computer Science and Engineering from Shanghai Jiao Tong University. Yang Lu's research focuses on developing machine learning and statistical methods for genomics and proteomics data analysis. He is particularly interested in developing interpretation methods to find scientifically interesting and statistically confident hypotheses from complex biological data.

This talk is organized by Samuel Malede Zewdu