Talks

Algorithms for Ultra-Large Alignment and Phylogeny Estimation

Professor Tandy Warnow - The University of Texas at Austin

Thursday, September 26, 2013, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Key Words:
phylogeny estimation, multiple sequence alignment, metagenomics, large
dataset analysis, phylogenetic placement, Hidden Markov Models

Abstract:
The estimation of the "Tree of Life" from biomolecular sequence data presents many interesting computational, mathematical, and statistical challenges. In particular, the most accurate methods have depended on having an accurate multiple sequence alignment, and the estimation of an accurate multiple sequence alignment is itself a challenging problem.

In this talk, I will present several new methods that combine machine learning, chordal graph theory, and local search strategies, to obtain improved large-scale multiple sequence alignment and phylogeny estimation. I will begin with "fast-converging methods", which are algorithms that are guaranteed to reconstruct the true tree from polynomial length sequences, under the standard assumption that the sequences are generated by a Markov model down an unknown model tree. I will then present DACTAL, a method for estimating trees without a multiple sequence alignment, and methods for ultra-large alignment estimation (SATe, PASTA (unpublished), and UPP (unpublished)). Finally, if time permits, I will present results for TIPP (unpublished), a method for taxon identification of short metagenomic reads.

I will also present results on simulated datasets with up to 1,000,000 (1 million) sequences, and also on biological datasets with up to 100,000 sequences.

This talk is organized by Adelaide Findlay