log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
PhD Proposal: Computational Methods to Improve Prediction of Microbial Translation Initiation
Derrick Wood - University of Maryland, College Park
Friday, June 8, 2012, 2:00-3:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

THE PRELIMINARY ORAL EXAMINATION FOR THE DEGREE OF Ph.D. IN COMPUTER SCIENCE FOR

                           Derrick Wood

Within microbial gene finding, the prediction of translation initiation sites (TISs) is a difficult problem as there are often multiple possible starting sites for a single gene. While detection of features such as ribosomal binding sites (RBSs) has enabled gene finding programs to achieve an accuracy of over 90% in their predictions, this figure is still considerably less than the 99% sensitivity that many gene finders possess with respect to the easier problem of simply finding genes. I have already begun work to improve TIS prediction by exploiting the sequence conservation found between distant species’ genomes, and this has revealed a considerable number of errors in the genome annotations present in our public genomic databases. These errors appear to be due to an incorrect use of existing gene finding programs to find rare and interesting genomic features at the beginning of genes; I am proposing work to extend my previous approach to find such features while demanding a high amount of evidence for such discoveries so that these features can be predicted with high precision. In addition, through the use of the RNA-seq technology, a small number of bacterial transcriptomes have been sequenced within the past 3 years; this number should grow considerably in the coming months and years.  Through an examination of existing transcriptome data, I have found that there exists a distinctly non-uniform distribution of distances between the beginning of transcripts and their respective genes. I propose a method to incorporate transcriptome data to estimate this distribution for a given genome, and use this estimation to produce improved TIS prediction. Finally, recent research indicates that a ribosomal binding site may not be present in nearly half of all genes, and that for such genes, a lack of secondary RNA structure allows translation in spite of the absence of an RBS. I propose the use of secondary structure prediction to augment existing TIS selection methods and improve TIS prediction for those genes where existing RBS-based methods are less effective.

 

Examining Committee:

Dr. Steven Salzberg               -          Chair

Dr. Alan Sussman                   -          Dept’s Representative

Dr. Mihai Pop                          -          Committee Member

EVERYBODY IS INVITED TO ATTEND THE PRESENTATION

This talk is organized by Jeff Foster