Talks

PhD Defense: Learning to Efficiently Rank

Lidan Wang - University of Maryland, College Park

3258 A.V. Williams Building (AVW)

Monday, July 30, 2012, 2:00-3:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

THE DISSERTATION DEFENSE FOR THE DEGREE OF Ph.D. IN COMPUTER SCIENCE FOR

Lidan Wang

Web search engines allow users to find information on almost any topic imaginable. To be successful, a search engine must return relevant information to the user in a short amount of time. However, efficiency (speed) and effectiveness (relevance) are competing forces that often counteract each other. It is often the case that methods developed for improving effectiveness incur moderate-to-large computational costs, thus sustained effectiveness gains typically have to be counter-balanced by buying more/faster hardware, implementing caching strategies if possible, or spending additional effort in low-level optimizations.

In this work, I will describe the "Learning to Efficiently Rank" framework for building highly effective ranking models for Web-scale data, without sacrificing run-time efficiency for returning results. I will introduce new classes of ranking models that have the capability of being simultaneously fast and effective, and I will demonstrate how to optimize the models for multiple metrics. The dissertation has the following contributions:

1) Given a desired tradeoff between effectiveness/efficiency, we introduce efficient linear models, which have a mechanism to directly optimize the tradeoff metric and achieve an optimal balance between effectiveness/efficiency.

2) We introduce "ranking under temporal constraints", which returns the most effective ranked results possible under a time constraint imposed on each query.

3) We propose a cascade ranking model for efficient top-K retrieval over Web-scale documents, such that ranking effectiveness and efficiency are simultaneously optimized by the cascade.

4) We propose a constrained version of the cascade that can return results within time constraints by simultaneously reducing document set size and unnecessary features.

5) We generalize the "Learning to Efficiently Rank" framework and discuss another important instance in the broader context of multi-objective learning to rank. We discuss how to construct robust ranking models -- in addition to performing well on average, the model should not perform very poorly for any individual query, relative to a baseline ranking model. We show how to jointly optimize model robustness and effectiveness.

Examining Committee:

Committee Chair: Dr. Jimmy Lin

Dean's Representative: Dr. Carol Espy-Wilson

Committee Members: Dr. Douglas Oard

Dr. Hal Daume III

Dr. Amol Deshpande

Dr. Donald Metzler

This talk is organized by Jeff Foster