Talks

A Fixed-Point Method for Weighting Terms in Verbose Informational Queries

Wednesday, October 8, 2014, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

The term weighting and document ranking functions used with informational queries are typically optimized for cases in which queries are short and documents are long. It is reasonable to assume that the presence of a term in a short query reflects some aspect of the topic that is important to the user, and thus rewarding documents that contain the greatest number of distinct query terms is a useful heuristic. Verbose informational queries, such as those that result from cut-and-paste of example text, or that might result from informal spoken interaction, pose a different challenge in which many extraneous (and thus potentially misleading) terms may be present in the query. Modest improvements have been reported from applying supervised methods to learn which terms in a verbose query deserve the greatest emphasis. This paper proposes a novel unsupervised method for weighting terms in verbose informational queries that relies instead on iteratively estimating which terms are most central to the query. The key idea is to use an initial set of retrieval results to define a recursion on the term weight vector that converges to a fixed point representing the vector that optimally describes the initial result set. Experiments with several TREC news and Web test collections indicate that the proposed method often statistically significantly outperforms state of the art supervised methods.

This talk is organized by Jimmy Lin