Talks

Variable Selection is Hard

Howard Karloff - Yahoo Labs - New York

Thursday, January 22, 2015, 11:00 am-12:00 pm

Abstract

Consider the task of a machine-learning system faced with voluminous

data on m individuals. While there may be p=10^6 features describing

each individual, how can the algorithm find a small set of features

that "best" describes the individuals? People usually seek small

feature sets both because models with small feature sets are

understandable and because simple models usually generalize better.

We study the simple case of linear regression, in which a user has

an m x p matrix B and a vector y, and seeks a p-vector x *with as

few nonzeroes as possible* such that Bx is approximately equal to

y, and we call it SPARSE REGRESSION. There are numerous algorithms

in the statistical literature for SPARSE REGRESSION, such as Forward

Selection, Backward Elimination, LASSO, and Ridge Regression.

We give a general hardness proof that (subject to a complexity

assumption) no polynomial-time algorithm can give good performance

(in the worst case) for SPARSE REGRESSION, even if it is allowed

to include more variables than necessary, and even if it need only

find an x such that Bx is relatively far from y.

This is joint work with Dean Foster and Justin Thaler of Yahoo Labs.

This talk is organized by Manish Purohit