Printable PDF
Department of Mathematics,
University of California San Diego

****************************

Math 288 - Statistics Seminar

Jianqing Fan

Princeton University

A principle of Robustification for Big Data

Abstract:

Heavy-tailed distributions are ubiquitous in modern statistical analysis and machine learning problems. This talk gives a simple principle for robust high-dimensional statistical inference via an appropriate shrinkage on the data. This widens the scope of high-dimensional techniques, reducing the moment conditions from sub-exponential or sub-Gaussian distributions to merely bounded second moment. As an illustration of this principle, we focus on robust estimation of the low-rank matrix from the trace regression model. It encompasses four popular problems: sparse linear models, compressed sensing, matrix completion, and multi-task regression. Under only bounded $2+\delta$ moment condition, the proposed robust methodology yields an estimator that possesses the same statistical error rates as previous literature with sub-Gaussian errors. We also illustrate the idea for estimation of large covariance matrix. The benefits of shrinkage are also demonstrated by financial, economic, and simulated data.

Host: Jelena Bradic

January 31, 2017

2:00 PM

AP&M 6402

****************************