Printable PDF
Department of Mathematics,
University of California San Diego

****************************

Math 288 - Statistics Seminar

Ziwei Zhu

University of Michigan and University of Cambridge

Estimation of principal eigenspaces with missing values

Abstract:

In this talk, I will focus on Principal Component Analysis (PCA) in the presence of missing data. Under a homogeneous and independent missingness mechanism, we showed that the leading eigenspaces of a Hadamard-reweighted sample covariance matrix achieves the (nearly) minimax optimal rate with a phase transition. If the true leading eigenspaces satisfy an incoherence assumption, we can embrace much more flexible missingness mechanisms: we derived the statistical rate of this reweighted-covariance-based estimator under arbitrary deterministic observation regime. Then we proposed to use this estimator to initialize a tuning-free iterative algorithm called primePCA to further enhance the statistical accuracy. We showed that under the noiseless setting, primePCA achieves exact recovery of the true leading eigenspaces with geometric convergence, provided that the initializer is close to the truth. Simulation study shows that primePCA performs similarly as softImpute with oracle tuning within a wide range of heterogeneity levels of observation probabilities and signal-to-noise ratios.

Host: Wenxin Zhou

April 24, 2019

4:00 PM

AP&M 5402

****************************