Printable PDF
Department of Mathematics,
University of California San Diego

****************************

Statistics

J. Romano

Stanford University

Three lectures on multiple hypothesis testing: Lecture II

Abstract:

The demand for new methodology for the simultaneous testing of many hypotheses is driven by numerous modern applications in genomics, imaging, astronomy, finance, etc. A key feature of these problems is their high dimensionality, meaning tests of thousands of hypotheses may be considered simultaneously. The problem of controlling measures of error are particularly important in order to counter the effects of ``data snooping'' (or ``data mining''). The goal of the lectures is develop some classes of techniques that effectively deal with the problem of multiplicity. \vskip .1in \noindent A classical approach to dealing with multiplicity is to require decision rules that control the familywise error rate (FWER). We will develop some stepwise procedures that offer either finite sample or asymptotic control. It will be shown that the problem of controlling the FWER for stepdown tests can be reduced to the problem of controlling the Type 1 error of single testing; as such, the ``subset pivotality'' often used in the literature are unnecessary. The use of resampling methods can provide improved ability to detect false hypotheses by implicitly estimating the dependence structure of the test statistics. Some optimality results will be presented, as well as some applications to set estimation problems in underidentified econometric models. \vskip .1 in \noindent In many applications, particularly if the number of hypotheses is large, one might be willing to tolerate more than one false rejections if the number of such cases is controlled. Therefore, we will replace control of the FWER by the $k$-FWER, the probability of $k$ or more false rejections. We will also consider the false discovery proportion (FDP) defined as the number of false rejections divided by the total number of rejections. The false discovery rate of Benjamini Hochberg (1995) controls $E(FDP)$. We will also discuss suitable methods to control the tail probability $P \{ FDP > \gamma \}$ for any $\gamma$. Resampling, subsampling, and permutation methods can yield valid procedures.

Host: Dimitris Politis

May 4, 2005

4:00 PM

AP&M 6438

****************************