# International Seminar on Selective Inference

A weekly online seminar on selective inference, multiple testing, and post-selection inference.

Gratefully inspired by the Online Causal Inference Seminar

## Upcoming Seminar Presentations

All seminars take place **Wednesdays at 8:30 am PT / 11:30 am ET / 4:30 pm London / 6:30 pm Tel Aviv. **Past seminar presentations are posted here.

**Wednesday, February****8****, 2023**[link to join]**Speaker:**Asher Spector (Stanford University)**Title:**Asymptotically Optimal Knockoff Statistics via the Masked Likelihood Ratio**Abstract:**This paper introduces a class of asymptotically most powerful knockoff statistics based on a simple principle: that we should prioritize variables in order of our ability to distinguish them from their knockoffs. Our contribution is threefold. First, we argue that feature statistics should estimate "oracle masked likelihood ratios," which are Neyman-Pearson statistics for discriminating between features and knockoffs using partially observed (masked) data. Second, we introduce the masked likelihood ratio (MLR) statistic, a knockoff statistic that estimates the oracle MLR. We show that MLR statistics are asymptotically average-case optimal, i.e., they maximize the expected number of discoveries made by knockoffs when averaging over a user-specified prior on unknown parameters. Our optimality result places no explicit restrictions on the problem dimensions or the unknown relationship between the response and covariates; instead, we assume a "local dependence" condition which depends only on simple quantities that can be calculated from the data. Third, in simulations and three real data applications, we show that MLR statistics outperform state-of-the-art feature statistics, including in settings where the prior is highly misspecified. We implement MLR statistics in the open-source python package knockpy; our implementation is often (although not always) faster than computing a cross-validated lasso.**Discussant:**Xin Xing (Virginia Tech)**Links:**[Relevant papers: paper #1]

**Wednesday, February 15, 2023**[link to join]**Speaker:**Jesse Hemerik (Wageningen University)**Title:**Flexible estimation and control of the false discovery proportion**Abstract:**When we choose a multiple testing method, there are always tradeoffs between type I error control, power and flexibility. This is particularly true for multiple testing methods that estimate or control the proportion of false discoveries (FDP). At the beginning of this talk, an overview of such methods will be given. We then introduce a multiple testing procedure that controls the median of the FDP in a flexible way. The procedure only requires a vector of p-values as input and is comparable to the Benjamini-Hochberg method, which controls the mean of the FDP. Benjamini-Hochberg requires choosing the target FDP, alpha, before looking at the data, but our method does not. Our procedure is inspired by a popular estimator of the total number of true hypotheses. We adapt this estimator to provide simultaneously median unbiased estimators of the FDP. This simultaneity allows for the claimed flexibility.**Discussant:**Pallavi Basu (Indian School of Business)

**Wednesday, February****22****, 2023**[link to join]**Speaker:**Yuhao Wang (Tsinghua University)**Title:**Residual Permutation Test for High-Dimensional Regression Coefficient Testing**Abstract:**We consider the problem of testing whether a single coefficient is equal to zero in high-dimensional fixed-design linear models. In the high-dimensional setting where the dimension of covariates p is allowed to be in the same order of magnitude as sample size n, to achieve finite-population validity, existing methods usually require strong distributional assumptions on the noise vector (such as Gaussian or rotationally invariant), which limits their applications in practice. In this paper, we propose a new method, called \emph{residual permutation test} (RPT), which is constructed by projecting the regression residuals onto the space orthogonal to the union of the column spaces of the original and permuted design matrices. RPT can be proved to achieve finite-population size validity under fixed design with just exchangeable noises, whenever p<n/2. Moreover, RPT is shown to be asymptotically powerful for heavy tailed noises with bounded (1+t)-th order moment when the true coefficient is at least of order n−t/(1+t) for t∈[0,1]. We further proved that this signal size requirement is essentially optimal in the minimax sense. Numerical studies confirm that RPT performs well in a wide range of simulation settings with normal and heavy-tailed noise distributions.**Discussant:**Panos Toulis (University of Chicago)**Links:**[Relevant papers: paper #1]

**Wednesday****,****March****1****, 202****3****Speaker:**Eugene Katsevich (University of Pennsylvania)**Title:**Reconciling model-X and doubly robust approaches to conditional independence testing**Abstract:**Model-X approaches to testing conditional independence between a predictor and an outcome variable given a vector of covariates usually assume exact knowledge of the conditional distribution of the predictor given the covariates. Nevertheless, model-X methodologies are often deployed with this conditional distribution learned in sample. We investigate the consequences of this choice through the lens of the distilled conditional randomization test (dCRT). We find that Type-I error control is still possible, but only if the mean of the outcome variable given the covariates is estimated well enough. This demonstrates that the dCRT is doubly robust, and motivates a comparison to the generalized covariance measure (GCM) test, another doubly robust conditional independence test. We prove that these two tests are asymptotically equivalent, and show that the GCM test is in fact optimal against (generalized) partially linear alternatives by leveraging semiparametric efficiency theory. In an extensive simulation study, we compare the dCRT to the GCM test. We find that the GCM test and the dCRT are quite similar in terms of both Type-I error and power, and that post-lasso based test statistics (as compared to lasso based statistics) can dramatically improve Type-I error control for both methods.**Discussant:**Shuangning Li (Harvard University)**Links:**[Relevant papers: paper #1]

**Wednesday,****March 8,****2023**[link to join]**Speaker:**Werner Brannath (University of Bremen)**Title:**

**Wednesday, March****15,****2023**[link to join]**Speaker:**Anastasios Angelopoulos (UC Berkeley)**Title:**

**Wednesday, March****29****, 2023**[link to join]**Speaker:**Yaniv Romano (Technion—Israel Institute of Technology)**Title:**

## Format

The seminars are held on Zoom and last 60 minutes:

45 minutes of presentation

15 minutes of discussion, led by an invited discussant

Moderators collect questions using the Q&A feature during the seminar.

## How to join

You can attend by clicking the link to join (there is no need to register in advance).

More instructions for attendees can be found here.

## Organizers

Rina Barber (University of Chicago)

Will Fithian (UC Berkeley)

Jelle Goeman (Leiden University)

Lihua Lei (Stanford University)

Daniel Yekutieli (Tel Aviv University)

## Contact us

If you have feedback or suggestions or want to propose a speaker, please e-mail us at selectiveinferenceseminar@gmail.com.

## What is selective inference?

Broadly construed, * selective inference* means searching for interesting patterns in data, usually with inferential guarantees that account for the search process. It encompasses:

**Multiple testing:**testing many hypotheses at once (and paying disproportionate attention to rejections)**Post-selection inference:**examining the data to decide what question to ask, or what model to use, then carrying out one or more appropriate inferences**Adaptive / interactive inference:**sequentially asking one question after another of the same data set, where each question is informed by the answers to preceding questions**Cheating:**cherry-picking, double dipping, data snooping, data dredging, p-hacking, HARKing, and other low-down dirty rotten tricks; basically any of the above, but done wrong!