International Seminar on Selective Inference
A weekly online seminar on selective inference, multiple testing, and post-selection inference.
Gratefully inspired by the Online Causal Inference Seminar
Upcoming Seminar Presentations
All seminars take place Wednesdays at 8:30 am PT / 11:30 am ET / 4:30 pm London / 6:30 pm Tel Aviv. Past seminar presentations are posted here.
Wednesday, June 7, 2023 [link to join]
Speaker: Xianyang Zhang (Texas A&M University)
Title: Joint Mirror Procedure: Controlling False Discovery Rate for Identifying Simultaneous Signals
Abstract: In many applications, identifying a single feature of interest requires testing the statistical significance of several hypotheses. Examples include mediation analysis which simultaneously examines the existence of the exposure-mediator and the mediator-outcome effects, and replicability analysis aiming to identify simultaneous signals that exhibit statistical significance across multiple independent experiments. In this work, we develop a novel procedure, named joint mirror (JM), to detect such features while controlling the false discovery rate (FDR) in finite samples. The JM procedure iteratively shrinks the rejection region based on partially revealed information until a conservative false discovery proportion (FDP) estimate is below the target FDR level. We propose an efficient algorithm to implement the method. Extensive simulations demonstrate that our procedure can control the modified FDR, a more stringent error measure than the conventional FDR, and provide power improvement in several settings. Our method is further illustrated through real-world applications in mediation and replicability analyses.
Discussant: Lin Gui (University of Chicago)
Links: [Relevant papers: paper #1]
Wednesday, June 14, 2023 [link to join]
Speaker: Jonathan Roth (Brown University)
Title: Inference for Linear Conditional Moment Inequalities
Abstract: We show that moment inequalities in a wide variety of economic applications have a particular linear conditional structure. We use this structure to construct uniformly valid confidence sets that remain computationally tractable even in settings with nuisance parameters. We first introduce least favorable critical values which deliver non-conservative tests if all moments are binding. Next, we introduce a novel conditional inference approach which ensures a strong form of insensitivity to slack moments. Our recommended approach is a hybrid technique which combines desirable aspects of the least favorable and conditional methods. The hybrid approach performs well in simulations calibrated to Wollmann (2018), with favorable power and computational time comparisons relative to existing alternatives.
Discussant: Kevin Chen (Harvard University)
Links: [Relevant papers: paper #1]
Wednesday, June 21, 2023 [link to join]
Speaker: Aldo Solari (University of Milano-Bicocca)
Title: Simultaneous directional inference
Abstract: We consider the problem of inference on the signs of n>1 parameters. Within a simultaneous inference framework, we aim to: identify as many of the signs of the individual parameters as possible; provide confidence bounds on the number of positive (or negative) parameters on subsets of interest. Our suggestion is as follows: start by using the data to select the direction of the hypothesis test for each parameter; then, adjust the one-sided p-values for the selection, and use them for simultaneous inference on the selected n one-sided hypotheses. The adjustment is straightforward assuming that the one-sided p-values are conditionally valid and mutually independent. Such assumptions are commonly satisfied in a meta-analysis, and we can apply our approach following a test of the global null hypothesis that all parameters are zero, or of the hypothesis of no qualitative interaction. We consider the use of two multiple testing principles: closed testing and partitioning. The novel procedure based on partitioning is more powerful, but slightly less informative: it only infers on positive and non-positive signs. The procedure takes at most a polynomial time, and we show its usefulness on a subgroup analysis of a medical intervention, and on a meta-analysis of an educational intervention.
Links: [Relevant papers: paper #1]
Wednesday, June 28, 2023 [link to join]
Speaker: Minge Xie (Rutgers University)
Title: Repro Samples Method for Uncertainty Quantification of Irregular Inference Problems and for Unraveling Machine Learning Blackboxes
Abstract: Rapid data science developments require us to have new frameworks to tackle highly non-trivial irregular inference problems, e.g., those involving discrete or non-numerical parameters and those involving non-numerical data, etc. This talk presents a novel, wide-reaching and effective simulation-inspired framework, called repro samples method, to conduct statistical inference for the irregular inference problems plus more. We systemically develop both exact and approximate (asymptotic) theories to support the development. An attractive feature is that the method doesn't need to rely on a likelihood or the large sample central limit theorem, and thus is especially effective for complicated and irregular inference problems encountered in data science. The effectiveness of the method is illustrated by solving two open inference problems in statistics: a) construct a confidence set for the unknown number of components in a normal mixture; b) construct confidence sets for the unknown true model, the regression coefficients, or both true model and coefficients jointly in a high dimensional regression model. Comparison studies show that the method has far superior performance to existing attempts. Although the case studies pertain to the traditional statistics models, the method also has direct extensions to complex machine learning models, e.g., (ensemble) tree models, neural networks, graphical models, etc. It is a new tool that has the potential to develop interpretable AI and unravel machine learning blackboxes.
The seminars are held on Zoom and last 60 minutes:
45 minutes of presentation
15 minutes of discussion, led by an invited discussant
Moderators collect questions using the Q&A feature during the seminar.
How to join
You can attend by clicking the link to join (there is no need to register in advance).
More instructions for attendees can be found here.
Rina Barber (University of Chicago)
Will Fithian (UC Berkeley)
Jelle Goeman (Leiden University)
Lihua Lei (Stanford University)
Daniel Yekutieli (Tel Aviv University)
If you have feedback or suggestions or want to propose a speaker, please e-mail us at email@example.com.
What is selective inference?
Broadly construed, selective inference means searching for interesting patterns in data, usually with inferential guarantees that account for the search process. It encompasses:
Multiple testing: testing many hypotheses at once (and paying disproportionate attention to rejections)
Post-selection inference: examining the data to decide what question to ask, or what model to use, then carrying out one or more appropriate inferences
Adaptive / interactive inference: sequentially asking one question after another of the same data set, where each question is informed by the answers to preceding questions
Cheating: cherry-picking, double dipping, data snooping, data dredging, p-hacking, HARKing, and other low-down dirty rotten tricks; basically any of the above, but done wrong!