International Seminar on Selective Inference

A weekly online seminar on selective inference, multiple testing, and post-selection inference.

Gratefully inspired by the Online Causal Inference Seminar

Mailing List

For announcements and Zoom invitations please subscribe to our mailing list.

Upcoming Seminar Presentations

All seminars take place Tuesdays at 8:30 am PT / 11:30 am ET / 4:30 pm London / 6:30 pm Tel Aviv. Past seminar presentations are posted here.

Tuesday, April 1, 2025 [Link]
- Speaker: Sida Li (University of Chicago)
- Title: Prediction-Powered Adaptive Shrinkage Estimation
- Abstract: Prediction-Powered Inference (PPI) is a powerful framework for enhancing statistical estimates by combining limited gold-standard data with machine learning (ML) predictions. While prior work has demonstrated PPI's benefits for individual statistical tasks, modern applications require answering numerous parallel statistical questions. We introduce Prediction-Powered Adaptive Shrinkage (PAS), a method that bridges PPI with empirical Bayes shrinkage to improve the estimation of multiple means. PAS debiases noisy ML predictions within each task and then borrows strength across tasks by using those same predictions as a reference point for shrinkage. The amount of shrinkage is determined by minimizing an unbiased estimate of risk, and we prove that this tuning strategy is asymptotically optimal. Experiments on both synthetic and real-world datasets show that PAS adapts to the reliability of the ML predictions and outperforms traditional and modern baselines in large-scale applications.
- Discussant: Dan Kluger (MIT)
- Links: [Relevant papers: paper #1]

Tuesday, April 8, 2025 [Link]
- Speaker: William Hartog (Stanford University)
- Title: Family-wise Error Rate Control with E-values
- Abstract: The closure principle is a standard tool for achieving family-wise error rate (FWER) control in multiple testing problems. In general, the computational cost for closed testing can be exponential in the number of hypotheses. The celebrated graphical approach of FWER control [Bretz et al., 2009] overcomes the computational hurdle by using weighted Bonferroni local tests on p-values with appropriately chosen weights. In this study, we extend the graphical approach to e-values. With valid e-values – common in settings of sequential hypothesis testing or universal inference for irregular parametric models – we can derive strictly more powerful local tests based on weighted averages of e-values. Consequently, this e-value-based closed test is more powerful than the corresponding graphical approach with inverse e-values as p-values. Although the computational shortcuts for the p-value-based graphical approach are not applicable, we develop efficient polynomial-time algorithms using dynamic programming for e-value-based graphical approaches with any directed acyclic graph. For special graphs, such as those used in the Holm’s procedure and fallback procedure, we develop tailored algorithms with computation cost linear in the number of hypotheses, up to logarithmic factors.
- Discussant: Ruodu Wang (University of Waterloo)
- Links: [Relevant papers: paper #1]

Tuesday, April 15, 2025 [Link]
- Speaker: Maximilian Kasy (University of Oxford)
- Title: Optimal Pre-Analysis Plans: Statistical Decisions Subject to Implementability
- Abstract: What is the purpose of pre-analysis plans, and how should they be designed? We model the interaction between an agent who analyzes data and a principal who makes a decision based on agent reports. The agent could be the manufacturer of a new drug, and the principal a regulator deciding whether the drug is approved. Or the agent could be a researcher submitting a research paper, and the principal an editor deciding whether it is published. The agent decides which statistics to report to the principal. The principal cannot verify whether the analyst reported selectively. Absent a pre-analysis message, if there are conflicts of interest, then many desirable decision rules cannot be implemented. Allowing the agent to send a message before seeing the data increases the set of decision rules that can be implemented, and allows the principal to leverage agent expertise. The optimal mechanisms that we characterize require pre-analysis plans. Applying these results to hypothesis testing, we show that optimal rejection rules pre-register a valid test, and make worst-case assumptions about unreported statistics. Optimal tests can be found as a solution to a linear-programming problem.
- Discussant:
- Links: [Relevant papers: paper #1]

Tuesday, April 22, 2025 [Link]
- Speaker: Ying Jin (Harvard University)
- Title: Automated Hypothesis Validation with Agentic Sequential Falsifications
- Abstract: Hypotheses are central to information acquisition, decision-making, and discovery. However, many real-world hypotheses are abstract, high-level statements that are difficult to validate directly. This challenge is further intensified by the rise of hypothesis generation from Large Language Models (LLMs), which are prone to hallucination and produce hypotheses in volumes that make manual validation impractical. Here we propose POPPER, an agentic framework for rigorous automated validation of free-form hypotheses. Guided by Karl Popper’s principle of falsification, POPPER validates a hypothesis using LLM agents that design and execute falsification experiments targeting its measurable implications. We employ a sequential testing framework to ensure strict Type-I error control while actively gathering evidence from diverse observations, whether drawn from existing data or newly conducted procedures. We demonstrate POPPER on six domains including biology, economics, and sociology. POPPER delivers robust error control, high power, and scalability. Furthermore, compared to human scientists, POPPER achieved comparable performance in validating complex biological hypotheses while reducing time by 10 folds, providing a scalable, rigorous solution for hypothesis validation.
- Discussant:
- Links: [Relevant papers: paper #1]

Format

The seminars are held on Zoom and last 60 minutes:

45 minutes of presentation
15 minutes of discussion, led by an invited discussant

Moderators collect questions using the Q&A feature during the seminar.

How to join

You can attend by clicking the link to join (there is no need to register in advance).

More instructions for attendees can be found here.

Organizers

Will Fithian (UC Berkeley)
Jelle Goeman (Leiden University)
Nikos Ignatiadis (University of Chicago)
Lihua Lei (Stanford University)
Zhimei Ren (University of Pennsylvania)

Former organizers

Rina Barber (University of Chicago)
Daniel Yekutieli (Tel Aviv University)

Contact us

If you have feedback or suggestions or want to propose a speaker, please e-mail us at selectiveinferenceseminar@gmail.com.

What is selective inference?

Broadly construed, selective inference means searching for interesting patterns in data, usually with inferential guarantees that account for the search process. It encompasses:

Multiple testing: testing many hypotheses at once (and paying disproportionate attention to rejections)
Post-selection inference: examining the data to decide what question to ask, or what model to use, then carrying out one or more appropriate inferences
Adaptive / interactive inference: sequentially asking one question after another of the same data set, where each question is informed by the answers to preceding questions
Cheating: cherry-picking, double dipping, data snooping, data dredging, p-hacking, HARKing, and other low-down dirty rotten tricks; basically any of the above, but done wrong!