International Seminar on Selective Inference

A weekly online seminar on selective inference, multiple testing, and post-selection inference.

Gratefully inspired by the Online Causal Inference Seminar

Mailing List

For announcements and Zoom invitations please subscribe to our mailing list.

Upcoming Seminar Presentations

All seminars take place Thursdays at 8:30 am PT / 11:30 am ET / 4:30 pm London / 6:30 pm Tel Aviv. Past seminar presentations are posted here.

  • Thursday, October 13, 2022 [Recording]

    • Speaker: Aaditya Ramdas (Carnegie Mellon University)

    • Title: E-values as unnormalized weights in multiple testing

    • Abstract: The last two years have seen a flurry of new work on using e-values for multiple testing. This talk will summarize old ideas and present some new, unsubmitted work. I will briefly summarize what e-values and e-processes are (nonparametric, composite generalizations of likelihood ratios and Bayes factors), and recap the e-BH and e-BY procedures for FDR and FCR control, and their utility in a bandit context.

Then, I will present a simple, yet powerful, idea: using e-values as unnormalized weights in multiple testing. Most standard weighted multiple testing methods require the weights to deterministically add up to the number of hypotheses being tested (equivalently, the average weight is unity). But this normalization is not required when the weights are e-values obtained from independent data. This could result in a massive increase in power, especially if the non-null hypotheses have e-values much larger than one. More broadly, we study how to combine an e-value and a p-value, and design multiple testing procedures where both e-values and p-values are available for some hypotheses. A case study with RNA-seq and microarray data will demonstrate the practical power benefits.

These are joint works with Ruodu Wang, Neil Xu and Nikos Ignatiadis.

  • Thursday, October 20, 2022 [Recording]

    • Speaker: Timothy Armstrong (University of Southern California)

    • Title: Empirical Bayes Confidence Intervals, Average Coverage and the False Discovery Rate

    • Abstract: This talk presents a general method for constructing intervals satisfying an average coverage property. Given an estimate of average squared bias of estimates of $n$ parameters, one computes a critical value that takes into account possible undercoverage due to bias, on average over the $n$ intervals. Applying our approach to shrinkage estimators in an empirical Bayes setting, we obtain confidence intervals that satisfy the empirical Bayes coverage property of Morris (1983), while avoiding parametric assumptions on the prior previously used to construct such intervals.

While tests based on average coverage intervals do not control size in the usual frequentist sense, certain results on false discovery rate (FDR) control of multiple testing procedures continue to hold when applied to such tests. In particular, the Benjamini and Hochberg (1995) step-up procedure still controls FDR in the asymptotic regime with many weakly dependent $p$-values, and certain adjustments for dependent $p$-values such as the Benjamini and Yekutieli (2001) procedure continue to yield FDR control in finite samples.

  • Thursday, October 27, 2022 [Recording]

    • Speaker: Weijie Su (University of Pennsylvania)

    • Title: Statistical Estimation via a Truthful Owner-Assisted Scoring Mechanism

    • Abstract: In 2014, NeurIPS received 1,678 paper submissions, while this number increased to 10,411 in 2022, putting a tremendous strain on the peer review process. In this talk, we attempt to address this challenge starting by considering the following scenario: Alice submits a large number of papers to a machine learning conference and knows about the ground-truth quality of her papers; Given noisy ratings provided by independent reviewers, can Bob obtain accurate estimates of the ground-truth quality of the papers by asking Alice a question about the ground truth? First, if Alice would truthfully answer the question because by doing so her payoff as additive convex utility over all her papers is maximized, we show that the questions must be formulated as pairwise comparisons between her papers. Moreover, if Alice is required to provide a ranking of her papers, which is the most fine-grained question via pairwise comparisons, we prove that she would be truth-telling. By incorporating the ground-truth ranking, we show that Bob can obtain an estimator with the optimal squared error in certain regimes based on any possible ways of truthful information elicitation. Moreover, the estimated ratings are substantially more accurate than the raw ratings when the number of papers is large and the raw ratings are very noisy. Finally, we conclude the talk with several extensions and some refinements for practical considerations.

  • Friday, December 9, 2022 (STAMPS-ISSI joint seminar) [Recording] (not Thursday!!)

    • Speaker: Rebecca Willett (University of Chicago)

    • Title: Embed and Emulate: Learning to estimate parameters of dynamical systems with uncertainty quantification


The seminars are held on Zoom and last 60 minutes:

  • 45 minutes of presentation

  • 15 minutes of discussion, led by an invited discussant

Moderators collect questions using the Q&A feature during the seminar.

How to join

You can attend by clicking the link to join (there is no need to register in advance).

More instructions for attendees can be found here.


Contact us

If you have feedback or suggestions or want to propose a speaker, please e-mail us at

What is selective inference?

Broadly construed, selective inference means searching for interesting patterns in data, usually with inferential guarantees that account for the search process. It encompasses:

  • Multiple testing: testing many hypotheses at once (and paying disproportionate attention to rejections)

  • Post-selection inference: examining the data to decide what question to ask, or what model to use, then carrying out one or more appropriate inferences

  • Adaptive / interactive inference: sequentially asking one question after another of the same data set, where each question is informed by the answers to preceding questions

  • Cheating: cherry-picking, double dipping, data snooping, data dredging, p-hacking, HARKing, and other low-down dirty rotten tricks; basically any of the above, but done wrong!