Interactive Information Retrieval with Bandit Feedback

July 2021 @ SIGIR

Live Q&A: 10:00-11:30 and 21:00-22:30 EDT, July 11


Information retrieval (IR) in nature is a process of sequential decision making. Distinct from traditional IR solutions that rigidly execute an offline trained policy, interactive information retrieval emphasizes online policy learning with bandit feedback. In this tutorial, we will first motivate the need for online policy learning in interactive IR, by highlighting its importance in several real-world IR problems where online sequential decision making is necessary, such as web search and recommendations. We will carefully address the new challenges that arose in such a solution paradigm, including sample complexity, costly and even outdated feedback, and ethical considerations in online learning (such as fairness and privacy) in interactive IR. We will prepare the technical discussions by first introducing several classical interactive learning strategies from machine learning literature, and then fully dive into the recent research developments for addressing the aforementioned fundamental challenges in interactive IR.

Note that the concurrent SIGIR 2021 tutorial on "Interactive Information Retrieval: Models, Algorithms, and Evaluation" will provide a broad overview on the general conceptual framework and formal models in interactive IR, while this tutorial covers the online policy learning solutions for interactive IR with bandit feedback.


[Full Slides in pdf] [Full Slides in pptx] [Tutorial Proposal]

Video is available on SIGIR 2021 conference website and will be upload here after the conference.

Part1: Introduction & Classical bandit learning algorithms [Slides][Video]

Part2: Interactive recommender system [Slides][Video]

Part3: Online Learning to Rank [Slides][Video]

Part4: Ethical considerations in interactive IR [Slides][Video]



Huazheng Wang

University of Virginia


Yiling Jia

University of Virginia


Hongning Wang

University of Virginia