SEMINAR

Adversarially robust stochastic multi-armed bandits

Speaker

Julian Zimmert

Working
University of Copenhagen
Timeline
Thu, Jan 16 2020 - 10:00 am (GMT + 7)
About Speaker

Julian Zimmert received his Masters degree in Mathematics at the Humboldt University of Berlin and is now a final year PhD student at the University of Copenhagen working under supervision of Yevgeny Seldin. His main area of research is robust algorithms for ranges of environments, in particular algorithms for multi-armed bandits in adversarial and stochastic settings. Recently, he did an internship at DeepMind in the Foundations group of Csaba Szepesvari working on a connection between mirror descent and the information theoretic analysis of Thompson sampling.

Abstract

Multi-armed bandits are a popular framework for optimal experimental design with applications in digital advertising and website optimisation. Traditionally, the bandit literature separates between two distinct forms of environments: The stochastic setting assumes that the data is generated by an i.i.d. process, which allows specialised algorithms to learn quickly. At the other extreme, the adversarial setting only assumes boundedness. This makes learning extremely robust, but comes at the cost of significantly slower convergence to the optimal solution. Real world applications are typically somewhere in between. While it might be reasonable to assume the data is close to i.i.d., the distribution might be influenced by hidden confounders or undergo unforeseen changes. Practically this means that stochastic bandit algorithms might fail even to approach a good solution. This poses a serious dilemma to the practitioners. Should one prioritise fast or robust learning? But why not both? This talk presents a recent breakthrough in practical all-purpose algorithms.

Related seminars

Anh Nguyen

Microsoft GenAI

The Revolution of Small Language Models
Fri, Mar 8 2024 - 02:30 pm (GMT + 7)

Thang D. Bui

Australian National University (ANU)

Recent Progress on Grokking and Probabilistic Federated Learning
Fri, Jan 26 2024 - 10:00 am (GMT + 7)

Tim Baldwin

MBZUAI, The University of Melbourne

LLMs FTW
Tue, Jan 9 2024 - 10:30 am (GMT + 7)

Quan Vuong

Google DeepMind

Scaling Robot Learning
Wed, Dec 27 2023 - 10:00 am (GMT + 7)