
(Counterfactual-AirBnB) Harnessing the Power of Interleaving and Counterfactual Evaluation for Airbnb Search Ranking
Failed to add items
Add to basket failed.
Add to Wish List failed.
Remove from Wish List failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
By:
About this listen
Tune into our podcast as we explore Airbnb's groundbreaking advancements in search ranking evaluation. Traditional A/B testing for significant purchases like accommodation bookings faces challenges: it's time-consuming, with low traffic and delayed feedback. Offline evaluations, while quick, often lack accuracy due to issues like selection bias and disconnect from online metrics.
To overcome this, Airbnb developed and implemented two novel online evaluation methods: interleaving and counterfactual evaluation. Our competitive pair-based interleaving method offers an impressive 50X speedup in experimentation velocity compared to traditional A/B tests. For even greater generalizability and sensitivity, our online counterfactual evaluation achieves an astonishing 100X speedup. These methods allow for rapid identification of promising candidates for full A/B tests, significantly streamlining the experimental process.
While interleaving may face limitations with rankers using set-level optimization that can disrupt user experience, counterfactual evaluation provides greater robustness in such scenarios. These innovative techniques are not only proven effective at Airbnb, leading to increased capacity to test new ideas and higher success rates in A/B testing, but are also easily generalizable to other online platforms, especially those with sparse conversion events.
Paper Link: https://doi.org/10.1145/3711896.3737232