❌Pitfalls of Data-Driven Decision Making (Series): Traditional vs Bayesian A/B Testing or Exploration-Exploitation Dilemma

Shadi Balandeh
3 min readOct 23, 2023

--

“Introducing the “Sock-It-To-Me Slippers” — Slippers that launch your socks straight to your laundry basket when you kick them off!”

Have you ever wondered why the heck you got that random ad in your feed?

They typically result from algorithms that aim to solve a common problem in statistics and machine learning known as the ‘exploration-exploitation dilemma.

It’s like trying to decide whether to keep trying a new restaurant every Friday night (exploration) or sticking with your all-time favorite (exploitation). Striking the right balance can be tough.

E-commerce companies often use traditional A/B tests to optimize their ad campaigns. In this method, they split their audience into two groups: one sees the current ad (exploitation) while the other sees a new ad variant (exploration). The best-performing version is selected after a set period.

❌Pitfall: Traditional A/B testing models tend to favor exploiting what has worked well in the past, potentially leading to stagnation and missed opportunities for improvement- the classic exploitation-exploration dilemma.

To foster more varied exploration, methods like Bayesian A/B Testing with Thompson Sampling are essential.

Bayesian A/B Testing with Thompson Sampling

Thompson Sampling is a Bayesian algorithm that addresses this pitfall by incorporating uncertainty into the decision-making process. This approach offers dynamic A/B testing.

The chance of showing a specific variation (A or B) evolves as more data is gathered. Thompson Sampling, using the multi-armed bandit strategy, finds a balance between exploration and exploitation by adjusting traffic allocation based on real-time results.

By drawing from probability distributions, Thompson Sampling balances between checking both ads and sticking to the currently successful one. It leans towards options with higher expected outcomes but doesn’t ignore variants with lesser performance. This exploration depth is regulated by the uncertainty in the distributions.

As more data is acquired, these probability distributions sharpen, letting Thompson Sampling adjust its strategy based on fresh insights, leading it to favor better-performing options faster.

In practice, for every ad variant, the company maintains a probability distribution to gauge the ad’s likely success, refining it with incoming data. Thompson Sampling uses these distributions to decide which ad to show a user. It generally displays ads with higher expected engagement but still occasionally shows the less-explored ads to gather more data.

Summary

In conclusion, the exploration-exploitation dilemma stands as a potential pitfall in data-driven decision-making. Traditional methods tend to favor exploiting known successes at the expense of exploring new possibilities, potentially leading to stagnation and missed opportunities for improvement.

To address this challenge, dynamic models like Bayesian A/B Testing with Thompson Sampling are essential. These models offer a balanced approach, adapting in real-time to allocate resources between exploration and exploitation. By embracing uncertainty and adjusting strategies based on evolving data, they pave the way for more informed and agile decision-making in an ever-changing digital landscape.

--

--

Shadi Balandeh
Shadi Balandeh

Written by Shadi Balandeh

AI and Data Science Manager| AI & Data Literacy Educator| Scientific Data-Driven Decision Making Advocate| Mom

No responses yet