03-02-2026
In today’s data-driven supply chains, it’s not just customers who are learning — your partners are learning too. When your retailers constantly update their inventory models and pricing, your own wholesale pricing strategy has to learn to keep up.
A study coauthored by Daniels School Associate Professor William Haskell, “Learning to Price Supply Chain Contracts Against a Learning Retailer,” looks at a common but underexplored situation: a supplier sells through a retailer (such as a consumer brand selling via Amazon), but only sees the retailer’s orders, not end-customer demand.
Each period, the supplier sets a wholesale price, while the retailer, using its own data and algorithms, chooses an order quantity. Customers then generate random demand, which only the retailer observes.
Crucially, neither party knows the true demand distribution in advance. The retailer uses some data-driven inventory policy (which may change over time), but the supplier does not know that policy and must infer market conditions only from order quantities.
The core question: Can a supplier still do nearly as well as an “all-knowing” supplier who fully understands demand and the retailer’s policy? The answer, under realistic conditions, is yes.
The research shows how suppliers can “learn to price” against a learning retailer. The authors design pricing policies that approach the performance of an ideal clairvoyant supplier. Over time, the average performance gap shrinks, meaning the supplier’s losses from not knowing the environment grow slowly rather than accumulating rapidly.
Importantly, the approach works for both discrete and continuous demand. Whether products are sold in units like pallets and cases or in continuous quantities such as tons or gallons, the framework can be adapted.
The study also finds that suppliers do not need direct visibility into customer demand. The algorithm relies only on the supplier’s own historical prices, the retailer’s resulting orders and basic knowledge of feasible demand ranges. This is important because most suppliers lack access to real-time point-of-sale data.
The authors warn that off-the-shelf bandit methods and generic online-learning tools can perform poorly in this setting. In realistic scenarios — such as when the retailer uses standard sample-average inventory policies — these tools can generate losses that grow linearly over time. In simulations, the tailored pricing policy consistently outperforms these general-purpose algorithms.
The study also introduces a smarter way to measure how much the environment is changing. Instead of tracking how profits fluctuate, the authors track how the retailer’s implied demand estimates shift over time. This better captures how retailer learning affects the supplier’s world and allows robust pricing even when the retailer switches or mixes decision policies.
For suppliers and upstream leaders, three messages stand out:
Here are concrete moves to consider if you’re a supplier or run an upstream business unit:
Ultimately, this research shows that suppliers don’t have to be passive price‑takers in a world of sophisticated, data‑driven retailers. With the right learning‑based pricing strategies, you can remain agile, protect margins and keep pace with partners whose algorithms are changing just as fast as the market itself.