Online Learning Applications: Dynamic Pricing

Status of the Work

Requirement	Progress Status	Notes
1. Requirement 1: Single product and stochastic environment		Sublinear regret
2. Requirement 2: Multiple products and stochastic environment		Sublinear regret
3. Requirement 3: Best-of-both-worlds algorithms with a single product		Sublinear regret
4. Requirement 4: Best-of-both-worlds with multiple products		Sublinear regret
5. Requirement 5: Slightly non-stationary environments with multiple products		Sublinear regret and relevant comparison
Presentation

Introduction

This is the repository for the Project of Online Learning Applications in the academic year 2024/2025 at Polytechnic of Milan.

Subject: 062534 - Online Learning Applications

Professor: Castiglioni Matteo

Academic Year: 2024/2025

The project includes 5 requirements to analyze various algorithms in different scenarios:

Single product and stochastic environment
Multiple products and stochastic environment
Best-of-both-worlds algorithms with a single product
Best-of-both-worlds with multiple products
Slightly non-stationary environments with multiple products

To have an overview of the Project, please see the presentation.

Features

Sublinear regret: all the algorithms achieve sublinear regret in different environments and after multiple trials, being robust and consistent.
Self-contained notebooks: it is sufficient to export the notebooks in a working python/jupyter environment and run them.
Meaningful analyses: each algorithm has been analyzed and tested taking into account several different cases, with different parameters and behaviors.

General setting of the project

The goal of the project is to design online learning algorithms to sell multiple types of products under production constraints. See the offcial slides for detailed information.

Parameters

Number of rounds T
Number of types of products N
Set of possible prices P (small and discrete set)
Production capacity B (For simplicity, there is a total number of products B that the company can produce (independently from the specific type of product)

Buyer's behavior

Has a valuation vi for each type of product in N
Buys all products priced below their respective valuations

Interaction

At each round $$t \in T$$:

The company chooses which types of product to sell and set price $$p_{i}$$ for each type of product.
A buyer with a valuation for each type of product arrives.
The buyer buys a unit of each product with price smaller than the product valuation.

Requirement 1: Single product and stochastic environment

Build a stochastic environment: a distribution over the valuations of a single type of product. Then, build a pricing strategy using UCB1 ignoring the inventory constraint and build a pricing strategy extending UCB1 to handle the inventory constraint. First, we used UCB1 to fulfill the first part of the requirement, achieving sublinear regret. Then, we extended the UCB1 to deal with the inventory constraint, achieving also in this case sublinear regret.

UCB1	UCB1_with_budget

Requirement 2: Multiple products and stochastic environment

Build a stochastic environment: a joint distribution over the valuations of all the types of products. Build a pricing strategy using Combinatorial-UCB with the inventory constraint. So, the environment includes a logit-normal distribution to deal with multiple products and joint valuations. The Combinatorial UCB has been extended with Gaussian Processes (GP) for the selection of prices. The regret is sublinear.

CombUCB-GP

Requirement 3: Best-of-both-worlds algorithms with a single product

Build a highly non-stationary stochastic environment: a sequence of valuations of the product (e.g., sampled from a distribution that changes quickly over time). Build a pricing strategy using a primal-dual method with the inventory constraint. So, the environment includes different distributions whose non-stationarity is achieved thorugh means jumps, spikes, drifts. The Primal-Dual algorithm is based on the EXP3.P algorithm, that is suitable for dynamic pricing in non-stationary environment. The achieved regret is sublinear. For further information, visit https://cesa-bianchi.di.unimi.it/Pubblicazioni/J18.pdf.

Primal-Dual

Requirement 4: Best-of-both-worlds with multiple products

Build a highly non-stationary stochastic environment: a sequence of correlated valuations for each type of product (e.g., sampled from a distribution that changes quickly over time). Build a pricing strategy using a primal-dual method with the inventory constraint. So, the environment includes a logit-normal distribution to deal with multiple products and joint valuations and drifting means to represent non-stationarity. The Primal-Dual algorithm is based on the EXP3.P algorithm, that is suitable for dynamic pricing in non-stationary environment, having a EXP3.P agent for each product. The achieved regret is sublinear.

Primal-Dual

Requirement 5: Slightly non-stationary environments with multiple products

Build a slightly non-stationary stochastic environment for the pricing problem:

Rounds are partitioned in intervals
In each interval the distribution of products valuations is fixed
Each interval has a different distribution

Extend Combinatorial-UCB with sliding window and compare its performance with the performance of Primal-Dual method. Given the extended algorithm with SW of requirement 2 and Primal-Dual algorithm of requirement 4, they were compared against each other on the new environment. The achieved regret is sublinear.

CombUCB-GP-SW

Primal_Dual vs CombUCB-GP-SW

Final Considerations

Final Mark: 16/16

Authors

This Project was developed by:

Carminati Gabriele @carmigab
Compagnoni Riccardo Domingo @riccardocompagnoni
De Introna Federico @federicodeintrona
Di Giore Francesco @Digioref
Fossa' Chiara @keira-ph

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
Presentation		Presentation
Requirement 1		Requirement 1
Requirement 2		Requirement 2
Requirement 3		Requirement 3
Requirement 4		Requirement 4
Requirement 5		Requirement 5
data		data
LICENSE		LICENSE
README.md		README.md
project.pdf		project.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Online Learning Applications: Dynamic Pricing

Status of the Work

Introduction

Features