SIGMOD25-OptimizerTutorial

This is the companion repository for the tutorial "Reproducible Prototyping of Query Optimizer Components" held at SIGMOD 2025 (DOI 10.1145/3722212.3725637). It contains the scripts and data files that we use during the hands-on session.

🔍 Scope

In the practical part of the tutorial, we demonstrate how to use the PostBOUND framework to implement and evaluate common classes of optimizers. Specifically, we implement the following prototypes:

a learned cardinality estimator inspired by MSCN¹,
a learned plan selection inspired by BAO², and
a upper bound-driven query optimizer inspired by the UES formula³

Additionally, we show how PostBOUND can be used to generate training data for all the learned approaches and how all prototypes can be compared in an end-to-end benchmark.

See Structure for how this repo is organized.

💻 Setup

To follow along the live demo parts of our tutorial, you first need to setup PostBOUND on your system. You can download a PostBOUND image specifically created for this tutorial via

docker pull rbergm/postbound-sigmod25

This will need to download approximately 1.6 GB. Once the download completes, you can start a new container, for example using

docker run -d --name postbound-sigmod25 rbergm/postbound-sigmod25

Afterwards, you should be able to log into your container with the usual docker exec -it postbound-sigmod25 /bin/bash. The shell environment will automatically have a Python virtual environment activated that contains the most recent PostBOUND version.

In the container, clone the tutorial repository and change into the new directory. From there, all that is left to do is activate the config file for your server connection and install the dependencies. Since we use embedding-based models for simplicity, this requires downloading another couple of gigabytes of data (e.g. CUDA interfaces). You can test the installation by running one of the examples:

# basic setup
git clone https://github.com/db-tu-dresden/SIGMOD25-OptimizerTutorial.git
cd SIGMOD25-OptimizerTutorial

# activate the config file and install all dependencies
cp .psycopg_connection_stats.sample .psycopg_connection
pip install -r requirements.txt

# run an example
python3 examples/mscn-light.py \
    --samples data/cardinality-samples.csv \
    --workload workload/ \
    --connect .psycopg_connection \
    --out test-results.csv
cat test-results.csv

Tip

You can also setup PostBOUND manually by following the instructions from the PostBOUND documentation

📖 Structure

This repository is structured as follows:

examples contains the actual implementations of the optimizer prototypes as well as the tooling (data generation and benchmarking)
workload contains our test queries based which are a subset of the Stats Benchmark⁴
data contains the pre-generated training data for the learned optimizers
results contains the (pre-generated) benchmark data for the prototypes as well as notebook to demonstrate a potential evaluation

📚 Further Reading

If you want to learn more about PostBOUND, take a look at the Github repo and the documentation.

Our paper "An Elephant Under the Microscope: Analyzing the Interaction of Optimizer Components in PostgreSQL." (DOI 10.1145/3709659) also contains a high-level introduction to the framework and its motivation.

Kipf et al.: "Learned Cardinalities: Estimating Correlated Joins with Deep Learning" (CIDR'19, https://www.cidrdb.org/cidr2019/papers/p101-kipf-cidr19.pdf) ↩
Marcus et al.: "Bao: Making Learned Query Optimization Practical" (SIGMOD'22, DOI 10.1145/3542700.3542703) ↩
Hertzschuch et al.: "Simplicity Done Right for Join Ordering" (CIDR'21, https://vldb.org/cidrdb/papers/2021/cidr2021_paper01.pdf) ↩
Han et al.: "Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation" (VLDB'22, DOI 10.14778/3503585.3503586) ↩

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.vscode		.vscode
data		data
examples		examples
results		results
workload		workload
.gitignore		.gitignore
.psycopg_connection_stats.sample		.psycopg_connection_stats.sample
LICENSE		LICENSE
README.md		README.md
Slides.pdf		Slides.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SIGMOD25-OptimizerTutorial

🔍 Scope

💻 Setup

📖 Structure

📚 Further Reading

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

db-tu-dresden/SIGMOD25-OptimizerTutorial

Folders and files

Latest commit

History

Repository files navigation

SIGMOD25-OptimizerTutorial

🔍 Scope

💻 Setup

📖 Structure

📚 Further Reading

Footnotes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages