Skip to content

Commit d5cc4c2

Browse files
Add README. (#33)
**Summary**: Wrote **Quickstart** and **Overview** sections. Go [here](https://github.com/wangpatrick57/dbgym/tree/readme) to see the README on the website. **Details**: * **Overview** summarizes the research motivation behind the project, giving background as necessary. * **Quickstart** gives a single shell script which compiles Postgres with Boot, generates data, builds a Proto-X embedding, and trains a Proto-X agent. * I renamed all occurrences of "pgdata" to "dbdata" to match the project's vision of working for multiple DBMSs (as described in the README). * I removed the startup check. * I got rid of the `ssd_checker` dependency as it's a very small repository. * Fixed Postgres compilation code to work with the new `vldb_2024` branch of Boot. --------- Co-authored-by: Wan Shen Lim <wanshen.lim@gmail.com>
1 parent 3aecdd1 commit d5cc4c2

File tree

22 files changed

+387
-284
lines changed

22 files changed

+387
-284
lines changed

README.md

Lines changed: 86 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,86 @@
1-
# Database Gym
1+
# 🛢️ Database Gym 🏋️
2+
[\[Slides\]](http://www.cidrdb.org/cidr2023/slides/p27-lim-slides.pdf) [\[Paper\]](https://www.cidrdb.org/cidr2023/papers/p27-lim.pdf)
3+
4+
*An end-to-end research vehicle for the field of self-driving DBMSs.*
5+
6+
## Quickstart
7+
8+
These steps were tested on a fresh repository clone, Ubuntu 22.04.
9+
10+
```
11+
# Setup dependencies.
12+
# You may want to create a Python virtual environment (e.g. with conda) before doing this.
13+
./dependency/install_dependencies.sh
14+
15+
# Compile a custom fork of PostgreSQL, load TPC-H (SF 0.01), train the Proto-X agent, and tune.
16+
./scripts/quickstart.sh postgres tpch 0.01 protox
17+
```
18+
19+
## Overview
20+
21+
Autonomous DBMS research often involves more engineering than research.
22+
As new advances in state-of-the-art technology are made, it is common to find that they have have
23+
reimplemented the database tuning pipeline from scratch: workload capture, database setup,
24+
training data collection, model creation, model deployment, and more.
25+
Moreover, these bespoke pipelines make it difficult to combine different techniques even when they
26+
should be independent (e.g., using a different operator latency model in a tuning algorithm).
27+
28+
The database gym project is our attempt at standardizing the APIs between these disparate tasks,
29+
allowing researchers to mix-and-match the different pipeline components.
30+
It draws inspiration from the Farama Foundation's Gymnasium (formerly OpenAI Gym), which
31+
accelerates the development and comparison of reinforcement learning algorithms by providing a set
32+
of agents, environments, and a standardized API for communicating between them.
33+
Through the database gym, we hope to save other people time and reimplementation effort by
34+
providing an extensible open-source platform for autonomous DBMS research.
35+
36+
This project is under active development.
37+
Currently, we decompose the database tuning pipeline into the following components:
38+
39+
1. Workload: collection, forecasting, synthesis
40+
2. Database: database loading, instrumentation, orchestrating workload execution
41+
3. Agent: identifying tuning actions, suggesting an action
42+
43+
## Repository Structure
44+
45+
`task.py` is the entrypoint for all tasks.
46+
The tasks are grouped into categories that correspond to the top-level directories of the repository:
47+
48+
- `benchmark` - tasks to generate data and queries for different benchmarks (e.g., TPC-H, JOB)
49+
- `dbms` - tasks to build and start DBMSs (e.g., PostgreSQL)
50+
- `tune` - tasks to train autonomous database tuning agents
51+
52+
## Credits
53+
54+
The Database Gym project rose from the ashes of the [NoisePage](https://db.cs.cmu.edu/projects/noisepage/) self-driving DBMS project.
55+
56+
The first prototype was written by [Patrick Wang](https://github.com/wangpatrick57), integrating [Boot (VLDB 2024)](https://github.com/lmwnshn/boot) and [Proto-X (VLDB 2024)](https://github.com/17zhangw/protox) into a cohesive system.
57+
58+
## Citing This Repository
59+
60+
If you use this repository in an academic paper, please cite:
61+
62+
```
63+
@inproceedings{lim23,
64+
author = {Lim, Wan Shen and Butrovich, Matthew and Zhang, William and Crotty, Andrew and Ma, Lin and Xu, Peijing and Gehrke, Johannes and Pavlo, Andrew},
65+
title = {Database Gyms},
66+
booktitle = {{CIDR} 2023, Conference on Innovative Data Systems Research},
67+
year = {2023},
68+
url = {https://db.cs.cmu.edu/papers/2023/p27-lim.pdf},
69+
}
70+
```
71+
72+
Additionally, please cite any module-specific paper that is relevant to your use.
73+
74+
**Accelerating Training Data Generation**
75+
76+
```
77+
(citation pending)
78+
Boot, appearing at VLDB 2024.
79+
```
80+
81+
**Simultaneously Tuning Multiple Configuration Spaces with Proto Actions**
82+
83+
```
84+
(citation pending)
85+
Proto-X, appearing at VLDB 2024.
86+
```

benchmark/tpch/cli.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ def tpch_group(dbgym_cfg: DBGymConfig):
2121
@tpch_group.command(name="data")
2222
@click.argument("scale-factor", type=float)
2323
@click.pass_obj
24-
# The reason generate-data is separate from create-pgdata is because generate-data is generic
25-
# to all DBMSs while create-pgdata is specific to Postgres.
24+
# The reason generate data is separate from create dbdata is because generate-data is generic
25+
# to all DBMSs while create dbdata is specific to a single DBMS.
2626
def tpch_data(dbgym_cfg: DBGymConfig, scale_factor: float):
2727
_clone(dbgym_cfg)
2828
_generate_data(dbgym_cfg, scale_factor)

dbms/postgres/build_repo.sh

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,34 +4,34 @@ set -euxo pipefail
44

55
REPO_REAL_PARENT_DPATH="$1"
66

7-
# download and make postgres from the boot repository
7+
# Download and make postgres from the boot repository.
88
mkdir -p "${REPO_REAL_PARENT_DPATH}"
99
cd "${REPO_REAL_PARENT_DPATH}"
10-
git clone git@github.com:lmwnshn/boot.git --single-branch --branch boot --depth 1
10+
git clone git@github.com:lmwnshn/boot.git --single-branch --branch vldb_2024 --depth 1
1111
cd ./boot
1212
./cmudb/build/configure.sh release "${REPO_REAL_PARENT_DPATH}/boot/build/postgres"
1313
make clean
1414
make install-world-bin -j4
1515

16-
# download and make bytejack
17-
cd ./cmudb/extension/bytejack_rs/
16+
# Download and make boot.
17+
cd ./cmudb/extension/boot_rs/
1818
cargo build --release
19-
cbindgen . -o target/bytejack_rs.h --lang c
19+
cbindgen . -o target/boot_rs.h --lang c
2020
cd "${REPO_REAL_PARENT_DPATH}/boot"
2121

22-
cd ./cmudb/extension/bytejack/
22+
cd ./cmudb/extension/boot/
2323
make clean
2424
make install -j
2525
cd "${REPO_REAL_PARENT_DPATH}/boot"
2626

27-
# download and make hypopg
27+
# Download and make hypopg.
2828
git clone git@github.com:HypoPG/hypopg.git
2929
cd ./hypopg
3030
PG_CONFIG="${REPO_REAL_PARENT_DPATH}/boot/build/postgres/bin/pg_config" make install
3131
cd "${REPO_REAL_PARENT_DPATH}/boot"
3232

33-
# download and make pg_hint_plan
34-
# we need -L to follow links
33+
# Download and make pg_hint_plan.
34+
# We need -L to follow links.
3535
curl -L https://github.com/ossc-db/pg_hint_plan/archive/refs/tags/REL15_1_5_1.tar.gz -o REL15_1_5_1.tar.gz
3636
tar -xzf REL15_1_5_1.tar.gz
3737
rm REL15_1_5_1.tar.gz

dbms/postgres/cli.py

Lines changed: 58 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
At a high level, this file's goal is to (1) install+build postgres and (2) create pgdata.
2+
At a high level, this file's goal is to (1) build postgres and (2) create dbdata (aka pgdata).
33
On the other hand, the goal of tune.protox.env.util.postgres is to provide helpers to manage
44
a Postgres instance during agent tuning.
55
util.pg provides helpers used by *both* of the above files (as well as other files).
@@ -10,11 +10,10 @@
1010
import subprocess
1111
from pathlib import Path
1212
import click
13-
import ssd_checker
1413

1514
from benchmark.tpch.load_info import TpchLoadInfo
1615
from dbms.load_info_base_class import LoadInfoBaseClass
17-
from misc.utils import DBGymConfig, conv_inputpath_to_realabspath, link_result, open_and_save, save_file, get_pgdata_tgz_name, default_pgbin_path, WORKSPACE_PATH_PLACEHOLDER, default_pgdata_parent_dpath
16+
from misc.utils import DBGymConfig, conv_inputpath_to_realabspath, link_result, open_and_save, save_file, get_dbdata_tgz_name, default_pgbin_path, WORKSPACE_PATH_PLACEHOLDER, default_dbdata_parent_dpath, is_ssd
1817
from util.shell import subprocess_run
1918
from sqlalchemy import Connection
2019
from util.pg import SHARED_PRELOAD_LIBRARIES, conn_execute, sql_file_execute, DBGYM_POSTGRES_DBNAME, create_conn, DEFAULT_POSTGRES_PORT, DBGYM_POSTGRES_USER, DBGYM_POSTGRES_PASS, DEFAULT_POSTGRES_DBNAME
@@ -32,7 +31,7 @@ def postgres_group(dbgym_cfg: DBGymConfig):
3231

3332
@postgres_group.command(
3433
name="build",
35-
help="Download and build the Postgres repository and all necessary extensions/shared libraries. Does not create pgdata.",
34+
help="Download and build the Postgres repository and all necessary extensions/shared libraries. Does not create dbdata.",
3635
)
3736
@click.pass_obj
3837
@click.option("--rebuild", is_flag=True, help="Include this flag to rebuild Postgres even if it already exists.")
@@ -41,46 +40,46 @@ def postgres_build(dbgym_cfg: DBGymConfig, rebuild: bool):
4140

4241

4342
@postgres_group.command(
44-
name="pgdata",
45-
help="Build a .tgz file of pgdata with various specifications for its contents.",
43+
name="dbdata",
44+
help="Build a .tgz file of dbdata with various specifications for its contents.",
4645
)
4746
@click.pass_obj
4847
@click.argument("benchmark_name", type=str)
4948
@click.option("--scale-factor", type=float, default=1)
5049
@click.option("--pgbin-path", type=Path, default=None, help=f"The path to the bin containing Postgres executables. The default is {default_pgbin_path(WORKSPACE_PATH_PLACEHOLDER)}.")
5150
@click.option(
52-
"--intended-pgdata-hardware",
51+
"--intended-dbdata-hardware",
5352
type=click.Choice(["hdd", "ssd"]),
5453
default="hdd",
55-
help=f"The intended hardware pgdata should be on. Used as a sanity check for --pgdata-parent-dpath.",
54+
help=f"The intended hardware dbdata should be on. Used as a sanity check for --dbdata-parent-dpath.",
5655
)
5756
@click.option(
58-
"--pgdata-parent-dpath",
57+
"--dbdata-parent-dpath",
5958
default=None,
6059
type=Path,
61-
help=f"The path to the parent directory of the pgdata which will be actively tuned. The default is {default_pgdata_parent_dpath(WORKSPACE_PATH_PLACEHOLDER)}.",
60+
help=f"The path to the parent directory of the dbdata which will be actively tuned. The default is {default_dbdata_parent_dpath(WORKSPACE_PATH_PLACEHOLDER)}.",
6261
)
63-
def postgres_pgdata(dbgym_cfg: DBGymConfig, benchmark_name: str, scale_factor: float, pgbin_path: Path, intended_pgdata_hardware: str, pgdata_parent_dpath: Path):
62+
def postgres_dbdata(dbgym_cfg: DBGymConfig, benchmark_name: str, scale_factor: float, pgbin_path: Path, intended_dbdata_hardware: str, dbdata_parent_dpath: Path):
6463
# Set args to defaults programmatically (do this before doing anything else in the function)
6564
if pgbin_path == None:
6665
pgbin_path = default_pgbin_path(dbgym_cfg.dbgym_workspace_path)
67-
if pgdata_parent_dpath == None:
68-
pgdata_parent_dpath = default_pgdata_parent_dpath(dbgym_cfg.dbgym_workspace_path)
66+
if dbdata_parent_dpath == None:
67+
dbdata_parent_dpath = default_dbdata_parent_dpath(dbgym_cfg.dbgym_workspace_path)
6968

7069
# Convert all input paths to absolute paths
7170
pgbin_path = conv_inputpath_to_realabspath(dbgym_cfg, pgbin_path)
72-
pgdata_parent_dpath = conv_inputpath_to_realabspath(dbgym_cfg, pgdata_parent_dpath)
71+
dbdata_parent_dpath = conv_inputpath_to_realabspath(dbgym_cfg, dbdata_parent_dpath)
7372

7473
# Check assertions on args
75-
if intended_pgdata_hardware == "hdd":
76-
assert not ssd_checker.is_ssd(pgdata_parent_dpath), f"Intended hardware is HDD but pgdata_parent_dpath ({pgdata_parent_dpath}) is an SSD"
77-
elif intended_pgdata_hardware == "ssd":
78-
assert ssd_checker.is_ssd(pgdata_parent_dpath), f"Intended hardware is SSD but pgdata_parent_dpath ({pgdata_parent_dpath}) is an HDD"
74+
if intended_dbdata_hardware == "hdd":
75+
assert not is_ssd(dbdata_parent_dpath), f"Intended hardware is HDD but dbdata_parent_dpath ({dbdata_parent_dpath}) is an SSD"
76+
elif intended_dbdata_hardware == "ssd":
77+
assert is_ssd(dbdata_parent_dpath), f"Intended hardware is SSD but dbdata_parent_dpath ({dbdata_parent_dpath}) is an HDD"
7978
else:
8079
assert False
8180

82-
# Create pgdata
83-
_create_pgdata(dbgym_cfg, benchmark_name, scale_factor, pgbin_path, pgdata_parent_dpath)
81+
# Create dbdata
82+
_create_dbdata(dbgym_cfg, benchmark_name, scale_factor, pgbin_path, dbdata_parent_dpath)
8483

8584

8685
def _get_pgbin_symlink_path(dbgym_cfg: DBGymConfig) -> Path:
@@ -109,52 +108,52 @@ def _build_repo(dbgym_cfg: DBGymConfig, rebuild):
109108
dbms_postgres_logger.info(f"Set up repo in {expected_repo_symlink_dpath}")
110109

111110

112-
def _create_pgdata(dbgym_cfg: DBGymConfig, benchmark_name: str, scale_factor: float, pgbin_path: Path, pgdata_parent_dpath: Path) -> None:
111+
def _create_dbdata(dbgym_cfg: DBGymConfig, benchmark_name: str, scale_factor: float, pgbin_path: Path, dbdata_parent_dpath: Path) -> None:
113112
"""
114-
I chose *not* for this function to skip by default if pgdata_tgz_symlink_path already exists. This
113+
I chose *not* for this function to skip by default if dbdata_tgz_symlink_path already exists. This
115114
is because, while the generated data is deterministic given benchmark_name and scale_factor, any
116-
change in the _create_pgdata() function would result in a different pgdata. Since _create_pgdata()
115+
change in the _create_dbdata() function would result in a different dbdata. Since _create_dbdata()
117116
may change somewhat frequently, I decided to get rid of the footgun of having changes to
118-
_create_pgdata() not propagate to [pgdata].tgz by default.
117+
_create_dbdata() not propagate to [dbdata].tgz by default.
119118
"""
120119

121-
# It's ok for the pgdata/ directory to be temporary. It just matters that the .tgz is saved in a safe place.
122-
pgdata_dpath = pgdata_parent_dpath / "pgdata_being_created"
123-
# We might be reusing the same pgdata_parent_dpath, so delete pgdata_dpath if it already exists
124-
if pgdata_dpath.exists():
125-
shutil.rmtree(pgdata_dpath)
120+
# It's ok for the dbdata/ directory to be temporary. It just matters that the .tgz is saved in a safe place.
121+
dbdata_dpath = dbdata_parent_dpath / "dbdata_being_created"
122+
# We might be reusing the same dbdata_parent_dpath, so delete dbdata_dpath if it already exists
123+
if dbdata_dpath.exists():
124+
shutil.rmtree(dbdata_dpath)
126125

127126
# Call initdb.
128127
# Save any script we call from pgbin_symlink_dpath because they are dependencies generated from another task run.
129128
save_file(dbgym_cfg, pgbin_path / "initdb")
130-
subprocess_run(f'./initdb -D "{pgdata_dpath}"', cwd=pgbin_path)
129+
subprocess_run(f'./initdb -D "{dbdata_dpath}"', cwd=pgbin_path)
131130

132-
# Start Postgres (all other pgdata setup requires postgres to be started).
131+
# Start Postgres (all other dbdata setup requires postgres to be started).
133132
# Note that subprocess_run() never returns when running "pg_ctl start", so I'm using subprocess.run() instead.
134-
start_postgres(dbgym_cfg, pgbin_path, pgdata_dpath)
133+
start_postgres(dbgym_cfg, pgbin_path, dbdata_dpath)
135134

136135
# Set up Postgres.
137-
_generic_pgdata_setup(dbgym_cfg)
138-
_load_benchmark_into_pgdata(dbgym_cfg, benchmark_name, scale_factor)
136+
_generic_dbdata_setup(dbgym_cfg)
137+
_load_benchmark_into_dbdata(dbgym_cfg, benchmark_name, scale_factor)
139138

140139
# Stop Postgres so that we don't "leak" processes.
141-
stop_postgres(dbgym_cfg, pgbin_path, pgdata_dpath)
140+
stop_postgres(dbgym_cfg, pgbin_path, dbdata_dpath)
142141

143142
# Create .tgz file.
144-
# Note that you can't pass "[pgdata].tgz" as an arg to cur_task_runs_data_path() because that would create "[pgdata].tgz" as a dir.
145-
pgdata_tgz_real_fpath = dbgym_cfg.cur_task_runs_data_path(
143+
# Note that you can't pass "[dbdata].tgz" as an arg to cur_task_runs_data_path() because that would create "[dbdata].tgz" as a dir.
144+
dbdata_tgz_real_fpath = dbgym_cfg.cur_task_runs_data_path(
146145
mkdir=True
147-
) / get_pgdata_tgz_name(benchmark_name, scale_factor)
148-
# We need to cd into pgdata_dpath so that the tar file does not contain folders for the whole path of pgdata_dpath.
149-
subprocess_run(f"tar -czf {pgdata_tgz_real_fpath} .", cwd=pgdata_dpath)
146+
) / get_dbdata_tgz_name(benchmark_name, scale_factor)
147+
# We need to cd into dbdata_dpath so that the tar file does not contain folders for the whole path of dbdata_dpath.
148+
subprocess_run(f"tar -czf {dbdata_tgz_real_fpath} .", cwd=dbdata_dpath)
150149

151150
# Create symlink.
152-
# Only link at the end so that the link only ever points to a complete pgdata.
153-
pgdata_tgz_symlink_path = link_result(dbgym_cfg, pgdata_tgz_real_fpath)
154-
dbms_postgres_logger.info(f"Created pgdata in {pgdata_tgz_symlink_path}")
151+
# Only link at the end so that the link only ever points to a complete dbdata.
152+
dbdata_tgz_symlink_path = link_result(dbgym_cfg, dbdata_tgz_real_fpath)
153+
dbms_postgres_logger.info(f"Created dbdata in {dbdata_tgz_symlink_path}")
155154

156155

157-
def _generic_pgdata_setup(dbgym_cfg: DBGymConfig):
156+
def _generic_dbdata_setup(dbgym_cfg: DBGymConfig):
158157
# get necessary vars
159158
pgbin_real_dpath = _get_pgbin_symlink_path(dbgym_cfg).resolve()
160159
assert pgbin_real_dpath.exists()
@@ -182,29 +181,29 @@ def _generic_pgdata_setup(dbgym_cfg: DBGymConfig):
182181
cwd=pgbin_real_dpath,
183182
)
184183

185-
# Create the dbgym database. since one pgdata dir maps to one benchmark, all benchmarks will use the same database
186-
# as opposed to using databases named after the benchmark
184+
# Create the dbgym database. Since one dbdata dir maps to one benchmark, all benchmarks will use the same database
185+
# as opposed to using databases named after the benchmark.
187186
subprocess_run(
188187
f"./psql -c \"create database {DBGYM_POSTGRES_DBNAME} with owner = '{dbgym_pguser}'\" {DEFAULT_POSTGRES_DBNAME} -p {pgport} -h localhost",
189188
cwd=pgbin_real_dpath,
190189
)
191190

192191

193-
def _load_benchmark_into_pgdata(
192+
def _load_benchmark_into_dbdata(
194193
dbgym_cfg: DBGymConfig, benchmark_name: str, scale_factor: float
195194
):
196195
with create_conn(use_psycopg=False) as conn:
197196
if benchmark_name == "tpch":
198197
load_info = TpchLoadInfo(dbgym_cfg, scale_factor)
199198
else:
200199
raise AssertionError(
201-
f"_load_benchmark_into_pgdata(): the benchmark of name {benchmark_name} is not implemented"
200+
f"_load_benchmark_into_dbdata(): the benchmark of name {benchmark_name} is not implemented"
202201
)
203202

204-
_load_into_pgdata(dbgym_cfg, conn, load_info)
203+
_load_into_dbdata(dbgym_cfg, conn, load_info)
205204

206205

207-
def _load_into_pgdata(dbgym_cfg: DBGymConfig, conn: Connection, load_info: LoadInfoBaseClass):
206+
def _load_into_dbdata(dbgym_cfg: DBGymConfig, conn: Connection, load_info: LoadInfoBaseClass):
208207
sql_file_execute(dbgym_cfg, conn, load_info.get_schema_fpath())
209208

210209
# truncate all tables first before even loading a single one
@@ -223,29 +222,29 @@ def _load_into_pgdata(dbgym_cfg: DBGymConfig, conn: Connection, load_info: LoadI
223222
sql_file_execute(dbgym_cfg, conn, constraints_fpath)
224223

225224

226-
def start_postgres(dbgym_cfg: DBGymConfig, pgbin_path: Path, pgdata_dpath: Path) -> None:
227-
_start_or_stop_postgres(dbgym_cfg, pgbin_path, pgdata_dpath, True)
225+
def start_postgres(dbgym_cfg: DBGymConfig, pgbin_path: Path, dbdata_dpath: Path) -> None:
226+
_start_or_stop_postgres(dbgym_cfg, pgbin_path, dbdata_dpath, True)
228227

229228

230-
def stop_postgres(dbgym_cfg: DBGymConfig, pgbin_path: Path, pgdata_dpath: Path) -> None:
231-
_start_or_stop_postgres(dbgym_cfg, pgbin_path, pgdata_dpath, False)
229+
def stop_postgres(dbgym_cfg: DBGymConfig, pgbin_path: Path, dbdata_dpath: Path) -> None:
230+
_start_or_stop_postgres(dbgym_cfg, pgbin_path, dbdata_dpath, False)
232231

233232

234-
def _start_or_stop_postgres(dbgym_cfg: DBGymConfig, pgbin_path: Path, pgdata_dpath: Path, is_start: bool) -> None:
233+
def _start_or_stop_postgres(dbgym_cfg: DBGymConfig, pgbin_path: Path, dbdata_dpath: Path, is_start: bool) -> None:
235234
# They should be absolute paths and should exist
236235
assert pgbin_path.is_absolute() and pgbin_path.exists()
237-
assert pgdata_dpath.is_absolute() and pgdata_dpath.exists()
236+
assert dbdata_dpath.is_absolute() and dbdata_dpath.exists()
238237
# The inputs may be symlinks so we need to resolve them first
239238
pgbin_real_dpath = pgbin_path.resolve()
240-
pgdata_dpath = pgdata_dpath.resolve()
239+
dbdata_dpath = dbdata_dpath.resolve()
241240
pgport = DEFAULT_POSTGRES_PORT
242241
save_file(dbgym_cfg, pgbin_real_dpath / "pg_ctl")
243242

244243
if is_start:
245244
# We use subprocess.run() because subprocess_run() never returns when running "pg_ctl start".
246245
# The reason subprocess_run() never returns is because pg_ctl spawns a postgres process so .poll() always returns None.
247246
# On the other hand, subprocess.run() does return normally, like calling `./pg_ctl` on the command line would do.
248-
result = subprocess.run(f"./pg_ctl -D \"{pgdata_dpath}\" -o '-p {pgport}' start", cwd=pgbin_real_dpath, shell=True)
247+
result = subprocess.run(f"./pg_ctl -D \"{dbdata_dpath}\" -o '-p {pgport}' start", cwd=pgbin_real_dpath, shell=True)
249248
result.check_returncode()
250249
else:
251-
subprocess_run(f"./pg_ctl -D \"{pgdata_dpath}\" -o '-p {pgport}' stop", cwd=pgbin_real_dpath)
250+
subprocess_run(f"./pg_ctl -D \"{dbdata_dpath}\" -o '-p {pgport}' stop", cwd=pgbin_real_dpath)

0 commit comments

Comments
 (0)