Skip to content

Commit 9e652d1

Browse files
Redesigned logging (#40)
**Summary**: redesigned logging across the whole project. Also fixed some bugs to pass the CI. **Demo**: ![Screenshot 2024-10-20 at 12 37 33](https://github.com/user-attachments/assets/6052f7e0-74cf-4593-be56-c57b7a636b49) The log files archived in `run_*/dbgym/artifacts/`. There are logs from dbgym and from third-party libraries. You can see that all logs, even info logs, are captured in `dbgym.log`. **Logging Design**: * I was having difficulty debugging some bugs because the console was too cluttered. This motivated me to redesign logging. * I removed all class-name loggers. All these loggers behaved the same (at least from what I could tell) so it's easier to just make them all use the same logger. * We use the loggers `dbgym`, `dbgym.output`, `dbgym.replay`, and the root logger. * `dbgym` is the "base" logger and should be used most of the time. It outputs errors to the console and all logs to the file `run_*/dbgym/artifacts/dbgym.log`. * `dbgym.output` is used when you actually want to output something to show the user. It just outputs the message straight to the console without any extra metadata. As a child of `dbgym`, anything logged here will also be propagated to `dbgym` and thus archived in `dbgym.log`. * `dbgym.replay` is specific to Proto-X and is where Proto-X stores log information only relevant by replay. By making it its own logger, we insulate it from any changes to the main logging system. * The root logger is used to help debug unit tests. Unit tests are isolated from the main logging system for simplicity. See `test_clean.py` for an example of this. * Certain third-party loggers like `ray` are redirected to a file to reduce console clutter. * I kept the ray dashboard in the console though because it's pretty useful. * `print()` is reserved for use for actual debugging. * I redirected `warnings` to a separate file too to further reduce clutter. * I do special handling to eliminate the warnings that show up every time when import tensorflow (see `task.py` for an example of this). **Other Details**: * Upgraded nccl to version 2.20.* in requirements.txt to fix an import error. * Embedding datagen was not working. I added additional unit tests to help me debug this. * Made workload_tests.py more robust by checking fields other than class mapping. This is done by saving reference `Workload` and `IndexSpace` objects as `pkl` files. * Verified that replay still works (since it relies on log files).
1 parent ac849f8 commit 9e652d1

File tree

178 files changed

+679
-1689
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

178 files changed

+679
-1689
lines changed

.github/workflows/tests_ci.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ jobs:
3838
3939
- name: Static type checking
4040
run: |
41-
mypy --config-file scripts/mypy.ini .
41+
./scripts/mypy.sh
4242
4343
- name: Run unit tests
4444
run: |
@@ -47,7 +47,7 @@ jobs:
4747
4848
- name: Run integration tests
4949
# Delete the workspace. Run once with a clean workspace. Run again from the existing workspace.
50-
# Need to run with a non-root user in order to start Postgres.
50+
# Note that we need to run with a non-root user in order to start Postgres.
5151
run: |
5252
. "$HOME/.cargo/env"
5353
rm -rf ../dbgym_integtest_workspace

benchmark/tpch/cli.py

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,9 @@
99
link_result,
1010
workload_name_fn,
1111
)
12+
from util.log import DBGYM_LOGGER_NAME
1213
from util.shell import subprocess_run
1314

14-
benchmark_tpch_logger = logging.getLogger("benchmark/tpch")
15-
benchmark_tpch_logger.setLevel(logging.INFO)
16-
1715

1816
@click.group(name="tpch")
1917
@click.pass_obj
@@ -75,17 +73,19 @@ def _clone(dbgym_cfg: DBGymConfig) -> None:
7573
dbgym_cfg.cur_symlinks_build_path(mkdir=True) / "tpch-kit.link"
7674
)
7775
if expected_symlink_dpath.exists():
78-
benchmark_tpch_logger.info(f"Skipping clone: {expected_symlink_dpath}")
76+
logging.getLogger(DBGYM_LOGGER_NAME).info(
77+
f"Skipping clone: {expected_symlink_dpath}"
78+
)
7979
return
8080

81-
benchmark_tpch_logger.info(f"Cloning: {expected_symlink_dpath}")
81+
logging.getLogger(DBGYM_LOGGER_NAME).info(f"Cloning: {expected_symlink_dpath}")
8282
real_build_path = dbgym_cfg.cur_task_runs_build_path()
8383
subprocess_run(
8484
f"./tpch_setup.sh {real_build_path}", cwd=dbgym_cfg.cur_source_path()
8585
)
8686
symlink_dpath = link_result(dbgym_cfg, real_build_path / "tpch-kit")
8787
assert expected_symlink_dpath.samefile(symlink_dpath)
88-
benchmark_tpch_logger.info(f"Cloned: {expected_symlink_dpath}")
88+
logging.getLogger(DBGYM_LOGGER_NAME).info(f"Cloned: {expected_symlink_dpath}")
8989

9090

9191
def _get_tpch_kit_dpath(dbgym_cfg: DBGymConfig) -> Path:
@@ -103,7 +103,7 @@ def _generate_queries(
103103
) -> None:
104104
tpch_kit_dpath = _get_tpch_kit_dpath(dbgym_cfg)
105105
data_path = dbgym_cfg.cur_symlinks_data_path(mkdir=True)
106-
benchmark_tpch_logger.info(
106+
logging.getLogger(DBGYM_LOGGER_NAME).info(
107107
f"Generating queries: {data_path} [{seed_start}, {seed_end}]"
108108
)
109109
for seed in range(seed_start, seed_end + 1):
@@ -125,7 +125,7 @@ def _generate_queries(
125125
)
126126
queries_symlink_dpath = link_result(dbgym_cfg, real_dir)
127127
assert queries_symlink_dpath.samefile(expected_queries_symlink_dpath)
128-
benchmark_tpch_logger.info(
128+
logging.getLogger(DBGYM_LOGGER_NAME).info(
129129
f"Generated queries: {data_path} [{seed_start}, {seed_end}]"
130130
)
131131

@@ -137,12 +137,14 @@ def _generate_data(dbgym_cfg: DBGymConfig, scale_factor: float) -> None:
137137
data_path / f"tables_sf{get_scale_factor_string(scale_factor)}.link"
138138
)
139139
if expected_tables_symlink_dpath.exists():
140-
benchmark_tpch_logger.info(
140+
logging.getLogger(DBGYM_LOGGER_NAME).info(
141141
f"Skipping generation: {expected_tables_symlink_dpath}"
142142
)
143143
return
144144

145-
benchmark_tpch_logger.info(f"Generating: {expected_tables_symlink_dpath}")
145+
logging.getLogger(DBGYM_LOGGER_NAME).info(
146+
f"Generating: {expected_tables_symlink_dpath}"
147+
)
146148
subprocess_run(f"./dbgen -vf -s {scale_factor}", cwd=tpch_kit_dpath / "dbgen")
147149
real_dir = dbgym_cfg.cur_task_runs_data_path(
148150
f"tables_sf{get_scale_factor_string(scale_factor)}", mkdir=True
@@ -151,7 +153,9 @@ def _generate_data(dbgym_cfg: DBGymConfig, scale_factor: float) -> None:
151153

152154
tables_symlink_dpath = link_result(dbgym_cfg, real_dir)
153155
assert tables_symlink_dpath.samefile(expected_tables_symlink_dpath)
154-
benchmark_tpch_logger.info(f"Generated: {expected_tables_symlink_dpath}")
156+
logging.getLogger(DBGYM_LOGGER_NAME).info(
157+
f"Generated: {expected_tables_symlink_dpath}"
158+
)
155159

156160

157161
def _generate_workload(
@@ -165,7 +169,9 @@ def _generate_workload(
165169
workload_name = workload_name_fn(scale_factor, seed_start, seed_end, query_subset)
166170
expected_workload_symlink_dpath = symlink_data_dpath / (workload_name + ".link")
167171

168-
benchmark_tpch_logger.info(f"Generating: {expected_workload_symlink_dpath}")
172+
logging.getLogger(DBGYM_LOGGER_NAME).info(
173+
f"Generating: {expected_workload_symlink_dpath}"
174+
)
169175
real_dpath = dbgym_cfg.cur_task_runs_data_path(workload_name, mkdir=True)
170176

171177
queries = None
@@ -190,10 +196,11 @@ def _generate_workload(
190196
and not sql_fpath.is_symlink()
191197
and sql_fpath.is_absolute()
192198
), "We should only write existent real absolute paths to a file"
193-
output = ",".join([f"S{seed}-Q{qnum}", str(sql_fpath)])
194-
print(output, file=f)
199+
f.write(f"S{seed}-Q{qnum},{sql_fpath}\n")
195200
# TODO(WAN): add option to deep-copy the workload.
196201

197202
workload_symlink_dpath = link_result(dbgym_cfg, real_dpath)
198203
assert workload_symlink_dpath == expected_workload_symlink_dpath
199-
benchmark_tpch_logger.info(f"Generated: {expected_workload_symlink_dpath}")
204+
logging.getLogger(DBGYM_LOGGER_NAME).info(
205+
f"Generated: {expected_workload_symlink_dpath}"
206+
)

dbms/postgres/cli.py

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
open_and_save,
3030
save_file,
3131
)
32+
from util.log import DBGYM_LOGGER_NAME
3233
from util.pg import (
3334
DBGYM_POSTGRES_DBNAME,
3435
DBGYM_POSTGRES_PASS,
@@ -42,9 +43,6 @@
4243
)
4344
from util.shell import subprocess_run
4445

45-
dbms_postgres_logger = logging.getLogger("dbms/postgres")
46-
dbms_postgres_logger.setLevel(logging.INFO)
47-
4846

4947
@click.group(name="postgres")
5048
@click.pass_obj
@@ -142,12 +140,14 @@ def _get_repo_symlink_path(dbgym_cfg: DBGymConfig) -> Path:
142140
def _build_repo(dbgym_cfg: DBGymConfig, rebuild: bool) -> None:
143141
expected_repo_symlink_dpath = _get_repo_symlink_path(dbgym_cfg)
144142
if not rebuild and expected_repo_symlink_dpath.exists():
145-
dbms_postgres_logger.info(
143+
logging.getLogger(DBGYM_LOGGER_NAME).info(
146144
f"Skipping _build_repo: {expected_repo_symlink_dpath}"
147145
)
148146
return
149147

150-
dbms_postgres_logger.info(f"Setting up repo in {expected_repo_symlink_dpath}")
148+
logging.getLogger(DBGYM_LOGGER_NAME).info(
149+
f"Setting up repo in {expected_repo_symlink_dpath}"
150+
)
151151
repo_real_dpath = dbgym_cfg.cur_task_runs_build_path("repo", mkdir=True)
152152
subprocess_run(
153153
f"./build_repo.sh {repo_real_dpath}", cwd=dbgym_cfg.cur_source_path()
@@ -156,7 +156,9 @@ def _build_repo(dbgym_cfg: DBGymConfig, rebuild: bool) -> None:
156156
# only link at the end so that the link only ever points to a complete repo
157157
repo_symlink_dpath = link_result(dbgym_cfg, repo_real_dpath)
158158
assert expected_repo_symlink_dpath.samefile(repo_symlink_dpath)
159-
dbms_postgres_logger.info(f"Set up repo in {expected_repo_symlink_dpath}")
159+
logging.getLogger(DBGYM_LOGGER_NAME).info(
160+
f"Set up repo in {expected_repo_symlink_dpath}"
161+
)
160162

161163

162164
def _create_dbdata(
@@ -207,7 +209,9 @@ def _create_dbdata(
207209
# Create symlink.
208210
# Only link at the end so that the link only ever points to a complete dbdata.
209211
dbdata_tgz_symlink_path = link_result(dbgym_cfg, dbdata_tgz_real_fpath)
210-
dbms_postgres_logger.info(f"Created dbdata in {dbdata_tgz_symlink_path}")
212+
logging.getLogger(DBGYM_LOGGER_NAME).info(
213+
f"Created dbdata in {dbdata_tgz_symlink_path}"
214+
)
211215

212216

213217
def _generic_dbdata_setup(dbgym_cfg: DBGymConfig) -> None:

dependencies/requirements.txt

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
absl-py==2.1.0
22
aiosignal==1.3.1
3+
astroid==3.2.4
34
astunparse==1.6.3
45
async-timeout==4.0.3
56
attrs==23.2.0
@@ -11,6 +12,7 @@ click==8.1.7
1112
cloudpickle==3.0.0
1213
cmake==3.28.1
1314
cramjam==2.8.1
15+
dill==0.3.8
1416
distlib==0.3.8
1517
faiss-gpu==1.7.2
1618
Farama-Notifications==0.0.4
@@ -42,6 +44,7 @@ libclang==16.0.6
4244
lit==17.0.6
4345
Markdown==3.5.2
4446
MarkupSafe==2.1.4
47+
mccabe==0.7.0
4548
ml-dtypes==0.2.0
4649
mpmath==1.3.0
4750
msgpack==1.0.7
@@ -67,7 +70,7 @@ nvidia-cusolver-cu11==11.4.0.1
6770
nvidia-cusolver-cu12==11.4.5.107
6871
nvidia-cusparse-cu11==11.7.4.91
6972
nvidia-cusparse-cu12==12.1.0.106
70-
nvidia-nccl-cu11==2.14.3
73+
nvidia-nccl-cu11==2.20.5
7174
nvidia-nccl-cu12==2.20.5
7275
nvidia-nvjitlink-cu12==12.3.101
7376
nvidia-nvtx-cu11==11.7.91
@@ -116,6 +119,7 @@ tensorflow-io-gcs-filesystem==0.36.0
116119
termcolor==2.4.0
117120
threadpoolctl==3.2.0
118121
tomli==2.0.1
122+
tomlkit==0.13.2
119123
torch==2.4.0
120124
tqdm==4.66.1
121125
triton==3.0.0

experiments/protox_tpch_sf1/main.sh

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
#!/bin/bash
2+
3+
set -euxo pipefail
4+
5+
SCALE_FACTOR=1
6+
INTENDED_DBDATA_HARDWARE=ssd
7+
. ./experiments/load_per_machine_envvars.sh
8+
9+
# space for testing. uncomment this to run individual commands from the script (copy pasting is harder because there are envvars)
10+
python3 task.py tune protox agent hpo tpch --scale-factor $SCALE_FACTOR --max-concurrent 4 --tune-duration-during-hpo 1 --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH --build-space-good-for-boot
11+
exit 0
12+
13+
# benchmark
14+
python3 task.py benchmark tpch data $SCALE_FACTOR
15+
python3 task.py benchmark tpch workload --scale-factor $SCALE_FACTOR
16+
17+
# postgres
18+
python3 task.py dbms postgres build
19+
python3 task.py dbms postgres dbdata tpch --scale-factor $SCALE_FACTOR --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH
20+
21+
# embedding
22+
python3 task.py tune protox embedding datagen tpch --scale-factor $SCALE_FACTOR --override-sample-limits "lineitem,32768" --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH
23+
python3 task.py tune protox embedding train tpch --scale-factor $SCALE_FACTOR --train-max-concurrent 10
24+
25+
# agent
26+
python3 task.py tune protox agent hpo tpch --scale-factor $SCALE_FACTOR --max-concurrent 4 --tune-duration-during-hpo 4 --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH --build-space-good-for-boot
27+
python3 task.py tune protox agent tune tpch --scale-factor $SCALE_FACTOR

manage/cli.py

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,7 @@
1313
is_child_path,
1414
parent_dpath_of_path,
1515
)
16-
17-
task_logger = logging.getLogger("task")
18-
task_logger.setLevel(logging.INFO)
16+
from util.log import DBGYM_LOGGER_NAME, DBGYM_OUTPUT_LOGGER_NAME
1917

2018

2119
# This is used in test_clean.py. It's defined here to avoid a circular import.
@@ -49,7 +47,7 @@ def manage_clean(dbgym_cfg: DBGymConfig, mode: str) -> None:
4947
@click.pass_obj
5048
def manage_count(dbgym_cfg: DBGymConfig) -> None:
5149
num_files = _count_files_in_workspace(dbgym_cfg)
52-
print(
50+
logging.getLogger(DBGYM_OUTPUT_LOGGER_NAME).info(
5351
f"The workspace ({dbgym_cfg.dbgym_workspace_path}) has {num_files} total files/dirs/symlinks."
5452
)
5553

@@ -184,10 +182,10 @@ def clean_workspace(
184182
ending_num_files = _count_files_in_workspace(dbgym_cfg)
185183

186184
if verbose:
187-
task_logger.info(
185+
logging.getLogger(DBGYM_LOGGER_NAME).info(
188186
f"Removed {starting_num_files - ending_num_files} out of {starting_num_files} files"
189187
)
190-
task_logger.info(
188+
logging.getLogger(DBGYM_LOGGER_NAME).info(
191189
f"Workspace went from {starting_num_files - ending_num_files} to {starting_num_files}"
192190
)
193191

manage/tests/test_clean.py

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,19 +11,16 @@
1111

1212
# This is here instead of on `if __name__ == "__main__"` because we often run individual tests, which
1313
# does not go through the `if __name__ == "__main__"` codepath.
14-
# Make it DEBUG to see logs from verify_structure(). Make it INFO to not see logs.
15-
logging.basicConfig(level=logging.INFO)
14+
# Make it DEBUG to see logs from verify_structure(). Make it CRITICAL to not see any logs.
15+
# We use the root logger for unit tests to keep it separate from the standard logging subsystem which
16+
# uses the dbgym.* loggers.
17+
logging.basicConfig(level=logging.CRITICAL)
1618

1719

1820
FilesystemStructure = NewType("FilesystemStructure", dict[str, Any])
1921

2022

2123
class CleanTests(unittest.TestCase):
22-
"""
23-
I deemed "clean" important enough to write extensive unit tests for because a bug could lead to
24-
losing important files.
25-
"""
26-
2724
scratchspace_path: Path = Path()
2825

2926
@staticmethod

misc/utils.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import logging
12
import os
23
import shutil
34
import subprocess
@@ -9,6 +10,7 @@
910
import redis
1011
import yaml
1112

13+
from util.log import DBGYM_LOGGER_NAME
1214
from util.shell import subprocess_run
1315

1416
# Enums
@@ -107,8 +109,8 @@ def get_dbdata_tgz_name(benchmark_name: str, scale_factor: float | str) -> str:
107109
# - If a name already has the workload_name, I omit scale factor. This is because the workload_name includes the scale factor
108110
# - By convention, symlinks should end with ".link". The bug that motivated this decision involved replaying a tuning run. When
109111
# replaying a tuning run, you read the tuning_steps/ folder of the tuning run. Earlier, I created a symlink to that tuning_steps/
110-
# folder called run_*/dbgym_agent_protox_tune/tuning_steps. However, replay itself generates an output.log file, which goes in
111-
# run_*/dbgym_agent_protox_tune/tuning_steps/. The bug was that my replay function was overwriting the output.log file of the
112+
# folder called run_*/dbgym_agent_protox_tune/tuning_steps. However, replay itself generates an replay_info.log file, which goes in
113+
# run_*/dbgym_agent_protox_tune/tuning_steps/. The bug was that my replay function was overwriting the replay_info.log file of the
112114
# tuning run. By naming all symlinks "*.link", we avoid the possibility of subtle bugs like this happening.
113115
default_traindata_path: Callable[[Path, str, str], Path] = (
114116
lambda workspace_path, benchmark_name, workload_name: get_symlinks_path_from_workspace_path(
@@ -674,5 +676,5 @@ def is_ssd(path: Path) -> bool:
674676
return is_ssd
675677
return False
676678
except Exception as e:
677-
print(f"An error occurred: {e}")
679+
logging.getLogger(DBGYM_LOGGER_NAME).error(f"An error occurred: {e}")
678680
return False

scripts/mypy.sh

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
#!/bin/bash
2+
mypy --config-file scripts/mypy.ini .

scripts/pat_test.sh

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,7 @@ INTENDED_DBDATA_HARDWARE=ssd
77
. ./experiments/load_per_machine_envvars.sh
88

99
# space for testing. uncomment this to run individual commands from the script (copy pasting is harder because there are envvars)
10-
python3 task.py tune protox agent hpo tpch --scale-factor $SCALE_FACTOR --num-samples 2 --max-concurrent 2 --workload-timeout 15 --query-timeout 1 --tune-duration-during-hpo 0.01 --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH --build-space-good-for-boot
11-
python3 task.py tune protox agent tune tpch --scale-factor $SCALE_FACTOR --tune-duration-during-tune 0.02
12-
python3 task.py tune protox agent replay tpch --scale-factor $SCALE_FACTOR
10+
python3 task.py tune protox embedding train tpch --scale-factor $SCALE_FACTOR --iterations-per-epoch 1 --num-points-to-sample 1 --num-batches 1 --batch-size 64 --start-epoch 15 --num-samples 4 --train-max-concurrent 4 --num-curate 2
1311
exit 0
1412

1513
# benchmark
@@ -20,8 +18,6 @@ python3 task.py benchmark tpch workload --scale-factor $SCALE_FACTOR
2018
python3 task.py dbms postgres build
2119
python3 task.py dbms postgres dbdata tpch --scale-factor $SCALE_FACTOR --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH
2220

23-
exit 0
24-
2521
# embedding
2622
# python3 task.py tune protox embedding datagen tpch --scale-factor $SCALE_FACTOR --default-sample-limit 64 --file-limit 64 --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH # short datagen for testing
2723
python3 task.py tune protox embedding datagen tpch --scale-factor $SCALE_FACTOR --override-sample-limits "lineitem,32768" --intended-dbdata-hardware $INTENDED_DBDATA_HARDWARE --dbdata-parent-dpath $DBDATA_PARENT_DPATH # long datagen so that train doesn't crash

scripts/read_parquet.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,24 @@
1+
import logging
12
import sys
23
from pathlib import Path
34

45
import pandas as pd
56

7+
from util.log import DBGYM_OUTPUT_LOGGER_NAME
68

7-
def read_and_print_parquet(file_path: Path) -> None:
9+
10+
def read_and_output_parquet(file_path: Path) -> None:
811
# Read the Parquet file into a DataFrame
912
df = pd.read_parquet(file_path)
1013

11-
# Print the DataFrame
12-
print("DataFrame:")
13-
print(df)
14+
# Output the DataFrame
15+
logging.getLogger(DBGYM_OUTPUT_LOGGER_NAME).info("DataFrame:")
16+
logging.getLogger(DBGYM_OUTPUT_LOGGER_NAME).info(df)
1417

1518

1619
if __name__ == "__main__":
1720
# Specify the path to the Parquet file
1821
parquet_file_path = Path(sys.argv[0])
1922

20-
# Call the function to read and print the Parquet file
21-
read_and_print_parquet(parquet_file_path)
23+
# Call the function to read and output the Parquet file
24+
read_and_output_parquet(parquet_file_path)

0 commit comments

Comments
 (0)