-
Notifications
You must be signed in to change notification settings - Fork 83
feat(integration-tests): Add infrastructure for running integration tests; Add basic integration tests for clp
& clp-s
compression.
#1100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Bill-hbrhbr
merged 121 commits into
y-scope:main
from
Bill-hbrhbr:integration-tests-boilerplate
Sep 15, 2025
Merged
Changes from 11 commits
Commits
Show all changes
121 commits
Select commit
Hold shift + click to select a range
6f76462
Add pycache to gitignore
Bill-hbrhbr 6fa551a
Preemptively update docs
Bill-hbrhbr 01fba04
Add project setup
Bill-hbrhbr 4ac0005
Add integration tests to python linting
Bill-hbrhbr 6f76924
Add sample tests and task workflow to run them
Bill-hbrhbr 6c6701d
Big update
Bill-hbrhbr c659eb4
Add clp-s test code
Bill-hbrhbr dc62fd4
Package restructure
Bill-hbrhbr ded7d4c
Complete clp-s testing (with bug)
Bill-hbrhbr 6880a0f
Make clp-s test workable with keys and rows sorting
Bill-hbrhbr c24cdaf
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 9587f99
Address some review comments
Bill-hbrhbr 6d73f31
Address more review comments
Bill-hbrhbr 09fc85c
turn download and extract fixture into a private helper function
Bill-hbrhbr 6216e68
UNcomment larget datasets
Bill-hbrhbr 3a6753b
Apply suggestions from code review
Bill-hbrhbr ac5a9d8
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 734ce73
Move download dir and other attributes inside dataset_logs fixture
Bill-hbrhbr 777a0d3
Make json compare into a helper function
Bill-hbrhbr 5dbe1fc
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 97d881c
Add package basic validity check
Bill-hbrhbr 55a4b21
remove dup class def
Bill-hbrhbr 4aadaae
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 3acbf02
Address review comments
Bill-hbrhbr be64383
remove unnecessary class param customizations
Bill-hbrhbr 6f83578
Add back missing dataset tests
Bill-hbrhbr cb0d282
Update dev utils
Bill-hbrhbr 57cf51b
Rename fixtures and fields according to offline discussions
Bill-hbrhbr bdbe7d2
Address most review comments
Bill-hbrhbr 9c92e93
Furthur renaming from package_config to test_config
Bill-hbrhbr 0857b49
Use __post_init__ to improve dataclass design
Bill-hbrhbr c035d9d
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr e747bfc
Update integration-tests/tests/test_identity_transformation.py
Bill-hbrhbr 1b10e01
Rename to avoid classes starting with Test
Bill-hbrhbr 61909bd
Address review comment
Bill-hbrhbr 54b26a9
Uncomment tests
Bill-hbrhbr 2922b47
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 14b0141
Split out package config
Bill-hbrhbr 709d9c8
Apply suggestions from code review
Bill-hbrhbr 1a33927
Use singular term for name
Bill-hbrhbr b047457
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr f0c5a8f
abbreviate var
Bill-hbrhbr 9ca7500
Revert test_log_name singular noun change
Bill-hbrhbr 28119ea
Apply suggestions from code review
Bill-hbrhbr 0cedb12
Lint fix
Bill-hbrhbr 3508514
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 0378bed
Address most review comments
Bill-hbrhbr 64dbb72
Test using CLP core bins instead of package
Bill-hbrhbr 5ae418c
Rename tasks
Bill-hbrhbr 812450c
Add uv requirement to core building
Bill-hbrhbr 011d073
Add README shell script lang hint
Bill-hbrhbr fce6489
Apply suggestions from code review
Bill-hbrhbr 5303ec9
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 9b40d16
Address coderabbit ai copmments
Bill-hbrhbr 53a6768
Add helper for validating directory exists
Bill-hbrhbr 29c2ded
Add docstring for dataclasses
Bill-hbrhbr 85c37f2
Add docstrings for test utils and improve functions
Bill-hbrhbr 37ffcac
Lint fix
Bill-hbrhbr 35f1690
Add mypy and ruff linters
Bill-hbrhbr 95b6886
ruff lint
Bill-hbrhbr 0a68983
Add missing __init__ files
Bill-hbrhbr e4f4891
Pass mypy test
Bill-hbrhbr ca43014
Disable warning about assert
Bill-hbrhbr ebbc541
Add mypy taskflow and fix all ruff complaints
Bill-hbrhbr baa6e9d
Update integration-tests/.pytest.ini
Bill-hbrhbr d22c534
Address coderabbit comment
Bill-hbrhbr 9833069
Lint fix
Bill-hbrhbr 7c818d4
logic fix
Bill-hbrhbr 37c0e23
Improve docstrings
Bill-hbrhbr c608df6
Add linting section to README
Bill-hbrhbr 1cae84e
Apply suggestions from code review
Bill-hbrhbr 1aa50e6
Add yoda-condition check skips
Bill-hbrhbr d49cedd
Update integration-tests/README.md
Bill-hbrhbr 5ea3289
Space out README code section
Bill-hbrhbr 8f81696
Update integration-tests/pyproject.toml
Bill-hbrhbr c5ad284
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 1d63ccc
Make use of python class property
Bill-hbrhbr 4139341
Improve taskfile
Bill-hbrhbr d5b7f76
Fix tab spaces
Bill-hbrhbr 9f30065
Create taskfile python linting for projects using uv
Bill-hbrhbr 6648734
satisfy yaml linter
Bill-hbrhbr 5b219fb
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 09fbd51
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 6da26a1
Refactor docs.
kirkrodrigues 789f4d1
Apply suggestions from code review
Bill-hbrhbr d7619cb
Address review concern
Bill-hbrhbr c8bd52f
Apply suggestions from code review
Bill-hbrhbr 8400e01
Address review comments
Bill-hbrhbr 95d19ba
Update integration-tests/tests/utils/config.py
Bill-hbrhbr bc4e20b
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 9f022f8
Make integration task depend on the whole package
Bill-hbrhbr aaba2f1
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr cbd0e8e
Apply suggestions from code review
Bill-hbrhbr a4382d6
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr b53c04a
Rename assert_utils to asserting_utils
Bill-hbrhbr 73dd3d1
Change validate to validates in docstring start
Bill-hbrhbr 9d280d7
abbreviate validate_dir_exists
Bill-hbrhbr 0de00bb
localize integration tests taskfile vars
Bill-hbrhbr 2f2c97f
Add back missing asserting_utils.py
Bill-hbrhbr bafa272
Update docs/src/dev-docs/index.md
Bill-hbrhbr 98b228d
lint fix
Bill-hbrhbr eaf04e4
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 51ec20c
Remove unrelated changes
Bill-hbrhbr fa81d00
Update docs/src/dev-docs/index.md
Bill-hbrhbr 7ca37da
Fix docs.
kirkrodrigues 9365193
Apply Rabbit's suggestion.
kirkrodrigues b641bd9
Alphabetize .gitignore.
kirkrodrigues c2073cd
Update integration-tests/README.md
kirkrodrigues 1fcc997
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr fb274d1
Address review comments
Bill-hbrhbr 4840360
typo fix
Bill-hbrhbr 4f3bb75
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 66f61a7
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 47ccb1c
Move python linting checks specific for unit tests into their own cat…
Bill-hbrhbr 37f59bb
Apply suggestions from code review
Bill-hbrhbr ba35f88
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr 8c98e4d
Address review comment
Bill-hbrhbr 60501f1
lint fix
Bill-hbrhbr 31e69e3
use shutil to find chmod binary
Bill-hbrhbr f424f21
use shutil to find the curl executable
Bill-hbrhbr 31da58e
Merge branch 'main' into integration-tests-boilerplate
Bill-hbrhbr File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,3 +2,4 @@ | |
.task/ | ||
build/ | ||
**/dist/ | ||
**/__pycache__/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
[pytest] | ||
addopts = | ||
--strict-config | ||
--strict-markers | ||
--capture=no | ||
--verbose | ||
--color=yes | ||
--code-highlight=yes | ||
env = | ||
D:CLP_BUILD_DIR=../build | ||
D:CLP_PACKAGE_DIR=../build/clp-package | ||
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
markers = | ||
binaries: mark tests that directly call the binaries in the package bin | ||
clp: mark tests that use the CLP storage engine | ||
clp_s: mark tests that use the CLP-S storage engine |
haiqi96 marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
3.10 |
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
Empty file.
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
[project] | ||
name = "integration-tests" | ||
version = "0.1.0" | ||
description = "Integration tests for the CLP project." | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
readme = "README.md" | ||
authors = [ | ||
{ name = "YScope Inc.", email = "dev@yscope.com" } | ||
] | ||
requires-python = ">=3.10" | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
[project.scripts] | ||
integration-tests = "integration_tests:main" | ||
|
||
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[build-system] | ||
requires = ["hatchling"] | ||
build-backend = "hatchling.build" | ||
|
||
[dependency-groups] | ||
dev = [ | ||
"pytest>=8.3.5", | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
"pytest-benchmark>=5.1.0", | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
"pytest-env>=1.1.5", | ||
] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
def main() -> None: | ||
print("Hello from integration-tests!") | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
from tests.fixtures.base_config import base_config | ||
from tests.fixtures.dataset_logs import ( | ||
download_and_extract_dataset, | ||
hive_24hr, | ||
postgresql, | ||
spark_event_logs, | ||
) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
from pathlib import Path | ||
|
||
import pytest | ||
from tests.utils.config import BaseConfig | ||
from tests.utils.utils import get_env_var | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def base_config() -> BaseConfig: | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
clp_build_dir = Path(get_env_var("CLP_BUILD_DIR")).resolve() | ||
clp_package_dir = Path(get_env_var("CLP_PACKAGE_DIR")).resolve() | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
base_config = BaseConfig( | ||
clp_bin_dir=clp_package_dir / "bin", | ||
clp_package_dir=clp_package_dir, | ||
clp_sbin_dir=clp_package_dir / "sbin", | ||
test_output_dir=clp_build_dir / "var" / "logs" / "pytest", | ||
uncompressed_logs_dir=clp_build_dir / "var" / "data" / "pytest" / "downloads", | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
) | ||
base_config.test_output_dir.mkdir(parents=True, exist_ok=True) | ||
base_config.uncompressed_logs_dir.mkdir(parents=True, exist_ok=True) | ||
return base_config |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
import shutil | ||
|
||
import pytest | ||
from tests.utils.config import ( | ||
BaseConfig, | ||
DatasetLogs, | ||
) | ||
from tests.utils.utils import run_and_assert | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
haiqi96 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
def hive_24hr() -> DatasetLogs: | ||
return DatasetLogs( | ||
name="hive-24hr", | ||
tar_url="https://zenodo.org/records/7094921/files/hive-24hr.tar.gz?download=1", | ||
) | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def elasticsearch() -> DatasetLogs: | ||
return DatasetLogs( | ||
name="elasticsearch", | ||
tar_url="https://zenodo.org/records/10516227/files/elasticsearch.tar.gz?download=1", | ||
) | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def spark_event_logs() -> DatasetLogs: | ||
return DatasetLogs( | ||
name="spark-event-logs", | ||
tar_url="https://zenodo.org/records/10516346/files/spark-event-logs.tar.gz?download=1", | ||
) | ||
|
||
|
||
@pytest.fixture(scope="session") | ||
def postgresql() -> DatasetLogs: | ||
return DatasetLogs( | ||
name="postgresql", | ||
tar_url="https://zenodo.org/records/10516402/files/postgresql.tar.gz?download=1", | ||
) | ||
|
||
|
||
@pytest.fixture(autouse=True) | ||
def download_and_extract_dataset(request, base_config: BaseConfig) -> DatasetLogs: | ||
dataset_config = request.getfixturevalue(request.param) | ||
dataset_name = dataset_config.name | ||
if request.config.cache.get(dataset_name, False): | ||
print(f"Uncompressed logs for dataset `{dataset_name}` is up-to-date.") | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
return dataset_config | ||
|
||
download_path = str(base_config.uncompressed_logs_dir / f"{dataset_name}.tar.gz") | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
extract_path = str(base_config.uncompressed_logs_dir / dataset_name) | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
# fmt: off | ||
cmds = [ | ||
"curl", | ||
"--fail", | ||
"--location", | ||
"--output", str(download_path), | ||
"--show-error", | ||
dataset_config.tar_url, | ||
] | ||
# fmt: on | ||
run_and_assert(cmds) | ||
|
||
try: | ||
shutil.unpack_archive(download_path, extract_path) | ||
except: | ||
assert False, f"Tar extraction failed for downloaded dataset `{dataset_name}`." | ||
|
||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
request.config.cache.set(dataset_name, True) | ||
return dataset_config | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
122 changes: 122 additions & 0 deletions
122
integration-tests/tests/test_identity_transformation.py
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
import shutil | ||
from pathlib import Path | ||
from tempfile import NamedTemporaryFile | ||
from typing import IO | ||
|
||
import pytest | ||
from tests.utils.config import ( | ||
BaseConfig, | ||
DatasetLogs, | ||
) | ||
from tests.utils.utils import ( | ||
diff_equal, | ||
run_and_assert, | ||
) | ||
|
||
pytestmark = pytest.mark.binaries | ||
haiqi96 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
text_datasets = pytest.mark.parametrize( | ||
"download_and_extract_dataset", | ||
[ | ||
"hive_24hr", | ||
], | ||
indirect=["download_and_extract_dataset"], | ||
haiqi96 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
) | ||
|
||
json_datasets = pytest.mark.parametrize( | ||
"download_and_extract_dataset", | ||
[ | ||
"spark_event_logs", | ||
"postgresql", | ||
], | ||
indirect=["download_and_extract_dataset"], | ||
) | ||
|
||
|
||
@pytest.mark.clp | ||
@text_datasets | ||
def test_clp_identity_transform( | ||
request, base_config: BaseConfig, download_and_extract_dataset: DatasetLogs | ||
) -> None: | ||
binary_path_str = str(base_config.clp_bin_dir / "clp") | ||
dataset_name = download_and_extract_dataset.name | ||
download_dir = base_config.uncompressed_logs_dir / dataset_name | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
archives_dir = base_config.test_output_dir / f"{dataset_name}-archives" | ||
extract_dir = base_config.test_output_dir / f"{dataset_name}-logs" | ||
|
||
shutil.rmtree(archives_dir, ignore_errors=True) | ||
shutil.rmtree(extract_dir, ignore_errors=True) | ||
|
||
# fmt: off | ||
compression_cmd = [ | ||
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
||
binary_path_str, | ||
"c", | ||
"--progress", | ||
"--remove-path-prefix", str(download_dir), | ||
str(archives_dir), | ||
str(download_dir), | ||
] | ||
# fmt: on | ||
run_and_assert(compression_cmd) | ||
run_and_assert([binary_path_str, "x", str(archives_dir), str(extract_dir)]) | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
diff_equal(download_dir, extract_dir) | ||
|
||
shutil.rmtree(archives_dir, ignore_errors=True) | ||
shutil.rmtree(extract_dir, ignore_errors=True) | ||
|
||
|
||
@pytest.mark.clp_s | ||
@json_datasets | ||
def test_clp_s_identity_transform( | ||
request, base_config: BaseConfig, download_and_extract_dataset: DatasetLogs | ||
) -> None: | ||
binary_path_str = str(base_config.clp_bin_dir / "clp-s") | ||
dataset_name = download_and_extract_dataset.name | ||
download_dir = base_config.uncompressed_logs_dir / dataset_name | ||
archives_dir = base_config.test_output_dir / f"{dataset_name}-archives" | ||
extract_dir = base_config.test_output_dir / f"{dataset_name}-logs" | ||
|
||
shutil.rmtree(archives_dir, ignore_errors=True) | ||
shutil.rmtree(extract_dir, ignore_errors=True) | ||
|
||
run_and_assert([binary_path_str, "c", str(archives_dir), str(download_dir)]) | ||
run_and_assert([binary_path_str, "x", str(archives_dir), str(extract_dir)]) | ||
|
||
# Recompress the decompressed single-file output and decompress it again to verify consistency. | ||
# TODO: Remove this check once we can directly compare decompressed logs (which would preserve | ||
# the directory structure and row/key order) with the original downloaded logs. | ||
# See also: https://docs.yscope.com/clp/main/user-guide/core-clp-s.html#current-limitations | ||
single_file_archives_dir = base_config.test_output_dir / f"{dataset_name}-single-file-archives" | ||
single_file_extract_dir = base_config.test_output_dir / f"{dataset_name}-single-file-logs" | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
shutil.rmtree(single_file_archives_dir, ignore_errors=True) | ||
shutil.rmtree(single_file_extract_dir, ignore_errors=True) | ||
|
||
run_and_assert([binary_path_str, "c", single_file_archives_dir, extract_dir]) | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
run_and_assert([binary_path_str, "x", single_file_archives_dir, single_file_extract_dir]) | ||
|
||
# Key and row orders are not preserved during `clp-s` operations, so sort before diffing. | ||
with _sort_json_keys_and_rows(extract_dir / "original") as s1, _sort_json_keys_and_rows( | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
single_file_extract_dir / "original" | ||
) as s2: | ||
diff_equal(s1.name, s2.name) | ||
|
||
shutil.rmtree(archives_dir, ignore_errors=True) | ||
shutil.rmtree(extract_dir, ignore_errors=True) | ||
shutil.rmtree(single_file_archives_dir, ignore_errors=True) | ||
shutil.rmtree(single_file_extract_dir, ignore_errors=True) | ||
|
||
|
||
def _sort_json_keys_and_rows(json_fp: Path) -> IO[bytes]: | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
with NamedTemporaryFile(mode="w+", delete=True) as keys_sorted, NamedTemporaryFile( | ||
mode="w+", delete=True | ||
) as flattened: | ||
keys_and_rows_sorted = NamedTemporaryFile(mode="w+", delete=True) | ||
run_and_assert(["jq", "--sort-keys", ".", str(json_fp)], stdout=keys_sorted) | ||
keys_sorted.flush() | ||
run_and_assert(["jq", ".", keys_sorted.name], stdout=flattened) | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
flattened.flush() | ||
run_and_assert(["sort", flattened.name], stdout=keys_and_rows_sorted) | ||
keys_and_rows_sorted.flush() | ||
return keys_and_rows_sorted | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
from dataclasses import dataclass | ||
from pathlib import Path | ||
|
||
|
||
@dataclass(frozen=True) | ||
class BaseConfig: | ||
clp_bin_dir: Path | ||
clp_package_dir: Path | ||
clp_sbin_dir: Path | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
test_output_dir: Path | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
uncompressed_logs_dir: Path | ||
haiqi96 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
|
||
@dataclass(frozen=True) | ||
class DatasetLogs: | ||
name: str | ||
tar_url: str | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Bill-hbrhbr marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
import os | ||
import subprocess | ||
from pathlib import Path | ||
from typing import List | ||
|
||
|
||
def diff_equal(path1: Path, path2: Path) -> None: | ||
haiqi96 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
cmd = ["diff", "--brief", "--recursive", str(path1), str(path2)] | ||
proc = subprocess.run(cmd, stdout=subprocess.PIPE) | ||
if 0 != proc.returncode: | ||
if 1 == proc.returncode: | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
assert False, "Files/Directories don't match." | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
assert False, f"Command failed {' '.join(cmd)}" | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
def get_env_var(var_name: str) -> str: | ||
value = os.environ.get(var_name) | ||
assert value is not None, f"Environment variable {var_name} is not set." | ||
return value | ||
|
||
|
||
def run_and_assert(cmd: List[str], **kwargs) -> subprocess.CompletedProcess: | ||
haiqi96 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
proc = subprocess.run(cmd, **kwargs) | ||
assert 0 == proc.returncode, f"Command failed: {' '.join(cmd)}" | ||
return proc | ||
Bill-hbrhbr marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.