Releases: andrjas/data_check
Releases · andrjas/data_check
0.20.0
0.19.0
Added
- Python 3.12 support
- ruff for linting
Changed
- using pandas.Timestamp instead of datetime for date/datetime columns
Removed
- custom datetime parsing
- isort in pre-commit (using ruff instead)
- black in pre-commit (using ruff instead)
- eradicate in pre-commit (using ruff instead)
0.18.0
0.17.0
Added
- pipeline YAML validation via pydantic
- more breakpoint step features and documentation
Changed
- replaced 'overall result' with 'summary'
Fixed
- load_template and load_lookups called twice in run
- generating sorted csv for checks
- updated SQLAlchemy links to 2.0
- print exception if merging non-unique columns
0.16.0
0.15.0
Added
- 'data_check init' to create projects and pipelines
- 'append' as alias for append-mode in cli and pipelines
- 'ping --wait' and --timeout/--retry
- Python 3.11 support
Changed
- io module is renamed to file_ops
- running csv file without matching sql file will fail, otherwise it will run the csv check
- MSSQL uses arm64 image for CI
Fixed
- NA/NaT should be treated equally in checks
- CTRL+C should work in Windows
- 'data_check gen' works with full table checks
Removed
- custom docker images for CI
0.14.0
Added
- pre-commit hooks with various tools for code quality
- project wide default_load_mode configuration
- pipelines: added 'files' for 'sql' to deprecate 'sql_files'
- pipelines: added 'run' as alias for 'check'
- tests that pipeline steps matches cli
- pipelines: 'write_check' for 'sql'
- documentation for 'fake' pipeline step
- pipelines: added 'table' and 'file' for 'load' to deprecate 'load_table'
- running data_check_pipeline.yml directly to execute the pipeline
Changed
- refactored TableInfo into Table
- moved integration tests into pytest
- upgraded dependencies
Fixed
- load fails if csv doesn't have all columns
Deprecated
- pipelines: 'sql_files' is deprecated, use 'sql' instead
- pipelines: 'load_table' is deprec
0.13.0
Added
- upsert mode for loading data into tables
- pipelines: added 'mode' to deprecate 'load_mode'
- env variable DATA_CHECK_CONNECTION can override default connection
Changed
- printing exception on failure without --traceback
- upgraded dependencies
- documentation theme
Fixed
- Oracle: using VARCHAR2 instead of CLOB to load strings and large decimals
- bug in runner.executor when calculating max_workers
Deprecated
- pipelines: 'load_mode' is deprecated, use 'mode' instead
Removed
- workaround for replace mode
- support for python 3.7
- importlib-metadata dependency