Skip to content

CLI tool for inserting SELECT query results into ClickHouse with automatic schema matching and type-safe casting. Ideal for ETL pipelines and SQL-driven data flows.

Notifications You must be signed in to change notification settings

castengine/insert-tools

Repository files navigation

πŸš€ Insert Tools

PyPI version Python Versions Downloads GitHub Workflow Status License: Non-commercial License: Commercial Last commit Stars

Problem:

Have you faced issues inserting data into databases? Constant schema mismatch errors, incorrect data types, manual checks, and even silent data corruption? If you work with large ETL pipelines and databases, you know how painful it can be.

Solution:

Insert Tools is a robust and flexible tool designed for safe and fast data insertion into databases β€” starting with ClickHouse. It validates schema by column names (not by order), supports automatic type casting, and lets you dry-run your inserts before touching real data. Perfect for ETL pipelines where target table schemas evolve frequently.

πŸ”₯ Why you should try it:

  • βœ… Data safety: Validates column names and types before insert.
  • βš™οΈ Auto type casting: Converts mismatched types when enabled.
  • 🚧 Dry-run mode: Test inserts without touching data.
  • 🐳 Docker-ready: Comes with ready-to-use Docker integration.
  • πŸ”§ Configurable: Fully controllable insert pipeline.
  • πŸ”₯ Time saver: Automates validation and error prevention.

🎯 Key Features:

  • πŸ–₯️ Simple CLI and Python API.
  • πŸ›‘οΈ Strict mode to block extra columns.
  • πŸ“Œ Detailed logging and diagnostics.
  • πŸ”„ Easy CI/CD integration.

πŸ“¦ Quick install:

pip install insert-tools

To install for development:

pip install -e .[dev]

Link to the project on PyPI

πŸš€ Run & Examples:

🐍 Python usage:

from insert_tools.runner import InsertConfig, run_insert

config = InsertConfig(
    host="localhost",
    database="default",
    target_table="my_table",
    select_sql="SELECT * FROM source_table",
    user="default",
    password="admin123",
    allow_type_cast=True,
    strict_column_match=True
)

run_insert(config)

πŸ–₯️ CLI usage:

insert-tools \
  --host localhost \
  --port 8123 \
  --user default \
  --password admin123 \
  --database default \
  --target_table my_table \
  --select_sql "SELECT * FROM source_table" \
  --allow_type_cast \
  --strict \
  --dry-run \
  --verbose

πŸ§ͺ Testing & Integration:

pytest -v --cov=insert_tools tests/

Integration tests are supported via Docker (docker-compose.yml).

πŸ“ˆ Roadmap:

Planned and upcoming features:

βœ… Core & Safety

  • ClickHouse support (stable)
  • Manual insert_columns mapping
  • Logging configuration (file, level, formatting)
  • Dry-run + exit codes
  • Strict schema validator with preview

πŸ“¦ Priority Database Support

  • MySQL β€” no name-based insert, requires exact column order
  • PostgreSQL β€” order and column count must match
  • SQLite β€” insert depends on column order
  • Oracle β€” insert requires explicit column mapping
  • SQL Server β€” insert must follow column order

🧰 Advanced Features

  • Error handling strategies (fail, warn, skip)
  • Config file validation (optional)
  • Secure secrets handling (.env / vault)
  • Optional CAST rules config

πŸ“˜ Ecosystem

  • Full documentation site (mkdocs)
  • Schema + config reference
  • Auto-generated help from CLI
  • GitHub Discussions / Community page

πŸ› οΈ Configuration Options

Parameter Description Required
host ClickHouse server hostname βœ…
port ClickHouse server port ❌
user ClickHouse user ❌
password ClickHouse password ❌
database Target database βœ…
target_table Target table name βœ…
select_sql SQL query to fetch data βœ…
allow_type_cast Allow type casting on mismatch ❌
strict_column_match Enable strict mode for column matching ❌

🧱 How It Works

  1. Fetches target table schema from ClickHouse.
  2. Extracts column names and types from SELECT query.
  3. Applies optional CAST(...) if types mismatch.
  4. Validates column alignment and inserts data.

🀝 Contributing:

Ideas, bug reports, and pull requests are welcome! Join the community and help make Insert Tools better.

βš–οΈ License

This project uses a dual-license model:

Insert Tools makes data insertion simple, fast, and safe. Save your time and nerves today!

About

CLI tool for inserting SELECT query results into ClickHouse with automatic schema matching and type-safe casting. Ideal for ETL pipelines and SQL-driven data flows.

Topics

Resources

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published