🧠 GenAI SQL Tools Suite

A production-ready suite of modular, asynchronous tools for analyzing, refactoring, commenting, and auditing SQL code using Azure OpenAI (GPT-4o).

Designed with security, performance, auditability, and HIPAA/HITECH compliance in mind.

📁 Project Structure

sql_tools/
├── app.py                      # CLI interface and controller
├── requirements.txt
├── README.md
├── LICENSE
├── core/                       # Framework and shared logic
│   ├── base_ai_client.py       # Async Azure OpenAI client
│   ├── config_loader.py        # Configuration loader (mirrors original config.py)
│   ├── logger.py               # HIPAA-compliant logging utility
│   └── sql_task_base.py        # Abstract base class for GenAI SQL tasks
├── tasks/                      # Modular GenAI SQL task classes
│   ├── sql_analyzer.py
│   ├── sql_commenter.py
│   ├── sql_data_masker.py
│   ├── sql_refactorer.py
│   ├── sql_explainer.py
│   ├── sql_security_auditor.py
│   ├── sql_test_generator.py
│   ├── sql_performance_benchmark.py # New: Added performance benchmarking and optimization
│   ├── sql_data_masker.py           # New: Data masking and anonymization
│   ├── sql_visualizer.py            # New: Visualization and insights
│   ├── sql_error_corrector.py       # New: Error correction and debugging
│   ├── sql_style_enforcer.py        # New: Style guide enforcement
│   ├── natural_language_to_sql.py   # New: Natural language to SQL conversion
├── learn/                      # Learning Mode folder
│   └── sql_learn_mode.py       # Interactive tutorials for SQL learning
├── prompts/                    # Centralized prompt management
│   ├── index.yaml              # YAML file defining all prompts and metadata
│   ├── summarization/          # Summarization-related prompt templates
│   ├── classification/         # Classification-related prompt templates
└── utils/
    ├── file_utils.py           # File I/O, backup, and directory handling
    ├── prompt_manager.py       # Centralized prompt loading and validation
    ├── sanitizer.py            # LLM output cleaner (markdown, GPT comments)
    ├── dynamic_sql_detector.py # New: Utility for dynamic SQL detection

✅ Features

🧹 Data Masking and Anonymization - Automatically masks sensitive data such as emails, phone numbers, credit card numbers, and SSNs.
⚙️ Modular task engine (comment, analyze, refactor, audit, explain, test)
🔍 Query Simulation and Validation
📋 Centralized prompt management via prompts/index.yaml
⚡ Asynchronous OpenAI integration using httpx
🧼 Sanitized output with --sanitize
🔁 Directory recursion and batching with --recursive
🧪 Preview results before modifying files with --dry-run
🔐 Backup and HIPAA-safe logging
🔀 Git integration: auto-stage files with --git
📤 Export to separate files using --output
🌐 Support for multiple SQL dialects using nl_to_sql task (e.g., T-SQL, PostgreSQL, Oracle)
🚀 Natural Language to SQL Conversion
🔐 Enhanced security audits with SQL injection and role misuse detection
📊 Performance benchmarking and optimization
🔧 Data masking and anonymization
🎨 SQL Style Guide Enforcement
🛠 Dynamic SQL Detection
🧑‍🏫 SQL Education Mode (Interactive Tutorials)

🧪 CLI Usage

🔒 Mask Sensitive Data in SQL Queries

python app.py --task=mask --path=example.sql --output=masked_example.sql

🔍 What It Does:

Task: mask — Automatically identifies and masks sensitive data such as:
- Email addresses.
- Phone numbers.
- Credit card numbers.
- Social Security Numbers (SSNs).
--output=...: Writes the masked SQL to a new file.

🔧 Enforce SQL Style Guide

python app.py --task=style_enforce --path=example.sql --sql_dialect=PostgreSQL --output=styled_example.sql

What It Does:

Task: style_enforce — Enforces SQL coding standards dynamically using AI.
--sql_dialect=...: Specifies the SQL dialect (e.g., PostgreSQL, T-SQL).
--output=...: Writes the styled SQL to a new file.

🔧 Comment a SQL file

python app.py --task=comment --path=example.sql

🧼 Clean and save to new file

python app.py --task=comment --path=example.sql --sanitize --output=cleaned_example.sql

🔍 Preview refactored query (no overwrite)

python app.py --task=refactor --path=query.sql --dry-run

🗃️ Process all .sql files in folder (with backups)

python app.py --task=analyze --path=./sql_scripts --recursive --backup

🔐 Run security audit and stage for Git

python app.py --task=audit --path=query.sql --git

🧪 Generate SQL test cases

python app.py --task=test --path=example.sql --dry-run

🚀 Natural Language to SQL Conversion (inline query)

python app.py --task=nl_to_sql --path="list all patients diagnosed with diabetes last month" --sql_dialect="PostgreSQL" --schema_path="schema/schema.json" --dry-run

🚀 Natural Language to SQL Conversion (query file)

python app.py --task=nl_to_sql --path=queries/nl_query.txt --sql_dialect="T-SQL" --schema_path="schema/HealthClaimsDW.json" --output=output/generated_query.sql

📊 Benchmark SQL query performance

python app.py --task=benchmark --path=example.sql --dry-run

📈 Visualize query execution plan

python app.py --task=visualize --path=example.sql

🔍 Dynamic SQL Detection

Detect dynamic SQL patterns and analyze risks/optimizations:

Analyze risks and optimizations - Detect patterns only:

python app.py --task=dynamic_sql --path="queries/sample_query.sql" --detect_only --dry-run

Process all SQL files in a directory recursively:

python app.py --task=dynamic_sql --path="queries/" --recursive

👩‍🏫 SQL Learning Mode (Interactive Tutorials)

The sql_learn_mode.py script provides an interactive platform for learning SQL concepts through quizzes, practice, and conversational guidance. It leverages an AI client (BaseAIClient) for generating dynamic SQL content, such as quiz questions and feedback on queries.

Run the Learning Mode directly:

python learn/sql_learn_mode.py

Optionally, you can Set as startup file from within VS Professional and click Start/F5 to run the console app. This will start the AI Agent.

🛠 Configuration

Edit core/config_loader.py to match your Azure OpenAI deployment:

AOPAI_KEY = "your-api-key"
API_BASE = "https://your-resource.openai.azure.com/"
AOPAI_API_VERSION = "2025-01-01-preview"
AOPAI_DEPLOY_MODEL = "gpt-4o-dev"

📦 Install Requirements

pip install -r requirements.txt

📋 Centralized Prompt Management

All prompts are defined in a single index.yaml file, which maps specific tasks to their associated prompt templates. This design enables:

Consistent prompt formatting
Easier version control and auditing
Reusability across multiple modules

Each task class dynamically loads its associated prompt using metadata from this file.

🧾 Example `index.yaml` Entry

commenter.add_comments:
  inline: |
    "You are a T-SQL expert. Given the SQL code below, please:
    1. Prepend a comment header block with:
       -- =============================================
       -- Author:      {user}
       -- Create date: {timestamp}
       -- Description: <Provide a detailed overview of this query>
       -- =============================================

    2. Add or improve inline comments throughout the query.
    3. Only return the updated SQL code with no markdown formatting.

    SQL Code:
    {sql_query}"
  used_by: tasks.sql_commenter.SQLCommenter
  inputs:
    - sql_query
    - user
    - timestamp
  version: 1.0
  description: Add comments and metadata headers to SQL queries.

🔐 Security & Compliance

Logs are stored per task under the logs/ directory
Safe use of T-SQL comments (--, /* ... */)
Output sanitized for code-only results when using --sanitize
Aligned with HIPAA/HITECH compliance standards

📄 License

This project is licensed under the MIT License.

🤝 Contributing (Comming Soon)

Fork and open a PR
Follow modular design and solid principles
Ensure proper logging, error handling, and secure configuration

🙌 Authors & Acknowledgments

Vision and engineering by Hans Esquivel
Powered by Python & Azure OpenAI
Thanks to the SQL and ML community for insights and best practices

Contact

Please open an issue or start a discussion if you want to get in touch.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
cli		cli
core		core
learn		learn
prompts		prompts
schema		schema
sql		sql
tasks		tasks
tests		tests
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
GenAI SQL CLI Doc.docx		GenAI SQL CLI Doc.docx
GenAI SQL CLI.sln		GenAI SQL CLI.sln
LICENSE		LICENSE
README.md		README.md
SQL_CLI_Tools.pyproj		SQL_CLI_Tools.pyproj
app.py		app.py
requirements.txt		requirements.txt
sql_cli_usage_guide.md		sql_cli_usage_guide.md
use_cases_and_review.md		use_cases_and_review.md

License

Club-Innovate/GenAI-SQL-CLI

Folders and files

Latest commit

History

Repository files navigation