This proyect provides a solution to convert Parquet files to PostgreSQL databases using DuckDB as an intermediary.
- Python 3.8+
- DuckDB
- PostgreSQL
git clone https://github.com/claudiomansillab/parquet-to-psql.git
cd parquet-to-psql
python -m venv venv source venv/bin/activate # In Windows: venv\Scripts\activate
pip install -r requirements.txt
Place your Parquet files in the data/input/ folder.
Modify the example.env file in the root directory with your PostgreSQL connection data and save it as .env.
python src/main.py
The data will be loaded into PostgreSQL and a summary will be generated in data/output/.
- src/: Contains the main Python scripts.
- scripts/: Shell scripts for configuration and execution.
- data/: Folders for input and output files.
The contributions are welcome. Please open an issue to discuss major changes before submitting a pull request.