Skip to content

claudiomansillab/parquet_topsql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parquet to PostgreSQL Converter (In development)

This proyect provides a solution to convert Parquet files to PostgreSQL databases using DuckDB as an intermediary.

Requirements

  • Python 3.8+
  • DuckDB
  • PostgreSQL

Configuration

Clone this repository:

git clone https://github.com/claudiomansillab/parquet-to-psql.git

cd parquet-to-psql

Create a virtual environment and activate it:

python -m venv venv source venv/bin/activate # In Windows: venv\Scripts\activate

Install the dependencies:

pip install -r requirements.txt

Usage

Place your Parquet files in the data/input/ folder.

Configuration of .env file

Modify the example.env file in the root directory with your PostgreSQL connection data and save it as .env.

Execute the main script:

python src/main.py

The data will be loaded into PostgreSQL and a summary will be generated in data/output/.

Project Structure

  • src/: Contains the main Python scripts.
  • scripts/: Shell scripts for configuration and execution.
  • data/: Folders for input and output files.

Contributions

The contributions are welcome. Please open an issue to discuss major changes before submitting a pull request.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published