A conversion tool to convert between popular data storage file types (CSV/TXT/TSV, JSON, Parquet, Excel) Powered by DuckDB
- Auto-detection: Automatically detects the input file type based on its extension.
- Multiple Formats: If you don't want to make everything parquet (Why not?!) Make-it-Parquet! also supportsSupports CSV, TXT, TSV, JSON, Parquet, and Excel conversions in both directions.
- Conversions Such as: CSV to Excel, Excel to CSV, CSV to Parquet, and Parquet to JSON etc. etc. Make-it-Parquet! is the ultimate multi format converter ensuring you can get your data into any format you need.
- Interactive Options: Prompts for Excel sheet and range if not specified.
- Directory Conversion: When converting a directory, the tool always creates a subfolder (named after the output type) in the output destination to store the converted files.
- Flexible Aliasing: Easily alias commands (e.g.,
mp
for general use).
- Python 3.7+
- DuckDB Python package
Assuming you are using uv run (or a similar tool), you can run the script with:
uv run /path/to/make_it_parquet.py [OPTIONS]
The basic usage from the command line is as follows:
Usage: mp <input_path> [OPTIONS]
Arguments:
input_path
Path to a file or directory containing files.
Options:
-i, --input_type TEXT
Override auto-detection of input file type.
Allowed values: csv, txt, tsv, json, parquet, pq, excel, ex.-o, --output_type TEXT
Desired output file type.
Allowed values: csv, tsv, json, parquet, pq, excel, ex.-op, --output-path TEXT
Output file (if input is a single file) or directory (if input is a folder).
For directory input, a subfolder named after the output type is always created.-s, --sheet TEXT
For Excel input: sheet number or sheet name to import (e.g. 1 or "Sheet1").-c, --range TEXT
For Excel input: cell range to import (e.g. A1:B2).-d, --delimiter TEXT
Defines the delimiter for TXT export. Pass 't' for tab-separated, 'c' for comma-separated, or provide a literal value.
If not provided, the tool will prompt you.
-
Convert a Single Excel File to Parquet:
mp /path/to/file.xlsx -i excel -o pq
-
Convert All Files in a Folder to CSV:
mp /path/to/folder -o csv
-
Convert an Excel File to CSV with a Specified Sheet and Range:
mp /path/to/file.xlsx -i excel -s 1 -c A2:E7 -o csv
-
Convert Any File Type to Parquet:
mp /path/to/file_or_folder -o pq
-
Convert a CSV File to JSON (auto-detecting the input type):
mp /path/to/file.csv -o json
-
Convert all Supported Files in a Directory to TXT Format and Specify a Delimiter for TXT Export:
mp /path/to/folder -o txt -d t
To simplify usage, you can set up an alias using uv run
in your shell configuration.
Add the following line to your ~/.bashrc
:
alias mp='uv run /path/to/make_it_parquet.py'
Then reload your shell:
source ~/.bashrc
Add the following line to your ~/.zshrc
:
alias mp='uv run /path/to/make_it_parquet.py'
Then reload your shell:
source ~/.zshrc
For Fish shell, add the following function to your ~/.config/fish/config.fish
:
function mp
uv run /path/to/make_it_parquet.py $argv
end
Then reload your configuration:
source ~/.config/fish/config.fish
-
Input Processing:
- If the input path is a file, Make-It-Parquet! will convert the file to the specified output format by simply changing the extension.
- If the input is a directory, it will scan the folder to determine the majority file type (using predefined naming mappings) and generate an output directory name accordingly.
-
File Conversions:
- Conversion functions, defined in the tool's
conversions.py
module, handle the actual file format conversions using DuckDB SQL commands. - For Excel conversions, if the number of rows exceeds an effective limit, the export is paginated into multiple files.
- Whether converting CSV to Excel, Excel to CSV, CSV to Parquet, Parquet to JSON, or any other supported file type, the tool ensures a smooth process that lets you conveniently convert data files from one format to another.
- Conversion functions, defined in the tool's
-
CLI Options:
- Input and output types can be overridden by CLI options (
-i
and-o
). - For Excel inputs,
-s
and-c
allow you to specify the sheet and cell range. - For TXT exports, use the
-d
option to select the delimiter (or the tool will prompt if not supplied).
- Input and output types can be overridden by CLI options (
- Python 3.7+
- DuckDB
Licensed under the MIT Licence. See LICENSE
for more information.
- Thanks to DuckDB for providing a robust SQL engine for on-the-fly file conversions.
- Special thanks to contributors and users who helped refine Make-It-Parquet!
Below is a comprehensive list of supported conversions:
- convert csv to txt
- convert csv to tsv
- convert csv to json
- convert csv to parquet
- convert csv to excel
- convert txt to csv
- convert txt to tsv
- convert txt to json
- convert txt to parquet
- convert txt to excel
- convert tsv to csv
- convert tsv to txt
- convert tsv to json
- convert tsv to parquet
- convert tsv to excel
- convert json to csv
- convert json to txt
- convert json to tsv
- convert json to parquet
- convert json to excel
- convert parquet to csv
- convert parquet to txt
- convert parquet to tsv
- convert parquet to json
- convert parquet to excel
- convert excel to csv
- convert excel to txt
- convert excel to tsv
- convert excel to json
- convert excel to parquet
Happy converting!