RowGen is a command-line tool that generates synthetic data and inserts it into your database. It uses AI to create realistic fake data based on your database schema.
- AI-Powered Fake Data: Uses HuggingFace’s NLP models to generate realistic text, numbers, and structured data.
- SQL-Compatible: directly executes
INSERT
statements or export them into .sql files for easy database import. - Customizable Schemas: Define table structures and let RowGen fill in the rest.
- Poetry-Managed: Clean dependency management and virtual environments.
Suppose you have a database such as the following:
Column | Type | Constraints |
---|---|---|
author_id | SERIAL | PRIMARY KEY |
name | VARCHAR(100) | NOT NULL |
VARCHAR(100) | UNIQUE |
Column | Type | Constraints |
---|---|---|
store_id | SERIAL | PRIMARY KEY |
name | VARCHAR(100) | NOT NULL |
location | VARCHAR(255) | NOT NULL |
Column | Type | Constraints |
---|---|---|
book_id | SERIAL | PRIMARY KEY |
title | VARCHAR(200) | NOT NULL |
publication_date | DATE | |
price | NUMERIC(10, 2) | CHECK (price >= 0) |
author_id | INTEGER |
RowGen generates realistic sample data for this database, respecting foreign key relations and constraints:
author_id | name | |
---|---|---|
1 | Margaret Atwood | margaret.atwood@example.com |
2 | Haruki Murakami | haruki.murakami@example.com |
3 | J.K. Rowling | jk.rowling@example.com |
4 | George Orwell | george.orwell@example.com |
5 | Agatha Christie | agatha.christie@example.com |
store_id | name | location |
---|---|---|
1 | Book Haven | 123 Main St, New York, NY |
2 | Literary Corner | 456 Elm St, San Francisco, CA |
3 | Page Turner | 789 Oak St, Chicago, IL |
4 | Novel Nook | 101 Pine St, Seattle, WA |
5 | The Bookworm | 202 Maple St, Boston, MA |
book_id | title | publication_date | price | author_id | store_id |
---|---|---|---|---|---|
1 | The Handmaid's Tale | 1985-08-01 | 12.99 | 1 | 1 |
2 | Norwegian Wood | 1987-09-04 | 14.5 | 2 | 2 |
3 | Harry Potter and the Philosopher's Stone | 1997-06-26 | 10.99 | 3 | 3 |
4 | 1984 | 1949-06-08 | 9.99 | 4 | 4 |
5 | Murder on the Orient Express | 1934-01-01 | 11.25 | 5 | 5 |
6 | The Testaments | 2019-09-10 | 15.99 | 1 | 1 |
7 | Kafka on the Shore | 2002-09-12 | 13.75 | 2 | 2 |
8 | Harry Potter and the Chamber of Secrets | 1998-07-02 | 11.99 | 3 | 3 |
9 | Animal Farm | 1945-08-17 | 8.5 | 4 | 4 |
10 | And Then There Were None | 1939-11-06 | 10.99 | 5 | 5 |
Notes:
- Foreign keys (author_id, store_id) are linked correctly.
- Constraints such as NOT NULL and CHECK on price are respected.
- Email addresses are linked with mailto: for easy access.
pip install rowgen
rowgen --user <username> --database <dbname> [options]
rowgen --db-type postgresql --host localhost --port 5432 --user myuser --database mydb
rowgen --db-url postgresql://user:password@localhost:5432/mydb
Creates an inserts.sql file with the generated statements:
rowgen --user myuser --database mydb --rows 50
rowgen --user myuser --database mydb --execute
rowgen --user myuser --database mydb --output custom_inserts.sql
rowgen --user myuser --database mydb --apikey YOUR_HUGGINGFACE_API_KEY
If no API key is provided, RowGen will prompt you to enter one and save it in ~/.config/rowgen/conf for future use.
Generate 100 rows for a PostgreSQL database and save to file:
rowgen --db-type postgresql --host db.example.com --user admin --database production --rows 100 --output prod_data.sql
Generate and immediately insert 25 rows into a MySQL database:
rowgen --db-type mysql --host localhost --user root --database test --execute
Use SQLite with direct execution:
rowgen --db-type sqlite --database /path/to/database.db --execute
-Connection Issues: Verify your database credentials and that the server is accessible
-API Key Problems: Check that your HuggingFace API key is valid and has sufficient permissions
-Permission Errors: Ensure you have write access to the output directory when saving to file
For more information, run:
rowgen --help