Skip to content

hyacinth scans plain text files for common forms of personally identifiable information (PII) — including emails, phone numbers, SSNs, DOB, Credit Card numbers, and physical addresses — and redacts them securely using regex-based pattern matching. Designed for developers who need data hygiene without heavy dependencies.

Notifications You must be signed in to change notification settings

aRageQueen/hyacinth

Repository files navigation

hyacinth

hyacinth scans plain text files for common forms of personally identifiable information (PII) — including emails, phone numbers, SSNs, DOB, Credit Card numbers, and Physical addresses — and redacts them securely using regex-based pattern matching. Designed for developers who need data hygiene without heavy dependencies.

Features

Detects and redacts:

  • Email addresses
  • US phone numbers
  • Social Security Numbers (SSNs)
  • Date of Birth
  • Credit Card numbers
  • Physical addresses
  • Simple CLI interface
  • Lightweight and dependency-free (pure Python)
  • Easily extensible with new patterns or input types

Installation

Clone the repository:

git clone https://github.com/your-username/hyacinth.git
cd hyacinth

Usage

  1. Run hyacinth from the command line:
    python hyacinth.py
  2. Choose option (1 to scrub file)
  3. Add filepath to scrub data from (ex: single_entry_test_redacted.txt)

Tool structure

hyacinth/
├── hyacinth.py       # Main CLI script
├── patterns.py       # Regex definitions
├── utils.py          # Redaction logic
├── test.txt          # Example input file
└── README.md         # Project documentation

About

hyacinth scans plain text files for common forms of personally identifiable information (PII) — including emails, phone numbers, SSNs, DOB, Credit Card numbers, and physical addresses — and redacts them securely using regex-based pattern matching. Designed for developers who need data hygiene without heavy dependencies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages