Welcome to the protegrity-developer-edition repository, part of the Protegrity Developer Edition suite. This repository provides a self-contained experimentation platform for discovering and protecting sensitive data using Protegrity’s Data Discovery and Protection APIs.
This repository enables developers to:
- Rapidly set up a local environment using Docker Compose.
- Experiment with unstructured text classification and PII redaction.
- Integrate Protegrity APIs into GenAI and traditional applications.
- Use sample applications and data to understand integration workflows.
.
├── CHANGELOG
├── CONTRIBUTIONS.md
├── LICENSE
├── README.md
├── data-discovery
│ ├── sample-classification-commands.sh
│ └── sample-classification-python.py
├── docker-compose.yml
└── samples
├── config.json
├── requirements.txt
├── sample-app-find-and-redact.py
├── sample-app-find.py
└── sample-data
└── sample-find-redact.txt
- Data Discovery: REST-based classification of unstructured text using Data Discovery.
- Data Protection: Integration with a sample Python application for redaction or masking.
- Sample App: Demonstrates how to find and redact PII.
- Cross-platform: Works on Linux, Windows, and MacOS.
- Python >= 3.9.23
- pip
- Python Virtual Environment
- Container management software:
- For Linux/Windows: Docker
- For MacOS: Docker Desktop or Colima
- Docker Compose V2
- Git
Linux and Windows users can proceed to Setup Instructions.
Additional settings for MacOS
MacOS requires additional steps for Docker and for systems with Apple Silicon chips. Complete the following steps before using Developer Edition.
-
Complete one of the following options to apply the settings.
- For Colima:
- Open a command prompt.
- Run the following command.
colima start --vm-type vz --vz-rosetta
- For Docker Desktop:
- Open Docker Desktop.
- Go to Settings > General.
- Enable the following check boxes:
- Use Virtualization framework
- Use Rosetta for x86_64/amd64 emulation on Apple Silicon
- Click Apply & restart.
- For Colima:
-
Update one of the following options for resolving certificate related errors.
- For Colima:
-
Open a command prompt.
-
Navigate and open the following file.
~/.colima/default/colima.yaml -
Update the following configuration in
colima.yamlto add the path for obtaining the required images.Before update:
docker: {}After update:
docker: insecure-registries: - ghcr.io -
Save and close the file.
-
Stop colima.
colima stop -
Close and start the command prompt.
-
Start colima.
colima start --vm-type vz --vz-rosetta
-
- For Docker Desktop:
-
Open Docker Desktop.
-
Click the gear or settings icon.
-
Click Docker Engine from the sidebar. The editor with your current Docker daemon configuration
daemon.jsonopens. -
Locate and add the
insecure-registrieskey in the root JSON object. Ensure that you add a comma after the last value in the existing configuration.After update:
{ . . <existing configuration>, "insecure-registries": [ "ghcr.io", "githubusercontent.com" ] } -
Click Apply & Restart to save the changes and restart Docker Desktop.
-
Verify: After Docker restarts, run
docker infoin your terminal and confirm that the required registry is listed under Insecure Registries.
-
- For Colima:
-
Optional: If the The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested error is displayed.
-
Start a command prompt.
-
Navigate and open the following file.
~/.docker/config.json -
Add the following paramater.
"default-platform": "linux/amd64" -
Save and close the file.
-
Run the
docker compose up -dfrom theprotegrity-developer-editiondirectory if already cloned, else proceed to Setup Instructions.
-
Complete the steps provided here to clone, install, find, and test the Developer Edition.
- Open a command prompt.
- Clone the git repository.
git clone https://github.com/Protegrity-Developer-Edition/protegrity-developer-edition.git - Navigate to the
protegrity-developer-editiondirectory in the cloned location. - Start the Data Discovery services in background. The dependent containers are large in size. Based on the network connection, the containers might take time to download and deploy.
Based on your configuration use the
docker compose up -ddocker-compose up -dcommand. - Install the
protegrity-developer-pythonmodule. It is recommended to install and activate the Python virtual environment before installing the module.The installation completes and the success message is displayed.pip install protegrity-developer-python
Complete the steps provided here to run the sample application. The sample application reads the sample-find-redact.txt file, classifies and redacts the sensitive data, and the output.txt file is saved to the folder samples/sample-data.
- Open a command prompt.
- Navigate to the
protegrity-developer-editiondirectory in the cloned location. - Run the sample application.
python samples/sample-app-find-and-redact.py
💡Note: By default, all sensitive data is redacted, even if the entities are not mapped in the
named_entity_mapconfiguration.
Edit samples/config.json to customize the Python module:
- API endpoint (Default:
localhost) - Named entity mappings
- Redaction method (
redactormask, Default:redact) - Masking Character (Default:
#) - Classification score threshold (Default:
0.6) - Enable logging (Default:
true)
{
"api_endpoint": "http://localhost:8580/pty/data-discovery/v1.0/classify",
"named_entity_map": {
"CREDIT_CARD": "CCN",
"DATE_TIME": "DATE"
},
"redaction_method": "redact",
"masking_character": "#",
"classification_threshold": 0.6,
"enable_logging": true
}- The Protegrity Developer Edition documentation is available at http://developer.docs.protegrity.com/.
- For API reference and tutorials, visit the Developer Portal at https://www.protegrity.com/developers.
- Join the discussion on https://github.com/orgs/Protegrity-Developer-Edition/discussions.
- Anonymous downloads supported; registration required for participation.
See LICENSE for terms and conditions.