Civic Transparency Software Development Kit (SDK)

Synthetic data generation toolkit for civic transparency research and testing.

Installation

pip install civic-transparency-sdk

What This Package Provides

Synthetic Data Generation: Create realistic transparency data for testing and research without requiring real user data. Generate controlled datasets with reproducible seeds for studying information dynamics.
Internal Data Structures: Simulation-specific types (WindowAgg, ContentHash, TopHash) for generating and manipulating synthetic transparency data.
Database Integration: Convert generated data to DuckDB/SQL databases for analysis with ready-to-use schemas and indexing patterns.
CLI Tools: Command-line utilities for generating worlds, converting formats, and managing synthetic datasets.

Note: This SDK generates synthetic data for research/testing. For implementing the PTag API specification, see civic-transparency-ptag-types, which provides the official API response types.

Quick Start

Generate synthetic data:

# Activate environment
source .venv/bin/activate  # Linux/Mac
# or
.venv\Scripts\activate     # Windows

# Generate baseline world
ct-sdk generate --world A --topic-id aa55ee77 --out world_A.jsonl

# Convert to database
ct-sdk convert --jsonl world_A.jsonl --duck world_A.duckdb --schema schema/schema.sql

The generated DuckDB files are ready for analysis with any SQL-compatible tools or custom analysis scripts.

Use Cases

Academic Research: Generate controlled datasets with known parameters for studying information dynamics, coordination patterns, and transparency system behaviors.
Algorithm Development: Build and test transparency tools using synthetic data that mimics real-world patterns without privacy concerns.
Testing & Validation: Create reproducible test datasets for developing transparency systems without requiring real user data.
Education: Provide realistic datasets for teaching transparency concepts, data analysis, and system design.

Reproducibility

All generation is deterministic:

Seed-based randomization: Same seed produces identical datasets
Version tracking: Metadata includes package versions
Parameter logging: All generation settings preserved in output
Schema versioning: Database structures fully documented

Example seeds:

World A (baseline): 4242
World B (influenced): 8484

Package Structure

ci.transparency.sdk/
├── cli/            # Command-line interface (ct-sdk)
├── digests.py      # Content fingerprinting (SimHash64, MinHashSig)
├── hash_core.py    # Content identification (HashId, ContentHash, TopHash)
├── ids.py          # ID management (WorldId, TopicId)
├── io_schema.py    # JSON serialization utilities
└── window_agg.py   # Window aggregation structure (WindowAgg)

Related Projects

Civic Transparency PTag Spec - Official API specification
Civic Transparency PTag Types - Python types for PTag API responses (use this for API implementation)
Civic Transparency Verify - Statistical verification tools (private)

Security Model

This package provides synthetic data generation for research and testing. It does not include:

Detection algorithms or thresholds
Verification workflows or assessment criteria
Operational patterns or alerting rules

These are maintained separately to prevent adversarial reverse-engineering while enabling legitimate transparency research.

Documentation

Full documentation at: civic-interconnect.github.io/civic-transparency-py-sdk/

Usage Guide - Getting started and common workflows
CLI Reference - Command-line interface details
SDK Reference - Core Python APIs
Schema Reference - Database schemas and integration

Development

See CONTRIBUTING.md for development setup and guidelines.

Versioning

This specification follows semantic versioning. See CHANGELOG.md for version history.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
docs		docs
schema		schema
scripts_py		scripts_py
src/ci/transparency/sdk		src/ci/transparency/sdk
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Civic Transparency Software Development Kit (SDK)

Installation

What This Package Provides

Quick Start

The generated DuckDB files are ready for analysis with any SQL-compatible tools or custom analysis scripts.

Use Cases

Reproducibility

Package Structure

Related Projects

Security Model

Documentation

Development

Versioning

License

About

Uh oh!

Releases 2

Contributors 3

Uh oh!

Languages

License

civic-interconnect/civic-transparency-py-sdk

Folders and files

Latest commit

History

Repository files navigation

Civic Transparency Software Development Kit (SDK)

Installation

What This Package Provides

Quick Start

The generated DuckDB files are ready for analysis with any SQL-compatible tools or custom analysis scripts.

Use Cases

Reproducibility

Package Structure

Related Projects

Security Model

Documentation

Development

Versioning

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors 3

Uh oh!

Languages