cognito.ai is a naturalβlanguage forensic evidence discovery engine for UFDR (Universal Forensic Extraction Device Report) data. It ingests UFDR exports, normalizes heterogeneous schemas with deterministic IDs, and indexes into Elasticsearch search engine using categoryβaware mappings. [ Project Demo | Project Abstract | Project PPT ]
Built for Smart India Hackathon - 2025
- Real-time Search: Elasticsearch with category-aware mappings and faceted filters.
- UFDR Ingestion: normalized schemas and deterministic IDs across heterogeneous exports.
- Natural-Language Querying: Gemini-powered NLQ translated to Elasticsearch DSL.
- Secure Persistence: MongoDB storage with JWT-based authentication.
- Visual Analytics: timeline and network views, case drill-downs, exportable reports.
- Dev-Friendly Setup: Docker Compose for ES + Mongo; FastAPI + Next.js local dev.
π Sample AI-Generated Report - See an example of our platform's comprehensive forensic analysis output.
.
βββ backend/
β βββ app/
β β βββ core/ # Settings, DB/ES bootstrap
β β βββ models/ # Pydantic models (UFDRDocument)
β β βββ routes/ # FastAPI routes
β β βββ services/ # Elasticsearch, parser, classifier, AI intent planner, auth
β β βββ utils/ # Helpers
β β βββ main.py # FastAPI app entrypoint
β βββ data/ # Sample data (e.g., ufdr.jsonl)
β βββ pyproject.toml # Python deps
β βββ uv.lock # Locked dependency resolution
βββ frontend/
β βββ app/ # Next.js app routes/pages
β βββ components/ # UI & dashboard components
β βββ hooks/, lib/, styles/ # Frontend utilities
β βββ package.json # Web deps
βββ scripts/ # Utilities (reindex, test API)
βββ docker/ # Docker configuration files
βββ docs/ # Misc docs
βββ docker-compose.yaml # Local Elasticsearch
- Add Elasticsearch indexing support for UFDR documents (@areeb)
- Define schema support in Elasticsearch for multiple file data types (@areeb)
- Build Gemini-powered NLQ layer to dynamically generate Elasticsearch DSL queries (@areeb)
- Integrate MongoDB (@shivansh)
- Implement JWT-based authentication system (@shivansh)
- Develop pipeline to extract TSV and convert to JSON from UFDR files (@hamad)
- Add Neo4j visualization support (@avantika)
- Enable report export generation (@bhavana)
- Design and implement web UI (@areeb, @shivansh)
- Set up CI/CD pipeline for DigitalOcean + Cloudflare Pages deployment (@areeb)
- Upgrade NLQ layer with Mixtral NeMo (12B SLM) support (@anish)
- Upgrade ETL pipeline to better sync with multiple services (@areeb)
- Develop ETL pipeline for additional file types (@hamad)
- Write Pytest tests (@avantika)
- Write Cypress tests (@shivansh)
- Configure NGINX (@areeb)
- Set up CI workflow to generate dynamic docs on merges to
testingbranch (@bhavana) - Add Redis caching for search results (@avantika)
- Clone the repo:
git clone https://github.com/areebahmeddd/cognito.ai.git
cd cognito.ai- Start Elasticsearch (single-node) via Docker Compose:
docker compose up -d- Configure environment variables:
Create a .env file in backend/:
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_INDEX=cognito
JWT_SECRET_KEY=<your_jwt_secret_key>
JWT_ALGORITHM=<your_jwt_algorithm>
JWT_EXPIRE_MINUTES=<your_jwt_expire_minutes>
MONGODB_CONNECTION_STRING=mongodb://localhost:27017/cognito
GEMINI_API_KEY=<your_api_key>
Create a .env file in frontend/:
NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=<your_nextauth_secret>
NEXT_PUBLIC_API_URL=http://127.0.0.1:8000/api/v1
Install and run API:
cd backend
uv sync
uv run app.main:app --host 0.0.0.0 --port 8000 --reloadSwagger UI: http://localhost:8000/docs
cd frontend
npm clean-install
npm run devWipes Elasticsearch index (and wildcard) and then MongoDB database.
- Uses env vars with these defaults:
ELASTICSEARCH_URL=http://localhost:9200ELASTICSEARCH_INDEX=cognitoMONGODB_CONNECTION_STRING=mongodb://localhost:27017/cognito
python scripts/nuke_infra.pyCreates synthetic UFDR-like TSV bundles as ZIPs at the project root: Test_UFDR-1.zip, Test_UFDR-2.zip, Test_UFDR-3.zip.
python scripts/mock_zip.pyThis project is licensed under the MIT License.











