Skip to content

areebahmeddd/cognito.ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Project Logo

🧠 Project Description

cognito.ai is a natural‑language forensic evidence discovery engine for UFDR (Universal Forensic Extraction Device Report) data. It ingests UFDR exports, normalizes heterogeneous schemas with deterministic IDs, and indexes into Elasticsearch search engine using category‑aware mappings. [ Project Demo | Project Abstract | Project PPT ]

Built for Smart India Hackathon - 2025

Key Features

  • Real-time Search: Elasticsearch with category-aware mappings and faceted filters.
  • UFDR Ingestion: normalized schemas and deterministic IDs across heterogeneous exports.
  • Natural-Language Querying: Gemini-powered NLQ translated to Elasticsearch DSL.
  • Secure Persistence: MongoDB storage with JWT-based authentication.
  • Visual Analytics: timeline and network views, case drill-downs, exportable reports.
  • Dev-Friendly Setup: Docker Compose for ES + Mongo; FastAPI + Next.js local dev.

πŸ“„ Sample AI-Generated Report - See an example of our platform's comprehensive forensic analysis output.

πŸ—‚οΈ Project Structure

.
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ core/                # Settings, DB/ES bootstrap
β”‚   β”‚   β”œβ”€β”€ models/              # Pydantic models (UFDRDocument)
β”‚   β”‚   β”œβ”€β”€ routes/              # FastAPI routes
β”‚   β”‚   β”œβ”€β”€ services/            # Elasticsearch, parser, classifier, AI intent planner, auth
β”‚   β”‚   β”œβ”€β”€ utils/               # Helpers
β”‚   β”‚   └── main.py              # FastAPI app entrypoint
β”‚   β”œβ”€β”€ data/                    # Sample data (e.g., ufdr.jsonl)
β”‚   β”œβ”€β”€ pyproject.toml           # Python deps
β”‚   └── uv.lock                  # Locked dependency resolution
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app/                     # Next.js app routes/pages
β”‚   β”œβ”€β”€ components/              # UI & dashboard components
β”‚   β”œβ”€β”€ hooks/, lib/, styles/    # Frontend utilities
β”‚   └── package.json             # Web deps
β”œβ”€β”€ scripts/                     # Utilities (reindex, test API)
β”œβ”€β”€ docker/                      # Docker configuration files
β”œβ”€β”€ docs/                        # Misc docs
└── docker-compose.yaml          # Local Elasticsearch

πŸ—οΈ Project Design

System Architecture
System Architecture

Sequence Diagram
Sequence Diagram

🎯 Project Milestones

Completed:

  • Add Elasticsearch indexing support for UFDR documents (@areeb)
  • Define schema support in Elasticsearch for multiple file data types (@areeb)
  • Build Gemini-powered NLQ layer to dynamically generate Elasticsearch DSL queries (@areeb)
  • Integrate MongoDB (@shivansh)
  • Implement JWT-based authentication system (@shivansh)
  • Develop pipeline to extract TSV and convert to JSON from UFDR files (@hamad)
  • Add Neo4j visualization support (@avantika)
  • Enable report export generation (@bhavana)
  • Design and implement web UI (@areeb, @shivansh)
  • Set up CI/CD pipeline for DigitalOcean + Cloudflare Pages deployment (@areeb)

In Progress:

  • Upgrade NLQ layer with Mixtral NeMo (12B SLM) support (@anish)
  • Upgrade ETL pipeline to better sync with multiple services (@areeb)
  • Develop ETL pipeline for additional file types (@hamad)
  • Write Pytest tests (@avantika)
  • Write Cypress tests (@shivansh)
  • Configure NGINX (@areeb)
  • Set up CI workflow to generate dynamic docs on merges to testing branch (@bhavana)
  • Add Redis caching for search results (@avantika)

πŸ–ΌοΈ Project Preview

Landing Page
Landing Page

Home Page
Home Page

Your Cases Page
Your Cases Page

Create Modal
Create Case Modal

Artifact Modal
UFDR Artifact Inspector

Results Page
Search Results Page

Timeline Page
Timeline Analysis Page

Nework Page
Network Correlation Page

Summary Page
Case Summary Page

βš™οΈ Setup for Development

  1. Clone the repo:
git clone https://github.com/areebahmeddd/cognito.ai.git
cd cognito.ai
  1. Start Elasticsearch (single-node) via Docker Compose:
docker compose up -d
  1. Configure environment variables:

Create a .env file in backend/:

ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_INDEX=cognito
JWT_SECRET_KEY=<your_jwt_secret_key>
JWT_ALGORITHM=<your_jwt_algorithm>
JWT_EXPIRE_MINUTES=<your_jwt_expire_minutes>
MONGODB_CONNECTION_STRING=mongodb://localhost:27017/cognito
GEMINI_API_KEY=<your_api_key>

Create a .env file in frontend/:

NEXTAUTH_URL=http://localhost:3000
NEXTAUTH_SECRET=<your_nextauth_secret>
NEXT_PUBLIC_API_URL=http://127.0.0.1:8000/api/v1

πŸ–₯️ Backend (FastAPI)

Install and run API:

cd backend
uv sync
uv run app.main:app --host 0.0.0.0 --port 8000 --reload

Swagger UI: http://localhost:8000/docs

🌐 Frontend (Next.js)

cd frontend
npm clean-install
npm run dev

🧰 Scripts

1. Nuke Infra (Fresh Start)

Wipes Elasticsearch index (and wildcard) and then MongoDB database.

  • Uses env vars with these defaults:
    • ELASTICSEARCH_URL=http://localhost:9200
    • ELASTICSEARCH_INDEX=cognito
    • MONGODB_CONNECTION_STRING=mongodb://localhost:27017/cognito
python scripts/nuke_infra.py

2. Generate Mock UFDR ZIPs (for testing)

Creates synthetic UFDR-like TSV bundles as ZIPs at the project root: Test_UFDR-1.zip, Test_UFDR-2.zip, Test_UFDR-3.zip.

python scripts/mock_zip.py

πŸ“œ License

This project is licensed under the MIT License.

πŸ‘₯ Authors

About

πŸ”Ž Natural Language Interface for Digital Forensic Evidence

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 6