Skip to content

FR34KY-CODER/Legal-Case-Classification-and-Summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚖️ Legal Case Classifier & Summariser

Legal NLP Icon

An end-to-end GPU-accelerated pipeline to scrape, classify, and summarise Indian legal case documents. Built for scale, tuned for precision.


🧠 Pipeline Overview

  1. Web Scraping – Extracts case data from official court sources into CSV (~250MB).
  2. Preprocessing – Lowercasing, stop word removal, lemmatization, and n-gram generation.
  3. Classification – Heuristic rule-based prediction using manually vectorized keywords.
  4. Summarization – Concise, 100-word summaries using transformer models (T5 / BART) with professional legal tone.

✨ Features

  • 📄 Scrapes and structures raw legal text from court portals.
  • 🧹 NLP preprocessing using SpaCy and NLTK.
  • 🧠 Heuristic classification of content (issues, petitions, conclusions, arguments).
  • 📝 T5/BART-based summarization via HuggingFace or local inference.
  • ⚡ CUDA-enabled for fast training and inference.
  • 🧩 Modular pipeline design.

📈 Future Roadmap

  • 🧠 Build a custom summarization model replicating T5 architecture.
  • 📦 Scale dataset up to 1TB with broader legal domain coverage.
  • ⚙️ Integrate parallel computing for large-scale training.
  • 🌐 Deploy complete webapp with case upload, live classification, and summarization.
  • 🧪 Fine-tune on domain-specific legal jargon for Indian courts.

About

It's the Repo containing a model for summarization and classification of the Indian legal cases.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages