This project implements a dual-mode attendance system using:
- Face Recognition (ArcFace + classifier)
- Card-Based Attendance (OCR extraction of registration/roll number)
with a modern Gradio interface.
- Project Overview
- Technical Stack
- Coding Guidelines
- Attendance Logic
- Testing & Validation
- Deployment Guidelines
- File Structure
- Setup
- Additional Notes
- Objective: Develop an attendance system that supports:
- Face Recognition: Mark attendance by recognizing a student's face.
- Card-Based Attendance: Mark attendance by extracting the registration/roll number from a student card using OCR.
- Dataset: 10+ facial images per student, organized in
dataset/student_id/folders. - Model: Uses ArcFace (InsightFace) for face embeddings and a trained classifier for student recognition.
- OCR: Uses Tesseract-OCR (via pytesseract) to extract registration numbers from card images.
- Programming Language: Python 3.10+
- Libraries & Frameworks:
insightface,scikit-learn: For ArcFace embeddings and classification.pytesseract: For OCR from card images.torch,torchvision: For deep learning operations.gradio: For the user interface.PIL(Python Imaging Library): For image processing.pandas,numpy: For data handling.
- Code Structure: Modularized into separate files:
data_loader.py: Handles data loading and preprocessing.model.py: (Legacy) ViT-based face recognition (not used by default).arcface_recognition.py: ArcFace-based face recognition logic.card_attendance.py: Card-based attendance using OCR.attendance.py: Manages attendance logic and record-keeping.app.py: Integrates all modules and runs the Gradio interface.
- Naming Conventions:
snake_casefor functions and variables,PascalCasefor classes. - Documentation: Docstrings for all functions and classes, comments for complex logic.
- Error Handling: Try-except blocks and error logging.
- Recognition Flow:
- User uploads a face or card image.
- The system processes the image:
- Face Tab: Predicts the student's identity using ArcFace + classifier.
- Card Tab: Extracts registration/roll number using OCR.
- The system checks the attendance record:
- If the student hasn't checked in today: Record check-in timestamp.
- If the student has checked in but not out: Record check-out timestamp.
- If both recorded: Notify attendance is complete for the day.
- Data Storage: Attendance records are maintained in
attendance_records.csv.
- Test Cases: Validate both face and card recognition, test attendance logic scenarios.
- Performance Metrics: Monitor model accuracy, OCR extraction accuracy, and system response time.
- Gradio Interface: User-friendly interface with two tabs:
- Face Recognition Attendance: Upload a face image.
- Card-Based Attendance: Upload a card image.
- Tesseract-OCR: Must be installed and available in your system PATH for card-based attendance.
- Dependencies: Listed in
requirements.txt.
project_root/
├── data_loader.py
├── model.py
├── arcface_recognition.py
├── card_attendance.py
├── attendance.py
├── app.py
├── requirements.txt
├── dataset/
│ ├── student_1/
│ │ ├── img1.jpg
│ │ └── ...
│ └── student_n/
│ ├── img1.jpg
│ └── ...
├── attendance_records.csv
├── arcface_classifier.joblib
├── arcface_labels.joblib
├── arcface_embeddings.npy
├── arcface_labels.npy
├── train_arcface_classifier.py
├── convert_embeddings_to_means_csv.py
└── README.md
- Clone the repository.
- Install dependencies:
pip install -r requirements.txt
- Install Tesseract-OCR (for card-based attendance):
- Windows: Download and install from UB Mannheim builds.
- Linux:
sudo apt-get install tesseract-ocr - Mac:
brew install tesseract - Add Tesseract to your PATH if needed.
- Organize your dataset in the
dataset/directory with subdirectories for each student ID. - Train the ArcFace classifier:
python train_arcface_classifier.py
- Run the application:
python app.py
- Security: Avoid permanent storage of uploaded images, protect attendance records.
- Scalability: Design for easy addition of students, optimize model inference.
- Maintenance: Regularly update dataset, monitor logs.
- Customization: Adjust the regex in
card_attendance.pyto match your card format if needed.
