Modern face detection, recognition & analysis in 3 lines of code
VisionFace is a state-of-the-art, open-source framework for comprehensive face analysis, built with PyTorch. It provides a unified interface for face detection, recognition, landmark detection, and visualization with support for multiple cutting-edge models.
Quick Start β’ Examples β’ Models β’ API Docs
- Detect faces in images with 12+ models (YOLO, MediaPipe, MTCNN...)
- Recognize faces with vector search and embedding models
- Extract landmarks (68-point, 468-point face mesh)
- Batch process thousands of images efficiently
- Production-ready with Docker support and REST API
pip install visionface
The Face Detection
module is your gateway to identifying faces in any image. Built for both beginners and experts, it provides a unified interface to 12+ cutting-edge detection models.
β¨ Key Features:
- Multiple Input Sources: Image Files, URLs, PIL images, NumPy arrays
- Flexible Processing: Single image or batch processing thousands of images efficiently
- 12+ State-of-the-Art Models: From ultra-fast mobile models to high-precision detectors
- One-Line Detection: Get results with just
detector.detect_faces(image)
- Rich Outputs: Bounding boxes, confidence scores, cropped faces ready to use
π Quick Example:
import cv2
from visionface import FaceDetection, FaceAnnotators
# 1. Initialize detector
detector = FaceDetection(detector_backbone="yolo-small")
# 2. Detect faces
image = cv2.imread("your_image.jpg")
faces = detector.detect_faces(image)
# 3. Visualize results
result = FaceAnnotators.box_annotator(image, faces[0])
cv2.imwrite("detected.jpg", result)
The Face Recognition
module identifies individuals by generating embeddings and comparing them in a vector database. The process includes three stages: detecting faces, creating embeddings with the chosen model, and searching the database to find the closest matches.
β¨ Key Features:
- Multi-model support: Choose from high-accuracy embedding backbones such as FaceNet-VGG, FaceNet-CASIA, and Dlib.
- Vector DB Integration: Store and query embeddings using Qdrant, Milvus, or local file-based storage.
- Scalable Search: Efficiently match thousands or millions of faces with fast search.
- Flexible Enrollment: Add faces one-by-one or in batches with associated labels.
- Threshold & Ranking: Control similarity thresholds and retrieve top-k matches for robust recognition results.
from visionface import FaceRecognition
# 1. Setup recognition system
fr = FaceRecognition(detector_backbone="yolo-small",
embedding_backbone="FaceNet-VGG",
db_backend="qdrant")
# 2. Add known faces
fr.upsert_faces(
images=["john.jpg", "jane.jpg", "bob.jpg"],
labels=["John", "Jane", "Bob"],
collection_name="employees"
)
# 3. Search for matches
matches = fr.search_faces("query_face_image.jpg",
collection_name="employees",
score_threshold=0.7,
top_k=3)
for match in matches:
print(f"Found: {match['face_name']} (confidence: {match['score']:.2f})")
The Face Embeddings
module transforms each detected face into a high-dimensional numeric vector (embedding) that captures its unique features.
These embeddings can be used for:
- Face verification: Check if two faces belong to the same perso
- Recognition: Match against a database of known faces
- Clustering: Group similar faces automatically
- Advanced analytics:
β¨ Supported Embedding Models:
FaceNet-VGG
, FaceNet-CASIA
, Dlib
π Quick Example:
from visionface import FaceEmbedder
# 1. Initialize embedder
embedder = FaceEmbedder(embedding_backbone="FaceNet-VGG")
# 2. Generate embeddings for face images
embeddings = embedder.embed_faces(
face_imgs=["face1.jpg", "face2.jpg"],
normalize_embeddings=True # L2 normalization
)
# 3. Use embeddings
for i, embedding in enumerate(embeddings):
print(f"Face {i+1} embedding shape: {embedding.shape}") # (512,)
# Use for: face verification, clustering, custom databases
The Landmarks
module identifies key facial features with pixel-perfect accuracy. From eye positions to lip contours, get detailed facial geometry for advanced applications.
β¨ Key Features:
- Multiple Input Sources: Image Files, URLs, PIL images, NumPy arrays
- Flexible Processing: Single image or batch processing thousands of images efficiently
- 2D & 3D Support: Standard 2D points or full 3D face mesh
- Rich Annotations: Built-in visualization with customizable styling
- Multiple Backends: MediaPipe (468 points) or Dlib (68 points)
π Quick Example:
from visionface import LandmarkDetection
from visionface.annotators.landmark import MediaPipeFaceMeshAnnotator
landmark_detector = LandmarkDetection(detector_backbone="mediapipe")
image = cv2.imread("your_image.jpg")
# Get 468 facial landmarks
landmarks = landmark_detector.detect_3d_landmarks(image)
# Visualize with connections
vizualizer = MediaPipeFaceMeshAnnotator(thickness=2, circle_radius=3)
result = vizualizer.annotate(
image, landmarks[0], connections=True
)
cv2.imwrite("detected_landmarks.jpg", result)
π― Real-time Face Detection
import cv2
from visionface import FaceDetection, FaceAnnotators
detector = FaceDetection(detector_backbone="yolo-nano") # Fastest model
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
faces = detector.detect_faces(frame)
annotated = FaceAnnotators.box_annotator(frame, faces)
cv2.imshow('Face Detection', annotated)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
π Batch Processing
from visionface import FaceDetection
import glob
detector = FaceDetection(detector_backbone="yolo-medium")
# Process entire folder
image_paths = glob.glob("photos/*.jpg")
images = [cv2.imread(path) for path in image_paths]
# Detect all faces at once
all_detections = detector.detect_faces(images)
# Save cropped faces
for i, detections in enumerate(all_detections):
for j, face in enumerate(detections):
if face.cropped_face is not None:
cv2.imwrite(f"faces/image_{i}_face_{j}.jpg", face.cropped_face)
π’ Employee Recognition System
from visionface import FaceRecognition
import os
# Initialize system
fr = FaceRecognition(db_backend="qdrant")
# Auto-enroll from employee photos folder
def enroll_employees(folder_path):
for filename in os.listdir(folder_path):
if filename.endswith(('.jpg', '.png')):
name = filename.split('.')[0] # Use filename as name
image_path = os.path.join(folder_path, filename)
fr.upsert_faces(
images=[image_path],
labels=[name],
collection_name="company_employees"
)
print(f"Enrolled: {name}")
# Enroll all employees
enroll_employees("employee_photos/")
# Check security camera feed
def identify_person(camera_image):
results = fr.search_faces(
camera_image,
collection_name="company_employees",
score_threshold=0.8,
top_k=1
)
if results[0]: # If match found
return results[0][0]['face_name']
return "Unknown person"
Choose the right model for your use case:
Use Case | Speed | Accuracy | Recommended Model |
---|---|---|---|
π Real-time apps | β‘β‘β‘ | ββ | yolo-nano , mediapipe |
π― General purpose | β‘β‘ | βββ | yolo-small (default) |
π High accuracy | β‘ | ββββ | yolo-large , mtcnn |
π± Mobile/Edge | β‘β‘β‘ | ββ | mediapipe , yolo-nano |
π Landmarks needed | β‘β‘ | βββ | mediapipe , dlib |
π Complete Model List
Detection Models:
yolo-nano
,yolo-small
,yolo-medium
,yolo-large
yoloe-small
,yoloe-medium
,yoloe-large
(prompt-based)yolow-small
,yolow-medium
,yolow-large
,yolow-xlarge
(open-vocabulary)mediapipe
,mtcnn
,opencv
Embedding Models:
FaceNet-VGG
(512D) - Balanced accuracy/speedFaceNet-CASIA
(512D) - High precisionDlib
(128D) - Lightweight
Landmark Models:
mediapipe
- 468 points + 3D meshdlib
- 68 points, robust
- π Full Documentation
- π Tutorials & Guides
- π REST API Reference
- π‘ Use Case Examples
We welcome contributions! See our Contributing Guide.
Quick ways to help:
- β Star the repo
- π Report bugs
- π‘ Request features
- π Improve docs
- π§ Submit PRs
MIT License - see LICENSE file.
@software{VisionFace2025,
title = {VisionFace: Modern Face Detection & Recognition Framework},
author = {VisionFace Team},
year = {2025},
url = {https://github.com/miladfa7/visionface}
}
β¬ Back to Top β’ Made with β€οΈ by the VisionFace team