Skip to content

A high-performance and secure REST/GraphQL API built with Rust, MongoDB & Redis for cleaning email subscriber lists. Maintains sender reputation by validating, deduplicating, and pruning inactive emails.

Notifications You must be signed in to change notification settings

SelfSend/email-sanitizer-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

87 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Code Quality:

CI codecov

email-sanitizer by SelfSend

A high-performance and secure REST/GraphQL API built with Rust, MongoDB & Redis for cleaning email subscriber lists. Maintains sender reputation by validating, deduplicating, and pruning inactive emails.

πŸš€ Features

  • Scalable Architecture: Built with Rust and optimized for horizontal scaling.
  • Security First: OAuth2/JWT authentication, rate limiting, and input validation.
  • Observability: Integrated logging, metrics, and distributed tracing.
  • Multi-Protocol Support: REST, GraphQL, or gRPC endpoints (choose as needed).

βœ… Validations

Detect and handle invalid email addresses before they take some space in your database, cause delivery issues or harm your sender score.

The API can be used to validate email addresses in real-time or in bulk. It can also be used to clean up existing email lists by removing invalid or duplicate addresses.

The API is built with Rust, designed to be fast and efficient, capable of processing thousands of email addresses per second. It can be integrated into existing applications or used as a standalone service.

Currently including multiple edge-cases and validation checks:

Syntax Validation Checks

Validation Type Description
Local-Part Characters Allow letters, digits, specific symbols; quotes required for spaces.
Quoted Local-Part Balanced quotes and escaping; spaces allowed only within quotes.
Local-Part Dots Prohibit leading/trailing/consecutive dots unless quoted.
Domain Labels Subdomains 1–63 chars; hyphens allowed only mid-label.
Domain IP Literals Validate IPv4/IPv6 addresses in brackets (e.g., [192.168.1.1]).
Domain Case Handling Normalize domain to lowercase (case-insensitive).
Unicode Support Allow UTF-8 characters in local-part and domain.
Unicode Normalization Normalize Unicode to NFC form to avoid duplicates.
SMTPUTF8 Compliance Support SMTPUTF8 extension for non-ASCII addresses.
Local-Part Length ≀64 octets (after encoding).
Total Address Length ≀254 octets (including local-part, @, domain).
Address Comments Reject or strip RFC 5322-style comments.
Obsolete Syntax Disallow deprecated syntax (e.g., folded whitespace).
Quoted Escapes Validate backslash-escaped quotes (e.g., "user\"name").
Punycode Conversion Convert international domains to Punycode (e.g., xn--fiqs8s.xn--55qx5d).
IP Formatting Validate IPv4/IPv6 syntax in domain literals.
Domain Literal Brackets Reject IP literals missing brackets (e.g., user@192.168.1.1).
Reserved Domains Block reserved domains (e.g., localhost, test).
Null Addresses Reject empty addresses (e.g., <>).
Local-Part Case Preserve case but flag inconsistencies (case-sensitive).
Domain Case Normalization Always convert domain to lowercase (case-insensitive).

DNS/MX Records Validation Checks

Follows RFC specifications by checking A/AAAA records if MX records are missing. Checks either MX records exist or direct IP records (A/AAAA) are present

Disposable Email Address Validation Checks

Checks among a list of 106,543 disposable email domains, the largest database of disposable emails out there, updated daily.

πŸ›  Tech Stack

Category Tools
Language Rust
Framework Actix
Database MongoDB + Redis (caching)
Infra AWS Lambda/Kubernetes + Terraform (IaC)
Auth Auth0/Clerk/PASETO/OAuth2
Monitoring Prometheus + Grafana, ELK Stack

πŸ“¦ Getting Started

Prerequisites

  • Rust 1.65+
  • MongoDB/Redis (or Docker)
  • Terraform (optional, for cloud provisioning)

Installation

  1. Clone the repo:
    git clone https://github.com/SelfSend/email-sanitizer-api.git
  2. Install Dependencies:
    cargo build

πŸ”§ Environment Setup

Configure your .env file:

MONGODB_URI=mongodb+srv://<<username>>:<<password>>@clusterX.*****.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0 # mongodb://192.168.8.136:27017 on local
DB_NAME_TEST=selfsend_test
DB_NAME_PRODUCTION=selfsend_production
DB_DISPOSABLE_EMAILS_COLLECTION=disposable_email_domains

# Redis
REDIS_URL=redis://127.0.0.1:6379
REDIS_CACHE_TTL=86400 # 1 day in seconds

πŸ—οΈ Running the Server

# Development (hot-reload)
cargo watch -x run

# Production build
cargo build --release

🚨 Contributing

  • Fork the repository.
  • Create a branch:
git checkout -b feat/issue-number-issue-name
#git checkout -b "feat/9-set-up-rust-project-with-actixaxum"
  • Follow Conventional Commits.
  • Submit a PR with tests and documentation.

πŸ“„ License

MIT License.

🌟 Future Roadmap

Phase 1: Core Setup & Validation (Sprint 1-2)
Tasks
  1. Project Initialization βœ…

    • Set up Rust project with Actix/Axum. βœ…
    • Configure CI/CD (GitHub Actions). βœ…
    • DoD: Project builds successfully, CI pipeline passes. βœ…
  2. Basic Email Validation βœ…

    • Implement syntax validation (regex). βœ…
    • Add DNS/MX record verification. βœ…
    • DoD: Unit tests cover 90% of cases, returns structured validation results. βœ…
  3. MongoDB Integration & Disposable emails validation βœ…

    • Design database schema for disposable email domains. βœ…
    • Implement disposable email addreses validation. βœ…
    • DoD: DB migrations applied, test queries succeed. βœ…
  4. REST API (Basic Endpoints) βœ…

    • Implement POST /validate for single email validation. βœ…
    • Add error handling and OpenAPI docs. βœ…
    • DoD: Endpoint tested via Postman, Swagger UI works. βœ…

Phase 2: GraphQL & Advanced Features (Sprint 3-4)
Tasks
  1. GraphQL Integration βœ…

    • Set up GraphQL server (Async-GraphQL). βœ…
    • Add validateEmail query and validateEmailsBulk mutation. βœ…
    • DoD: GraphQL playground accessible, queries return correct responses. βœ…
  2. Redis Caching Layer βœ…

    • Cache DNS/MX results to reduce latency. βœ…
    • Implement TTL for cached entries. βœ…
    • DoD: Cached responses are 50% faster than uncached ones. βœ…
  3. Disposable & Role-Based Email Detection

    • Integrate blocklists for disposable emails. βœ…
    • Detect role-based addresses (e.g., admin@, support@).
    • DoD: Blocklists loaded at startup, role detection accuracy >95%.
  4. Bulk Processing

    • Add async bulk validation endpoint (POST /bulk/validate).
    • Implement job queue (Redis or MongoDB).
    • DoD: Processes 10K emails in <5 mins, returns job status.

Phase 3: Performance & Security (Sprint 5-6)
Tasks
  1. Authentication & Authorization

    • Add JWT/API key authentication.
    • Restrict sensitive endpoints.
    • DoD: Unauthorized requests blocked, keys validated via DB.
  2. Rate Limiting

    • Implement Redis-based rate limiting (per API key).
    • DoD: Rejects requests beyond 10 reqs/sec, logs violations.
  3. Monitoring & Logging

    • Add Prometheus metrics (/metrics).
    • Structured logging (tracing).
    • DoD: Metrics visible in Grafana, logs searchable in Kibana.
  4. Load Testing & Optimization

    • Benchmark with k6 (target: 50K RPS).
    • Optimize DB queries and Redis usage.
    • DoD: Latency <100ms at 10K RPS, no memory leaks.

Phase 4: Deployment & Maintenance (Sprint 7+)
Tasks
  1. Docker & Kubernetes Deployment

    • Containerize API with multi-stage Dockerfile.
    • Deploy to Kubernetes (EKS/GKE) or serverless (AWS Lambda).
    • DoD: API runs in production, health checks pass.
  2. Blue-Green Deployment

    • Zero-downtime deployment strategy.
    • Rollback mechanism.
    • DoD: Deploys without downtime, rollback tested.
  3. Post-Launch Monitoring

    • Set up alerts (Prometheus Alertmanager).
    • Track bounce rate improvements.
    • DoD: Alerts trigger on errors, sender score improves by 10%.

Roadmap Timeline
Sprint Focus Area Key Deliverables
1-2 Core Validation REST API, DB integration
3-4 GraphQL & Caching Bulk processing, Redis
5-6 Security & Scaling Rate limits, auth, 50K RPS
7+ Deployment Kubernetes, monitoring
Success Metrics
  • Performance: 99.9% uptime, <100ms latency.
  • Accuracy: 95%+ valid/invalid email detection.
  • Security: No critical CVEs, rate limits enforced.

selfsend-github-bio

About

A high-performance and secure REST/GraphQL API built with Rust, MongoDB & Redis for cleaning email subscriber lists. Maintains sender reputation by validating, deduplicating, and pruning inactive emails.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published