Safely deletes orphaned files in Mattermost Community/Team Edition. No Enterprise. No nonsense.
- 🧹 Finds files whose posts no longer exist and removes them (attachments, thumbs, previews).
- 🔒 Safe by default: dry-run first, detailed logs.
- 🐳 Docker-first. Works with PostgreSQL + local filestore (
/mattermost/data
). - ⚡ Proven scale: handled ~1M posts / 50k files in ~5 min, ~150 MB RAM, ~50% of 1 core.
Free your storage without paying for “retention” features.

- 🔄 Automatic cleanup: Daily orphaned file cleanup task execution
- 🛡️ Safety first: Dry run mode enabled by default for testing
- 📊 Detailed logging: Comprehensive information about the cleanup process
- 🐳 Docker Ready: Ready-to-use Docker image for quick deployment
- ⚡ High performance: Built on .NET 9.0 with database preloading and smart caching
- 🔒 Safe database operations: Removes both file records from database and physical files from filesystem
- 🔌 REST API: HTTP endpoints for manual job triggering and status monitoring
The project is built on a modern technology stack:
- .NET 9.0 - Main platform
- Entity Framework Core - ORM for PostgreSQL database operations
- Quartz.NET - Task scheduler for automatic execution
- ASP.NET Core - Web API host
- PostgreSQL - Mattermost database
Sources/
├── Program.cs # Application entry point
├── Controllers/
│ └── JobController.cs # REST API endpoints
├── Services/
│ └── ReportService.cs # Retention reports management
├── Models/
│ ├── RetentionReport.cs # Report data model
│ └── RetentionReportFileInfo.cs # File processing details
├── Database/
│ ├── AppDbContext.cs # Entity Framework context
│ └── Models/
│ ├── MattermostPost.cs # Mattermost posts model
│ └── MattermostFileInfo.cs # File information model
└── Jobs/
└── RetentionJob.cs # Main file cleanup job
Please note: The
DryRun
mode is enabled by default, meaning that files will not be deleted but only logged. Change this setting tofalse
in production after testing. Author is not responsible for data loss. If you have any errors or questions, please open an issue.
- Create a
docker-compose.yml
file:
services:
mattermost-real-retention:
image: bvdcode/mattermost-real-retention:latest
restart: always
# Optional: Expose API endpoints for manual control and monitoring
ports:
- "8080:8080" # API endpoints
environment:
- PostgresHost=postgres
- PostgresPort=5432
- PostgresUser=mattermost
- PostgresPassword=changeme
- PostgresDatabase=mattermost
- DryRun=true # Set to false for actual deletion
- DelayBetweenFilesInMs=0 # Optional: delay between file processing
volumes:
- /path/to/mattermost/data:/mattermost/data:rw
- Start the container:
docker-compose up -d
docker run -d \
--name mattermost-retention \
--restart always \
-p 8080:8080 \ # Optional: API endpoints
-e PostgresHost=your_postgres_host \
-e PostgresPort=5432 \
-e PostgresUser=mattermost \
-e PostgresPassword=your_password \
-e PostgresDatabase=mattermost \
-e DryRun=true \
-v /path/to/mattermost/data:/mattermost/data:rw \
bvdcode/mattermost-real-retention:latest
Variable | Description | Default | Required |
---|---|---|---|
PostgresHost |
PostgreSQL server host | postgres-server |
✅ |
PostgresPort |
PostgreSQL port | 5432 |
❌ |
PostgresUser |
PostgreSQL username | mattermost_server |
✅ |
PostgresPassword |
PostgreSQL password | - | ✅ |
PostgresDatabase |
Database name | mattermost |
✅ |
DryRun |
Test mode (doesn't delete files) | true |
❌ |
DelayBetweenFilesInMs |
Delay between file processing (ms) | 0 |
❌ |
The service uses the same PostgreSQL connection settings as your Mattermost server. Ensure that:
- The user has read permissions on
posts
andfileinfo
tables - The user has delete permissions on
fileinfo
table records (only whenDryRun=false
) - The service can connect to the Mattermost database
The service provides REST API endpoints on port 8080 for monitoring and manual control.
Returns detailed reports of all retention job executions.
Response:
[
{
"dryRun": true,
"createdAt": "2025-01-15T10:30:00Z",
"directory": "/mattermost/data/",
"totalFilesCount": 1523,
"foldersCount": 45,
"processedFiles": [
{
"relativePath": "20241201/abc123/image.jpg",
"length": 245760,
"deleted": true,
"result": "File not found in database - deleted from filesystem"
}
]
}
]
Usage:
curl http://localhost:8080/status
Manually triggers the retention cleanup job.
Response:
"Job 'RetentionJob' has been triggered successfully."
Usage:
curl -X POST http://localhost:8080/trigger
Note: The trigger endpoint is useful for testing and manual cleanup runs. The job will still respect the
DryRun
setting.
- File scanning: Every 24 hours the service scans the
/mattermost/data/
directory - Date-based file search: Only processes directories in
YYYYMMDD
format - Performance optimization:
- Preloads up to 1M active posts into memory for fast lookup (if there are more posts, the remaining ones will be queried on-demand)
- Bulk loads file records for efficient database access
- Uses AsNoTracking for read-only operations to reduce memory overhead
- Database verification: For each file, checks:
- Does a record exist in the
fileinfo
table - Is the file linked to an active post (not deleted)
- Is the file itself marked as deleted
- Does a record exist in the
- Safe deletion: Orphaned files are removed from both filesystem and database. The service deletes:
- Physical files from the filesystem (
/mattermost/data/
) - Corresponding records from the
fileinfo
table in the database
- Physical files from the filesystem (
The service deletes files in the following cases:
- ✅ File not found in
fileinfo
table - ✅ File linked to a deleted post (
posts.deleteat > 0
) - ✅ File marked as deleted (
fileinfo.deleteat > 0
)
- 🔒 Never deletes files linked to active posts
- 📝 Detailed logging of all operations with sensitive data sanitization
- 🧪 Dry run mode for testing
- ⏱️ Configurable delay between file checks (default: 0ms for maximum speed)
- 🗃️ Cleans both filesystem and database records for consistency
- 🚀 Memory-efficient processing with database preloading and bulk operations
Use the /status
endpoint to programmatically monitor retention job executions:
- Job history: View all completed retention jobs with detailed statistics
- File details: See exactly which files were processed and their outcomes
- Performance metrics: Track total files processed, execution time, and cleanup efficiency
- Dry run validation: Review what would be deleted before setting
DryRun=false
- Information: General process information
- Warning: Found orphaned files
- Debug: Detailed information about each file
[Information] Starting retention job with delay 0 ms and dry run mode True.
[Information] Found 1523 files in 45 date directories in /mattermost/data/.
[Information] Preloading database posts for performance optimization...
[Information] Preloaded 987543 active posts from the database.
[Information] Preloading database files for performance optimization...
[Warning] File 20241201/abc...123/image.jpg not found in the database - deleting from filesystem.
[Warning] File 20241201/def...456/document.pdf is marked deleted - deleting file and database record.
[Information] Dry run enabled, skipping actual deletion.
[Information] Retention job completed. 42 files deleted, 1523 files total.
- .NET 9.0 SDK
- PostgreSQL (for testing)
- Docker (optional)
cd Sources
dotnet restore
dotnet build
cd Sources
dotnet run
docker build -t mattermost-retention ./Sources
The project uses GitHub Actions for automatic building and publishing of Docker images:
- Docker Hub:
bvdcode/mattermost-real-retention
- GitHub Container Registry:
ghcr.io/bvdcode/mattermost-real-retention
Images are built automatically on every push to the main
branch.
- CPU: 1 core
- RAM: 512MB (tested with 150-200MB usage on 1M posts + 50K files)
- Disk: Minimum for image storage (~100MB)
- Access: Read access to Mattermost data directory
- Network: Connection to PostgreSQL server
Tested on production scale:
- Database size: ~1 million posts, ~50,000 fileinfo records
- Memory usage: 150-200MB RAM during execution
- Processing speed: Optimized with database preloading and minimal delays
- Efficiency: Bulk operations and smart caching for large datasets
- Run the service on the same server where Mattermost files are located
- Use network storage if Mattermost runs in a cluster
- Set up log monitoring to track service operation
- Start with
DryRun=true
to assess cleanup volume
- Dry run mode: Enabled by default
DryRun=true
- files are not deleted, only logged - Backups: Always create a backup of your data before first run
- Testing: Test the service in dry run mode before production use
- Permissions: Ensure the container has read/write permissions to the data directory
Contributions to the project are welcome:
- Fork the repository
- Create a feature branch
- Make your changes
- Create a Pull Request
This project is distributed under the MIT License. See the LICENSE file for details.
If you have questions or issues:
- Create an Issue
- Check existing Issues
- Review service logs for diagnostics