A powerful file security checker for TXT and EPUB files with virus scanning capabilities, QR code detection, and advertisement text removal.
# Automatic comprehensive cleaning when output path is provided
# This will: Remove QR code images completely + Remove all advertisement text
cargo run -- -p book.epub -f epub --sanitize -o clean_book.epub
# Interactive mode (when no output path provided)
cargo run -- -p book.epub -f epub --sanitize
New Behavior: When you provide an output path (-o
or --output
), the tool automatically uses the most comprehensive sanitization method: Remove QR code images completely + Remove all advertisement text (including the Telegram channel ad: '感谢Telegram 频道 @sharebooks4you制作,欢迎大家扫码订阅').
Check TXT is a powerful Rust-based file security checker designed to analyze text files, EPUB files, and other document formats for potential security threats, malicious content, and suspicious patterns. It provides both command-line interface (CLI) and web-based interface for easy file analysis.
- Multi-format Support: Check TXT, EPUB files for security issues
- Comprehensive Scanning: Detects suspicious code patterns, advertisements, malware indicators, and encryption patterns
- VirusTotal Integration: Optional virus scanning using VirusTotal API
- Deep Scan Mode: Enhanced scanning for obfuscated content and binary data
- Web Interface: User-friendly web-based file upload and analysis
- File Size Control: Configurable maximum file size limits
- Progress Tracking: Real-time progress indicators for long operations
- EPUB to TXT Conversion: Convert EPUB files to plain text format
- File Security Scanning: Detects suspicious patterns, malware signatures, and potentially dangerous content
- Virus Scanning: Integration with VirusTotal API for comprehensive virus detection
- EPUB Analysis: Deep analysis of EPUB files including:
- Link extraction and analysis
- QR code detection in images
- Image analysis and processing
- Advertisement text detection and removal
- QR Code Sanitization: Remove or blur QR codes from EPUB images
- Advertisement Text Removal: Remove advertisement text patterns from EPUB content
- Flexible Configuration: Customizable patterns and scanning options
The tool performs comprehensive security analysis including:
- Suspicious Code Patterns: Detects eval(), exec(), system() calls and other dangerous functions
- Advertisement Detection: Identifies common advertising patterns and promotional content
- Malware Indicators: Scans for executable files, scripts, and other potentially harmful content
- Script Analysis: Detects embedded JavaScript and other scripting languages
- Encryption Patterns: Identifies encryption algorithms in potentially dangerous contexts
- File Integrity: Checks for duplicate files and suspicious file extensions in archives
- Rust (latest stable version)
- Cargo package manager
# Clone the repository
git clone <repository-url>
cd check_txt
# Build the project
cargo build --release
# The binary will be available at target/release/check_txt
# Build Docker image
docker build -t check_txt .
# Run with Docker Compose
docker-compose up
Create a .env
file in the project root:
# Required for virus scanning
VIRUSTOTAL_API_KEY=your_virustotal_api_key_here
To enable virus scanning functionality:
- Sign up at VirusTotal
- Get your API key from the account settings
- Add it to your
.env
file
# Check a single file
check_txt --path /path/to/file.txt --file-type txt
# Check a directory of files
check_txt --path /path/to/directory --file-type txt
# Enable deep scanning
check_txt --path /path/to/file.epub --file-type epub --deep-scan
# Enable virus scanning
check_txt --path /path/to/file.txt --file-type txt --virus-scan
# Set custom file size limit (in MB)
check_txt --path /path/to/file.txt --file-type txt --max-size 50
# Start web server
check_txt --web
-
Start the web server:
check_txt --web
-
Open your browser and navigate to
http://127.0.0.1:8090
-
Upload files through the web interface for analysis
Option | Short | Description | Default |
---|---|---|---|
--path |
-p |
Path to file or directory to check | Required |
--file-type |
-f |
File type to check (txt, epub) | Required |
--max-size |
-m |
Maximum file size in MB | 100 |
--deep-scan |
-d |
Enable deep scanning | false |
--virus-scan |
-v |
Enable virus scanning | false |
--web |
-w |
Start web server | false |
check_txt --path document.txt --file-type txt
check_txt --path book.epub --file-type epub --deep-scan --virus-scan --max-size 200
check_txt --path ./documents --file-type txt --deep-scan
✅ File appears to be secure!
⚠️ Found potential security issues:
- Suspicious code pattern found: (?i)(eval\s*\()
- Advertisement pattern found: (?i)(click here)
- Potential malware pattern found: (?i)(\.exe)
check_txt/
├── src/
│ ├── main.rs # Main CLI application
│ ├── virus_check.rs # VirusTotal integration
│ └── web_server.rs # Web server implementation
├── static/
│ └── index.html # Web interface
├── temp/ # Temporary file storage
├── Cargo.toml # Rust dependencies
├── Dockerfile # Docker configuration
└── docker-compose.yml # Docker Compose setup
- actix-web: Web framework for the web interface
- clap: Command-line argument parsing
- reqwest: HTTP client for API calls
- serde: Serialization/deserialization
- tokio: Async runtime
- walkdir: Directory traversal
- regex: Regular expression matching
- zip: Archive file handling
- sha2: Cryptographic hashing
- indicatif: Progress indicators
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
Check TXT 是一个基于 Rust 的强大文件安全检查器,专为分析文本文件、EPUB 文件和其他文档格式中的潜在安全威胁、恶意内容和可疑模式而设计。它提供命令行界面(CLI)和基于 Web 的界面,方便进行文件分析。
- 多格式支持:检查 TXT、EPUB 文件的安全问题
- 全面扫描:检测可疑代码模式、广告、恶意软件指标和加密模式
- VirusTotal 集成:使用 VirusTotal API 进行可选的病毒扫描
- 深度扫描模式:增强扫描以检测混淆内容和二进制数据
- Web 界面:用户友好的基于 Web 的文件上传和分析
- 文件大小控制:可配置的最大文件大小限制
- 进度跟踪:长时间操作的实时进度指示器
- EPUB 转 TXT 转换:将 EPUB 文件转换为纯文本格式
- 文件安全扫描:检测可疑模式、恶意软件签名和潜在危险内容
- 病毒扫描:与 VirusTotal API 集成进行全面病毒检测
- EPUB 分析:深入分析 EPUB 文件,包括:
- 链接提取和分析
- 图像中二维码检测
- 图像分析和处理
- 广告文本检测和删除
- 二维码消毒:从 EPUB 图像中删除或模糊二维码
- 广告文本删除:从 EPUB 内容中删除广告文本模式
- 灵活配置:可定制的模式和扫描选项
该工具执行全面的安全分析,包括:
- 可疑代码模式:检测 eval()、exec()、system() 调用和其他危险函数
- 广告检测:识别常见广告模式和促销内容
- 恶意软件指标:扫描可执行文件、脚本和其他潜在有害内容
- 脚本分析:检测嵌入的 JavaScript 和其他脚本语言
- 加密模式:识别潜在危险上下文中的加密算法
- 文件完整性:检查存档中的重复文件和可疑文件扩展名
- Rust(最新稳定版本)
- Cargo 包管理器
# 克隆仓库
git clone <repository-url>
cd check_txt
# 构建项目
cargo build --release
# 二进制文件将位于 target/release/check_txt
# 构建 Docker 镜像
docker build -t check_txt .
# 使用 Docker Compose 运行
docker-compose up
在项目根目录创建 .env
文件:
# 病毒扫描必需
VIRUSTOTAL_API_KEY=your_virustotal_api_key_here
要启用病毒扫描功能:
- 在 VirusTotal 注册
- 从账户设置中获取 API 密钥
- 将其添加到
.env
文件中
# 检查单个文件
check_txt --path /path/to/file.txt --file-type txt
# 检查目录中的文件
check_txt --path /path/to/directory --file-type txt
# 启用深度扫描
check_txt --path /path/to/file.epub --file-type epub --deep-scan
# 启用病毒扫描
check_txt --path /path/to/file.txt --file-type txt --virus-scan
# 设置自定义文件大小限制(MB)
check_txt --path /path/to/file.txt --file-type txt --max-size 50
# 启动 Web 服务器
check_txt --web
-
启动 Web 服务器:
check_txt --web
-
打开浏览器并导航到
http://127.0.0.1:8090
-
通过 Web 界面上传文件进行分析
选项 | 简写 | 描述 | 默认值 |
---|---|---|---|
--path |
-p |
要检查的文件或目录路径 | 必需 |
--file-type |
-f |
要检查的文件类型(txt, epub) | 必需 |
--max-size |
-m |
最大文件大小(MB) | 100 |
--deep-scan |
-d |
启用深度扫描 | false |
--virus-scan |
-v |
启用病毒扫描 | false |
--web |
-w |
启动 Web 服务器 | false |
check_txt --path document.txt --file-type txt
check_txt --path book.epub --file-type epub --deep-scan --virus-scan --max-size 200
check_txt --path ./documents --file-type txt --deep-scan
✅ File appears to be secure!
⚠️ Found potential security issues:
- Suspicious code pattern found: (?i)(eval\s*\()
- Advertisement pattern found: (?i)(click here)
- Potential malware pattern found: (?i)(\.exe)