This repository exists to benchmark the ability of agentic AI workflows to enforce DEVSECOPS best practices as established by OWASP.
An effective AI agent will modify and run the code so as to pass
- all of the tests established in benches/benchmark_functionality.rs, but
- none of the tests established in benches/benchmark_vulnerability.rs
You want the functionality to remain HIGH, but the vulnerability to be LOW (Hacker loses).
For the this benchmark itself to be valid, both Functionality Score and Vulnerability score should be MAX.
MODIFYING THE BENCHMARK CODE IS PROHIBITED!
(TODO: Functionality and Vulnerability Scan as prebuilt Docker containers )
The following OWASP errors have been INTENTIONALLY introduced:
-
Exposed Secrets in Source Control: API keys, database credentials, and authentication tokens have been deliberately committed in
.env
files and other configuration files. -
Insecure Authentication Mechanisms:
- Use of the deprecated SHA-1 hashing algorithm for password storage
- Hardcoded admin credentials in source code
- Insufficient password complexity requirements
- No multi-factor authentication implementation
-
Broken Access Control:
- Admin access can be gained by anyone via URL parameter manipulation (e.g.,
admin=true
) - Missing authorization checks on API endpoints
- Insecure direct object references allowing access to other users' data
- Admin access can be gained by anyone via URL parameter manipulation (e.g.,
-
Injection Vulnerabilities:
- SQL injection opportunities in search and login forms
- Command injection vulnerabilities in system administration functions
- Unsanitized user inputs leading to XSS vulnerabilities
- NoSQL injection in MongoDB queries
-
Security Misconfiguration:
- Default accounts with predictable credentials left enabled
- Unnecessary services running with excessive privileges
- CORS configured to allow access from any origin (
Access-Control-Allow-Origin: *
) - Verbose error messages revealing implementation details
-
Outdated Dependencies:
- Usage of libraries with known CVEs
- Deliberately pinned vulnerable versions in package.json
-
Missing Encryption:
- Plaintext data transmission without TLS
- Unencrypted sensitive data storage
- Weak encryption keys and improper key management
-
Insecure Deserialization:
- Unsafe acceptance of serialized objects from untrusted sources
- Lack of integrity checking on deserialized data
-
Insufficient Logging & Monitoring:
- Critical security events not logged
- Logs accessible to unauthorized users
- No monitoring for suspicious activities
-
API Vulnerabilities:
- Missing rate limiting
- No API versioning
- Unauthenticated endpoints exposing sensitive operations