ArchEval Benchmark is an open-source dataset for evaluating software architecture capabilities. It features 8 curated open-source repositories spanning microservices, middleware, and AI frameworks. Each project includes three core components:
- GitHub Repository Name (URL)
- Architecture Diagram (Image Format)
- Architecture Documentation (PDF format)
Repository Name | Architecture Diagram | #Files | Top 3 Tech Stacks | |
---|---|---|---|---|
1 | hashicorp/consul | ![]() |
3,684 | Go(2361), JS/TS(1164), YAML(78) |
2 | spring-framework | ![]() |
9,370 | Java(8986), Kotlin(328), YAML(25) |
3 | apache/zookeeper | ![]() |
1,115 | Java(950), C/C++(59), Python(36) |
4 | mindspore-ai/mindspore | ![]() |
16,525 | C/C++(9459), Python(6225), YAML(753) |
5 | kubernetes/kubernetes | 21,743 | Go(15941), YAML(5225), Markdown(562) | |
6 | tensorflow/tensorflow | ![]() |
10,846 | C/C++(5973), Python(3133), Markdown(1187) |
7 | apache/kafka | ![]() |
5,876 | Java(5549), Python(178), YAML(65) |
8 | istio/istio | 4,611 | YAML(2595), Go(1886), Markdown(87) |
π‘ Tech Stack Note: Includes core languages and configuration files (YAML/HTML/MD)
- Architecture reverse engineering
- Code-documentation consistency validation
- Cross-project architectural pattern analysis
- AI-based architecture generation
git clone https://github.com/panrusheng/arch-eval-benchmark
- Follow directory structure:
{number}_{org}/project.{pdf|png|jpg|svg}
(Example:9_neworg/project.pdf
) - Diagrams should match PDF filenames
- New repositories should exceed 1,000 files