This repository contains ChiGen and ChiBench, two tools designed to test and debug electronic design automation (EDA) tools. Below, you'll find descriptions of each tool and how they can be utilized.
ChiBench consists of a large collection of Verilog programs mined from open-source github repositories. The goal of this benchmark suite is to test and debug electronic design automation (EDA) tools, such as the Jasper Formal Verification Platform or Intel Quartus. To test your EDA tool, simply pass all the programs in the ChiBench collection to it, and see if it crashes.
ChiGen is a tool designed for synthesizing realistic Verilog designs to test and debug Electronic Design Automation (EDA) tools. Originally developed to validate the Jasper Formal Verification Platform, ChiGen has proven to be highly effective in identifying bugs across a wide range of tools, including Verible, Verilator, and Yosys. A tutorial to get started with ChiGen can be found below.
git clone https://github.com/lac-dcc/chimera.git
The verible binary is in the verible_bin
folder. However, if the binary does not work on the computer, try building it from source
./scripts/setup.sh
In this step, there are 2 options:
- Train the grammar from scratch (which may take some hours to complete). The
verible-verilog-syntax
binary can be generated by compiling a modified version of Verible in this repository.
./scripts/run_parser_count_productions.sh ./database verible-verilog-syntax grammar.json context-size
- Use one of the pre-trained grammars available in the
json
folder (recommended).
If you opted to train the grammar from scratch, simply run the following command:
./build/Chimera grammar.json 1 > program.v
If you chose to use a pre-trained grammar, replace grammar.json
with the grammar file you selected. For example:
./build/Chimera ./json/1gram_size_test.json 1 > program.v
verible-verilog-format --inplace program.v
-t <target-size>
: Specifies the minimum number of tokens in the generated programs, for instance,-t 200
.--printseed
: Prints the randomization seed.--printcfg
: Generates call graph dot file.--debug
: Prints debug messages.--addbind
: Enables bind statements (not supported in many EDA tools).--addasserts
: Enables assert statements.--seed
: Set the seed for randomization.--help
: Display usage.
If you ever use ChiBench or ChiGen to find bugs in some EDA tool, we would appreciate it very much if you could reach out to us and report your experience. If you need help to set up the scripts to do this kind of exploration, feel free to reach out to us as well!
There are already some ChiGen programs available in the folder 3k_programs_for_bugs/chigen/
.
In the scripts
folder, there is a script called generate_programs.sh
, which generates a specified number of programs. The usage is:
scripts/generate_programs.sh <target-directory> <chimera_executable> <json_file> <n_gram> <number of programs to generate> <path_to_formatter> <target_size>"
One example of use would be:
./scripts/generate_programs.sh ./generated_programs ./build/Chimera ./json/1gram_size_test_modified.json 1 1000 ./verible_bin/verible-verilog-format 100
As described in Step 4, ChiGen requires a JSON file containing a Verilog grammar, where each production rule is associated with a probability. However, you may want to modify these probabilities to increase the diversity of the generated programs. To do this, you can use the change_probabilities
script located in the scripts
folder.
Example command:
python3 change_probabilities.py --input grammar.json \
--output grammar_boosted.json \
--power 0.5
--input
: Path to the trained or pre-trained grammar file (from Step 4).--output
: Path where the modified grammar will be saved.--power
: Compression power (default 0.5).- <1.0 → flattens distribution (boosts rare productions).
- 1.0 → keeps distribution unchanged.
- >1.0 → sharpens distribution (boosts frequent productions).
The resulting file (e.g., grammar_boosted.json) can then be used directly with Chimera as shown in Step 5.
We have found various bugs in open-source platforms with programs from ChiBench and ChiGen. Below we list examples of issues reported:
Issue | Tool | Description |
---|---|---|
2159 | Verible's Obfuscator | Crashes when reading a program that only contains the pragma directive. |
2189 | Verible's code formater | Crashes with syntactically valid input. |
2359 | Verible's code formatter | Fails to parse input. |
2364 | Verible's code formatter | Fails to parse input. |
2233 | Verible's parser | Incorrectly accepts Verilog code with mismatched program and endmodule keywords. |
2181 | Verible's parser | Crashes instead of reporting syntax errors related to instantiation type. |
5276 | Verilator | Crashes with signal 9 on a very large program. |
5311 | Verilator | Crashes when using time assignments. |
5312 | Verilator | Crashes when calling a function created in "generate" block. |
5865 | Verilator | Crashes when passing inout ports to primitive gates. |
1174 | Icarus Verilog | Crashes when assigning to parameters in a procedural block. |
1225 | Icarus Verilog | Freezes in invalid infinite loop. |
4598 | Yosys | Crashes while simplifying program. |
The design and implementation of ChiGen is described in this paper. Cite it as:
@misc{Vieira25,
title={Bottom-Up Generation of Verilog Designs for Testing EDA Tools},
author={João Victor Amorim Vieira and Luiza de Melo Gomes and Rafael Sumitani and Raissa Maciel and Augusto Mafra and Mirlaine Crepalde and Fernando Magno Quintão Pereira},
year={2025},
eprint={2504.06295},
archivePrefix={arXiv},
primaryClass={cs.AR},
url={https://arxiv.org/abs/2504.06295},
}
The process of constructing and curating the ChiBench collection of benchmarks is described in this paper. Cite it as:
@misc{Sumitani24,
title={ChiBench: a Benchmark Suite for Testing Electronic Design Automation Tools},
author={Rafael Sumitani and João Victor Amorim and Augusto Mafra and Mirlaine Crepalde and Fernando Magno Quintão Pereira},
year={2024},
eprint={2406.06550},
archivePrefix={arXiv},
primaryClass={cs.AR}
}
ChiBench contains only programs that were originally distributed with some license. Thus, each program in the ChiBench suite contains, as a header comment, the original license of that specification, plus a link to the repository from where that code was obtained. Notice that these programs might use different licenses, given that they were extracted from different projects. On May 24th, 2024, the following licenses were used among the repositories mined to build the ChiBench collection:
License | # |
---|---|
Apache License 2.0 | 341 |
MIT License | 331 |
GNU General Public License v3.0 | 159 |
GNU General Public License v2.0 | 37 |
BSD 3-Clause "New" or "Revised" License | 37 |
BSD 2-Clause "Simplified" License | 27 |
GNU Lesser General Public License v2.1 | 10 |
Creative Commons Zero v1.0 Universal | 9 |
The Unlicense | 7 |
GNU Lesser General Public License v3.0 | 5 |
Mozilla Public License 2.0 | 5 |
ISC License | 5 |
Creative Commons Attribution Share Alike 4.0 International | 4 |
CERN Open Hardware Licence Version 2 - Permissive | 3 |
GNU Affero General Public License v3.0 | 3 |
Creative Commons Attribution 4.0 International | 2 |
CERN Open Hardware Licence Version 2 - Strongly Reciprocal | 1 |
CERN Open Hardware Licence Version 2 - Weakly Reciprocal | 1 |
This project is sponsored by Cadence Design Systems. Additionally, the different people involved in this project acknowledge the support of CNPq, FAPEMIG, and CAPES. Finally, thank UFMG's Department of Computer Science for making available the infrastructure necessary to carry out this project.