This is just an educational prototype for the full-fledged tool that is going to be released soon, ported to Ghidra, and completely open-source.
This prototype demonstrates:
- The practicality of the algorithm
- Much more precise pointer-alias analysis (and therefore taint analysis!) than CodeQL or the performance that most of your static analyzers are capable of
- This runs on binaries - no source code is needed (making it practical for off-the-shelf firmware binaries)
- Parallel processing of SSEs, an algorithmic improvement
- Processing of in-stack passed structures, therefore achieving field sensitivity for even those cases, another algorithmic improvement
Stay tuned for the full-fledged release! It is going to be released somewhere during this year's DEFCON.
The full-fledged tool re-discovers all CVEs shared in the dataset [1] in a very short time, using a simple, cheap laptop. Skeptic about false positive rates? Stay tuned:)
Just source venv:
source ./venv/bin/activate
And check out the tests:
python tests.py
Or the admittedly rudimentary inter-procedural check:
python interproc.py
I wrote an article in the format of Phrack, which eventually will not make it to Issue 72, but here it is to explain the technical ideas behind this project, and more. Check it out!
See overview_of_algorithm_and_motivation.txt