A multi-layered heuristic engine designed to practically analyze the halting properties of Python scripts.
Note
When tested against a benchmark suite of 5,498 files—including the Python standard library, top PyPI packages, and a gauntlet of adversarial paradoxes—this analyzer achieved a Practical Success Rate of 93%.
- The Challenge: The Halting Problem
- The Solution: A Multi-Layered Heuristic Defense
- Formal Representation of the Analyzer
- Performance: Benchmark-Driven Results
- Usage
- Project Philosophy
Important
The Halting Problem is theoretically undecidable. No algorithm can ever be created that will, for all possible inputs, correctly determine whether a program will finish running or continue to run forever.
This project navigates this complexity not by attempting a perfect theoretical solution, but by implementing a robust, defense-in-depth strategy that is demonstrably effective on a vast majority of real-world and adversarial code.
This analyzer employs a "defense-in-depth" strategy. It subjects a given program to a series of increasingly sophisticated and computationally expensive analysis phases. If any phase can make a definitive decision, the analysis stops, ensuring maximum efficiency.
Layer 0: Meta-Analysis & Cycle Detection (cross_script_recursion
)
Before any analysis begins, the script's code is converted to a "semantic hash" that represents its structure, independent of variable names or comments. The analyzer maintains a call stack of these hashes. If it's asked to analyze a script that is already in the current analysis chain (e.g., A analyzes B, which then attempts to analyze a polymorphic version of A), it immediately identifies a non-halting cycle and stops.
Layer 1: Adversarial Pattern Matching (paradox_detection
)
This is a highly specific AST visitor that acts as a targeted immune response. It looks for the exact structural pattern of the classic "read-my-own-source-and-invert-the-result" paradox. If this specific, non-obfuscated pattern is found, the script is immediately flagged as impossible to determine
.
Layer 2: Static Analysis (static_analysis
)
The fastest check for the most obvious cases. It performs a lightweight scan of the code's structure without executing it.
- Finds `while True:`: Immediately returns `does not halt`.
- Finds no loops AND no recursion: Immediately returns `halts`.
Layer 3: Heuristic Classification (heuristic_classifier
)
An AST-based pattern matcher that identifies the structural "fingerprints" of known computationally intractable problems. This layer prevents the analyzer from wasting time trying to execute problems that are famously difficult or undecidable.
- Recognizes the **Ackermann function** and flags it as `impossible to determine`.
- Recognizes the **Collatz conjecture** and flags it as `impossible to determine`.
Layer 4: Symbolic Prover (symbolic_prover
)
A more intelligent static phase that uses logical constraints to prove termination for common loop patterns that the basic static analyzer cannot solve. It can prove that loops like for i in range(10)
or while x < 10: x += 1
will definitively halt.
Layer 5: Dynamic Tracing (dynamic_tracing
)
The most powerful and resource-intensive phase. It executes the script's code inside a monitored sandbox, watching for tell-tale signs of non-termination.
- It detects runaway recursion (exceeding a depth limit).
- It detects deterministic cycles in the execution trace (e.g., the program state repeats).
- If the script runs to completion or exits with a standard error, it is considered to `halt`.
Layer 6: Final Decision Synthesis (decision_synthesis
)
A final safety net. If all other phases were inconclusive, it performs one last check for self-referential calls to the analyze_halting
function and makes a final judgment based on the combined results of the previous phases.
The logic of the entire pipeline can be expressed as a formal system. Let be the set of all Python programs and be the set of results. The analyzer H is a function that takes a program P and the current analysis chain C and is defined as:
H(P, C) =
| "does not halt", if Hash(P) ∈ C
|
| "impossible to determine", if Paradox(P) = true
|
| Static(P), if Static(P) ≠ "impossible to determine"
|
| "impossible to determine", if Heuristic(P) = "impossible to determine"
|
| Prove(P), if Prove(P) ≠ "impossible to determine"
|
| Trace(P), if Trace(P) ≠ "impossible to determine"
|
| Synthesis(Static, Prove, Trace, P)
Line 1: Analysis Cycle Detection — Hash(P) ∈ C
- Meaning: "Is the semantic hash of the current program
P
already present in the analysis call chainC
?" - Purpose: This is the primary defense against meta-level recursion. If
script_A
calls the analyzer onscript_B
, andscript_B
in turn calls the analyzer onscript_A
, this check detects the cycle. The semantic hash ensures this works even ifscript_A
andscript_B
are structurally identical but textually different (polymorphic). - Result:
does not halt
. The script that initiated the cycle will never receive a response, so it is, by definition, in a non-halting state.
Line 2: Explicit Paradox Detection — Paradox(P)
- Meaning: "Does the program
P
match the known structure of a direct, self-referential paradox?" - Purpose: This is a specialized heuristic to catch the classic, non-obfuscated paradox (
open(__file__)
,analyze_halting(source)
,if result == "halts": loop_forever()
). - Result:
impossible to determine
. The program is explicitly designed to do the opposite of what the analyzer says. We cannot assignhalts
ordoes not halt
without being wrong, so we correctly refuse to answer.
Line 3: Static Analysis — Static(P)
- Meaning: "Can we determine the halting status of
P
using simple, fast static checks?" - Purpose: This handles the "low-hanging fruit." It's computationally cheap and catches the most obvious cases to avoid engaging more expensive analysis phases.
- Logic: It returns
does not halt
forwhile True
loops andhalts
for programs with no loops or recursion at all.
Line 4: Heuristic Classification — Heuristic(P)
- Meaning: "Does program
P
match the structural signature of a known, computationally intractable problem?" - Purpose: This acts as an "expert system." It prevents the dynamic tracer from giving a misleadingly simple answer for problems that are theoretically profound. While
collatz(27)
does halt, the general Collatz problem is undecidable. - Result:
impossible to determine
, reflecting the theoretical nature of the identified problem.
Line 5: Symbolic Proving — Prove(P)
- Meaning: "Can we formally prove that the loops in
P
must terminate?" - Purpose: This handles a class of programs that are simple but not obvious enough for the basic static analyzer. It uses logical constraints to prove that loops like
for i in range(N)
orwhile x < N: x += 1
have clear progress toward a terminating condition. - Result:
halts
if the proof succeeds.
Line 6: Dynamic Tracing — Trace(P)
- Meaning: "When we execute program
P
in a sandbox, does it terminate, or does it exhibit non-halting behavior?" - Purpose: This is the court of last resort and the most powerful tool. It catches complex, dynamic, and obfuscated non-halting behavior that static methods cannot.
- Logic: If the program finishes, it
halts
. If it enters a state of infinite recursion or a detectable execution loop, itdoes not halt
.
Line 7: Final Synthesis — Synthesis(Static, Prove, Trace, P)
- Meaning: "If all else has failed, what is the safest final answer?"
- Purpose: This is the final fallback in the
decision_synthesis
phase. It combines inconclusive results from prior phases and checks for self-referential patterns. - Logic: Prioritizes definitive results from earlier phases. If none, checks for self-reference via AST or substring and classifies as
does not halt
if found; otherwise,impossible to determine
.
To validate this approach, a comprehensive benchmark was performed using the included benchmark.py
script.
- Corpus Size: 5,498 total Python scripts.
- Corpus Composition:
- Halting Code: The Python Standard Library and top PyPI packages (
requests
,numpy
,pandas
, etc.). - Non-Halting Code: Synthetically generated infinite loops and a suite of hand-crafted adversarial paradoxes.
- Complex Code: Theoretically challenging cases like the Ackermann function and the Collatz conjecture.
- Halting Code: The Python Standard Library and top PyPI packages (
- Success Criteria: A test passes if the analyzer's result is considered "safe" for the given category:
halting
scripts must be classified ashalts
.non-halting
scripts are correct if classified asdoes not halt
orimpossible to determine
.complex
scripts are correct if classified asimpossible to determine
ordoes not halt
.
Metric | Score |
---|---|
Correct Predictions | 4,886 of 5,498 |
Practical Success Rate | 93% |
This result demonstrates that while a perfect halting decider is impossible, a layered heuristic approach can achieve a very high degree of accuracy and safety on practical, real-world code.
The main.py
script can analyze a directory of Python files. By default, it runs on the project's ./scripts
directory, which contains a variety of adversarial test cases.
# Analyze the default adversarial scripts
python main.py
You can also point it at any other directory using the --target
flag.
# Analyze a custom directory of Python scripts
python main.py --target /path/to/your/scripts
The benchmark.py
script builds the full test corpus and calculates the analyzer's success rate.
First Run (Builds the Corpus)
This command will take several minutes to download and process thousands of files into a benchmark_suite
directory.
python benchmark.py
Subsequent Runs (Uses Cached Corpus)
Tip
Once the benchmark_suite
directory exists, running the command again will skip the download/build process and provide results much faster.
# This run will be much faster
python benchmark.py
To delete the existing corpus and perform a fresh build, use the --rebuild
flag.
python benchmark.py --rebuild
This project acknowledges that the Halting Problem is theoretically undecidable. The goal is not to achieve impossible perfection but to build a practical tool that demonstrates the power of layered heuristics. By combining static analysis, symbolic logic, dynamic tracing, and advanced meta-defenses, this analyzer successfully pushes the boundary of what can be practically decided, providing correct and safe answers for an overwhelming majority of real-world and adversarial programs.