a Simple Hackable Interpreter
The goal of this project is to implement the same simple, hackable interpreter core in multiple host languages. For educational purposes, and to allow comparing performance and optimal implementation strategies.
I can't remember a time when I didn't feel the urge to design and implement my own programming languages, to gain a deeper understanding and make it possible to build my own tools.
I started out with template engines and formula evaluators, but as I gained more experience the goal posts moved further and further towards general purpose languages.
Most tutorials unfortunately start with parsing, which I consider to be the least interesting aspect. It wasn't until I discovered Forth that I managed to implement something I would consider using myself before running ut of motivation.
I then went through a long period of messing around with those ideas, adding more and more elaborate features on top until I ended up with something more resembling Lisp.
Along the way I kept switching host languages, moving ideas from language to language in an attempt to cut them down to their core and find optimal implementation strategies.
This project in many ways represents the culmination of that work, and is offered with the hope of helping other developers get started designing and implementing their own languges.
The content is openly licensed via CC BY-NC-ND 4.0.
I've decided to use an open license to benefit as many as possible, because I believe knowledge should be shared freely.
But I also believe in compensation for creators; and the less economic pressure I have to deal with, the more time and energy I can put into the project.
Please take a moment to consider chipping in if you like the idea. Nothing is free in this world, your contribution could make a big difference.
The repository is set up for sponsoring via Stripe and Liberapay, alternatively you may use BTC (bitcoin:18k7kMcvPSSSzQtJ6hY5xxCt5U5p45rbuh) or ETH (0x776001F33F6Fc07ce9FF70187D5c034DCb429811).
Python3 is used as a performance baseline.
fact 0.28790313
fib1 0.29931953
fib2 0.44976327
Java
fact 0.86500816
fib1 0.92455279
fib2 1.14576449
The language implemented out of the box is a strictly prefix, dynamically typed scripting language. The syntax is easy to replace/extend without touching other parts of the implementation.
method fib (n Int)
if < n 2
n
else
+ fib - n 1
fib - n 2
Macros and methods have a fixed number of arguments, parens may be used to group expressions.
say (1 2 3)
1 2 3
The following types are provided.
The root of all types.
The type of bindings (values in registers).
The boolean type has two values, T
and F
.
Signed 64-bit integers.
The type of macros like method
and if
.
The type of methods like +
and -
, as well as user defined methods like fib
.
Durations of time.
The following macros are provided, adding more is trivial.
Repeat body
rounds
times and push elapsed time on stack.
Evaluate actual
and compare to expected
, throw an exception if they're not equal.
Evaluate expr1
if cond
is truthy, else expr2
(if provided).
Define a new method with specified name
, arguments
and body
. Arguments may optionally be suffixed with a type, which is checked when the method is called; defaults to Any
if not provided.
The following methods are provided, adding more is trivial.
Add x
to y
and push the result on stack.
Subtract x
from y
and push the result on stack.
Multiply x
by y
and push the result on stack.
Push T
on stack if x
equals y
, otherwise F
.
Push T
on stack if x
is less than y
, otherwise F
.
Push T
on stack if x
is greater than y
, otherwise F
.
Print what
followed by newline to standard output.
The VM is primarily stack based, using registers for bindings.
Note that all implementations allow easily adding new operations from user code.
Evaluate rounds
times from the next operation to end pc
and push elapsed time on stack.
Call target method. Host methods are called directly while script methods push an entry on the call stack and jump to the start of the method.
Pop value from stack and compare to expected
, throw an exception if they're not equal.
Pop value from stack and continue evaluating if it's truthy, otherwise jump to the end of the branch.
Get value from source register
and push on stack.
Jump to target pc
.
Push value
on stack.
Pop count
values from stack and put in target register
s.
Pop entry from call stack and jump to its return pc.