-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Emterpreter
The Emterpreter is an option that compiles asm.js output from Emscripten into a binary bytecode. It also generates an interpreter ("Emscripten interpreter", hence Emterpreter) capable of executing that bytecode. This lets you compile your project, or parts of your project, into bytecode that will be interpreted, as opposed to asm.js that will be executed directly by the JavaScript engine.
Why does this option exist? To provide an alternative in situations where normal direct execution by the JavaScript engine has issues. The two main motivations are
- JavaScript must be parsed and compiled before it is executed, which can take a long time in large codebases, whereas a binary bytecode is just data, so you can get to the point of something executing earlier. Executing in an interpreter might be slower, but it can be better than nothing.
- JavaScript has high-level control flow (no gotos) and must be written as short-running events, not long-running synchronous code. However, sometimes you have code that is written in the latter form that you can't easily refactor. The Emterpreter can handle that, because running the code in an interpreter allows us to manually control the flow of execution, as well as pause and resume the entire call stack, letting us turn synchronous code into asynchronous code.
For more background on the Emterpreter, see
- (blogpost link)
- (blogpost link)
To use the Emterpreter, build with -s EMTERPRETIFY=1
. This runs all code in the interpreter by default. You can also use the EMTERPRETIFY_BLACKLIST
option to specify the only methods not to be interpreted, or EMTERPRETIFY_WHITELIST
to specify the only methods that are to be interpreted.
As mentioned earlier, the Emterpreter and its binary bytecode load faster than JavaScript and asm.js can. Just building with the Emterpreter option gives you that, but it also makes the code run more slowly. A hybrid solution is to start up quickly in the Emterpreter, then switch to faster execution in full asm.js speed later. This is possible by swapping the asm.js module - first load the Emterpreted one, then load the fast one in the background and switch to it when it's ready.
To do this, build the project twice:
- Once with the Emterpreter option enabled, and
SWAPPABLE_ASM_MODULE
. This is the module you will start up with, and swap out when the fast one is ready. - Again to normal asm.js, then run
tools/distill_asm.py infile.js outfile.js swap-in
. The output,outfile.js
, will be just the asm module itself. You can then load this in a script tag on the same page, and it will swap itself in when it is ready.
If you have a method that you know will only ever run exactly once, and doesn't need to be fast, you can run that specific method in the Emterpreter: As mentioned above, the JavaScript engine won't need to compile it, and the bytecode is smaller than asm.js, so both download and startup will be faster. Another example is exception-handling or assertion reporting code, something that should never run, and if it does, is ok to run at a slower speed.
To do this, simply use the whitelist option mentioned before, with a list of the methods you want to be run as bytecode.
The emterpreter runs the code in an interpreter, which makes it feasible to manually control the call stack and so forth. To enable this support, build with -s EMTERPRETIFY_ASYNC=1
. You can then write synchronous-looking code, and it will "just work", e.g.
while (1) {
do_frame();
emscripten_sleep(10);
}
is a simple way to do a main loop, which typically you would refactor your code for and use emscripten_set_main_loop
. Instead, in the emterpreter the call to emscripten_sleep
will save the execution state, including call stack, do a setTimeout
for the specified amount of milliseconds, and after that delay, reconstruct the execution state exactly as it was before. From the perspective of the source code, it looks like synchronous sleep, but under the hood it is converted to a form that can work in a web browser asynchronously.
For a list of the APIs that can be used in this synchronous manner, see the docs.
When using sleep in this manner, you can likely use the emterpreter blacklist very efficiently: Only things that can lead to a call to sleep need to be emterpreted. In the example above, do_frame
and everything it calls could be in the blacklist, and run at full asm.js speed.
Warning: this option does not work if there is anything on the stack aside from emterpreted functions, as we can only save and restore emterpreter stack frames. When you build with -s ASSERTIONS=1
, runtime checks will be added to all compiled-but-not-emterpreted code, that will raise an error. This helps show you if you blacklisted something you shouldn't. However, we don't have a good way to check for non-compiled code, so for example if you have handwritten JS calling emterpreted code, and the emterpreted code tries to do an async save, things will fail in potentially confusing ways: what happens is the emterpreted code returns immediately (in order to wait for the asynchronous callback), and your handwritten code underneath it will then execute, not knowing that the code just returning has not yet completed.
Building with EMTERPRETIFY_ADVISE
will process the project and perform a static analysis to determine which methods should probably be run in the interpreter. This checks which methods could be on the stack underneath a call to a synchronous method, in which case they must be interpreted so that we can save and restore the stack later in an asynchronous way.
The analysis is pessimistic, in that function pointers can easily confuse it. So it might suggest methods that do not need to be interpreted in practice. You can do a more dynamic approach by building with -s ASSERTIONS=1
and seeing if you get an error message, it should report the stack when the synchronous operation happens, and then you should add everything there to the functions to interpret.
ASYNCIFY is an earlier experiment on running synchronous code. It works quite differently: it does a whole-program analysis, then modifies all relevant methods to they can be saved and resumed. Comparing the two,
- ASYNCIFY has a bad worst-case of large code size: If it needs to modify many methods, it can grow code size very significantly (even 10x more was seen). The emterpreter on the other hand has a guarantee of having smaller code size than normal emscripten output, simply because emterpreter bytecode is smaller than JS source.
- ASYNCIFY is slower than normal emscripten output, but not hugely so, while the emterpreter can be 10-20x slower, because it interprets code. Using a blacklist with the emterpreter, this can be mitigated.
- There are some known bugs with ASYNCIFY on things like exceptions and setjmp. The emterpreter has not been tested on those feature yet, so it's unclear if it would work.
- As the emterpreter is useful for other things than synchronous code, it will likely continue to be worked on, while the ASYNCIFY option currently does not have activity.
Stack traces when running the emterpreter can be a little confusing. Keep these things in mind:
- When non-emterpreted code calls into emterpreted code, it has to go through a "trampoline", a little function that just calls
emterpret()
with the location of the code to execute. That's why you'll seemain() -> emterpret()
in your stack traces,main()
is just a trampoline. - When calling between emterpreted code, there is an
INTCALL
opcode which does a direct call fromemterpret()
to another invocation ofemterpret()
. That means that you do see a stack trace of the right size, but the names are all the same. Invoke emcc with--profiling-funcs
or--profiling
to have the emterpreter take a slower path of calling through trampolines all the time. This is useful for profiling.
The bytecode is a simple register-based bytecode invented for this purpose, just enough to support the asm.js code that Emscripten emits. It is designed more for speed of execution and quick startup (no preprocessing necessary at all), than size.
It also has a bunch of "combo" opcodes for things like test+branch, etc. See tools/emterpretify.py
for the list of opcodes.