Add unstructured, typed IR for AOT compiler #596

electrum · 2024-10-23T05:12:58Z

The IR uses unstructured control flow (i.e. jump/goto) and includes types for SELECT, DROP and LOCAL_TEE. It also directly encodes stack unwinding via the DROP_KEEP opcode. This simplifies JVM bytecode generation by extracting the control flow, stack unwinding, and stack/type tracking.

While this simplifies the compiler (at the expense of more code), the goal is to make method splitting simpler to implement, as the IR is easier to analyze than raw WebAssembly.

andreaTP · 2024-10-23T17:18:57Z

I want to do an in-depth review as the PR is on the large side, give me a little time to come to it 🙏

I just skimmed through and I have a first question, the TypeStack and the Analyzer are doing a job very similar(in terms of structure) to the Validator now, wondering if we can re-use some code there, wdyt?
e.g. using a single implementation of the analyzeSimple methods seems doable, this will help keeping the changes isolated when implementing new opcodes.

A note that we are reviewing the public API, and it's possible that we will create git conflicts, sorry for the noise.

electrum · 2024-10-23T17:53:11Z

I agree they are similar. We could possibly move the type annotations to AnnotatedInstruction and have the validator attach those. But that might be mixing concerns and make the code more complicated.

andreaTP

Sorry for the delay, I finally took the time to go through the code.
It's really great work @electrum 👍

I like the direction of those changes, everything becomes a bit easier to understand.
At the same time, there are a couple of details that might be cleaner with a little "design doc", such as:

what's GOTO and the expected semantics
etc.

I see that this code structure bring us closer to fix the issues we have in the AOT, but I would feel more relaxed about it looking at a "target" PR, where the dots are connected and we can actually test out the those fixes.
This approach will confirm that we are not missing necessary semantics.

andreaTP · 2024-10-28T11:01:49Z

aot/src/main/java/com/dylibso/chicory/aot/AotAnalyzer.java

+        }
+
+        // implicit block for the function
+        stack.enterScope(FUNCTION_SCOPE, FunctionType.of(List.of(), functionType.returns()));


Are we discarding functionType.params()?

This code seems to make use of them:
https://github.com/dylibso/chicory/pull/596/files#diff-89b97bafcb107230b3864b1fc861e71f0eef8784e78fed7987356111d89f632cR634

Good observation, but this is actually a "block type" rather than a "function type". The parameters for a block are on the stack when the block is entered. The function's implicit block is different, as function arguments are in locals rather than on the stack. Thus the stack is empty at the start of the function.

andreaTP · 2024-10-28T11:05:10Z

aot/src/main/java/com/dylibso/chicory/aot/AotAnalyzer.java

+        return Optional.of(new AotInstruction(AotOpCode.DROP_KEEP, operands.build().toArray()));
+    }
+
+    private FunctionType blockType(Instruction ins) {


blockType is used also for LOOPs which is using the params instead of the returns, am I missing something?

This is how block types are encoded per the specification:

A structured instruction can consume input and produce output on the operand stack according to its annotated block type. It is given either as a type index that refers to a suitable function type, or as an optional value type inline, which is a shorthand for the function type [] -> [valuetype?].

andreaTP · 2024-10-28T11:11:39Z

aot/src/main/java/com/dylibso/chicory/aot/AotCompiler.java

-                    exitBlockDepth = ins.depth();
-                    if (ins.labelTrue() < idx) {
+                case GOTO:
+                    if (visitedTargets.contains(ins.operand(0))) {


Why are we skipping the interruption check here?
Can we document what's the rationale?

There's a comment above that I hoped would explain it

// track targets to detect backward jumps Set<Long> visitedTargets = new HashSet<>();

If the target label has already been visited, then this is a backward jump. Otherwise, it is a forward jump. We document this in InterpreterMachine.checkinterruption():

Terminate WASM execution if requested. This is called at the start of each call and at any potentially backwards branches. Forward branches and other non-branch instructions are not checked, as the execution will run until it eventually reaches a termination point.

Should we reference this comment in AotMethods.checkInterruption()?

Should we reference this comment in AotMethods.checkInterruption()?

yes please 🙏

andreaTP · 2024-10-28T11:15:35Z

aot/src/main/java/com/dylibso/chicory/aot/AotOpCode.java

+
+enum AotOpCode {
+    LABEL,
+    DROP_KEEP,


What's DROP_KEEP? What's the semantics?

The WABT IR has the same instruction. I've been thinking that UNWIND might be a better name as that describes the purpose and usage. The semantics can be seen in AotEmitters.DROP_KEEP(). What it does is to drop x number of intermediate stack values, keeping the last y values on the stack. This is how unwinds work.

Thanks for the explanation, we definitely need to preserve this comment somewhere

andreaTP · 2024-10-28T11:17:08Z

aot/src/main/java/com/dylibso/chicory/aot/AotEmitters.java

+        }
+
+        // drop intervening values
+        for (int i = keepStart - 1; i >= 1; i--) {


is this a stack unwind?

Yes, exactly, this is why I've been thinking to name the opcode UNWIND instead.

andreaTP · 2024-10-28T11:19:36Z

aot/src/test/resources/com/dylibso/chicory/approvals/ApprovalTest.verifyBrTable.approved.txt


  public static func_0(ILcom/dylibso/chicory/runtime/Memory;Lcom/dylibso/chicory/runtime/Instance;)I
    ILOAD 0
-    INVOKESTATIC com/dylibso/chicory/$gen/CompiledMachine$AotMethods.checkInterruption ()V


This is an interesting change, do you think it's not needed?

Yes, it's not needed because none of the switch targets are a backward jump.

The reason why I'm a bit unsure here is that we are not running the InterruptionTest on the aot compiled code.
But I'd prefer to keep the ball rolling other than fussing over a detail that can be reverted at any time.

electrum · 2024-10-28T20:29:13Z

If you think this existing structure makes the compiler easier to understand and would like to merge it now, then I can create a design doc to help with understanding it. Otherwise, we can wait until the method splitting work is ready, which will make some changes to it.

I looked more at sharing the stack analyzer code with the validator, but I don't think it's worthwhile. The instruction handling is based on the specification and won't ever change, so it's better to have two simple implementations than a shared one that's more complicated.

andreaTP · 2024-10-29T08:47:46Z

If you think this existing structure makes the compiler easier to understand and would like to merge it now

Ok, agreed, otherwise it will generate a ton of rebasing work.

I can create a design doc to help with understanding it

Please, 🙏 if possible, I'll prioritize the design doc over the full method splitting implementation, I'll be happy to have more people able to review and reason about the aot compiler.

andreaTP · 2024-10-29T08:52:35Z

This is a corpus of work, I hope you don't mind if I squash everything together.
E.g.: Move utility methods to AotUtil make sense in isolation, but it makes much more sense as part of the bigger refactoring.

electrum added 3 commits October 22, 2024 12:05

Attach ASM verifier output to exception

6ee553f

Move utility methods to AotUtil

ff9bcda

Add unstructured, typed IR for AOT compiler

45a7afa

electrum requested a review from andreaTP October 23, 2024 13:26

andreaTP reviewed Oct 28, 2024

View reviewed changes

andreaTP merged commit 72a65cf into dylibso:main Oct 29, 2024
13 checks passed

Add unstructured, typed IR for AOT compiler #596

Add unstructured, typed IR for AOT compiler #596

Uh oh!

Conversation

electrum commented Oct 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andreaTP commented Oct 23, 2024

Uh oh!

electrum commented Oct 23, 2024

Uh oh!

andreaTP left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

electrum Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

electrum Oct 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

electrum commented Oct 28, 2024

Uh oh!

andreaTP commented Oct 29, 2024

Uh oh!

andreaTP commented Oct 29, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

electrum commented Oct 23, 2024 •

edited

Loading

electrum Oct 28, 2024 •

edited

Loading

electrum Oct 28, 2024 •

edited

Loading