Implementing the language #68
Replies: 18 comments
-
In my opinion, get a sloppy prototype (or subset language) going, with another language as back-end (JS, C#, Java, whatever suits) - essentially neglecting performance and code quality. Then use that as a basis for building a proper, optimized, native, bootstrapped compiler in the language itself. 😎 |
Beta Was this translation helpful? Give feedback.
-
It makes the most sense for us to write this in Rust because of all the
existing tooling:
Official borrow checker: https://github.com/rust-lang/polonius
TypeScript parser/linter: https://github.com/rslint/rslint
Obviously, having Rust be the host language will make it harder for some to
participate until we're self-hosting, but I don't think anyone wants to go
through the trouble of reimplementing the borrow checker in another
language.
…On Mon, Oct 25, 2021, 1:19 PM Rasmus Schultz ***@***.***> wrote:
In my opinion, get a sloppy prototype (or subset language) going, with
another language as back-end (JS, C#, Java, whatever suits) - essentially
neglecting performance and code quality. Then use that as a basis for
building a proper, optimized, native, bootstrapped compiler in the language
itself. 😎
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDQCMTPTSFKW453KAVDCJTUIWGRTANCNFSM5GTPSBWA>
.
|
Beta Was this translation helpful? Give feedback.
-
I'm looking at starting to write the language now. |
Beta Was this translation helpful? Give feedback.
-
I've began a sample language here: |
Beta Was this translation helpful? Give feedback.
-
Love your work Isaac! SuperSonicHub1 suggests we use Rust as a backend because it already has all of this tooling built into it - a sensible approach. So if you imagine the following passes:
This essentially follows the same compilation flow as TypeScript for Javascript. Transpile the source into JavaScript. Generating AST and transpiling to Rust Source doesn't need to be written in Rust - it can be in another language. Eventually the transpiler can be port to BorrowScript |
Beta Was this translation helpful? Give feedback.
-
You're thinking of doing transpilation? That's an interesting idea, but I
think writing our own complier will give us more control over the language.
…On Wed, Nov 10, 2021, 9:25 PM David Alsh ***@***.***> wrote:
Love your work Isaac! SuperSonicHub1 suggests we use Rust as a backend
because it already has all of this tooling built into it - a sensible
approach.
So if you imagine the following passes:
BorrowScript Source -> BS AST -> Rust Source -> Binary
This essentially follows the same compilation flow as TypeScript for
Javascript. Transpile the source into JavaScript.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGDQCMTQVEFVULU6UC3A6UTULMSTDANCNFSM5GTPSBWA>
.
|
Beta Was this translation helpful? Give feedback.
-
The issue with directly transpiling to rust or any other language is that to make use of that languages features like the borrow checker we would have to either let the user directly read that languages error messages or we would have to translate that languages error messages into a form useful for our language. I'm currently planning to initially target WASM with the language I've been writing. I've now got static analysis working for most trivial uses and have basic unions working along with IO, while loops, if statements and some other bits and pieces but its all still currently interpreted. |
Beta Was this translation helpful? Give feedback.
-
Fully agree with you but I think it's a practical route initially. Writing a complete compiler backend is quite the challenge and something I want to do for BS but in the absence of resources I feel a practical starting point is to do a code to code compilation (transpilation) - especially seeing as that requires a large chunk of the effort for a complete compiler anyway (in other words, not wasted effort). That way we can prove the language semantics work (obviously they will as it's just simplified Rust). It's then easier to appeal for more resources as people would be interested in contributing to a working project. We would need to build half a transpiler regardless of if we wrote our own backend because of the need to write a parser/AST generator. From there we can pipe the output into whatever.
The parser validates the source code, produces compilation/type checking errors, handles imports and generates a single file (kind of like a js bundle) which is an easy to interpret intermediary format to hand over onto a compiler or transpiler.
In all cases, we need to produce an AST. Writing a transpiler for a subset of the language at that point is a great launching point. Generating AST requires parsing and is not trivial const foo: string = 'foo' Would produce something like {
"file": "main.bs"
"contents": [
{
"token": "foo",
"type": "string",
"assignment": "constant",
"value": "foo"
}
]
} You can see what I mean here: https://ts-ast-viewer.com/#code/MYewdgzgLgBAZiEAuG0BOBLMBzGBeGAcgREKA So one application will read source files and produce an object format we can then pipe into another program which will use that structure to make something else. The second program might be a compiler, or it might convert the object representation into another source code (like Rust). It's worth noting that doing that means we will get borrow checking for free from the Rust compiler. We just need to parse our source and produce valid Rust. Again, if it works, people will help out and then we can look at a roadmap where we write our own compiler |
Beta Was this translation helpful? Give feedback.
-
This depends on how the parser is implemented.
You can have the parser be smart, implementing borrow checking and error message generation or you can just produce an AST from the source without caring too much (excluding exceptions) about how that AST is generated. |
Beta Was this translation helpful? Give feedback.
-
We should agree on what our AST target will look like so we can share our efforts on writing a parser. It's worth having a type contract for it so that if we write separate implementations, they will be interchangeable and we won't waste effort I think being very clear on the segments involved in the compilation process and the expectations of what is generated allows us to produce a modular compiler that's easy to develop in pieces and allows for innovators to find better ways to generate it in the future. So when we start building out our compiler we have several binaries that make up our main binary
Then our main compiler does both of these actions
But this means we can add a module to our compiler that targets wasm
We can then write our own compiler
And each module knows what it's job is. |
Beta Was this translation helpful? Give feedback.
-
@alshdavid Silly but sincere question, do you plan to implement this, or is this repo only a specification for others to implement? |
Beta Was this translation helpful? Give feedback.
-
Its been a while sorry, I've started back at work and been working on other projects. For the comments above: A shared AST is definitely a good idea eventually, what I've currently got is definitely not usable for it, I need to do a massive refactor soon I think that will change its structure entirely. |
Beta Was this translation helpful? Give feedback.
-
Isn't borrow checking a function of the parser and the resulting AST? It should be statically determinable if an illegal borrow has occurred where it can be called out by the parser. So this wouldn't have to be something passed down to the underlying compiler (ie. rust to determine an illegal borrow happened). Anything that constitutes as valid syntax should provide guarantees that there are no illegal borrows. Part of the metadata of the AST should be what symbols or references are available moving into a given scope and what are left unused (not moved or borrowed) after moving out of that scope. The ones left are the ones that can be garbage collected. This would mean it would be possible to write a transpiler for any language and still retain the guarantees provided by the borrow checker. |
Beta Was this translation helpful? Give feedback.
-
In that case you're practically implementing your own borrow checker though so wouldn't be taking advantage of the existing one. I've completed a lot more work on my current implementation. |
Beta Was this translation helpful? Give feedback.
-
@Isaac-Leonard increabile work! I have had lots of fun looking through your project's source code - would love to incorporate it into an official compiler. As far as implementation is concerned, I am ratifying the last few bits of the language semantics and am keen to start working on a compiler. After I have a very clear picture of the language, I will put up a projects board, set up github actions to automate releases and begin writing a compiler. Isaac has been writing his implementation in Rust - I was thinking about using Go because it might be easier for contributors (and automating cross compiling the project releases), but Rust is also a great choice. |
Beta Was this translation helpful? Give feedback.
-
Hey, thank you |
Beta Was this translation helpful? Give feedback.
-
Also just letting you know I plan to publish to crates.io in the next week or so. |
Beta Was this translation helpful? Give feedback.
-
Ok so I've spent a little while learning how LLVM works so I am pretty confident that it's an achievable compile target for the first release. I would like to keep the compiler architecture modular such that it will be open to different compiler backends (compile targets). This means the compiler will be split into 3 pieces. flowchart LR
subgraph Front End
A[AST Generator - from BS Source] -->|BS AST| B[LLVM IR Generator]
end
subgraph Back Ends
B --> |LLVM IR| C[Generate Binary from LLVM IR - LLVM Wrapper]
end
This keeps the door open for future projects where the AST generator is reused and alternative backends can be targeted. Might see cool community projects targeting esoteric hardware or producing more optimised binaries. flowchart LR
subgraph Front End
A[AST Generator - from BS Source] -->|BS AST| B[LLVM IR Generator]
A -->|BS AST| C[C Source Generator]
A -->|BS AST| D[x86/ARM64 Assembly Generator]
end
subgraph Back Ends
B --> |LLVM IR| E[Generate Binary from LLVM IR - LLVM Wrapper]
C --> |C Source| F[Generate Binary from C code - GCC Wrapper]
D --> |x86 Assembly| G[Generate Binary from Assembly - Assembler Wrapper]
end
The compiler will be split into multiple binaries, each responsible for one step of the process. The parent binary will tie them all together.
Technically you can run the full sequence
However the
The reason I want to split this into separate executables is that it allows for clearer separation between the pieces and if one component doesn't work out well in the chosen language - then we can rewrite that component in a better suited language. It also makes it easier for newcomers to understand the flow and find a way to interject in the process. I am thinking of writing this in Go as it's an easy language to work in and there are llvm bindings for it. In time it would be cool to rewrite the compiler in BorrowScript, once it can be used for that purpose. My next step is to come up with an unsafe, manually memory managed variant of the language to create a minimal compiler with. EDIT: I am also thinking about changing the language name and adding a mascot. BorrowScript is cool but it people have associated it as being an interpreted language - rather than a compiled language. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Has any work been done yet on actually writing an implementation compiler?
I've had some minor experience with writing interpreters and would be interested in helping.
Beta Was this translation helpful? Give feedback.
All reactions