Performance and Dynamism in User-extensible Compiler Infrastructures

A research project in partial requirement for a Computer Science MPhil at the University of Cambridge, supervised by Dr Tobias Grosser and co-supervised by Sasha Lopoukhine.

Abstract

MLIR is a modular compiler framework that provides core infrastructure to be leveraged and extended by users implementing their own compilers, an inherently dynamic design as a result of the underlying heterogeneous data structure whose shape is known only at runtime. This approach presents an inherent optimization boundary, as the dynamic structures cannot be precisely reasoned about before runtime to guarantee the validity of optimisations, meaning static ahead-of-time compilation provides fewer benefits. Previous compiler frameworks accept the limitations of this optimization boundary, leveraging only the remaining optimizations offered by static ahead-of-time compilation, yet still incurring the costs of long build times and reduced flexibility, suggesting dynamic languages might be more suitable. We examine performance bottlenecks incurred by dynamic languages for code rewriting tasks in xDSL, a Python-native compiler framework inspired by MLIR. We find that both the inherent dynamism of these rewriting tasks over runtime heterogeneous data structures and modern interpreter optimisations narrow the performance gap between static and dynamic languages, using both traditional measurement techniques and a novel tool for performance profiling bytecode instructions. Our research challenges the status quo of implementing user-extensible compiler frameworks in static, ahead-of-time compiled languages. Instead, we motivate the use of dynamic languages, demonstrating that they balance compilation performance with the flexibility and fast build times.

Keywords: xDSL, MLIR, LLVM, Dynamic Programming Languages, Performance, User-extensible Compiler Infrastructure

Thesis

Presentation

Code

In the course of the thesis, the author contributed a number of PRs to the xdsl project, including moving to use uv, along with bug fixes and performance optimisations informed by the benchmarking and specialisation processes, and development of the PyAST frontend.

In addition to this, the author originated the following repositories:

xdsl-bench -- the infrastructure and artefacts from benchmarking the xDSL compiler framework with air-speed velocity
bytesight -- a Python-native tracing performance profiler operating at the bytecode level, introduced in the fourth chapter of the thesis
llvm-project-benchmarks -- a fork of source code for the benchmarks implemented by Mehdi Amini for his "How Slow is MLIR?" talk, as a resource for users trying to benchmark MLIR as the talk provides no instructions nor links to their source code. Additional benchmarks for direct comparison with xDSL are provided in the dev/ branch

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
code		code
presentation @ 8df4706		presentation @ 8df4706
proposal @ 8101c76		proposal @ 8101c76
report @ 0f55308		report @ 0f55308
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Performance and Dynamism in User-extensible Compiler Infrastructures

Abstract

Thesis

Presentation

Code

About

Uh oh!

Releases

Packages

Uh oh!

EdmundGoodman/masters-project

Folders and files

Latest commit

History

Repository files navigation

Performance and Dynamism in User-extensible Compiler Infrastructures

Abstract

Thesis

Presentation

Code

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages