[May 10] Formal Verification of a Realistic Compiler #326

alaiasolkobreslin · 2022-05-03T16:31:59Z

alaiasolkobreslin
May 3, 2022

This is a thread for Formal Verification of a Realistic Compiler.

@anshumanmohan and @alaiasolkobreslin will be leading the discussion!

5hubh4m · 2022-05-06T03:55:03Z

5hubh4m
May 6, 2022

I loved reading this paper, it’s so well written! I don’t have much to say about the contents except that I’m a big fan of verification and let’s do more of it — especially for systems software like compilers and OSes (thinking of efforts like seL4 and CertiKOS)!

0 replies

JonathanDLTran · 2022-05-07T23:43:16Z

JonathanDLTran
May 7, 2022

I enjoyed the concept of the fully-verified compiler that this work demonstrates, and I'm interested in how modular the components of the compiler are in this work, and whether the compiler is easily extensible so that when new optimizations are added and proved, they can be integrated in easily. With more extensibility, I wonder if it would be possible to have other contributors work on passes and verify these passes independently, then plug them into the general framework.

I was also curious if the author had expanded the work so that larger benchmarks could be tested, beyond those selected. For instance, it would be interesting to see if real-world avionics embedded programs could be compiled via CompCert, as the author does mention that PowerPC was chosen because it is used often for avionics software.

3 replies

5hubh4m May 7, 2022

I would guess those programs are extremely closed source.

anshumanmohan May 8, 2022

Thanks for your comment!

We can talk more about the ease of integration of new passes/features on Tuesday, but in brief: it is possible to state your behavior (e.g. in Gallina, the specification language inside of Coq), and then get CompCert to generate a proof obligation where you must show that your code does nothing too naughty. I played with this feature some in past research :)

The avionics dream is alive! CompCert is released both as an academic project and as a commercial tool (see here). Airbus is doing top secret stuff with it.

sampsyo May 8, 2022
Maintainer

Never having worked with CompCert myself, I can't give as authoritative an answer to the "how extensible is it?" question as @anshumanmohan can. But I feel the need to point out that tacking stuff onto CompCert (optimizations, backends, etc.) is sort of a "cottage industry" of research unto itself these days. That doesn't indicate that it's easy, but it does indicate that it's possible.

A couple of random examples: support for floating point and instruction scheduling.

andrewb1999 · 2022-05-10T01:42:42Z

andrewb1999
May 10, 2022

CompCert is work that I have known about for many years now, but it was good to get a better understanding of the inner workings through this paper. Compilers seem like a near perfect place for formal verification techniques to be applied as they have the potential to reduce bugs across a wide range of applications. However, outside of the world of safety-critical software, it's kinda surprising how good our compilers have become without formal methods for reducing bugs. The methodology of breaking up the compiler into a sequence of composable passes on an IR makes it significantly easier to write correct compilers without needing a formal proof.

There has also been recent work on extending CompCert for High-Level Synthesis. HLS tools are highly error prone compared to traditional compilers and the produced bugs can be much more challenging to debug, often requiring long simulations. It would be amazing to have a fully verified HLS tool in production as it would essentially completely cutout RTL behavioral simulation from pre-silicon verification. Unfortunately, I have some doubts about verified tools being able to reach the performance of non-verified tools because of practical reasons. Formally verifying the correctness of a compiler is hard and time consuming, eating into time that could have been spent improving parts of the compiler. Code generated by tools like Coq can also be pretty clunky and slow at times, although this has certainly gotten better. Maybe we can get around this in some way by constraining formal verification to some particularly error prone portion of compilation, rather than providing an end to end guarantee.

2 replies

anshumanmohan May 10, 2022

Agree bigtime that compilers are a good application of formal methods; they have a powerful amplifying effect. This is probably a good counter to the oft-stated argument that formal verification is perfectly nice but far too laborious.

Alaia and I would love for you to say a little more about HLS/RTL tomorrow, if you don't mind!

chhzh123 May 10, 2022

Wow, the Vericert paper is really cool!!! Although the performance result is really bad (27x slower compared with LegUp), it is still the first fully verified HLS tool and deserves credits.

I also agree providing formal verification is rather tedious and cost lots of time. According to the CompCert authors,

The whole Coq formalization and proof represents 42,000 lines of Coq (excluding comments and blank lines) and approximately three person-years of work.

Three years is actually a large amount of time. For traditional software, programmers can iterate for several versions and greatly improve the performance, but it is hard for verification to keep up this pace. This significantly prevents verified software to have broarder applications.

gsvic · 2022-05-10T03:37:28Z

gsvic
May 10, 2022

Being particularly interested in databases I am usually curious to see how concepts we learn in this course are connected with a query compiler/execution engine, as a compiler is usually the backbone of any database system. Looking for similar works on how could verification methods be applied on query compilers, I came across Q*cert (link). This looks particularly interesting, as it is also applied in some popular systems like Spark SQL, and there is also an interesting demo in this link: https://querycert.github.io/, and it also looks like that it can be used in order to verify various query optimizations, like filter-pushdown.

1 reply

orkosinha May 10, 2022

If the target of query-compilers are source code (assumed from the demo page), what are the performance benefits of using something like this when creating the query plan than compilation of the source code with CompCert?

charles-rs · 2022-05-10T03:58:17Z

charles-rs
May 10, 2022

Really interesting paper, and a great read!

One thing that I found particularly interesting was the large number of IRs. The reason given was that it was easier to verify translations between different IRs than between different subsets of the same IR. This is something that I found myself wondering in 4120, where we translate between two subsets of the same IR, and that it seems like it might be a more type safe approach to just have two distinct (but similar) languages. This seems to be the complete opposite approach of LLVM, where it's really just one IR, and EVERYTHING is operating on that same language.

2 replies

sampsyo May 10, 2022
Maintainer

An unverified rumor I have heard is that this “many IR” design is a personal style of Xavier’s (for which there is good evidence in the form of his other most famous work, i.e., the OCaml compiler).

susan-garry May 10, 2022

I also found this to be rather peculiar. I wonder how much more difficult it would be to verify translations between different subsets of the same IR, since this would seem to be the more common practice today. For instance, if we could verify translations between different optimization passes in LLVM, the multiplier effect of being confident in our code's correctness would be far larger. (Although then again, LLVM is so popular that it gets tested more than other IRs simply because it is used more, so it may have fewer bugs to begin with, possibly weakening the power of formal verification).

chhzh123 · 2022-05-10T04:21:58Z

chhzh123
May 10, 2022

I also heard about the CompCert project a long time ago, which is such a famous milestone for formal verification. Correspondingly, I knew the NSF funded a $10 million, five-year grant to support the project Expeditions in Computing: The Science of Deep Specification (DeepSpec, for short). I guess the project is going to an end since it was launched in 2016, so I wonder if there are any improvements for formal verification in system software these several years? Is the development effort reduced? One possible direction I guess is to leverage program synthesis to automatically generate the proof program (not sure if someone has tried this idea:) )

1 reply

andreyyao May 10, 2022

Automated proof generation seems quite difficult...

andreyyao · 2022-05-10T05:22:39Z

andreyyao
May 10, 2022

I was pleasantly surprised to find out this paper is one of the readings for this class. I'm taking CS4160 (Formal Verification) which uses Coq, and we actually had an entire lecture about CompCert. It was mentioned that initially CompCert had some bugs in its parser, which was generated by Menhir and not proven correct, which I thought was funny. By the way, Andrew Appel gave a guest lecture to the same class on Verifiable C recently(He seems to like Princeton a lot). Combining Verifiable C and CompCert gives (theoretically, if everything else goes right) completely correct code, which is pretty wild to think about.

I suppose how people can use it is to compile the source code using CompCert and then say that "btw this assembly is emitted by CompCert so you should be more confident that it's correct!". However, given a program in assembly, to really trust it CompCert probably needs to attach its digital certificate or footprint to the program. Would CompCert work better when the program is distributed in source code format for this reason?

2 replies

sampsyo May 10, 2022
Maintainer

(Can’t quite tell if “he seems to like Princeton a lot” is a joke… 🤔)

andreyyao May 10, 2022

Only half a joke. He was saying stuff like "even smarter a Princeton PhD" and his team also named their sml compiler "SML of New Jersey".

yy665 · 2022-05-10T14:30:51Z

yy665
May 10, 2022

This is definitely a good read! I have also heard about CompCert a long while ago, although I had no idea how it’s done and what exactly it allows us to do. It appears to me that this work is extensible enough and already has a lot of supports from a range of communities, so there’s already very good utility.

I am wondering after years of development in compiler verification, how much percent of translations in real-world compilers (especially large, open-source compilers) are actually verified?

By the way regarding the Andrew Appel, I remember he mentioned other tools that can be used as alternatives to CompCert in the verification chain. Could anyone compare ComCert and other approaches?

2 replies

sampsyo May 10, 2022
Maintainer

The other big one is CakeML. There is also CertiCoq.

yy665 May 10, 2022

Thanks for the pointer!

tonyjie · 2022-05-11T21:53:14Z

tonyjie
May 11, 2022

An interesting read! I found this comment from this paper to CompCert pretty interesting.

The striking thing about our CompCert results is that the middleend bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine-checked, has tangible benefits for compiler users.

This really proves the strength and power of CompCert from the testing perspectives.

0 replies

michaelmaitland · 2022-05-12T20:10:42Z

michaelmaitland
May 12, 2022

I really enjoyed this paper and also the presentation / discussion. I feel that there is a really good justification for using a verified compiler and I think that their approach is pretty powerful.

I wonder how compilers will be impacted by advancements in other areas of CS such as AI and Formal Verification.

0 replies

anshumanmohan · 2022-05-12T20:17:54Z

anshumanmohan
May 12, 2022

Yesterday CompCert won the ACM Software System Award! Previous winners include GCC, Coq, LLVM, make, the World Wide Web, TCP/IP, and UNIX.

https://twitter.com/TheOfficialACM/status/1524405396269678592

0 replies

[May 10] Formal Verification of a Realistic Compiler #326

Uh oh!

Replies: 11 comments · 13 replies

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sampsyo May 8, 2022 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sampsyo May 10, 2022 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sampsyo May 10, 2022 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sampsyo May 10, 2022 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 11 comments 13 replies

sampsyo May 8, 2022
Maintainer

sampsyo May 10, 2022
Maintainer

sampsyo May 10, 2022
Maintainer

sampsyo May 10, 2022
Maintainer