Idea: run code analysis of independent files in parallel #13733
Replies: 3 comments 3 replies
-
The problem with parallelization of implementation files(without sig files) is type inference. F# uses a single-pass for type-checking, with PostInferenceChecks being the next one. Both do inference solving. You could, in theory, parallelize implementation files that do not depend on each other, but the only way you could prove it is by type-checking - text search heuristics would be brittle in the face of ambiguity. There may be other ways to do parallelism, but the cost of proving that it is safe is going to be more expensive than just doing the type-checking. The only way I know that proves that it is safe with practically zero cost is if there are signature files present. I believe that is the best we can do without creating something more complicated and expensive. It is understandable for users not to create signature files as they feel quite redundant, which is why I was hoping that someday we could easily generate these files in the editors. |
Beta Was this translation helpful? Give feedback.
-
Two things I want to add: In fsharp/src/Compiler/Driver/ParseAndCheckInputs.fsi Lines 163 to 172 in 4330fa1 The idea was to create a dependency graph from that to see what files could be processed in parallel. Currently, that list of parsed inputs is processed using a fold. let results, tcState =
(tcState, inputs)
||> List.mapFold (TypeCheckOneInputEntry(ctok, checkForErrors, tcConfig, tcImports, tcGlobals, prefixPathOpt)) One key problem would most likely be to update |
Beta Was this translation helpful? Give feedback.
-
This is pretty much a duplicate of #11634 , except for the scope of parallelisation discussed - my naive thinking is that further stages could be parallelised similarly. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Introduction
At my company the performance of F# code analysis is a major issue when working on any F# code. I'm looking at potential ways of enabling parallelisation wherever possible, as a cheap way of reducing wall-clock time of code analysis significantly.
There were some discussions earlier on the F# slack about the possibility of running code analysis/type checking of multiple files inside a project in parallel.
Me and @nojaf discussed this a bit recently and are looking for some opinions.
The general idea is that given A.fs and B.fs (B listed after A), to allow B.fs to have some amount of type-checking performed before A.fs type checking finishes - if we're certain that B does not depend on A in any way.
Existing work on optimising analysis of fsi files with backing signature files
Related to this is @nojaf’s initial work to revive @TIHan 's optimisation PR - that postpones (and parallelises) initial type-checking of .fs files with backing .fsi files.
However, we were trying to work out how difficult it would be to generalise that solution beyond fs/fsi files, that would:
Issues we considered
How far could the parallelisation go?
.fs/.fsi optimisation mentioned above only optimises the first phase of code analysis/type-checking. Can the later phases be optimised in a similar manner? (probably a stupid question, sorry).
Need to keep track of multiple
TcState
objects.The above optimisation retains the invariant of having just one
TcState
that's iterated over and mutated as files are analysed sequentially, and the parallel type-checking of.fs
files at a later stage does not mutate it (did I remember this correctly @nojaf?). This avoids any concurrency issues.If we were to create a graph of file dependencies processed on multiple worker nodes in parallel, they would require multiple instances of
TcState
which would in turn require either:val merge : TcState -> TcState -> TcState
orTcStateDelta type
andval add : TcState -> TcStateDelta -> TcState
Heuristic for detecting file dependencies
Regarding the heuristic, we think that it would be relatively easy to create a
val dependsOn : File -> File -> bool
function that would be:A very simple heuristic could be to use text search. That would possibly have a high rate of false positives.
However, @nojaf is of an opinion that a fairly accurate heuristic can be built based on untyped AST and satisfy all the above requirements.
Summary
I believe that this sort of optimisation, if possible, could provide a massive speedup of both code analysis and compilation on multi-core machines (how much depends on the scope of the optimisation w.r.t. type-checking stages).
Are there any show-stoppers that make this impossible/extremely hard?
@vlad @TIHan @KevinRansom
Beta Was this translation helpful? Give feedback.
All reactions