You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[flang] improve DITypeAttr caching with recursive derived types (#146543)
The current DITypeAttr caching for derived type debug metadata
generation strategy is not optimal. This turns out to be an issue for
compile times in apps with very very complex derived types like CP2K
See the added debug-cyclic-derived-type-caching-simple.f90 test for more
details about the duplication issue.
As a real world example justifying the new non trivial caching strategy,
in CP2K, emitting debug type info for the swarm_worker_type` in swarm_worker.F
caused 1,747,347 llvm debug metadata nodes to be emitted instead of 8023
after this patch (200x less) leading to noticeable compile time
improvements (I measured 0.12s spent in `AddDebugInfo` pass instead of
7.5s prior to this patch).
The main idea is that caching is now associating to the cached
DITypeAttr tree for a derived type a list of parent nodes being referred
to recursively via indices in this DITypeAttr.
When leaving the context of a parent node, all types that were cached
and linked to this parent node are cleared from the cache.
This allows more reusage in sub-trees while still fulfilling the MLIR
requirements that DITypeAttr types referring to a parent DITypeAttr via
integer id should only be used inside the DITypeAttr of the parent.
Most of the complexity comes from computing the "list of parent nodes"
by merging the ones from the components.
This is made is such a way that the extra cost for apps without
recursive derived type is minimal because the extra data structure
should not require extra dynamic allocations when they are no or little
recursion.
Example:
Take the following type graph (Fortran source for it in the added
debug-cyclic-derived-type-caching-complex.f90).
A is the tope level types, and has direct components of types B, C, and
E.
There are cycles in the type tree introduced by type B and D.
Types `C` and `E` are of interest here because they are in the middle of
those cycles and appear in several places in the type tree. There
occurrences is labeled in brackets in the order of visit by the
DebugTypeGenerator.
```
A -> B -> C [1] -> D -> E [1] -> F -> G -> B
| | | |
| | | | -> D
| | |
| | | -> H -> E [2] -> F -> G -> B
| | |
| | |-> D
| |
| | -> I -> E [3] -> F -> G -> B
| | |
| | |-> D
| | -> C [2]
|
| -> C [3] -> D
| -> E [4] -> F -> G -> B
|
| -> D
```
With this patch, E[2] and E[3] can share the same DITypeAttr as well as
C[1] and C[2] while they previously all got there own nodes.
To be safe with regards to cycles in MLIR, a DITypeAttr created for a
node N2 under a node N1 being recursively referred to and above the
recursive reference to N1 shall not be used above N1 in the DITypeAttr
tree. It can however be used in several places under N1.
Hence here:
-E[2] cannot reuse E[1] DITypeAttr because D appears above and under
E[1].
-E[3] can reuse E[2] DITypeAttr because they are both under B and above
D.
-E[4] cannot reuse E[3] DITypeAttr because it is above B.
This is achieved by this patch because when visiting A and reaching B,
the recursive reference to B is registered in the visit context. This
context is added D when going back-up in F. So when reaching back E[1]
with the information to build its DITypeAttr, its recursive references
are known and saved along the DITypeAttr in the cache.
When reaching back D, the cache for E is cleared because it is known it
depended on D. A new DITypeAttr is created after E[2], and this time it
only depends on B because the D under E[2] is not a recursive reference
(D is not above E[2]). Hence, when reaching E[3] it can be reused, and
the cache entry for E[2] is cleared when reaching B, which leads to a
new DITypeAttr to be created for E[4].
0 commit comments