Description
Much of plan
is spent repeatedly retrieving the same hooks, in particular for composite components and especially when they are nested. It would be nice to only calculate these once, for example by storing them in an extra private field created by @compdef
as with geometry/graph/schematic.
This isn't quite that simple, though, because hooks
needs to return a named tuple, and the component might be immutable. I was partial to working with pure NamedTuples because of their immutability but this is a significant enough cost that it could be worth relaxing that. Some options:
- Force composite components to be mutable so we can initialize hooks lazily (bad)
- Force the composite component constructor to create the hooks (don't like the complexity)
- Carry hooks around in
Dict
s instead, access those directly (makes sense but is breaking) - Use a
Dict
for "caching" (without worrying about invalidation), so it gets populated the first timehooks
is called but still allowshooks(comp).hookname
access as a less efficient alternative tohooks(comp, :hookname)
(fine, can even still be used for non-composite components if users can implement eitherhooks
or_hooks
where only the latter will use the cache) - Cache the hooks in the schematic—this actually seems quite reasonable and could even be used with top-level schematics (although invalidation is more important in that context)
I'm favoring 5 for now but will keep the breaking Dict change in mind.
How much is this worth? In the QPU example we have these timings after compilation:
Assembling schematic graph: 0.004190 seconds (39.59 k allocations: 3.764 MiB)
Floorplanning: 0.700999 seconds (5.49 M allocations: 371.339 MiB, 8.49% gc time, 0.79% compilation time)
Schematic design rule checking: 0.100096 seconds (908.97 k allocations: 43.558 MiB, 13.28% gc time)
Generating crossovers: 1.949626 seconds (2.18 M allocations: 161.057 MiB, 0.68% gc time)
Ground-plane hole fill: 0.607943 seconds (5.14 M allocations: 335.287 MiB, 4.62% gc time)
Rendering to polygons: 0.699862 seconds (5.78 M allocations: 299.489 MiB, 3.03% gc time)
Flattening cells: 0.054393 seconds (273.14 k allocations: 21.864 MiB)
Total: 4.122284 seconds (19.88 M allocations: 1.210 GiB, 3.28% gc time, 0.13% compilation time)
Saving: 0.412568 seconds (190.17 k allocations: 49.931 MiB)
Ignoring crossovers it could be a 20-25% speedup (just guessing, can't tell from simple profiling how much is repeated). Still only half a second at that scale, but with deeper nesting and larger designs it becomes practically significant.