-
-
Notifications
You must be signed in to change notification settings - Fork 133
Open
Description
At some point I'll either run out of steam or get bored so I'm dumping my notes here so someone else can try these idea out.
See #1735 for the current best performance
SyntaxPrinter
- Replace Linq with imperative code
- See also
Doc.Join
- See also
- Use
Func<T,PrintingContext, Doc>more to prevent closure creation - Flatten concat usage, nested concat uses memory and ruins locality
- A lot of
Doc.Joinare inside a concat with a leading or trailing separator (0.2MB on complex benchmark)
- A lot of
- Try and remove
Doc.Nullusage, probably slows stuff down when printing/ fitting- Maybe create
ConcatNotNull(- It might be worth always doing this inside
Concator maybe only for certain problematic methods.
- It might be worth always doing this inside
- I feel like
InvocationExpressionmay benefit from de nulling and flattening nested concats
- Maybe create
- I estimate that around 1/6 - 1/5 of all Docs are Null, this is bad for printing, serialising and performance
DetermineLayoutmight be able to use syntaxkind instead of is Syntax- Do a dumb search on the code as a string for #if and then run
PreprocessorSymbols(12% of our library code total time) HasLeadingCommentMatchingcould do a quick check for minimum length, avoid loop, code load or scary regex.- Would its span be relevant here, would also have to pass the regex length
Token.PrintSyntaxTokencheck how often it short circuits. It might be worth moving the logic to an external function to prevent code load.- See above for
PrivatePrintLeadingTrivia - Should use
inkeyword everywhere, but am waiting for more changes to be merged to make performance confirmation easier AppendCommentStringReadercould be a static shared instance (maybe) or replaced with a span version, not sure what the benefit of this would be- Looking at it it seems like a pretty easy rewrite to use spans
- Could just use
EnumerateLines, it literally looks like a drop in superior version StringReadercould be replaced inSyntaxNodeComparer
- Scared to look at
BinaryExpression - GetLeading/Trailing trivia can be replaced with a fast check for empty
- Using directive better sort
- Could try method inline
RawSyntaxKindor replace some calls with raw kind comparison NITPICK doubt it will do anything - Could replace a lot of Doc.Null with additions by either filtering all concat/group creation into a vlb,
- Or could use some collection expression weirdness
- Call it ConcatNotNull
- Good for AttributeList
- Doc.Concat(whenTrue) / Doc.Concat(whenFalse) ????
- ConditionalGroup could contain two Docs because it never uses more than two items
- Conditional Expressoin has a redundant condition
- InvocationExpression doesn't need to call .AsSpan().ToArray() on the printed nodes, it can instead have a VLB of nodes and a VLB of break indexes
- It might be worth creating a dedicated type to keep logic the same
- This would probably be useful with Concat.ManyChildren as many concat groups in an invocation chain will likely have less that 5 values
- Off by one errors may be a big issue
- Modifier ToArray is unneeded to sort, could use stackalloc and span sort or array pool, it will likely escape but we can use concat children, prolly saves 50KB on complez
- Pool Dictionary<string, GroupMode> saves 170KB
- PrintSyntaxToken VLB is oversized
- Using SyntaxToken.TrailingTrivia or LeadingTrivia internally calls the internal greennode which will use both GetTrailingTriviaCore and GetLeadingTriviaCore,
- Has comments ends up calling each method two times each.
- Scan for
csharpier-ignoreand write toPrinterContextwe can then avoid regex checks - Noticed 170KB of Comparison from
Modifiers, I have no idea why a delegate is created for eachArray.Sortcall - AddLeadingTrivia allocates a
StringWriter, .NET 10 might stack allocate this
Doc Printer
- Goto statements in
RunOn,DocFitteretc. Literally the one time its recc DocFittercould have a cache where it remembers the final length of concats and short circuit, probably won't work due to different states- Discrete type discriminator would be nice, C# 15 may add discriminant unions which may help
- Might work if size is constrained to 16 bytes with additional info being heap allocated and pointed to
- lots of indirection, will be tricky and likely not useful will speed up RunOn and DocFitter
- When reverse iterating through
Doc.Concatit may help toSpan.Copyand then useSpan.Reversein place- This will only be useful for larger average concats
- Won't work with Concat.Two/Three may need to be special cased/ a dedicated abstract method added ie bool
TryCopyToReversed(Span<Doc>)
- Split
DocPrinter,FitterandRunOninto the main most common path and a secondary check function ieStringDoc,Concat,Groupfalling back to the secondary one, might help code load / happy path ContainsBreak, early exit with StringDoc maybeElementAt, Either useUnsafeAccessor.ElementAtat orValueListBuilder- Seeing a lot of
StelItemcalls, I assume it's coming fromStackI wonder if using VLB will stop this- Can't see where else it's from, stacktrace is a bit weird I assume
Stack.Pushis being inlined causing the weird backtrace
- Can't see where else it's from, stacktrace is a bit weird I assume
IncrementIndentcould be a list for instant lookup time instead of hashing shenanigans (maybe sparse list???? if worried about some weird length)- Groups could use an integer instead of a string, faster, smaller, arguably better than
Guidfor debugging- Ask why
Guidis used sometimes but anintis used otherwise, debugging?? - An incremental Int is prolly better for humans than a different
Guideach time, in most cases it will be consistent or off by one - Id is used as dictionary key, ints are much faster as keys (RunOn and Fits are hotspots rn)
- Ask why
Node Comparer
SyntaxNodeComparer is run when a file is formatted, I think it runs in about half the time to format, not sure if it can be improved
- Might be able to do a simple compare text before running, this would be relevant for updates to csharpier where the file cache is invalidated but nothing changes in the new formatted code
- I think I'll save this for last
- might be able to reuse
Stackor estimate capacity - AllButLast is good, might be possible to create a CompareEnumerable, don't know if the pain of boxing/enumerable allocations offsets the benefit of not calling
ToArray CompareListsboxes, this might be avoided with generics, It's always a little janky I've never found this consistentSyntaxNodeComparercallsToLista lot, it might be better ifToArraywhere used both to remove theListallocation and to avoid allocating when there are no items in the collectionCompareResultindenation is off inSyntaxNodeComparer- I'm pretty sure the indentation is straight wrong.
SyntaxNodeComparercallsCSharpSyntaxTree.ParseTexton original text for a second time, this is pretty slow- OrderBy
- Could use the non boxing
ToArrayand the sort, this is not a stable sort so this might not work - Check for length before any boxing nonsense
- Only call
OrderByskipToArrayand create aCompareEnumerable- still boxes might be faster imo
- Could use the non boxing
- Mystery
Func<SyntaxToken, SyntaxToken, CompareResult>allocations- No idea why this is a thing, driving me nuts, its probably Compare doing something weird. Maybe method group conversion, I swear Linq does this all the time why would it allocate a Func
- Maybe a stateful vs static thing breaking things?
- I imagine
isusage inCompareis slow, we used SyntaxKind before, not sure if it was worth the effort
originalToken.ValueText.Replacecould do a span thing where it iterates backwards removing\ror comparing the two, probably not worth the hassle - Apply
SkipLocalsInitto everything? No point zeroingCompareResultbecause it always assigned to - Why do we have two stacks? Have one stack with a tuple of 4
SyntaxNode - Could try passing
CompareResultviainorref. Not sure how large a benefit it would be- TBF it's 224 due to padding
- Does
TextSpan?have to be nullable, can't it just be default - Perhaps all methods could return a bool and any
TextSpanare assigned to a property inSyntaxNodeComparer
CSharpier.CLI
- Would love to see a benchmark on a real world CSharpier cache file, how large is it, how long does it take to deserialise.
- If this is a major issue CSharpier could start reading files async while the file is being serialised / read.
- This logic would be a pain it's pretty much a pipeline
- How fast is the hash function, if its a problem (unlikely) perhaps CSharpier should try a faster version I don't think Xxhash32 is vectorised.
- xxh3 is faster
- XxHash32.Hash has the option to return an int this way we can save a whopping 32 bytes!!! lmao
- I did wonder if TPL dataflow could be used for reading a bunch of files.
- Maybe
Parallel.ForEachAsyncmight work
- Maybe
Metadata
Metadata
Assignees
Labels
No labels