-
Notifications
You must be signed in to change notification settings - Fork 44
3863 llvm term cache #3882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3863 llvm term cache #3882
Conversation
This PR introduces a cache for the terms returned from calls to `Booster.LLVM.Internal.API.simplify`. Unpacking now uses a term store for the unpacked terms, and recognises shared terms by a shallow index into this store, using `TermF Int` as the map key (where the `Int` are indexes of symbol application/injection arguments in the term store). In small targeted tests using requests from MX-backend proofs , memory consumption was noticeably reduced, also resulting in better performance.
KEVM performance run
|
= BTerm Term | ||
| BPredicate Predicate | ||
= BTerm Idx | ||
| BPredicate Idx -- Predicate -- problem?!? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the problem here that we will use the same store for terms and predicates in resolve -> mkSymbolApplication
? I think this is fine, as the terms and predicates indeed consist of the same bits and pieces ultimately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should remove that comment, it was not a problem after all, we just wrap the Bool
term into a Predicate
at the end instead of during the procedure.
DomainValueF sort payload -> DomainValue sort payload | ||
VarF v -> Var v | ||
InjectionF s1 s2 i -> Injection s1 s2 (fromStore i) | ||
other -> error $ "Unexpected shallow term " <> show other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm feeling a little uneasy about this call to error
, but I do agree that it will not ever be triggered by construction of the shallow terms in resolve -> mkSymbolApplication
.
_ -> fail "Expecting a single term on the top of the stack" | ||
runDecodeM version mDef $ | ||
decodeBlock mbSize >>= \case | ||
[BTerm trmIdx] -> fmap stripRawTerm $ getTerm trmIdx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like magic to me!
We have a single call to decodeTerm
in the Booster.LLVM
module, as far as I know. Do I understand correctly that the cache lifetime is confined to decodeTerm'
?
Also, the getTerm trmIdx
call will return the fully resolved term? The subterms of that term may be present in the store before, but only once?
I think I know the answers, just want a confirmation.
Big brain @jberthold !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a single call to
decodeTerm
in theBooster.LLVM
module, as far as I know. Do I understand correctly that the cache lifetime is confined todecodeTerm'
?
Correct, the cache starts empty in runDecodeM
called here. We could carry a cache around across calls but then we would have to check for any unpacked term whether or not we already have it in the cache (term comparisons, but not recursive deep terms). This is good enough I think.
Also, the
getTerm trmIdx
call will return the fully resolved term? The subterms of that term may be present in the store before, but only once?
Yes, the term is already fully resolved in registerTerm
before it is stored in the termStore
(...hopefully, because otherwise any inconsistent data would sit there as a thunk waiting to explode).
I think I know the answers, just want a confirmation.
Big brain @jberthold !
😊 not my idea originally, the credit goes to Virgil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see the speedup!
I left some comments/questions, but they are for my understanding only.
Regarding the Kotnrol tests: I'm running them right now on my PR. Perhaps my out-of-date Nix cache allows me to do that? I'll try running the Kontol tests on this PR later.
@@ -501,12 +501,9 @@ llvmSimplify term = do | |||
withTermContext result $ | |||
emitEquationTrace t Nothing (Just "LLVM") Nothing $ | |||
Success result | |||
toCache LLVM t result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have another cache access in llvmSimplify, that will actually also write to cache too. Should we remove it as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that one should stay, the two here are duplicates. We need to cache the LLVM result for performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
I've ran the Kontrol tests, and the speedup is there too:
some noise is introduced by |
The EDIT: clearly your results are much better 😏 not sure what went wrong there. I was using the
|
This PR introduces a cache for the terms returned from calls to
Booster.LLVM.Internal.API.simplify
. Unpacking now uses a term store for the unpacked terms, and recognises shared terms by a shallow index into this store, usingTermF Int
as the map key (where theInt
are indexes of symbol application/injection arguments in the term store).In small targeted tests using requests from MX-backend proofs , memory consumption was noticeably reduced, also resulting in better performance.
Currently, the cache only lives for the duration of one LLVM call (not across different calls), no global variables or unsafe IO is required.
Fixes #3863