Add syntactic filtering of side conditions after simplification #3870

geo2a · 2024-05-15T13:35:39Z

This PR introduces a heuristic in the equation application process:

after simplifying the predicates in the requires clause, see if an of them is syncatically present in the path condition (which is known to be true) and delete them

geo2a · 2024-05-17T08:23:12Z

I've deleted the KEVM performance table as out-of-date. Will post a new one.

geo2a · 2024-05-21T13:54:38Z

I have three recently re-enabled KEVM proofs failing on the feature branch:

FAILED src/tests/integration/test_prove.py::test_pyk_prove[mcd/flopper-tick-pass-rough-spec.k]
FAILED src/tests/integration/test_prove.py::test_pyk_prove[mcd/flopper-dent-guy-diff-tic-not-0-pass-rough-spec.k]
FAILED src/tests/integration/test_prove.py::test_pyk_prove[mcd/flopper-dent-guy-same-pass-rough-spec.k]

Will investigate mcd/flopper-tick-pass-rough-spec.k, as it seems to fail relatively quickly (140.20s)

geo2a · 2024-05-21T16:07:55Z

The mcd/flopper-tick-pass-rough-spec.k fails because Booster gets unknwon from z3 when trying to decide a branch condition:

[booster][execute][term c784633][rewrite 0ee7a2c][detail] EVM.jumpi.true
[booster][execute][term c784633][rewrite 0ee7a2c][match][success] rewrite 0ee7a2c
...
[booster][execute][term c784633][rewrite 0ee7a2c][smt] Checking 1 predicates, given 34 assertions and a substitution of size 0
[booster][execute][term c784633][rewrite 0ee7a2c][smt] Predicates to check: _==Int_(_/Int_(chop(_)_WORD_Int_Int(_*Int_(#lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, "5"), #lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, _+Int_(keccak(_)_SERIALIZATION_Int_Bytes(_+Bytes__BYTES-HOOKED_Bytes_Bytes_Bytes(#buf(_,_)_BUF-SYNTAX_Bytes_Int_Int("32", VarABI_id:SortInt{}), "\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL...truncated")), "1")))), #lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, _+Int_(keccak(_)_SERIALIZATION_Int_Bytes(_+Bytes__BYTES-HOOKED_Bytes_Bytes_Bytes(#buf(_,_)_BUF-SYNTAX_Bytes_Int_Int("32", VarABI_id:SortInt{}), "\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL...truncated")), "1"))), #lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, "5"))
[booster][execute][term c784633][rewrite 0ee7a2c][smt] Check of Given ∧ P and Given ∧ !P produced (Sat,Unknown)
[Error#booster] SMT returned `Unknown'

and crashing. I've opened an issue to add retries in Booster.

The proof can go on if given a longer timeout, and we see in the logs that the predicate actually would get actually:

...
[booster][execute][term c784633][rewrite 0ee7a2c][smt] Predicates to check: _==Int_(_/Int_(chop(_)_WORD_Int_Int(_*Int_(#lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, "5"), #lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, _+Int_(keccak(_)_SERIALIZATION_Int_Bytes(_+Bytes__BYTES-HOOKED_Bytes_Bytes_Bytes(#buf(_,_)_BUF-SYNTAX_Bytes_Int_Int("32", VarABI_id:SortInt{}), "\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL...truncated")), "1")))), #lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, _+Int_(keccak(_)_SERIALIZATION_Int_Bytes(_+Bytes__BYTES-HOOKED_Bytes_Bytes_Bytes(#buf(_,_)_BUF-SYNTAX_Bytes_Int_Int("32", VarABI_id:SortInt{}), "\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL\NUL...truncated")), "1"))), #lookup(_,_)_EVM-TYPES_Int_Map_Int(VarACCT_ID_STORAGE:SortMap{}, "5"))
[booster][execute][term c784633][rewrite 0ee7a2c][smt] Check of Given ∧ P and Given ∧ !P produced (Sat,Unsat)

geo2a · 2024-05-22T07:05:35Z

After increasing the timeout to 1 second, mcd/flopper-tick-pass-rough-spec.k still fails. The other two tests pass.

I think we should discuss this further and possible wait on merging until we have #3884

geo2a · 2024-05-22T07:06:46Z

KEVM performance numbers, excluding mcd/flopper-tick-pass-rough-spec.k:

Test	georgy-second-side-condition-filterin time	master-0f3148308 time	(georgy-second-side-condition-filterin/master-0f3148308) time
mcd/vat-muluu-pass-spec.k	105.48	116.19	0.9078233927188226
mcd/flopper-kick-pass-rough-spec.k	153.49	169.05	0.9079562259686483
mcd/end-pack-pass-rough-spec.k	195.32	212.04	0.921146953405018
benchmarks/requires01-a0gt0-spec.k	50.91	54.0	0.9427777777777777
examples/solidity-code-spec.md	122.11	127.34	0.958928851892571
benchmarks/encode-keccak00-spec.k	59.6	62.02	0.9609803289261528
mcd/end-cash-pass-rough-spec.k	303.58	315.39	0.9625542978534513
benchmarks/keccak00-spec.k	56.98	59.15	0.9633136094674556
kontrol/test-expectcalltest-testexpectstaticcall-0-spec.k	54.44	56.39	0.9654194006029437
benchmarks/storagevar02-nooverflow-spec.k	52.1	50.17	1.0384692047040063
erc20/ds/transferFrom-success-2-spec.k	98.47	94.56	1.041349407783418
mcd/flopper-cage-pass-spec.k	62.85	60.23	1.0434999169848913
kontrol/test-allowchangestest-testfailallowcallstoaddress-0-spec.k	58.01	55.58	1.0437207628643397
erc20/ds/transferFrom-success-1-spec.k	103.61	97.09	1.0671541868369554
erc20/ds/approve-success-spec.k	75.24	70.16	1.072405929304447
mcd/vat-addui-pass-spec.k	141.23	131.34	1.0753007461550175
erc20/ds/transfer-success-1-spec.k	86.57	79.04	1.0952682186234817
erc20/ds/transfer-success-2-spec.k	83.56	75.88	1.1012124406958357
mcd/vat-addui-fail-rough-spec.k	81.57	69.68	1.170637198622273
erc20/hkg/approve-spec.k	61.74	52.58	1.1742107265119819
erc20/hkg/transferFrom-success-1-spec.k	79.44	67.55	1.1760177646188008
erc20/hkg/transferFrom-success-2-spec.k	75.37	63.91	1.1793146612423722

geo2a · 2024-05-22T10:08:05Z

Kontrol performance:

Test	georgy-second-side-condition-filterin time	master-0f3148308 time	(georgy-second-side-condition-filterin/master-0f3148308) time
FreshCheatcodes.test_address()	43.23	58.98	0.7329603255340793
AddrTest.test_notBuiltinAddress_symbolic(address)	55.83	65.31	0.8548461185117133
StructTypeTest.test_vars((uint8,uint32,bytes32))	35.72	40.84	0.8746327130264445
ConstructorTest.run_constructor()	96.39	109.73	0.8784288708648501
BlockParamsTest.testFee(uint256)	38.48	42.76	0.8999064546304958
BlockParamsTest.testChainId(uint256)	38.68	42.89	0.9018419211937514
AccountParamsTest.testDealSymbolic(uint256)	43.01	47.68	0.9020553691275167
BlockParamsTest.testRoll(uint256)	38.48	42.54	0.9045604137282557
ExpectRevertTest.testFail_expectRevert_false()	134.45	145.31	0.9252632303351455
FreshBytesTest.test_symbolic_bytes_length	103.04	109.31	0.9426401976031471
BytesTypeTest.test_bytes4(bytes4)	32.35	34.22	0.9453535943892462
BlockParamsTest.testWarp(uint256)-trace_options2	30.48	32.09	0.9498286070426923
HevmTests.prove_require_assert_false	70.82	74.2	0.9544474393530996
CounterTest.testSetNumber(uint256)	145.76	152.02	0.9588212077358241
AccountParamsTest.testNonceSymbolic(uint64)	94.58	98.61	0.9591319338809451
Setup2Test.test_setup()	31.46	32.8	0.9591463414634147
FeeTest.test_fee_setup()	65.07	67.73	0.9607264137014615
StoreTest.testGasLoadColdVM()	39.03	40.46	0.9646564508156203
SetUpTest.testSetupData()	79.38	76.51	1.0375114364135405
StoreTest.testLoadNonExistent()	38.73	36.92	1.0490249187432286
HevmTests.prove_revert	23.31	21.93	1.0629274965800273
ExpectRevertTest.testFail_ExpectRevert_failAndSuccess()	115.08	105.97	1.085967726715108
HevmTests.proveFail_all_branches	71.89	64.9	1.107704160246533
TOTAL	1465.2499999999998	1543.7100000000003	0.9491743915631818

test_foundry_prove.py::test_foundry_refute_node fails on both feature and master branches 🤷

geo2a · 2024-05-30T13:04:47Z

Measured KEVM performance again: no big difference, long-running proofs seem to benefit though.

Test	georgy-second-side-condition-filterin time	master-b6ab7a905 time	(georgy-second-side-condition-filterin/master-b6ab7a905) time
mcd/flopper-tick-pass-rough-spec.k	200.67	225.68	0.8891793690180786
mcd/flopper-kick-pass-rough-spec.k	147.07	163.6	0.898960880195599
mcd/end-pack-pass-rough-spec.k	187.88	203.43	0.9235609300496485
mcd/vat-muluu-pass-spec.k	105.56	114.24	0.9240196078431373
benchmarks/requires01-a0gt0-spec.k	51.15	55.28	0.9252894356005789
benchmarks/requires01-a0le0-spec.k	51.0	54.45	0.9366391184573002
erc20/ds/transferFrom-failure-1-d-spec.k	55.1	58.04	0.9493452791178498
benchmarks/overflow00-nooverflow-spec.k	51.71	54.27	0.9528284503408881
benchmarks/structarg01-spec.k	63.09	65.92	0.9570691747572816
examples/erc20-spec.md	179.97	187.77	0.9584598178622783
mcd/dstoken-approve-fail-rough-spec.k	96.62	100.74	0.9591026404605917
benchmarks/storagevar01-spec.k	49.24	51.15	0.9626588465298144
bihu/forwardToHotWallet-success-2-spec.k	60.67	62.89	0.9647002703132453
bihu/functional-spec.k	50.7	48.84	1.038083538083538
erc20/ds/transferFrom-success-2-spec.k	95.69	90.9	1.0526952695269527
erc20/ds/transferFrom-success-1-spec.k	98.98	93.41	1.0596295899796597
erc20/ds/approve-success-spec.k	72.96	68.09	1.0715229842855043
mcd/vat-addui-pass-spec.k	139.59	129.88	1.0747613181398215
erc20/hkg/approve-spec.k	59.66	54.34	1.0979020979020977
erc20/ds/transfer-success-1-spec.k	84.08	75.2	1.1180851063829786
erc20/ds/transfer-success-2-spec.k	81.59	72.85	1.1199725463280714
erc20/hkg/transferFrom-success-2-spec.k	72.77	62.23	1.1693716856821468
erc20/hkg/transferFrom-success-1-spec.k	77.54	66.17	1.1718301345020403
mcd/vat-addui-fail-rough-spec.k	82.32	69.49	1.1846308821413152
TOTAL	2215.6100000000006	2228.86	0.9940552569474981

geo2a · 2024-05-31T09:19:29Z

There seems to be an improvement in Kontrol performance, although the proofs are rather short-running:

Test	georgy-second-side-condition-filterin time	master-b6ab7a905 time	(georgy-second-side-condition-filterin/master-b6ab7a905) time
FreshCheatcodes.test_address()	42.9	57.59	0.7449209932279909
AddrTest.test_notBuiltinAddress_symbolic(address)	55.29	63.67	0.8683838542484686
StructTypeTest.test_vars((uint8,uint32,bytes32))	35.57	39.31	0.9048588145510048
BlockParamsTest.testFee(uint256)	37.71	41.67	0.9049676025917927
BlockParamsTest.testRoll(uint256)	37.7	41.3	0.9128329297820824
AccountParamsTest.testDealSymbolic(uint256)	42.11	46.06	0.9142422926617455
BlockParamsTest.testChainId(uint256)	37.76	41.12	0.9182879377431906
AssertTest.prove_assert_true()	29.39	31.19	0.9422891952548894
FreshBytesTest.test_symbolic_bytes_length	101.17	106.05	0.953983969825554
StoreTest.testLoadNonExistent()	36.65	38.22	0.9589220303506018
ConstructorTest.run_constructor()	91.72	95.23	0.9631418670587
AssertTest.test_assert_true()	34.9	33.35	1.0464767616191903
AssumeTest.test_assume_true(uint256,uint256)	39.85	38.06	1.0470310036784025
HevmTests.proveFail_all_branches	72.68	69.1	1.0518089725036182
HevmTests.prove_revert	29.63	22.4	1.3227678571428572
TOTAL	725.0300000000001	764.3200000000002	0.948594829390831

geo2a · 2024-05-31T12:02:24Z

No difference in mx-backend performance.

ehildenb · 2024-06-03T15:16:44Z

@geo2a what example prompted this change? Was it something on one of our commercial engagements? If so, then we should make sure that example gets upstreamed into one of the relevant test-suites, so that we know we are actually testing the new feature.

geo2a self-assigned this May 15, 2024

PetarMax mentioned this pull request May 15, 2024

Non-terminating simplifications #3873

Open

geo2a force-pushed the georgy/second-side-condition-filtering branch from 92e726b to 8d363a8 Compare May 21, 2024 07:22

geo2a mentioned this pull request May 21, 2024

Add [depth] logging context #3885

Closed

This was referenced May 21, 2024

Filter requires clause a second time when applyin rewrite rules in Booster #3886

Closed

Increase SMT timeout in integration tests runtimeverification/evm-semantics#2440

Closed

This was referenced May 22, 2024

Add "hard" integration test for Booster's SMT solver interface #3887

Open

3863 llvm term cache #3882

Merged

geo2a added 2 commits May 22, 2024 16:35

Add syntactic filtering of side conditions after simplification

ca0e798

refactor applyEquation

3203683

geo2a force-pushed the georgy/second-side-condition-filtering branch from 69d0d4f to 3203683 Compare May 22, 2024 14:35

Merge branch 'master' into georgy/second-side-condition-filtering

fd857d0

geo2a requested review from goodlyrottenapple and jberthold June 3, 2024 06:56

geo2a marked this pull request as ready for review June 3, 2024 06:57

goodlyrottenapple approved these changes Jun 3, 2024

View reviewed changes

jberthold approved these changes Jun 3, 2024

View reviewed changes

geo2a added the automerge label Jun 3, 2024

Merge branch 'master' into georgy/second-side-condition-filtering

9c3b42c

rv-jenkins merged commit 1c85ef0 into master Jun 3, 2024
6 checks passed

rv-jenkins deleted the georgy/second-side-condition-filtering branch June 3, 2024 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add syntactic filtering of side conditions after simplification #3870

Add syntactic filtering of side conditions after simplification #3870

Uh oh!

geo2a commented May 15, 2024 •

edited

Loading

Uh oh!

geo2a commented May 17, 2024 •

edited

Loading

Uh oh!

geo2a commented May 21, 2024 •

edited

Loading

Uh oh!

geo2a commented May 21, 2024 •

edited

Loading

Uh oh!

geo2a commented May 22, 2024

Uh oh!

geo2a commented May 22, 2024

Uh oh!

geo2a commented May 22, 2024

Uh oh!

geo2a commented May 30, 2024

Uh oh!

geo2a commented May 31, 2024 •

edited

Loading

Uh oh!

geo2a commented May 31, 2024

Uh oh!

Uh oh!

ehildenb commented Jun 3, 2024

Uh oh!

Uh oh!

Add syntactic filtering of side conditions after simplification #3870

Add syntactic filtering of side conditions after simplification #3870

Uh oh!

Conversation

geo2a commented May 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geo2a commented May 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geo2a commented May 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geo2a commented May 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geo2a commented May 22, 2024

Uh oh!

geo2a commented May 22, 2024

Uh oh!

geo2a commented May 22, 2024

Uh oh!

geo2a commented May 30, 2024

Uh oh!

geo2a commented May 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geo2a commented May 31, 2024

Uh oh!

Uh oh!

ehildenb commented Jun 3, 2024

Uh oh!

Uh oh!

geo2a commented May 15, 2024 •

edited

Loading

geo2a commented May 17, 2024 •

edited

Loading

geo2a commented May 21, 2024 •

edited

Loading

geo2a commented May 21, 2024 •

edited

Loading

geo2a commented May 31, 2024 •

edited

Loading