test(skeleton): get_random_u256 #111

TzahiTaub · 2024-05-12T11:41:54Z

This change is

codecov-commenter · 2024-05-12T11:42:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.94%. Comparing base (cee54bc) to head (e54cd2d).
Report is 5 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #111      +/-   ##
==========================================
- Coverage   74.86%   72.94%   -1.93%     
==========================================
  Files          26       27       +1     
  Lines         959     1094     +135     
  Branches      959     1094     +135     
==========================================
+ Hits          718      798      +80     
- Misses        207      270      +63     
+ Partials       34       26       -8

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dorimedini-starkware

Reviewed 1 of 3 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @amosStarkware and @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 5 at r2 (raw file):

use rstest::rstest;

///Generates a random U256 number between low and high (inclusive).

Suggestion:

/// Generates

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

    } else {
        high_of_low
    };

if low == 1 and high == 2^128, then low_of_high == 0 and low_of_low > low_of_high is true.
then, new_high_of_low will be 1.
this means that new_high will be selected from [1, 1], so it will always be exactly 1, right?
we will reach the else case below, i.e.

U256::from_words(1, rand::thread_rng().gen_range(1..=0))

... which should not compile / panic.

Am I missing something?

why not:

assert that high_of_low <= high_of_high (should always be true if low <= high).
randomize a new_high value in [high_of_low, high_of_high]
if high_of_low == new_high, randomize new_low in [low_of_low, 2^128)
elif high_of_high == new_high...
else randomize in [0, 2^128)

?

Code quote:

    let new_high_of_low = if low_of_low > low_of_high {
        high_of_low + 1
    } else {
        high_of_low
    };

crates/committer/src/patricia_merkle_tree/test_utils.rs line 34 at r2 (raw file):

#[case(U256::ZERO, U256::ONE)]
#[case((U256::ONE<<128)-U256::ONE, U256::ONE << 128)]
#[case(U256::ONE<<128, (U256::ONE << 128)+U256::ONE)]

add this case please (I think the test should fail ATM...)

Suggestion:

#[case(U256::ZERO, U256::ZERO)]
#[case(U256::ZERO, U256::ONE)]
#[case((U256::ONE<<128)-U256::ONE, U256::ONE << 128)]
#[case(U256::ONE<<128, (U256::ONE << 128)+U256::ONE)]
#[case(U256::ONE, U256::ONE << 128)]

TzahiTaub

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @amosStarkware and @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, dorimedini-starkware wrote…

if low == 1 and high == 2^128, then low_of_high == 0 and low_of_low > low_of_high is true.
then, new_high_of_low will be 1.
this means that new_high will be selected from [1, 1], so it will always be exactly 1, right?
we will reach the else case below, i.e.
U256::from_words(1, rand::thread_rng().gen_range(1..=0))
... which should not compile / panic.

Am I missing something?

why not:

assert that high_of_low <= high_of_high (should always be true if low <= high).

randomize a new_high value in [high_of_low, high_of_high]

if high_of_low == new_high, randomize new_low in [low_of_low, 2^128)

elif high_of_high == new_high...

else randomize in [0, 2^128)

?

We will reach the inner else, i.e., U256::from_words(1, rand::thread_rng().gen_range(0..=2^128)) (see added test case below that passes).
Do I still need to understand the difference between your proposal to mine? :)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 34 at r2 (raw file):

Previously, dorimedini-starkware wrote…

add this case please (I think the test should fail ATM...)

Done.

TzahiTaub

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @amosStarkware and @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

We will reach the inner else, i.e., U256::from_words(1, rand::thread_rng().gen_range(0..=2^128)) (see added test case below that passes).
Do I still need to understand the difference between your proposal to mine? :)

Wait, rethinking.

dorimedini-starkware

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @amosStarkware)

amosStarkware

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dorimedini-starkware and @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 37 at r3 (raw file):

#[case(U256::ONE<<128, (U256::ONE << 128)+U256::ONE)]

fn test_get_random_u256(#[case] low: U256, #[case] high: U256) {

If you think the test utils need tests - why not add a separate file (test_utils_test? it's even funny)?
non blocking

Code quote:

fn test_get_random_u256(#[case] low: U256, #[case] high: U256) {

TzahiTaub

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @amosStarkware and @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

Wait, rethinking.

Yep, made simplified confusing examples. I don't like step 2 in your proposal because in your example - we'll get a 50% chance to get 2^128 as the returned value even though the range is huge. Made it a bit more complicated with if else's to get ~ uniform randomness...
Regreting I've started this thing.

crates/committer/src/patricia_merkle_tree/test_utils.rs line 37 at r3 (raw file):

Previously, amosStarkware wrote…

If you think the test utils need tests - why not add a separate file (test_utils_test? it's even funny)?
non blocking

Because in general this looks like an endless procedure, I specifically want to test this function because it's damn confusing, but I hope there won't be any other tests.

dorimedini-starkware

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @amosStarkware and @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

Yep, made simplified confusing examples. I don't like step 2 in your proposal because in your example - we'll get a 50% chance to get 2^128 as the returned value even though the range is huge. Made it a bit more complicated with if else's to get ~ uniform randomness...
Regreting I've started this thing.

consider switching to BigUint?
there is a randomization trait, and I believe this crate has a U256 type...

crates/committer/src/patricia_merkle_tree/test_utils.rs line 9 at r4 (raw file):

    let (mut high_of_low, low_of_low) = low.into_words();
    let (mut high_of_high, low_of_high) = high.into_words();
    assert!(high_of_low <= high_of_high);

if you need this, add this restriction to the docstring

Code quote:

assert!(high_of_low <= high_of_high);

crates/committer/src/patricia_merkle_tree/test_utils.rs line 21 at r4 (raw file):

        let diff = high - low;
        let upper_bound = if diff > u128::MAX {
            assert!(low_of_high >= low_of_low);

this is implicit...

Code quote:

assert!(low_of_high >= low_of_low);

crates/committer/src/patricia_merkle_tree/test_utils.rs line 25 at r4 (raw file):

        } else {
            u128::try_from(diff).unwrap()
        };

as high_of_high == high_of_low + 1, if high - low > u128::MAX then the inequality low_of_high >= low_of_low is implied; no need to assert it.
and then, what you're actually doing here is a min operation, right?

... and then, what if low=0 and high=2^129-1? we will get high_of_high==1, and low_of_high==u128::MAX; there are 2*u128::MAX-1 possible numbers in the range but it seems you give an upper bound of u128::MAX in this case, so numbers in [2^128, 2^129) will never be sampled...

Suggestion:

        let diff = high - low;
        let upper_bound = u128::try_from(
            min(U256::from(u128::MAX), high - low)
        ).unwrap();

TzahiTaub

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, dorimedini-starkware wrote…

consider switching to BigUint?
there is a randomization trait, and I believe this crate has a U256 type...

If the last change seems right (I think it is), let's consider this in a future PR?

crates/committer/src/patricia_merkle_tree/test_utils.rs line 9 at r4 (raw file):

Previously, dorimedini-starkware wrote…

if you need this, add this restriction to the docstring

OK.

crates/committer/src/patricia_merkle_tree/test_utils.rs line 21 at r4 (raw file):

Previously, dorimedini-starkware wrote…

this is implicit...

Yes, made that as a sanity check. Removed.

crates/committer/src/patricia_merkle_tree/test_utils.rs line 25 at r4 (raw file):

Previously, dorimedini-starkware wrote…

as high_of_high == high_of_low + 1, if high - low > u128::MAX then the inequality low_of_high >= low_of_low is implied; no need to assert it.
and then, what you're actually doing here is a min operation, right?

... and then, what if low=0 and high=2^129-1? we will get high_of_high==1, and low_of_high==u128::MAX; there are 2*u128::MAX-1 possible numbers in the range but it seems you give an upper bound of u128::MAX in this case, so numbers in [2^128, 2^129) will never be sampled...

Good job, now it's more succinct.

dorimedini-starkware

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

If the last change seems right (I think it is), let's consider this in a future PR?

not uniform I think (see below, I may have a solution...)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

        rand::thread_rng().gen_range(high_of_low..=high_of_high),
        new_low,
    )

if low==u128::MAX and high==u128::MAX+1 this will always output u128::MAX+1, right? this is incorrect (unless the "lower bound is exclusive, upper bound is inclusive" behavior is consistent... haven't checked).
How about this?

Suggestion:

    let upper_bound = u128::try_from(min(U256::from(u128::MAX), high - low)).unwrap();
    if upper_bound < u128::MAX {
        let delta = rand::thread_rng().gen_range(0..upper_bound);
        return low + delta;
    }
    
    // Randomize with low_low=0 and high_high=u128::MAX until the value
    // is in range.
    // Setting low_low to 0 and high_high to u128::MAX can only increase the
    // sample domain by u128::MAX. As high-low>=u128::MAX, the expected 
    // number of samples until the loops breaks is bound from above by 2.
    let randomize = || {
        U256::from_words(
            rand::thread_rng().gen_range(high_of_low..=high_of_high),
            rand::thread_rng().gen_range(0..=u128::MAX)
        )
    }
    let mut result = randomize();
    while result < low || result > high {
        result = randomize();
    }
    result

TzahiTaub

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

Previously, dorimedini-starkware wrote…

if low==u128::MAX and high==u128::MAX+1 this will always output u128::MAX+1, right? this is incorrect (unless the "lower bound is exclusive, upper bound is inclusive" behavior is consistent... haven't checked).
How about this?

Good enough for me. It's not exactly 2, as for example, if low=u128::MAX and high=1<<129 the expected number of samples is 3 as if new_high=0b10 (as of high) or new_high=0 (as of low) there is 1/2**128 chance to succeed with the sampling, and you need to have new_high=1. But good enough for us.

dorimedini-starkware

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

Good enough for me. It's not exactly 2, as for example, if low=u128::MAX and high=1<<129 the expected number of samples is 3 as if new_high=0b10 (as of high) or new_high=0 (as of low) there is 1/2**128 chance to succeed with the sampling, and you need to have new_high=1. But good enough for us.

ah, you're right.
but I still think it's better because the current implementation is not the uniform distribution (right? or is my analysis wrong?),
and better to have a uniform distribution with O(1) runtime in expectation, than a distribution that is non-uniform in an non-intuitive way.
Do you agree?
if so please swap implementation

TzahiTaub

Reviewable status: 2 of 3 files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

Previously, dorimedini-starkware wrote…

ah, you're right.
but I still think it's better because the current implementation is not the uniform distribution (right? or is my analysis wrong?),
and better to have a uniform distribution with O(1) runtime in expectation, than a distribution that is non-uniform in an non-intuitive way.
Do you agree?
if so please swap implementation

Done.

dorimedini-starkware

Reviewed 1 of 1 files at r6.
Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 10 at r6 (raw file):

    let high_of_low = low.high();
    let high_of_high = high.high();
    assert!(high_of_low <= high_of_high);

more to the point, and can catch more cases

Suggestion:

    assert!(low <= high);
    let high_of_low = low.high();
    let high_of_high = high.high();

crates/committer/src/patricia_merkle_tree/test_utils.rs line 24 at r6 (raw file):

    // valid result for high bits in (high_of_low, high_of_high) or high_of_high == high_of_low + 1)
    // and every possible low 128 bits value is valid when either the high bits equal high_of_high,
    // or high_of_low).

didn't exactly understand the explanation, maybe this is clearer? WDYT?
non blocking

Suggestion:

    // As high-low>u128::MAX, the expected number of samples until the loops breaks is bound from
    // above by 3 (worst case if when high=k*2^128 and low=((k-1)*2^128)-1, and there is a 1/3 
    // chance to sample a new high value with over 1/2^128 chance for a valid low value).

crates/committer/src/patricia_merkle_tree/test_utils.rs line 40 at r6 (raw file):

#[rstest]
#[should_panic]
#[case(U256::ZERO, U256::ZERO)]

the should_panic attribute applies only to this single case?

Code quote:

#[should_panic]
#[case(U256::ZERO, U256::ZERO)]

dorimedini-starkware

Reviewed 1 of 1 files at r7, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @TzahiTaub)

TzahiTaub

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 24 at r6 (raw file):

Previously, dorimedini-starkware wrote…

didn't exactly understand the explanation, maybe this is clearer? WDYT?
non blocking

I'm not sure it's clear why this is the worst case, I tried to explain the two possibilities (also where high and low "high's" differ in a single bit and not two as in your example).

crates/committer/src/patricia_merkle_tree/test_utils.rs line 40 at r6 (raw file):

Previously, dorimedini-starkware wrote…

the should_panic attribute applies only to this single case?

Yes, otherwise it would have failed.

TzahiTaub

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 10 at r6 (raw file):

Previously, dorimedini-starkware wrote…

more to the point, and can catch more cases

:)

TzahiTaub

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 24 at r6 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

I'm not sure it's clear why this is the worst case, I tried to explain the two possibilities (also where high and low "high's" differ in a single bit and not two as in your example).

Any better?

dorimedini-starkware

Reviewed 1 of 1 files at r8, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 8 at r8 (raw file):

/// Panics if low > high.
pub(crate) fn get_random_u256(low: U256, high: U256) -> U256 {
    assert!(low <= high);

Suggestion:

assert!(low < high);

TzahiTaub

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)

crates/committer/src/patricia_merkle_tree/test_utils.rs line 8 at r8 (raw file):

/// Panics if low > high.
pub(crate) fn get_random_u256(low: U256, high: U256) -> U256 {
    assert!(low <= high);

Done.

dorimedini-starkware

Reviewed 1 of 1 files at r9, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @TzahiTaub)

TzahiTaub self-assigned this May 12, 2024

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch 2 times, most recently from 183a25c to 21d9530 Compare May 12, 2024 11:51

TzahiTaub requested review from dorimedini-starkware and amosStarkware May 12, 2024 12:40

dorimedini-starkware requested changes May 12, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from 21d9530 to b9e1bdc Compare May 12, 2024 13:48

TzahiTaub commented May 12, 2024

View reviewed changes

dorimedini-starkware reviewed May 12, 2024

View reviewed changes

amosStarkware reviewed May 12, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from b9e1bdc to 7cae595 Compare May 12, 2024 15:00

TzahiTaub commented May 12, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from 7cae595 to e0a83fa Compare May 12, 2024 15:01

dorimedini-starkware requested changes May 12, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from e0a83fa to 949ad6c Compare May 12, 2024 18:09

TzahiTaub commented May 12, 2024

View reviewed changes

dorimedini-starkware requested changes May 12, 2024

View reviewed changes

TzahiTaub commented May 13, 2024

View reviewed changes

dorimedini-starkware requested changes May 14, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from 949ad6c to d2f0500 Compare May 19, 2024 09:05

TzahiTaub commented May 19, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from d2f0500 to aecd2cd Compare May 19, 2024 09:07

dorimedini-starkware requested changes May 19, 2024

View reviewed changes

dorimedini-starkware reviewed May 19, 2024

View reviewed changes

TzahiTaub commented May 19, 2024

View reviewed changes

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from aecd2cd to bdc8a21 Compare May 19, 2024 09:22

TzahiTaub commented May 19, 2024

View reviewed changes

dorimedini-starkware requested changes May 19, 2024

View reviewed changes

test(skeleton): get_random_u256

e54cd2d

TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from bdc8a21 to e54cd2d Compare May 19, 2024 09:25

TzahiTaub commented May 19, 2024

View reviewed changes

dorimedini-starkware approved these changes May 19, 2024

View reviewed changes

TzahiTaub added this pull request to the merge queue May 19, 2024

Merged via the queue into main with commit 5266ea6 May 19, 2024
11 checks passed

TzahiTaub deleted the tzahi/test_utils/random_u256 branch May 19, 2024 09:27

test(skeleton): get_random_u256 #111

test(skeleton): get_random_u256 #111

Uh oh!

Conversation

TzahiTaub commented May 12, 2024 • edited by alon-dotan-starkware Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

amosStarkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

TzahiTaub left a comment

Choose a reason for hiding this comment

Uh oh!

dorimedini-starkware left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

TzahiTaub commented May 12, 2024 •

edited by alon-dotan-starkware

Loading

codecov-commenter commented May 12, 2024 •

edited

Loading