Skip to content

test(skeleton): get_random_u256 #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2024
Merged

Conversation

TzahiTaub
Copy link
Contributor

@TzahiTaub TzahiTaub commented May 12, 2024

This change is Reviewable

@TzahiTaub TzahiTaub self-assigned this May 12, 2024
@codecov-commenter
Copy link

codecov-commenter commented May 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.94%. Comparing base (cee54bc) to head (e54cd2d).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #111      +/-   ##
==========================================
- Coverage   74.86%   72.94%   -1.93%     
==========================================
  Files          26       27       +1     
  Lines         959     1094     +135     
  Branches      959     1094     +135     
==========================================
+ Hits          718      798      +80     
- Misses        207      270      +63     
+ Partials       34       26       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch 2 times, most recently from 183a25c to 21d9530 Compare May 12, 2024 11:51
Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @amosStarkware and @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 5 at r2 (raw file):

use rstest::rstest;

///Generates a random U256 number between low and high (inclusive).

Suggestion:

/// Generates 

crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

    } else {
        high_of_low
    };

if low == 1 and high == 2^128, then low_of_high == 0 and low_of_low > low_of_high is true.
then, new_high_of_low will be 1.
this means that new_high will be selected from [1, 1], so it will always be exactly 1, right?
we will reach the else case below, i.e.

U256::from_words(1, rand::thread_rng().gen_range(1..=0))

... which should not compile / panic.

Am I missing something?

why not:

  1. assert that high_of_low <= high_of_high (should always be true if low <= high).
  2. randomize a new_high value in [high_of_low, high_of_high]
  3. if high_of_low == new_high, randomize new_low in [low_of_low, 2^128)
  4. elif high_of_high == new_high...
  5. else randomize in [0, 2^128)

?

Code quote:

    let new_high_of_low = if low_of_low > low_of_high {
        high_of_low + 1
    } else {
        high_of_low
    };

crates/committer/src/patricia_merkle_tree/test_utils.rs line 34 at r2 (raw file):

#[case(U256::ZERO, U256::ONE)]
#[case((U256::ONE<<128)-U256::ONE, U256::ONE << 128)]
#[case(U256::ONE<<128, (U256::ONE << 128)+U256::ONE)]

add this case please (I think the test should fail ATM...)

Suggestion:

#[case(U256::ZERO, U256::ZERO)]
#[case(U256::ZERO, U256::ONE)]
#[case((U256::ONE<<128)-U256::ONE, U256::ONE << 128)]
#[case(U256::ONE<<128, (U256::ONE << 128)+U256::ONE)]
#[case(U256::ONE, U256::ONE << 128)]

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from 21d9530 to b9e1bdc Compare May 12, 2024 13:48
Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @amosStarkware and @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, dorimedini-starkware wrote…

if low == 1 and high == 2^128, then low_of_high == 0 and low_of_low > low_of_high is true.
then, new_high_of_low will be 1.
this means that new_high will be selected from [1, 1], so it will always be exactly 1, right?
we will reach the else case below, i.e.

U256::from_words(1, rand::thread_rng().gen_range(1..=0))

... which should not compile / panic.

Am I missing something?

why not:

  1. assert that high_of_low <= high_of_high (should always be true if low <= high).
  2. randomize a new_high value in [high_of_low, high_of_high]
  3. if high_of_low == new_high, randomize new_low in [low_of_low, 2^128)
  4. elif high_of_high == new_high...
  5. else randomize in [0, 2^128)

?

We will reach the inner else, i.e., U256::from_words(1, rand::thread_rng().gen_range(0..=2^128)) (see added test case below that passes).
Do I still need to understand the difference between your proposal to mine? :)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 34 at r2 (raw file):

Previously, dorimedini-starkware wrote…

add this case please (I think the test should fail ATM...)

Done.

Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @amosStarkware and @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

We will reach the inner else, i.e., U256::from_words(1, rand::thread_rng().gen_range(0..=2^128)) (see added test case below that passes).
Do I still need to understand the difference between your proposal to mine? :)

Wait, rethinking.

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @amosStarkware)

Copy link
Contributor

@amosStarkware amosStarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dorimedini-starkware and @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 37 at r3 (raw file):

#[case(U256::ONE<<128, (U256::ONE << 128)+U256::ONE)]

fn test_get_random_u256(#[case] low: U256, #[case] high: U256) {

If you think the test utils need tests - why not add a separate file (test_utils_test? it's even funny)?
non blocking

Code quote:

fn test_get_random_u256(#[case] low: U256, #[case] high: U256) {

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from b9e1bdc to 7cae595 Compare May 12, 2024 15:00
Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @amosStarkware and @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

Wait, rethinking.

Yep, made simplified confusing examples. I don't like step 2 in your proposal because in your example - we'll get a 50% chance to get 2^128 as the returned value even though the range is huge. Made it a bit more complicated with if else's to get ~ uniform randomness...
Regreting I've started this thing.


crates/committer/src/patricia_merkle_tree/test_utils.rs line 37 at r3 (raw file):

Previously, amosStarkware wrote…

If you think the test utils need tests - why not add a separate file (test_utils_test? it's even funny)?
non blocking

Because in general this looks like an endless procedure, I specifically want to test this function because it's damn confusing, but I hope there won't be any other tests.

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from 7cae595 to e0a83fa Compare May 12, 2024 15:01
Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @amosStarkware and @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

Yep, made simplified confusing examples. I don't like step 2 in your proposal because in your example - we'll get a 50% chance to get 2^128 as the returned value even though the range is huge. Made it a bit more complicated with if else's to get ~ uniform randomness...
Regreting I've started this thing.

consider switching to BigUint?
there is a randomization trait, and I believe this crate has a U256 type...


crates/committer/src/patricia_merkle_tree/test_utils.rs line 9 at r4 (raw file):

    let (mut high_of_low, low_of_low) = low.into_words();
    let (mut high_of_high, low_of_high) = high.into_words();
    assert!(high_of_low <= high_of_high);

if you need this, add this restriction to the docstring

Code quote:

assert!(high_of_low <= high_of_high);

crates/committer/src/patricia_merkle_tree/test_utils.rs line 21 at r4 (raw file):

        let diff = high - low;
        let upper_bound = if diff > u128::MAX {
            assert!(low_of_high >= low_of_low);

this is implicit...

Code quote:

assert!(low_of_high >= low_of_low);

crates/committer/src/patricia_merkle_tree/test_utils.rs line 25 at r4 (raw file):

        } else {
            u128::try_from(diff).unwrap()
        };

as high_of_high == high_of_low + 1, if high - low > u128::MAX then the inequality low_of_high >= low_of_low is implied; no need to assert it.
and then, what you're actually doing here is a min operation, right?

... and then, what if low=0 and high=2^129-1? we will get high_of_high==1, and low_of_high==u128::MAX; there are 2*u128::MAX-1 possible numbers in the range but it seems you give an upper bound of u128::MAX in this case, so numbers in [2^128, 2^129) will never be sampled...

Suggestion:

        let diff = high - low;
        let upper_bound = u128::try_from(
            min(U256::from(u128::MAX), high - low)
        ).unwrap();

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from e0a83fa to 949ad6c Compare May 12, 2024 18:09
Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, dorimedini-starkware wrote…

consider switching to BigUint?
there is a randomization trait, and I believe this crate has a U256 type...

If the last change seems right (I think it is), let's consider this in a future PR?


crates/committer/src/patricia_merkle_tree/test_utils.rs line 9 at r4 (raw file):

Previously, dorimedini-starkware wrote…

if you need this, add this restriction to the docstring

OK.


crates/committer/src/patricia_merkle_tree/test_utils.rs line 21 at r4 (raw file):

Previously, dorimedini-starkware wrote…

this is implicit...

Yes, made that as a sanity check. Removed.


crates/committer/src/patricia_merkle_tree/test_utils.rs line 25 at r4 (raw file):

Previously, dorimedini-starkware wrote…

as high_of_high == high_of_low + 1, if high - low > u128::MAX then the inequality low_of_high >= low_of_low is implied; no need to assert it.
and then, what you're actually doing here is a min operation, right?

... and then, what if low=0 and high=2^129-1? we will get high_of_high==1, and low_of_high==u128::MAX; there are 2*u128::MAX-1 possible numbers in the range but it seems you give an upper bound of u128::MAX in this case, so numbers in [2^128, 2^129) will never be sampled...

Good job, now it's more succinct.

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 13 at r2 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

If the last change seems right (I think it is), let's consider this in a future PR?

not uniform I think (see below, I may have a solution...)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

        rand::thread_rng().gen_range(high_of_low..=high_of_high),
        new_low,
    )

if low==u128::MAX and high==u128::MAX+1 this will always output u128::MAX+1, right? this is incorrect (unless the "lower bound is exclusive, upper bound is inclusive" behavior is consistent... haven't checked).
How about this?

Suggestion:

    let upper_bound = u128::try_from(min(U256::from(u128::MAX), high - low)).unwrap();
    if upper_bound < u128::MAX {
        let delta = rand::thread_rng().gen_range(0..upper_bound);
        return low + delta;
    }
    
    // Randomize with low_low=0 and high_high=u128::MAX until the value
    // is in range.
    // Setting low_low to 0 and high_high to u128::MAX can only increase the
    // sample domain by u128::MAX. As high-low>=u128::MAX, the expected 
    // number of samples until the loops breaks is bound from above by 2.
    let randomize = || {
        U256::from_words(
            rand::thread_rng().gen_range(high_of_low..=high_of_high),
            rand::thread_rng().gen_range(0..=u128::MAX)
        )
    }
    let mut result = randomize();
    while result < low || result > high {
        result = randomize();
    }
    result

Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

Previously, dorimedini-starkware wrote…

if low==u128::MAX and high==u128::MAX+1 this will always output u128::MAX+1, right? this is incorrect (unless the "lower bound is exclusive, upper bound is inclusive" behavior is consistent... haven't checked).
How about this?

Good enough for me. It's not exactly 2, as for example, if low=u128::MAX and high=1<<129 the expected number of samples is 3 as if new_high=0b10 (as of high) or new_high=0 (as of low) there is 1/2**128 chance to succeed with the sampling, and you need to have new_high=1. But good enough for us.

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

Good enough for me. It's not exactly 2, as for example, if low=u128::MAX and high=1<<129 the expected number of samples is 3 as if new_high=0b10 (as of high) or new_high=0 (as of low) there is 1/2**128 chance to succeed with the sampling, and you need to have new_high=1. But good enough for us.

ah, you're right.
but I still think it's better because the current implementation is not the uniform distribution (right? or is my analysis wrong?),
and better to have a uniform distribution with O(1) runtime in expectation, than a distribution that is non-uniform in an non-intuitive way.
Do you agree?
if so please swap implementation

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from 949ad6c to d2f0500 Compare May 19, 2024 09:05
Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 3 files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 33 at r5 (raw file):

Previously, dorimedini-starkware wrote…

ah, you're right.
but I still think it's better because the current implementation is not the uniform distribution (right? or is my analysis wrong?),
and better to have a uniform distribution with O(1) runtime in expectation, than a distribution that is non-uniform in an non-intuitive way.
Do you agree?
if so please swap implementation

Done.

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from d2f0500 to aecd2cd Compare May 19, 2024 09:07
Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r6.
Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 10 at r6 (raw file):

    let high_of_low = low.high();
    let high_of_high = high.high();
    assert!(high_of_low <= high_of_high);

more to the point, and can catch more cases

Suggestion:

    assert!(low <= high);
    let high_of_low = low.high();
    let high_of_high = high.high();

crates/committer/src/patricia_merkle_tree/test_utils.rs line 24 at r6 (raw file):

    // valid result for high bits in (high_of_low, high_of_high) or high_of_high == high_of_low + 1)
    // and every possible low 128 bits value is valid when either the high bits equal high_of_high,
    // or high_of_low).

didn't exactly understand the explanation, maybe this is clearer? WDYT?
non blocking

Suggestion:

    // As high-low>u128::MAX, the expected number of samples until the loops breaks is bound from
    // above by 3 (worst case if when high=k*2^128 and low=((k-1)*2^128)-1, and there is a 1/3 
    // chance to sample a new high value with over 1/2^128 chance for a valid low value).

crates/committer/src/patricia_merkle_tree/test_utils.rs line 40 at r6 (raw file):

#[rstest]
#[should_panic]
#[case(U256::ZERO, U256::ZERO)]

the should_panic attribute applies only to this single case?

Code quote:

#[should_panic]
#[case(U256::ZERO, U256::ZERO)]

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r7, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @TzahiTaub)

Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 24 at r6 (raw file):

Previously, dorimedini-starkware wrote…

didn't exactly understand the explanation, maybe this is clearer? WDYT?
non blocking

I'm not sure it's clear why this is the worst case, I tried to explain the two possibilities (also where high and low "high's" differ in a single bit and not two as in your example).


crates/committer/src/patricia_merkle_tree/test_utils.rs line 40 at r6 (raw file):

Previously, dorimedini-starkware wrote…

the should_panic attribute applies only to this single case?

Yes, otherwise it would have failed.

Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 10 at r6 (raw file):

Previously, dorimedini-starkware wrote…

more to the point, and can catch more cases

:)

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from aecd2cd to bdc8a21 Compare May 19, 2024 09:22
Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 24 at r6 (raw file):

Previously, TzahiTaub (Tzahi) wrote…

I'm not sure it's clear why this is the worst case, I tried to explain the two possibilities (also where high and low "high's" differ in a single bit and not two as in your example).

Any better?

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r8, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @TzahiTaub)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 8 at r8 (raw file):

/// Panics if low > high.
pub(crate) fn get_random_u256(low: U256, high: U256) -> U256 {
    assert!(low <= high);

Suggestion:

assert!(low < high);

@TzahiTaub TzahiTaub force-pushed the tzahi/test_utils/random_u256 branch from bdc8a21 to e54cd2d Compare May 19, 2024 09:25
Copy link
Contributor Author

@TzahiTaub TzahiTaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @dorimedini-starkware)


crates/committer/src/patricia_merkle_tree/test_utils.rs line 8 at r8 (raw file):

/// Panics if low > high.
pub(crate) fn get_random_u256(low: U256, high: U256) -> U256 {
    assert!(low <= high);

Done.

Copy link
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r9, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @TzahiTaub)

@TzahiTaub TzahiTaub added this pull request to the merge queue May 19, 2024
Merged via the queue into main with commit 5266ea6 May 19, 2024
11 checks passed
@TzahiTaub TzahiTaub deleted the tzahi/test_utils/random_u256 branch May 19, 2024 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants