-
Notifications
You must be signed in to change notification settings - Fork 1k
[Variant] Add list support to unshred_variant #8514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CC @alamb @liamzwbao |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few notes for reviewers
|
||
//! Module for unshredding VariantArray by folding typed_value columns back into the value column. | ||
use crate::arrow_to_variant::ListLikeArray; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice new interface, @liamzwbao !
/// | ||
/// [valid]: VariantMetadata#Validation | ||
/// [Variant spec]: https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#value-data-for-array-basic_type3 | ||
#[derive(Debug, Clone, PartialEq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We forgot to fix this PartialEq
when we fixed the one for VariantObject
.
Manual implementation below.
true | ||
self.iter() | ||
.zip(other.iter()) | ||
.all(|((name_a, value_a), (name_b, value_b))| name_a == name_b && value_a == value_b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opportunistic simplification.
($case_num:literal) => { | ||
paste::paste! { | ||
#[test] | ||
fn [<test_variant_integration_case_ $case_num>]() { | ||
all_cases()[$case_num - 1].run() | ||
} | ||
} | ||
}; | ||
|
||
// Generates an error test case, where the expected result is an error message | ||
($case_num:literal, $expected_error:literal) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opportunistic simplification: Instead of duplicating the macro definition, just use $(...)?
syntax to capture (and respond to) an optional error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL -- thanks @scovich
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list_builder.finish(); | ||
Ok(()) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how we could test this reasonably 🤔 Maybe we can rework the tests in cast_to_variant
to shred
and then unshred
an array and verify it survives round tripping 🤔
($case_num:literal) => { | ||
paste::paste! { | ||
#[test] | ||
fn [<test_variant_integration_case_ $case_num>]() { | ||
all_cases()[$case_num - 1].run() | ||
} | ||
} | ||
}; | ||
|
||
// Generates an error test case, where the expected result is an error message | ||
($case_num:literal, $expected_error:literal) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL -- thanks @scovich
variant_test_case!(1, "Unshredding not yet supported for type: List("); | ||
variant_test_case!(2, "Unshredding not yet supported for type: List("); | ||
variant_test_case!(1); | ||
variant_test_case!(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I read this diff correctly that after this PR we handle all the test cases other than Decimal? If so, that is pretty rad 🤯
(btw I think the reference to https://github.com/apache/arrow-rs/issues/8329
above variant_test_case!(4);
is old and can be removed )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I read this diff correctly that after this PR we handle all the test cases other than Decimal?
I believe so!
the reference to
https://github.com/apache/arrow-rs/issues/8329
abovevariant_test_case!(4);
is old and can be removed
Good catch! Fixed in #8481, since it has nothing to do with variant arrays.
Perhaps you meant |
I think this one is now ready to rebase and merge -- I tried to do it mechanically but I hit a conflict. @scovich since you are more familiar with this code, could you rebase the PR so I can merge it? |
Done! |
Thanks @scovich |
Which issue does this PR close?
List
#8337Rationale for this change
Add a missing feature.
What changes are included in this PR?
Leveraging the recently added
ListLikeArray
trait, support all five list types when unshredding variant data.Are these changes tested?
Yes -- all the list-related variant shredding integration tests pass now.
Are there any user-facing changes?
No.