-
Notifications
You must be signed in to change notification settings - Fork 287
Expose Avro reader to PyIceberg #1328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Fokko
wants to merge
20
commits into
apache:main
Choose a base branch
from
Fokko:fd-avro-pyiceberg
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
7249542
WIP
Fokko 0260aa4
Merge branch 'main' of github.com:apache/iceberg-rust
Fokko cff3d2b
Expose Avro parsers in Python
Fokko ee6aeda
Merge branch 'main' of github.com:apache/iceberg-rust into fd-avro-py…
Fokko fb44a0a
Cleanup
Fokko 9bc9baf
Thanks Scott!
Fokko 24b02e3
Merge branch 'main' of github.com:apache/iceberg-rust into fd-avro-py…
Fokko d02aff8
Merge branch 'main' into fd-avro-pyiceberg
Fokko 7c63887
Less is more
Fokko 8ca7e90
Cleanup
Fokko 2a14693
Merge branch 'main' of github.com:apache/iceberg-rust into fd-avro-py…
Fokko 6b62d04
Merge branch 'main' of github.com:apache/iceberg-rust into fd-avro-py…
Fokko 63918be
WIP
Fokko 5e6bb10
fix: literal conversion to py
roeap 2e62d7d
Merge pull request #1 from roeap/fix/pyo3-converty
Fokko 846561a
Merge branch 'main' of github.com:apache/iceberg-rust into fd-avro-py…
Fokko 820b895
Merge branch 'main' of github.com:apache/iceberg-rust into fd-avro-py…
Fokko 9fc7ae5
Merge remote-tracking branch 'origin' into fd-avro-pyiceberg
kevinjqliu 36aed09
fix clippy
kevinjqliu b0805a9
add unit test for `read_manifest_entries`
kevinjqliu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,12 +23,37 @@ use iceberg::spec::{ | |
ManifestList, ManifestStatus, PrimitiveLiteral, | ||
}; | ||
use pyo3::prelude::*; | ||
use pyo3::IntoPyObjectExt; | ||
use pyo3::types::{PyBytes}; | ||
|
||
|
||
#[pyclass] | ||
#[allow(dead_code)] | ||
pub struct PyPrimitiveLiteral { | ||
inner: Option<PrimitiveLiteral>, | ||
inner: PrimitiveLiteral | ||
} | ||
// | ||
// impl<'py> IntoPyObject<'py> for PyPrimitiveLiteral { | ||
// type Target = PyAny; // the Python type | ||
// type Output = Bound<'py, Self::Target>; // in most cases this will be `Bound` | ||
// type Error = std::convert::Infallible; // the conversion error type, has to be convertable to `PyErr` | ||
// | ||
fn into_pyobject(self, py: Python<'py>) -> Result<Self::Output, Self::Error> { | ||
match self.inner { | ||
PrimitiveLiteral::Boolean(v) => Ok(v.into_py_any(py)), | ||
PrimitiveLiteral::Int(v) => Ok(v.into_py_any(py)), | ||
PrimitiveLiteral::Long(v) =>Ok( v.into_py_any(py)), | ||
PrimitiveLiteral::Float(v) => Ok(v.0.into_py_any(py)), // unwrap OrderedFloat | ||
PrimitiveLiteral::Double(v) =>Ok( v.0.into_py_any(py)), | ||
PrimitiveLiteral::String(v) =>Ok( v.into_py_any(py)), | ||
PrimitiveLiteral::Binary(v) =>Ok( PyBytes::new(py, &v).into_py_any(py)), | ||
PrimitiveLiteral::Int128(v) => Ok(v.into_py_any(py)), // Python handles big ints | ||
PrimitiveLiteral::UInt128(v) =>Ok( v.into_py_any(py)), | ||
PrimitiveLiteral::AboveMax => Err("AboveMax is not supported"), | ||
PrimitiveLiteral::BelowMin => Err("BelowMin is not supported"), | ||
} | ||
} | ||
// } | ||
|
||
|
||
#[pyclass] | ||
pub struct PyDataFile { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's odd to put this in manifest module, how about put them in |
||
|
@@ -57,15 +82,33 @@ impl PyDataFile { | |
} | ||
} | ||
|
||
fn into_pyobject(self, lit: PrimitiveLiteral, py: Python<'py>) -> Option<PyObject> { | ||
match lit { | ||
PrimitiveLiteral::Boolean(v) => v, | ||
PrimitiveLiteral::Int(v) => v, | ||
PrimitiveLiteral::Long(v) => Some(v.into_py_any(py)), | ||
PrimitiveLiteral::Float(v) => Some(v.0.into_py_any(py)), // unwrap OrderedFloat | ||
PrimitiveLiteral::Double(v) =>Some( v.0.into_py_any(py)), | ||
PrimitiveLiteral::String(v) =>Some( v.into_py_any(py)), | ||
PrimitiveLiteral::Binary(v) =>Some( PyBytes::new(py, &v).into_py_any(py)), | ||
PrimitiveLiteral::Int128(v) => Some(v.into_py_any(py)), // Python handles big ints | ||
PrimitiveLiteral::UInt128(v) => Some(v.into_py_any(py)), | ||
PrimitiveLiteral::AboveMax => None, | ||
PrimitiveLiteral::BelowMin => None, | ||
} | ||
} | ||
|
||
#[getter] | ||
fn partition(&self) -> Vec<PyPrimitiveLiteral> { | ||
self.inner | ||
.partition() | ||
.iter() | ||
.map(|lit| PyPrimitiveLiteral { | ||
inner: lit.map(|l| l.as_primitive_literal().unwrap()), | ||
}) | ||
.collect() | ||
Python::with_gil(|py| { | ||
self.inner | ||
.partition() | ||
.iter() | ||
.map(|lit| | ||
lit.map(|l| into_pyobject(l)) | ||
) | ||
.collect() | ||
}) | ||
} | ||
|
||
#[getter] | ||
|
@@ -269,7 +312,7 @@ impl crate::manifest::PyManifestFile { | |
} | ||
|
||
#[getter] | ||
fn key_metadata(&self) -> Vec<u8> { | ||
fn key_metadata(&self) -> Option<Vec<u8>> { | ||
self.inner.key_metadata.clone() | ||
} | ||
} | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I catch your didn't run
cargo fmt
.