Replies: 1 comment
-
I have wanted something like this for a while, but AFAIK nothing really exists. There are some other experiments along these lines, e.g. https://github.com/simw/pydantic-to-pyarrow for Pydantic, https://github.com/michalc/pgarrow for SQLAlchemy + PostgreSQL There's also a lack of a textual format for Arrow schemas in the first place (#25078) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello I've recently been using the adbc driver for duckdb and find it convenient, but one huge missing piece that has been vexing me is codegen / static type checking of arrow schemas. While I know that arrow is a columnar storage format, when interacting with it in a client it is in many cases still preferable to model / access it as a traditional domain object for the simple reason that the properties of a given arrow schema will change over time -just like any other data interchange structure.
For most widespread data formats, there are typically tools to take a schema and generate the relevant client code. Flatbuffers has flatc, protobuf has protoc. There are plenty of tools which can take a JSON schema and generate client code, but conspicuously I have never been able to find any equivalent for arrow (atleast for python classes / type annotations) despite it's widespread use.
I am wondering if these capabilities exist somewhere and I have simply missed them or if there is some underlying reason that no such tools exist.
I have a very crude system of wrapper facades that works well enough for static type checking which looks something like below, but I keep thinking somewhere out there with 8.1 billion people on Earth, someone must have built a more automated / polished version of this.
Wrappers
Example domain Protocols (interfaces)
Query
Beta Was this translation helpful? Give feedback.
All reactions