Replies: 1 comment
-
I don't think there were any discussions about using the dataset API. IIUC, the main thing to ensure is that the geoparquet metadata is written correctly. If we can do that in a more general way without loosing anything then I'd be fine with using the dataset API, or offering a separate API that uses the dataset API. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This is me not not knowing the features of this package very well, I'm thinking about writing a partitioned geoparquet file similar to this:
Is there an equivalent with
to_parquet()
? the idea is that when I need to keep adding items I only touch the most recent year/partition.EDIT: looking at the code, it seems that the API would need to change to use
pyarrow.parquet.write_to_dataset
(instead ofpyarrow.parquet.ParquetWriter
), is there a performance consideration for not using thewrite_to_dataset
function? or would partitions be handled by a future release?Beta Was this translation helpful? Give feedback.
All reactions