Skip to content

Currently parquet don't reads some specific bytes, why? #3634

Closed Answered by tustvold
Sach1nAgarwal asked this question in Q&A
Discussion options

You must be logged in to vote

The remaining bytes are most likely index information which you aren't telling it to read (as there is no point if you aren't doing predicate pushdown).

It can be enabled with https://docs.rs/parquet/latest/parquet/file/serialized_reader/struct.ReadOptionsBuilder.html#method.with_page_index

https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/ may also be informative

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@Sach1nAgarwal
Comment options

@tustvold
Comment options

Answer selected by Jefffrey
@Sach1nAgarwal
Comment options

@Sach1nAgarwal
Comment options

@tustvold
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants