Skip to content

Conversation

mhaseeb123
Copy link
Member

@mhaseeb123 mhaseeb123 commented Sep 30, 2025

Description

Closes #20125

This PR allows evaluating IS_NULL(col) and by extension NOT(IS_NULL(col)) expressions while filtering Parquet row groups and data pages (requires page index) using corresponding statistics. The PR also includes optimizations in case the host columns containing page-stats don't contain any nulls.

Checklist

  • Optimize page_stats_caster and row_group_stats_caster to use has_is_null_operator
  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link

copy-pr-bot bot commented Sep 30, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Sep 30, 2025
@mhaseeb123 mhaseeb123 added feature request New feature or request 2 - In Progress Currently a work in progress cuIO cuIO issue non-breaking Non-breaking change libcudf Affects libcudf (C++/CUDA) code. and removed libcudf Affects libcudf (C++/CUDA) code. labels Sep 30, 2025
@mhaseeb123 mhaseeb123 marked this pull request as ready for review October 14, 2025 15:05
@mhaseeb123 mhaseeb123 requested a review from a team as a code owner October 14, 2025 15:05
@mhaseeb123 mhaseeb123 added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 14, 2025
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two quick questions, everything else looks great. Regarding the has_is_null_operator name, it feels a bit wordy, but I couldn’t come up with a better alternative.

@GregoryKimball GregoryKimball moved this to Burndown in libcudf Oct 17, 2025
@mhaseeb123 mhaseeb123 added 4 - Needs Review Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Needs Review Waiting for reviewer to review or respond cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

Status: Burndown

Development

Successfully merging this pull request may close these issues.

[FEA] Support unary operators in Parquet row group and page filtering

2 participants