-
Notifications
You must be signed in to change notification settings - Fork 960
Open
Labels
PythonAffects Python cuDF API.Affects Python cuDF API.cudf-polarsIssues specific to cudf-polarsIssues specific to cudf-polarsfeature requestNew feature or requestNew feature or request
Description
Implement a post_traversal
pass over the un-lowered IR graph to populate dict[IR, dict[str, ColumnStats]] and dict[IR, RowCount] data structure with base (i.e. source) statistics. The necessary statistics classes were added in #19276.
This traversal will not update the ColumnStats.unique_stats
attribute for each column yet. The goal of this traversal is to make sure DataSourceInfo
and source-based row-count estimates are fully propagated.
We can also use this traversal to call add_unique_stats_column
for known GroupBy
and Distinct
key columns. This way, the first call too DataSourceInfo.unique_stats(*)
(expected during a later IR-graph traversal) will collect row-group information for all known GroupBy
/Distinct
keys.
Metadata
Metadata
Assignees
Labels
PythonAffects Python cuDF API.Affects Python cuDF API.cudf-polarsIssues specific to cudf-polarsIssues specific to cudf-polarsfeature requestNew feature or requestNew feature or request
Type
Projects
Status
In Progress