[NIGHTLY] v25.08.00
Pre-release
Pre-release
·
657 commits
to branch-25.10
since this release
π Links
π¨ Breaking Changes
- Allow
np.dtype('object')
for cases that are valid (#19478) @galipremsagar - [FEA] Remove CUDA JIT-Compatibility Checks & CCCL WARs (#19470) @lamarrr
- Drop cuda 11 usages (#19386) @galipremsagar
- Deprecate cudf::round for float types (#19298) @davidwendt
- Support output_dtype in cudf::reduce for nunique aggregation (#19265) @davidwendt
- Change default cudf-polars executor to "streaming" (#19263) @TomAugspurger
- Fix Handling of Complex Types in AST (#19248) @lamarrr
- Enable chunked reading of PQ sources with
>2B
rows (#19245) @mhaseeb123 - Refactor
grid_1d
class (#19211) @lamarrr - Return valid for all-nulls in reduce() with nunique include-nulls aggregation (#19196) @davidwendt
- Refactor JNI error handling (#19149) @ttnghia
- Remove CUDA 11 from dependencies.yaml (#19139) @KyleFromNVIDIA
- Quick fixes of
modernize-use-constraints
rule (#19105) @vuule - Filter Parquet row groups using row bounds (#19082) @mhaseeb123
- Temporarily revert "Refactor JNI error handling (#18983)" (#19076) @abellina
- Rename
parquet_chunked_writer
tochunked_parquet_writer
for consistency with the reader (#19047) @mhaseeb123 - Compile libcudf using C++20 Standard (#19045) @vuule
- Refactor JNI error handling (#18983) @ttnghia
- stop uploading packages to downloads.rapids.ai (#18973) @jameslamb
- Remove deprecated Series methods, isclose (#18947) @mroeschke
- Remove deprecated groupby.collect (#18946) @mroeschke
- Remove deprecated get_dummies(cats=, ...) (#18944) @mroeschke
- Add pylibcudf.Column.from_arrow factory method (#18937) @Matt711
- Add pylibcudf.Table.from_arrow factory method (#18936) @Matt711
- Remove deprecated APIs (#18933) @vuule
- Remove cudf.Scalar (#18927) @mroeschke
- Remove deprecated
cudf::io::host_buffer
(#18881) @Matt711 - Null-handling for Transforms (#18845) @lamarrr
- Enable
skip_rows
in the chunked parquet reader. (#18130) @mhaseeb123
π Bug Fixes
- Revert "Add primitive row dispatch support for semi/anti join and cudf::contains" (#19503) @PointKernel
- Allow
np.dtype('object')
for cases that are valid (#19478) @galipremsagar - Add conda dependency on nvidia-ml-py. (#19454) @bdice
- Mark
cudf.pandas
notebook repr test as flaky (#19441) @Matt711 - Fix pytest to properly expose a bug (#19433) @galipremsagar
- Switch from
thrust::sort
tocub::DeviceRadixSort
in Parquet chunked reader (#19414) @ttnghia - Use numba-cuda>=0.15.2,<0.16 (#19413) @bdice
- Update String Transform Examples (#19407) @lamarrr
- [BUG] Make floor division and modulo by 0 match CPU polars (#19406) @Matt711
- Handle empty input in cudf::strings::extract APIs (#19398) @davidwendt
- Fix jitify error on exit from FILTER_TEST (#19395) @davidwendt
- Update cudf.pandas tests to silence deprecation warnings (#19377) @Matt711
- Replace sprintf with snprintf in libcudf parquet tests (#19371) @davidwendt
- Make DateOffset respect timezone (#19366) @Matt711
- Fix flaky tests in
cudf.pandas
(#19345) @TomAugspurger - Update protocol choices for ucxx in PDSH benchmark (#19343) @TomAugspurger
- Remove passing pandas tests from xfail list (#19341) @Matt711
- Fix Union-Slice bug (#19336) @Matt711
- Fix bit shift overflow in segmented_offset_bitmask_binop utility (#19329) @davidwendt
- Fix job filters for
pandas-tests
(#19322) @galipremsagar - Fix compile warning in interop_stringview.cpp (#19320) @davidwendt
- Fix a use-after-free issue in TDigest aggregation code. (#19311) @nvdbaranec
- Always represent datetime aware data as UTC in strftime (#19304) @mroeschke
- Do not pass cupy objects objects to numba kernels directly (#19283) @brandon-b-miller
- Correct docstring for
DataFrame.apply
to match code (#19262) @dagardner-nv - Cast
n_unique
aggregation result to match polars (#19256) @Matt711 - Fix Handling of Complex Types in AST (#19248) @lamarrr
- Add missing include (#19239) @vyasr
- Raised
MixedTypeErrors
for condition that lead to mixed types (#19232) @galipremsagar - Fix errors in the nvCOMP adapter (#19221) @vuule
- Remove nvToolsExt usage (#19209) @vyasr
- Fix a pair of bugs in get_decompression_scratch() size. (#19207) @nvdbaranec
- Allow
is_list_like
to return correct values by disabling it (#19188) @galipremsagar - Fix slicing after
Join
andGroupBy
in streaming cudf-polars (#19187) @rjzamora - Fix
binops
type preservation for some dtypes (#19183) @galipremsagar - Fix streaming
GroupBy
on non-trivial keys (#19181) @rjzamora - Fix bitmask in from_arrow_host for sliced stringview type (#19174) @davidwendt
- Fixed group_by mean with missing values and multiple partitions (#19165) @TomAugspurger
- Add fallback to
HStack
lowering in cudf-polars (#19163) @rjzamora - Fix
Literal
partitioning in cudf-polars (#19160) @rjzamora - Fix
from_array_interface
for empty arrays (#19144) @Matt711 - Adding GH_TOKEN pass-through to summarize job (#19143) @msarahan
- Fix hash collision in Union([MapFunction]) (#19124) @TomAugspurger
- Fix bug in
group_by().n_unique()
in streaming cudf-polars (#19108) @rjzamora - Parse (non-MultiIndex) label-based keys to structured data (#19103) @mroeschke
- Fix cudf_polars spilling (#19101) @TomAugspurger
- Fix libcudf strings case logic to set null-row size to zero (#19095) @davidwendt
- Temporarily revert "Refactor JNI error handling (#18983)" (#19076) @abellina
- Temporary workaround for incorrect
SplitScan
results in cuDF-Polars (#19071) @rjzamora - Use default memory resource for JSON_QUOTE_NORMALIZATION gtests (#19057) @davidwendt
- Added null-probability to polynomial benchmarks and fixed transform call-sites (#18972) @lamarrr
- Fix flaky custreamz test (#18961) @TomAugspurger
- Fix tdigest percentile correctness for low row-counts (#18952) @mythrocks
- Enable
skip_rows
in the chunked parquet reader. (#18130) @mhaseeb123
π Documentation
- Update conda environment file for CUDA 12.9 compatibility (#19376) @a-hirota
- Update recommended gcc version in contibuting guide (#19365) @davidwendt
- Autodoc DateOffset (#19297) @wence-
- Fix cudf::column_device_view::element() doxygen (#19296) @davidwendt
- Document aggregations for cudf::reduce in doxygen (#19264) @davidwendt
- add docs on CI workflow inputs (#19234) @jameslamb
- Update README and CONTRIBUTING to reflect new CUDA requirements (#19138) @PointKernel
- Remove the extra index URL for CUDA 12 (#19128) @vyasr
- Improve WordPieceVocabulary.tokenize documentation (#19098) @davidwendt
- Add some basic streaming engine documentation (#19088) @wence-
- Update the contributing guide to include pylibcudf in the build command (#19011) @Matt711
- Fix pylibcudf docs for some strings APIs (#19004) @davidwendt
- Update cuDF Python library design with BaseIndex and pylibcudf updates (#18903) @mroeschke
π New Features
- Avoid using UVM on systems without a traditional memory resource (#19444) @Matt711
- Add parquet-sampling configuration options (#19423) @rjzamora
- Add new JSON reader interface accepting string column input to pylibcudf (#19400) @shrshi
- Add a parquet reader utility to update output null masks (#19370) @mhaseeb123
- Build and ship
shim.cu
file as LTOIR (#19368) @brandon-b-miller - Add cudf::strings::find_instance API (#19326) @davidwendt
- Add single-file streaming
Sink
support (#19317) @rjzamora - Support null_count expression (#19314) @Matt711
- Materialize tables in the experimental Parquet reader (#19308) @mhaseeb123
- Add new cudf::top_k API (#19303) @davidwendt
- Add cudf::strings::split_part API (#19289) @davidwendt
- Support output_dtype in cudf::reduce for nunique aggregation (#19265) @davidwendt
- Add
post_traversal
API to cudf-polars (#19258) @rjzamora - Deprecate
DataFrame.apply_rows
(#19218) @brandon-b-miller - Require
numba-cuda>=0.16.0
(#19213) @brandon-b-miller - Add a mode to co-process decompression and compression on host and device (#19203) @vuule
- Return valid for all-nulls in reduce() with nunique include-nulls aggregation (#19196) @davidwendt
- Refactor JNI error handling (#19149) @ttnghia
- Add support for horizontal string concatenation
pl.concat_str
(#19142) @Matt711 - Add PDS-DS Query 1 (#19131) @Matt711
- Support
cudf-polars
str.reverse
(#19117) @brandon-b-miller - Support
cudf-polars
str.pad_end
andstr.pad_start
(#19116) @brandon-b-miller - Support
cudf-polars
str.head
andstr.tail
(#19115) @brandon-b-miller - Support
cudf-polars
str.to_titlecase
(#19114) @brandon-b-miller - Add
cudf/io/codec.hpp
to expose compression/decompression APIs (#19113) @ttnghia - Support converting decimals to/from pylibcudf scalars (#19106) @Matt711
- Support resource-constrained sort-merge inner join operation through left table partitioning (#19102) @shrshi
- Filter Parquet row groups using row bounds (#19082) @mhaseeb123
- Implement UDF Filters (#19070) @lamarrr
- Move the remaining libcudf pieces to C++20 (#19065) @vuule
- Allow using a stream per thread at runtime (#19051) @vyasr
- Remove stacktrace retrieval code (#19048) @ttnghia
- Compile libcudf using C++20 Standard (#19045) @vuule
- String Transform Examples: Added Branching, Public API Versions, and Sampling (#19038) @lamarrr
- Refactor JNI error handling (#18983) @ttnghia
- Add basic
Sink
support for streaming cudf-polars executor (#18963) @rjzamora - Fix debug-build Failure in JIT Tests (#18939) @lamarrr
- Add from_arrow factory methods for Scalar and DataType (#18938) @Matt711
- Add pylibcudf.Column.from_arrow factory method (#18937) @Matt711
- Add pylibcudf.Table.from_arrow factory method (#18936) @Matt711
- Update nvCOMP adapter (#18931) @vuule
- Create a pylibcudf Column from a iterable of python strings (#18916) @Matt711
- Add CLI argument to enable OOM protection in PDS-H (#18914) @pentschev
- Implement data page pruning using Parquet page index stats (#18873) @mhaseeb123
- Null-handling for Transforms (#18845) @lamarrr
- Implement row group pruning with dictionaries in experimental PQ reader (#18836) @mhaseeb123
- Add support for parquet scan + count operation (#18463) @Matt711
- Manage strings with NRT (#18453) @brandon-b-miller
π οΈ Improvements
- Disable codecov comments (#19472) @bdice
- [FEA] Remove CUDA JIT-Compatibility Checks & CCCL WARs (#19470) @lamarrr
- Use libnvcomp conda package (#19439) @bdice
- JNI Set RMM_LOG_LEVEL and RMM_LOG_ACTIVE_LEVEL to allow setting log level at compile time (#19435) @abellina
- Use numba-cuda >=0.14.0,<0.15.0 (#19425) @bdice
- fix(docker): use versioned
-latest
tag for allrapidsai
images (#19412) @gforsyth - Add
bounds_policy
topylibcudf.lists.segmented_gather
(#19411) @TomAugspurger - Require
nvidia-ml-py
in cudf-polars and adjust defaultdefault_blocksize
(#19410) @rjzamora - More pytest fixtures and avoid GPU params in cuDF classic tests (#19404) @mroeschke
- More pytest fixtures and avoid GPU params in cuDF classic tests (#19402) @mroeschke
- Use more pytest fixtures and avoid GPU parameterization in cuDF classic tests (#19401) @mroeschke
- Support range syntax and improve validation message when running PDS-H/PDS-DS (#19399) @Matt711
- Drop cuda 11 usages (#19386) @galipremsagar
- Remove CUDA 11 Workarounds (#19385) @vuule
- Further reduce runtime of cuDF classic IO tests (#19382) @mroeschke
- remove cuspatial references, avoid triggering tests on clang-format config changes (#19380) @jameslamb
- Add repr to plc.aggregation.Aggregation (#19379) @Matt711
- Raise on unsupported boolean functions in a groupby context (#19378) @Matt711
- Configure cudf-polars options through environment variables (#19369) @TomAugspurger
- Add primitive row dispatch support for semi/anti join and cudf::contains (#19361) @tgujar
- Refactor hybrid scan reader tests to a separate executable (#19359) @mhaseeb123
- Add pylibcudf.Column.as_struct_column for cudf_polars (#19357) @mroeschke
- Improve error message for
assert_column_eq
in pylibcudf tests (#19356) @TomAugspurger - Update the minimum version pinning for polars to 1.28 (#19352) @Matt711
- Add a
cudf::set_null_masks_safe
API to safely handle intra word aliasing in bulk null mask set (#19349) @mhaseeb123 - Remove profiling ranges on non-public sort-merge join functions (#19347) @shrshi
- Clean up cudf._lib.strings_udf.pyx (#19335) @mroeschke
- Add support for
pandas-2.3.1
(#19334) @galipremsagar - Allow comparison binop to datetime.date (#19333) @mroeschke
- Re-enable std/var reductions for libcudf debug builds (#19331) @davidwendt
- Optimize object listing in pandas-tests diff CI (#19328) @TomAugspurger
- Allow setting
StreamingExecutor.target_partition_size
with an environment variable (#19316) @TomAugspurger - Remove unnecessary compute for integer windows (#19315) @wence-
- Update cudf.pandas test skips for pandas==2.3.1 (#19313) @TomAugspurger
- Support Expr.str.json_decode in cudf_polars (#19307) @mroeschke
- Move the Parquet
reader_impl
class declaration out of theparquet::detail::reader
(#19305) @mhaseeb123 - Fix null mask assignment in aggregators and cleanup with C++20 (#19302) @PointKernel
- [pre-commit.ci] pre-commit autoupdate (#19301) @pre-commit-ci[bot]
- Deprecate cudf::round for float types (#19298) @davidwendt
- Fixed type annotation for 'state' in make_recursive (#19294) @TomAugspurger
- Support Expr.str.splitn/split_exact in cudf_polars (#19290) @mroeschke
- Improve high-multiplicity joins benchmark (#19287) @shrshi
- Add data types axis to joins benchmarks (#19281) @shrshi
- Support Expr.str.strip_prefix/suffix in cudf_polars (#19278) @mroeschke
- Support Expr.str.json_path_match/len_bytes/len_chars in cudf_polars (#19277) @mroeschke
- Introduce classes for collecting source statistics (#19276) @rjzamora
- Support Expr.str.find & Expr.str.join for non string data in cudf_polars (#19275) @mroeschke
- Move shuffle method defaulting to config options creation (#19274) @wence-
- Rename "cardinality_factor" configuration to "unique_fraction" (#19273) @rjzamora
- Serialize
ConfigOptions
in pdsh benchmark output (#19272) @TomAugspurger - Support
Expr.str.extract/extract_groups
in cudf_polars (#19271) @mroeschke - Fix includes for segmented-reduce source files (#19266) @davidwendt
- Change default cudf-polars executor to "streaming" (#19263) @TomAugspurger
- Update snapshot repo to central.soantype.com (#19259) @pxLi
- Raise
NotImplementedError
forLazyFrame.profile
with the streaming exeuctor (#19257) @TomAugspurger - Move ast expression function definitions to .cpp files (#19250) @davidwendt
- Enable chunked reading of PQ sources with
>2B
rows (#19245) @mhaseeb123 - Support
str.count_matches
andstr.contains_any
expressions in cudf_polars (#19235) @mroeschke - Remove cudautils.py (#19233) @mroeschke
- Use CUDA 12.9 in Conda, Devcontainers, Spark, GHA, etc. (#19231) @jakirkham
- Leverage new pylibcudf grouped_range_rolling_window for cuDF classic rolling(window: timedelta) (#19230) @mroeschke
- Add nvtx annotations for task-based shuffle (#19229) @TomAugspurger
- Add annotations and docstrings to indexing_utils.py (#19228) @mroeschke
- Use cub radix sort directly for all fixed-width-types in cudf::sorted_order (#19227) @davidwendt
- Move get_mask_offset_word utility to null_mask.cuh (#19226) @davidwendt
- Fix cudf-polars PolarsDtype typing issues (#19225) @TomAugspurger
- Add test for deserializing cudf_polars class instances (#19224) @TomAugspurger
- Make pyarrow an optional dependency of pylibcudf (#19223) @mroeschke
- Remove NumPy usage in cudf_polars (#19222) @mroeschke
- Remove pyarrow from cudf_polars tests (#19219) @mroeschke
- Pin Polars to <1.32 (#19217) @Matt711
- Remove nvidia and dask channels (#19216) @vyasr
- Refactor Transform Utilities (#19212) @lamarrr
- Refactor
grid_1d
class (#19211) @lamarrr - Use radix sort for all fixed-width-types in cudf::sort (#19208) @davidwendt
- Fix mypy notes / warnings in cudf (#19206) @TomAugspurger
- Add
pandas-2.3.0
support (#19202) @galipremsagar - Avoid
pylibcudf.interop.to_arrow
inDataFrame.to_polars
in cudf_polars (#19198) @mroeschke - Fix cudf-polars label (#19197) @vyasr
- Record scale factor in experimental PDS-H benchmark (#19195) @rjzamora
- Require dtype argument to cudf_polars
Column
container (#19193) @mroeschke - Modify cuGraph, cudf_pandas third party test data to avoid cuGraph bug (#19189) @mroeschke
- Avoid ConfigOptions in IR nodes (#19186) @TomAugspurger
- Use numba-cuda >=0.14.0,<0.15.0 to get pynvjitlink by default. (#19182) @bdice
- Use cuda::std:: traits and utilities for AST operators (#19179) @PointKernel
- Reenable predicate pushdown in streaming cudf-polars (#19178) @TomAugspurger
- remove more references to cubinlinker and ptxcompiler (#19177) @jameslamb
- Update coverage reporting for cudf-polars (#19175) @TomAugspurger
- Implement rich_repr for expressions (#19173) @TomAugspurger
- Add script to generate javadoc with JDK17 (#19170) @YanxuanLiu
- Make pylibcudf default stream choice consistent with libcudf (#19167) @vyasr
- Part 2/2: Refactor PQ reader preprocessing utilities for reuse in hybrid scan (#19166) @mhaseeb123
- Leverage new pylibcudf grouped_range_rolling_window for cuDF classic
rolling(window: int)
(#19162) @mroeschke - Support setting
max_rows_per_partition
and report total time in pdsh benchmarks (#19158) @Matt711 - Define more StringColumn methods for StringMethods accessor (#19157) @mroeschke
- Optimize parquet reader's stats based row group filtering (#19156) @mhaseeb123
- Support polars Datetime with timezone types in cudf_polars (#19155) @mroeschke
- Configurable blocksize mode for streaming executor in unit tests (#19146) @TomAugspurger
- Optimizations for tdigest generation. (#19140) @nvdbaranec
- Remove CUDA 11 from dependencies.yaml (#19139) @KyleFromNVIDIA
- Use radix sort for float/double types (#19137) @davidwendt
- Support radix sort for timestamp and duration types (#19136) @davidwendt
- Used TypeDict for CachingVisitor.state (#19135) @TomAugspurger
- Move Accessor implementation to their own directory (#19134) @mroeschke
- Add benchmarks for sorting float and timestamp (#19133) @davidwendt
- Enable using page mask in
decompress_page_data
in Parquet reader (#19132) @mhaseeb123 - refactor(shellcheck): fix all shellcheck warnings/errors (#19129) @gforsyth
- Remove pytest pin (#19127) @vyasr
- Move pdsh utility functions/classes to a seperate module (#19126) @Matt711
- Use pylibcudf.Column.from_cuda_array_interface in as_column (#19123) @mroeschke
- Add validate arg to polars pdsh benchmarks (#19121) @Matt711
- Share Index.values with base implementaiton (#19112) @mroeschke
- Use len instead of len(obj.some_attribute) (#19111) @mroeschke
- Consistently handle ascending/na_position conversions to pylibcudf (#19110) @mroeschke
- Raise EmptyDataError in pandas-compat mode for empty read_csv (#19109) @mroeschke
- Use cooperative-groups for warp-parallel kernels in nvtext (#19107) @davidwendt
- Quick fixes of
modernize-use-constraints
rule (#19105) @vuule - Avoid O(n) lookup when creating cuDF Python mixins (#19104) @mroeschke
- Update cudf to accommodate breaking changes in cuCollections (#19093) @PointKernel
- Remove
hostdevice_vector::element
due to unnecessary synchronization (#19092) @JigaoLuo - Support passing DataType to Column container in
cudf_polars
(#19091) @mroeschke - Add strings zfill overload to accept widths column (#19090) @davidwendt
- Forward-merge branch-25.06 to branch-25.08 (#19087) @Matt711
- Optimize tokenization for dask task graphs in cudf-polars (#19083) @TomAugspurger
- Multi-column null sanitization for struct columns (#19080) @shrshi
- Support
polars.Expr.value_counts
incudf_polars
(#19079) @mroeschke - Support
polars.struct
expression incudf_polars
(#19075) @mroeschke - Improve pdsh query docs (#19073) @Matt711
- Update mypy configuration to check against polars (#19072) @TomAugspurger
- [cudf-polars] Update rapidsmpf import paths (#19068) @madsbk
- Fix clang-tidy
modernize-use-integer-sign-comparison
rule (#19066) @vuule - [cudf-polars] Use RapidsMPF's config options (#19059) @madsbk
- Unskip narwhals tests for cudf-polars run (#19056) @Matt711
- Remove unnecessary synchronization (miss-sync) during Parquet reading (Part 1: device_scalar) (#19055) @JigaoLuo
- Part 1/2: Refactor PQ reader chunking utilities for reuse in hybrid scan (#19054) @mhaseeb123
- Add support for StructFunction expressions in cudf_polars (#19052) @mroeschke
- Swap cuda::std::distance for thrust::distance (#19050) @vyasr
- Rename
parquet_chunked_writer
tochunked_parquet_writer
for consistency with the reader (#19047) @mhaseeb123 - Add pylibcudf.Scalar.to_py to avoid scalar conversion to host via pyarrow (#19043) @mroeschke
- Fix and expand
to_parquet
tests of theskip_compression
option (#19042) @vuule - Remove CUDA 11 devcontainers and update CI scripts (#19040) @bdice
- refactor(rattler): remove cuda 11 branching (#19039) @gforsyth
- Use thrust::tabulate_output_iterator (#19037) @bdice
- Remove skip_rows workaround for chunked Parquet reader in cudf-polars (#19036) @Matt711
- Prefer chaining pylibcudf IO options in cudf-polars (#19022) @Matt711
batched_memset
to use ahost_span
arg instead ofstd::vector
(#19020) @mhaseeb123- Import from collections.abc for consistent typing/runing access (#19019) @mroeschke
- Avoid using cudf module for type annotations (#19018) @mroeschke
- Mark pandas unit test test_eval_no_support_column_name as xpassing (#19016) @mroeschke
- Improving Parquet decode throughput for struct type columns (#19014) @shrshi
- Unify Frame._split and DataFrame.scatter_by_map/partition_by_hash implementations (#19013) @mroeschke
- Move IndexedFrame.memory_usage docstrings to DataFrame/Series, make RangeIndex methods consistent with base class (#19010) @mroeschke
- Share DataFrame/Series.(de)seralize methods, implement to_dlpack directly on Frame (#19008) @mroeschke
- Pin narhwals to 1.41 (#19007) @Matt711
- Add year range check to cudf::strings::is_timestamp (#19006) @davidwendt
- Add cudf::strings::contains_multiple to pylibcudf (#19003) @davidwendt
- Avoid unnecessary partition step in streaming join (#19002) @rjzamora
- Part 2/n: Use cooperative groups in PQ decoders (#18978) @mhaseeb123
- Move libcudf copying benchmarks to nvbench (#18976) @davidwendt
- Add lag/lead/bitwise/row_number aggregations to pylibcudf (#18975) @mroeschke
- Switch to importing rather than cimporting datetime (#18974) @vyasr
- stop uploading packages to downloads.rapids.ai (#18973) @jameslamb
- Trace
IR.do_evaluate
in cudf_polars (#18970) @TomAugspurger - xfail more pandas unit tests that fail with cudf.pandas before execution instead of xfailing after execution (#18965) @mroeschke
- Remove test checks that depend on the compression engine (#18960) @vuule
- Use cooperative-groups for warp-parallel kernels in strings functions (#18959) @davidwendt
- fetch code before running pull request labeler (#18958) @jameslamb
- Use cooperative groups in parquet decoder kernels (#18954) @mhaseeb123
- Add a DataType container in cudf_polars (#18953) @mroeschke
- add 'rapids-init-pip' to test_cudf_polars_polars_tests.sh (#18951) @jameslamb
- parameterized ucx / ucxx (#18949) @quasiben
- Rework cudf::sorted_order implementation for faster compile (#18948) @davidwendt
- Remove deprecated Series methods, isclose (#18947) @mroeschke
- Remove deprecated groupby.collect (#18946) @mroeschke
- Remove deprecated get_dummies(cats=, ...) (#18944) @mroeschke
- Add .python_typecode and .typestr attributes to DataType (#18941) @Matt711
- Remove deprecated APIs (#18933) @vuule
- Remove cudf.Scalar (#18927) @mroeschke
- Add #pragma once to prevent redundant includes and speed up compilation (#18925) @PointKernel
- Bump polars version to <1.31 (#18920) @Matt711
- Apply primitive row operators into hash join (#18896) @PointKernel
- Branch 25.08 merge branch 25.06 (#18895) @vyasr
- Remove deprecated
cudf::io::host_buffer
(#18881) @Matt711 - Fix decompression scratch size in AUTO mode (#18878) @vuule
- Apply linter suggestions to cuIO code (#18876) @vuule
- xfail pandas unit tests that fail with cudf.pandas (#18872) @mroeschke
- Branch 25.08 merge branch 25.06 (#18855) @vyasr
- Add support for extended dtypes in
cudf.pandas
(#18832) @galipremsagar - Auto merge fix for branch-25.08 (#18824) @davidwendt
- Forward-merge branch-25.06 to branch-25.08 (#18817) @Matt711
- Forward-merge branch-25.06 to branch-25.08 (#18756) @Matt711
- Fix auto merge conflict for branch-25.08 (#18733) @davidwendt
- Forward-merge branch-25.06 to branch-25.08 (#18698) @Matt711
- Fix merge conflict for auto-merger 25.06 to 25.08 (#18693) @davidwendt
- Fix merge conflict: branch-25.06 into branch-25.08 (#18668) @davidwendt
- Make cuda12 as JNI default (#18651) @pxLi
- Forward-merge branch-25.06 into branch-25.08 (#18647) @bdice
- Fix merge branch-25.06 into branch-25.08 (#18622) @davidwendt
- Store polars Series instead of pyarrow Array in cudf_polars LiteralColumn expr (#18564) @mroeschke
- Refactor strings split/record with whitespace logic (#18560) @davidwendt
- Refactor hash join with multiset (#18021) @PointKernel