Release v25.06.00 · rapidsai/cudf

🚨 Breaking Changes

Remove cudf.BaseIndex (#18751) @mroeschke
Implement BIT_COUNT unary operation (#18589) @ttnghia
Expose column chunk metadata in read_parquet_metadata() (#18579) @mhaseeb123
Fix overflow for MERGE_M2 groupby aggregation (#18546) @ttnghia
Deduplicate parquet physical type enums (#18526) @mhaseeb123
Implemented String Output & User-data Support for Transforms (#18490) @lamarrr
Promote Parquet type enums to enum classes (#18441) @mhaseeb123
Move parquet schema types and structs to public headers (#18424) @mhaseeb123
Start removal of vector factories with _sync suffix by deprecating them and adding versions without the suffix (#18414) @vuule
Skip decoding of pages marked as pruned in PQ reader (#18347) @mhaseeb123
Deprecate nvtext subword tokenizer (#18334) @davidwendt
Add standard data ingestion pipelines to pylibcudf for ndarrays (#18311) @Matt711
Remove extranous modules from top level cudf namespace (#18287) @mroeschke
Add Keep Option Parameter to Distinct (#18237) @warrickhe
Update to CCCL 2.8.x with no CCCL patches (#18235) @bdice

🐛 Bug Fixes

Disable pytest benchmark for Narwhals CI job (#19074) @Matt711
Avoid undefined behaviour in rolling_store_output_functor (#19069) @wence-
Filter out pkg_resources UserWarning to make nightly CI pass (#19058) @Matt711
Pin deltalake to <1.0.0 (#19017) @Matt711
[BUG] Incorrectly getting the caller's frame when searching for locals and globals in cudf.pandas (#18979) @Matt711
Ensure gc fixture is used in custreamz test (#18915) @TomAugspurger
Fix a potential segfault in PQ reader's number of rows per source calculation (#18906) @mhaseeb123
Fix Dataframe getitem when MultiIndex columns exist (#18880) @galipremsagar
Ensure eq/ne between Columns in public objects don't return bool (#18875) @mroeschke
Fix fencepost error in Repartition task generation (#18854) @wence-
Fix cudf_polars pl.col(...).len() always excluding null values (#18849) @mroeschke
Throw a descriptive exception in Parquet reader when trying to read files with more than two billion rows (#18835) @mhaseeb123
Skip a decompression test (#18825) @vuule
Update strings benchmarks to use alloc_size column/table function (#18822) @davidwendt
Fix host decompression of empty DEFLATE data (#18805) @vuule
Avoid going OOM in test_row_limit_exceed_raises by using dummy array (#18802) @Matt711
Fix host decompression of empty Snappy data (#18800) @vuule
Skip test that fails due to polars issue (#18787) @wence-
Ensure scalar dtype is always set in from_py (#18780) @vyasr
Fix reading of Snappy compressed Avro files (#18774) @vuule
Fix missing semicolon in label_bins.cu (#18765) @evanramos-nvidia
Fix noexcept annotations on strings_column_view (#18763) @wence-
Fix integer overflows in pylibcudf from_column_view_of_arbitrary (#18758) @wence-
Fix overflow case and clean up some logic (#18734) @vyasr
Link to nvtx3::nvtx3-cpp instead of nvToolsExt (#18730) @jakirkham
Revise DaskIntegration protocol to align with rapidsmpf (#18720) @rjzamora
Fix skip_compression option in the Parquet writer with host compression (#18714) @vuule
Add missing header (#18671) @vyasr
Revert "Set flag to always use unsafe atomic storage" (#18657) @PointKernel
Fix optional operator* called on a disengaged value in clamp.cu (#18655) @davidwendt
Add missing header to host_memory.cpp (#18649) @alliepiper
Fix device compression when writing Parquet files without using nvCOMP (#18644) @vuule
Add CUDA_ARCHITECTURES setting to cpp-linters script (#18637) @davidwendt
Pin to cython<3.1 (#18617) @wence-
Fix DataFrame.memory_usage output order (#18595) @mroeschke
Set flag to always use unsafe atomic storage (#18590) @PointKernel
Update KvikIO S3 endpoint usage (#18565) @kingcrimsontianyu
Skip cuml third-party integration tests that may segfault (#18561) @Matt711
Allow .iloc with cuDF objects as column indexers (#18558) @mroeschke
Fix overflow for MERGE_M2 groupby aggregation (#18546) @ttnghia
Add back cudf root (#18544) @vyasr
Change default memory resource for 'distributed' cudf-polars (#18531) @rjzamora
Fix copy-on-write buffer separation and cleanup (#18530) @galipremsagar
Fix cpp examples cmake to use the rapids_config.cmake (#18501) @davidwendt
Rename rapidsmp to rapidsmpf (#18493) @rjzamora
Fix compilation with the C++20 standard (#18486) @vuule
Fix an error when reading some compressed Parquet V2 files (#18478) @vuule
Support title-case characters in strings capitalize() and title() APIs (#18457) @davidwendt
Ensure DataFrame column label operations reset label_dtype (#18452) @mroeschke
Fix a segfault when reading a Parquet file with unsupported compression type (#18451) @vuule
Fix logger macros (#18444) @vyasr
Fix auto-detection of compression type in host-side decompression (#18440) @shrshi
Use delete not free to release data allocated with new (#18412) @wence-
Fix synchronization issues in host compression and decompression (#18395) @vuule
Update Dask array-conversion handling (#18382) @rjzamora
Fixed indexing on empty DataFrame with no columns (#18381) @TomAugspurger
Deterministic hashing for DataFrameScan nodes in cudf-polars multi-partition executor (#18351) @TomAugspurger
Fix index of right table in unary operators in AST, in Joins (#18333) @karthikeyann
Add offsetalator to contiguous-split (#18312) @davidwendt
Support large strings in nvtext vocabulary-tokenizer (#18283) @davidwendt
Handle empty aggregations in multi-partition cudf.polars group_by (#18277) @TomAugspurger

📖 Documentation

Docs for streaming executor options (#18934) @quasiben
Fix some duplicate toctree issues and improve groupby docs (#18580) @vyasr
[DOC] Running libcudf benchmarks and comparing output results (#18548) @Matt711
Fix doxygen usage of the contraction for it is (#18517) @davidwendt
Clarify @brief tag as description/title on documentation guide (#18515) @davidwendt
[DOC] Improve clarity in parquet APIs set_row_groups and set_columns parquet (#18466) @Matt711
Add a usage page to cudf-polars documentation (#18460) @Matt711
[DOC] Fix typo in CONTRIBUTING.md on build type tests (#18456) @JigaoLuo
improve docs related to documentation contribution (#18418) @ncclementi
Add restart kernel note in cudf pandas docs (#18374) @ncclementi

🚀 New Features

Add CLI argument to enable RMM async memory resource in PDS-H (#18899) @pentschev
Scan a headerless CSV file with column names provided (#18816) @Matt711
Add fast paths for DataFrame.to_cupy (#18801) @Matt711
Require numba-cuda>=0.11.0 (#18770) @brandon-b-miller
Create a pylibcudf Column from a python iterable (#18768) @Matt711
Support ConditianalJoin via broadcasting in cudf-polars streaming engine (#18723) @rjzamora
Experimental PQ reader utility to calculate total rows in input row groups (#18716) @mhaseeb123
Extend explain_query to support printing the logical plan (pre lowered plan) (#18708) @Matt711
Reuse libcudf dependencies for Java JNI build when they are available (#18682) @ttnghia
Add alloc_size member function to cudf::column and cudf::table (#18639) @davidwendt
Print the physical cudf-polars plan in pdsh.py (#18635) @rjzamora
String Transform Examples (#18616) @lamarrr
Add streaming support for group_by -> n_unique to cudf-polars (#18606) @rjzamora
Export cudf compiler flags and definitions (#18604) @ttnghia
Implement BIT_COUNT unary operation (#18589) @ttnghia
Expose column chunk metadata in read_parquet_metadata() (#18579) @mhaseeb123
Add APIs to check ORC and Parquet compression support at runtime (#18578) @vuule
Add Distinct support to the cudf-polars streaming executor (#18576) @rjzamora
Add support for large list host Arrow data conversion (#18562) @vyasr
Implement BITWISE_AGG aggregations (bitwise AND, OR and XOR) for sort-based groupby and reduction (#18551) @ttnghia
Implement row group pruning with bloom filters in experimental PQ reader (#18545) @mhaseeb123
Implement row group pruning with stats in experimental PQ reader (#18543) @mhaseeb123
[JNI] Expose row-wise sha1 api (#18540) @warrickhe
Add Sort + head/tail support to streaming cudf-polars executor (#18538) @rjzamora
Add multi-partition MapFunction support to cudf-polars (#18523) @rjzamora
Adds support for writing raw UTF-8 characters (without escaping) in the JSON writer (#18508) @Matt711
Support reading from device buffers in the pylibcudf IO APIs (#18496) @Matt711
Support multi-partition Select operations with aggregations (#18492) @rjzamora
Implemented String Output & User-data Support for Transforms (#18490) @lamarrr
Add a utility to bulk set multiple null masks (#18489) @mhaseeb123
High level interface for experimental PQ reader and implementation of metadata APIs (#18480) @mhaseeb123
Added pylibcudf.utilities.is_ptds_enabled (#18467) @TomAugspurger
Add a public API for copying a table_view to device array (#18450) @Matt711
Support cudf-polars cast_time_unit (#18442) @brandon-b-miller
Support creating a pylibcudf Column from a host array (#18425) @Matt711
Move parquet schema types and structs to public headers (#18424) @mhaseeb123
Add optional dtype argument to Scalar.from_any (#18415) @Matt711
Expose cudf::chunked_pack in pylibcudf (#18411) @wence-
Add support for long string columns in cudf::contiguous_split (#18393) @nvdbaranec
Implemented String Input support for Transforms and Removed jit::column_device_view (#18378) @lamarrr
Automatically dispatch between host and device decompression/compression based on the number of buffers (#18363) @vuule
Expose join hash table load factor (#18361) @PointKernel
Skip decoding of pages marked as pruned in PQ reader (#18347) @mhaseeb123
Sort-based inner join for high-multiplicity tables (#18318) @shrshi
Support constructing pylibcudf Columns and Tables from views into arbitrary objects (#18314) @vyasr
Add standard data ingestion pipelines to pylibcudf for ndarrays (#18311) @Matt711
Support cudf-polars isoyear and week (isoweek) (#18265) @brandon-b-miller
Add Keep Option Parameter to Distinct (#18237) @warrickhe
Add rapidsmp shuffle support to cudf-polars (#18231) @rjzamora
Support cudf-polars strftime (#18181) @brandon-b-miller
Add benchmark for join operations with low build table cardinality (#18105) @shrshi
Add nvtext substring deduplication APIs (Part 2) (#18104) @davidwendt
Support include_file_paths in cudf polars (#18057) @Matt711
Add support for the Arrow device capsule interfaces (#15370) @vyasr

🛠️ Improvements

use 'rapids-init-pip' in wheel CI, other CI changes (#18902) @jameslamb
Avoid RecursionError in custreamz test (#18887) @TomAugspurger
Update NumPy dependency in cudf.pandas-catboost integration test (#18870) @Matt711
CPU only execution for PDSH (#18869) @quasiben
Remove more top level cudf imports in core (#18862) @mroeschke
Remove top level cudf imports in core (#18857) @mroeschke
Add CUDF_INSTALL_DIR for JAVA build script (#18852) @pxLi
Call the correct from_pandas in hdf reader (#18850) @galipremsagar
Update __all__ in cudf_polars/dsl/ir.py (#18848) @Matt711
Upload examples conda package (#18847) @vyasr
Add retries to prevent failures in occasionally slow CI runs (#18843) @galipremsagar
Finish CUDA 12.9 migration and use branch-25.06 workflows (#18839) @bdice
Remove toplevel import cudf from window/tools/join directories (#18833) @mroeschke
Remove toplevel import cudf from cudf/io files (#18829) @mroeschke
Update pdsh benchmark script to support explain-only (#18826) @TomAugspurger
Refactor UDF utils and add a hook to enable NRT when necessary (#18823) @brandon-b-miller
Fix memory access error in nvtext::edit_distance (#18821) @davidwendt
Update to clang 20 (#18818) @bdice
Reduce more data sizes of Python tests (#18814) @mroeschke
Mark DataFrame.dtypes as an _external_only_api (#18809) @mroeschke
Change calls to thrust::swap to cuda::std::swap (#18808) @davidwendt
Move implemented BaseIndex methods over to Index (#18807) @mroeschke
Improve pandas version fetching script (#18793) @galipremsagar
Change cudf::sort googlebench benchmarks to nvbench (#18786) @davidwendt
Only warn in cudf.pandas if rmm mode explicitly set and rmm already configured (#18785) @jcrist
Quote head_rev in conda recipes (#18784) @bdice
Move RangeIndex implementation below Index (#18777) @mroeschke
Remove unecessary _Ravelled class (#18771) @Matt711
Remove pytest-rerunfailures (#18766) @mroeschke
Replace from_arrow with direct calls Column/Table constructors in pylibcudf and cudf-polars tests (#18762) @Matt711
CUDA 12.9 use updated compression flags (#18755) @robertmaynard
fix(rattler): add librmm to host for libcudf to fix overlinking error (#18754) @gforsyth
Remove the file name from the output in cudf-polars' explain APIs (#18752) @Matt711
Remove cudf.BaseIndex (#18751) @mroeschke
Support creating a pylibcudf Column from a general ndarray (#18744) @Matt711
Improve lowering of Distinct IR nodes for high-cardinality data (#18725) @rjzamora
Simplify Numba-CUDA MVC logic (#18724) @bdice
Test with CUDA 12.9.0 (#18721) @bdice
Add more cudf.Series microbenchmarks (#18718) @Matt711
Run unit-tests-cudf-pandas on branch-25.06 for nightly tests (#18717) @davidwendt
Move test_large_unique_categories_repr to benchmarks (#18715) @galipremsagar
Allow pylibcudf.Column to consume objects exposing __arrow_c_stream__ (#18712) @mroeschke
Switch from printing to logging (#18711) @vyasr
Add Python tests for different compression implementations (#18710) @vuule
Remove redundant xfails in cuml integration tests (#18699) @Matt711
ci: run unit-tests-cudf-pandas on branch-25.06 workflow (#18692) @gforsyth
Exclude librmm.so from auditwheel (#18691) @bdice
Add C++ tests for different compression implementations (#18690) @vuule
Improve runtime of cuDF Python unit tests (#18689) @mroeschke
Require at least numba-cuda 0.10.1 (#18688) @brandon-b-miller
Add nvidia-cuda-{nvrtc, nvcc} as a dependency for cuDF wheels (#18686) @brandon-b-miller
Support rolling aggregations in in-memory cudf-polars execution (#18681) @wence-
Replace parquet_blocksize with target_partition_size (#18669) @rjzamora
Skip test_large_unique_categories_repr in CI (#18666) @bdice
Locally import pyarrow.dataset and fsspec for import cudf performance (#18663) @mroeschke
Disable arm64 python tests (#18662) @galipremsagar
Pin numba-cuda>=0.9.0,!=0.10.0 due to CI hangs on ARM (#18661) @mroeschke
Fix compile warnings in Java JNI (#18660) @ttnghia
Drop Empty nodes from IR graph (#18658) @rjzamora
Add support for Python 3.13 (#18648) @gforsyth
Cleanup libcudf detail/aggregation.hpp/.cuh (#18642) @davidwendt
Skip all known pytest failures in pandas-tests (#18641) @galipremsagar
Preserve partitioning after Filter and Projection in cudf-polars (#18638) @rjzamora
Support quantile in cudf-polars grouped aggregations (#18634) @wence-
Deprecate Series.nullmask, Series.nullable, Series.from_categorical, Series.from_masked_array, cudf.isclose (#18631) @mroeschke
Access private objects by importing from module instead of cudf.core/util namespace (#18629) @mroeschke
Replace unnecessary cudf::size_of() calls with sizeof() (#18628) @davidwendt
Improve cold cache dropping (#18626) @kingcrimsontianyu
Improve default config values for cudf-polars streaming (#18623) @rjzamora
Add gtest error check for nvtext::wordpiece_tokenize (#18621) @davidwendt
Polars dataframe serialize using chunked pack (#18614) @madsbk
xfail all known errors in pandas-test suite (#18612) @galipremsagar
Add TemporalBaseColumn as a parent class to DatetimeColumn and TimedeltaColumn (#18611) @mroeschke
Update cudf::cast internal function to use sizeof instead of cudf::size_of (#18607) @davidwendt
Move cudf/utils/utils.py methods to appropriate locations (#18605) @mroeschke
pylibcudf.Column: add device_buffer_size and register a dask.sizeof function for cudf-polars Column and DataFrame (#18602) @madsbk
Use cached_property for Datetime and Timedelta column properties (#18601) @mroeschke
Annotate and simplify from_arrow (#18600) @mroeschke
Enable reporting peak memory usage for gtests (#18599) @davidwendt
Prune methods from Frame that are specific to subclasses (#18597) @mroeschke
Switch tensorflow integration tests to use 12.x (#18596) @galipremsagar
refactor: use libnvcomp from libkvikio wheel to unblock Python 3.13 upgrade (#18593) @gforsyth
Add temporary pdsh benchmarks to cudf_polars.experimental (#18592) @rjzamora
Update numba-cuda dependency to >=0.9.0 (#18591) @brandon-b-miller
use 'certifi' certificates in fetch_pandas_versions script (#18588) @jameslamb
Add nvtext substring duplication APIs (Part 1) (#18585) @davidwendt
Bump polars version to <1.29 (#18581) @Matt711
Allow datetime.timedelta objects in pylibcudf.Scalar.from_py (#18577) @mroeschke
Rework strings split_helper utility for better reuse (#18575) @davidwendt
Additional tests strings for strings split APIs (#18574) @davidwendt
Support datetime.datetime objects in pylibcudf.Scalar.from_py (#18572) @mroeschke
Store Python scalars instead of PyArrow Scalars in cudf_polars Literal expr (#18563) @mroeschke
Support plc.Scalar.from_py(None) and plc.Scalar.from_py(int, float type) (#18559) @mroeschke
Add xfail window function tests for cudf_polars (#18557) @btepera
Add fast paths to Series.to_cupy and Series.values (#18555) @Matt711
Reduce cudf-polars pyarrow usage (#18554) @vyasr
Avoid possible invalid kernel grid error in cudf::set_null_masks if no bitmasks to set (#18553) @mhaseeb123
Adjust cudf Python groupby test for cuCollections update (#18550) @mroeschke
Refactor scan test I/O logic into shared make_partitioned_source helper (#18542) @Matt711
Download build artifacts from Github for CI jobs (#18539) @VenkateshJaya
Update hypothesis version (#18537) @galipremsagar
Make Python testing dependencies more specific to pylibcudf vs cudf (#18535) @mroeschke
Pin hypothesis<6.131.1 due to performance issues (#18532) @mroeschke
Deduplicate parquet physical type enums (#18526) @mhaseeb123
Reduce the number of miscellaenous pandas unit tests run with cudf.pandas (#18524) @mroeschke
Improve nvtext::tokenize_with_vocabulary performance (#18522) @davidwendt
Make pylibcudf.Column.from_rmm_buffer a Python staticmethod (#18521) @mroeschke
Add more short circuit checks for .equals (#18520) @mroeschke
Add synchronous task scheduler to cudf-polars (#18519) @rjzamora
Don't fetch dlpack headers when building cuDF Python (#18518) @mroeschke
Refactor polars configuration (#18516) @TomAugspurger
Refactor internal strings utility to separate header and definition file (#18514) @davidwendt
Fix print() keyword argument in cudf pandas test (#18513) @trxcllnt
Improve performance of strings split-record on whitespace (#18510) @davidwendt
Use cuda::std::iter_value_t instead of thrust iterator traits (#18509) @miscco
Remove redundant task-graph logic for streaming GroupBy (#18507) @rjzamora
Replace GPU_ARCHS build variable by CMAKE_CUDA_ARCHITECTURES (#18506) @ttnghia
Optimize pandas metadata generation to reduce memory pressure (#18505) @galipremsagar
Replace deprecated host_buffer in favor of host_span in SourceInfo (#18503) @Matt711
Add pylibcudf.Column.from_rmm_buffer (#18502) @mroeschke
Replace thrust functors with libcu++ ones (#18500) @miscco
Rename cudf-polars executors (#18499) @rjzamora
Remove casting functions in pylibcudf utils (#18497) @Matt711
Increase wheel size limit. (#18487) @bdice
Add CategoricalIndex.from_codes (#18485) @mroeschke
Split join header (#18484) @shrshi
Fix unspecified behavior involving move semantics and order of evaluation (#18481) @kingcrimsontianyu
Remove need for to_cudf_compatible_scalar (#18477) @mroeschke
Rerun flaky pytests in CI (#18476) @galipremsagar
Vendor RAPIDS.cmake (#18473) @bdice
Add ARM conda environments. (#18470) @bdice
Bump polars version to <1.28 (#18469) @Matt711
Add sink support in cudf_polars (#18468) @mroeschke
Enable rapidsmpf spilling in cudf-polars (#18461) @madsbk
Promote Parquet type enums to enum classes (#18441) @mhaseeb123
Consolidate logic in DataFrame.init for listlike arguments (#18439) @mroeschke
Update compression formats supported in JSON reader (#18438) @shrshi
Disabled Jitify Minification (#18436) @lamarrr
Fix printing decimal128 types that are zero (#18435) @trxcllnt
Replace direct use of nvCOMP and of its adapter with the higher-level decompression API (#18434) @vuule
Add more cudf.DataFrame constructor pytest benchmarks (#18433) @mroeschke
Test against stable tags for narwhals (#18431) @Matt711
Refcount-based dropping of cached evaluations in cudf-polars executor (#18430) @wence-
Replace Thrust iterator facilities with libcu++ ones (#18427) @miscco
Remove numpy requirement when converting 2d cuda array interface objects to pylibcudf Columns (#18426) @Matt711
Share more cudf.Column methods for indices_of/isin (#18423) @mroeschke
Switch the ptr type in gpumemoryview from Py_ssize_t to uintptr_t (#18419) @Matt711
Add strings::extract_single API (#18417) @davidwendt
Add to_arrow_host_stringview interop API (#18416) @davidwendt
Start removal of vector factories with _sync suffix by deprecating them and adding versions without the suffix (#18414) @vuule
Allow polars arrow conversion to produce string_view (#18413) @wence-
Change dask_cudf.to_parquet behavior for local filesystems (#18408) @rjzamora
Add rank and label_bin methods to ColumnBase (#18407) @mroeschke
Improve performance of strings::like for long strings (#18406) @davidwendt
Automatic single-partition fallback in cudf-polars (#18405) @rjzamora
Remove _sync suffix from hostdevice types (#18404) @vuule
Use owning Arrow types in C++ to expose data to Python (#18402) @vyasr
add static push and pop methods to NvtxRange (#18401) @zpuller
Deprecate cudf.Scalar (#18394) @mroeschke
Bump polars version to <1.27 (#18387) @Matt711
Branch 25.06 merge 25.04 (#18380) @Matt711
Silence warning by setting BUILD_SHARED_LIBS (#18371) @vyasr
Rewrite groupby aggregations in cudf-polars to simplify evaluation (#18369) @wence-
Pass stream through when taking ownership from libcudf (#18367) @wence-
Expose new grouped_range_rolling API in pylibcudf (#18365) @wence-
Avoid patching sort algorithms from CCCL (#18364) @miscco
Deprecate old nvtext::normalize_characters (#18360) @davidwendt
refactor(rattler): enable strict channel priority for builds (#18358) @gforsyth
Optimize sequences by introducing make_offsets_child_column (#18357) @ustcfy
Decompress all data in a single decompress_page_data when reading Parquet input in a single chunk (#18352) @vuule
Moving wheel builds to specified location and uploading build artifacts to Github (#18346) @VenkateshJaya
Performance improvement for to_lower/to_upper for multi-byte UTF-8 characters (#18345) @davidwendt
Branch 25.06 merge branch 25.04 (#18344) @vyasr
Use dask-cuda for cudf-polars experimental testing (#18343) @rjzamora
Deprecate nvtext subword tokenizer (#18334) @davidwendt
Remove cudf.Scalar in as_column (#18331) @mroeschke
Add tests for cudf.polars to be able to work on a cpu-only machine (#18327) @galipremsagar
Allow cudf.DataFrame.from_pylibcudf to accept a pylibcudf.io.TableWithMetadata (#18319) @mroeschke
Avoid stateful construction in DataFrame.__init__ (#18306) @mroeschke
Improve the groupby performance for extremely low cardinality (#18290) @PointKernel
Remove extranous modules from top level cudf namespace (#18287) @mroeschke
Require type annotations in cudf.polars (#18285) @TomAugspurger
Removing unnecessary StreamSynchronization in reading (#18279) @JigaoLuo
Update to CCCL 2.8.x with no CCCL patches (#18235) @bdice
Reduce register pressure for compute_column_kernel (#18226) @matal-nvidia
Use the mapped buffer for all read operations in the memory-mapped source; switch default source to the kvikIO one (#18204) @vuule
Improve test coverage in the catboost integration tests (#18126) @Matt711
Create file sources in parallel (#18094) @vuule
Enable stumpy_distributed tests (#17969) @galipremsagar
Refactor distinct join to use primitive row operators when proper (#17726) @PointKernel
Update chunked parquet reader benchmarks (#16543) @sdrp713

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v25.06.00

🚨 Breaking Changes

🐛 Bug Fixes

📖 Documentation

🚀 New Features

🛠️ Improvements

Contributors

Uh oh!