Skip to content

Optimize allocation rate for int64 array in hex function #16483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Fly-Style
Copy link
Contributor

Which issue does this PR close?

Comment from #15947 regarding potentially optimized memory usage #15947 (comment)

Rationale for this change

iter().map(...).collect() introduces intermediate allocations and this patch optimizes this behavior by reducing buffer allocation only once specifically of hex computation for int64.

@github-actions github-actions bot added the spark label Jun 20, 2025
@Dandandan
Copy link
Contributor

I think this is probably faster! Do we have any benchmarks for this to show it?

@Fly-Style
Copy link
Contributor Author

@Dandandan hi, Daniël! On my way to benchmark it :)

@Fly-Style
Copy link
Contributor Author

Fly-Style commented Jun 24, 2025

Hm, interestingly, the speedup is not higher than I expected...
I assume there may be some inconsistencies in benchmark code, I have experience only with JMH, so, I may miss something here. /cc @Dandandan

Benchmark results:

Benchmarking compute_hex_new: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.1s, or reduce sample count to 90.
compute_hex_new         time:   [49.225 ms 49.905 ms 50.779 ms]
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

Benchmarking compute_hex_old: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.0s, or reduce sample count to 90.
compute_hex_old         time:   [50.322 ms 50.565 ms 50.820 ms]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Hardware: MBP m3 pro

Benchmark code:

use arrow::array::PrimitiveDictionaryBuilder;
use arrow::datatypes::{Int32Type, Int64Type};
use criterion::{criterion_group, criterion_main, Criterion};
use datafusion_expr::ColumnarValue;
use datafusion_spark::function::math::hex::{compute_hex, compute_hex_old};
use std::sync::Arc;

const DICT_SIZE: i64 = 1024 * 1024;

fn criterion_benchmark(c: &mut Criterion) {
    let mut input_builder = PrimitiveDictionaryBuilder::<Int32Type, Int64Type>::new();
    for i in 0..DICT_SIZE {
        input_builder.append_value((DICT_SIZE - i) * 1024);
    }
    let input = input_builder.finish();
    let columnar_value = ColumnarValue::Array(Arc::new(input));

    c.bench_function("compute_hex_new", |b| {
        b.iter(|| compute_hex(&[columnar_value.clone()], false).unwrap());
    });

    c.bench_function("compute_hex_old", |b| {
        b.iter(|| compute_hex_old(&[columnar_value.clone()], false).unwrap());
    });
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants