fixed scalarIndexing performance issue #1215

riteshbhirud · 2025-07-08T04:15:33Z

Fixes performance regression in the NaN counting callback.

Replaced inefficient scalar operations with proper ClimaCore field operations:

Eliminated allowscalar block entirely
Used field broadcasting with @. ifelse(isnan(state), ...) instead of parent(state)
Reduced allocations by avoiding intermediate boolean arrays
Maintained identical mathematical behavior and accuracy

Results:

Eliminates GPU scalar indexing bottleneck
Reduces memory allocations
Should restore normal SYPD when using NaN callback
No change to functionality or accuracy

kmdeck · 2025-07-08T16:18:25Z

Thanks @riteshbhirud! Does this reduce allocations, though? I believe this line still allocates:
@. ifelse(isnan(state), 1, 0) (and similar broadcasted expressions).

Also, confusingly, the sum of a field does an integral, and not a similar arithmetic sum. But if we take the sum of the parent(field), it calls the method of sum for an array.

imreddyTeja · 2025-07-08T18:10:59Z

src/shared_utilities/utils.jl

+    if isnothing(mask)
+        num_nans = sum(@. ifelse(isnan(state), 1, 0))
+    else
+        num_nans = sum(@. ifelse(isnan(state), mask, 0))
    end
+    num_nans = Int(num_nans)


I think we can further optimize this and reduce some of the allocations. For the no mask case, we can go from 2 kernel launches to just one if we use mapreduce or count. I believe @kmdeck is correct, and line 597 will still allocate an intermediate. For the case with a mask, we can also get away with only one kernel launch and no allocations with mapreduce.

That could look something like

Suggested change

if isnothing(mask)

num_nans = sum(@. ifelse(isnan(state), 1, 0))

else

num_nans = sum(@. ifelse(isnan(state), mask, 0))

end

num_nans = Int(num_nans)

if isnothing(mask)

num_nans = count(isnan, parent(state)

else

num_nans = mapreduce((s, m) -> m != 0 && isnan(s), Base.add_sum, parent(state), parent(mask))

end

That would also avoid the type conversion.

I'm not sure if masks are guaranteed to be boolean valued, so I'm not sure if line 599 will always be valid

nice!

The mask is not boolean, because we create it using this:

ClimaLand.jl/src/shared_utilities/Domains.jl

Line 1047 in 7ac4cd4

apply_threshold(field, value) =

I think we did this because you cant create a field of bools by broadcasting over a field of floats. But we can do something like:

apply_threshold(field, value) = field > value ? 0 : 1

and then later down in the landsea_mask function, change thishttps://github.com/CliMA/ClimaLand.jl/blob/7ac4cd452663559f20df22d2a60e8f1aaa92d90f/src/shared_utilities/Domains.jl#L1088

to

binary_mask = ClimaCore.Fields.Field(Bool, surface_space) fill!(ClimaCore.Fields.field_values(binary_mask), 0) @. binary_mask = apply_threshold(mask, threshold)

One other thought is that we call this function for each state variable in Y, but with one exception, if one is NaN, the entire state is NaN at that same location. so maybe we can also speed it up by calling it for just one field.

I'm not sure how to do that in a nice automated way though, without making an assumption about what is in Y

riteshbhirud · 2025-07-08T19:24:47Z

apologies for the allocations, I have now replaced the multi-step array creation and summation with single-kernel count/mapreduce operations to eliminate allocations. Can you please take a look once you have chance if this is what we expected?

kmdeck · 2025-07-08T22:37:28Z

apologies for the allocations, I have now replaced the multi-step array creation and summation with single-kernel count/mapreduce operations to eliminate allocations. Can you please take a look once you have chance if this is what we expected?
Edit - I enabled CI to run so we can see if the tests pass

kmdeck · 2025-07-09T17:11:06Z

@riteshbhirud I am going to try running a simulation where we use the NaN check callback and see if it runs and if SYPD increases :) will report back!

imreddyTeja

Thank you, this is great!

Looking at just the callback by itself, this change makes the first call slightly slower, but then the subsequent calls are about 5x faster. @allocations also reports no allocations now!

I'll run a benchmark comparing this to main.

Could you please run the .dev/climaformat.jl to format these changes? It looks like the formatting test is failing.

riteshbhirud · 2025-07-22T07:29:01Z

@imreddyTeja I ran the .dev/climaformat.jl. Can you please run the CI if all good to merge?

Thanks!

fixed scalarIndexing performance issue

bb26418

imreddyTeja reviewed Jul 8, 2025

View reviewed changes

reduced allocations for NaNs

04a3d05

imreddyTeja requested changes Jul 15, 2025

View reviewed changes

riteshbhirud and others added 2 commits July 22, 2025 03:26

run_formatter

da7e61f

Merge branch 'CliMA:main' into main

3aa4e66

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fixed scalarIndexing performance issue #1215

fixed scalarIndexing performance issue #1215

Uh oh!

riteshbhirud commented Jul 8, 2025

Uh oh!

kmdeck commented Jul 8, 2025

Uh oh!

imreddyTeja Jul 8, 2025

Uh oh!

kmdeck Jul 8, 2025 •

edited

Loading

Uh oh!

kmdeck Jul 8, 2025

Uh oh!

riteshbhirud commented Jul 8, 2025

Uh oh!

kmdeck commented Jul 8, 2025 •

edited

Loading

Uh oh!

kmdeck commented Jul 9, 2025

Uh oh!

imreddyTeja left a comment •

edited

Loading

Uh oh!

riteshbhirud commented Jul 22, 2025

Uh oh!

Uh oh!

fixed scalarIndexing performance issue #1215

Are you sure you want to change the base?

fixed scalarIndexing performance issue #1215

Uh oh!

Conversation

riteshbhirud commented Jul 8, 2025

Uh oh!

kmdeck commented Jul 8, 2025

Uh oh!

imreddyTeja Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

kmdeck Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kmdeck Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

riteshbhirud commented Jul 8, 2025

Uh oh!

kmdeck commented Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kmdeck commented Jul 9, 2025

Uh oh!

imreddyTeja left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

riteshbhirud commented Jul 22, 2025

Uh oh!

Uh oh!

kmdeck Jul 8, 2025 •

edited

Loading

kmdeck commented Jul 8, 2025 •

edited

Loading

imreddyTeja left a comment •

edited

Loading