You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-52233][SQL] Fix map_zip_with for Floating Point Types
### What changes were proposed in this pull request?
Fix to `map_zip_with` expression while handling floating point numbers.
### Why are the changes needed?
Previously we would run `getKeysWithIndexesFast` which would use faulty scala.collections.mutable.LinkedHashMap implementation, which does not use proper equality on keys for floating point numbers. All NaNs would be treated in a different way. This PR aims to fix this behaviour, by using `java.utils.LinkedHashMap` instead, which uses boxed `Type.equals()` instead of primitive type equality `==`.
Example:
```
select map_zip_with(map(float('NaN'), 1), map(float('NaN'), 2), (k, v1, v2) -> (v1, v2))
```
Output before:
```
{"NaN":{"v1":1,"v2":null},"NaN":{"v1":null,"v2":2}}
```
Output after:
```
{"NaN":{"v1":1,"v2":2}}
```
### Does this PR introduce _any_ user-facing change?
Yes, fixing the way expression works.
### How was this patch tested?
Added tests to golden files for both edge cases, `NaN` and `Infinity`.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closesapache#50967 from mihailom-db/FixMapZipWith.
Authored-by: Mihailo Milosevic <mihailo.milosevic@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
0 commit comments