Skip to content

Commit b6f4194

Browse files
xinrong-mengyhuang-db
authored andcommitted
[SPARK-52331][PS][TESTS] Adjust test for promotion from float32 to float64 during division
### What changes were proposed in this pull request? Adjust test for promotion bug from float32 to float64 during division ### Why are the changes needed? Pass nightly build with ANSI off. Part of https://issues.apache.org/jira/browse/SPARK-52169. The promotion bug is shown as below: ``` >>> ps.set_option("compute.fail_on_ansi_mode", False) >>> spark.conf.set("spark.sql.ansi.enabled", False) >>> >>> import pandas as pd >>> import numpy as np >>> pdf = pd.DataFrame( ... { ... "a": [1.0, -1.0, 0.0, np.nan], ... "b": [0.0, 0.0, 0.0, 0.0], ... }, ... dtype=np.float32, ... ) >>> >>> psdf = ps.from_pandas(pdf) >>> >>> psdf["a"] / psdf["b"] 0 inf 1 -inf 2 NaN 3 NaN dtype: float64 >>> >>> pdf["a"] / pdf["b"] 0 inf 1 -inf 2 NaN 3 NaN dtype: float32 ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Test changes only. ``` % SPARK_ANSI_SQL_MODE=false ./python/run-tests --python-executables=python3.10 --testnames "pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior" Running PySpark tests. Output is in /Users/xinrong.meng/spark/python/unit-tests.log Will test against the following Python executables: ['python3.10'] Will test the following Python tests: ['pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior'] python3.10 python_implementation is CPython python3.10 version is: Python 3.10.16 Starting test(python3.10): pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior (temp output: /Users/xinrong.meng/spark/python/target/f26cd9b9-f6c3-48ec-86f1-d1d7f6158361/python3.10__pyspark.pandas.tests.computation.test_binary_ops_FrameBinaryOpsTests.test_divide_by_zero_behavior__wrk8yuzn.log) Finished test(python3.10): pyspark.pandas.tests.computation.test_binary_ops FrameBinaryOpsTests.test_divide_by_zero_behavior (5s) Tests passed in 5 seconds ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#51035 from xinrong-meng/etest_promo. Authored-by: Xinrong Meng <xinrong@apache.org> Signed-off-by: Xinrong Meng <xinrong@apache.org>
1 parent 3f828d4 commit b6f4194

File tree

1 file changed

+22
-10
lines changed

1 file changed

+22
-10
lines changed

python/pyspark/pandas/tests/computation/test_binary_ops.py

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -114,17 +114,29 @@ def test_binary_operator_sub(self):
114114
@unittest.skipIf(is_ansi_mode_test, ansi_mode_not_supported_message)
115115
def test_divide_by_zero_behavior(self):
116116
# float / float
117-
for dtype in [np.float32, np.float64]:
118-
pdf = pd.DataFrame(
119-
{
120-
"a": [1.0, -1.0, 0.0, np.nan],
121-
"b": [0.0, 0.0, 0.0, 0.0],
122-
},
123-
dtype=dtype,
124-
)
125-
psdf = ps.from_pandas(pdf)
117+
# np.float32
118+
pdf = pd.DataFrame(
119+
{
120+
"a": [1.0, -1.0, 0.0, np.nan],
121+
"b": [0.0, 0.0, 0.0, 0.0],
122+
},
123+
dtype=np.float32,
124+
)
125+
psdf = ps.from_pandas(pdf)
126+
# TODO(SPARK-52332): Fix promotion from float32 to float64 during division
127+
self.assert_eq(psdf["a"] / psdf["b"], (pdf["a"] / pdf["b"]).astype(np.float64))
126128

127-
self.assert_eq(psdf["a"] / psdf["b"], pdf["a"] / pdf["b"])
129+
# np.float64
130+
pdf = pd.DataFrame(
131+
{
132+
"a": [1.0, -1.0, 0.0, np.nan],
133+
"b": [0.0, 0.0, 0.0, 0.0],
134+
},
135+
dtype=np.float64,
136+
)
137+
psdf = ps.from_pandas(pdf)
138+
139+
self.assert_eq(psdf["a"] / psdf["b"], pdf["a"] / pdf["b"])
128140

129141
# int / int
130142
for dtype in [np.int32, np.int64]:

0 commit comments

Comments
 (0)