Skip to content

Commit 3fb4adb

Browse files
yaooqinndongjoon-hyun
authored andcommitted
[SPARK-52685][SQL][TESTS] Add a clue for flaky test: 'SPARK-47148: AQE should avoid to submit shuffle job on cancellation'
### What changes were proposed in this pull request? This PR adds a clue for flaky test: 'SPARK-47148: AQE should avoid to submit shuffle job on cancellation' ### Why are the changes needed? The test fails frequently without clue provided ``` SPARK-47148: AQE should avoid to submit shuffle job on cancellation *** FAILED *** (6 seconds, 90 milliseconds) [info] scala.`package`.Seq.apply[org.apache.spark.SparkException](error).++[Throwable](scala.Option.apply[Throwable](error.getCause())).++[Throwable](scala.Predef.wrapRefArray[Throwable](error.getSuppressed())).exists(((e: Throwable) => e.getMessage().!=(null).&&(e.getMessage().contains("coalesce test error")))) was false (AdaptiveQueryExecSuite.scala:938) [info] org.scalatest.exceptions.TestFailedException: ``` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing CI, and an intentional local failure ``` [info] - SPARK-47148: AQE should avoid to submit shuffle job on cancellation *** FAILED *** (7 seconds, 7 milliseconds) [info] errMsgList.exists(((x$25: String) => x$25.contains("AAAcoalesce test error"))) was fals [info] The error message should contain 'coalesce test error', but got: [info] ====== [info] Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (10.242.151.176 executor driver): java.lang.RuntimeException: coalesce test error [info] at org.apache.spark.sql.execution.adaptive.TestProblematicCoalesceStrategy$TestProblematicCoalesceExec.$anonfun$doExecute$1(AdaptiveQueryExecSuite.scala:3227) [info] at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:866) [info] at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:866) [info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) [info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:374) [info] at org.apache.spark.rdd.RDD.iterator(RDD.scala:338) [info] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) [info] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:374) [info] at org.apache.spark.rdd.RDD.iterator(RDD.scala:338) [info] at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:107) [info] at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) [info] at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:171) [info] at org.apache.spark.scheduler.Task.run(Task.scala:147) [info] at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$5(Executor.scala:647) [info] at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:80) [info] at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:77) [info] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:100) [info] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:650) [info] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [info] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [info] at java.base/java.lang.Thread.run(Thread.java:840) [info] [info] Driver stacktrace: [info] coalesce test error [info] ====== (AdaptiveQueryExecSuite.scala:941) ```` ### Was this patch authored or co-authored using generative AI tooling? no Closes #51375 from yaooqinn/SPARK-52685. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
1 parent a1d55d7 commit 3fb4adb

File tree

1 file changed

+9
-4
lines changed

1 file changed

+9
-4
lines changed

sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -930,14 +930,19 @@ class AdaptiveQueryExecSuite
930930
SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
931931
SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {
932932
val joined = createJoinedDF()
933-
joined.explain(true)
934933

935934
val error = intercept[SparkException] {
936935
joined.collect()
937936
}
938-
assert((Seq(error) ++ Option(error.getCause) ++ error.getSuppressed()).exists(
939-
e => e.getMessage() != null && e.getMessage().contains("coalesce test error")))
940-
937+
val errMsgList = (error :: error.getCause :: error.getSuppressed.toList)
938+
.filter(e => e != null && e.getMessage != null)
939+
.map(_.getMessage)
940+
941+
assert(errMsgList.exists(_.contains("coalesce test error")),
942+
s"""
943+
|The error message should contain 'coalesce test error', but got:
944+
|${errMsgList.mkString("======\n", "\n", "\n======")}
945+
|""".stripMargin)
941946
val adaptivePlan = joined.queryExecution.executedPlan.asInstanceOf[AdaptiveSparkPlanExec]
942947

943948
// All QueryStages should be based on ShuffleQueryStageExec

0 commit comments

Comments
 (0)