How can I optimize Z-ordering performance and resource usage for large DataFrames? #4648
-
I'm encountering OOM and "no space on disk" errors when running Z-ordering on large datasets because it appears to generate only a single task that utilizes just one executor. Is there a way to configure Z-ordering to run across multiple executors in parallel, or are there alternative approaches and best practices to distribute the workload and manage resource usage more effectively during Z-ordering operations? |
Beta Was this translation helpful? Give feedback.
Answered by
Byunk
May 29, 2025
Replies: 1 comment
-
Sorry for dump question... The only need is repartition before optimization. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
Byunk
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sorry for dump question... The only need is repartition before optimization.