You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From the SQL plan I can see that we spend most of the time in Native.io_time total.
Also the Input bytes for this stage is around 130GB with blaze and only 25MB without balze.
Here the SQL plan with blaze
And without
It looks like we are scanning the whole parquet file to count the number of rows and not relying on metadata.
Am I understanding things correctly or am I missing some config on the job? Also it looks quite slow to get >100MB in >40s (from spark jobs the same kind of action getting >100MB for parquet file is around 20s)
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I'm running a job doing union with count on a parquet tables.
Running this count with blaze takes 14 min while it takes 21s without blaze
Here is the config for blaze:
From the SQL plan I can see that we spend most of the time in Native.io_time total.
Also the Input bytes for this stage is around 130GB with blaze and only 25MB without balze.
Here the SQL plan with blaze

And without

It looks like we are scanning the whole parquet file to count the number of rows and not relying on metadata.
Am I understanding things correctly or am I missing some config on the job? Also it looks quite slow to get >100MB in >40s (from spark jobs the same kind of action getting >100MB for parquet file is around 20s)
Beta Was this translation helpful? Give feedback.
All reactions