Replies: 1 comment 1 reply
-
What is the data source, and is the data frame filtered?
(from mobile phone)
Op za 18 dec. 2021 21:00 schreef yohplala ***@***.***>:
… Hi,
I am implementing a function that yields pandas dataframes from vaex with
*variable* chunk sizes.
Hence, I cannot directly rely on vdf.to_pandas_df(chunk_size=50_000_000)
Instead, I am using yield vdf[start:end].to_pandas_df(), with start and
end being updated in a for loop.
Please, do you see any bottleneck / performance issue with this approach?
(I am asking, as vaex tends sometimes to show surprises :))
thanks in advance for your feedback!
Bests,
—
Reply to this email directly, view it on GitHub
<#1782>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANPEPIYVUYQBUOVUU5JJPTURTR5RANCNFSM5KLAPM2Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I am implementing a function that yields pandas dataframes from vaex with variable chunk sizes.
Hence, I cannot directly rely on
vdf.to_pandas_df(chunk_size=50_000_000)
Instead, I am using
yield vdf[start:end].to_pandas_df()
, withstart
andend
being updated in afor
loop.Please, do you see any bottleneck / performance issue with this approach?
(I am asking, as vaex tends sometimes to show surprises :))
thanks in advance for your feedback!
Bests,
Beta Was this translation helpful? Give feedback.
All reactions