Custom tokio runtime/thread pool when reading from Icechunk #1043
Replies: 5 comments 7 replies
-
|
Hi @maximedion2, thanks for pointing this out. Are you specifically looking to specify the connector for ObjectStore storage backends or do you want general control over the tokio thread pool? Icechunk does indeed use We don't currently have a built in way to control this, but i'm interested in exactly where you want control because there is thread stuff happening in Icechunks context, but then also downstream libraries (object_store, the native s3 api) that wont necessarily automatically share config/threading. |
Beta Was this translation helpful? Give feedback.
-
|
@mpiannucci sounds good! Tbh I haven't looked super deep into icechunk storage options, I'm using Regarding configuring the icechunk backends, is this also possible on the custom s3 library (that, iirc, doesn't depend on object store)? Another option, which I was clumsily trying to explain in my first reply, is to use a custom runtime to spawn any and all calls to the icechunk backend (through the In any case, thanks for your replies! I will look into this more closely and see what the best option is. |
Beta Was this translation helpful? Give feedback.
-
|
@rabernat so right now I'm working on this project on my own time, can't quite justify doing that for the company I work for, at least not yet, so progress is a bit slow, but it's moving forward. I initially started this to learn more about query engines, zarr, rust, etc... and I also think it could be a powerful tool. I started this a couple of years ago but I recently rewrote everything to use the Are you guys internally working on a query engine on top of icechunk? in any case, yes we can definitely chat, I'll reach out soon. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there! Follow up on this. So it wasn't clear to me if right now, with no changes to the crate, it's possible to create a session with an Also, I vaguely remember @rabernat mentioning in a conversation that icechunk relies on a custom implementation, i.e. not |
Beta Was this translation helpful? Give feedback.
-
|
Looking at the code for local repositories, I see this, I'm a little confused, what does "prefer using object stores" mean here? What is the "production" ready way to set up a local icechunk repo? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey there!
I wasn't sure where to start looking, but would there be a way to read from an
icechunkstore with a user specified runtime? For example, the equivalent of this fromobject_store, https://docs.rs/object_store/0.12.2/object_store/client/struct.SpawnedReqwestConnector.html.The use case I have in mind is exactly what they are describing in the
SpawnedReqwestConnectordocs, i.e. I want to avoid having compute heavy tasks in the thread pool that handles the tasks reading the data chunks. If the "main" task that ends up triggering reads fromicechunkis launched on atokioruntime with compute heavy stuff running in it, then when read tasks are spawned (presumably usingtokio::spawn) those can end up being blocked by the blocking tasks because they are in the same thread pool, which I want to avoid (that's the short version).Does that functionality exist somewhere, or does
icechunkat least expose enough "machinery" for me to do that with what I can get out-of-the-box?Beta Was this translation helpful? Give feedback.
All reactions