Streaming or mounting STAC assets from Blob Container #28
Replies: 1 comment 1 reply
-
The home directory is fairly limited (15G) but you have a bit more room in outside of it, in e.g.
Can you expand on this? Where exactly are things slow: the search, the asset downloading, something else? And what do you mean by "cached on PC instance memory"? The STAC metadata or the assets? https://nbviewer.org/github/microsoft/PlanetaryComputerExamples/blob/main/tutorials/mosaiks.ipynb#Extract-features-from-the-imagery-around-each-point has an example of structuring and parallelizing queries for a large number of points.
That's possible for the Hub admin (e.g. https://github.com/microsoft/planetary-computer-hub/blob/main/helm/values.yaml#L183-L189), but not as a user of the hub. You can use the Python SDK or adlfs to work with the remote files. Either way, the files are still in the Azure Storage Service, so performance of accessing data from that container will be comparable to accessing Planetary Computer data. So if it's downloading the assets that's the problem, you'll still be bound by the network. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
My team is trying to demonstrate a workflow for training a semantic segmentation model (U-Net architecture specifically) on a Planetary Computer instance, but the pipeline of fetching all source and label .tif assets for a dataset into a stack of
xarray
is proving to be difficult.Is there a way in Planetary Computer to mount a Blob Container directly to an instance for "local" access to the dataset? Does anyone have experience with this kind of problem, and/or able to solve it with streaming data out of an archive stored on a Blob Container?
Beta Was this translation helpful? Give feedback.
All reactions