earth-mover
diff --git a/‎docs/docs/configuration.md‎
Lines changed: 5 additions & 0 deletions b/‎docs/docs/configuration.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/docs/performance.md‎
Lines changed: 113 additions & 8 deletions b/‎docs/docs/performance.md‎
Lines changed: 113 additions & 8 deletions
diff --git a/‎icechunk-python/python/icechunk/_icechunk_python.pyi‎
Lines changed: 4 additions & 0 deletions b/‎icechunk-python/python/icechunk/_icechunk_python.pyi‎
Lines changed: 4 additions & 0 deletions
@@ -28,6 +28,11 @@ The threshold for when to inline a chunk into a manifest instead of storing it a
 
 The number of concurrent requests to make when getting partial values from storage.
 
+### [`max_concurrent_requests`](./reference.md#icechunk.RepositoryConfig.max_concurrent_requests)
+
+The maximum total number of concurrent requests this repo will allow.
+See [Performance | Concurrency](../performance#Concurrency) for details.
+
 ### [`compression`](./reference.md#icechunk.RepositoryConfig.compression)
 
 Icechunk uses Zstd compression to compress its metadata files. [`CompressionConfig`](./reference.md#icechunk.CompressionConfig) allows you to configure the [compression level](./reference.md#icechunk.CompressionConfig.level) and [algorithm](./reference.md#icechunk.CompressionConfig.algorithm). Currently, the only algorithm available is [`Zstd`](https://facebook.github.io/zstd/).
 
@@ -1,9 +1,92 @@
 # Performance
 
-!!! info
+## Concurrency
 
-    This is advanced material, and you will need it only if you have arrays with more than a million chunks.
-    Icechunk aims to provide an excellent experience out of the box.
+Optimal I/O throughput with object storage is achieved when there are many HTTP requests running concurrently.
+Icechunk supports this by using Rust's high-performance Tokio asynchronous runtime to issue these requests.
+
+### Setting Zarr's Async Concurrency Configuration
+
+Icechunk is used in conjunction with Zarr Python.
+The level of concurrency used in each request is controlled by the zarr `async.concurrency` config parameter.
+
+```python
+import zarr
+print(zarr.config.get("async.concurrency"))
+# -> 10 (default)
+```
+
+Large machines in close proximity to object storage can benefit from much more concurrency. For high-performance configuration, we recommend much higher values, e.g.
+
+```python
+zarr.config.get({"async.concurrency": 128})
+```
+
+Note that this concurrency limit is _per individual Zarr Array read/write operation_
+
+```
+# chunks fetched concurrently up to async.concurrency limit
+data = array[:]
+# chunks written concurrently up to async.concurrency limit
+array[:] = data
+```
+
+### Dask and Multi-Tiered Concurrency
+
+Using Dask with Zarr introduces _another_ layer of concurrency: the number of Dask threads or workers.
+If each Dask task addresses multiple Zarr chunks, the amount of concurrency multiplies.
+In these circumstances, it is possible to generate _too much concurrency_.
+If there are **thousands** of concurrent HTTP requests in flight, they may start to stall or time out.
+To prevent this, Icechunk introduces a global concurrency limit.
+
+### Icechunk Global Concurrency Limit
+
+Each Icechunk repo has a cap on the maximum amount of concurrent requests that will be made.
+The default concurrency limit is 256.
+
+For example, the following code sets this limit to 10
+
+```python
+config = icechunk.RepositoryConfig(max_concurrent_requests=10)
+repo = icechunk.Repository.open(
+    storage=storage,
+    config=config,
+)
+```
+
+In this configuration, even if the upper layers of the stack (Dask and Zarr) issue many more concurrent requests, Icechunk will only open 10 HTTP connections to the object store at once.
+
+### Stalled Network Streams
+
+A stalled network stream is an HTTP connection which does not transfer any data over a certain period.
+Stalled connections may occur in the following situations:
+
+- When the client is connecting to a remote object store behind a slow network connection.
+- When the client is behind a VPN or proxy server which is limiting the number or throughput of connections between the client and the remote object store.
+- When the client tries to issue a high volume of concurrent requests. (Note that the global concurrency limit described above should help avoid this, but the precise limit is hardware- and network-dependent. )
+
+By default, Icechunk detects stalled HTTP connections and raises an error when it sees one.
+These errors typically contain lines like
+
+```
+  |-> I/O error
+  |-> streaming error
+  `-> minimum throughput was specified at 1 B/s, but throughput of 0 B/s was observed
+```
+
+This behavior is configurable when creating a new `Storage` option, via the `network_stream_timeout_seconds` parameter.
+The default is 60 seconds.
+To set a different value, you may specify as follows
+
+```python
+storage=  icechunk.s3_storage(
+    **other_storage_kwargs,
+    network_stream_timeout_seconds=50,
+)
+repo = icechunk.Repository.open(storage=storage)
+```
+
+Specifying a value of 0 disables this check entirely.
 
 ## Scalability
 
@@ -41,6 +124,11 @@ on Icechunk scalability.
 
 ## Splitting manifests
 
+!!! info
+
+    This is advanced material, and you will need it only if you have arrays with more than a million chunks.
+    Icechunk aims to provide an excellent experience out of the box.
+
 Icechunk stores chunk references in a chunk manifest file stored in `manifests/`.
 By default, Icechunk stores all chunk references in a single manifest file per array.
 For very large arrays (millions of chunks), these files can get quite large.
@@ -53,10 +141,10 @@ downloading and rewriting the entire manifest.
 
     Note that the chunk sizes in the following examples are tiny for demonstration purposes.
 
-
 ### Configuring splitting
 
-To solve this issue, Icechunk lets you __split__ the manifest files by specifying a ``ManifestSplittingConfig``.
+To solve this issue, Icechunk lets you **split** the manifest files by specifying a ``ManifestSplittingConfig``.
+
 ```python exec="on" session="perf" source="material-block"
 import icechunk as ic
 from icechunk import ManifestSplitCondition, ManifestSplittingConfig, ManifestSplitDimCondition
@@ -74,6 +162,7 @@ repo_config = ic.RepositoryConfig(
 ```
 
 Then pass the `config` to `Repository.open` or `Repository.create`
+
 ```python
 repo = ic.Repository.open(..., config=repo_config)
 ```
@@ -137,12 +226,12 @@ will result in splitting manifests so that each manifest contains (3 longitude c
 
     Python dictionaries preserve insertion order, so the first condition encountered takes priority.
 
-
 ### Splitting behaviour
 
 By default, Icechunk minimizes the number of chunk refs that are written in a single commit.
 
 Consider this simple example: a 1D array with split size 1 along axis 0.
+
 ```python exec="on" session="perf" source="material-block"
 import random
 
@@ -166,6 +255,7 @@ repo = ic.Repository.create(storage, config=repo_config)
 ```
 
 Create an array
+
 ```python exec="on" session="perf" source="material-block"
 import zarr
 
@@ -176,6 +266,7 @@ array = root.create_array(name=name, shape=(10,), dtype=int, chunks=(1,))
 ```
 
 Now lets write 5 chunk references
+
 ```python exec="on" session="perf" source="material-block"
 import numpy as np
 
@@ -184,16 +275,19 @@ print(session.status())
 ```
 
 And commit
+
 ```python exec="on" session="perf" source="material-block"
 snap = session.commit("Add 5 chunks")
 ```
 
 Use [`repo.lookup_snapshot`](./reference.md#icechunk.Repository.lookup_snapshot) to examine the manifests associated with a Snapshot
+
 ```python exec="on" session="perf" source="material-block"
 print(repo.lookup_snapshot(snap).manifests)
 ```
 
 Let's open the Repository again with a different splitting config --- where 5 chunk references are in a single manifest.
+
 ```python exec="on" session="perf" source="material-block"
 split_config = ManifestSplittingConfig.from_dict(
     {ManifestSplitCondition.AnyArray(): {ManifestSplitDimCondition.Any(): 5}}
@@ -204,6 +298,7 @@ print(new_repo.config.manifest)
 ```
 
 Now let's append data.
+
 ```python exec="on" session="perf" source="material-block"
 session = new_repo.writable_session("main")
 array = zarr.open_array(session.store, path=name, mode="a")
@@ -235,6 +330,7 @@ print(session.status())
 snap3 = session.commit("rewrite [3,7)")
 print(repo.lookup_snapshot(snap3).manifests)
 ```
+
 This ends up rewriting all refs to two new manifests.
 
 ### Rewriting manifests
@@ -247,6 +343,7 @@ At that point, you will want to experiment with different manifest split configu
 To force Icechunk to rewrite all chunk refs to the current splitting configuration use [`rewrite_manifests`](./reference.md#icechunk.Repository.rewrite_manifests)
 
 To illustrate, we will use a split size of 3 --- for the current example this will consolidate to two manifests.
+
 ```python exec="on" session="perf" source="material-block"
 split_config = ManifestSplittingConfig.from_dict(
     {ManifestSplitCondition.AnyArray(): {ManifestSplitDimCondition.Any(): 3}}
@@ -262,11 +359,13 @@ snap4 = new_repo.rewrite_manifests(
 ```
 
 `rewrite_snapshots` will create a new commit on `branch` with the provided `message`.
+
 ```python exec="on" session="perf" source="material-block"
 print(repo.lookup_snapshot(snap4).manifests)
 ```
 
 The splitting configuration is saved in the snapshot metadata.
+
 ```python exec="on" session="perf" source="material-block"
 print(repo.lookup_snapshot(snap4).metadata)
 ```
@@ -275,7 +374,6 @@ print(repo.lookup_snapshot(snap4).metadata)
 
     Once you find a splitting configuration you like, remember to persist it on-disk using `repo.save_config`.
 
-
 ### Example workflow
 
 Here is an example workflow for experimenting with splitting
@@ -293,31 +391,38 @@ repo = ic.Repository.open(storage, config=repo_config)
 ```
 
 We will rewrite the manifests on a different branch
+
 ```python exec="on" session="perf" source="material-block"
 repo.create_branch("split-experiment-1", repo.lookup_branch("main"))
 snap = repo.rewrite_manifests(
     f"rewrite_manifests with new config", branch="split-experiment-1"
 )
 print(repo.lookup_snapshot(snap).manifests)
 ```
+
 Now benchmark reads on `main` vs `split-experiment-1`
+
 ```python exec="on" session="perf" source="material-block"
 store = repo.readonly_session("main").store
 store_split = repo.readonly_session("split-experiment-1").store
 # ...
 ```
+
 Assume we decided the configuration on `split-experiment-1` was good.
 First we persist that configuration to disk
+
 ```python exec="on" session="perf" source="material-block"
 repo.save_config()
 ```
 
 Now point the `main` branch to the commit with rewritten manifests
+
 ```python exec="on" session="perf" source="material-block"
 repo.reset_branch("main", repo.lookup_branch("split-experiment-1"))
 ```
 
 Notice that the persisted config is restored when opening a Repository
+
 ```python exec="on" session="perf" source="material-block"
 print(ic.Repository.open(storage).config.manifest)
 ```
@@ -351,6 +456,7 @@ repo = ic.Repository.open(..., config=repo_config)
 ```
 
 This example will preload all manifests that match the regex "x" when opening a Session. While this is a simple example, you can use the `ManifestPreloadCondition` class to create more complex preload conditions using the following options:
+
 - `ManifestPreloadCondition.name_matches` takes a regular expression used to match an array's name;
 - `ManifestPreloadCondition.path_matches` takes a regular expression used to match an array's path;
 - `ManifestPreloadCondition.and_conditions` to combine (1), (2), and (4) together; and
@@ -414,7 +520,6 @@ This will preload all manifests that match the array name "x" while the number o
 
     Once you find a preload configuration you like, remember to persist it on-disk using `repo.save_config`. The saved config can be overridden at runtime for different applications.
 
-
 #### Default preload configuration
 
 Icechunk has a default `preload_if` configuration that will preload all manifests that match [cf-xarrays coordinate axis regex](https://github.com/xarray-contrib/cf-xarray/blob/1591ff5ea7664a6bdef24055ef75e242cd5bfc8b/cf_xarray/criteria.py#L149-L160).
 
@@ -13,6 +13,7 @@ class S3Options:
         allow_http: bool = False,
         anonymous: bool = False,
         force_path_style: bool = False,
+        network_stream_timeout_seconds: int | None = None,
     ) -> None:
         """
         Create a new `S3Options` object
@@ -29,6 +30,9 @@ class S3Options:
             Whether to use anonymous credentials to the storage backend. When `True`, the s3 requests will not be signed.
         force_path_style: bool
             Whether to force use of path-style addressing for buckets.
+        network_stream_timeout_seconds: int | None
+            Timeout requests if no bytes can be transmitted during this period of time.
+            If set to 0, timeout is disabled. Default is 60 seconds.
         """
 
 class ObjectStoreConfig: