Skip to content

Commit 3734b64

Browse files
jsulzjulien-c
andauthored
Updates to Xet upload/download docs (#3174)
* update to upload tip * updating install instructions for uploads * updating download notes for xet * Grammar catch! Co-authored-by: Julien Chaumond <julien@huggingface.co> --------- Co-authored-by: Julien Chaumond <julien@huggingface.co>
1 parent 07b7654 commit 3734b64

File tree

2 files changed

+10
-6
lines changed

2 files changed

+10
-6
lines changed

docs/source/en/guides/download.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -168,7 +168,7 @@ For more details about the CLI download command, please refer to the [CLI guide]
168168

169169
There are two options to speed up downloads. Both involve installing a Python package written in Rust.
170170

171-
* `hf_xet` is newer and uses the Xet storage backend for upload/download. It is available in production, but is in the process of being rolled out to all users, so join the [waitlist](https://huggingface.co/join/xet) to get onboarded soon!
171+
* `hf_xet` is newer and uses the Xet storage backend for upload/download. Xet storage is the [default for all new Hub users and organizations](https://huggingface.co/changelog/xet-default-for-new-users), and is in the process of being rolled out to all users. If you don't have access, join the [waitlist](https://huggingface.co/join/xet) to make Xet the default for all your repositories!
172172
* `hf_transfer` is a power-tool to download and upload to our LFS storage backend (note: this is less future-proof than Xet). It is thoroughly tested and has been in production for a long time, but it has some limitations.
173173

174174
### hf_xet
@@ -178,12 +178,14 @@ chunk-based deduplication for faster downloads and uploads. `hf_xet` integrates
178178

179179
`hf_xet` uses the Xet storage system, which breaks files down into immutable chunks, storing collections of these chunks (called blocks or xorbs) remotely and retrieving them to reassemble the file when requested. When downloading, after confirming the user is authorized to access the files, `hf_xet` will query the Xet content-addressable service (CAS) with the LFS SHA256 hash for this file to receive the reconstruction metadata (ranges within xorbs) to assemble these files, along with presigned URLs to download the xorbs directly. Then `hf_xet` will efficiently download the xorb ranges necessary and will write out the files on disk. `hf_xet` uses a local disk cache to only download chunks once, learn more in the [Chunk-based caching(Xet)](./manage-cache#chunk-based-caching-xet) section.
180180

181-
To enable it, specify the `hf_xet` package when installing `huggingface_hub`:
181+
To enable it, simply install the latest version of `huggingface_hub`:
182182

183183
```bash
184-
pip install -U "huggingface_hub[hf_xet]"
184+
pip install -U "huggingface_hub"
185185
```
186186

187+
As of `huggingface_hub` 0.32.0, this will also install `hf_xet`.
188+
187189
Note: `hf_xet` will only be utilized when the files being downloaded are being stored with Xet Storage.
188190

189191
All other `huggingface_hub` APIs will continue to work without any modification. To learn more about the benefits of Xet storage and `hf_xet`, refer to this [section](https://huggingface.co/docs/hub/storage-backends).

docs/source/en/guides/upload.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -191,18 +191,20 @@ Take advantage of faster uploads through `hf_xet`, the Python binding to the [`x
191191

192192
<Tip warning={true}>
193193

194-
Xet storage is being rolled out to Hugging Face Hub users at this time, so xet uploads may need to be enabled for your repo for `hf_xet` to actually upload to the Xet backend. Join the [waitlist](https://huggingface.co/join/xet) to get onboarded soon! Also, `hf_xet` today only works with files on the file system, so cannot be used with file-like objects (byte-arrays, buffers).
194+
As of May 23rd, 2025, Xet-enabled repositories [are the default for all new Hugging Face Hub users and organizations](https://huggingface.co/changelog/xet-default-for-new-users). If your user or organization was created before then, you may need Xet enabled on your repo for `hf_xet` to actually upload to the Xet backend. Join the [waitlist](https://huggingface.co/join/xet) to make Xet the default for all your repositories. Also, note that while `hf_xet` works with in-memory bytes or bytearray data, support for BinaryIO streams is still pending.
195195

196196
</Tip>
197197

198198
`hf_xet` uses the Xet storage system, which breaks files down into immutable chunks, storing collections of these chunks (called blocks or xorbs) remotely and retrieving them to reassemble the file when requested. When uploading, after confirming the user is authorized to write to this repo, `hf_xet` will scan the files, breaking them down into their chunks and collecting those chunks into xorbs (and deduplicating across known chunks), and then will be upload these xorbs to the Xet content-addressable service (CAS), which will verify the integrity of the xorbs, register the xorb metadata along with the LFS SHA256 hash (to support lookup/download), and write the xorbs to remote storage.
199199

200-
To enable it, specify the `hf_xet` extra when installing `huggingface_hub`:
200+
To enable it, simply install the latest version of `huggingface_hub`:
201201

202202
```bash
203-
pip install -U "huggingface_hub[hf_xet]"
203+
pip install -U "huggingface_hub"
204204
```
205205

206+
As of `huggingface_hub` 0.32.0, this will also install `hf_xet`.
207+
206208
All other `huggingface_hub` APIs will continue to work without any modification. To learn more about the benefits of Xet storage and `hf_xet`, refer to this [section](https://huggingface.co/docs/hub/storage-backends).
207209

208210
**Cluster / Distributed Filesystem Upload Considerations**

0 commit comments

Comments
 (0)