Add pin_memory to DataLoader and update ImageInfo to support #1894

rockerBOO · 2025-01-23T15:39:17Z

Support using pin_memory with DataLoader. Updated ImageInfo to pin_memory for relevant tensors. Will probably need some testing but is disabled by default.

Host to GPU copies are much faster when they originate from pinned (page-locked) memory. See Use pinned memory buffers for more details on when and how to use pinned memory generally.

For data loading, passing pin_memory=True to a DataLoader will automatically put the fetched data Tensors in pinned memory, and thus enables faster data transfer to CUDA-enabled GPUs.

https://pytorch.org/docs/stable/data.html#memory-pinning
https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

kohya-ss · 2025-01-23T23:27:45Z

Thank you for this! I will check it as soon as possible.

I checked pin_memory before, and it caused a large increase in memory usage in a Windows environment. If you have already tried it in a Windows environment, did it work without any problems?

rockerBOO · 2025-01-24T00:13:20Z

I don't have access to a Window environment to test. On Linux it doesn't seem to effect memory usage (or I'm not using it correctly). This would be off by default so if it does hamper Windows memory usage, we can add a note in the documentation.

It has been roughly 8-10% improvement to epoch speed but I haven't done enough testing. Larger batch sizes may have higher gains, running with 1 or 3 batch size on a 2080 seemed to be the same relative improvement.

I'm looking at another performance pass to try and find bottlenecks with epoch speed and GPU usage.

rockerBOO · 2025-06-16T22:08:55Z

I have been using this for the past 6 months and has worked as expected. I added some tests.

kohya-ss · 2025-06-16T22:24:55Z

Thank you for update. Users can choose whether to use pin_memory or not, and I'm sure this PR will be useful for users who can use pin_memory. I'll merge it after checking.

rockerBOO · 2025-06-16T22:39:18Z

huggingface/accelerate#2441 This PR is related to this other PR which added the DataLoaderConfig which was for Accelerator 1.0.0 which would require updating to that version. I can see if it could work on the current Accelerator version to not need to update it.

rockerBOO added 3 commits January 23, 2025 10:39

Add pin_memory to DataLoader and update ImageInfo to support

c4b0bb6

Accelerate dataloader_config to non_blocking if pin_memory is enabled

50d8daa

Add pin_memory to finetune scripts

03b35be

rockerBOO added 2 commits June 16, 2025 17:27

Merge branch 'sd3' into pin_memory

0981223

Add tests for pin memory

95e260f

rockerBOO marked this pull request as ready for review June 16, 2025 21:56

Update pin_memory tests to use DataLoader

81df559

Pinning only supported when CUDA is available

d73f2db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add pin_memory to DataLoader and update ImageInfo to support #1894

Add pin_memory to DataLoader and update ImageInfo to support #1894

Uh oh!

rockerBOO commented Jan 23, 2025

Uh oh!

kohya-ss commented Jan 23, 2025

Uh oh!

rockerBOO commented Jan 24, 2025

Uh oh!

rockerBOO commented Jun 16, 2025

Uh oh!

kohya-ss commented Jun 16, 2025

Uh oh!

rockerBOO commented Jun 16, 2025

Uh oh!

Uh oh!

Uh oh!

Add pin_memory to DataLoader and update ImageInfo to support #1894

Are you sure you want to change the base?

Add pin_memory to DataLoader and update ImageInfo to support #1894

Uh oh!

Conversation

rockerBOO commented Jan 23, 2025

Uh oh!

kohya-ss commented Jan 23, 2025

Uh oh!

rockerBOO commented Jan 24, 2025

Uh oh!

rockerBOO commented Jun 16, 2025

Uh oh!

kohya-ss commented Jun 16, 2025

Uh oh!

rockerBOO commented Jun 16, 2025

Uh oh!

Uh oh!