Skip to content

fix: Use anyio.SpooledTemporaryFile in UploadFile #2925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

11kkw
Copy link

@11kkw 11kkw commented Apr 16, 2025

Summary

This PR updates the internal implementation of UploadFile to use anyio.SpooledTemporaryFile[bytes] instead of a generic BinaryIO interface.

By using anyio.SpooledTemporaryFile, all file operations (read, write, seek) are now fully asynchronous and consistent across both in-memory and disk-based file handling. This eliminates the need for run_in_threadpool and simplifies the logic for determining whether the file is in memory or on disk.

This is my first contribution to Starlette. If any part of this PR doesn't follow the project's conventions or needs to be revised or split, I’m more than happy to make the necessary changes.

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

@@ -413,7 +414,7 @@ class UploadFile:

def __init__(
self,
file: typing.BinaryIO,
file: anyio.SpooledTemporaryFile[bytes],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a base class analogous to BinaryIO? This was not a spoiled temporary file type before on purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that shouldn't matter for the UploadFile.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that shouldn't matter for the UploadFile.

Yeah, that makes sense — the original use of BinaryIO was definitely meant to keep things implementation-agnostic, and I get why switching directly to SpooledTemporaryFile might feel too specific.

If you're open to it, I think one possible middle ground could be introducing a small AsyncFile protocol — something simple that defines the expected async methods (read, write, seek, etc). That way, we can preserve the flexibility while still ensuring everything stays fully async.

Happy to adjust the PR in that direction if it sounds reasonable to you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, but maybe that structure already exists in anyio? 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, but maybe that structure already exists in anyio? 🤔

Yes, it seems like anyio provides a type called AsyncFile for async file operations

Copy link
Member

@graingert graingert May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncFile is a concrete implementation not an abc, probably uh "AsyncFileIO[bytes]" protocol would make sense here.

If it's a protocol we can move it or have it in multiple places and have everything still work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I don't think the file parameter should be exposed here. It seems to give users the power to customize the file, but in fact the other parts of UploadFile rely heavily on the implementation of file. If we need to make some abstractions to allow users to customize the file parsing results of multipart, we can use a design similar to baize. But at least in this PR, I think there is no problem with modifying the type in this way.

@Kludex Kludex requested a review from graingert April 16, 2025 05:29
Copy link
Member

@abersheeran abersheeran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change!

except MultiPartException as exc:
# Close all the files if there was an error.
async with AsyncExitStack() as stack:
for f in self._files_to_close_on_error:
Copy link
Member

@graingert graingert May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I mean that _files_to_close_on_error would just be an exit stack
  • these should be closed on any error including BaseException

@@ -265,10 +267,9 @@ async def parse(self) -> FormData:
await part.file.seek(0)
self._file_parts_to_write.clear()
self._file_parts_to_finish.clear()
except MultiPartException as exc:
except BaseException as exc:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this change?

I changed it to BaseException assuming it would help ensure cleanup even for cases like CancelledError or KeyboardInterrupt, beyond just MultiPartException.

But perhaps I misunderstood — please let me know if I’m thinking about this the wrong way!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CancelledError being the most likely one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants