Skip to content

gh-133009: fix UAF in xml.etree.ElementTree.Element.__deepcopy__ #133010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Apr 26, 2025

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not enough. "tag", "text" and "tail" can be set to different value during deepcopying. If it was the last reference, the deepcopy() argument can be destroyed, and the reference can became handling.

The safe way is to increase the reference count of the deepcopy() argument before calling any code that can release the GIL.

@picnixz
Copy link
Member Author

picnixz commented Apr 26, 2025

I actually tried to do something with tag etc, but I wasn't able to crash the interpreter.

@serhiy-storchaka
Copy link
Member

String literals are interned and saved in the constants list. You need to use something having a single reference. Try 'tag'.upper().

@picnixz
Copy link
Member Author

picnixz commented Apr 26, 2025

The safe way is to increase the reference count of the deepcopy() argument before calling any code that can release the GIL.

I've actually removed this code because I thought it wasn't needed, but I'll check with an evil non-interned string tomorrow.

@picnixz
Copy link
Member Author

picnixz commented Apr 27, 2025

I'll try to add more tests later as well.

* had side-effects. However, 'i' is the number of copied items that
* we were able to successfully copy.
*/
element->extra->length = i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related but the Py_REFCNT(object) == 1 in deepcopy() below is likely unsafe for similar reasons as #132070. I don't really understand what that check is doing, though: it seems what it does could work too if the refcnt > 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I don't know. I'm leaving on Monday evening for ~10 days. Maybe we can track the issue separately?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely a separate issue! Just noticed it while reading code around your change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the refcnt > 1, we need to check in memo, and use a reference to already made copy if it is found. We need also to update memo. It is complicated and adds a lot of overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants