Skip to content

Use a lock file to avoid exceptions due to concurrenct symlink creation #2851

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

stefanberger
Copy link

@stefanberger stefanberger commented Jul 15, 2025

We have seen exceptions being raised from _update_root_symlink() on the level of the sigstore-python library when multiple concurrent threads were creating symlinks in this function with the same symlink name (in a test environment running tests concurrently). To avoid this issue, have each thread open a lock file and create an exclusive lock on it to serialize the access to the removal and creation of the symlink.

The reproducer for this issue, that should be run in 2 or more python interpreters concurrently, looks like this:

from sigstore import sign
while True:
    sign.TrustedRoot.production()

Use fcntl.lockf-based locking for Linux and Mac and a different implementation on Windows. The source originally comes from a discussion on stockoverflow (link below).

Resolves: #2836
Link: https://stackoverflow.com/questions/489861/locking-a-file-in-python

Description of the changes being introduced by the pull request:

Fixes #2836

@stefanberger stefanberger requested a review from a team as a code owner July 15, 2025 17:22
We have seen exceptions being raised from _update_root_symlink() on the
level of the sigstore-python library when multiple concurrent threads
were creating symlinks in this function with the same symlink name (in a
test environment running tests concurrently). To avoid this issue, have
each thread open a lock file and create an exclusive lock on it to
serialize the access to the removal and creation of the symlink.

The reproducer for this issue, that should be run in 2 or more python
interpreters concurrently, looks like this:

from sigstore import sign
while True:
    sign.TrustedRoot.production()

Use fcntl.lockf-based locking for Linux and Mac and a different
implementation on Windows. The source originally comes from a
discussion on stockoverflow (link below).

Resolves: theupdateframework#2836
Link: https://stackoverflow.com/questions/489861/locking-a-file-in-python
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
@jku
Copy link
Member

jku commented Jul 16, 2025

The reproducer for this issue, that should be run in 2 or more python interpreters concurrently, looks like this:

from sigstore import sign
while True:
    sign.TrustedRoot.production()

I'm unlikely to have time to really dive into this in the next couple of weeks but from first read:

  • I assume with your code one of these test programs will continue happily and the other one will freeze completely forever, until the first program is killed? In fact you don't even need loops, just running sign.TrustedRoot.production() once per process probably does it as long as the first process remains running. I feel like you can't lock files in a library without ever unlocking them
  • Like I said in a comment, I would like to consider a higher level lock, or at least analyze the risks... the symlink call is what you are seeing an issue with but it might not be the only place where multiple updaters could conflict. Options might include taking a lock for Updater lifetime (also risking locking "forever" accidentally) or taking a lock while the top-level methods are running...
  • leaving the lock file in the directory feels wrong (existence of specific lock files usually implies that the lock is active -- unlike how fcntl.LOCK_EX works), do you think there is a way to use e.g. the symlink file as the lock file?

@stefanberger
Copy link
Author

  • I assume with your code one of these test programs will continue happily and the other one will freeze completely forever,

Once the one program holding the lock closes the file when leaving the 'with' statement that opened the file, the other one can grab the lock. There's no explicit unlocking needed, if that's what you were looking for. I can add an explicit unlock if you want. Actually, while running the 3-liner after copying and pasting into the python interpreter prompt, it also prints out information in each loop:

<sigstore._internal.trust.TrustedRoot object at 0x7f23a7c3b5b0>
<sigstore._internal.trust.TrustedRoot object at 0x7f23a7cb2a90>

Otherwise it's easy to add a print statement into the loop if run out of a .py file to see that they all run fine.

until the first program is killed? In fact you don't even need loops, just running sign.TrustedRoot.production() once per process probably does it as long as the first process remains running. I feel like you can't lock files in a library without ever unlocking them

You really need the concurrency for a while. Oddly, we ran into this issue quite quickly when running concurrent test cases on github actions.

Like I said in a comment, I would like to consider a higher level lock, or at least analyze the risks... the symlink call is what you are seeing an issue with but it might not be the only place where multiple updaters could conflict. Options might include taking a lock

I am not as familiar with the code as you are. I am primarily trying to address the issue that we were seeing on the level of the root cause. There we can set a lock file with a fine granularity that only serializes access to a specific resources that we need to protect from concurrent access. However, it's quite possible that locking on a higher level could be needed if concurrent process were to conflict on shared files.

leaving the lock file in the directory feels wrong (existence of specific lock files usually implies that the lock is active -- unlike how fcntl.LOCK_EX works), do you think there is a way to use e.g. the symlink file as the lock file?

What we need is a file that is there for everyone to grab a lock on and that the file doesn't get removed while concurrent processes/threads are trying to grab the lock. Part of the code that we are trying to protect from concurrency deals with removing an existing symlink first (presumably because it is 'wrong') and then setting it again. The removal of the symlink would be a problem.

Also, grabbing a lock on the 'current' file causes the issue of the file being created and remaining at 0 bytes content while with the separate lock file I get the current file with content of 5413 bytes. So it seemed better to create an explicit lock file.

I agree that it's not nice to have these lock files there permanently laying around after they have been used, but I don't see another easy alternative. Other choices could be:

  • a directory under the f"{Path.home()}/.local/share" somewhere that is dedicated to lock files
  • a directory under the /tmp path (Windows %temp%) that is dedicated to lock files
  • above two choices with a single (coarse-grained) lock file for all file locking needs

@jku
Copy link
Member

jku commented Jul 16, 2025

Once the one program holding the lock closes the file when leaving the 'with' statement that opened the file, the other one can grab the lock.

ah apologies, I missed the relation to the file handle -- this means I will have to take a closer look at the overall need for locking though as this is very short lived.

Thanks for thinking out loud the lock file issue as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ngclient: Be better with concurrent instances
2 participants