-
Notifications
You must be signed in to change notification settings - Fork 101
gvfs-helper-client: clean up server process #756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gvfs-helper-client: clean up server process #756
Conversation
Promoting to a full PR after trying to explore the option of being better bout |
gvfs-helper-client.c
Outdated
@@ -48,6 +58,8 @@ static int gh_client__start_fn(struct subprocess_entry *subprocess) | |||
|
|||
struct gh_server__process *entry = (struct gh_server__process *)subprocess; | |||
|
|||
server_process_pid = subprocess->process.pid; | |||
atexit(cleanup_server_process_on_exit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it may be interesting to find out why the process does not finish when stdin
closes (looks like the packet_read_line_gently()
call that should return -1 does not? Or maybe the problem is that child processes continue to exist as zombie processes on Linux until a waitpid()
of their parent process releases them?), an atexit()
handler definitely takes care of cleaning up the server process.
However, there may be more than one gvfs-helper
process to clean up. @derrickstolee could you test whether b8d292f (pushed to microsoft/git
as gvfs-helper-linux-term
) addresses the problem you observed, too?
5d57c6d
to
68f5d49
Compare
gvfs-helper-client.c
Outdated
@@ -48,6 +50,8 @@ static int gh_client__start_fn(struct subprocess_entry *subprocess) | |||
|
|||
struct gh_server__process *entry = (struct gh_server__process *)subprocess; | |||
|
|||
atexit(stop_gh_server_subprocesses); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should move this all the way into the cmd_main()
function, to be 100% sure that it is registered, and registered one time only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work, or I could add a static int registered = 0;
into this method. This is definitely on the critical path for starting a process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I do not understand is: the previous location was in the code path that initializes the hashmap, and the new location is in the gh_client__start_fn()
function which is passed as a callback a couple lines further down from the previous location. So how is it possible that it was not called early enough in the previous location but it is early enough in the new location?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also confused by this, since I specifically traced that the hashing location was being called but somehow wasn't sticking. I'll look into it some more later this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running the gvfs-helper test script with SANITIZE=leak
is a good reproducer for the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll play with this, but wonder why the current linux-leaks
job seems not to be bothered (and succeeds without this here PR).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current iteration puts the atexit()
in cmd_main()
. If you put it in the "right" place, then the leak check will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not reproduce the hang with SANITIZE=leak
, but with this diff:
diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
index 9bfd3390d4be..50c8aaedacee 100755
--- a/t/t9210-scalar.sh
+++ b/t/t9210-scalar.sh
@@ -426,6 +426,10 @@ test_expect_success '`scalar clone` with GVFS-enabled server' '
)
'
+test_expect_success 'fetch <non-existent> does not hang in gvfs-helper' '
+ test_must_fail git -C using-gvfs/src fetch origin does-not-exist
+'
+
test_expect_success '`scalar clone --no-gvfs-protocol` skips gvfs/config' '
# the fake cache server requires fake authentication &&
git config --global core.askPass true &&
I plan on making this new test case a part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derrickstolee I finally figured out what is going wrong.
The subprocess_start()
function registers a clean_on_exit_handler
for all subprocesses. This handler is called by the cleanup_children()
function which is run via run-command.c
's own atexit()
handler.
This latter atexit()
handler is registered way before the gvfs-helper.c
code is run, and therefore does the wait-and-hang thing before the atexit()
handler I wanted to introduce had a chance.
My current approach is completely different: It overrides the clean_on_exit_handler
for the gvfs-helper
sub-processes. Therefore we do not need an extra atexit()
handler at all, and hence do not run into atexit()
handler ordering issues, either.
0a798ea
to
344fd32
Compare
5bafc74
to
68dc64e
Compare
Whoopsie. I needed to take care of the new leaks... |
Sadly I cannot reproduce the |
On Linux, the following command would cause the terminal to be stuck waiting: git fetch origin foobar The issue would be that the fetch would fail with the error fatal: couldn't find remote ref foobar but the underlying git-gvfs-helper process wouldn't die. The `subprocess_exit_handler()` method would close its stdin and stdout, but that wouldn't be enough to cause the process to end, even though the `packet_read_line_gently()` call that is run in `do_sub_cmd__server()` in a loop should return -1 and the process should the terminate peacefully. While it is unclear why this does not happen, there may be other conditions where the `gvfs-helper` process would not terminate. Let's ensure that the gvfs-helper-client process cleans up the gvfs-helper server processes that it spawned upon exit. Reported-by: Stuart Wilcox Humilde <stuartwilcox@microsoft.com> Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
68dc64e
to
4c15724
Compare
It's really strange that I wasn't able to do this locally; So I went back to the drawing board and realized that the only problem I need to address is that So all I had to do was to make a near-identical copy of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd approve, but this is my PR!
On Linux, the following command would cause the terminal to be stuck waiting: ``` git fetch origin foobar ``` The issue would be that the fetch would fail with the error ``` fatal: couldn't find remote ref foobar ``` but the underlying `git-gvfs-helper` process wouldn't die. The `subprocess_exit_handler()` method would close its stdin and stdout, but that wouldn't be enough to cause the process to end. This PR addresses that by skipping the `finish_command()` call of the `clean_on_exit_handler` and instead lets `cleanup_children()` send a SIGTERM to terminate those spawned child processes.
On Linux, the following command would cause the terminal to be stuck waiting: ``` git fetch origin foobar ``` The issue would be that the fetch would fail with the error ``` fatal: couldn't find remote ref foobar ``` but the underlying `git-gvfs-helper` process wouldn't die. The `subprocess_exit_handler()` method would close its stdin and stdout, but that wouldn't be enough to cause the process to end. This PR addresses that by skipping the `finish_command()` call of the `clean_on_exit_handler` and instead lets `cleanup_children()` send a SIGTERM to terminate those spawned child processes.
On Linux, the following command would cause the terminal to be stuck waiting: ``` git fetch origin foobar ``` The issue would be that the fetch would fail with the error ``` fatal: couldn't find remote ref foobar ``` but the underlying `git-gvfs-helper` process wouldn't die. The `subprocess_exit_handler()` method would close its stdin and stdout, but that wouldn't be enough to cause the process to end. This PR addresses that by skipping the `finish_command()` call of the `clean_on_exit_handler` and instead lets `cleanup_children()` send a SIGTERM to terminate those spawned child processes.
On Linux, the following command would cause the terminal to be stuck waiting: ``` git fetch origin foobar ``` The issue would be that the fetch would fail with the error ``` fatal: couldn't find remote ref foobar ``` but the underlying `git-gvfs-helper` process wouldn't die. The `subprocess_exit_handler()` method would close its stdin and stdout, but that wouldn't be enough to cause the process to end. This PR addresses that by skipping the `finish_command()` call of the `clean_on_exit_handler` and instead lets `cleanup_children()` send a SIGTERM to terminate those spawned child processes.
On Linux, the following command would cause the terminal to be stuck waiting: ``` git fetch origin foobar ``` The issue would be that the fetch would fail with the error ``` fatal: couldn't find remote ref foobar ``` but the underlying `git-gvfs-helper` process wouldn't die. The `subprocess_exit_handler()` method would close its stdin and stdout, but that wouldn't be enough to cause the process to end. This PR addresses that by skipping the `finish_command()` call of the `clean_on_exit_handler` and instead lets `cleanup_children()` send a SIGTERM to terminate those spawned child processes.
On Linux, the following command would cause the terminal to be stuck waiting:
The issue would be that the fetch would fail with the error
but the underlying
git-gvfs-helper
process wouldn't die. Thesubprocess_exit_handler()
method would close its stdin and stdout, but that wouldn't be enough to cause the process to end.The current approach creates our own
atexit()
handler by sending aSIGINT
signal to the child process. This has worked in my end-to-end test.Perhaps another approach would be to change the way git-gvfs-helper server gets messages over
stdin
to better terminate when the pipe closes. However, I debugged to the point that I could see thegit-gvfs-helper
process waiting on a basicread()
from the standard library, so it's not mishandling anEOF
signal or anything.