Skip to content

[Bug]: Services are killed instead of gracefully stopped due to faulty process handling and not respecting stop_grace_periodΒ #5620

@Ryvix

Description

@Ryvix

Error Message and Logs

When stopping a Service (like the default Minecraft service template) via the Coolify UI, the container is killed (SIGKILL) instead of being gracefully stopped using the stop_grace_period defined in its Docker Compose configuration. This prevents applications within the container from performing clean shutdown procedures. Manual docker stop --time=<period> <container_id> works correctly and respects the timeout.

The root cause is the Service::stopContainers method in app/Models/Service.php. The original implementation used asynchronous process handling (Process::start + $process->wait()) which didn't reliably wait for the remote docker stop command to finish. This led to premature container removal (docker rm -f).

Switching to synchronous instant_remote_process calls fixes the premature removal but requires further enhancement to respect the stop_grace_period defined in the service's compose file.

Steps to Reproduce

  1. Deploy a Minecraft service. Optionally that defines a stop_grace_period.
  2. Ensure the service is running.
  3. Click the "Stop" button for the service in the Coolify UI.
  4. Observe Docker events (docker events) - a kill event might appear quickly, or the container might be removed prematurely by docker rm -f without respecting the full stop_grace_period.
  5. Check application logs inside the container (if possible) - the shutdown sequence might be cut short or absent.

Example Repository URL

No response

Coolify Version

v4.0.0-beta.408

Are you using Coolify Cloud?

No (self-hosted)

Operating System and Version (self-hosted)

Ubuntu 24.04

Additional Information

The core issue stems from how Service::stopContainers handles the stop command. The original asynchronous approach was unreliable. The fix involves:

  1. Switching to synchronous instant_remote_process calls for docker stop, docker inspect, and docker kill.
  2. Parsing the service's docker_compose_raw data within stopContainers to extract the stop_grace_period for the specific service being stopped.
  3. Using the extracted stop_grace_period (converted to seconds, with a reasonable default like 30s if parsing fails or it's not defined) as the --time argument for the docker stop command.

Proposed Fix app/Models/Service.php:

    public function stopContainers(array $containerNames, $server)
    {
        // Default timeout if not specified in compose file or parsing fails
        $defaultGracefulStopTimeout = 30; // Default Docker stop timeout is 10s, let's use 30s as a base
        $parsedCompose = collect();

        // Parse the compose file only once if there's raw data
        if ($this->docker_compose_raw) {
            try {
                // Ensure Yaml and Str facades/classes are imported at the top of the file
                // use Symfony\Component\Yaml\Yaml;
                // use Illuminate\Support\Str;
                $parsedCompose = Yaml::parse($this->docker_compose_raw);
                // Use data_get to safely access nested keys, default to an empty collection
                $parsedComposeServices = data_get($parsedCompose, 'services', []);
                // Ensure it's a collection for ->has() check
                $parsedCompose = collect($parsedComposeServices);

            } catch (\Exception $e) {
                // Handle YAML parsing error, maybe log it? For now, fallback to empty collection.
                // Log::error("Error parsing docker-compose for service {$this->uuid}: " . $e->getMessage());
                $parsedCompose = collect();
            }
        }

        foreach ($containerNames as $containerName) {
            $gracefulStopTimeout = $defaultGracefulStopTimeout;

            // Extract service name from container name (e.g., "mc-m4sw08wgccwwcskg804ks40w" -> "mc")
            // Ensure UUID is available before attempting to use it
            if ($this->uuid) {
                 $serviceName = Str::beforeLast($containerName, "-{$this->uuid}");

                 // Check if $parsedCompose is a collection and has the key
                 if ($serviceName && $parsedCompose instanceof \Illuminate\Support\Collection && $parsedCompose->has($serviceName)) {
                    $serviceConfig = $parsedCompose->get($serviceName);
                    // Ensure serviceConfig is an array before using data_get
                    if (is_array($serviceConfig)) {
                        $stopGracePeriodString = data_get($serviceConfig, 'stop_grace_period');
                        if ($stopGracePeriodString) {
                            // Basic parsing for "Xs" or "Xm" format
                            // More complex durations (e.g., "1m30s") are not handled here yet.
                            if (Str::endsWith($stopGracePeriodString, 's') && is_numeric(Str::before($stopGracePeriodString, 's'))) {
                                $parsedTimeout = (int) Str::before($stopGracePeriodString, 's');
                            } elseif (Str::endsWith($stopGracePeriodString, 'm') && is_numeric(Str::before($stopGracePeriodString, 'm'))) {
                                $parsedTimeout = (int) Str::before($stopGracePeriodString, 'm') * 60;
                            } elseif (is_numeric($stopGracePeriodString)) {
                                // Attempt to cast directly for plain integer seconds
                                $parsedTimeout = (int) $stopGracePeriodString;
                            } else {
                                // Invalid format
                                $parsedTimeout = 0;
                            }
                            // Use parsed value if valid (>0), otherwise keep default
                            if ($parsedTimeout > 0) {
                                $gracefulStopTimeout = $parsedTimeout;
                            }
                        }
                    }
                 }
            }


            // Attempt graceful stop using the determined timeout
            instant_remote_process(command: ["docker stop --time={$gracefulStopTimeout} {$containerName}"], server: $server, throwError: false);

            // Check if container is still running after stop attempt
            $isRunning = instant_remote_process(command: ["docker inspect -f '{{.State.Running}}' {$containerName}"], server: $server, throwError: false);

            // If graceful stop failed, kill the container
            if (trim($isRunning) === 'true') {
                instant_remote_process(command: ["docker kill {$containerName}"], server: $server, throwError: false);
            }

            // Remove the container (force remove handles both stopped and killed containers)
            $this->removeContainer($containerName, $server);

            // Small delay between containers
            usleep(100000);
        }
    }

Metadata

Metadata

Assignees

No one assigned

    Labels

    πŸ› Possible BugReported issues that need to be reproduced by the team.πŸ” TriageIssues that need assessment and prioritization.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions