You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix bug where the external network is killed during run/restart (#1000)
When ToolHive starts or restarts a workload, it checks to see if a container with the same name exists. If the configuration has changed, it needs to remove the container and recreate it. This happens in a method named handleExistingContainer.
Previously, the code used the RemoveWorkload method to clean up the old container. However, when the logic to create a separate external Docker network was added, the RemoveWorkload method was extended to remove the external network if no workloads were left. If no other workloads are running during a start or restart and the run/restart involves a workload with the same name as an exited container in Docker - a situation arises where ToolHive checks to see if the external network exists, then checks to see if there is an existing container, and ends up deleting the container and the external network after checking to ensure it exists. When the MCP server starts, it goes into an error state because it has been configured to use a network which no longer exists.
It is not straightforward to move the external network check due to the sequence of operations needed to support the ingress/egress proxies (even for workloads where network isolation is disabled). Instead, I refactored the code to split the container deletion logic from RemoveWorkload into a private method, and I call that method directly in the handleExistingContainer method.
0 commit comments