Skip to content

docs: document how to integrate with Ryuk #176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,3 +79,32 @@ The following environment variables can be configured to change the behaviour:
| `RYUK_CHANGES_RETRY_INTERVAL` | `1s` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The internal between retries if resource changes (containers, networks, images, and volumes) are detected while pruning |
| `RYUK_VERBOSE` | `false` | `bool` | Whether to enable verbose aka debug logging |
| `RYUK_SHUTDOWN_TIMEOUT` | `10m` | [Duration](https://golang.org/pkg/time/#ParseDuration) | The duration after shutdown has been requested when the remaining connections are ignored and prune checks start |

## Integrating Ryuk in the Testcontainers Libraries

The Testcontainers libraries can be configured to use Ryuk to remove resources after a test session has completed.

- Identify test session semantics for the Testcontainers library. For example, a test session could be a single test method, a test class, or a test suite. As reference, please consider taking a look at Go's implementation [here](https://golang.testcontainers.org/features/test_session_semantics/). This unique identifier for the test session semantic, is referenced as `SESSION_ID` from now on.
As an implementation hint, consider how an atomic user interaction with the intent of running tests should generally lead to one single session (i.e. run tests from within IDE).
- Use the above configuration to start Ryuk as a special container within the library. For that, read the above environment variables and/or from the Testcontainers properties file, which is located in the home directory of the user. Regarding precedence, the environment variables must have higher precedence than the properties file.
- Define Ryuk as a container with privileged access.
- Define a wait strategy for the listening port, defined by the `RYUK_PORT` environment variable. This is necessary to ensure that Ryuk is ready to receive messages from the Testcontainers library.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use of env var should be optional and it doesn't matter when talking about container because the exposed port is random from Testcontainers POV

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the configuring any RYUK_ prefixed environment variables is optional, including RYUK_PORT, if however the user specifies it, it should be honoured, so we ensure connections are made to that port.

- Bind the Docker socket to the Ryuk container, so that it can communicate with the Docker daemon. This is necessary to be able to create and remove resources.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Bind the Docker socket to the Ryuk container, so that it can communicate with the Docker daemon. This is necessary to be able to create and remove resources.
- Bind the Docker socket to the Ryuk container, so that it can communicate with the Docker daemon. This is necessary to be able to create and remove resources. Be aware that this needs to be the Docker socket accessible to the container, which might be a different one from Docker socket accessible by host processes.

- E.g. `${RESOLVED_DOCKER_SOCKET}:/var/run/docker.sock`, where `${RESOLVED_DOCKER_SOCKET}` is the path to the Docker socket discovered by the Testcontainers library.
- Optionally, name the container using the `SESSION_ID` to make it easier to identify the container in the Docker daemon.
- Expose the port defined by the `RYUK_PORT` environment variable, so that the Testcontainers library can send messages to Ryuk.
- Add a special label to the Ryuk container in order to avoid removing it by mistake.
- E.g. `org.testcontainers.reaper=true`, `org.testcontainers.ryuk=true`, etc.
- If you already use a specific label for reaping resources, please remember to remove it from the Ryuk container for the same reason.
- Ryuk should run in the default bridge network of the Docker runtime.
- Every time a Docker resource is created in the Testcontainers library, Ryuk must be informed about it. This can be done by sending a message to Ryuk with the Docker labels of the resource, as a set of key-value pairs. In general, it's a good practice to always send the same set of labels for all the resources, including the above `SESSION_ID`, so that Ryuk can consistenly identify and remove the created resources after the test session has completed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Are you really doing this for every resource? .NET sends it only once at the start and then assigns the same value and label the resources.

Copy link
Member Author

@mdelapenya mdelapenya Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to double check, but at first sight yeah, once a container is created, it notifies Ryuk. I think it could be a legacy situation, where each container had its own reaper, which is not possible anymore 🤔

@stevenh we should take a look at this, as we could possibly optimise the communication with Ryuk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In .NET, we follow these steps:

  1. Create and start Ryuk.
  2. Connect to Ryuk.
  3. Send the filter.
  4. Maintain the connection.

Then we use the ID from the filter and label every resource we create with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you only need to send a new filter if its different however it's connections being removed which trigger clean up, so unless you connect for each resource it could be slower to clean up and might even remove in use resources if the last connection was removed while another resource is still in use which matches the filter.

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I do not fully understand. .NET establishes only one connection to Ryuk. Each test process has its own instance. Each test process creates Ryuk, connects, sends the filter, and maintains the connection only once for all running tests (within the process).

Copy link
Contributor

@stevenh stevenh Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I understood you meant Testcontainers .NET. What I'm trying to understand is what in testcontainers .NET is responsible for setting up ryuk and connecting to it. You seemed to infer it was something outside of just creating a container with testcontainers .NET, is that the case?

How testcontainers-go works is every resource created checks to ensure ryuk is running, if not it creates it and in either way connects to it, so ryuk knows there is an additional dependency.

ryuk monitors these connections and when the last one disconnects, it runs the clean up. This means that the clean up should be quick as its a triggered event but also it means that if there is an issue along the way and something triggers an unexpected shutdown the failing connections would still trigger a clean up.

For the go implementation this is import as the test infrastructure has a global timeout and if that triggers it doesn't gracefully clean up test resources.

In the wider use, a test container could be used for a single test in a suite, so having each container do this validation helps to ensure ryuk is run and orphaned resources only run while needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seemed to infer it was something outside of just creating a container with testcontainers .NET, is that the case?

No, we start Ryuk with the first container resource.

How testcontainers-go works is every resource created checks to ensure ryuk is running, if not it creates it and in either way connects to it, so ryuk knows there is an additional dependency.

This part is slightly different in .NET. Every resource we create checks if Ryuk is running. If it is not, we create it; if it is, we skip the creation. We do not establish another connection to Ryuk. The connection is created with the first resource that creates Ryuk (the resource does not hold the connection; it is held by the test process).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to confirm, if there was a test that ran for a period of time after all docker resources were unneeded, these would be keep available as its not until the process exits that the connection is removed?

Example in sudo code

func testWithContainer() {
    container = tc.Run("myimage"....)
    // Do things with container....
}

func main() {
      testWithoutContainer()
      testWithContainer()
      // Resources not cleaned up yet...
      moreTestsWithoutContainer() 
      // Connection to ryuk still open, container not cleaned up until main exits?
}

Copy link
Contributor

@HofmeisterAn HofmeisterAn Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection to ryuk still open, container not cleaned up until main exits?

In theory, that is true, but .NET provides a concept for releasing unmanaged resources: Dispose. Typically, when a test completes, the Dispose method cleans up the container if it is no longer needed. Ryuk serves as a fallback, guaranteeing cleanup in cases when the test process crashes. Thank you for the pseudocode example; I now understand your perspective ✅.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's the same with testcontainers-go, its a fall back for tests written correctly, so having an extra delay is not a big deal.

However documenting that you can use a connection per resource to ensure timely clean up is still of benefit IMO, thoughts?

- Use a TCP connection to send Ryuk the message. The connection must be established to the address of the Ryuk container and the port specified in the `RYUK_PORT` environment variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Use a TCP connection to send Ryuk the message. The connection must be established to the address of the Ryuk container and the port specified in the `RYUK_PORT` environment variable.
- Use a TCP connection to send Ryuk the message. The connection must be established to the address of the Ryuk container and the (mapped) port specified in the `RYUK_PORT` environment variable.

- An example: `localhost:8080`. Please use the Tescontainers library to get the address of the container, not hardcoding `localhost` or any other address.
- The message sent to Ryuk must be a string, with the Docker filter format, as follows:
- Each label must be represented as a key-value pair, separated by an equal sign (`=`).
- Labels must be separated among them by an ampersand (`&`).
- The message must be terminated by a newline character (`\n`).
- An example: `label=testing=true&label=testing.sessionid=mysession\n`.
- Once received by Ryuk, the message is processed and stored as a Docker filter.
- Ryuk responds with an acknowledgment message, with the constant value of `ACK\n`, which can be used to check if the message was successfully processed, completing the handshake.
- Whenever a resource is removed by the Testcontainers library, send a termination signal to Ryuk using a TCP connection in the same way as seen above; this way Ryuk can identify the test session is about to finish and start the cleanup process. Ryuk uses `RYUK_CONNECTION_TIMEOUT` and `RYUK_RECONNECTION_TIMEOUT` to determine when to start the cleanup process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Whenever a resource is removed by the Testcontainers library, send a termination signal to Ryuk using a TCP connection in the same way as seen above; this way Ryuk can identify the test session is about to finish and start the cleanup process. Ryuk uses `RYUK_CONNECTION_TIMEOUT` and `RYUK_RECONNECTION_TIMEOUT` to determine when to start the cleanup process.
- Whenever all resources are removed by the Testcontainers library, send a termination signal to Ryuk using a TCP connection in the same way as seen above; this way Ryuk can identify the test session is about to finish and start the cleanup process. Ryuk uses `RYUK_CONNECTION_TIMEOUT` and `RYUK_RECONNECTION_TIMEOUT` to determine when to start the cleanup process.

Correct?
Also @eddumelendez , do we still do in tc-java, or did we had to revert it because of the Gradle daemon issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not anymore

Loading