-
Notifications
You must be signed in to change notification settings - Fork 1.7k
feat: Clarify Rootless Runtime Requirements #4022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,10 @@ Ingress exposes HTTP and HTTPS routes from outside the cluster to services withi | |
> **NOTE**: You may also want to consider using [Gateway API](https://gateway-api.sigs.k8s.io/) instead of Ingress. | ||
> Gateway API has an [Ingress migration guide](https://gateway-api.sigs.k8s.io/guides/migrating-from-ingress/). | ||
|
||
> **WARNING**: If you are using a [rootless container runtime], ensure your host is | ||
> properly configured before creating the KIND cluster. Most Ingress and Gateway controllers will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe we should also cross link "properly configured" for people that know what rootless container runtime is but don't know what properly configured is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The link is to the Rootless page, which has the full guidance for host configuration. Perhaps we merge this and see if it indeed reduces the issues/questions around rootless that KIND can't solve? |
||
> not work if these steps are skipped. | ||
|
||
### Create Cluster | ||
|
||
#### Option 1: LoadBalancer | ||
|
@@ -139,3 +143,4 @@ curl localhost/bar | |
|
||
[LoadBalancer]: /docs/user/loadbalancer/ | ||
[Cloud Provider KIND]: /docs/user/loadbalancer/ | ||
[rootless container runtime]: /docs/user/rootless/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,57 +9,212 @@ menu: | |
Starting with kind 0.11.0, [Rootless Docker](https://docs.docker.com/go/rootless/), [Rootless Podman](https://github.com/containers/podman/blob/master/docs/tutorials/rootless_tutorial.md) and [Rootless nerdctl](https://github.com/containerd/nerdctl/blob/main/docs/rootless.md) can be used as the node provider of kind. | ||
|
||
## Provider requirements | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Markdown style consistency - I like having an empty line after a heading. This doesn't appear to impact the site rendering by Hugo (see the deploy preview). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, if we ever decided to enable a markdown linter (highly unlikely) it would complain about there not being a blank line between headers, code blocks, etc. That said, unrelated changes to the file does make it slightly harder to review, but... |
||
- Docker: 20.10 or later | ||
- Podman: 3.0 or later | ||
- nerdctl: 1.7 or later | ||
|
||
## Host requirements | ||
The host needs to be running with cgroup v2. | ||
Make sure that the result of the `docker info` command contains `Cgroup Version: 2`. | ||
If it prints `Cgroup Version: 1`, try adding `GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"` to `/etc/default/grub` and | ||
running `sudo update-grub` to enable cgroup v2. | ||
|
||
Also, depending on the host configuration, the following steps might be needed: | ||
### cgroup v2 | ||
|
||
The host needs to be running with cgroup v2, which is the default for many Linux disributions: | ||
|
||
- Ubuntu: 21.10 and later. | ||
- Fedora: 31 and later. | ||
- Arch: April 2021 release and later. | ||
|
||
You can verify the cgroup version used by your controller runtime with the following procedure: | ||
|
||
- `docker`: Run `docker info` and look for `Cgroup Version: 2` in the output. | ||
- `podman`: Run `podman info` and look for `cgroupVersion: v2` in the output. | ||
- `nerdctl`: Run `nerdctl info` and look for `Cgroup Version: 2` in the output. | ||
|
||
If the `info` output prints `Cgroup Version: 1` or equivalent, try the following to enable cgroup v2: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better to just upgrade the host |
||
|
||
1. In `/etc/default/grub`, add the line `GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"` | ||
2. Run `sudo update-grub` to enable cgroup v2. | ||
|
||
Your host will also need to enable [cgroup delegation](https://systemd.io/CGROUP_DELEGATION/) of the `cpu` controller for | ||
user services. This is enabled by default for distributions running `systemd` version 252 and higher. | ||
|
||
To enable cgroup delegation for all the controllers, do the following: | ||
|
||
1. Check your version of `systemd` by running `systemctl --version`. If the output prints | ||
`systemd 252` or higher, no further action is needed. Example output below from a Fedora host: | ||
|
||
```sh | ||
$ systemctl --version | ||
systemd 257 (257.9-2.fc42) | ||
``` | ||
|
||
2. For systems with older versions of `systemd`, first create the directory | ||
`/etc/systemd/system/user@.service.d/` if it is not present. | ||
|
||
```sh | ||
sudo mkdir -p /etc/systemd/system/user@.service.d/ | ||
``` | ||
|
||
3. Next, create the file `/etc/systemd/system/user@.service.d/delegate.conf` with the following content: | ||
|
||
```ini | ||
[Service] | ||
Delegate=yes | ||
``` | ||
|
||
4. Reload systemd for these changes to take effect: | ||
|
||
```sh | ||
sudo systemctl daemon-reload | ||
``` | ||
|
||
5. If using docker, reload the user docker daemon: | ||
|
||
```sh | ||
systemctl --user restart docker | ||
``` | ||
|
||
### Networking | ||
|
||
Containers running in rootless mode may not loaded with host-level iptable modules. | ||
This breaks the behavior of most networking components, such as Ingress and Gateway controllers. | ||
|
||
To load the iptable modules, do the following: | ||
|
||
1. First, use `lsmod` to check which kernel modules are loaded by default for user processes on | ||
your system. Use `grep` to find which iptable modules are loaded: | ||
|
||
```sh | ||
lsmod | grep "ip.*table" | ||
``` | ||
|
||
2. Check the output for the following kernel modules: | ||
- `ip6_tables` | ||
- `ip6table_nat` | ||
- `ip_tables` | ||
- `iptable_nat` | ||
|
||
- Create `/etc/systemd/system/user@.service.d/delegate.conf` with the following content, and then run `sudo systemctl daemon-reload`: | ||
3. If one or more of the kernel modules above are not present, your system needs to load these at | ||
startup for each process. First, run the following command to add these missing modules: | ||
|
||
```sh | ||
sudo tee /etc/modules-load.d/iptables.conf > /dev/null <<'EOF' | ||
ip6_tables | ||
ip6table_nat | ||
ip_tables | ||
iptable_nat | ||
EOF | ||
``` | ||
|
||
```ini | ||
[Service] | ||
Delegate=yes | ||
``` | ||
4. Check that the new module loading configuration is correct. You should see the following output: | ||
|
||
(This is not enabled by default because ["the runtime impact of | ||
[delegating the "cpu" controller] is still too | ||
high"](https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/ZMKLS7SHMRJLJ57NZCYPBAQ3UOYULV65/). | ||
Beware that changing this configuration may affect system | ||
performance.) | ||
```sh | ||
$ cat /etc/modules-load.d/iptables.conf | ||
ip6_tables | ||
ip6table_nat | ||
ip_tables | ||
iptable_nat | ||
``` | ||
|
||
Please note that: | ||
5. Next, restart the `systemd-modules-load` service to make these changes effective immediately: | ||
|
||
- `/etc/systemd/system/user@.service.d/` directory needs to be created if not already present on your host | ||
- If using Docker and it was already running when this step was done, a restart is needed for the changes to take | ||
effect | ||
{{< codeFromInline lang="bash" >}} | ||
systemctl --user restart docker | ||
{{< /codeFromInline >}} | ||
```sh | ||
sudo systemctl restart systemd-modules-load.service | ||
``` | ||
|
||
- Create `/etc/modules-load.d/iptables.conf` with the following content: | ||
Alternatively, restart your system to ensure these changes take effect. | ||
|
||
``` | ||
ip6_tables | ||
ip6table_nat | ||
ip_tables | ||
iptable_nat | ||
``` | ||
### Increase PID Limits | ||
|
||
- If using podman, be aware that by default there is a [limit](https://docs.podman.io/en/v4.3/markdown/options/pids-limit.html#pids-limit-limit) to the number of pids that can be created. This can cause problems like nginx workers inside a container not spawning correctly. | ||
- If you want to disable this limit, edit your `containers.conf` file (generally located in `/etc/containers/containers.conf`). Note that this could cause things like pid exhaustion to happen on the host machine. Alternatively, change `0` to your desired new limit: | ||
KIND nodes are represented as individual containers on their hosts. Runtimes such as podman set | ||
default [process id limits](https://docs.podman.io/en/v4.3/markdown/options/pids-limit.html#pids-limit-limit) | ||
that may be too low for the node or for a pod running on the node. The Ingress NGINX Controller is | ||
[particularly susceptible](https://github.com/kubernetes-sigs/kind/issues/3451) to this issue. | ||
|
||
To increase the PID limit, do the following: | ||
|
||
1. If using podman, edit your `containers.conf` file (generally located in | ||
`/etc/containers/containers.conf` or `~/.config/containers/containers.conf`) to increase the PIDs | ||
limit to a desired value (default 4096 on most systems): | ||
|
||
```ini | ||
[containers] | ||
pids_limit = 0 | ||
pids_limit = 65536 | ||
``` | ||
|
||
2. Re-recreate the KIND cluster for these changes to take effect: | ||
|
||
```sh | ||
kind delete cluster && kind create cluster | ||
``` | ||
|
||
### Increase inotify Limits | ||
|
||
As documented in [known issues](/docs/user/known-issues/#pod-errors-due-to-too-many-open-files), pods may | ||
fail by reaching inotify watch and instance limits. Ingress controllers such as NGINX and Contour | ||
are particularly susceptible to this issue. | ||
|
||
To increase the inotify limits, do the following: | ||
|
||
1. As root, create a `.conf` file in `/etc/systctl.d` that increases the `fs.inotify` max user settings: | ||
|
||
``` | ||
fs.inotify.max_user_watches = 524288 | ||
fs.inotify.max_user_instances = 512 | ||
``` | ||
|
||
2. Reload `sysctl` for these changes to take effect: | ||
|
||
```sh | ||
sudo sysctl --system | ||
``` | ||
|
||
Alternatively, restart your system for these changes to take effect. | ||
|
||
stmcginnis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
### Allow Binding to Privileged Ports | ||
|
||
If you use the `extraPortMappings` method to provide ingress to your KIND cluster, you can allow | ||
the KIND node container to bind to ports 80 and 443 on the host. User containers cannot bind to | ||
ports below 1024 by default as they are considered privileged. | ||
|
||
You can avoid this issue by binding the node to a non-privileged host port, such as 8080 or 8443: | ||
|
||
```yaml | ||
# kind config.yaml | ||
kind: Cluster | ||
apiVersion: kind.x-k8s.io/v1alpha4 | ||
nodes: | ||
- role: control-plane | ||
extraPortMappings: | ||
- containerPort: 80 | ||
hostPort: 8080 | ||
protocol: TCP | ||
- containerPort: 443 | ||
hostPort: 8443 | ||
protocol: TCP | ||
``` | ||
|
||
Note that with this configuration, requests to your cluster ingress will need to add the | ||
appropriate port number. In the example above, HTTP requests must use `localhost:8080` in the URL. | ||
|
||
To allow a KIND node to bind to ports 80 and/or 443 on the host, do the following: | ||
|
||
1. As root, create a `.conf` file in `/etc/systctl.d` that lowers the privileged port start number: | ||
|
||
``` | ||
# Allow unprivileged binding to HTTP port 80 | ||
# Use 443 if you only need binding to the default HTTPS port | ||
net.ipv4.ip_unprivileged_port_start=80 | ||
``` | ||
|
||
2. Reload `sysctl` for these changes to take effect: | ||
|
||
```sh | ||
sudo sysctl --system | ||
``` | ||
|
||
Alternatively, restart your system for these changes to take effect. | ||
|
||
## Restrictions | ||
|
||
The restrictions of Rootless Docker apply to kind clusters as well. | ||
|
Uh oh!
There was an error while loading. Please reload this page.