Skip to content

Add persistent-drainable option for affinity-mode ingress annotation to support draining sticky server sessions #13480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/examples/affinity/cookie/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Session affinity can be configured using the following annotations:
|Name|Description|Value|
| --- | --- | --- |
|nginx.ingress.kubernetes.io/affinity|Type of the affinity, set this to `cookie` to enable session affinity|string (NGINX only supports `cookie`)|
|nginx.ingress.kubernetes.io/affinity-mode|The affinity mode defines how sticky a session is. Use `balanced` to redistribute some sessions when scaling pods or `persistent` for maximum stickiness.|`balanced` (default) or `persistent`|
|nginx.ingress.kubernetes.io/affinity-mode|The affinity mode defines how sticky a session is. Use `balanced` to redistribute some sessions when scaling pods. Use `persistent` to persist sessions until pods receive a deletion timestamp. Use `persistent-drainable` to persist sessions until after a pod gracefully handles its `preStop` lifecycle hook.|`balanced` (default), `persistent`, or `persistent-drainable`|
|nginx.ingress.kubernetes.io/affinity-canary-behavior|Defines session affinity behavior of canaries. By default the behavior is `sticky`, and canaries respect session affinity configuration. Set this to `legacy` to restore original canary behavior, when session affinity parameters were not respected.|`sticky` (default) or `legacy`|
|nginx.ingress.kubernetes.io/session-cookie-name|Name of the cookie that will be created|string (defaults to `INGRESSCOOKIE`)|
|nginx.ingress.kubernetes.io/session-cookie-secure|Set the cookie as secure regardless the protocol of the incoming request|`"true"` or `"false"`|
Expand Down
16 changes: 14 additions & 2 deletions docs/user-guide/nginx-configuration/annotations.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ You can add these Kubernetes annotations to specific Ingress objects to customiz
|---------------------------|------|
|[nginx.ingress.kubernetes.io/app-root](#rewrite)|string|
|[nginx.ingress.kubernetes.io/affinity](#session-affinity)|cookie|
|[nginx.ingress.kubernetes.io/affinity-mode](#session-affinity)|"balanced" or "persistent"|
|[nginx.ingress.kubernetes.io/affinity-mode](#session-affinity)|"balanced" or "persistent" or "persistent-drainable"|
|[nginx.ingress.kubernetes.io/affinity-canary-behavior](#session-affinity)|"sticky" or "legacy"|
|[nginx.ingress.kubernetes.io/auth-realm](#authentication)|string|
|[nginx.ingress.kubernetes.io/auth-secret](#authentication)|string|
Expand Down Expand Up @@ -173,7 +173,19 @@ If the Application Root is exposed in a different path and needs to be redirecte
The annotation `nginx.ingress.kubernetes.io/affinity` enables and sets the affinity type in all Upstreams of an Ingress. This way, a request will always be directed to the same upstream server.
The only affinity type available for NGINX is `cookie`.

The annotation `nginx.ingress.kubernetes.io/affinity-mode` defines the stickiness of a session. Setting this to `balanced` (default) will redistribute some sessions if a deployment gets scaled up, therefore rebalancing the load on the servers. Setting this to `persistent` will not rebalance sessions to new servers, therefore providing maximum stickiness.
The annotation `nginx.ingress.kubernetes.io/affinity-mode` defines the stickiness of a session.

- `balanced` (default)

Setting this to `balanced` will redistribute some sessions if a deployment gets scaled up, therefore rebalancing the load on the servers.

- `persistent`

Setting this to `persistent` will not rebalance sessions to new servers, therefore providing greater stickiness. Sticky sessions will continue to be routed to the same server as long as its [Endpoint's condition](https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#conditions) remains `Ready`. If the Endpoint stops being `Ready`, such when a server pod receives a deletion timestamp, sessions will be rebalanced to another server.

- `persistent-drainable`

Setting this to `persistent-drainable` behaves like `persistent`, but sticky sessions will continue to be routed to the same server as long as its [Endpoint's condition](https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#conditions) remains `Serving`, even after the server pod receives a deletion timestamp. This allows graceful session draining during the `preStop` lifecycle hook. New sessions will *not* be directed to these draining servers and will only be routed to a server whose Endpoint is `Ready`, except potentially when all servers are draining.

The annotation `nginx.ingress.kubernetes.io/affinity-canary-behavior` defines the behavior of canaries when session affinity is enabled. Setting this to `sticky` (default) will ensure that users that were served by canaries, will continue to be served by canaries. Setting this to `legacy` will restore original canary behavior, when session affinity was ignored.

Expand Down
1 change: 1 addition & 0 deletions internal/ingress/annotations/annotations_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,7 @@ func TestAffinitySession(t *testing.T) {
}{
{map[string]string{annotationAffinityType: "cookie", annotationAffinityMode: "balanced", annotationAffinityCookieName: "route", annotationAffinityCanaryBehavior: ""}, "cookie", "balanced", "route", ""},
{map[string]string{annotationAffinityType: "cookie", annotationAffinityMode: "persistent", annotationAffinityCookieName: "route1", annotationAffinityCanaryBehavior: "sticky"}, "cookie", "persistent", "route1", "sticky"},
{map[string]string{annotationAffinityType: "cookie", annotationAffinityMode: "persistent-drainable", annotationAffinityCookieName: "route1", annotationAffinityCanaryBehavior: "sticky"}, "cookie", "persistent-drainable", "route1", "sticky"},
{map[string]string{annotationAffinityType: "cookie", annotationAffinityMode: "balanced", annotationAffinityCookieName: "", annotationAffinityCanaryBehavior: "legacy"}, "cookie", "balanced", "INGRESSCOOKIE", "legacy"},
{map[string]string{}, "", "", "", ""},
{nil, "", "", "", ""},
Expand Down
5 changes: 3 additions & 2 deletions internal/ingress/annotations/sessionaffinity/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,12 +77,13 @@ var sessionAffinityAnnotations = parser.Annotation{
Documentation: `This annotation enables and sets the affinity type in all Upstreams of an Ingress. This way, a request will always be directed to the same upstream server. The only affinity type available for NGINX is cookie`,
},
annotationAffinityMode: {
Validator: parser.ValidateOptions([]string{"balanced", "persistent"}, true, true),
Validator: parser.ValidateOptions([]string{"balanced", "persistent", "persistent-drainable"}, true, true),
Scope: parser.AnnotationScopeIngress,
Risk: parser.AnnotationRiskMedium,
Documentation: `This annotation defines the stickiness of a session.
Setting this to balanced (default) will redistribute some sessions if a deployment gets scaled up, therefore rebalancing the load on the servers.
Setting this to persistent will not rebalance sessions to new servers, therefore providing maximum stickiness.`,
Setting this to persistent will not rebalance sessions to new servers, therefore providing greater stickiness. Sticky sessions will continue to be routed to the same server as long as its Endpoint's condition remains Ready. If the Endpoint stops being Ready, such when a server pod receives a deletion timestamp, sessions will be rebalanced to another server.
Setting this to persistent-drainable behaves like persistent, but sticky sessions will continue to be routed to the same server as long as its Endpoint's condition remains Serving, even after the server pod receives a deletion timestamp. This allows graceful session draining during the preStop lifecycle hook. New sessions will *not* be directed to these draining servers and will only be routed to a server whose Endpoint is Ready, except potentially when all servers are draining.`,
},
annotationAffinityCanaryBehavior: {
Validator: parser.ValidateOptions([]string{"sticky", "legacy"}, true, true),
Expand Down
36 changes: 27 additions & 9 deletions internal/ingress/controller/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -536,7 +536,7 @@ func (n *NGINXController) getStreamServices(configmapName string, proto apiv1.Pr
sp := svc.Spec.Ports[i]
if sp.Name == svcPort {
if sp.Protocol == proto {
endps = getEndpointsFromSlices(svc, &sp, proto, zone, n.store.GetServiceEndpointsSlices)
endps = getEndpointsFromSlices(svc, &sp, proto, zone, ReadyEndpoints, n.store.GetServiceEndpointsSlices)
break
}
}
Expand All @@ -548,7 +548,7 @@ func (n *NGINXController) getStreamServices(configmapName string, proto apiv1.Pr
//nolint:gosec // Ignore G109 error
if sp.Port == int32(targetPort) {
if sp.Protocol == proto {
endps = getEndpointsFromSlices(svc, &sp, proto, zone, n.store.GetServiceEndpointsSlices)
endps = getEndpointsFromSlices(svc, &sp, proto, zone, ReadyEndpoints, n.store.GetServiceEndpointsSlices)
break
}
}
Expand Down Expand Up @@ -605,7 +605,7 @@ func (n *NGINXController) getDefaultUpstream() *ingress.Backend {
} else {
zone = emptyZone
}
endps := getEndpointsFromSlices(svc, &svc.Spec.Ports[0], apiv1.ProtocolTCP, zone, n.store.GetServiceEndpointsSlices)
endps := getEndpointsFromSlices(svc, &svc.Spec.Ports[0], apiv1.ProtocolTCP, zone, ReadyEndpoints, n.store.GetServiceEndpointsSlices)
if len(endps) == 0 {
klog.Warningf("Service %q does not have any active Endpoint", svcKey)
endps = []ingress.Endpoint{n.DefaultEndpoint()}
Expand Down Expand Up @@ -940,7 +940,7 @@ func (n *NGINXController) getBackendServers(ingresses []*ingress.Ingress) ([]*in
} else {
zone = emptyZone
}
endps := getEndpointsFromSlices(location.DefaultBackend, &sp, apiv1.ProtocolTCP, zone, n.store.GetServiceEndpointsSlices)
endps := getEndpointsFromSlices(location.DefaultBackend, &sp, apiv1.ProtocolTCP, zone, ReadyEndpoints, n.store.GetServiceEndpointsSlices)
// custom backend is valid only if contains at least one endpoint
if len(endps) > 0 {
name := fmt.Sprintf("custom-default-backend-%v-%v", location.DefaultBackend.GetNamespace(), location.DefaultBackend.GetName())
Expand Down Expand Up @@ -1050,7 +1050,7 @@ func (n *NGINXController) createUpstreams(data []*ingress.Ingress, du *ingress.B

if len(upstreams[defBackend].Endpoints) == 0 {
_, port := upstreamServiceNameAndPort(ing.Spec.DefaultBackend.Service)
endps, err := n.serviceEndpoints(svcKey, port.String())
endps, err := n.serviceEndpoints(svcKey, port.String(), ReadyEndpoints)
upstreams[defBackend].Endpoints = append(upstreams[defBackend].Endpoints, endps...)
if err != nil {
klog.Warningf("Error creating upstream %q: %v", defBackend, err)
Expand Down Expand Up @@ -1114,8 +1114,13 @@ func (n *NGINXController) createUpstreams(data []*ingress.Ingress, du *ingress.B
}

if len(upstreams[name].Endpoints) == 0 {
epSelectionMode := ReadyEndpoints
if anns.SessionAffinity.Mode == "persistent-drainable" {
epSelectionMode = ServingEndpoints
}

_, port := upstreamServiceNameAndPort(path.Backend.Service)
endp, err := n.serviceEndpoints(svcKey, port.String())
endp, err := n.serviceEndpoints(svcKey, port.String(), epSelectionMode)
if err != nil {
klog.Warningf("Error obtaining Endpoints for Service %q: %v", svcKey, err)
n.metricCollector.IncOrphanIngress(ing.Namespace, ing.Name, orphanMetricLabelNoService)
Expand All @@ -1127,6 +1132,10 @@ func (n *NGINXController) createUpstreams(data []*ingress.Ingress, du *ingress.B
n.metricCollector.IncOrphanIngress(ing.Namespace, ing.Name, orphanMetricLabelNoEndpoint)
} else {
n.metricCollector.DecOrphanIngress(ing.Namespace, ing.Name, orphanMetricLabelNoEndpoint)

if allEndpointsAreDraining(endp) {
klog.Warningf("All Endpoints for Service %q are draining.", svcKey)
}
}
upstreams[name].Endpoints = endp
}
Expand Down Expand Up @@ -1184,7 +1193,7 @@ func (n *NGINXController) getServiceClusterEndpoint(svcKey string, backend *netw
}

// serviceEndpoints returns the upstream servers (Endpoints) associated with a Service.
func (n *NGINXController) serviceEndpoints(svcKey, backendPort string) ([]ingress.Endpoint, error) {
func (n *NGINXController) serviceEndpoints(svcKey, backendPort string, epSelectionMode EndpointSelectionMode) ([]ingress.Endpoint, error) {
var upstreams []ingress.Endpoint

svc, err := n.store.GetService(svcKey)
Expand All @@ -1205,7 +1214,7 @@ func (n *NGINXController) serviceEndpoints(svcKey, backendPort string) ([]ingres
return upstreams, nil
}
servicePort := externalNamePorts(backendPort, svc)
endps := getEndpointsFromSlices(svc, servicePort, apiv1.ProtocolTCP, zone, n.store.GetServiceEndpointsSlices)
endps := getEndpointsFromSlices(svc, servicePort, apiv1.ProtocolTCP, zone, epSelectionMode, n.store.GetServiceEndpointsSlices)
if len(endps) == 0 {
klog.Warningf("Service %q does not have any active Endpoint.", svcKey)
return upstreams, nil
Expand All @@ -1221,7 +1230,7 @@ func (n *NGINXController) serviceEndpoints(svcKey, backendPort string) ([]ingres
if strconv.Itoa(int(servicePort.Port)) == backendPort ||
servicePort.TargetPort.String() == backendPort ||
servicePort.Name == backendPort {
endps := getEndpointsFromSlices(svc, &servicePort, apiv1.ProtocolTCP, zone, n.store.GetServiceEndpointsSlices)
endps := getEndpointsFromSlices(svc, &servicePort, apiv1.ProtocolTCP, zone, epSelectionMode, n.store.GetServiceEndpointsSlices)
if len(endps) == 0 {
klog.Warningf("Service %q does not have any active Endpoint.", svcKey)
}
Expand Down Expand Up @@ -1903,3 +1912,12 @@ func newTrafficShapingPolicy(cfg *canary.Config) ingress.TrafficShapingPolicy {
Cookie: cfg.Cookie,
}
}

func allEndpointsAreDraining(eps []ingress.Endpoint) bool {
for _, ep := range eps {
if !ep.IsDraining {
return false
}
}
return true
}
31 changes: 25 additions & 6 deletions internal/ingress/controller/endpointslices.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,16 @@ import (
"k8s.io/ingress-nginx/pkg/apis/ingress"
)

type EndpointSelectionMode int

const (
ReadyEndpoints EndpointSelectionMode = iota
ServingEndpoints
)

// getEndpointsFromSlices returns a list of Endpoint structs for a given service/target port combination.
func getEndpointsFromSlices(s *corev1.Service, port *corev1.ServicePort, proto corev1.Protocol, zoneForHints string,
getServiceEndpointsSlices func(string) ([]*discoveryv1.EndpointSlice, error),
epSelectionMode EndpointSelectionMode, getServiceEndpointsSlices func(string) ([]*discoveryv1.EndpointSlice, error),
) []ingress.Endpoint {
upsServers := []ingress.Endpoint{}

Expand Down Expand Up @@ -153,9 +160,19 @@ func getEndpointsFromSlices(s *corev1.Service, port *corev1.ServicePort, proto c
}

for _, ep := range eps.Endpoints {
if (ep.Conditions.Ready != nil) && !(*ep.Conditions.Ready) {
continue
epIsReady := (ep.Conditions.Ready == nil) || *ep.Conditions.Ready
if epSelectionMode == ReadyEndpoints {
if !epIsReady {
continue
}
} else {
// assume epSelectionMode == ServingEndpoints.
epIsServing := (ep.Conditions.Serving == nil) || *ep.Conditions.Serving
if !epIsServing {
continue
}
}

epHasZone := false
if useTopologyHints {
for _, epzone := range ep.Hints.ForZones {
Expand All @@ -176,10 +193,12 @@ func getEndpointsFromSlices(s *corev1.Service, port *corev1.ServicePort, proto c
if _, exists := processedUpstreamServers[hostPort]; exists {
continue
}

ups := ingress.Endpoint{
Address: epAddress,
Port: fmt.Sprintf("%v", epPort),
Target: ep.TargetRef,
Address: epAddress,
Port: fmt.Sprintf("%v", epPort),
Target: ep.TargetRef,
IsDraining: !epIsReady,
}
upsServers = append(upsServers, ups)
processedUpstreamServers[hostPort] = struct{}{}
Expand Down
Loading
Loading