Skip to content

Commit d3b2f2c

Browse files
authored
Merge pull request #89935 from ochromy/SRVKS-1306
2 parents 73b8118 + c7b2f3d commit d3b2f2c

File tree

4 files changed

+239
-0
lines changed

4 files changed

+239
-0
lines changed

knative-serving/config-applications/configuring-revision-timeouts.adoc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,5 +8,17 @@ toc::[]
88

99
You can configure timeout durations for revisions globally or individually to control the time spent on requests.
1010

11+
//Configuring revision timeout
1112
include::modules/configuring-revision-timeout.adoc[leveloffset=+1]
13+
14+
//Configuring maximum revision timeout
1215
include::modules/configuring-maximum-revision-timeout.adoc[leveloffset=+1]
16+
17+
//Long-running requests
18+
include::modules/serverless-long-running-requests.adoc[leveloffset=+1]
19+
20+
//Configuring the default route timeouts globally (+2)
21+
include::modules/configuring-default-route-timeouts-globally.adoc[leveloffset=+2]
22+
23+
//Configuring the default route timeouts per revision (+2)
24+
include::modules/configuring-default-route-timeouts.adoc[leveloffset=+2]
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * knative-serving/config-applications/configuring-revision-timeouts.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="configuring-default-route-timeouts-globally_{context}"]
7+
= Configuring the default route timeouts globally
8+
9+
By configuring the route timeouts globally, you can ensure consistent timeout settings across all services, simplifying management for workloads that have similar timeout needs and reducing the need for individual adjustments.
10+
11+
You can configure the route timeouts globally by updating the `ROUTE_HAPROXY_TIMEOUT` environment value in your `serverless-operator` subscription and updating the `max-revision-timeout-seconds` field in your `KnativeServing` custom resource (CR). This applies the timeout changes across all Knative services, and you can deploy services with specific timeouts up to the maximum value set.
12+
13+
.Procedure
14+
15+
. Set the value of `ROUTE_HAPROXY_TIMEOUT` in your subscription to your required timeout in seconds by running the following command:
16+
+
17+
.Setting the `ROUTE_HAPROXY_TIMEOUT` value to 900 seconds
18+
[source,terminal]
19+
----
20+
$ oc patch subscription.operators.coreos.com serverless-operator -n openshift-serverless --type='merge' -p '{"spec": {"config": {"env": [{"name": "ROUTE_HAPROXY_TIMEOUT", "value": "900"}]}}}'
21+
----
22+
+
23+
The `ROUTE_HAPROXY_TIMEOUT` environment variable is managed by the Serverless Operator and by default is set to `600`. Set the value of `ROUTE_HAPROXY_TIMEOUT` in your subscription to your required timeout in seconds by running the following command. Note that this causes pods to be redeployed in the `openshift-serverless` namespace.
24+
+
25+
[NOTE]
26+
====
27+
If you created your routes manually and disabled auto-generation with the `serving.knative.openshift.io/disableRoute` annotation, you can configure the timeouts directly in the route definitions.
28+
====
29+
30+
. Set the maximum revision timeout in your `KnativeServing` CR:
31+
+
32+
.`KnativeServing` CR with `max-revision-timeout-seconds` set to 900 seconds
33+
[source,yaml]
34+
----
35+
apiVersion: operator.knative.dev/v1beta1
36+
kind: KnativeServing
37+
metadata:
38+
name: knative-serving
39+
spec:
40+
config:
41+
defaults:
42+
max-revision-timeout-seconds: "900"
43+
#...
44+
----
45+
+
46+
The Serverless Operator automatically adjusts the `terminationGracePeriod` value of the activator to the set maximum revision timeout value to avoid request termination in cases where activator pods are being terminated themselves.
47+
+
48+
. Optional: Verify that the timeout has been set by running the following command:
49+
+
50+
[source,terminal]
51+
----
52+
$ oc get deployment activator -n knative-serving -o jsonpath="{.spec.template.spec.terminationGracePeriodSeconds}"
53+
----
54+
55+
. If necessary for your cloud provider, adjust the load balancer timeout by running the following command:
56+
+
57+
.Load balancer timeout adjustment for AWS Classic LB
58+
[source,terminal]
59+
----
60+
$ oc -n openshift-ingress-operator patch ingresscontroller/default --type=merge --patch=' \
61+
{"spec":{"endpointPublishingStrategy": \
62+
{"type":"LoadBalancerService", "loadBalancer": \
63+
{"scope":"External", "providerParameters":{"type":"AWS", "aws": \
64+
{"type":"Classic", "classicLoadBalancer": \
65+
{"connectionIdleTimeout":"20m"}}}}}}}'
66+
----
67+
68+
. Deploy a Knative service with the desired timeouts less or equal to the `max-revision-timeout-seconds` variable:
69+
+
70+
.A Service definition with timeouts set to 800 seconds
71+
[source,yaml]
72+
----
73+
apiVersion: serving.knative.dev/v1
74+
kind: Service
75+
metadata:
76+
name: example-service-name
77+
spec:
78+
template:
79+
spec:
80+
timeoutSeconds: 800
81+
responseStartTimeoutSeconds: 800
82+
----
83+
+
84+
[IMPORTANT]
85+
====
86+
When using {SMProductShortName}, if the activator pod is stopped while a long-running request is in-flight, the request is interrupted.
87+
To avoid request interruptions, you must adjust the value of the `terminationDrainDuration` field in the `ServiceMeshControlPlane` CR:
88+
[source,yaml]
89+
----
90+
apiVersion: maistra.io/v2
91+
kind: ServiceMeshControlPlane
92+
#...
93+
spec:
94+
techPreview:
95+
meshConfig:
96+
defaultConfig:
97+
terminationDrainDuration: 1000s <1>
98+
#...
99+
----
100+
<1> Ensure that the value exceeds the request duration to avoid the Istio proxy shutdown, which would interrupt the request.
101+
====
102+
103+
.Verification
104+
105+
* If you are using Kourier, you can verify the current value of the timeout at the {ocp-product-title} route by running the following command:
106+
+
107+
[source,terminal]
108+
----
109+
$ oc get route <route_name> -n knative-serving-ingress ess -o jsonpath="{.metadata.annotations.haproxy\.router\.openshift\.io/timeout}"
110+
800s
111+
----
Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * knative-serving/config-applications/configuring-revision-timeouts.adoc
4+
5+
:_mod-docs-content-type: PROCEDURE
6+
[id="configuring-default-route-timeouts_{context}"]
7+
= Configuring the default route timeouts per revision
8+
9+
By configuring the route timeouts per revision, you can fine-tune timeouts for workloads with unique requirements, such as AI or data processing applications, without impacting the global timeout settings for other services.
10+
You can configure the timeouts of a specific revision by updating your `KnativeServing` custom resource (CR), the Service definition, and using the `serving.knative.openshift.io/setRouteTimeout` annotation to adjust the {ocp-product-title} route timeout.
11+
12+
.Procedure
13+
14+
. Set the `max-revision-timeout` annotation in your `KnativeServing` CR as you require:
15+
+
16+
[source,yaml]
17+
----
18+
apiVersion: operator.knative.dev/v1beta1
19+
kind: KnativeServing
20+
metadata:
21+
name: knative-serving
22+
spec:
23+
config:
24+
defaults:
25+
max-revision-timeout-seconds: "900"
26+
----
27+
28+
. Optional: Verify the termination grace period of an activator by running the following command:
29+
+
30+
[source,terminal]
31+
----
32+
$ oc get deployment activator -n knative-serving -o jsonpath="{.spec.template.spec.terminationGracePeriodSeconds}"
33+
34+
900
35+
----
36+
37+
. If necessary for your cloud provider, adjust the load balancer timeout by running the following command:
38+
+
39+
.Load balancer timeout adjustment for AWS Classic LB
40+
[source,terminal]
41+
----
42+
$ oc -n openshift-ingress-operator patch ingresscontroller/default \
43+
--type=merge --patch='{"spec":{"endpointPublishingStrategy": \
44+
{"type":"LoadBalancerService", "loadBalancer": \
45+
{"scope":"External", "providerParameters":{"type":"AWS", "aws": \
46+
{"type":"Classic", "classicLoadBalancer": \
47+
{"connectionIdleTimeout":"20m"}}}}}}}'
48+
----
49+
50+
. Set the timeout for your specific service:
51+
+
52+
[source,yaml]
53+
----
54+
apiVersion: serving.knative.dev/v1f
55+
kind: Service
56+
metadata:
57+
name: <your_service_name>
58+
annotations:
59+
serving.knative.openshift.io/setRouteTimeout: "800" <1>
60+
spec:
61+
template:
62+
metadata:
63+
annotations:
64+
#...
65+
spec:
66+
timeoutSeconds: 800 <2>
67+
responseStartTimeoutSeconds: 800 <3>
68+
----
69+
<1> This annotation sets the timeout for the {ocp-product-title} route. You can fine-tune this for each service instead of setting a global maximum.
70+
<2> This ensures that the request does not exceed the specific value.
71+
<3> This ensures that the response start timeout does not trigger before the max threshold is reached. The default value is `300`.
72+
73+
+
74+
[IMPORTANT]
75+
====
76+
When using {SMProductShortName}, if the activator pod is stopped while a long-running request is in-flight, the request is interrupted.
77+
To avoid request interruptions, you must adjust the value of the `terminationDrainDuration` field in the `ServiceMeshControlPlane` CR:
78+
[source,yaml]
79+
----
80+
apiVersion: maistra.io/v2
81+
kind: ServiceMeshControlPlane
82+
#...
83+
spec:
84+
techPreview:
85+
meshConfig:
86+
defaultConfig:
87+
terminationDrainDuration: 1000s <1>
88+
#...
89+
----
90+
<1> Ensure that the value exceeds the request duration to avoid the Istio proxy shutdown, which would interrupt the request.
91+
====
92+
93+
.Verification
94+
95+
* If you are using Kourier, you can verify the current value of the timeout at the {ocp-product-title} route by running the following command:
96+
+
97+
[source,terminal]
98+
----
99+
$ oc get route <route-name> -n knative-serving-ingress ess -o jsonpath="{.metadata.annotations.haproxy\.router\.openshift\.io/timeout}"
100+
800s
101+
----
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * knative-serving/config-applications/configuring-revision-timeouts.adoc
4+
5+
:_mod-docs-content-type: CONCEPT
6+
[id="serverless-long-running-requests_{context}"]
7+
= Long running requests
8+
9+
To ensure that requests exceeding the default 600 second timeout set by Knative are not prematurely terminated, you need to adjust the timeouts in the following components:
10+
11+
* {ocp-product-title} route
12+
* {ServerlessProductName} Serving
13+
* Load balancer, depending on the cloud provider
14+
15+
You can configure the timeouts globally or per revision. You can configure the timeouts globally if you have requests across all Knative services that need extended durations, or per revision for specific workloads that require different timeout values, such as AI deployments.

0 commit comments

Comments
 (0)