Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ should copy into their pom.xml file. It will render out to:

```xml
<dependency>
<groupdId>org.apache.flink</groupId>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java</artifactId>
<version><!-- current flink version --></version>
</dependency>
Expand Down
59 changes: 59 additions & 0 deletions docs/content/docs/custom-resource/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,24 @@ This serves as a full reference for FlinkDeployment and FlinkSessionJob custom r
| Parameter | Type | Docs |
| ----------| ---- | ---- |

### FlinkBlueGreenDeploymentConfigOptions
**Class**: org.apache.flink.kubernetes.operator.api.spec.FlinkBlueGreenDeploymentConfigOptions

**Description**: Configuration options to be used by the Flink Blue/Green Deployments.

| Parameter | Type | Docs |
| ----------| ---- | ---- |

### FlinkBlueGreenDeploymentSpec
**Class**: org.apache.flink.kubernetes.operator.api.spec.FlinkBlueGreenDeploymentSpec

**Description**: Spec that describes a Flink application with blue/green deployment capabilities.

| Parameter | Type | Docs |
| ----------| ---- | ---- |
| configuration | java.util.Map<java.lang.String,java.lang.String> | |
| template | org.apache.flink.kubernetes.operator.api.spec.FlinkDeploymentTemplateSpec | |

### FlinkDeploymentSpec
**Class**: org.apache.flink.kubernetes.operator.api.spec.FlinkDeploymentSpec

Expand All @@ -94,6 +112,16 @@ This serves as a full reference for FlinkDeployment and FlinkSessionJob custom r
| logConfiguration | java.util.Map<java.lang.String,java.lang.String> | Log configuration overrides for the Flink deployment. Format logConfigFileName -> configContent. |
| mode | org.apache.flink.kubernetes.operator.api.spec.KubernetesDeploymentMode | Deployment mode of the Flink cluster, native or standalone. |

### FlinkDeploymentTemplateSpec
**Class**: org.apache.flink.kubernetes.operator.api.spec.FlinkDeploymentTemplateSpec

**Description**: Template Spec that describes a Flink application managed by the blue/green controller.

| Parameter | Type | Docs |
| ----------| ---- | ---- |
| metadata | io.fabric8.kubernetes.api.model.ObjectMeta | |
| spec | org.apache.flink.kubernetes.operator.api.spec.FlinkDeploymentSpec | |

### FlinkSessionJobSpec
**Class**: org.apache.flink.kubernetes.operator.api.spec.FlinkSessionJobSpec

Expand Down Expand Up @@ -308,6 +336,37 @@ This serves as a full reference for FlinkDeployment and FlinkSessionJob custom r
| UNKNOWN | Checkpoint format unknown, if the checkpoint was not triggered by the operator. |
| description | org.apache.flink.configuration.description.InlineElement | |

### FlinkBlueGreenDeploymentState
**Class**: org.apache.flink.kubernetes.operator.api.status.FlinkBlueGreenDeploymentState

**Description**: Enumeration of the possible states of the blue/green transition.

| Value | Docs |
| ----- | ---- |
| INITIALIZING_BLUE | We use this state while initializing for the first time, always with a "Blue" deployment type. |
| ACTIVE_BLUE | Identifies the system is running normally with a "Blue" deployment type. |
| ACTIVE_GREEN | Identifies the system is running normally with a "Green" deployment type. |
| TRANSITIONING_TO_BLUE | Identifies the system is transitioning from "Green" to "Blue". |
| TRANSITIONING_TO_GREEN | Identifies the system is transitioning from "Blue" to "Green". |
| SAVEPOINTING_BLUE | Identifies the system is savepointing "Blue" before it transitions to "Green". |
| SAVEPOINTING_GREEN | Identifies the system is savepointing "Green" before it transitions to "Blue". |

### FlinkBlueGreenDeploymentStatus
**Class**: org.apache.flink.kubernetes.operator.api.status.FlinkBlueGreenDeploymentStatus

**Description**: Last observed status of the Flink Blue/Green deployment.

| Parameter | Type | Docs |
| ----------| ---- | ---- |
| jobStatus | org.apache.flink.kubernetes.operator.api.status.JobStatus | |
| blueGreenState | org.apache.flink.kubernetes.operator.api.status.FlinkBlueGreenDeploymentState | The state of the blue/green transition. |
| lastReconciledSpec | java.lang.String | Last reconciled (serialized) deployment spec. |
| lastReconciledTimestamp | java.lang.String | Timestamp of last reconciliation. |
| abortTimestamp | java.lang.String | Computed from abortGracePeriodMs, timestamp after which the deployment should be aborted. |
| deploymentReadyTimestamp | java.lang.String | Timestamp when the deployment became READY/STABLE. Used to determine when to delete it. |
| savepointTriggerId | java.lang.String | Persisted triggerId to track transition with savepoint. Only used with UpgradeMode.SAVEPOINT |
| error | java.lang.String | Error information about the FlinkBlueGreenDeployment. |

### FlinkDeploymentReconciliationStatus
**Class**: org.apache.flink.kubernetes.operator.api.status.FlinkDeploymentReconciliationStatus

Expand Down
99 changes: 99 additions & 0 deletions e2e-tests/data/bluegreen-laststate.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

apiVersion: flink.apache.org/v1beta1
kind: FlinkBlueGreenDeployment
metadata:
name: basic-bg-laststate-example
spec:
configuration:
kubernetes.operator.bluegreen.deployment-deletion.delay: "1s"
template:
spec:
image: flink:1.20
flinkVersion: v1_20
flinkConfiguration:
rest.port: "8081"
execution.checkpointing.interval: "10s"
execution.checkpointing.storage: "filesystem"
state.backend.incremental: "true"
state.checkpoints.dir: file:///opt/flink/volume/flink-cp
state.savepoints.dir: file:///opt/flink/volume/flink-sp
state.checkpoints.num-retained: "5"
taskmanager.numberOfTaskSlots: "1"
serviceAccount: flink
jobManager:
resource:
memory: 1G
cpu: 1
podTemplate:
spec:
containers:
- name: flink-main-container
resources:
requests:
ephemeral-storage: 2048Mi
limits:
ephemeral-storage: 2048Mi
volumeMounts:
- mountPath: /opt/flink/volume
name: flink-volume
volumes:
- name: flink-volume
persistentVolumeClaim:
claimName: flink-bg-laststate
taskManager:
resource:
memory: 2G
cpu: 1
job:
jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
parallelism: 1
entryClass: org.apache.flink.streaming.examples.statemachine.StateMachineExample
args:
- "--error-rate"
- "0.15"
- "--sleep"
- "30"
upgradeMode: last-state
mode: native

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: flink-bg-laststate
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Gi

---
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
annotations:
ingressclass.kubernetes.io/is-default-class: "true"
labels:
app.kubernetes.io/component: controller
name: nginx
spec:
controller: k8s.io/ingress-nginx
52 changes: 52 additions & 0 deletions e2e-tests/data/bluegreen-stateless.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

apiVersion: flink.apache.org/v1beta1
kind: FlinkBlueGreenDeployment
metadata:
name: basic-bg-stateless-example
spec:
configuration:
kubernetes.operator.bluegreen.deployment-deletion.delay: "2s"
template:
spec:
image: flink:1.20
flinkVersion: v1_20
flinkConfiguration:
rest.port: "8081"
taskmanager.numberOfTaskSlots: "1"
serviceAccount: flink
jobManager:
resource:
memory: 1G
cpu: 1
taskManager:
resource:
memory: 2G
cpu: 1
job:
jarURI: local:///opt/flink/examples/streaming/StateMachineExample.jar
parallelism: 1
entryClass: org.apache.flink.streaming.examples.statemachine.StateMachineExample
args:
- "--error-rate"
- "0.15"
- "--sleep"
- "30"
upgradeMode: stateless
mode: native
83 changes: 83 additions & 0 deletions e2e-tests/test_bluegreen_laststate.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
#!/usr/bin/env bash
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

# This script tests the Flink Blue/Green Deployments support as follows:
# - Create a FlinkBlueGreenDeployment which automatically starts a "Blue" FlinkDeployment
# - Once this setup is stable, we trigger a transition which will create the "Green" FlinkDeployment
# - Once it's stable, verify the "Blue" FlinkDeployment is torn down
# - Perform additional validation(s) before exiting

SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
source "${SCRIPT_DIR}/utils.sh"

CLUSTER_ID="basic-bg-laststate-example"
BG_CLUSTER_ID=$CLUSTER_ID
BLUE_CLUSTER_ID=$CLUSTER_ID"-blue"
GREEN_CLUSTER_ID=$CLUSTER_ID"-green"

APPLICATION_YAML="${SCRIPT_DIR}/data/bluegreen-laststate.yaml"
APPLICATION_IDENTIFIER="flinkbgdep/$CLUSTER_ID"
BLUE_APPLICATION_IDENTIFIER="flinkdep/$BLUE_CLUSTER_ID"
GREEN_APPLICATION_IDENTIFIER="flinkdep/$GREEN_CLUSTER_ID"
TIMEOUT=300

#echo "BG_CLUSTER_ID " $BG_CLUSTER_ID
#echo "BLUE_CLUSTER_ID " $BLUE_CLUSTER_ID
#echo "APPLICATION_IDENTIFIER " $APPLICATION_IDENTIFIER
#echo "BLUE_APPLICATION_IDENTIFIER " $BLUE_APPLICATION_IDENTIFIER

retry_times 5 30 "kubectl apply -f $APPLICATION_YAML" || exit 1

sleep 1
wait_for_jobmanager_running $BLUE_CLUSTER_ID $TIMEOUT
wait_for_status $BLUE_APPLICATION_IDENTIFIER '.status.lifecycleState' STABLE ${TIMEOUT} || exit 1
wait_for_status $APPLICATION_IDENTIFIER '.status.jobStatus.state' RUNNING ${TIMEOUT} || exit 1
wait_for_status $APPLICATION_IDENTIFIER '.status.blueGreenState' ACTIVE_BLUE ${TIMEOUT} || exit 1

#blue_job_id=$(kubectl get -oyaml flinkdep/basic-bluegreen-example-blue | yq '.status.jobStatus.jobId')

#kubectl patch flinkbgdep ${BG_CLUSTER_ID} --type merge --patch '{"spec":{"template":{"spec":{"flinkConfiguration":{"rest.port":"8082","state.checkpoints.num-retained":"6"}}}}}'
kubectl patch flinkbgdep ${BG_CLUSTER_ID} --type merge --patch '{"spec":{"template":{"spec":{"flinkConfiguration":{"state.checkpoints.num-retained":"6"}}}}}'
echo "Resource patched, giving a chance for the savepoint to be taken..."
sleep 10

jm_pod_name=$(get_jm_pod_name $BLUE_CLUSTER_ID)
echo "Inspecting savepoint directory..."
kubectl exec -it $jm_pod_name -- bash -c "ls -lt /opt/flink/volume/flink-sp/"

wait_for_status $GREEN_APPLICATION_IDENTIFIER '.status.lifecycleState' STABLE ${TIMEOUT} || exit 1
kubectl wait --for=delete deployment --timeout=${TIMEOUT}s --selector="app=${BLUE_CLUSTER_ID}"
wait_for_status $APPLICATION_IDENTIFIER '.status.jobStatus.state' RUNNING ${TIMEOUT} || exit 1
wait_for_status $APPLICATION_IDENTIFIER '.status.blueGreenState' ACTIVE_GREEN ${TIMEOUT} || exit 1

green_initialSavepointPath=$(kubectl get -oyaml $GREEN_APPLICATION_IDENTIFIER | yq '.spec.job.initialSavepointPath')

echo "Deleting test B/G resources" $BG_CLUSTER_ID
kubectl delete flinkbluegreendeployments/$BG_CLUSTER_ID &
echo "Waiting for deployment to be deleted..."
kubectl wait --for=delete flinkbluegreendeployments/$BG_CLUSTER_ID

if [[ $green_initialSavepointPath == '/opt/flink/volume/flink-sp/savepoint-'* ]]; then
echo 'Green deployment started from the expected initialSavepointPath:' $green_initialSavepointPath
else
echo 'Unexpected initialSavepointPath:' $green_initialSavepointPath
exit 1
fi;

echo "Successfully run the Flink Blue/Green Deployments test"
62 changes: 62 additions & 0 deletions e2e-tests/test_bluegreen_stateless.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/usr/bin/env bash
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

# This script tests the Flink Blue/Green Deployments support as follows:
# - Create a FlinkBlueGreenDeployment which automatically starts a "Blue" FlinkDeployment
# - Once this setup is stable, we trigger a transition which will create the "Green" FlinkDeployment
# - Once it's stable, verify the "Blue" FlinkDeployment is torn down
# - Perform additional validation(s) before exiting

SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
source "${SCRIPT_DIR}/utils.sh"

CLUSTER_ID="basic-bg-stateless-example"
BG_CLUSTER_ID=$CLUSTER_ID
BLUE_CLUSTER_ID=$CLUSTER_ID"-blue"
GREEN_CLUSTER_ID=$CLUSTER_ID"-green"

APPLICATION_YAML="${SCRIPT_DIR}/data/bluegreen-stateless.yaml"
APPLICATION_IDENTIFIER="flinkbgdep/$CLUSTER_ID"
BLUE_APPLICATION_IDENTIFIER="flinkdep/$BLUE_CLUSTER_ID"
GREEN_APPLICATION_IDENTIFIER="flinkdep/$GREEN_CLUSTER_ID"
TIMEOUT=300

retry_times 5 30 "kubectl apply -f $APPLICATION_YAML" || exit 1

sleep 1
wait_for_jobmanager_running $BLUE_CLUSTER_ID $TIMEOUT
wait_for_status $BLUE_APPLICATION_IDENTIFIER '.status.lifecycleState' STABLE ${TIMEOUT} || exit 1
wait_for_status $APPLICATION_IDENTIFIER '.status.jobStatus.state' RUNNING ${TIMEOUT} || exit 1
wait_for_status $APPLICATION_IDENTIFIER '.status.blueGreenState' ACTIVE_BLUE ${TIMEOUT} || exit 1

echo "PATCHING B/G deployment..."
#kubectl patch flinkbgdep ${BG_CLUSTER_ID} --type merge --patch '{"spec":{"template":{"spec":{"flinkConfiguration":{"rest.port":"8082","taskmanager.numberOfTaskSlots":"2"}}}}}'
kubectl patch flinkbgdep ${BG_CLUSTER_ID} --type merge --patch '{"spec":{"template":{"spec":{"flinkConfiguration":{"taskmanager.numberOfTaskSlots":"2"}}}}}'

wait_for_status $GREEN_APPLICATION_IDENTIFIER '.status.lifecycleState' STABLE ${TIMEOUT} || exit 1
kubectl wait --for=delete deployment --timeout=${TIMEOUT}s --selector="app=${BLUE_CLUSTER_ID}"
wait_for_status $APPLICATION_IDENTIFIER '.status.jobStatus.state' RUNNING ${TIMEOUT} || exit 1
wait_for_status $APPLICATION_IDENTIFIER '.status.blueGreenState' ACTIVE_GREEN ${TIMEOUT} || exit 1

echo "Deleting test B/G resources" $BG_CLUSTER_ID
kubectl delete flinkbluegreendeployments/$BG_CLUSTER_ID &
echo "Waiting for deployment to be deleted..."
kubectl wait --for=delete flinkbluegreendeployments/$BG_CLUSTER_ID

echo "Successfully run the Flink Blue/Green Deployments test"
Loading
Loading