Skip to content

Rolling HDFS upgrade #571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Aug 28, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
a5e9547
Add upgrade mode with serialized deployments
nightkr Aug 2, 2024
fc6cc0d
Use deployedProductVersion to decide upgrade mode (but do not automat…
nightkr Aug 2, 2024
2eb38a8
Upgrade docs
nightkr Aug 2, 2024
38809e2
Remove dummy log message
nightkr Aug 2, 2024
a36de0f
Move upgrade readiness check into utils module
nightkr Aug 2, 2024
acffa82
Fix test build issue
nightkr Aug 5, 2024
98baaad
Regenerate CRDs
nightkr Aug 5, 2024
5a552d3
Docs
nightkr Aug 5, 2024
c1e13a2
s/terminal/shell/g
nightkr Aug 5, 2024
e1476a2
Update rust/operator-binary/src/hdfs_controller.rs
nightkr Aug 5, 2024
8af1db6
Update docs/modules/hdfs/pages/usage-guide/upgrading.adoc
nightkr Aug 6, 2024
44b5e59
Update docs/modules/hdfs/pages/usage-guide/upgrading.adoc
nightkr Aug 6, 2024
947931e
Update docs/modules/hdfs/pages/usage-guide/upgrading.adoc
nightkr Aug 7, 2024
5970585
Update docs/modules/hdfs/pages/usage-guide/upgrading.adoc
nightkr Aug 7, 2024
13129b5
Update docs/modules/hdfs/pages/usage-guide/upgrading.adoc
nightkr Aug 7, 2024
eb19010
Move upgrade_args to a separate variable
nightkr Aug 7, 2024
d5a092a
Merge branch 'feature/upgrade' of github.com:stackabletech/hdfs-opera…
nightkr Aug 7, 2024
f0df2b7
Upgrade mode -> compatibility mode
nightkr Aug 8, 2024
49cf9d9
Move rollout tracker into operator-rs
nightkr Aug 8, 2024
c582a3a
Update docs/modules/hdfs/pages/usage-guide/upgrading.adoc
nightkr Aug 8, 2024
b24c25f
Add note on downgrades
nightkr Aug 9, 2024
1e68f1d
Merge branch 'feature/upgrade' of github.com:stackabletech/hdfs-opera…
nightkr Aug 9, 2024
10e5220
Perform downgrades in order
nightkr Aug 9, 2024
808f926
Add note about status subresource
nightkr Aug 9, 2024
a9809ba
Update CRDs
nightkr Aug 9, 2024
0604aa6
s/upgrading_product_version/upgrade_target_product_version/g
nightkr Aug 12, 2024
c142421
Switch to main operator-rs
nightkr Aug 12, 2024
6ae8e0b
Update rust/crd/src/lib.rs
nightkr Aug 21, 2024
46eedee
Merge branch 'main' into feature/upgrade
nightkr Aug 26, 2024
2a25ff4
Add guardrail against trying to crossgrade in the middle of another u…
nightkr Aug 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions deploy/helm/hdfs-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22033,6 +22033,9 @@ spec:
- type
type: object
type: array
deployedProductVersion:
nullable: true
type: string
type: object
required:
- spec
Expand Down
85 changes: 85 additions & 0 deletions docs/modules/hdfs/pages/usage-guide/upgrading.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
= Upgrading HDFS

IMPORTANT: HDFS upgrades are experimental, and details may change at any time

HDFS currently requires a manual process to upgrade. This guide will take you through an example case, upgrading an example cluster (from our xref:getting_started/index.adoc[Getting Started] guide) from HDFS 3.3.6 to 3.4.0.

== Preparing HDFS

HDFS must be configured to initiate the upgrade process. To do this, run the following commands in a HDFS superuser environment
(either a client configured with a superuser account, or from inside NameNode pod):

// This could be automated by the operator, but dfsadmin does not have good machine-readable output.
// It *can* be queried over JMX, but we're not so lucky for finalization.

[source,shell]
----
$ hdfs dfsadmin -rollingUpgrade prepare

Check notice on line 17 in docs/modules/hdfs/pages/usage-guide/upgrading.adoc

View workflow job for this annotation

GitHub Actions / LanguageTool

[LanguageTool] docs/modules/hdfs/pages/usage-guide/upgrading.adoc#L17

Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE) Suggestions: `prepare` Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US Category: MISC
Raw output
docs/modules/hdfs/pages/usage-guide/upgrading.adoc:17:32: Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE)
 Suggestions: `prepare`
 Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US
 Category: MISC
PREPARE rolling upgrade ...
Preparing for upgrade. Data is being saved for rollback.
Run "dfsadmin -rollingUpgrade query" to check the status
for proceeding with rolling upgrade
Block Pool ID: BP-841432641-10.244.0.29-1722612757853
Start Time: Fri Aug 02 15:49:12 GMT 2024 (=1722613752341)
Finalize Time: <NOT FINALIZED>

$ # Then run query until the HDFS is ready to proceed
$ hdfs dfsadmin -rollingUpgrade query

Check notice on line 27 in docs/modules/hdfs/pages/usage-guide/upgrading.adoc

View workflow job for this annotation

GitHub Actions / LanguageTool

[LanguageTool] docs/modules/hdfs/pages/usage-guide/upgrading.adoc#L27

Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE) Suggestions: `query` Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US Category: MISC
Raw output
docs/modules/hdfs/pages/usage-guide/upgrading.adoc:27:32: Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE)
 Suggestions: `query`
 Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US
 Category: MISC
QUERY rolling upgrade ...
Preparing for upgrade. Data is being saved for rollback.
Run "dfsadmin -rollingUpgrade query" to check the status
for proceeding with rolling upgrade
Block Pool ID: BP-841432641-10.244.0.29-1722612757853
Start Time: Fri Aug 02 15:49:12 GMT 2024 (=1722613752341)
Finalize Time: <NOT FINALIZED>

$ # It should look like this once ready
$ hdfs dfsadmin -rollingUpgrade query

Check notice on line 37 in docs/modules/hdfs/pages/usage-guide/upgrading.adoc

View workflow job for this annotation

GitHub Actions / LanguageTool

[LanguageTool] docs/modules/hdfs/pages/usage-guide/upgrading.adoc#L37

Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE) Suggestions: `query` Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US Category: MISC
Raw output
docs/modules/hdfs/pages/usage-guide/upgrading.adoc:37:32: Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE)
 Suggestions: `query`
 Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US
 Category: MISC
QUERY rolling upgrade ...
Proceed with rolling upgrade:
Block Pool ID: BP-841432641-10.244.0.29-1722612757853
Start Time: Fri Aug 02 15:49:12 GMT 2024 (=1722613752341)
Finalize Time: <NOT FINALIZED>
----

== Starting the upgrade

Once ready, the HdfsCluster can be updated with the new product version:

[source,shell]
----
$ kubectl patch hdfs/simple-hdfs --patch '{"spec": {"image": {"productVersion": "3.4.0"}}}' --type=merge
hdfscluster.hdfs.stackable.tech/simple-hdfs patched
----

Then wait until all pods are ready and running the new HDFS version.

NOTE: Services will be upgraded in order: JournalNodes, then NameNodes, then DataNodes.

== Finalizing the upgrade

Once all HDFS pods are running the new version, the HDFS upgrade can be finalized (from the HDFS superuser environment):

[source,shell]
----
$ hdfs dfsadmin -rollingUpgrade finalize

Check notice on line 65 in docs/modules/hdfs/pages/usage-guide/upgrading.adoc

View workflow job for this annotation

GitHub Actions / LanguageTool

[LanguageTool] docs/modules/hdfs/pages/usage-guide/upgrading.adoc#L65

Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE) Suggestions: `finalize` Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US Category: MISC
Raw output
docs/modules/hdfs/pages/usage-guide/upgrading.adoc:65:32: Possible typo: you repeated a word (ENGLISH_WORD_REPEAT_RULE)
 Suggestions: `finalize`
 Rule: https://community.languagetool.org/rule/show/ENGLISH_WORD_REPEAT_RULE?lang=en-US
 Category: MISC
FINALIZE rolling upgrade ...
Rolling upgrade is finalized.
Block Pool ID: BP-841432641-10.244.0.29-1722612757853
Start Time: Fri Aug 02 15:49:12 GMT 2024 (=1722613752341)
Finalize Time: Fri Aug 02 15:58:39 GMT 2024 (=1722614319854)
----

// We can't safely automate this, because finalize is asynchronous and doesn't tell us whether all NameNodes have even received the request to finalize.

WARNING: Please ensure that all NameNodes are running and available before proceeding. NameNodes that have not finalized yet will crash on launch when taken out of upgrade mode.

Finally, the operator and cluster should be taken out of upgrade mode, by marking the HdfsCluster as upgraded to the new version:

[source,shell]
----
$ kubectl patch hdfs/simple-hdfs --subresource=status --patch '{"status": {"deployedProductVersion": "3.4.0"}}' --type=merge
hdfscluster.hdfs.stackable.tech/simple-hdfs patched
----

NOTE: The NameNodes will be restarted a final time, taking them out of upgrade mode.
1 change: 1 addition & 0 deletions docs/modules/hdfs/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
** xref:hdfs:usage-guide/logging-log-aggregation.adoc[]
** xref:hdfs:usage-guide/monitoring.adoc[]
** xref:hdfs:usage-guide/configuration-environment-overrides.adoc[]
** xref:hdfs:usage-guide/upgrading.adoc[]
** xref:hdfs:usage-guide/operations/index.adoc[]
*** xref:hdfs:usage-guide/operations/cluster-operations.adoc[]
*** xref:hdfs:usage-guide/operations/pod-placement.adoc[]
Expand Down
21 changes: 17 additions & 4 deletions rust/crd/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ use stackable_operator::{
status::condition::{ClusterCondition, HasStatusCondition},
time::Duration,
};
use strum::{Display, EnumIter, EnumString};
use strum::{Display, EnumIter, EnumString, IntoStaticStr};

use crate::{
affinity::get_affinity,
Expand Down Expand Up @@ -312,27 +312,29 @@ impl AnyNodeConfig {

#[derive(
Clone,
Copy,
Debug,
Deserialize,
Display,
EnumIter,
EnumString,
IntoStaticStr,
Eq,
Hash,
JsonSchema,
PartialEq,
Serialize,
)]
pub enum HdfsRole {
#[serde(rename = "journalnode")]
#[strum(serialize = "journalnode")]
JournalNode,
#[serde(rename = "namenode")]
#[strum(serialize = "namenode")]
NameNode,
#[serde(rename = "datanode")]
#[strum(serialize = "datanode")]
DataNode,
#[serde(rename = "journalnode")]
#[strum(serialize = "journalnode")]
JournalNode,
}

impl HdfsRole {
Expand Down Expand Up @@ -802,6 +804,15 @@ impl HdfsCluster {
Ok(result)
}

pub fn is_upgrading(&self) -> bool {
self.status
.as_ref()
.and_then(|status| status.deployed_product_version.as_deref())
.map_or(false, |deployed_version| {
deployed_version != self.spec.image.product_version()
})
}

pub fn authentication_config(&self) -> Option<&AuthenticationConfig> {
self.spec.cluster_config.authentication.as_ref()
}
Expand Down Expand Up @@ -1322,6 +1333,8 @@ impl Configuration for JournalNodeConfigFragment {
pub struct HdfsClusterStatus {
#[serde(default)]
pub conditions: Vec<ClusterCondition>,

pub deployed_product_version: Option<String>,
}

impl HasStatusCondition for HdfsCluster {
Expand Down
15 changes: 10 additions & 5 deletions rust/operator-binary/src/container.rs
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ impl ContainerConfig {
labels: &Labels,
) -> Result<(), Error> {
// HDFS main container
let main_container_config = Self::from(role.clone());
let main_container_config = Self::from(*role);
pb.add_volumes(main_container_config.volumes(merged_config, object_name, labels)?);
pb.add_container(main_container_config.main_container(
hdfs,
Expand Down Expand Up @@ -566,11 +566,16 @@ if [[ -d {LISTENER_VOLUME_DIR} ]]; then
export $(basename $i | tr a-z- A-Z_)_PORT="$(cat $i)"
done
fi
{hadoop_home}/bin/hdfs {role} &
{hadoop_home}/bin/hdfs {role} {upgrade_args} &
wait_for_termination $!
{create_vector_shutdown_file_command}
"#,
hadoop_home = Self::HADOOP_HOME,
upgrade_args = if hdfs.is_upgrading() && *role == HdfsRole::NameNode {
"-rollingUpgrade started"
} else {
""
},
remove_vector_shutdown_file_command =
remove_vector_shutdown_file_command(STACKABLE_LOG_DIR),
create_vector_shutdown_file_command =
Expand Down Expand Up @@ -1317,7 +1322,7 @@ impl From<HdfsRole> for ContainerConfig {
fn from(role: HdfsRole) -> Self {
match role {
HdfsRole::NameNode => Self::Hdfs {
role: role.clone(),
role,
container_name: role.to_string(),
volume_mounts: ContainerVolumeDirs::from(role),
ipc_port_name: SERVICE_PORT_NAME_RPC,
Expand All @@ -1327,7 +1332,7 @@ impl From<HdfsRole> for ContainerConfig {
metrics_port: DEFAULT_NAME_NODE_METRICS_PORT,
},
HdfsRole::DataNode => Self::Hdfs {
role: role.clone(),
role,
container_name: role.to_string(),
volume_mounts: ContainerVolumeDirs::from(role),
ipc_port_name: SERVICE_PORT_NAME_IPC,
Expand All @@ -1337,7 +1342,7 @@ impl From<HdfsRole> for ContainerConfig {
metrics_port: DEFAULT_DATA_NODE_METRICS_PORT,
},
HdfsRole::JournalNode => Self::Hdfs {
role: role.clone(),
role,
container_name: role.to_string(),
volume_mounts: ContainerVolumeDirs::from(role),
ipc_port_name: SERVICE_PORT_NAME_RPC,
Expand Down
74 changes: 55 additions & 19 deletions rust/operator-binary/src/hdfs_controller.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
use std::{
collections::{BTreeMap, HashMap},
str::FromStr,
sync::Arc,
};

Expand Down Expand Up @@ -45,7 +44,7 @@ use stackable_operator::{
},
time::Duration,
};
use strum::{EnumDiscriminants, IntoStaticStr};
use strum::{EnumDiscriminants, IntoEnumIterator, IntoStaticStr};

use stackable_hdfs_crd::{
constants::*, AnyNodeConfig, HdfsCluster, HdfsClusterStatus, HdfsPodRef, HdfsRole,
Expand All @@ -63,6 +62,7 @@ use crate::{
},
product_logging::{extend_role_group_config_map, resolve_vector_aggregator_address},
security::{self, kerberos, opa::HdfsOpaConfig},
utils::statefulset::check_all_replicas_updated,
OPERATOR_NAME,
};

Expand Down Expand Up @@ -323,10 +323,15 @@ pub async fn reconcile_hdfs(hdfs: Arc<HdfsCluster>, ctx: Arc<Ctx>) -> HdfsOperat
let dfs_replication = hdfs.spec.cluster_config.dfs_replication;
let mut ss_cond_builder = StatefulSetConditionBuilder::default();

for (role_name, group_config) in validated_config.iter() {
let role: HdfsRole = HdfsRole::from_str(role_name).with_context(|_| InvalidRoleSnafu {
role: role_name.to_string(),
})?;
let mut deploy_done = true;

// Roles must be deployed in order during rolling upgrades
'roles: for role in HdfsRole::iter() {
let role_name: &str = role.into();
let Some(group_config) = validated_config.get(role_name) else {
tracing::debug!(?role, "role has no configuration, skipping");
continue;
};

if let Some(content) = build_invalid_replica_message(&hdfs, &role, dfs_replication) {
publish_event(
Expand Down Expand Up @@ -408,14 +413,26 @@ pub async fn reconcile_hdfs(hdfs: Arc<HdfsCluster>, ctx: Arc<Ctx>) -> HdfsOperat
name: rg_configmap_name,
})?;
let rg_statefulset_name = rg_statefulset.name_any();
ss_cond_builder.add(
cluster_resources
.add(client, rg_statefulset.clone())
.await
.with_context(|_| ApplyRoleGroupStatefulSetSnafu {
name: rg_statefulset_name,
})?,
);
let deployed_rg_statefulset = cluster_resources
.add(client, rg_statefulset.clone())
.await
.with_context(|_| ApplyRoleGroupStatefulSetSnafu {
name: rg_statefulset_name,
})?;
ss_cond_builder.add(deployed_rg_statefulset.clone());
if hdfs.is_upgrading() {
// When upgrading, ensure that each role is upgraded before moving on to the next as recommended by
// https://hadoop.apache.org/docs/r3.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html#Upgrading_Non-Federated_Clusters
if let Err(reason) = check_all_replicas_updated(&deployed_rg_statefulset) {
tracing::info!(
object = %ObjectRef::from_obj(&deployed_rg_statefulset),
reason = &reason as &dyn std::error::Error,
"rolegroup is still upgrading, waiting..."
);
deploy_done = false;
break 'roles;
}
}
}

let role_config = hdfs.role_config(&role);
Expand Down Expand Up @@ -459,12 +476,31 @@ pub async fn reconcile_hdfs(hdfs: Arc<HdfsCluster>, ctx: Arc<Ctx>) -> HdfsOperat
hdfs.as_ref(),
&[&ss_cond_builder, &cluster_operation_cond_builder],
),
// FIXME: We can't currently leave upgrade mode automatically, since we don't know when an upgrade is finalized
deployed_product_version: Some(
hdfs.status
.as_ref()
.and_then(|status| status.deployed_product_version.as_deref())
.unwrap_or(hdfs.spec.image.product_version())
.to_string(),
),
// deployed_product_version: if deploy_done {
// Some(hdfs.spec.image.product_version().to_string())
// } else {
// hdfs.status
// .as_ref()
// .and_then(|status| status.deployed_product_version.clone())
// },
};

cluster_resources
.delete_orphaned_resources(client)
.await
.context(DeleteOrphanedResourcesSnafu)?;
// During upgrades we do partial deployments, we don't want to garbage collect after those
// since we *will* redeploy (or properly orphan) the remaining resources layer.
if deploy_done {
cluster_resources
.delete_orphaned_resources(client)
.await
.context(DeleteOrphanedResourcesSnafu)?;
}
client
.apply_patch_status(OPERATOR_NAME, &*hdfs, &status)
.await
Expand Down Expand Up @@ -870,7 +906,7 @@ properties: []
let validated_config = validate_all_roles_and_groups_config(
"3.4.0",
&config,
&ProductConfigManager::from_str(product_config).unwrap(),
&product_config.parse::<ProductConfigManager>().unwrap(),
false,
false,
)
Expand Down
1 change: 1 addition & 0 deletions rust/operator-binary/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ mod hdfs_controller;
mod operations;
mod product_logging;
mod security;
mod utils;

mod built_info {
include!(concat!(env!("OUT_DIR"), "/built.rs"));
Expand Down
1 change: 1 addition & 0 deletions rust/operator-binary/src/utils/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
pub mod statefulset;
Loading
Loading