Skip to content

Commit 5c4b49b

Browse files
authored
Merge pull request #8638 from romayalon/romy-online-upgrade-improvements
NC | Online upgrade improvements
2 parents 8ec7dd5 + d8a032a commit 5c4b49b

File tree

5 files changed

+185
-17
lines changed

5 files changed

+185
-17
lines changed

docs/NooBaaNonContainerized/Upgrade.md

Lines changed: 163 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
1. [Introduction](#introduction)
44
2. [General Information](#general-information)
55
3. [Download Upstream RPM](#download-upstream-rpm)
6-
4. [Offline Upgrade](#offline-upgrade)
6+
4. [Offline Upgrade (Version < 5.18.0)](#offline-upgrade-version--5180)
77
1. [Offline Upgrade steps](#offline-upgrade-steps)
8-
5. [Online Upgrade](#online-upgrade)
8+
5. [Online Upgrade (Version >= 5.18.0)](#online-upgrade-version--5180)
99

1010

1111
## Introduction
@@ -29,7 +29,7 @@ This document provides step-by-step instructions to help you successfully upgrad
2929

3030
For NooBaa upstream (open source code) RPM download instructions, See [NooBaa Non Containerized Getting Started](./GettingStarted.md).
3131

32-
## Offline Upgrade
32+
## Offline Upgrade (Version < 5.18.0)
3333
The currently available upgrade process of NooBaa Non Containerized is an offline upgrade. Offline upgrade means that NooBaa service must be stopped during the upgrade and that NooBaa endpoints won't be handling S3 requests at the time of the upgrade.
3434

3535
### Offline Upgrade steps
@@ -69,5 +69,163 @@ The currently available upgrade process of NooBaa Non Containerized is an offlin
6969
cat /etc/noobaa.conf.d/system.json
7070
{"hostname":{"current_version":"5.17.0","upgrade_history":{"successful_upgrades":[{"timestamp":1719299738760,"completed_scripts":[],"from_version":"5.15.4","to_version":"5.17.0"}]}}}
7171
```
72-
## Online Upgrade
73-
The process of Online Upgrade of Non Containerized NooBaa is not supported yet.
72+
## Online Upgrade (Version >= 5.18.0)
73+
74+
### Online Upgrade Goals
75+
**1. Minimal downtime -** Ensure minimal downtime for each node.
76+
77+
**2. Incremental changes -** Split upgrade to small chunks, for example, upgrade nodes one by one. Each node will get upgraded on its turn, the other nodes will still be available for handling s3 requests.
78+
79+
**3. Rollback capability -** Mechanism for revert to the previous version in case something went wrong during the upgrade.
80+
81+
**4. Schema backward compatibility -** Changes to account/bucket/config schema must be backwards compatible to allow seamless transitions to new version.
82+
83+
84+
### Online Upgrade Algorithm
85+
86+
1. Initiate config directory backup (#1).
87+
2. Iterate nodes one by one -
88+
* Stop NooBaa service (or suspend the node in CES)
89+
* RPM upgrade each node.
90+
* Restart NooBaa service on each node.
91+
3. Wait for all hosts to finish RPM upgrade (source code upgrade).
92+
4. Initiate config directory backup (#2).
93+
5. Initiate upgrade of the config directory using a noobaa-cli complete upgrade command. (point of no return)
94+
95+
Online Upgrade Algorithm commands examples -
96+
1. Config directory backup -
97+
1. CES - `mms3 config backup /path/to/backup/location`
98+
2. Non CES - `cp -R /etc/noobaa.conf.d/ /path/to/backup/location`
99+
2. Stop NooBaa service - `systemctl stop noobaa`
100+
3. RPM upgrade on a specific node - `rpm -Uvh /path/to/new_noobaa_rpm_version.rpm`
101+
4. Restart NooBaa service - `systemctl restart noobaa`
102+
5. `noobaa-cli upgrade start --expected_version=5.18.0 --expected_hosts=hostname1,hostname2,hostname3`
103+
104+
### Additional Upgrade Properties of `system.json`-
105+
106+
1. New per host property -
107+
- config_dir_version
108+
109+
2. New config directory information -
110+
- config_directory
111+
- config_dir_version
112+
- phase
113+
- upgrade_package_version
114+
- in_progress_upgrade - (during the upgrade)
115+
- timestamp
116+
- completed_scripts
117+
- running_host
118+
- config_dir_from_version
119+
- config_dir_to_version
120+
- package_from_version
121+
- package_to_version
122+
- upgrade_history
123+
- last_failure (if last upgrade failed)
124+
- successful_upgrades
125+
126+
#### system.json new information examples -
127+
1. During Upgrade - `cat /etc/noobaa.conf.d/system.json | jq .`
128+
```json
129+
{
130+
"my_host1":{
131+
"current_version":"5.18.0",
132+
"config_dir_version": "1.0.0",
133+
"upgrade_history":{
134+
"successful_upgrades":[{
135+
"timestamp":1730890665481,
136+
"from_version":"5.17.1",
137+
"to_version":"5.18.0"
138+
}]
139+
}
140+
},
141+
"config_directory":{
142+
"phase":"CONFIG_DIR_LOCKED", // <- config dir is locked during an upgrade
143+
"config_dir_version":"0.0.0", // <- config_dir_version is still the old config_dir_version
144+
"upgrade_package_version":"5.17.1", // <- upgrade_package_version is still the old upgrade_package_version
145+
"in_progress_upgrade":[{ // <- in_progress_upgrade property during the upgrade
146+
"timestamp":1730890691016,
147+
"completed_scripts": [],
148+
"running_host":"my_host1",
149+
"config_dir_from_version":"0.0.0",
150+
"config_dir_to_version":"1.0.0",
151+
"package_from_version":"5.17.1",
152+
"package_to_version":"5.18.0"
153+
}]
154+
}
155+
}
156+
```
157+
158+
2. After a successful upgrade - `cat /etc/noobaa.conf.d/system.json | jq .`
159+
```json
160+
{
161+
"my_host1":{
162+
"current_version":"5.18.0",
163+
"config_dir_version": "1.0.0",
164+
"upgrade_history":{
165+
"successful_upgrades":[{
166+
"timestamp":1730890665481,
167+
"from_version":"5.17.1",
168+
"to_version":"5.18.0"
169+
}]
170+
}
171+
},
172+
"config_directory":{
173+
"phase":"CONFIG_DIR_UNLOCKED", // <- after a successful upgrade, config dir is unlocked
174+
"config_dir_version":"1.0.0", // <- config_dir_version is the new config_dir_version
175+
"upgrade_package_version":"5.18.0", // <- upgrade_package_version is the new upgrade_package_version
176+
"upgrade_history":{ // <- a new item in the successful upgrades array was added
177+
"successful_upgrades":[{
178+
"timestamp":1730890691016,
179+
"completed_scripts":
180+
["/usr/local/noobaa-core/src/upgrade/nc_upgrade_scripts/1.0.0/config_dir_restructure.js"],
181+
"running_host":"my_host1",
182+
"config_dir_from_version":"0.0.0",
183+
"config_dir_to_version":"1.0.0",
184+
"package_from_version":"5.17.1",
185+
"package_to_version":"5.18.0"
186+
}]
187+
}
188+
}
189+
}
190+
```
191+
192+
3. After a failing upgrade - `cat /etc/noobaa.conf.d/system.json | jq .`
193+
```json
194+
{
195+
"my_host1":{
196+
"current_version":"5.18.0",
197+
"config_dir_version": "1.0.0",
198+
"upgrade_history":{
199+
"successful_upgrades":[{
200+
"timestamp":1730890665481,
201+
"from_version":"5.17.1",
202+
"to_version":"5.18.0"
203+
}]
204+
}
205+
},
206+
"config_directory":{
207+
"phase":"CONFIG_DIR_LOCKED", // <- after a failing upgrade, config dir is still locked
208+
"config_dir_version":"0.0.0", // <- config_dir_version is still the old config_dir_version
209+
"upgrade_package_version":"5.17.1", // <- upgrade_package_version is still the old upgrade_package_version
210+
"upgrade_history":{
211+
"successful_upgrades": [], // <- successful_upgrades array is empty/doesn't contain the failed upgrade
212+
"last_failure":{ // <- a last_failure property is set in upgrade history
213+
"timestamp":1730890676741,
214+
"completed_scripts":[
215+
"/usr/local/noobaa-core/src/upgrade/nc_upgrade_scripts/1.0.0/config_dir_restructure.js"],
216+
"running_host":"my_host1",
217+
"config_dir_from_version":"0.0.0",
218+
"config_dir_to_version":"1.0.0",
219+
"package_from_version":"5.17.1",
220+
"package_to_version":"5.18.0",
221+
"error": "Error: _run_nc_upgrade_scripts: nc upgrade manager failed!!!, Error: this is a mock error\n at NCUpgradeManager._run_nc_upgrade_scripts (/usr/local/noobaa-core/src/upgrade/nc_upgrade_manager.js:258:19)\n at async NCUpgradeManager.upgrade_config_dir (/usr/local/noobaa-core/src/upgrade/nc_upgrade_manager.js:119:13)\n at async start_config_dir_upgrade (/usr/local/noobaa-core/src/manage_nsfs/upgrade.js:52:29)\n at async Object.manage_upgrade_operations (/usr/local/noobaa-core/src/manage_nsfs/upgrade.js:22:13)\n at async main (/usr/local/noobaa-core/src/cmd/manage_nsfs.js:73:13)"
222+
}
223+
}
224+
}
225+
}
226+
```
227+
228+
### Upgrade Helpers
229+
1. NooBaa Health CLI - will report on the config directory status, upgrade failures and hosts that are blocked for config directory updates.
230+
2. NooBaa CLI upgrade status - will print the upgrade status per the information written in system.json.
231+

src/test/unit_tests/jest_tests/test_nc_upgrade_manager.test.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -569,7 +569,7 @@ describe('nc upgrade manager - upgrade config directory', () => {
569569
config_dir_version: this_upgrade.config_dir_to_version,
570570
upgrade_package_version: this_upgrade.package_to_version,
571571
upgrade_history: {
572-
last_failure: system_data.config_directory.upgrade_history.last_failure,
572+
// last_failure should be removed after a successful upgrade
573573
successful_upgrades: [this_upgrade, ...system_data.config_directory.upgrade_history.successful_upgrades]
574574
}
575575
}

src/upgrade/nc_upgrade_manager.js

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -251,7 +251,7 @@ class NCUpgradeManager {
251251
*/
252252
async _run_nc_upgrade_scripts(this_upgrade) {
253253
try {
254-
await run_upgrade_scripts(this_upgrade, this.upgrade_scripts_dir, { dbg });
254+
await run_upgrade_scripts(this_upgrade, this.upgrade_scripts_dir, { dbg, from_version: this_upgrade.package_from_version });
255255
} catch (err) {
256256
const upgrade_failed_msg = `_run_nc_upgrade_scripts: nc upgrade manager failed!!!, ${err}`;
257257
dbg.error(upgrade_failed_msg);
@@ -265,6 +265,7 @@ class NCUpgradeManager {
265265
* 2. config_dir_version is the new version
266266
* 3. upgrade_package_version is the new source code version
267267
* 4. add the finished upgrade to the successful_upgrades array
268+
* 5. last_failure is removed after a successful upgrade
268269
* @param {Object} system_data
269270
* @param {Object} this_upgrade
270271
* @returns {Promise<Void>}
@@ -279,7 +280,8 @@ class NCUpgradeManager {
279280
upgrade_package_version: this_upgrade.package_to_version,
280281
upgrade_history: {
281282
...upgrade_history,
282-
successful_upgrades: [this_upgrade, ...successful_upgrades]
283+
successful_upgrades: [this_upgrade, ...successful_upgrades],
284+
last_failure: undefined
283285
}
284286
};
285287
const updated_system_data = { ...system_data, config_directory: updated_config_directory };

src/upgrade/nc_upgrade_scripts/1.0.0/config_dir_restructure.js

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ const nb_native = require('../../../util/nb_native');
2929
* 2. creation of accounts_by_name/ directory
3030
* 3. Upgrade config files of all accounts under accounts/ (old directory)
3131
* 4. delete accounts/ directory
32-
* @param {*} dbg
32+
* @param {{dbg: *, from_version: String}} params
3333
*/
34-
async function run({ dbg }) {
34+
async function run({ dbg, from_version }) {
3535
try {
3636
const config_fs = new ConfigFS(config.NSFS_NC_CONF_DIR, config.NSFS_NC_CONFIG_DIR_BACKEND);
3737
const fs_context = config_fs.fs_context;
@@ -40,10 +40,10 @@ async function run({ dbg }) {
4040
await config_fs.create_dir_if_missing(config_fs.accounts_by_name_dir_path);
4141

4242
const old_account_names = await config_fs.list_old_accounts();
43-
const failed_accounts = await upgrade_accounts_config_files(config_fs, old_account_names, dbg);
43+
const failed_accounts = await upgrade_accounts_config_files(config_fs, old_account_names, from_version, dbg);
4444

4545
if (failed_accounts.length > 0) throw new Error('NC upgrade process failed, failed_accounts array length is bigger than 0' + util.inspect(failed_accounts));
46-
await move_old_accounts_dir(fs_context, config_fs, old_account_names, dbg);
46+
await move_old_accounts_dir(fs_context, config_fs, old_account_names, from_version, dbg);
4747
} catch (err) {
4848
dbg.error('NC upgrade process failed due to - ', err);
4949
throw err;
@@ -56,13 +56,14 @@ async function run({ dbg }) {
5656
* 2. upgrade account config file with 3 retries
5757
* @param {import('../../../sdk/config_fs').ConfigFS} config_fs
5858
* @param {String[]} old_account_names
59+
* @param {String} from_version
5960
* @param {*} dbg
6061
* @returns {Promise<Object[]>}
6162
*/
62-
async function upgrade_accounts_config_files(config_fs, old_account_names, dbg) {
63+
async function upgrade_accounts_config_files(config_fs, old_account_names, from_version, dbg) {
6364
const failed_accounts = [];
6465

65-
const backup_access_keys_path = path.join(config_fs.config_root, '.backup_access_keys_dir/');
66+
const backup_access_keys_path = path.join(config_fs.config_root, `.backup_access_keys_dir_${from_version}/`);
6667
await config_fs.create_dir_if_missing(backup_access_keys_path);
6768

6869
for (const account_name of old_account_names) {
@@ -250,12 +251,13 @@ async function create_account_access_keys_index_if_missing(config_fs, account_up
250251
* @param {nb.NativeFSContext} fs_context
251252
* @param {import('../../../sdk/config_fs').ConfigFS} config_fs
252253
* @param {String[]} old_account_names
254+
* @param {String} from_version
253255
* @param {*} dbg
254256
* @returns {Promise<Void>}
255257
*/
256-
async function move_old_accounts_dir(fs_context, config_fs, old_account_names, dbg) {
258+
async function move_old_accounts_dir(fs_context, config_fs, old_account_names, from_version, dbg) {
257259
const old_account_tmp_dir_path = path.join(config_fs.old_accounts_dir_path, native_fs_utils.get_config_files_tmpdir());
258-
const hidden_old_accounts_path = path.join(config_fs.config_root, '.backup_accounts_dir/');
260+
const hidden_old_accounts_path = path.join(config_fs.config_root, `.backup_accounts_dir_${from_version}/`);
259261
try {
260262
await nb_native().fs.mkdir(fs_context, hidden_old_accounts_path);
261263
} catch (err) {

src/upgrade/upgrade_utils.js

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,13 @@ async function load_required_scripts(server_version, container_version, upgrade_
110110
*
111111
* @param {Object} this_upgrade
112112
* @param {string} upgrade_scripts_dir
113-
* @param {Object} options
113+
* @param {{
114+
* dbg?: *,
115+
* db_client?: import('../util/db_client'),
116+
* system_store?: import('../server/system_services/system_store').SystemStore,
117+
* system_server?: import('../server/system_services/system_server'),
118+
* from_version?: String
119+
* }} options
114120
*/
115121
async function run_upgrade_scripts(this_upgrade, upgrade_scripts_dir, options) {
116122
const from_version = this_upgrade.from_version || this_upgrade.config_dir_from_version;

0 commit comments

Comments
 (0)