Skip to content

Commit 3d70776

Browse files
authored
Queue disks for upsert when a transient error occurs (#3800)
Check every 5 seconds if there are any queued disks and we have a ready key manager, meaning that this replica has a key share or it is using the `HardcodedKeyRetriever`. If there are disks that need upserting then try to do that. This allows not just for cold boot but for hot insert of new disks when the trust quorum may not be available. Current the only transient error is a `KeyManager::Error` which indicates that the trust quorum is unavailable becaue not enough sleds are online. Fixes #3789
1 parent 9bf3481 commit 3d70776

File tree

2 files changed

+257
-106
lines changed

2 files changed

+257
-106
lines changed

sled-agent/src/bootstrap/agent.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -500,8 +500,8 @@ impl Agent {
500500
addr: SocketAddrV6::new(ip, BOOTSTORE_PORT, 0, 0),
501501
time_per_tick: std::time::Duration::from_millis(250),
502502
learn_timeout: std::time::Duration::from_secs(5),
503-
rack_init_timeout: std::time::Duration::from_secs(60),
504-
rack_secret_request_timeout: std::time::Duration::from_secs(30),
503+
rack_init_timeout: std::time::Duration::from_secs(300),
504+
rack_secret_request_timeout: std::time::Duration::from_secs(5),
505505
fsm_state_ledger_paths: bootstore_fsm_state_paths(
506506
&storage_resources,
507507
)

0 commit comments

Comments
 (0)