Skip to content

CA-399229: Assert no host pending mandatory guidance for pool.join #6007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 24, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions ocaml/xapi/xapi_pool.ml
Original file line number Diff line number Diff line change
Expand Up @@ -807,6 +807,37 @@ let pre_join_checks ~__context ~rpc ~session_id ~force =
(pool_joining_host_ca_certificates_conflict, !conflicting_names)
)
in
let assert_no_host_pending_mandatory_guidance () =
(* Assert that there is no host pending mandatory guidance on the joiner or
the remote pool coordinator.
*)
Repository_helpers.assert_no_host_pending_mandatory_guidance ~__context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the check on the joiner be extended to the recommended and full as well? The toolstack on the joiner will be restarted after joining. If there is a pending Toolstack restart (even as a full guidance), it probably means that a newer toolstack (than the one on the coordinator) on the joiner will start after joining.
But usually this is a rare case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only mandatory guidance is checked is because now RestartToolstack guidance only set as mandatory guidance.
@robhoes Is it possible that RestartToolstack guidance would be set as recommended or full guidance in updates so that we will need to cover all type of guidance here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it this way: any guidance that must be followed immediately to avoid things breaking must be part of the "mandatory guidance". This includes breaking pool join. And yes, usually we make restarting the Toolstack mandatory if the Toolstack is updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we make restarting the Toolstack mandatory if the Toolstack is updated.

Yeah, in this case, it would be fine to check the mandatory guidance only.

~host:(Helpers.get_localhost ~__context) ;
let remote_coordinator = get_master ~rpc ~session_id in
let remote_coordinator_pending_mandatory_guidances =
Client.Host.get_pending_guidances ~rpc ~session_id
~self:remote_coordinator
in
if remote_coordinator_pending_mandatory_guidances <> [] then (
error
"%s: %d mandatory guidances are pending for remote coordinator %s: [%s]"
__FUNCTION__
(List.length remote_coordinator_pending_mandatory_guidances)
(Ref.string_of remote_coordinator)
(remote_coordinator_pending_mandatory_guidances
|> List.map Updateinfo.Guidance.of_pending_guidance
|> List.map Updateinfo.Guidance.to_string
|> String.concat ";"
) ;
raise
Api_errors.(
Server_error
( host_pending_mandatory_guidances_not_empty
, [Ref.string_of remote_coordinator]
)
)
)
in
(* call pre-join asserts *)
assert_pool_size_unrestricted () ;
assert_management_interface_exists () ;
Expand All @@ -817,6 +848,9 @@ let pre_join_checks ~__context ~rpc ~session_id ~force =
assert_i_know_of_no_other_hosts () ;
assert_no_running_vms_on_me () ;
assert_no_vms_with_current_ops () ;
(* check first no host pending mandatory guidance then the hosts compatible,
api version and db schema *)
assert_no_host_pending_mandatory_guidance () ;
assert_hosts_compatible () ;
if not force then assert_hosts_homogeneous () ;
assert_no_shared_srs_on_me () ;
Expand Down
Loading