Skip to content

xapi_vm_migrate: Avoid duplicate, overly-strict CBT check on VDIs #6405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

last-genius
Copy link
Contributor

There is already a call to assert_can_migrate_vdis present in assert_can_migrate, which checks that none of the VDIs that are going to be moved have CBT enabled. There is no need to additionally check that none of the VDIs in general have CBT enabled.

Some clients, like XenOrchestra, will turn off CBT on VDIs and retry migration after getting the VDI_CBT_ENABLED error on live migration. Dropping this overly strict check allows not stripping CBT when VDI will not be moved (when it's on a shared SR).
In addition, during rolling pool upgrades, disabling CBT is not allowed, hence the live migration operation wouldn't be able to continue. Avoiding the strict check fixes that as well.

Closes #6400

@last-genius
Copy link
Contributor Author

Tested manually, live migrated a VM with a CBT-enabled VDI on a shared SR a few times.

Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in the original issue, I don't see a reason to have this strict check.

@Vincent-lau
Copy link
Contributor

Vincent-lau commented Apr 4, 2025

The function assert_can_migrate_vdis says

(** Check that none of the VDIs that are mapped to a different SR have CBT
or encryption enabled. This function must be called with the complete
[vdi_map], which contains all the VDIs of the VM.
[check_vdi_map] should be called before this function to verify that this
is the case. *)

which means it's going to check all the VDIs of a VM to see if they have cbt. And the List.iter you removed checks all the VDIs of a VM as well, through the vm_vdis list. So in what case will there be a VDI in vm_vdis but no in vdi_map? In other words, why is the List.iter check you removed a stricter check than assert_can_migrate_vdis?

@last-genius
Copy link
Contributor Author

last-genius commented Apr 4, 2025

which means it's going to check all the VDIs of a VM to see if they have cbt. And the List.iter you removed checks all the VDIs of a VM as well, through the vm_vdis list. So in what case will there be a VDI in vm_vdis but no in vdi_map? In other words, why is the List.iter check you removed a stricter check than assert_can_migrate_vdis?

the key part is "that are mapped to a different SR". the List.iter will check all VDIs. the assert will only check those VDIs that will have to move. I can amend the commit message to make it clearer, perhaps?

Copy link
Contributor

@Vincent-lau Vincent-lau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the other test worth doing is to migrate a VM with multiple VDIs from local SRs, some cbt-enabled, and some cbt-disabled and check that they are correctly blocked, with this patch

@lindig
Copy link
Contributor

lindig commented Apr 4, 2025

Just in case we are coming back to this in the future: how about leaving a comment that this check was removed? Often we re-discover reasons why some code was present.

There is already a call to `assert_can_migrate_vdis` present in
`assert_can_migrate`, which checks that none of the VDIs that *are going to be
moved* have CBT enabled. There is no need to additionally check that none of the
VDIs *in general* have CBT enabled.

Some clients, like XenOrchestra, will turn off CBT on VDIs and retry migration
after getting the `VDI_CBT_ENABLED` error on live migration. Dropping this
overly strict check allows not stripping CBT when VDI will not be moved (when
it's on a shared SR).
In addition, during rolling pool upgrades, disabling CBT is not allowed, hence
the live migration operation wouldn't be able to continue. Avoiding the strict
check fixes that as well.

Signed-off-by: Andrii Sultanov <andriy.sultanov@vates.tech>
@last-genius last-genius force-pushed the private/asv/cbt-duplicate-check-migration branch from 8e2e322 to bedf269 Compare April 4, 2025 18:27
@last-genius
Copy link
Contributor Author

LGTM, the other test worth doing is to migrate a VM with multiple VDIs from local SRs, some cbt-enabled, and some cbt-disabled and check that they are correctly blocked, with this patch

Verified this manually, migration with CBT-enabled VDIs that are going to be moved is still blocked.

@last-genius
Copy link
Contributor Author

Just in case we are coming back to this in the future: how about leaving a comment that this check was removed? Often we re-discover reasons why some code was present.

Added a short comment

@last-genius last-genius added this pull request to the merge queue Apr 7, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2025
@last-genius last-genius added this pull request to the merge queue Apr 7, 2025
github-merge-queue bot pushed a commit that referenced this pull request Apr 7, 2025
)

There is already a call to `assert_can_migrate_vdis` present in
`assert_can_migrate`, which checks that none of the VDIs that *are going
to be moved* have CBT enabled. There is no need to additionally check
that none of the VDIs *in general* have CBT enabled.

Some clients, like XenOrchestra, will turn off CBT on VDIs and retry
migration after getting the `VDI_CBT_ENABLED` error on live migration.
Dropping this overly strict check allows not stripping CBT when VDI will
not be moved (when it's on a shared SR).
In addition, during rolling pool upgrades, disabling CBT is not allowed,
hence the live migration operation wouldn't be able to continue.
Avoiding the strict check fixes that as well.

Closes #6400
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2025
@psafont psafont added this pull request to the merge queue Apr 7, 2025
github-merge-queue bot pushed a commit that referenced this pull request Apr 7, 2025
)

There is already a call to `assert_can_migrate_vdis` present in
`assert_can_migrate`, which checks that none of the VDIs that *are going
to be moved* have CBT enabled. There is no need to additionally check
that none of the VDIs *in general* have CBT enabled.

Some clients, like XenOrchestra, will turn off CBT on VDIs and retry
migration after getting the `VDI_CBT_ENABLED` error on live migration.
Dropping this overly strict check allows not stripping CBT when VDI will
not be moved (when it's on a shared SR).
In addition, during rolling pool upgrades, disabling CBT is not allowed,
hence the live migration operation wouldn't be able to continue.
Avoiding the strict check fixes that as well.

Closes #6400
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2025
@psafont psafont added this pull request to the merge queue Apr 7, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2025
@psafont psafont added this pull request to the merge queue Apr 7, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2025
@psafont psafont added this pull request to the merge queue Apr 8, 2025
Merged via the queue into xapi-project:master with commit 4b72ad9 Apr 8, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate strict CBT check on a VDI during migrate
5 participants