Skip to content

CA-399757: Add CAS style check for SR scan #6113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

Vincent-lau
Copy link
Contributor

@Vincent-lau Vincent-lau commented Nov 15, 2024

This is to reduce the chance of race between SR.scan and VDI.db_introduce. More details in the comment, but this won't fix the problem, but just makes it less likely to happen...

I have seen a few instances of this happening now in SXM tests, so need to prioritise fixing this

Trying to run some tests on this, but it's hard to know effective this is as storage migration tends to require lots of resources, and does not get run very often. Plus this issue is not very reproducible.

@Vincent-lau Vincent-lau force-pushed the private/shul2/scan-cas branch from a7928b1 to 95322fe Compare November 15, 2024 12:08
Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be a situation where the two lists don't converge?

@Vincent-lau
Copy link
Contributor Author

Vincent-lau commented Nov 15, 2024

Can there be a situation where the two lists don't converge?

Well then someone has to be constantly changing it under our feet. I think the only source that could do that would be SMAPIv1 calling back to xapi, which shouldn't happen all the time...

Copy link
Contributor

@contificate contificate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See commentary above.

I think the lack of else begin ... end makes the following a bug:

if db_vdis_after <> db_vdis_before then
  scan_rec () ;
update_vdis ~__context ~sr db_vdis_after vs ;

If you do recursively call scan_rec and it converges, you will then update_vdis, return to the caller and update_vdis with its version of db_vdis_after. This is probably not the desired semantics.

I think this should be a loop guard with a max retry bound, not recursion.

@contificate
Copy link
Contributor

Could we mitigate the TOCTOU issue somewhat by holding the DB lock when computing after, such that we continue to holding it after we break the loop (fixpoint convergence), then do all the updates whilst holding the lock for the entirety (ensuring not to hold it during the scan), and then release it.

I don't know how expensive all the storage operations are in general or if this is the TOCTOU issue you mention, so this suggestion may be deeply destructive if you hold the lock for too long. This assumes the TOCTOU issue you're talking about is the potential to interleave changes between the breaking of the loop and the subsequent updates.

@Vincent-lau
Copy link
Contributor Author

Could we mitigate the TOCTOU issue somewhat by holding the DB lock when computing after, such that we continue to holding it after we break the loop (fixpoint convergence), then do all the updates whilst holding the lock for the entirety (ensuring not to hold it during the scan), and then release it.

I don't know how expensive all the storage operations are in general or if this is the TOCTOU issue you mention, so this suggestion may be deeply destructive if you hold the lock for too long. This assumes the TOCTOU issue you're talking about is the potential to interleave changes between the breaking of the loop and the subsequent updates.

The TOCTOU problem happens when the db gets changed after we do the check db_vdis_after <> db_vdis_before, at which point we shouldn't really proceed.

So you are suggesting holding the lock before if db_vdis_after <> db_vdis_before then and release it until after we finish updating the db? How do you hold a db lock for part of the operations db_lock.lock db_lock.db_lock?

@contificate
Copy link
Contributor

So you are suggesting holding the lock before if db_vdis_after <> db_vdis_before then and release it until after we finish updating the db? How do you hold a db lock for part of the operations db_lock.lock db_lock.db_lock?

Something like:

while not fixpoint do
  before = ..
  action
  acquire db_lock
  after = ...
  fixpoint = (before == after)
  if not fixpoint:
    release db_lock
done
# holding db lock still
bunch of updates
release db_lock

I don't think the database lock exposes such an interface, I think we restricted it to be a with_lock function with a callback to avoid misuse. You also won't be able to do this if you spawn new threads within any of the "after" or "bunch of updates" computation (the holder of the lock is determined by thread ID - so you would need to be careful that you can still make progress when holding the lock).

@Vincent-lau
Copy link
Contributor Author

Vincent-lau commented Nov 19, 2024

I don't think the database lock exposes such an interface, I think we restricted it to be a with_lock function with a callback to avoid misuse. You also won't be able to do this if you spawn new threads within any of the "after" or "bunch of updates" computation (the holder of the lock is determined by thread ID - so you would need to be careful that you can still make progress when holding the lock).

Yes I am worried about deadlocking if we hold this lock. It could be that SR.scan is run on a pool member, and it will hold the lock for a period, while invoking updates such as Db.SR.... which will be run on the coordinator, and because the member is holding the lock, this will cause coordinator never able to acquire the db lock at all. And this is just one case, there might be other cases of dead locking as well.

In general there is just no way to solve this cleanly...

@Vincent-lau Vincent-lau force-pushed the private/shul2/scan-cas branch from 95322fe to 1e9ed52 Compare November 19, 2024 14:46
@robhoes
Copy link
Member

robhoes commented Nov 28, 2024

Yes I am worried about deadlocking if we hold this lock. It could be that SR.scan is run on a pool member, and it will hold the lock for a period, while invoking updates such as Db.SR.... which will be run on the coordinator, and because the member is holding the lock, this will cause coordinator never able to acquire the db lock at all. And this is just one case, there might be other cases of dead locking as well.

This function can run on any host (e.g. a pool member for a local SR), but the coordinator runs the database and only the coordinator has access to DB locks.

@robhoes
Copy link
Member

robhoes commented Nov 28, 2024

I think this is fine after adding the iteration limit, as discussed above.

@Vincent-lau Vincent-lau force-pushed the private/shul2/scan-cas branch 2 times, most recently from f03e740 to 2756dd0 Compare December 3, 2024 14:39
Copy link
Contributor

@contificate contificate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recent changes look good.

My only nitpick is that the order of operands in db_vdis_after <> db_vdis_before && limit > 0 could be swapped under the general idea that less costly computations that can short-circuit a condition should be tested first. Of course, it is the single limit retry case that this benefits the most and only in marginal ways.

@Vincent-lau Vincent-lau force-pushed the private/shul2/scan-cas branch from 2756dd0 to 87ca90e Compare December 3, 2024 14:53
SR.scan is currently not an atomic operation, and this has caused
problems as during the scan itself, there might be other calls changing
the state of the database, such as VDI.db_introduce called by SM, if
using SMAPIv1. This will confuse SR.scan as it sees an outdated
snapshot.

The proposed workaround would be add a CAS style check for SR.scan,
which will refuse to update the db if it detects changes. This is still
subject to the TOCTOU problem, but should reduce the racing window.

Signed-off-by: Vincent Liu <shuntian.liu2@cloud.com>
@robhoes robhoes added this pull request to the merge queue Dec 3, 2024
Merged via the queue into xapi-project:master with commit b12a6b0 Dec 3, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants