Skip to content

Conversation

mystenmark
Copy link
Contributor

@mystenmark mystenmark commented Sep 11, 2024

This fixes a crash where we clear all pending locks at reconfig
(

pub(crate) fn clear(&self) {
)
while trying to acquire locks for a transaction.

Without this fix, the node can reconfigure while we are trying to acquire locks for a transaction. If clear_locks() linked above is called while we are trying to call clear_cached_locks at

self.clear_cached_locks(&locks_to_write);
, we hit the panic at
DashMapEntry::Vacant(_) => panic!("lock must exist"),

Copy link

vercel bot commented Sep 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 23, 2024 4:25pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 23, 2024 4:25pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 23, 2024 4:25pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 23, 2024 4:25pm

@lxfind
Copy link
Contributor

lxfind commented Sep 11, 2024

Can you explain more? I don't fully understand how this could crash and why the PR fixes it

@mystenmark
Copy link
Contributor Author

Added some more explanation to the description

@halfprice
Copy link
Contributor

Looks like there is still compilation error?

@mystenmark
Copy link
Contributor Author

Looks like there is still compilation error?

indeed, thought it was a simple enough change to oneshot it but i was wrong

@mystenmark
Copy link
Contributor Author

Looks like there is still compilation error?

indeed, thought it was a simple enough change to oneshot it but i was wrong

Oh, I used the wrong lock. Should work now.

}

pub async fn execution_lock_for_signing(&self) -> ExecutionLockReadGuard {
// No need to check epoch here - current epoch is always correct for signing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you comment explain why signing also requires execution lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you read the PR description? It's to prevent reconfiguration from happening while we are signing a tx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but please also doc comment it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh of course, I have no idea why i thought you meant "comment in the PR"

return Err(SuiError::ValidatorHaltedAtEpochEnd);
}
// Ensure that validator cannot reconfigure while we are signing the tx
let _execution_lock = self.execution_lock_for_signing().await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this lock be moved into handle_transaction_impl so it also works with #19401

@mystenmark mystenmark force-pushed the mlogan-handle-tx-reconfig-lock-guard branch from bcbd41a to a643b6c Compare September 23, 2024 16:20
@mystenmark mystenmark requested a review from aschran September 23, 2024 16:20
@mystenmark mystenmark enabled auto-merge (squash) September 23, 2024 16:21
@mystenmark mystenmark requested a review from lxfind September 23, 2024 16:21
@mystenmark mystenmark merged commit 7c9b1d1 into main Sep 23, 2024
47 of 48 checks passed
@mystenmark mystenmark deleted the mlogan-handle-tx-reconfig-lock-guard branch September 23, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants