-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[move][vm_rewrite] package loader #19153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
} | ||
|
||
/// Read the package bytes stored on-disk at `addr` | ||
fn get_package_bytes(&self, address: &AccountAddress) -> Result<Option<Vec<Vec<u8>>>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return type seems janky -- shouldn't we give an error if the address path is invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in keeping with the other APIs in here/in general (e.g., get_module_bytes)
with storage-related APIs, so I think this type makes sense to me at least?
If it's useful -- this is used for the ModuleResolver
trait below in this module, and this trait is also over DBs where you do want to signify "error fetching" vs "no error but not found"
}; | ||
use std::fmt::Debug; | ||
use std::sync::Arc; | ||
use std::{collections::BTreeSet, fmt::Debug}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine imports
|
||
/// Return the transitive closure of all package dependencies of the current linkage context. | ||
fn all_package_dependencies(&self) -> Result<BTreeSet<AccountAddress>, Self::Error> { | ||
Ok(BTreeSet::new()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be empty, to demand others implement it?
/// | ||
/// The resuling map of vtables _must_ be closed under the static dependency graph of the root | ||
/// package w.r.t, to the current linkage context in `data_store`. | ||
pub fn new(data_store: &impl DataStore, package_runtime: &'a PackageLoader) -> VMResult<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
pub fn new(data_store: &impl DataStore, package_runtime: &'a PackageLoader) -> VMResult<Self> { | |
pub fn new(data_store: &impl DataStore, package_loader: &'a PackageLoader) -> VMResult<Self> { |
}; | ||
|
||
struct Context<'a> { | ||
link_context: AccountAddress, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this field now, right?
|
||
struct Context<'a> { | ||
link_context: AccountAddress, | ||
cache: &'a LoadedPackage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: rename this to package
.
// Also compute the packages dependency order. This is because we need to count on the fact that | ||
// all dependencies are loaded and their types cached before we cache a package. | ||
let package_deps = if let Some(pkg) = self.package_cache.read().loaded_package_at(dep) { | ||
let package_deps = Self::compute_immediate_package_dependencies( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have a guarantee that the dependencies are in the cache if we find this package in the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily since different link contexts will induce different dependencies possibly and a package is not cached based on link context (and is in fact shared across multiple link contexts).
e8f8c00
to
52e9385
Compare
// NB: this is needed for the bytecode verifier. If we update the bytecode verifier we should | ||
// be able to remove this. | ||
pub compiled_modules: BinaryCache<Identifier, CompiledModule>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this. comment means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what to think of this.
In a way this is not a massive deal as those modules would have to live somewhere.
I was envisioning a global cache for CompiledModule
which would return verified modules and so we would put things there only when verified (module verification, not cross modules). In a way that global cache could have any policy we wanted (dropped any time). Sort of a front end cache to module for the data store.
But we are there yet and not clear when and if we would go there, and in that case this may very well be just fine for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we go to check linking we need to call into the bytecode verifier for checking compatibility across dependencies. For this it needs/uses the CompiledModule
so we need to keep the CompiledModule
s for the package around for this purpose until/if we change the cyclic check and linking checker in the bytecode verifier to work over loaded modules.
/// before the beginning of execution, and based on the static call graph of the package that | ||
/// contains the root package id. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// before the beginning of execution, and based on the static call graph of the package that | |
/// contains the root package id. | |
/// before the beginning of execution, based on the static call graph of the root package (that | |
/// is, the package that contains the root package id). |
e47731d
to
70c3425
Compare
52e9385
to
5f82fad
Compare
5f82fad
to
13fa069
Compare
13fa069
to
5497eec
Compare
5497eec
to
a89f593
Compare
Bytecode::StLoc(a) => write!(f, "StLoc({})", a), | ||
Bytecode::Call(a) => write!(f, "Call({})", a.to_ref().name), | ||
Bytecode::KnownCall(a) => write!(f, "Call({})", a.to_ref().name), | ||
Bytecode::VirtualCall(a) => write!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't CallGeneric
also be updated to reflect this distinction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, also let's consider if we want an instruction for calling the framework. That would be equivalent to a static call but it may be good to distinguish those even if the implementation would be identical.
Also we should do the same for native calls irrespective of the implementation.
I think all of that info is available already so it should not be a problem to do so
use std::collections::BTreeMap; | ||
|
||
#[derive(Debug, Clone)] | ||
pub(crate) struct DeserializedPackage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider BinaryFormatPackage
or something, since it holds things from the move_binary_format
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add the type table and the linkage table here, not sure why we are omitting them.
At the end it does not seem a big deal to have them and it may help for future work.
Though I am not very clear yet on how all of this is coming together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type table may be fine here, but linkage table would definitely not be correct here (IMO). This package is being loaded "irrespective" of the context. The context that the package is then linked in/interpreted in is a dynamic construct and IMO shouldn't reside on this most likely. But lets chat about it :)
pub fn new(data_store: &impl DataStore, package_runtime: &'a VMCache) -> VMResult<Self> { | ||
let mut loaded_packages = HashMap::new(); | ||
|
||
// Make sure the root package and all of its dependencies (under the current linkage | ||
// context) are loaded. | ||
let cached_packages = package_runtime.load_and_cache_link_context(data_store)?; | ||
|
||
// Verify that the linkage and cyclic checks pass for all packages under the current | ||
// linkage context. | ||
linkage_checker::verify_linkage_and_cyclic_checks(&cached_packages)?; | ||
cached_packages.into_iter().for_each(|(_, p)| { | ||
loaded_packages.insert(p.runtime_id, p); | ||
}); | ||
|
||
Ok(Self { | ||
loaded_packages, | ||
cached_types: &package_runtime.type_cache, | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small structural nit: I would expect the VMCache
to take the DataStore
and return one of these, as package_runtime.generate_vtables(data_store)
or similar, so that, e.g., we can push the parallel execution handling into the cache logic. As it is right now, this code will need to be lock-aware to let that work.
@@ -0,0 +1,120 @@ | |||
// Copyright (c) The Move Contributors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering adding this to move-vm-test-utils/storage.rs
if it can go there?
.clone()) | ||
} | ||
|
||
fn all_package_dependencies(&self) -> Result<BTreeSet<AccountAddress>, Self::Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of confusion: it appears we also define relinking_store.rs
below, which could b emoved to move-vm-test-utils/src/storage.rs
If we did that, could we also reuse it here?
} | ||
} | ||
|
||
fn load_package(&self, package_id: &AccountAddress) -> VMResult<Vec<Vec<u8>>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have hoped version cuts could have avoided needing this sort of stuff, but I see we are sort of stuck where we are. Any solution I have isn't particularly better. Maybe on a longer timeline the version cut should become the whole crates/
folder...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, versioning the whole thing is the only way around this :'(
That being said though I realized I can just stub these out -- they should never be accessed from old versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to move some conversation offline and stop reviewing at the moment until we talk.
Thanks so much for this work, it's looking good and a lot to do
)) | ||
}) | ||
.map(|f| f.as_ref()) | ||
.ok_or_else(|| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe that is what we used to do and to make sure I understand.
This is giving MISSING_DEPENDENCY
with a missing function message both in case the lookup of the package or the lookup of the function fail.
Would it be better to have different errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MISSING_DEPENDENCY
in both cases is fine/correct. Maybe we just add a different message?
pub type PackageStorageId = AccountAddress; | ||
pub type RuntimePackageId = AccountAddress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of those names which mean very little to me. Admittedly I am not sure I have better names.
But at the end is something along the lines of OriginId
and PackageId
.
Or maybe PackageId
and PackageVersionId
.
Not sure, but storage and runtime make no sense to me and I'd love for us to think of better names.
Basically PackageRuntimeId
(I think, because I never remember which goes to prove how non descriptive the names are) is the package identity as a "name". The logical view of a package. Package at version 0.
Whereas PackageStorageId
is, well... the package itself.
In other words if I wanted to know all version of a package I would do a "search" over PackageRuntimeId
and have all the different version that have been loaded for that package.
Whether that operation is ever needed I don't think so, but having the package identity (as again in the name of the package) in there does not seem to hurt. Also it will be probably needed in some scenario.
edit: I wrote this comment mixing up storage id and runtime id and had to edit this - lol
// NB: this is under the package's context so we don't need to further resolve by | ||
// address in this table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this comment as it is not adding much in my opinion
// NB: this is needed for the bytecode verifier. If we update the bytecode verifier we should | ||
// be able to remove this. | ||
pub compiled_modules: BinaryCache<Identifier, CompiledModule>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what to think of this.
In a way this is not a massive deal as those modules would have to live somewhere.
I was envisioning a global cache for CompiledModule
which would return verified modules and so we would put things there only when verified (module verification, not cross modules). In a way that global cache could have any policy we wanted (dropped any time). Sort of a front end cache to module for the data store.
But we are there yet and not clear when and if we would go there, and in that case this may very well be just fine for now
/// ```..., arg(1), arg(2), ..., arg(n) -> ..., return_value(1), return_value(2), ..., | ||
/// return_value(k)``` | ||
Call(ArenaPointer<Function>), | ||
KnownCall(ArenaPointer<Function>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we please call this StaticCall
? I cannot possibly digest KnownCall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! forgot to make this change, but doing it :)
// - Virtual: the function is unknown and the index is the index in the global table of vtables | ||
// that will be filled in at a later time before execution. | ||
pub enum CallType { | ||
Known(ArenaPointer<Function>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as in the bytecode comment can we call this Static
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Making that name change right now
Bytecode::StLoc(a) => write!(f, "StLoc({})", a), | ||
Bytecode::Call(a) => write!(f, "Call({})", a.to_ref().name), | ||
Bytecode::KnownCall(a) => write!(f, "Call({})", a.to_ref().name), | ||
Bytecode::VirtualCall(a) => write!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, also let's consider if we want an instruction for calling the framework. That would be equivalent to a static call but it may be good to distinguish those even if the implementation would be identical.
Also we should do the same for native calls irrespective of the implementation.
I think all of that info is available already so it should not be a problem to do so
@@ -0,0 +1,29 @@ | |||
// Copyright (c) The Move Contributors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file name is funny, it took me a while to understand what it means, and I guess chain refers to blockchain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could also maybe call it binary_ast
or binary_package
?
use std::collections::BTreeMap; | ||
|
||
#[derive(Debug, Clone)] | ||
pub(crate) struct DeserializedPackage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add the type table and the linkage table here, not sure why we are omitting them.
At the end it does not seem a big deal to have them and it may help for future work.
Though I am not very clear yet on how all of this is coming together
|
||
pub struct TypeCache { | ||
pub cached_types: DatatypeCache, | ||
pub cached_instantiations: HashMap<CachedTypeIndex, HashMap<Vec<Type>, DatatypeInfo>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is this used and how?
d31e5a4
to
cee6909
Compare
cee6909
to
04a1ba5
Compare
Description
This is the base rewrite for the package loading and caching mechanism. Before progressing much further than this, we will need to rip out the current loading and module/type/function resolution in the interpreter and move over to passing vtables.
Test plan
This is not tested across the whole test suite. However I have added a number of unit tests for the loader in
package_cache_tests.rs
.