[move][vm_rewrite] package loader #19153

tzakian · 2024-08-29T23:16:27Z

Description

This is the base rewrite for the package loading and caching mechanism. Before progressing much further than this, we will need to rip out the current loading and module/type/function resolution in the interpreter and move over to passing vtables.

Test plan

This is not tested across the whole test suite. However I have added a number of unit tests for the loader in package_cache_tests.rs.

vercel · 2024-08-29T23:16:31Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Sep 14, 2024 0:07am

3 Skipped Deployments

Name	Status	Preview	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Visit Preview	Sep 14, 2024 0:07am
sui-kiosk	⬜️ Ignored (Inspect)	Visit Preview	Sep 14, 2024 0:07am
sui-typescript-docs	⬜️ Ignored (Inspect)	Visit Preview	Sep 14, 2024 0:07am

cgswords · 2024-08-30T02:12:49Z

external-crates/move/crates/move-cli/src/sandbox/utils/on_disk_state_view.rs

    }

+    /// Read the package bytes stored on-disk at `addr`
+    fn get_package_bytes(&self, address: &AccountAddress) -> Result<Option<Vec<Vec<u8>>>> {


This return type seems janky -- shouldn't we give an error if the address path is invalid?

This is in keeping with the other APIs in here/in general (e.g., get_module_bytes) with storage-related APIs, so I think this type makes sense to me at least?

If it's useful -- this is used for the ModuleResolver trait below in this module, and this trait is also over DBs where you do want to signify "error fetching" vs "no error but not found"

cgswords · 2024-08-30T02:13:06Z

external-crates/move/crates/move-core-types/src/resolver.rs

 };
-use std::fmt::Debug;
 use std::sync::Arc;
+use std::{collections::BTreeSet, fmt::Debug};


Combine imports

cgswords · 2024-08-30T02:13:24Z

external-crates/move/crates/move-core-types/src/resolver.rs

+
+    /// Return the transitive closure of all package dependencies of the current linkage context.
+    fn all_package_dependencies(&self) -> Result<BTreeSet<AccountAddress>, Self::Error> {
+        Ok(BTreeSet::new())


Should this be empty, to demand others implement it?

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs

cgswords · 2024-08-30T02:29:00Z

external-crates/move/crates/move-vm-runtime/src/loader/package_tables.rs

+    ///
+    /// The resuling map of vtables _must_ be closed under the static dependency graph of the root
+    /// package w.r.t, to the current linkage context in `data_store`.
+    pub fn new(data_store: &impl DataStore, package_runtime: &'a PackageLoader) -> VMResult<Self> {


Nit:

Suggested change

pub fn new(data_store: &impl DataStore, package_runtime: &'a PackageLoader) -> VMResult<Self> {

pub fn new(data_store: &impl DataStore, package_loader: &'a PackageLoader) -> VMResult<Self> {

cgswords · 2024-08-30T02:44:58Z

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs

+};
+
+struct Context<'a> {
+    link_context: AccountAddress,


We can remove this field now, right?

cgswords · 2024-08-30T02:46:21Z

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs

+
+struct Context<'a> {
+    link_context: AccountAddress,
+    cache: &'a LoadedPackage,


Nit: rename this to package.

cgswords · 2024-08-30T02:52:44Z

external-crates/move/crates/move-vm-runtime/src/loader/package_loader.rs

+            // Also compute the packages dependency order. This is because we need to count on the fact that
+            // all dependencies are loaded and their types cached before we cache a package.
+            let package_deps = if let Some(pkg) = self.package_cache.read().loaded_package_at(dep) {
+                let package_deps = Self::compute_immediate_package_dependencies(


Don't we have a guarantee that the dependencies are in the cache if we find this package in the cache?

Not necessarily since different link contexts will induce different dependencies possibly and a package is not cached based on link context (and is in fact shared across multiple link contexts).

cgswords · 2024-09-04T21:02:25Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

+    // NB: this is needed for the bytecode verifier. If we update the bytecode verifier we should
+    // be able to remove this.
+    pub compiled_modules: BinaryCache<Identifier, CompiledModule>,


Can you explain what this. comment means?

not sure what to think of this.
In a way this is not a massive deal as those modules would have to live somewhere.
I was envisioning a global cache for CompiledModule which would return verified modules and so we would put things there only when verified (module verification, not cross modules). In a way that global cache could have any policy we wanted (dropped any time). Sort of a front end cache to module for the data store.
But we are there yet and not clear when and if we would go there, and in that case this may very well be just fine for now

When we go to check linking we need to call into the bytecode verifier for checking compatibility across dependencies. For this it needs/uses the CompiledModule so we need to keep the CompiledModules for the package around for this purpose until/if we change the cyclic check and linking checker in the bytecode verifier to work over loaded modules.

cgswords · 2024-09-04T21:04:00Z

external-crates/move/crates/move-vm-runtime/src/loader/runtime_vtable.rs

+/// before the beginning of execution, and based on the static call graph of the package that
+/// contains the root package id.


Suggested change

/// before the beginning of execution, and based on the static call graph of the package that

/// contains the root package id.

/// before the beginning of execution, based on the static call graph of the root package (that

/// is, the package that contains the root package id).

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs

cgswords · 2024-09-13T05:20:13Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

            Bytecode::StLoc(a) => write!(f, "StLoc({})", a),
-            Bytecode::Call(a) => write!(f, "Call({})", a.to_ref().name),
+            Bytecode::KnownCall(a) => write!(f, "Call({})", a.to_ref().name),
+            Bytecode::VirtualCall(a) => write!(


Shouldn't CallGeneric also be updated to reflect this distinction?

I think so, also let's consider if we want an instruction for calling the framework. That would be equivalent to a static call but it may be good to distinguish those even if the implementation would be identical.
Also we should do the same for native calls irrespective of the implementation.
I think all of that info is available already so it should not be a problem to do so

cgswords · 2024-09-13T05:21:06Z

external-crates/move/crates/move-vm-runtime/src/loader/chain_ast.rs

+use std::collections::BTreeMap;
+
+#[derive(Debug, Clone)]
+pub(crate) struct DeserializedPackage {


Consider BinaryFormatPackage or something, since it holds things from the move_binary_format?

I would also add the type table and the linkage table here, not sure why we are omitting them.
At the end it does not seem a big deal to have them and it may help for future work.
Though I am not very clear yet on how all of this is coming together

type table may be fine here, but linkage table would definitely not be correct here (IMO). This package is being loaded "irrespective" of the context. The context that the package is then linked in/interpreted in is a dynamic construct and IMO shouldn't reside on this most likely. But lets chat about it :)

cgswords · 2024-09-13T05:24:27Z

external-crates/move/crates/move-vm-runtime/src/loader/runtime_vtable.rs

+    pub fn new(data_store: &impl DataStore, package_runtime: &'a VMCache) -> VMResult<Self> {
+        let mut loaded_packages = HashMap::new();
+
+        // Make sure the root package and all of its dependencies (under the current linkage
+        // context) are loaded.
+        let cached_packages = package_runtime.load_and_cache_link_context(data_store)?;
+
+        // Verify that the linkage and cyclic checks pass for all packages under the current
+        // linkage context.
+        linkage_checker::verify_linkage_and_cyclic_checks(&cached_packages)?;
+        cached_packages.into_iter().for_each(|(_, p)| {
+            loaded_packages.insert(p.runtime_id, p);
+        });
+
+        Ok(Self {
+            loaded_packages,
+            cached_types: &package_runtime.type_cache,
+        })
+    }


Small structural nit: I would expect the VMCache to take the DataStore and return one of these, as package_runtime.generate_vtables(data_store) or similar, so that, e.g., we can push the parallel execution handling into the cache logic. As it is right now, this code will need to be lock-aware to let that work.

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs

cgswords · 2024-09-13T05:40:48Z

external-crates/move/crates/move-vm-runtime/src/unit_tests/relinking_store.rs

@@ -0,0 +1,120 @@
+// Copyright (c) The Move Contributors


Considering adding this to move-vm-test-utils/storage.rs if it can go there?

cgswords · 2024-09-13T05:43:21Z

external-crates/move/crates/move-vm-integration-tests/src/tests/loader_tests.rs

            .clone())
    }
+
+    fn all_package_dependencies(&self) -> Result<BTreeSet<AccountAddress>, Self::Error> {


A bit of confusion: it appears we also define relinking_store.rs below, which could b emoved to move-vm-test-utils/src/storage.rs If we did that, could we also reuse it here?

cgswords · 2024-09-13T05:46:36Z

external-crates/move/move-execution/v0/crates/move-vm-runtime/src/data_cache.rs

        }
    }

+    fn load_package(&self, package_id: &AccountAddress) -> VMResult<Vec<Vec<u8>>> {


I would have hoped version cuts could have avoided needing this sort of stuff, but I see we are sort of stuck where we are. Any solution I have isn't particularly better. Maybe on a longer timeline the version cut should become the whole crates/ folder...

Yea, versioning the whole thing is the only way around this :'(

That being said though I realized I can just stub these out -- they should never be accessed from old versions.

dariorussi

I am going to move some conversation offline and stop reviewing at the moment until we talk.
Thanks so much for this work, it's looking good and a lot to do

dariorussi · 2024-09-12T12:55:15Z

external-crates/move/crates/move-vm-runtime/src/loader/runtime_vtable.rs

+                ))
+            })
+            .map(|f| f.as_ref())
+            .ok_or_else(|| {


maybe that is what we used to do and to make sure I understand.
This is giving MISSING_DEPENDENCY with a missing function message both in case the lookup of the package or the lookup of the function fail.
Would it be better to have different errors?

I think MISSING_DEPENDENCY in both cases is fine/correct. Maybe we just add a different message?

dariorussi · 2024-09-13T13:11:57Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

+pub type PackageStorageId = AccountAddress;
+pub type RuntimePackageId = AccountAddress;


I am not a big fan of those names which mean very little to me. Admittedly I am not sure I have better names.
But at the end is something along the lines of OriginId and PackageId.
Or maybe PackageId and PackageVersionId.
Not sure, but storage and runtime make no sense to me and I'd love for us to think of better names.
Basically PackageRuntimeId (I think, because I never remember which goes to prove how non descriptive the names are) is the package identity as a "name". The logical view of a package. Package at version 0.
Whereas PackageStorageId is, well... the package itself.
In other words if I wanted to know all version of a package I would do a "search" over PackageRuntimeId and have all the different version that have been loaded for that package.
Whether that operation is ever needed I don't think so, but having the package identity (as again in the name of the package) in there does not seem to hurt. Also it will be probably needed in some scenario.

edit: I wrote this comment mixing up storage id and runtime id and had to edit this - lol

dariorussi · 2024-09-13T14:32:08Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

+    // NB: this is under the package's context so we don't need to further resolve by
+    // address in this table.


I would remove this comment as it is not adding much in my opinion

dariorussi · 2024-09-13T14:39:52Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

+    // NB: this is needed for the bytecode verifier. If we update the bytecode verifier we should
+    // be able to remove this.
+    pub compiled_modules: BinaryCache<Identifier, CompiledModule>,


not sure what to think of this.
In a way this is not a massive deal as those modules would have to live somewhere.
I was envisioning a global cache for CompiledModule which would return verified modules and so we would put things there only when verified (module verification, not cross modules). In a way that global cache could have any policy we wanted (dropped any time). Sort of a front end cache to module for the data store.
But we are there yet and not clear when and if we would go there, and in that case this may very well be just fine for now

dariorussi · 2024-09-13T14:43:11Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

    /// ```..., arg(1), arg(2), ...,  arg(n) -> ..., return_value(1), return_value(2), ...,
    /// return_value(k)```
-    Call(ArenaPointer<Function>),
+    KnownCall(ArenaPointer<Function>),


can we please call this StaticCall? I cannot possibly digest KnownCall

yes! forgot to make this change, but doing it :)

dariorussi · 2024-09-13T14:43:44Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

+// - Virtual: the function is unknown and the index is the index in the global table of vtables
+//   that will be filled in at a later time before execution.
+pub enum CallType {
+    Known(ArenaPointer<Function>),


as in the bytecode comment can we call this Static?

Yep! Making that name change right now

dariorussi · 2024-09-13T14:58:11Z

external-crates/move/crates/move-vm-runtime/src/loader/ast.rs

            Bytecode::StLoc(a) => write!(f, "StLoc({})", a),
-            Bytecode::Call(a) => write!(f, "Call({})", a.to_ref().name),
+            Bytecode::KnownCall(a) => write!(f, "Call({})", a.to_ref().name),
+            Bytecode::VirtualCall(a) => write!(


I think so, also let's consider if we want an instruction for calling the framework. That would be equivalent to a static call but it may be good to distinguish those even if the implementation would be identical.
Also we should do the same for native calls irrespective of the implementation.
I think all of that info is available already so it should not be a problem to do so

dariorussi · 2024-09-13T15:08:25Z

external-crates/move/crates/move-vm-runtime/src/loader/chain_ast.rs

@@ -0,0 +1,29 @@
+// Copyright (c) The Move Contributors


this file name is funny, it took me a while to understand what it means, and I guess chain refers to blockchain?

we could also maybe call it binary_ast or binary_package?

dariorussi · 2024-09-13T15:11:06Z

external-crates/move/crates/move-vm-runtime/src/loader/chain_ast.rs

+use std::collections::BTreeMap;
+
+#[derive(Debug, Clone)]
+pub(crate) struct DeserializedPackage {


I would also add the type table and the linkage table here, not sure why we are omitting them.
At the end it does not seem a big deal to have them and it may help for future work.
Though I am not very clear yet on how all of this is coming together

dariorussi · 2024-09-13T17:36:54Z

external-crates/move/crates/move-vm-runtime/src/loader/type_cache.rs

+
+pub struct TypeCache {
+    pub cached_types: DatatypeCache,
+    pub cached_instantiations: HashMap<CachedTypeIndex, HashMap<Vec<Type>, DatatypeInfo>>,


when is this used and how?

tzakian requested review from cgswords and dariorussi August 29, 2024 23:16

vercel bot deployed to Preview – sui-docs August 29, 2024 23:23 View deployment

cgswords reviewed Aug 30, 2024

View reviewed changes

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs Show resolved Hide resolved

cgswords reviewed Aug 30, 2024

View reviewed changes

Base automatically changed from cgswords/arena_loader_0 to vm_2024_rewrite September 3, 2024 23:09

tzakian force-pushed the tzakian/wip-package-loader branch from e8f8c00 to 52e9385 Compare September 4, 2024 19:57

vercel bot deployed to Preview – sui-docs September 4, 2024 20:05 View deployment

cgswords reviewed Sep 4, 2024

View reviewed changes

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs Show resolved Hide resolved

cgswords force-pushed the vm_2024_rewrite branch from e47731d to 70c3425 Compare September 6, 2024 19:14

dariorussi requested a review from tnowacki September 12, 2024 11:53

tzakian force-pushed the tzakian/wip-package-loader branch from 52e9385 to 5f82fad Compare September 12, 2024 18:08

vercel bot deployed to Preview – sui-docs September 12, 2024 18:12 View deployment

tzakian force-pushed the tzakian/wip-package-loader branch from 5f82fad to 13fa069 Compare September 12, 2024 20:02

vercel bot deployed to Preview – sui-docs September 12, 2024 20:06 View deployment

tzakian force-pushed the tzakian/wip-package-loader branch from 13fa069 to 5497eec Compare September 12, 2024 20:18

vercel bot deployed to Preview – sui-docs September 12, 2024 20:20 View deployment

tzakian added 2 commits September 12, 2024 14:53

WIP: package loader

96f46a5

Refactor

a89f593

tzakian force-pushed the tzakian/wip-package-loader branch from 5497eec to a89f593 Compare September 12, 2024 23:15

tzakian marked this pull request as ready for review September 12, 2024 23:15

tzakian changed the title ~~WIP: package loader~~ [move][vm_rewrite] package loader Sep 12, 2024

vercel bot deployed to Preview – sui-docs September 12, 2024 23:16 View deployment

cgswords reviewed Sep 13, 2024

View reviewed changes

external-crates/move/crates/move-vm-runtime/src/loader/translate2.rs Show resolved Hide resolved

cgswords reviewed Sep 13, 2024

View reviewed changes

dariorussi reviewed Sep 13, 2024

View reviewed changes

vercel bot deployed to Preview – sui-docs September 13, 2024 20:39 View deployment

tzakian force-pushed the tzakian/wip-package-loader branch from d31e5a4 to cee6909 Compare September 13, 2024 23:44

vercel bot deployed to Preview – sui-docs September 13, 2024 23:45 View deployment

fixup! Refactor

04a1ba5

tzakian force-pushed the tzakian/wip-package-loader branch from cee6909 to 04a1ba5 Compare September 14, 2024 00:06

vercel bot deployed to Preview – sui-docs September 14, 2024 00:07 View deployment

tzakian merged commit e0d3927 into vm_2024_rewrite Sep 16, 2024
32 of 46 checks passed

tzakian deleted the tzakian/wip-package-loader branch September 16, 2024 20:20

	pub fn new(data_store: &impl DataStore, package_runtime: &'a PackageLoader) -> VMResult<Self> {
	pub fn new(data_store: &impl DataStore, package_loader: &'a PackageLoader) -> VMResult<Self> {

		/// before the beginning of execution, and based on the static call graph of the package that
		/// contains the root package id.

		pub type PackageStorageId = AccountAddress;
		pub type RuntimePackageId = AccountAddress;

		// NB: this is under the package's context so we don't need to further resolve by
		// address in this table.

[move][vm_rewrite] package loader #19153

[move][vm_rewrite] package loader #19153

Conversation

tzakian commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test plan

Uh oh!

vercel bot commented Aug 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cgswords Aug 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dariorussi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

tzakian commented Aug 29, 2024 •

edited

Loading

vercel bot commented Aug 29, 2024 •

edited

Loading

cgswords Aug 30, 2024 •

edited

Loading