-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[move] Initial trace format and implementation #18729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[move] Initial trace format and implementation #18729
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
} | ||
} | ||
|
||
#[cfg(feature = "gas-profiler")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation for feature gating these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple reasons:
- They are only used for move tracing, and in general we should be careful about when and how we create annotated values since they can inflate the value size.
- This is snapshotting the value to an annotated value -- in particular it goes through references. Whereas in a "true" VM value -> annotated value function you may want this to be an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to get this down as a doc comment, either on the feature gate or on as_annotated_move_value
.
external-crates/move/crates/move-trace-format/src/trace_state_machine.rs
Show resolved
Hide resolved
external-crates/move/crates/move-trace-format/src/trace_state_machine.rs
Outdated
Show resolved
Hide resolved
pub(crate) struct VMTracer<'a> { | ||
trace: &'a mut MoveTraceBuilder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a compelling reason to borrow this value, instead of having the tracer hold and then return it at the end of the execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main reason is passing it by value would make the interface for this and the VM more cumbersome since this would now affect the return types of the VM as well as input types (and call sites etc), and this feels more natural to me/descriptive of "who owns the tracer" (the caller)
// Copyright (c) The Move Contributors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
#[cfg(not(feature = "gas-profiler"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to rename this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes absolutely! But I propose this be done in a future PR, since this will be taking over the gas profiler functionality (and this is a publicly visible/usable feature flag that people use when installing the Sui CLI) so we'll need to be careful on how we rename it.
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
#[cfg(feature = "gas-profiler")] | ||
$tracer.as_mut().map(|tracer| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of CFGing all of these, something you might consider is the following code:
#[cfg(feature = "gas-profiler")]
const TRACING_MODE: bool = true;
#[cfg(not(feature = "gas-profiler"))]
const TRACING_MODE: bool = false;
Then you can eliminate these macro calls and replace them with:
if tracing2::TRACING_MODE {
tracer.as_mut.map(|tracer| ...)
}
Rust will eliminate these blocks even when compiled in debug mode.
In addition, you can even pass the tracer as Option<&VMTracer>
to avoid the as_mut
/map
chains in these macros.
external-crates/move/crates/move-vm-runtime/src/tracing2/trace_macros.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. Most things look good, aside from some name change requests. I'm not a big fan of the macro + cfg gate approach, but maybe there isn't something particularly better to do.
e704d1d
to
1cf3c88
Compare
8eac4d6
to
7030f2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realisd I had some comments saved up! Nothing major, mostly asking for clarification in various places (i.e. doc comments).
external-crates/move/crates/move-binary-format/src/file_format_common.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-trace-format/src/memory_tracer.rs
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
#[cfg(feature = "gas-profiler")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to get this down as a doc comment, either on the feature gate or on as_annotated_move_value
.
pub fn into_tag_with_refs(&self) -> TypeTagWithRefs { | ||
TypeTagWithRefs { | ||
type_: self.tag.clone(), | ||
ref_type: self.layout.as_ref().and_then(|(_, r)| r.clone()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that self.layout
could be None
because the layout could not be calculated -- this code merges that case with the case where the type is taken by value -- is that always safe to do? Can you add some docs here as to why that's safe to do if it is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good callout! Updated this so that the cases are not merged anymore.
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
1cf3c88
to
5622ad2
Compare
7030f2f
to
d990bf0
Compare
5622ad2
to
e1c998e
Compare
d990bf0
to
90e98bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am finishing tracer.rs
and I should be done later today, meanwhile some feedback...
I have to say that I find it a bit difficult to digest how much we are sprinkling #[cfg(feature = "gas-profiler")]
all over the code base.
I have not tried or thought enough but wonder if we could re-arrange things in a way that makes that more isolated/encapsulated?
Maybe removing the current profile macros and put everything inside functions?
errors::*, | ||
file_format::{Constant, SignatureToken, VariantTag}, | ||
}; | ||
#[cfg(feature = "gas-profiler")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to move all this implementation in tracer.rs
?
I guess ValueImpl
being "private" requires us to do so?
It seems very intrusive to have all this code here if it's going to be used by the tracer only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be less bad if the feature gate was not gas-profiler
, but something else that was specific to this crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this is needed here since ValueImpl
is private. In the future (i.e., after the most recent VM meeting) I think this will eventually all be able to live in a single crate :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are making me wonder now if I am missing something...
this is not adding any error handling that was not there before, right?
pub(crate) struct Interpreter { | ||
/// Operand stack, where Move `Value`s are stored for stack operations. | ||
operand_stack: Stack, | ||
pub(crate) operand_stack: Stack, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not that all those pub(crate)
are problematic but man....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea...But needed!
) | ||
}); | ||
#[cfg(feature = "gas-profiler")] | ||
let func_clone = func.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reason for this and other clone
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an existing clone for profile_open_frame
. I think that can take it by ref, but I'm kinda of the opinion of keeping it as-is as it should be going away soon either way...
external-crates/move/crates/move-trace-format/src/memory_tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-trace-format/src/memory_tracer.rs
Outdated
Show resolved
Hide resolved
/// Internal state for the tracer. This is where the actual tracing logic is implemented. | ||
pub(crate) struct VMTracer<'a> { | ||
trace: &'a mut MoveTraceBuilder, | ||
link_context: Option<AccountAddress>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why we are exposing this, can we get rid of it? what is that supposed to do? is that what we want to keep moving forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this in the tracer so that can get the correct module to e.g., resolve type layouts.
In the future, this can be replaced by the mapping that we get from the link context and/or the resolver over the package vtables. But at least for now/in the current incarnation of the VM here we need this. (note that this isn't exposed in the trace or anything though!)
trace: &'a mut MoveTraceBuilder, | ||
link_context: Option<AccountAddress>, | ||
pc: Option<u16>, | ||
active_frames: BTreeMap<TraceIndex, FrameInfo>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wonder if a Vec<(TraceIndex, FrameInfo)>
wouldn't be more efficient and more obvious.
In fact we could even move the TraceIndex
in the FrameInfo
? then it would just be Vec<FrameInfo>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is sparse (this is keyed by TraceIndex
and not e.g., call stack depth or anything like that), so moving to a Vec<...>
is probably not a good idea here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I did not understand this too well :) and here is what I thought:
BTreeMap<TraceIndex, FrameInfo>
is acting as a stack whenever we get the latest value (top of the stack, last_key_value
) and it works because TraceIndex
is guaranteed to be monotonically growing and as such a "later" frame has always a higher value of a frame up the stack (previous frame).
However we also use TraceIndex
as a key into the "stack" to get roots and such.
Is that the case?
If that was the case (if what I wrote is true), a Vec<(TraceIndex, FrameInfo)>
or Vec<FrameInfo>
- if FrameInfo
had the TraceIndex
- would work as well with a binary search in place of all the get
in the code. All the usage of a stack (last_key_value
) would simply work off a vector which is more natural.
In other words (again assuming I understood anything) the perf characteristics of a vector with a binary search would be the same of the BTreeMap
for the get
(log(n)) and better otherwise in terms of space and time.
All in all at the end is not a big deal in terms of perf, as call stacks are not really that deep, and maybe better when reading the code?
Did I got the usage right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You got the usage right!
I think it's better to keep it as a BTreeMap
as that feels more "natural" to me at least when reading/reasoning about the code since you generally need random access into the stack more frequently then most recent call frame (mainly due to mutating references). Agreed that it's not that big a deal either way though (and we can convert at a later date if we want/need to).
If we wanted to we could also move to an IndexMap
for this as well which would give the ideal access patterns for this -- O(1)
random access and O(1)
"last frame"/call-stack order operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine keeping things as they are and I still think a vector would be the ideal struct - lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more nits...
|
||
/// The reference information for a local. This is used to track the state of a local in a frame. | ||
/// * It can be a value, in which case the reference type is `Value`. | ||
/// * It can be a local that does not currently hold a value (is "empty"), in which case the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"in which case the" remove "the"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest how do you end up with a reference to an empty local? Isn't that what the borrow checker is supposed to protect against?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! This is tracking the state of the local during the course of execution, so e.g., when we open a frame, there may be a local (of reference type) that starts out uninitialized, so this would show up as a local with a LocalType
of LocalType { layout: ..., ref_type: ReferenceType::Empty(RefType::Imm)
. This then gets filled later on throughout execution.
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
/// Return the current frame identifier. This is trace index of the frame and is used to | ||
/// identify reference locations rooted higher up the call stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should any of those be a doc comment instead of just a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think doc comments are fine? Can move to normal comments if you want, but figure this is a bit nicer.
errors::*, | ||
file_format::{Constant, SignatureToken, VariantTag}, | ||
}; | ||
#[cfg(feature = "gas-profiler")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be less bad if the feature gate was not gas-profiler
, but something else that was specific to this crate?
// NB: parameters are passed directly, so we just pop to make sure they aren't also | ||
// left on the operand stack. For the initial call, these pops may (should) fail, but that | ||
// is fine as we already have the values in the parameter list. | ||
self.operand_stack.pop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (optional): This is a bit more verbose but for me it more directly conveys the idea that you care about maintaining the invariant on the operand stack, and less about the values coming out of it:
self.operand_stack.truncate(self.operand_stack.len().saturating_sub(frame.parameters.len())
external-crates/move/crates/move-trace-format/src/memory_tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-trace-format/src/memory_tracer.rs
Outdated
Show resolved
Hide resolved
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
Some(()) | ||
} | ||
|
||
fn close_instruction_( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't fully figure out why it's helpful to split instruction handling into pre-effects and post-effects, vs having one big match statement that contains the abstract implementation of every instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Called at different times -- one is called before the execution of the instruction, and the other is called immediately after (to observe the effect of the instruction). Basically
call_pre_effect();
<vm executes instruction>
call_post_effect();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest, is there a good example of an instruction where its post-effect relies on a change in state in the VM to get computed correctly?
external-crates/move/crates/move-vm-runtime/src/tracing2/tracer.rs
Outdated
Show resolved
Hide resolved
61c3ec0
to
dd46d00
Compare
d8ed5a4
to
0c0a0fa
Compare
dd46d00
to
44904a8
Compare
0c0a0fa
to
c6bc865
Compare
Everything should be either responded to or addressed! Thanks for all the feedback and suggestions. |
left a reply and one more idiotic question :) |
Correct! That is what is serialized out :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
accepting since it has reviewed quite a bit and we concluded there are things we want to do in subsequent PRs.
I guess there is no reason to keep it open.
And if there are reason you can keep it open for a bit longer :)
…t!) (#18730) ## Description Plumbs tracing into the Move unit tests, and Move CLI (_not_ the Sui CLI just yet!). You can now run with `move test --trace-execution` or `move test --trace-execution <dir>`. If the Move CLI binary was compiled with the `gas-profiled` feature flag enabled this will result in a trace file for each unit test either under `<move-pkg>/traces/` or under `<move-pkg>/<dir>/`. Once we're happy with this, and we've kicked the tires a bit more + moved over test coverage and gas profiling plumbing this up into the Sui CLI is something we should do. This additionally adds a new test to the `./tests.sh` script for external crates to compile and run tracing tests with the `gas-profiler` flag set. ### PR Stack: 1. #18727 2. #18729 3. #18730 **<<<<< You are here** ## Test plan Added a number of tests. --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [ ] GraphQL: - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:
a6f8f05
into
tzakian/tracing-1-source-maps-json
Description
This PR adds the initial trace format, and the implementation of this trace format in the VM. I've gone through the tracing format with most of you synchronously so won't write out all the details here (but I will take it on to write a spec for the trace format once this is done so other folks can use it when consuming this format). Happy to go through it at any point in real time.
Other TODO: right now the
MoveValue
s only support serialize and not deserialize back into Rust so we can only push a trace out from Rust but not consume it. The next thing I'm working on right now is adding support for that, and it may be either an update to this PR, or an additional PR depending on how involved that is...Tests and integration of this into the Move CLI (not the Sui CLI yet) is in the PR above this one.
This keeps the tracing largely feature-gated in the VM, and the only additional overhead/change at runtime with
gas-profiling
turned off is the additional argument, but this argument is unused and should be optimized away (and at worst only add essentially no overhead).I kept the
gas-profiling
feature flag and gated the new tracing under it. The plan being to eventually rename that flag at the same time that we update test coverage and gas profiling to use this new tracing format as well (but that's for a couple future PRs!).PR Stack:
Test plan
PR above this one
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.