-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[indexer] Use full_objects_history in reads #19290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
This would be easier done on the graphql side - I've exposed something recently that enables pruning and waits for pruning effects |
0a4805d
to
c0a55be
Compare
c0a55be
to
15e5fb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing an initial pass -- I'm a little concerned about how we're achieving consistency here by relying on the tables that we would use to fetch the live object set first, and then moving on to the full objects history for point look-ups, because it leaves us open to races breaking consistency for brand new objects:
- Imagine someone runs a query for an object they just created. At the time the request starts, it's not in
objects_snapshot
orobjects_history
(so it wouldn't show up in any consistent query for the live object set). - By the time we issue the query to
full_objects_history
it does get written out.
Now, because of the cascading logic introduced in this PR, that object will appear in the result set, when we did not intend it to.
If we are settling on keeping objects_version
around and with a very generous retention, then I think the best course of action is to rely on it as the sole arbiter for consistency:
- Introduce a DataLoader that just does the point lookups to
full_objects_history
-- this will one day be replaced by a call to the KV store. - Both the historical and parent lookups work by first querying
objects_version
to get a consistent set of(id, version)
keys, and following up with a query to get object contents.
This architecture also meshes nicely with another Object-related workstream, which is to make sure it's always possible (and efficient) to get the ObjectRef
for an object, e.g. from the effects of transactions that have just run for very old historical transactions where the object itself has been pruned. We can add the digest
to objects_version
and most object queries will just be a query to objects_version
to get those initial key details about an object. The subsequent data-loader fetch to get the objects content will be hidden in the implementation of Object
's fields, something like so:
struct Object {
address: SuiAddress,
version: u64,
digest: Digest,
}
struct ObjectContents {
native: NativeObject,
stored: StoredKVObject,
}
#[Object]
impl Object {
async fn address(&self) -> Address {
self.address
}
async fn version(&self) -> UInt53 {
self.version.into()
}
async fn digest(&self) -> Base64 {
Base64::from(self.digest)
}
#[graphql(flatten)]
async fn contents(&self, ctx: &Context<'_>) -> Result<Option<ObjectContents>> {
let DataLoader(loader) = ctx.data_unchecked();
loader.load_one(HistoricalKey {
id: self.address,
version: self.version
})
.await
.extend()
}
}
#[Object]
impl ObjectContents {
fn bcs(&self) -> Base64 { /* ... */ }
// ...
}
This is also where my thought came from to have objects_snapshot
and the consistent set store only the same fields as objects_version
and otherwise rely entirely on the KV store for object content.
None, | ||
) | ||
} else { | ||
Object::new_wrapped_or_deleted(key.id, key.version, key.checkpoint_viewed_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this is a bit strange, it means that if I request some totally made up ObjectID and/or version, I will get a response that says that that object is wrapped or deleted, which doesn't seem right.
I'm onboard with the idea that we shouldn't distinguish between an object that is wrapped/deleted/pruned/never existed, but I think in practice that has to mean that they all return no response, rather than this response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we know for sure it is a wrapped/deleted object, because we had an entry from full_objects_history table, but the serialized column is null.
434f790
to
ede930c
Compare
Introduced a new key called PointLookupKey and a loader for it. This would look up in the full_objects_history table for a given ID and version. |
ede930c
to
abac636
Compare
abac636
to
ee056ff
Compare
ee056ff
to
25f854a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks broadly correct to me and I only had nits looking at that, but then the tests surprised me, so let's ignore the nits for now and focus on that:
Here's how I understand the test:
- Publish
- Create
O(0)
and send toA
(akaobject(2,0)
) - Create checkpoint + advance epoch
- Create
O(1)
and send toA
(akaobject(5,0)
Ask GraphQL for a version of O(0)
, which exists, ask for A
's objects which should include O(0)
but not O(1)
because we haven't advanced the checkpoint yet -- so far so good.
- Create the checkpoint, advance the epoch -- this triggers pruning in
objects_history
.
Now because the test set-up is kind of strange, with only two checkpoints per epoch, we're in a weird place: objects_history
is designed to trail objects
by some number of checkpoints, but its pruning policy is set in terms of epochs, this means that although we should always be able to query live objects, we can't, because these live objects are being pulled from objects_history
and not objects_snapshot
.
This is the strange part that I understand, then there's the strange part I don't understand: Where did O(1)
go? It was created only one epoch ago, and we should have a two epoch lag.
TL;DR: The expected test output at the end should be that we see both O(0)
and O(1)
owned by A
, but for two different reasons, we see neither.
.map(|o| { | ||
( | ||
PointLookupKey { | ||
id: addr(&o.object_id).unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto on the question of unwrap
or internal error.
crates/sui-graphql-e2e-tests/tests/objects/full_objects_history.move
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is good to go now, and we can address the broader questions around objects_history
pruning in a follow-up -- thanks @lxfind .
pub(crate) fn new_serialized( | ||
object_id: SuiAddress, | ||
version: u64, | ||
serialized: SerializedObject, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (optional): I would keep this unaliased, because it's useful to know that it's an Option<_>
and from context it's pretty clear that it's the bytes of the serialized object.
Couple of notes before this lands though:
|
The next release that's going out will already have the writer-side of changes, which I will backfill to make sure it has all data. So when this change gets into prod we should be already complete on data. |
3cb81de
to
4289ace
Compare
Description
This PR queries the full_objects_history table for objects query.
It does so in a backward-compatible way. We always return data from the objects_history table if it's found, and otherwise we return from the full_objects_history table.
Test plan
Added e2e test
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.