Skip to content

Conversation

lxfind
Copy link
Contributor

@lxfind lxfind commented Sep 10, 2024

Description

This PR queries the full_objects_history table for objects query.
It does so in a backward-compatible way. We always return data from the objects_history table if it's found, and otherwise we return from the full_objects_history table.

Test plan

Added e2e test


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Sep 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 17, 2024 6:46pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 17, 2024 6:46pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 17, 2024 6:46pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 17, 2024 6:46pm

@wlmyng
Copy link
Contributor

wlmyng commented Sep 10, 2024

Testing is tricky. We need to be able to create a scenario where an object version is in the full_objects_history table but not in the objects_history table. Is it possible to trigger pruning in the tests?

This would be easier done on the graphql side - I've exposed something recently that enables pruning and waits for pruning effects

Base automatically changed from indexer-cleanup-graphql-object-deleted to main September 11, 2024 17:21
@lxfind lxfind force-pushed the indexer-use-full-object-history branch from 0a4805d to c0a55be Compare September 11, 2024 19:52
@lxfind lxfind force-pushed the indexer-use-full-object-history branch from c0a55be to 15e5fb1 Compare September 12, 2024 05:11
Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing an initial pass -- I'm a little concerned about how we're achieving consistency here by relying on the tables that we would use to fetch the live object set first, and then moving on to the full objects history for point look-ups, because it leaves us open to races breaking consistency for brand new objects:

  • Imagine someone runs a query for an object they just created. At the time the request starts, it's not in objects_snapshot or objects_history (so it wouldn't show up in any consistent query for the live object set).
  • By the time we issue the query to full_objects_history it does get written out.

Now, because of the cascading logic introduced in this PR, that object will appear in the result set, when we did not intend it to.

If we are settling on keeping objects_version around and with a very generous retention, then I think the best course of action is to rely on it as the sole arbiter for consistency:

  • Introduce a DataLoader that just does the point lookups to full_objects_history -- this will one day be replaced by a call to the KV store.
  • Both the historical and parent lookups work by first querying objects_version to get a consistent set of (id, version) keys, and following up with a query to get object contents.

This architecture also meshes nicely with another Object-related workstream, which is to make sure it's always possible (and efficient) to get the ObjectRef for an object, e.g. from the effects of transactions that have just run for very old historical transactions where the object itself has been pruned. We can add the digest to objects_version and most object queries will just be a query to objects_version to get those initial key details about an object. The subsequent data-loader fetch to get the objects content will be hidden in the implementation of Object's fields, something like so:

struct Object {
  address: SuiAddress,
  version: u64,
  digest: Digest,
}

struct ObjectContents {
  native: NativeObject,
  stored: StoredKVObject,
}

#[Object]
impl Object {
  async fn address(&self) -> Address {
    self.address
  }

  async fn version(&self) -> UInt53 {
    self.version.into()
  }

  async fn digest(&self) -> Base64 {
    Base64::from(self.digest)
  }

  #[graphql(flatten)]
  async fn contents(&self, ctx: &Context<'_>) -> Result<Option<ObjectContents>> {
    let DataLoader(loader) = ctx.data_unchecked();

    loader.load_one(HistoricalKey { 
       id: self.address,
       version: self.version
    })
    .await
    .extend()
  } 
}

#[Object]
impl ObjectContents {
  fn bcs(&self) -> Base64 { /* ... */ }
  // ...
}

This is also where my thought came from to have objects_snapshot and the consistent set store only the same fields as objects_version and otherwise rely entirely on the KV store for object content.

None,
)
} else {
Object::new_wrapped_or_deleted(key.id, key.version, key.checkpoint_viewed_at)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is a bit strange, it means that if I request some totally made up ObjectID and/or version, I will get a response that says that that object is wrapped or deleted, which doesn't seem right.

I'm onboard with the idea that we shouldn't distinguish between an object that is wrapped/deleted/pruned/never existed, but I think in practice that has to mean that they all return no response, rather than this response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, we know for sure it is a wrapped/deleted object, because we had an entry from full_objects_history table, but the serialized column is null.

@lxfind
Copy link
Contributor Author

lxfind commented Sep 12, 2024

Introduced a new key called PointLookupKey and a loader for it. This would look up in the full_objects_history table for a given ID and version.
Also added a new ObjectKind called Serialized to capture this case.

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks broadly correct to me and I only had nits looking at that, but then the tests surprised me, so let's ignore the nits for now and focus on that:

Here's how I understand the test:

  • Publish
  • Create O(0) and send to A (aka object(2,0))
  • Create checkpoint + advance epoch
  • Create O(1) and send to A (aka object(5,0)

Ask GraphQL for a version of O(0), which exists, ask for A's objects which should include O(0) but not O(1) because we haven't advanced the checkpoint yet -- so far so good.

  • Create the checkpoint, advance the epoch -- this triggers pruning in objects_history.

Now because the test set-up is kind of strange, with only two checkpoints per epoch, we're in a weird place: objects_history is designed to trail objects by some number of checkpoints, but its pruning policy is set in terms of epochs, this means that although we should always be able to query live objects, we can't, because these live objects are being pulled from objects_history and not objects_snapshot.

This is the strange part that I understand, then there's the strange part I don't understand: Where did O(1) go? It was created only one epoch ago, and we should have a two epoch lag.

TL;DR: The expected test output at the end should be that we see both O(0) and O(1) owned by A, but for two different reasons, we see neither.

.map(|o| {
(
PointLookupKey {
id: addr(&o.object_id).unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto on the question of unwrap or internal error.

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is good to go now, and we can address the broader questions around objects_history pruning in a follow-up -- thanks @lxfind .

pub(crate) fn new_serialized(
object_id: SuiAddress,
version: u64,
serialized: SerializedObject,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (optional): I would keep this unaliased, because it's useful to know that it's an Option<_> and from context it's pretty clear that it's the bytes of the serialized object.

@amnn
Copy link
Contributor

amnn commented Sep 17, 2024

Couple of notes before this lands though:

  • Does this need to wait until the full_objects_history backfill is completed? Because IIUC, if not, we will start failing point lookups where previously we did not, because we are no longer falling back on objects_history.
  • I'm about to land the CI steps to accommodate staging, and that will move all the existing tests into a stable directory, so watch out for that on rebase.

@lxfind
Copy link
Contributor Author

lxfind commented Sep 17, 2024

Does this need to wait until the full_objects_history backfill is completed? Because IIUC, if not, we will start failing point lookups where previously we did not, because we are no longer falling back on objects_history.

The next release that's going out will already have the writer-side of changes, which I will backfill to make sure it has all data. So when this change gets into prod we should be already complete on data.

@lxfind lxfind merged commit 95f0a4d into main Sep 17, 2024
47 of 48 checks passed
@lxfind lxfind deleted the indexer-use-full-object-history branch September 17, 2024 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants