[gql] Graphql transactions queries can handle db with pruning enabled #19237

wlmyng · 2024-09-05T23:34:46Z

Description

Extend the watermark task to also keep track of the min unpruned checkpoint and its min tx. The watermark task reads the min and max checkpoints from the checkpoints table, since this is the first table pruned by the pruner.

While checkpoint_viewed_at is inherited from a field's parent, the lower checkpoint bound will always be from the watermark, because any data < and potentially = to it must have been pruned from the db already.

Consequently, rather than the current assumption that the inclusive lower bound for transactions queries starts from zero if the caller did not specify a starting point, we instead use the min unpruned checkpoint and tx from the watermark task.

Returns an error if a caller tries to fetch data outside of the unpruned range (or should we return an empty response? I mirrored this off objects queries - if someone queries outside the available range, they will get an explicit error). At epoch boundary when we prune data, callers may receive an empty response instead of an error if the request was made before pruning but completed during or after pruning.

Later on, a dedicated watermarks table will simplify the query patterns needed to support the watermark task on graphql. And we should expose the unpruned range info on graphql api as well

Test plan

pruning/transactions.move

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

vercel · 2024-09-05T23:34:50Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Oct 3, 2024 5:09pm

3 Skipped Deployments

Name	Status	Preview	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Visit Preview	Oct 3, 2024 5:09pm
sui-kiosk	⬜️ Ignored (Inspect)	Visit Preview	Oct 3, 2024 5:09pm
sui-typescript-docs	⬜️ Ignored (Inspect)	Visit Preview	Oct 3, 2024 5:09pm

amnn

Almost there, thanks @wlmyng ! Main thing is changing the error behaviour for transactions as per the suggestion from @lxfind.

amnn · 2024-09-11T13:19:11Z

crates/sui-graphql-rpc/src/server/builder.rs

+            hi_cp_timestamp_ms: 1,
            epoch: 0,
+            lo_cp: 0,
+            lo_tx: 0,


Should we record hi_tx as well? It would seem to be symmetric and we do need to use that for the transaction queries.

I guess we care about the transaction upperbound for whatever the checkpoint upperbound is, so it doesn't really help to track hi_tx here.

crates/sui-graphql-rpc/src/server/watermark_task.rs

amnn · 2024-09-11T13:28:28Z

crates/sui-graphql-rpc/src/types/transaction_block/mod.rs

+        // If we've entered this function, we already fetched `checkpoint_viewed_at` from the
+        // `Watermark`, and so we must be able to retrieve `lo_cp` as well.
+        let Watermark { lo_cp, lo_tx, .. } = *ctx.data_unchecked();


nit: pull this out next to the other calls that fetch from the context.

amnn · 2024-09-11T13:31:19Z

crates/sui-graphql-rpc/src/types/transaction_block/mod.rs

+            return Err(Error::Client(
+                "Requested data is pruned and no longer available".to_string(),
+            ));


I think @lxfind mentioned that it would be better to treat pruned data the same as if it had never existed -- that seems like a reasonable position for transactions given that the timeframe is measured in months rather than minutes.

That should simplify the logic here as well, because I think you can skip this check entirely.

amnn · 2024-09-11T13:40:11Z

crates/sui-graphql-rpc/src/types/transaction_block/tx_lookups.rs

+            let tx_lo = match lo_record.1 {
+                Some(lo) => lo,
+                // Ostensibly this shouldn't happen in production, but should it occur, we can use
+                // `network_total_transactions` to exclude checkpoints < this one
+                None => lo_record.2,
+            } as u64;


Let's make this an internal error -- I want to know if something should never happen in production, and yet it does.

amnn · 2024-09-11T13:45:41Z

crates/sui-graphql-rpc/src/types/transaction_block/tx_lookups.rs

        // SAFETY: we can unwrap because of the `Some(checkpoint_viewed_at)
        let cp_hi = min_option([cp_before_inclusive, cp_at, Some(checkpoint_viewed_at)]).unwrap();

+        // Read from the `checkpoints` table rather than the `pruner_cp_watermark` table, because


Is there an instance at which we read from pruner_cp_watermark? (In other words, can we get rid of that table?).

yes it's used by the indexer, although just proposed a change in data platform channel that could allow us to get rid of this outright

agreed that these are duplicate info between pruner_cp_watermark and checkpoints, and I was looking at the de-dup, but prob no longer relevant given indexer-alt.

crates/sui-graphql-rpc/src/types/transaction_block/tx_lookups.rs

wlmyng · 2024-09-19T21:18:38Z

@amnn I plan to wait for the watermarks table changes to land so graphql can read from that (and consequently simplify all the querying complexity in this PR), but if needed I can clean up this PR and get this merged first

…l, while we have a closed interval due to shift away from using network_total_transactions

…lementation

…pruned?

…ark tracks lo and hi, because lo is not static - and won't be until we introduce watermarks table, we actually have to keep fetching the min cp no?

## Description - main changes from #19237 and this is the only unknown blocker - plus rebase, addressing comments and some testing changes. ## Test plan cargo nextest run --no-capture sui-graphql-e2e-tests::tests run_test::stable/prune --- ## Release notes Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required. For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates. - [ ] Protocol: - [ ] Nodes (Validators and Full nodes): - [ ] Indexer: - [ ] JSON-RPC: - [X] GraphQL: graphql changes to enable pruning, mainly around watermark - [ ] CLI: - [ ] Rust SDK: - [ ] REST API:

vercel bot deployed to Preview – sui-docs September 5, 2024 23:40 View deployment

wlmyng force-pushed the gql/compatible-with-pruning branch from a694df0 to 381b792 Compare September 6, 2024 16:24

vercel bot deployed to Preview – sui-docs September 6, 2024 16:33 View deployment

vercel bot deployed to Preview – sui-docs September 6, 2024 19:16 View deployment

vercel bot deployed to Preview – sui-docs September 6, 2024 22:09 View deployment

wlmyng force-pushed the gql/compatible-with-pruning branch from 3b078e1 to 0714a5c Compare September 6, 2024 22:12

vercel bot deployed to Preview – sui-docs September 6, 2024 22:16 View deployment

wlmyng force-pushed the gql/compatible-with-pruning branch from 0714a5c to 81178c2 Compare September 6, 2024 22:27

vercel bot deployed to Preview – sui-docs September 6, 2024 22:31 View deployment

vercel bot deployed to Preview – sui-docs September 6, 2024 22:48 View deployment

wlmyng marked this pull request as ready for review September 6, 2024 22:58

wlmyng requested review from amnn, emmazzz, stefan-mysten and suiwombat as code owners September 6, 2024 22:58

vercel bot deployed to Preview – sui-docs September 6, 2024 23:02 View deployment

amnn reviewed Sep 11, 2024

View reviewed changes

wlmyng added 7 commits October 3, 2024 09:35

this impl should be correct but txbounds expects a right-open interva…

c5a0b6c

…l, while we have a closed interval due to shift away from using network_total_transactions

tests around how transactions will work with pruning

1808bb5

track min unpruned cp in watermark task, simplify txbounds::query imp…

a07d377

…lementation

i think we should return an error when people query for data that is …

0ad253c

…pruned?

cleanup

76daaee

clippy

8e94e48

please clippy

4ddfb4d

wlmyng force-pushed the gql/compatible-with-pruning branch from 51ab464 to 4ddfb4d Compare October 3, 2024 16:43

vercel bot deployed to Preview – sui-docs October 3, 2024 16:45 View deployment

address comments. makes me think ... i suppose even though the waterm…

7e72b73

…ark tracks lo and hi, because lo is not static - and won't be until we introduce watermarks table, we actually have to keep fetching the min cp no?

vercel bot deployed to Preview – sui-docs October 3, 2024 17:09 View deployment

gegaowp mentioned this pull request Dec 5, 2024

GraphQL changes to enable pruning #20523

Merged

8 tasks

[gql] Graphql transactions queries can handle db with pruning enabled #19237

Are you sure you want to change the base?

[gql] Graphql transactions queries can handle db with pruning enabled #19237

Uh oh!

Conversation

wlmyng commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test plan

Release notes

Uh oh!

vercel bot commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amnn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wlmyng commented Sep 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wlmyng commented Sep 5, 2024 •

edited

Loading

vercel bot commented Sep 5, 2024 •

edited

Loading