Skip to content

consistency: indexer #22859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Jul 27, 2025
Merged

consistency: indexer #22859

merged 32 commits into from
Jul 27, 2025

Conversation

amnn
Copy link
Contributor

@amnn amnn commented Jul 23, 2025

Description

An implementation of an indexer written using the indexing framework that writes to a "consistent store" built on top of RocksDB (the Db and DbMap types). The consistent store also maintains a circular buffer of snapshots. All reads must go through one of these snapshots, which represent a consistent point in time in the database's history.

The indexer (Indexer, Synchronizer) coordinates writes from multiple pipelines with taking database-wide snapshots at pre-configured intervals (the stride): Each pipeline gets its own write queue, and a concurrent task that processes its writes. This task is responsible for holding back writes if they belong in the next snapshot, taking a snapshot once all pipelines have caught up, and landing writes to the database.

The indexer currently supports one pipeline: objects_by_owner. In a follow-up PR, an RPC implementation will be added to serve data from it, followed by more pipelines and RPC methods.

Test plan

$ cargo nextest run -p sui-indexer-alt-consistent-store

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • gRPC:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:

@amnn amnn self-assigned this Jul 23, 2025
@amnn amnn requested a review from a team as a code owner July 23, 2025 22:53
Copy link

vercel bot commented Jul 23, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 27, 2025 6:13pm
2 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Jul 27, 2025 6:13pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Jul 27, 2025 6:13pm

@amnn amnn temporarily deployed to sui-typescript-aws-kms-test-env July 23, 2025 22:53 — with GitHub Actions Inactive
@amnn amnn requested review from Copilot and a team and removed request for a team July 23, 2025 22:54
Copilot

This comment was marked as outdated.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a consistent indexer using a RocksDB-based store that provides snapshot-based reads and coordinated writes across multiple pipelines. The indexer maintains a circular buffer of database snapshots and synchronizes writes between pipelines to ensure consistent point-in-time views of the data.

  • Implements a new consistent store abstraction built on RocksDB with snapshot management
  • Adds synchronization mechanism to coordinate writes across multiple pipelines with database-wide snapshots
  • Introduces the objects_by_owner pipeline for indexing objects by their owner with support for filtering by type and balance-based ordering

Reviewed Changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
crates/x/src/lint.rs Adds bincode to allowed direct dependency duplicates due to breaking interface changes
crates/sui-indexer-alt-framework/src/task.rs Moves slow future monitoring functionality from ingestion module and adds comprehensive tests
crates/sui-indexer-alt-framework/src/ingestion/ Removes slow_future_monitor module and updates imports
crates/sui-indexer-alt-framework-store-traits/src/lib.rs Removes Sync trait bound from Connection trait
crates/sui-indexer-alt-consistent-store/ Complete implementation of consistent store with RocksDB backend, synchronization, and object indexing

Copy link
Collaborator

@evan-wall-mysten evan-wall-mysten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, but someone else on the team with more domain knowledge should also have a look because of how much is changed.

Copy link
Contributor

@tpham-mysten tpham-mysten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking great! Just a few nits comments regarding docs and naming.

Would love to see other's thoughts too as this change is pretty huge

Copy link
Contributor

@wlmyng wlmyng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have noticed my hesitancy, but then I recalled that consistent tables have at most 15 minutes of history, so I think any concerns around restartability are overblown ... this looks pretty good! We're essentially dealing with a live object table, but leveraging rocksdb snapshots to create that consistent/ historical state. Very cool!

amnn added 22 commits July 27, 2025 18:55
## Description

Initial scaffolding for indexer, entrypoint for starting the
service, and an initial pipeline (object_by_owner) with schema and
handler.

## Test plan

TBD.
## Description

These were probably overzealous, given everything is in one compilation
unit (crate).

## Test plan

Existing tests
## Description

Move `slow_future_monitor` into the `task` module so that it can be
exported outside the framework along with `try_for_each_spawned`.

## Test plan

Existing tests:

```
$ cargo nextest run -p sui-indexer-alt-framework
```
## Description

Add slow future monitors where the synchronizer waits on barriers, to
notify us if one synchronizer task is stuck waiting for more than 60s.

## Test plan

Manually tested.
## Description

Add some more documentation explaining how the consistent store's
indexer works, and how its parts fit together, where configuration
details come from, etc.

## Test plan

:eyes:
## Description

Build an initial interface for a consistent service, supporting just
ListOwnedObjects.

## Test plan

Initial (basic) protobuf generation tests.
## Description

Create a scaffold for the gRPC service, exposing an implementation of
the `ConsistentService` (which just returns an `Unimplemented` error for
now).

Also sets up the binary to run an indexer and the RPC service.

## Test plan

Run the service and use gRPC to talk to it:

```
# Generate the default configuration file
$ cargo run -p sui-indexer-alt-consistent-store \
  -- generate-config > /tmp/cos.toml

# Run the indexer
$ RUST_LOG=info cargo run -p sui-indexer-alt-consistent-store \
  -- run --database-path /tmp/cos                             \
  --remote-store-url https://checkpoints.mainnet.sui.io       \
  --config /tmp/cos.toml

# Make a request
$ grpcurl -v -plaintext localhost:7001 \
  sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects

Resolved method descriptor:
rpc ListOwnedObjects ( .sui.rpc.consistent.v1alpha.ListOwnedObjectsRequest ) returns ( .sui.rpc.consistent.v1alpha.ListOwnedObjectsResponse );

Request metadata to send:
(empty)

Response headers received:
(empty)

Response trailers received:
content-type: application/grpc
date: Fri, 18 Jul 2025 22:44:57 GMT
x-sui-rpc-version: 1.52.0-f1769737e1ef-dirty
Sent 0 requests and received 0 responses
ERROR:
  Code: Unimplemented
  Message: Not implemented yet
```
## Description

Add metrics related to the RPC, similar to the metrics implementation in
`sui-rpc-api`.

## Test plan

Run the service, make a request, and then check the metrics output at
`localhost:9184/metrics`.
## Description

Register the file descriptor set for the health service so it shows up
during reflection.

## Test plan

Run the service, and make the following call to it:

```
$ grpcurl -plaintext localhost:7001 grpc.health.v1.Health/Check

Resolved method descriptor:
rpc Check ( .grpc.health.v1.HealthCheckRequest ) returns ( .grpc.health.v1.HealthCheckResponse );

Request metadata to send:
(empty)

Response headers received:
content-type: application/grpc
date: Sat, 19 Jul 2025 23:15:59 GMT
x-sui-rpc-version: 1.52.0-d0dab41f93e1-dirty

Response contents:
{
  "status": "SERVING"
}

Response trailers received:
(empty)
Sent 0 requests and received 1 response
```
## Description

Create a dedicated sub-directory for the service, and break out each
method into its own module, similar to the arrangement of `sui-rpc-api`.

## Test plan

Existing tests.
## Description
Similar to the Error system in `sui-indexer-alt-graphql-rpc`.

## Test plan
Existing tests
## Description

Validate `ListOwnedObjects` and convert it into an `OwnerKind` that can
be used to query the database (but don't actually query the database,
yet).

## Test plan

Manually tested:

```
$ grpcurl -plaintext -d '{}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Missing 'owner'

$ grpcurl -plaintext -d '{"owner": {"kind": 0}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Missing 'owner'

$ grpcurl -plaintext -d '{"owner": {"kind": 1, "address": ""}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Missing 'address' for kind 'ADDRESS'

$ grpcurl -plaintext -d '{"owner": {"kind": 1}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Missing 'address' for kind 'ADDRESS'

$ grpcurl -plaintext -d '{"owner": {"kind": 2, "address": ""}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Missing 'address' for kind 'OBJECT'

$ grpcurl -plaintext -d '{"owner": {"kind": 3, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Unexpected 'address' for kind 'SHARED'

$ grpcurl -plaintext -d '{"owner": {"kind": 4, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Unexpected 'address' for kind 'IMMUTABLE'

$ grpcurl -plaintext -d '{"owner": {"kind": 1, "address": "not-a-hex-address"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Invalid 'address': "not-a-hex-address"

$ grpcurl -plaintext -d '{"owner": {"kind": 3}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: Unimplemented
  Message: Not implemented yet

$ grpcurl -plaintext -d '{"owner": {"kind": 4}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: Unimplemented
  Message: Not implemented yet

$ grpcurl -plaintext -d '{"owner": {"kind": 1, "address": "0x123"}, "object_type": "0x2::coin::Coin"}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Invalid 'address': "0x123"

$ grpcurl -plaintext -d '{"owner": {"kind": 2, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Invalid 'address': "0x123"

$ grpcurl -plaintext -d '{"owner": {"kind": 2, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Invalid 'address': "0x123": Invalid value was given to the function

$ grpcurl -plaintext -d '{"owner": {"kind": 2, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Invalid 'address': "0x123": Invalid value was given to the function

$ grpcurl -plaintext -d '{"owner": {"kind": 2, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: InvalidArgument
  Message: Invalid 'address': "0123": Invalid value was given to the function

$ grpcurl -plaintext -d '{"owner": {"kind": 2, "address": "0x123"}}' localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ListOwnedObjects
ERROR:
  Code: Unimplemented
  Message: Not implemented yet
```
## Description

Update the `seek` API for iterators to accept a raw (byte) key.
`key::encode` can be used to encode a structured value, but this change
is informed by the fact that cursors from RPC will be raw bytes.

## Test plan

Existing tests

```
$ cargo nextest run -p sui-indexer-alt-consistent-store
```
## Description

This is in preparation for supporting a form of iteration over all
elements whose keys match a prefix.

## Test plan

Existing tests

```
$ cargo nextest run -p sui-indexer-alt-consistent-store
```
## Description

`DbMap::range` and `DbMap::range_rev` were originally designed to
support prefix queries, but it's easier to support that with a bespoke
API.

## Test plan

```
$ cargo nextest run -p sui-indexer-alt-consistent-store
```
## Description

Expose the raw versions of the next key value pair from the iterator.
This will be used to extract the raw key to use as a pagination cursor.

## Test plan

Updated tests:

```
$ cargo nextest run -p sui-indexer-alt-consistent-store
```
## Description

Add configuration for default and max page size to the consistent
service and expose the configuration through a new gRPC method
(`ServiceConfig`).

This configuration is not yet used by anything.

The proto schema changes also include a minor rename (`START` and `END`
become `FRONT` and `BACK`) to match the nomenclature used in GraphQL.

## Test plan

Manually tested:

```
$ grpcurl -v -plaintext localhost:7001 sui.rpc.consistent.v1alpha.ConsistentService/ServiceConfig

Resolved method descriptor:
rpc ServiceConfig ( .sui.rpc.consistent.v1alpha.ServiceConfigRequest ) returns ( .sui.rpc.consistent.v1alpha.ServiceConfigResponse );

Request metadata to send:
(empty)

Response headers received:
content-type: application/grpc
date: Mon, 21 Jul 2025 14:11:42 GMT
x-sui-rpc-version: 1.52.0-672319c9b78c-dirty

Response contents:
{
  "defaultPageSize": 50,
  "maxPageSize": 200
}

Response trailers received:
(empty)
Sent 0 requests and received 1 response
```
## Description

Create a `Page` type to describe a requested page and implement a
function on that to extract that page of content from a `DbMap`, with a
prefix filter in place.

## Test plan

New unit tests:

```
$ cargo nextest run                   \
  -p sui-indexer-alt-consistent-store \
  -- rpc::pagination
```
## Descriptions

Prefix all RPC metrics with the name of the RPC service they come from
(JSONRPC, GraphQL, Consistent Store), and add the ability to introduce a
custom prefix to indexer framework metrics, so that multiple RPCs an
Indexers can co-exist in one process, serving metrics to one metrics
service.

Note that this will require updating dashboards and alerts once it
lands.

## Test plan

Existing tests.
## Description

Add the consistent service to the E2E test cluster. All E2E tests will
now wait for the service to catch up, but none of them query the service
yet.

This required adding a new method to return the available checkpoint
range (used by the test runner to check that the consistent store
indexer has caught up).

## Test plan

Existing E2E tests:

```
$ cargo nextest run -p sui-indexer-alt-consistent-store
```
## Description

Implementing listing for owned objects, for address-owners, at a
checkpoint. The checkpoint is provided using metadata (headers), and
defaults to the latest checkpoint available to the service if not
provided.

## Test plan

New E2E tests

```
$ cargo nextest run \
  -p sui-indexer-alt-e2e-tests \
  --test consistent_store_list_owned_objects_tests
```
## Description

Support filtering owned objects response by their type (or just the
package or module, or uninstantiated version of that type).

## Test plan

New E2E tests

```
$ cargo nextest run                                \
  -p sui-indexer-alt-consistent-store              \
  --test consistent_store_list_owned_objects_tests \
  -- test_type_filters
```
@amnn amnn temporarily deployed to sui-typescript-aws-kms-test-env July 27, 2025 18:04 — with GitHub Actions Inactive
@amnn amnn enabled auto-merge (rebase) July 27, 2025 18:05
@amnn amnn merged commit 6a142d5 into main Jul 27, 2025
58 checks passed
@amnn amnn deleted the amnn/cos-idx branch July 27, 2025 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants