Skip to content

Conversation

lxfind
Copy link
Contributor

@lxfind lxfind commented Sep 18, 2024

Description

This PR adds 3 improvements to the sql backfill tool:

  1. It allows ON CONFLICT DO NOTHING, so that we can safely backfill gaps without been too precise.
  2. It tunes down the default concurrency and chunk size, and allows for override through command line args.
  3. It prints out the minimum in-progress checkpoint so that if it ever stops for some reason, you can restart using that number.

Test plan

Run again locally


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Sep 18, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 18, 2024 11:41pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Sep 18, 2024 11:41pm
sui-kiosk ⬜️ Ignored (Inspect) Sep 18, 2024 11:41pm
sui-typescript-docs ⬜️ Ignored (Inspect) Sep 18, 2024 11:41pm

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lxfind !

@lxfind lxfind merged commit b8d8194 into main Sep 19, 2024
47 of 48 checks passed
@lxfind lxfind deleted the indexer-backfill-improvements branch September 19, 2024 15:29
samuel-rufi added a commit to iotaledger/iota that referenced this pull request Jul 15, 2025
# Description of change

Introduces the backfill infrastructure for the indexer. Consolidates
relevant upstream patches, filters out backfill workflows for
unsupported database tables, and improves error handling, testing, API
compatibility, progress tracking, refactoring, type & trait naming, and
comments.

**Upstream patches:**

* [indexer: SQL Backfill command
(#19359)](MystenLabs/sui#19359) {"side-effects":
"major", "type": "feature"}
* [Indexer: A few improvements to backfill tool
(#19441)](MystenLabs/sui#19441)
* [Indexer: Add a generic backfill
template](MystenLabs/sui#19510)
* [Indexer: Add ingestion based
backfill](MystenLabs/sui#19636)
* [indexer: backfill
tx_affected_objects](MystenLabs/sui#19675)
* [indexer: tx_affected_objects ingestion-based
backfill](MystenLabs/sui@b902e82)
* [Indexer: Fix a bug in epochs system state json
backfill](MystenLabs/sui#19695)
* [fix(backfill): chunk up writes to
DB](MystenLabs/sui#19699)
* [chore(indexer): drop redundant curly
braces](MystenLabs/sui@a9b5784)
* [chore(indexer): reduce visibility of
crates](MystenLabs/sui@372f6b6)
* [fix(indexer): fix default db path in SQL ingestion
scripts](MystenLabs/sui@81fcff6)
* [indexer: backfill for
tx_affected_addresses](MystenLabs/sui@6a579d6)

## Links to any relevant issues

Fixes #7365

## How the change has been tested

- [X] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [X] Patch-specific tests (correctness, functionality coverage)

### Infrastructure QA (only required for crates that are maintained by
@iotaledger/infrastructure)

This PR doesn’t touch existing ingestion pipelines or synchronization.
It doesn’t include any actual backfill implementation, instead, it
delivers the infrastructure and tooling needed to create backfill jobs.
However, relevant patch-specific tests have been added.

- [ ] Synchronization of the indexer from genesis for a network
including migration objects.
- [ ] Restart of indexer synchronization locally without resetting the
database.
- [ ] Restart of indexer synchronization on a production-like database.
- [ ] Deployment of services using Docker.
- [X] Verification of API backward compatibility.

### Release Notes

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [X] Indexer: Adds a new `RunBackfill` CLI command to backfill DB
tables.
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] REST API:

---------

Co-authored-by: Sergiu Popescu <44298302+sergiupopescu199@users.noreply.github.com>
filipdulic pushed a commit to iotaledger/iota that referenced this pull request Jul 16, 2025
# Description of change

Introduces the backfill infrastructure for the indexer. Consolidates
relevant upstream patches, filters out backfill workflows for
unsupported database tables, and improves error handling, testing, API
compatibility, progress tracking, refactoring, type & trait naming, and
comments.

**Upstream patches:**

* [indexer: SQL Backfill command
(#19359)](MystenLabs/sui#19359) {"side-effects":
"major", "type": "feature"}
* [Indexer: A few improvements to backfill tool
(#19441)](MystenLabs/sui#19441)
* [Indexer: Add a generic backfill
template](MystenLabs/sui#19510)
* [Indexer: Add ingestion based
backfill](MystenLabs/sui#19636)
* [indexer: backfill
tx_affected_objects](MystenLabs/sui#19675)
* [indexer: tx_affected_objects ingestion-based
backfill](MystenLabs/sui@b902e82)
* [Indexer: Fix a bug in epochs system state json
backfill](MystenLabs/sui#19695)
* [fix(backfill): chunk up writes to
DB](MystenLabs/sui#19699)
* [chore(indexer): drop redundant curly
braces](MystenLabs/sui@a9b5784)
* [chore(indexer): reduce visibility of
crates](MystenLabs/sui@372f6b6)
* [fix(indexer): fix default db path in SQL ingestion
scripts](MystenLabs/sui@81fcff6)
* [indexer: backfill for
tx_affected_addresses](MystenLabs/sui@6a579d6)

## Links to any relevant issues

Fixes #7365

## How the change has been tested

- [X] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [X] Patch-specific tests (correctness, functionality coverage)

### Infrastructure QA (only required for crates that are maintained by
@iotaledger/infrastructure)

This PR doesn’t touch existing ingestion pipelines or synchronization.
It doesn’t include any actual backfill implementation, instead, it
delivers the infrastructure and tooling needed to create backfill jobs.
However, relevant patch-specific tests have been added.

- [ ] Synchronization of the indexer from genesis for a network
including migration objects.
- [ ] Restart of indexer synchronization locally without resetting the
database.
- [ ] Restart of indexer synchronization on a production-like database.
- [ ] Deployment of services using Docker.
- [X] Verification of API backward compatibility.

### Release Notes

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [X] Indexer: Adds a new `RunBackfill` CLI command to backfill DB
tables.
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] REST API:

---------

Co-authored-by: Sergiu Popescu <44298302+sergiupopescu199@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants