Skip to content

Conversation

gegaowp
Copy link
Contributor

@gegaowp gegaowp commented Sep 10, 2024

Description

df_ columns except df_kind references have been removed from both indexer reader and graphql, this pr removes them from the ingestion path.

to make it happen, some refactoring was necessary, previously StoredObject was the From source Stored* of objects_history and objects_snapshot, this pr changes the source to IndexedObject, which is more intuitive as StoredObject is supposed to be coupled with objects table while IndexedObject is table agnostic.

Test plan

ci and local run


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Sep 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 20, 2024 0:23am
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 20, 2024 0:23am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 20, 2024 0:23am
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 20, 2024 0:23am

@gegaowp gegaowp changed the title Drop df columns draft not ready for review: drop df columns and refactor Sep 10, 2024
@gegaowp gegaowp force-pushed the drop-df-columns branch 2 times, most recently from 51a20ed to 876bc4e Compare September 11, 2024 15:58
@gegaowp gegaowp changed the title draft not ready for review: drop df columns and refactor indexer: drop df columns and refactoring Sep 11, 2024
@lxfind
Copy link
Contributor

lxfind commented Sep 11, 2024

In order to drop a column, the process needs to take 2 releases, as described in https://www.notion.so/mystenlabs/Indexer-Release-Process-fdd71dffb3334b7c9ecd8df3cacbe5a7?pvs=4#2ad9f8dcd47241459ab700bdad13d863.
In the first release the Rust types are updated so that it tolerates the missing columns, and in the second release the table is dropped from the schema.
@amnn This is another data point in my view that we may want to consider not thinking too much about properly handling breaking changes in the short term.

@gegaowp
Copy link
Contributor Author

gegaowp commented Sep 11, 2024

@lxfind yes and for this specific change, the columns were always Nullable and we have removed the callsites from indexerJSON RPC and GraphQL on previous prs, so that we can fast forward to step 3, which is this pr, and then on next deployment, when the new migration is run, the columns will be dropped, did I miss anything?

@lxfind
Copy link
Contributor

lxfind commented Sep 11, 2024

In a standard release process, where we want to ensure zero down time, we will need to be able to perform blue-green deployment, where we keep the old writer running while starting the new writer. So there will be a period of time where both old writer and new writer are running. In this case, when the new writer removes a column, the old writer will fail because it would still attempt to write to the dropped column (even if they are null). Same issue applies to readers. So in the future the proper way would be that we need to update the Rust type first before we drop the column.

However, if we want to speed up the process, I think we could ignore the proper rollout for now and always first shutdown the old indexer instances and then star the new one, which would avoid this problem, but with a few mins down time.

@amnn
Copy link
Contributor

amnn commented Sep 11, 2024

@amnn This is another data point in my view that we may want to consider not thinking too much about properly handling breaking changes in the short term.

However, if we want to speed up the process, I think we could ignore the proper rollout for now and always first shutdown the old indexer instances and then star the new one, which would avoid this problem, but with a few mins down time.

I think this is fine for now -- my main interest in following the backwards compatibility process is to test it out (and discover interesting edge cases like the impact of blue-green deployment on how quickly we can move) and less about actually maintaining backwards compatibility or reducing down time at the moment.

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good as well, thanks for the clean-up @gegaowp !

@gegaowp gegaowp force-pushed the drop-df-columns branch 2 times, most recently from 7928f9b to 27fb7a8 Compare September 19, 2024 03:09
@gegaowp gegaowp merged commit 8e09680 into MystenLabs:main Sep 20, 2024
43 of 44 checks passed
@gegaowp gegaowp deleted the drop-df-columns branch September 20, 2024 01:31
gegaowp added a commit that referenced this pull request Oct 16, 2024
## Description 

df_ columns except df_kind references have been removed from both
indexer reader and graphql, this pr removes them from the ingestion
path.

to make it happen, some refactoring was necessary, previously
`StoredObject` was the `From` source Stored* of `objects_history` and
`objects_snapshot`, this pr changes the source to `IndexedObject`, which
is more intuitive as `StoredObject` is supposed to be coupled with
`objects` table while `IndexedObject` is table agnostic.

## Test plan 

ci and local run

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:

---------

Co-authored-by: Will Yang <willyang@mystenlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants