Skip to content

Datastore/feat add per model sync page size #11179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MixMasterT
Copy link
Contributor

@MixMasterT MixMasterT commented Apr 3, 2023

Description of changes

This is a DRAFT PR for implementing perModelSyncPageSize.

Currently, it is possible to set the syncPageSize value, but here is no way to set it to a different value for any one specific data model. This proposed solution would allow users to provide a perModelSyncPageSize option in their DataStore config to set different syncPageSize values for different data models.

The perModelSyncPageSize option would be completely optional, so most users could ignore it altogether.
However, this will be very helpful to developers who may have a heavy data model, since the default syncPageSize is 1000 and the max data volume allowed per request over appSync is 1mb.

The intention for this implementation is that users can add a perModelSyncPageSize option when configuring DataStore. This would look something like the following (assume the project has three models authors, readers, and books, and that some readers have a large number of books, and that the book records can be quite large):

Amplify.configure({
  ...awsconfig,
  DataStore: {
     perModelSyncPageSize: {
        books: 10, // this would tell Amplify DataStore to fetch the books in batches of 10 instead of the default 1000
         // the other tables, `authors` and `readers` would still be fetched according to the syncPageSize 
    },
 }
});

Description of how you validated changes

So far the only validation has been to run the tests (with the new tests commented out), and verify that they still all pass.

NOTE: this is a draft PR, and the goal is to confirm the approach. If this approach is agreed upon, I will follow up with test implementation to verify the intended behavior.

Checklist

  • PR description included
  • yarn test passes
  • Tests are changed or added
  • Relevant documentation is changed or added (and PR referenced)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@MixMasterT MixMasterT requested a review from a team as a code owner April 3, 2023 17:17
@MixMasterT MixMasterT marked this pull request as draft April 3, 2023 17:17
@svidgen svidgen added the DataStore Related to DataStore category label Apr 4, 2023
@renebrandel
Copy link

Hi @MixMasterT - thanks for starting this PR. Could you tell us a bit more about your use case that requires this level of granularity?

However, this will be very helpful to developers who may have a heavy data model, since the default syncPageSize is 1000 and the max data volume allowed per request over appSync is 1mb.

Would love it if you could expand more on this. For example with specific limitations that you might've ran into.

@MixMasterT
Copy link
Contributor Author

MixMasterT commented Apr 6, 2023

@renebrandel

Hi @MixMasterT - thanks for starting this PR. Could you tell us a bit more about your use case that requires this level of granularity?

However, this will be very helpful to developers who may have a heavy data model, since the default syncPageSize is 1000 and the max data volume allowed per request over appSync is 1mb.

Would love it if you could expand more on this. For example with specific limitations that you might've ran into.

No problem.

Here is an example:
Let's say I'm creating an app for Chemistry research experiments. I have five data models (DynamoDB Tables): Users, Materials, Experiments, ResearchFacilities, and ResearchTeams.

Users, Materials, and ResearchFacilities are fairly standard, smallish data models (< 1kb per record).
ResearchTeams are basically just groupings of Users, but let's say that we use a sync-expression to share Experiments between all Users in the same ResearchTeam, and that some Users may belong to multiple ResearchTeams.

Now, suppose that the Experiments Table is quite complex and includes a lot of detailed observation data, so a single Experiment may be over 100kb in data.

If we have a User who has more than 20 experiments (could be hundreds) shared with her/him, that may cause AppSync to fail with the "Transformation too large" error.

Currently, with only a single syncPageSize to set, we could solve this problem by setting that down to a very low number (like 5). But if we still want to share all of the Materials with all of the Users, this may cause loading to be painfully slow.
There might be thousands of Materials that we want all users to have available on the frontend, and since each Material record is small, that model could (and should) be synced in larger batches.

perModelSyncPageSize solves this problem by allowing the developers to specify a lower syncPageSize only for those models that are bulky.
In this case, they could simply set the perModelSyncPageSize to { experiment: 5 }, which would cause the Experiments to load in small batches, but allow other models to continue loading in larger batches (up to 1k at a time).

Hopefully this description is helpful.

Please let me know if it makes sense to you. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataStore Related to DataStore category external-contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants