-
Notifications
You must be signed in to change notification settings - Fork 487
feat: Generic Importers #1389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
calumcalder
merged 34 commits into
dtinit:master
from
calumcalder:feat/generic-exporters
Jan 6, 2025
Merged
feat: Generic Importers #1389
calumcalder
merged 34 commits into
dtinit:master
from
calumcalder:feat/generic-exporters
Jan 6, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In preparation for creating a generic transfer extension, set the `@type` field when serializing `SocialActivity*` objects.
Adds a generic `Importer` for SOCIAL_POSTS. The importer passes the social data straight through, and is built to be extensible to other data types.
Extends the `GenericImporter` from the previous commit to support data with both JSON metadata and file data. As an example of usage, provide an implementation for `BLOBS` data.
calumcalder
commented
Nov 14, 2024
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferExtension.java
Outdated
Show resolved
Hide resolved
dfd07cd to
312862b
Compare
7afade1 to
40e18b1
Compare
calumcalder
commented
Nov 15, 2024
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferExtension.java
Outdated
Show resolved
Hide resolved
Extracts the logic for serializing BLOBS data to a separate module. Adds tests for the serializer. Adds a generic wrapper around the payload, which will wrap all other types after they undergo the same refactoring.
Extracts the SOCIAL_POSTS generic importer serialization logic to a separate module. Also adds tests, and wraps the payload in GenericPayload, like with BLOBS that was done in the previous commit.
Extracts the CALENDAR generic importer serialization logic to a separate module. Also adds tests, and wraps the payload in GenericPayload, like with other data types in previous commits.
Extracts the MEDIA generic importer serialization logic to a separate module. Also adds tests, and wraps the payload in GenericPayload, like with other data types in previous commits.
Moves `CachedDownloadableItem` to the `BlobbySerializer` file, since that's the only place it's used.
In previous commits the type information of exported data was being lost at serialization time. This change delays converting to JSON representation until the time when data is sent on the wire, allowing type information to be preserved as long as possible and to avoid double-serialization of the data. The exported data have a union of possible types (the MediaSerializer, for example, generates a list of MediaAlbums, VideoModels, and PhotoModels), but unfortunately Java doesn't support proper union types. A solution to this is to create an empty `ExportData` interface for each serializer and have the possible subtypes 'implement' the empty interface, then using the `ExportData` interface as a type bound. This makes Media and Calendar serializers more verbose as we're having to duplicate the underlying models, but it gives us a layer of separation between the model and the serialized data, allowing them to evolve independently, and guaranteeing that changes to the model will require a corresponding change in the serializer, reducing the chance of the serializer interface being broken.
Since we've added a layer between generic importer serializers and the underlying models to better preserve type information, make use of this by using a custom schema for media items. This lets us be a) use consistent schemas for Photos and Videos b) use TZ-aware date-times (although we have to assume the underlying Date is UTC when copying the model)
Simplifies the `@type` annotations of exported data. Might be worth doing this for nested types too, although some of them are directly using Models defined by DTP.
da4861a to
23bec4a
Compare
Expands the `TransferServiceConfig` object to allow arbitraty data in a `serviceConfig` field, to be later parsed by `TransferExtension`s at initialization time. Extends `GenericTransferExtension` to read from `serviceConfig` to configure itself for the import service being targeted. Also allows `TransferExtension`s to declare support for services through a `supportsService` method, turning Service->Extension mappings from one-to-one to many-to-one, meaning `GenericTransferExtension` can be configured to support multiple services. For now the configuration is fairly barebones; a list of supported verticals (wrapped in an object to allow future support for vertical-level configuration), the base of the API endpoint, and the service name.
Tidies up the MediaSerializer class, and reverts some now-unnecessary changes to the core DTP models.
Improves the serialization of Blobby data.
Refactors GenericImporter to make GenericFileImporter simpler, and to remove some duplication in the two classes.
There was a manual test being used during development that was replaced by unit tests.
File data had inherited some structure from when it was based on `BlobbyStorageContainerResource`. Since refactoring to a standalone schema, there's no need to have this structure, so simplify the schema by flattening it.
adb64ab to
76c37d7
Compare
This makes the naming more consistent with e.g. MediaSerializer
lisad
previously approved these changes
Dec 11, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love it. The documentation is already very readable to an outsider.
extensions/data-transfer/portability-data-transfer-generic/README.md
Outdated
Show resolved
Hide resolved
extensions/data-transfer/portability-data-transfer-generic/README.md
Outdated
Show resolved
Hide resolved
extensions/data-transfer/portability-data-transfer-generic/README.md
Outdated
Show resolved
Hide resolved
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferExtension.java
Show resolved
Hide resolved
...neric/src/main/java/org/datatransferproject/datatransfer/generic/auth/OAuthTokenManager.java
Show resolved
Hide resolved
.../test/java/org/datatransferproject/datatransfer/generic/GenericImportSerializerTestBase.java
Show resolved
Hide resolved
...ommon/src/main/java/org/datatransferproject/types/common/models/calendar/RecurrenceRule.java
Outdated
Show resolved
Hide resolved
sparuvu
reviewed
Dec 11, 2024
...sfer-generic/src/main/java/org/datatransferproject/datatransfer/generic/GenericImporter.java
Show resolved
Hide resolved
...sfer-generic/src/main/java/org/datatransferproject/datatransfer/generic/GenericImporter.java
Show resolved
Hide resolved
...sfer-generic/src/main/java/org/datatransferproject/datatransfer/generic/GenericImporter.java
Show resolved
Hide resolved
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferConstants.java
Show resolved
Hide resolved
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferExtension.java
Outdated
Show resolved
Hide resolved
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferExtension.java
Outdated
Show resolved
Hide resolved
...ric/src/main/java/org/datatransferproject/datatransfer/generic/GenericTransferExtension.java
Show resolved
Hide resolved
Based on review feedback, expand on the documentation. - Include a high-level overview of job lifecyle - Add details on data rates - Clarify the ordering behaviour - Add reference to OAuth RFC - 201 -> 20x for success codes - Remove implementation details for OAuth flow It's documented more thoroughly elsewhere and we should encourage use of a framework or third-party authorization server.
Also use `Set.contains` to check for vertical support.
|
Thanks for the reviews! |
lisad
approved these changes
Dec 13, 2024
sparuvu
approved these changes
Dec 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To support importing data via DTP, developers are required to implement a Java extension module exposing a
TransferExtensionandImporter, with theImporterinterfacing with a new or existing API also supported by the developer. Creating a Java extension creates a barrier for some developers who don't have the context or bandwidth to implement and maintain such an extension.As a way to make it easier to integrate with DTP this PR implements a
GenericTransferExtensionandGenericImporter, which can be reused by developers without committing or maintaining Java code.Generic Importers can be configured to call a HTTP endpoint created and maintained by the developer, allowing them to use the programming language and frameworks of their choice, as long as their API conforms to the OAuth Authorization Code Flow, and the HTTP JSON API laid out in the included README.
While this PR contains a functional implementation for the BLOBS, CALENDAR, MEDIA, and SOCIAL-POSTS verticals, I anticipate the implementation and API will be iterated on as developers begin testing and integration.
TODO
TransferExtensionloading