Restructure to allow extension to handle 1-n collection sync

As discussed in Slack, currently for each Firestore collection we want to sync to a Typesense collection, we need a 1:1 extension setup. This has several issues with it at scale.

For situtations where there is only a couple of TS collections needed for an application it is quite managable. However, given Google's terrible support for any kind of search within Firebase (ironicly) and given how well TS fills this void, it is quite likely to end up with a large number of searchable collections withing TS cluster(s) that need to me in sync from the source DB Firebase. Such is what has happened for us, growing from 1 Typesense collection (and Firebase Extension to sync it), to 19 (with more on the way).

Each time a Firebase Extension is installed to manage a Typesense collection sync, it comes with the following:

1. A Cloud Function for backfill.
2. A Cloud Function to watch the collection path and upsert document changes.
3. Google Cloud security keys providing permission to operate.
4. Maintenence overhead for updates per Extension.
5. Generally this is then multiplied by 2-3x to account for the same thing needed across development, testing, production environments.

In our case the following means many security keys to keep track of, almost 50 unnamed Cloud Functions poluting the Functions GUI for each environment (all doing essentially the same thing code wise) and an increasingly inefficient way to maintaining the process. Even just trying to find the right function to inspect its logs when something isn't working is exponentually harder as more Extensions are added. This is given the dreadful Firebase GUI from Google, which has zero information to label what each extension function is actually for, so it's basically trial and error to isolate which one you are looking for (we have raised this point with Google directly already). See example screens and how hard it is to work out which Extension or Cloud Function does what:

<img width="606" alt="Image" src="https://github.com/user-attachments/assets/948c4638-141b-4fbc-aed7-105a72a22e80" />
<img width="751" alt="Image" src="https://github.com/user-attachments/assets/dd89e17b-554c-4e7d-bbce-c0ce04e6ec4e" />


To reduce the code and IAM key repetition, I propose redesign of the Typesense Extension to allow it to manage 1-n collections. 

Depending on if the Firebase Extension GUI allows 1-n fields or not, it could also be accomplished by simply providing a field which points to where the configuration data was. Few ways this could be solved:

1. A field to reference a config file somewhere, maybe uploaded Cloud Storage.
2. Expanding on the `typesense_sync` collection, to enhance `firestore_collections[]` from an array of strings to an array of objects, each one with its respective configuration details needed for the sync.
3. Let the already supplied Typesense API key for the extension be used to talk to Typesense and fetch the config data needed for the extension to run. This then gives full flexibility to design whatever GUI you want on the Typesense Cloud side, such as an "Extensions" menu item added per collection in TS Cloud, where the user can setup the details needed for the extension type he is mapping (I expect this could work with other extensions than just Firebase), flexible to evolve with needs over time.

Personally, I think the 3rd gives the most freedom, but the 2nd is probably easiest to accomplish and a good compromise. However, any of those should do the trick and conceptually could be ported to other types of DB's you have extension sync's for.

It's also worth considering that each cloud function watching a path for sync is more which can go cold and so more cold starts, leading to delays of data getting across (which affects the UX of the end application). Consolidating these into one function would be great for minimising that, but also mean that it would be under greater load, now as a single endpoint for n collections. Given that the current sync code works well, it might have to be refactored into `child_process` or `worker_threads`, per collection, to ensure performance remained optimum.

Following the version 2 release of the extension, hopefully we can rework it for a v3 based on this feedback. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Restructure to allow extension to handle 1-n collection sync #121

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Restructure to allow extension to handle 1-n collection sync #121

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions