Skip to content

Commit 792c14d

Browse files
cabljacpr-Mais
andauthored
Merge pull request #2210 from firebase/next
* Update firestore-bigquery-export/extension.yaml Co-authored-by: Mais Alheraki <mais.alheraki@gmail.com> * Update firestore-bigquery-export/extension.yaml * chore: update wording of new param (#2185) * feat(firestore-bigquery-changetracker): include .d.ts files (#2207) * fix(firestore-bigquery-changetracker): include declaration files * chore(firestore-bigquery-changetracker): bump version * fix(firestore-bigquery-export): added ts-expect-error and TODOs in the import script * feat(firestore-bigquery-export): prepare RC (#2206) * chore(firestore-bigquery-changetracker): bump version * fix(firestore-bigquery-export): added ts-expect-error and TODOs in the import script * feat: try to immediately write to bq first * chore: remove legacy backfill code * feat: add max enqueue attempts param * test: add flags to test, remove unused resource * feat: add backup to gcs * chore(firestore-bigquery-export): temporarily disable GCS * chore: bump ext version * fix(firstore-bigquery-export): comment out unused role for now and use logging * fix(firestore-bigquery-export): implemented RC changes including logging keys * chore(firestore-bigquery-export): update README and CHANGELOG * chore(firestore-bigquery-export): update CHANGELOG * chore(firestore-bigquery-export): update param description and README (#2209) --------- Co-authored-by: Mais Alheraki <mais.alheraki@gmail.com>
2 parents 403e291 + 456fa13 commit 792c14d

File tree

23 files changed

+1179
-588
lines changed

23 files changed

+1179
-588
lines changed

_emulator/.firebaserc

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
{
22
"projects": {
33
"default": "demo-test"
4+
},
5+
"targets": {},
6+
"etags": {
7+
"dev-extensions-testing": {
8+
"extensionInstances": {
9+
"firestore-bigquery-export": "02acbd8b443b9635716d52d65758a78db1e51140191caecaaf60d932d314a62a"
10+
}
11+
}
412
}
513
}

firestore-bigquery-export/CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,15 @@
1+
## Version 0.1.56
2+
3+
feat - improve sync strategy by immediately writing to BQ, and using cloud tasks only as a last resort
4+
5+
refactor - improve observability/logging of events
6+
7+
chore - remove legacy backfill code
8+
9+
fix - improved usage of the types from change tracker package
10+
11+
feat - remove log failed exports param
12+
113
## Version 0.1.55
214

315
feat - log failed queued tasks

firestore-bigquery-export/README.md

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -126,8 +126,6 @@ To install an extension, your project must be on the [Blaze (pay as you go) plan
126126

127127
* Collection path: What is the path of the collection that you would like to export? You may use `{wildcard}` notation to match a subcollection of all documents in a collection (for example: `chatrooms/{chatid}/posts`). Parent Firestore Document IDs from `{wildcards}` can be returned in `path_params` as a JSON formatted string.
128128

129-
* Enable logging failed exports: If enabled, the extension will log event exports that failed to enqueue to Cloud Logging, to mitigate data loss.
130-
131129
* Enable Wildcard Column field with Parent Firestore Document IDs: If enabled, creates a column containing a JSON object of all wildcard ids from a documents path.
132130

133131
* Dataset ID: What ID would you like to use for your BigQuery dataset? This extension will create the dataset, if it doesn't already exist.
@@ -158,18 +156,16 @@ essential for the script to insert data into an already partitioned table.)
158156

159157
* Exclude old data payloads: If enabled, table rows will never contain old data (document snapshot before the Firestore onDocumentUpdate event: `change.before.data()`). The reduction in data should be more performant, and avoid potential resource limitations.
160158

161-
* Use Collection Group query: Do you want to use a [collection group](https://firebase.google.com/docs/firestore/query-data/queries#collection-group-query) query for importing existing documents? You have to enable collectionGroup query if your import path contains subcollections. Warning: A collectionGroup query will target every collection in your Firestore project that matches the 'Existing documents collection'. For example, if you have 10,000 documents with a subcollection named: landmarks, this will query every document in 10,000 landmarks collections.
162-
163159
* Cloud KMS key name: Instead of Google managing the key encryption keys that protect your data, you control and manage key encryption keys in Cloud KMS. If this parameter is set, the extension will specify the KMS key name when creating the BQ table. See the PREINSTALL.md for more details.
164160

161+
* Maximum number of enqueue attempts: This parameter will set the maximum number of attempts to enqueue a document to cloud tasks for export to BigQuery.
162+
165163

166164

167165
**Cloud Functions:**
168166

169167
* **fsexportbigquery:** Listens for document changes in your specified Cloud Firestore collection, then exports the changes into BigQuery.
170168

171-
* **fsimportexistingdocs:** Imports existing documents from the specified collection into BigQuery. Imported documents will have a special changelog with the operation of `IMPORT` and the timestamp of epoch.
172-
173169
* **syncBigQuery:** A task-triggered function that gets called on BigQuery sync
174170

175171
* **initBigQuerySync:** Runs configuration for sycning with BigQuery

firestore-bigquery-export/extension.yaml

Lines changed: 11 additions & 95 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
# limitations under the License.
1414

1515
name: firestore-bigquery-export
16-
version: 0.1.55
16+
version: 0.1.56
1717
specVersion: v1beta
1818

1919
displayName: Stream Firestore to BigQuery
@@ -60,19 +60,6 @@ resources:
6060
eventType: providers/cloud.firestore/eventTypes/document.write
6161
resource: projects/${param:PROJECT_ID}/databases/(default)/documents/${param:COLLECTION_PATH}/{documentId}
6262

63-
- name: fsimportexistingdocs
64-
type: firebaseextensions.v1beta.function
65-
description:
66-
Imports existing documents from the specified collection into BigQuery.
67-
Imported documents will have a special changelog with the operation of
68-
`IMPORT` and the timestamp of epoch.
69-
properties:
70-
runtime: nodejs18
71-
taskQueueTrigger:
72-
retryConfig:
73-
maxAttempts: 15
74-
minBackoffSeconds: 60
75-
7663
- name: syncBigQuery
7764
type: firebaseextensions.v1beta.function
7865
description: >-
@@ -206,19 +193,6 @@ params:
206193
default: posts
207194
required: true
208195

209-
- param: LOG_FAILED_EXPORTS
210-
label: Enable logging failed exports
211-
description: >-
212-
If enabled, the extension will log event exports that failed to enqueue to
213-
Cloud Logging, to mitigate data loss.
214-
type: select
215-
options:
216-
- label: Yes
217-
value: yes
218-
- label: No
219-
value: no
220-
required: true
221-
222196
- param: WILDCARD_IDS
223197
label: Enable Wildcard Column field with Parent Firestore Document IDs
224198
description: >-
@@ -409,74 +383,6 @@ params:
409383
- label: No
410384
value: no
411385

412-
# - param: DO_BACKFILL
413-
# label: Import existing Firestore documents into BigQuery?
414-
# description: >-
415-
# Do you want to import existing documents from your Firestore collection
416-
# into BigQuery? These documents will have each have a special changelog
417-
# with the operation of `IMPORT` and the timestamp of epoch. This ensures
418-
# that any operation on an imported document supersedes the import record.
419-
# type: select
420-
# required: true
421-
# default: no
422-
# options:
423-
# - label: Yes
424-
# value: yes
425-
# - label: No
426-
# value: no
427-
428-
# - param: IMPORT_COLLECTION_PATH
429-
# label: Existing Documents Collection
430-
# description: >-
431-
# Specify the path of the Cloud Firestore Collection you would like to
432-
# import from. This may or may not be the same Collection for which you plan
433-
# to mirror changes. If you want to use a collectionGroup query, provide the
434-
# collection name value here, and set 'Use Collection Group query' to true.
435-
# You may use `{wildcard}` notation with an enabled collectionGroup query to
436-
# match a subcollection of all documents in a collection (e.g.,
437-
# `chatrooms/{chatid}/posts`).
438-
# type: string
439-
# validationRegex: "^[^/]+(/[^/]+/[^/]+)*$"
440-
# validationErrorMessage:
441-
# Firestore collection paths must be an odd number of segments separated by
442-
# slashes, e.g. "path/to/collection".
443-
# example: posts
444-
# required: false
445-
446-
- param: USE_COLLECTION_GROUP_QUERY
447-
label: Use Collection Group query
448-
description: >-
449-
Do you want to use a [collection
450-
group](https://firebase.google.com/docs/firestore/query-data/queries#collection-group-query)
451-
query for importing existing documents? You have to enable collectionGroup
452-
query if your import path contains subcollections. Warning: A
453-
collectionGroup query will target every collection in your Firestore
454-
project that matches the 'Existing documents collection'. For example, if
455-
you have 10,000 documents with a subcollection named: landmarks, this will
456-
query every document in 10,000 landmarks collections.
457-
type: select
458-
default: no
459-
options:
460-
- label: Yes
461-
value: yes
462-
- label: No
463-
value: no
464-
465-
# - param: DOCS_PER_BACKFILL
466-
# label: Docs per backfill
467-
# description: >-
468-
# When importing existing documents, how many should be imported at once?
469-
# The default value of 200 should be ok for most users. If you are using a
470-
# transform function or have very large documents, you may need to set this
471-
# to a lower number. If the lifecycle event function times out, lower this
472-
# value.
473-
# type: string
474-
# example: 200
475-
# validationRegex: "^[1-9][0-9]*$"
476-
# validationErrorMessage: Must be a postive integer.
477-
# default: 200
478-
# required: true
479-
480386
- param: KMS_KEY_NAME
481387
label: Cloud KMS key name
482388
description: >-
@@ -491,6 +397,16 @@ params:
491397
'projects/PROJECT_NAME/locations/KEY_RING_LOCATION/keyRings/KEY_RING_ID/cryptoKeys/KEY_ID'.
492398
required: false
493399

400+
- param: MAX_ENQUEUE_ATTEMPTS
401+
label: Maximum number of enqueue attempts
402+
description: >-
403+
This parameter will set the maximum number of attempts to enqueue a
404+
document to cloud tasks for export to BigQuery.
405+
type: string
406+
validationRegex: ^(10|[1-9])$
407+
validationErrorMessage: Please select an integer between 1 and 10
408+
default: 3
409+
494410
events:
495411
- type: firebase.extensions.firestore-counter.v1.onStart
496412
description:

firestore-bigquery-export/firestore-bigquery-change-tracker/package.json

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"url": "github.com/firebase/extensions.git",
66
"directory": "firestore-bigquery-export/firestore-bigquery-change-tracker"
77
},
8-
"version": "1.1.37",
8+
"version": "1.1.38",
99
"description": "Core change-tracker library for Cloud Firestore Collection BigQuery Exports",
1010
"main": "./lib/index.js",
1111
"scripts": {
@@ -18,7 +18,9 @@
1818
},
1919
"files": [
2020
"lib/*.js",
21-
"lib/bigquery/*.js"
21+
"lib/bigquery/*.js",
22+
"lib/*.d.ts",
23+
"lib/bigquery/*.d.ts"
2224
],
2325
"author": "Jan Wyszynski <wyszynski@google.com>",
2426
"license": "Apache-2.0",

firestore-bigquery-export/firestore-bigquery-change-tracker/tsconfig.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
"outDir": "lib",
55
"types": ["node"],
66
"target": "ES2020",
7-
"skipLibCheck": true
7+
"skipLibCheck": true,
8+
"declaration": true
89
},
910
"include": ["src"],
1011
"exclude": ["src/**/*.test.ts"]

firestore-bigquery-export/functions/__tests__/__snapshots__/config.test.ts.snap

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,10 @@
22

33
exports[`extension config config loaded from environment variables 1`] = `
44
Object {
5+
"backupBucketName": "undefined.appspot.com",
56
"backupCollectionId": undefined,
7+
"backupDir": "_firestore-bigquery-export",
8+
"backupToGCS": false,
69
"bqProjectId": undefined,
710
"clustering": Array [
811
"data",
@@ -12,23 +15,20 @@ Object {
1215
"databaseId": "(default)",
1316
"datasetId": "my_dataset",
1417
"datasetLocation": undefined,
15-
"doBackfill": false,
16-
"docsPerBackfill": 200,
1718
"excludeOldData": false,
1819
"importCollectionPath": undefined,
1920
"initialized": false,
2021
"instanceId": undefined,
2122
"kmsKeyName": "test",
2223
"location": "us-central1",
23-
"logFailedExportData": false,
2424
"maxDispatchesPerSecond": 10,
25+
"maxEnqueueAttempts": 3,
2526
"tableId": "my_table",
2627
"timePartitioning": null,
2728
"timePartitioningField": undefined,
2829
"timePartitioningFieldType": undefined,
2930
"timePartitioningFirestoreField": undefined,
3031
"transformFunction": "",
31-
"useCollectionGroupQuery": false,
3232
"useNewSnapshotQuerySyntax": false,
3333
"wildcardIds": false,
3434
}

firestore-bigquery-export/functions/__tests__/e2e.test.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ import * as admin from "firebase-admin";
22
import { BigQuery } from "@google-cloud/bigquery";
33

44
/** Set defaults */
5-
const bqProjectId = "dev-extensions-testing";
6-
const datasetId = "firestore_export";
7-
const tableId = "bq_e2e_test_raw_changelog";
5+
const bqProjectId = process.env.BQ_PROJECT_ID || "dev-extensions-testing";
6+
const datasetId = process.env.DATASET_ID || "firestore_export";
7+
const tableId = process.env.TABLE_ID || "bq_e2e_test_raw_changelog";
88

99
/** Init resources */
1010
admin.initializeApp({ projectId: bqProjectId });
@@ -34,7 +34,7 @@ describe("e2e", () => {
3434

3535
/** Get the latest record from this table */
3636
const [changeLogQuery] = await bq.createQueryJob({
37-
query: `SELECT * FROM \`${bqProjectId}.${datasetId}.${tableId}\` ORDER BY timestamp DESC \ LIMIT 1`,
37+
query: `SELECT * FROM \`${bqProjectId}.${datasetId}.${tableId}\` ORDER BY timestamp DESC LIMIT 1`,
3838
});
3939

4040
const [rows] = await changeLogQuery.getQueryResults();

firestore-bigquery-export/functions/__tests__/functions.test.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ jest.mock("firebase-admin/functions", () => ({
3737
}));
3838

3939
jest.mock("../src/logs", () => ({
40+
...jest.requireActual("../src/logs"),
4041
start: jest.fn(() =>
4142
logger.log("Started execution of extension with configuration", config)
4243
),

firestore-bigquery-export/functions/package-lock.json

Lines changed: 11 additions & 10 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)