|
1 |
| -This is a place holder. |
| 1 | +### Overview |
2 | 2 |
|
3 |
| -Its content is developed in go/firestore-bigquery-export-docs |
| 3 | +The import script (`fs-bq-import-collection`) can read all existing documents in a Cloud Firestore collection and insert them into the raw changelog table created by the Export Collections to BigQuery extension. The script adds a special changelog for each document with the operation of `IMPORT` and the timestamp of epoch. This ensures that any operation on an imported document supersedes the import record. |
| 4 | + |
| 5 | +You may pause and resume the script from the last batch at any point. |
| 6 | + |
| 7 | +#### Important notes |
| 8 | + |
| 9 | ++ Run the script over the entire collection **_after_** installing the Export Collections to BigQuery extension; otherwise the writes to your database during the import might not be exported to the dataset. |
| 10 | ++ The import script can take up to _O(collection size)_ time to finish. If your collection is large, you might want to consider [loading data from a Cloud Firestore export into BigQuery](https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore). |
| 11 | ++ You will see redundant rows in your raw changelog table: |
| 12 | + |
| 13 | + + If document changes occur in the time between installing the extension and running this import script. |
| 14 | + + If you run the import script multiple times over the same collection. |
| 15 | + |
| 16 | +### Install and run the script |
| 17 | + |
| 18 | +This import script uses several values from your installation of the extension: |
| 19 | + |
| 20 | ++ `${PROJECT_ID}`: the project ID for the Firebase project in which you installed the extension |
| 21 | ++ `${COLLECTION_PATH}`: the collection path that you specified during extension installation |
| 22 | ++ `${DATASET_ID}`: the ID that you specified for your dataset during extension installation |
| 23 | + |
| 24 | +1. Run `npx @firebaseextensions/fs-bq-import-collection`. |
| 25 | + |
| 26 | +1. When prompted, enter the Cloud Firestore collection path that you specified during extension installation, `${COLLECTION_PATH}`. |
| 27 | + |
| 28 | +1. _(Optional)_ You can pause and resume the import at any time: |
| 29 | + |
| 30 | + + **Pause the import:** enter `CTRL+C` |
| 31 | + The import script records the name of the last successfully imported document in a cursor file called: |
| 32 | + `from-${COLLECTION_PATH}-to-${PROJECT_ID}:${DATASET_ID}:${rawChangeLogName}`, |
| 33 | + which lives in the directory from which you invoked the import script. |
| 34 | + |
| 35 | + + **Resume the import from where you left off:** re-run `npx @firebaseextensions/fs-bq-import-collection` |
| 36 | + _from the same directory that you previously invoked the script_ |
| 37 | + |
| 38 | + Note that when an import completes successfully, the import script automatically cleans up the cursor file it was using to keep track of its progress. |
| 39 | + |
| 40 | +1. In the [BigQuery web UI](https://console.cloud.google.com/bigquery), navigate to the dataset created by the extension. The extension named your dataset using the Dataset ID that you specified during extension installation, `${DATASET_ID}`. |
| 41 | + |
| 42 | +1. From your raw changelog table, run the following query: |
| 43 | + |
| 44 | + ``` |
| 45 | + SELECT COUNT(*) FROM |
| 46 | + `${PROJECT_ID}.${COLLECTION_PATH}.${COLLECTION_PATH}_raw_changelog` |
| 47 | + WHERE operation = "import" |
| 48 | + ``` |
| 49 | +
|
| 50 | + The result set will contain the number of documents in your source collection. |
0 commit comments