You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: firestore-bigquery-export/POSTINSTALL.md
+49-12
Original file line number
Diff line number
Diff line change
@@ -4,30 +4,30 @@ You can test out this extension right away!
4
4
5
5
1. Go to your [Cloud Firestore dashboard](https://console.firebase.google.com/project/${param:BIGQUERY_PROJECT_ID}/firestore/data) in the Firebase console.
6
6
7
-
1. If it doesn't already exist, create the collection you specified during installation: `${param:COLLECTION_PATH}`
7
+
2. If it doesn't already exist, create the collection you specified during installation: `${param:COLLECTION_PATH}`
8
8
9
-
1. Create a document in the collection called `bigquery-mirror-test` that contains any fields with any values that you'd like.
9
+
3. Create a document in the collection called `bigquery-mirror-test` that contains any fields with any values that you'd like.
10
10
11
-
1. Go to the [BigQuery web UI](https://console.cloud.google.com/bigquery?project=${param:BIGQUERY_PROJECT_ID}&p=${param:BIGQUERY_PROJECT_ID}&d=${param:DATASET_ID}) in the Google Cloud Platform console.
11
+
4. Go to the [BigQuery web UI](https://console.cloud.google.com/bigquery?project=${param:BIGQUERY_PROJECT_ID}&p=${param:BIGQUERY_PROJECT_ID}&d=${param:DATASET_ID}) in the Google Cloud Platform console.
12
12
13
-
1. Query your **raw changelog table**, which should contain a single log of creating the `bigquery-mirror-test` document.
13
+
5. Query your **raw changelog table**, which should contain a single log of creating the `bigquery-mirror-test` document.
14
14
15
15
```
16
16
SELECT *
17
17
FROM `${param:BIGQUERY_PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_changelog`
18
18
```
19
19
20
-
1. Query your **latest view**, which should return the latest change event for the only document present -- `bigquery-mirror-test`.
20
+
6. Query your **latest view**, which should return the latest change event for the only document present -- `bigquery-mirror-test`.
21
21
22
22
```
23
23
SELECT *
24
24
FROM `${param:BIGQUERY_PROJECT_ID}.${param:DATASET_ID}.${param:TABLE_ID}_raw_latest`
25
25
```
26
26
27
-
1. Delete the `bigquery-mirror-test` document from [Cloud Firestore](https://console.firebase.google.com/project/${param:BIGQUERY_PROJECT_ID}/firestore/data).
27
+
7. Delete the `bigquery-mirror-test` document from [Cloud Firestore](https://console.firebase.google.com/project/${param:BIGQUERY_PROJECT_ID}/firestore/data).
28
28
The `bigquery-mirror-test` document will disappear from the **latest view** and a `DELETE` event will be added to the **raw changelog table**.
29
29
30
-
1. You can check the changelogs of a single document with this query:
30
+
8. You can check the changelogs of a single document with this query:
31
31
32
32
```
33
33
SELECT *
@@ -54,13 +54,50 @@ Enabling wildcard references will provide an additional STRING based column. The
54
54
55
55
`Clustering` will not need to create or modify a table when adding clustering options, this will be updated automatically.
56
56
57
-
### Configuring Cross-Platform BigQuery Setup
57
+
#### Cross-project Streaming
58
58
59
-
When defining a specific BigQuery project ID, a manual step to set up permissions is required:
59
+
By default, the extension exports data to BigQuery in the same project as your Firebase project. However, you can configure it to export to a BigQuery instance in a different Google Cloud project. To do this:
60
60
61
-
1. Navigate to https://console.cloud.google.com/iam-admin/iam?project=${param:BIGQUERY_PROJECT_ID}
62
-
2. Add the **BigQuery Data Editor** role to the following service account:
1. During installation, set the `BIGQUERY_PROJECT_ID` parameter as your target BigQuery project ID.
62
+
63
+
2. Identify the service account on the source project associated with the extension. By default, it will be constructed as `ext-<extension-instance-id>@project-id.iam.gserviceaccount.com`. However, if the extension instance ID is too long, it may be truncated and 4 random characters appended to abide by service account length limits.
64
+
65
+
3. To find the exact service account, navigate to IAM & Admin -> IAM in the Google Cloud Platform Console. Look for the service account listed with "Name" as "Firebase Extensions <your extension instance ID> service account". The value in the "Principal" column will be the service account that needs permissions granted in the target project.
66
+
67
+
4. Grant the extension's service account the necessary BigQuery permissions on the target project. You can use our provided scripts:
-`-i`: (Optional) Extension instance ID if different from default "firestore-bigquery-export"
87
+
-`-s`: (Optional) Service account email. If not provided, it will be constructed using the extension instance ID
88
+
89
+
For PowerShell script:
90
+
-`-FirebaseProject`: Your Firebase (source) project ID
91
+
-`-BigQueryProject`: Your target BigQuery project ID
92
+
-`-ExtensionInstanceId`: (Optional) Extension instance ID if different from default "firestore-bigquery-export"
93
+
-`-ServiceAccount`: (Optional) Service account email. If not provided, it will be constructed using the extension instance ID
94
+
95
+
**Prerequisites:**
96
+
- You must have the [gcloud CLI](https://cloud.google.com/sdk/docs/install) installed and configured
97
+
- You must have permission to grant IAM roles on the target BigQuery project
98
+
- The extension must be installed before running the script
99
+
100
+
**Note:** If extension installation is failing to create a dataset on the target project initially due to missing permissions, don't worry. The extension will automatically retry once you've granted the necessary permissions using these scripts.
Copy file name to clipboardExpand all lines: firestore-bigquery-export/PREINSTALL.md
+133
Original file line number
Diff line number
Diff line change
@@ -69,6 +69,81 @@ Prior to sending the document change to BigQuery, you have an opportunity to tra
69
69
70
70
The response should be indentical in structure.
71
71
72
+
#### Materialized Views
73
+
74
+
This extension supports both regular views and materialized views in BigQuery. While regular views compute their results each time they're queried, materialized views store their query results, providing faster access at the cost of additional storage.
75
+
76
+
There are two types of materialized views available:
77
+
78
+
1.**Non-incremental Materialized Views**: These views support more complex queries including filtering on aggregated fields, but require complete recomputation during refresh.
79
+
80
+
2.**Incremental Materialized Views**: These views update more efficiently by processing only new or changed records, but come with query restrictions. Most notably, they don't allow filtering or partitioning on aggregated fields in their defining SQL, among other limitations.
81
+
82
+
**Important Considerations:**
83
+
- Neither type of materialized view in this extension currently supports partitioning or clustering
84
+
- Both types allow you to configure refresh intervals and maximum staleness settings during extension installation or configuration
85
+
- Once created, a materialized view's SQL definition cannot be modified. If you reconfigure the extension to change either the view type (incremental vs non-incremental) or the SQL query, the extension will drop the existing materialized view and recreate it
86
+
- Carefully consider your use case before choosing materialized views:
87
+
- They incur additional storage costs as they cache query results
88
+
- Non-incremental views may have higher processing costs during refresh
89
+
- Incremental views have more query restrictions but are more efficient to update
90
+
91
+
Example of a non-incremental materialized view SQL definition generated by the extension:
Please review [BigQuery's documentation on materialized views](https://cloud.google.com/bigquery/docs/materialized-views-intro) to fully understand the implications for your use case.
146
+
72
147
#### Using Customer Managed Encryption Keys
73
148
74
149
By default, BigQuery encrypts your content stored at rest. BigQuery handles and manages this default encryption for you without any additional actions on your part.
@@ -100,6 +175,64 @@ If you follow these steps, your changelog table should be created using your cus
100
175
101
176
After your data is in BigQuery, you can run the [schema-views script](https://github.com/firebase/extensions/blob/master/firestore-bigquery-export/guides/GENERATE_SCHEMA_VIEWS.md) (provided by this extension) to create views that make it easier to query relevant data. You only need to provide a JSON schema file that describes your data structure, and the schema-views script will create the views.
102
177
178
+
#### Cross-project Streaming
179
+
180
+
By default, the extension exports data to BigQuery in the same project as your Firebase project. However, you can configure it to export to a BigQuery instance in a different Google Cloud project. To do this:
181
+
182
+
1. During installation, set the `BIGQUERY_PROJECT_ID` parameter to your target BigQuery project ID.
183
+
184
+
2. After installation, you'll need to grant the extension's service account the necessary BigQuery permissions on the target project. You can use our provided scripts:
-`-i`: (Optional) Extension instance ID if different from default "firestore-bigquery-export"
204
+
205
+
For PowerShell script:
206
+
-`-FirebaseProject`: Your Firebase (source) project ID
207
+
-`-BigQueryProject`: Your target BigQuery project ID
208
+
-`-ExtensionInstanceId`: (Optional) Extension instance ID if different from default "firestore-bigquery-export"
209
+
210
+
**Prerequisites:**
211
+
- You must have the [gcloud CLI](https://cloud.google.com/sdk/docs/install) installed and configured
212
+
- You must have permission to grant IAM roles on the target BigQuery project
213
+
- The extension must be installed before running the script
214
+
215
+
**Note:** If extension installation is failing to create a dataset on the target project initially due to missing permissions, don't worry. The extension will automatically retry once you've granted the necessary permissions using these scripts.
216
+
217
+
#### Mitigating Data Loss During Extension Updates
218
+
219
+
When updating or reconfiguring this extension, there may be a brief period where data streaming from Firestore to BigQuery is interrupted. While this limitation exists within the Extensions platform, we provide two strategies to mitigate potential data loss.
220
+
221
+
##### Strategy 1: Post-Update Import
222
+
After reconfiguring the extension, run the import script on your collection to ensure all data is captured. Refer to the "Import Existing Documents" section above for detailed steps.
223
+
224
+
##### Strategy 2: Parallel Instance Method
225
+
1. Install a second instance of the extension that streams to a new BigQuery table
226
+
2. Reconfigure the original extension
227
+
3. Once the original extension is properly configured and streaming events
228
+
4. Uninstall the second instance
229
+
5. Run a BigQuery merge job to combine the data from both tables
230
+
231
+
##### Considerations
232
+
- Strategy 1 is simpler but may result in duplicate records that need to be deduplicated
233
+
- Strategy 2 requires more setup but provides better data continuity
234
+
- Choose the strategy that best aligns with your data consistency requirements and operational constraints
235
+
103
236
#### Billing
104
237
To install an extension, your project must be on the [Blaze (pay as you go) plan](https://firebase.google.com/pricing)
0 commit comments