Skip to content

[DENG-8241] Remove doc references to AI notebooks #865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 9, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 1 addition & 28 deletions src/cookbooks/bigquery/access.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,35 +127,8 @@ python -c 'from google.cloud import bigquery; print([d.dataset_id for d in bigqu

Colab can be used to easily access BigQuery and perform analyses. See the [`Telemetry Hello World` notebook](https://colab.research.google.com/drive/1uXmrPnqzDATiCVH2RNJKD8obIZuofFHx) for an interactive example. Under the hood, it uses the BigQuery API to read and write to BigQuery tables, so access needs to be explicitly provisioned.

### AI Platform Notebooks

[AI Platform Notebooks](https://cloud.google.com/ai-platform/notebooks/docs) is a managed JupyterLab service running on GCP. It gives you full control over the machine where your notebooks are running - you can install your own libraries and choose machine size depending on your needs.

To start, go to [GCP console](https://console.cloud.google.com) and make sure you are in the correct project - most likely this will be your team project. Then navigate to the Notebooks page in the sidebar under AI Platform > Notebooks ([direct link](https://console.cloud.google.com/ai-platform/notebooks/list/instances)). There you can create new notebook server instances and connect to them (when your instance is ready, you'll see an `Open JupyterLab` button).

Please note that by default JupyterLab saves notebook files only locally, so they are lost if your instance is deleted. To make sure you don't lose your work, either push your files to a Git repository (via a pre-installed Git extension) or upload them to GCS (using `gsutil` command in a terminal session).

#### Notebooks Access to workgroup-confidential Datasets

If you are a member of a restricted access workgroup, you can provision AI notebooks in the [`mozdata GCP project`](https://console.cloud.google.com/vertex-ai/workbench/list/instances?project=mozdata&supportedpurview=project) that can read workgroup-confidential data.

> **⚠** You must provision AI notebooks in `mozdata` using a nonstandard service account specific to your workgroup, see below.

When you create a notebook server, under "Advanced Options" / "Permissions", deselect "Use Compute Engine Default Service Account" and replace it with the service account associated with your workgroup. You may need to type this service account manually as it will not be available from a drop-down menu to all users. The ID of the service account matches the following pattern:

`WORKGROUP-SUBGROUP@mozdata.iam.gserviceaccount.com`

For example, if you are member of `workgroup:search-terms/aggregated`, use `search-terms-aggregated@mozdata.iam.gserviceaccount.com`.

This notebook server should have access to any restricted access datasets that are accessible to `workgroup:search-terms/aggregated`. Additionally, this notebooks server will not have write access to the standard `mozdata.analysis` dataset, but will instead have write access to a workgroup-specific dataset that looks like the following:

`mozdata.WORKGROUP_SUBGROUP_analysis`

In the example above this maps to `mozdata.search_terms_aggregated_analysis`.

## BigQuery Access Request

> **⚠** Access to BigQuery via the `mozdata` GCP project is granted to Mozilla Staff by default; only file an access request if you need other specific access such as via a teams project
> **⚠** Access to BigQuery via the `mozdata` GCP project is granted to Mozilla Staff by default; only file an access request if you need other specific access such as via a teams project

For access to BigQuery using projects other than `mozdata`, [file a bug (requires access to Mozilla Jira)](https://mozilla-hub.atlassian.net/secure/CreateIssueDetails!init.jspa?pid=10058&issuetype=10007&priority=3&customfield_10014=DSRE-87&summary=BigQuery%20GCP%20Console%20and%20API%20Access%20for%20YOUR_EMAIL_HERE&description=My%20request%20information%0A%3D%3D%3D%3D%3D%3D%3D%3D%0Amozilla.com%20ldap%20login%3A%0Ateam%3A%0Aaccess%20required%3A%20BigQuery%20GCP%20console%20and%20API%20Access%3B%20ENTER%20OTHER%20ACCESS%20REQUESTS%20HERE%0A%0APost%20request%0A%3D%3D%3D%3D%3D%3D%3D%3D%0ASee%20GCP%20console%20and%20other%20access%20methods%20docs%20here%3A%20https%3A%2F%2Fdocs.telemetry.mozilla.org%2Fcookbooks%2Fbigquery).
If you require access to AI Notebooks or Dataproc, please specify in the bug and a team project will be provisioned for you.