-
Notifications
You must be signed in to change notification settings - Fork 16
Description
✅ Checklist
- I have searched open and closed issues for duplicates.
- This is a request for a new feature in the Data Safe Haven or an upgrade to an existing feature.
- The feature is still missing in the latest version.
- I have read through the documentation.
- This isn't an open-ended question (open a discussion if it is).
🍓 Suggested change
Sadly, the backup approach currently in the codebase does not work properly (see: #2270), and we were force to disable it (see: #2466). This is in part due to very little out-of-the-box support from Azure Backup to NFS File Shares and BlockBlobStorage.
However, backup is a critical feature, explicitly mentioned by DSPT.
🚂 How could this be done?
SInce January 2024, NFS Azure file share support snapshots (see: https://techcommunity.microsoft.com/blog/azurestorageblog/announcing-the-general-availability-of-nfs-azure-file-share-snapshots/4038596). Although not a traditional backup, it offers some data protection features, like recovering previous versions.
Taking snapshots is accessible only via Portal, PowerShell or CLI (see: https://learn.microsoft.com/en-us/azure/storage/files/storage-snapshots-files?tabs=portal). For a speedy implementation, and reusing some of the infrastructure already in place, I think we can do the following:
- Create another Container App Job in the same Container Apps Environment we're currently using for the DNS sidecar (after renaming it to "Management Jobs" or something similar). The advantage is that the networking is already set-up for hosting containers that do Azure CLI requests.
- Within the Managed Environment, we create a new Container App Job for taking snapshots. It would be the same minimal base image as the DNS sidecar job, with Azure CLI installed. Using a system managed identity, it would periodically take snapshots of the DSH NFS File Shares:
home
andshared
. - Like with the DNS Sidecar job, we would configure the frequency and timeout on the config file.
Some caveats:
- We might need to change the
teardown
logic. We cannot delete a file share that has snapshots. Or we can leave it as-is, and force users to delete snapshots manually. - We might need to change our workload profile. At the moment, we're only allowing one instance of the cheapest profile available (4vCPU, 16Gi RAM). Its workload would double with an extra job.