Skip to content
This repository was archived by the owner on Mar 26, 2025. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added modules/AWSBackupProjectTNG.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
63 changes: 63 additions & 0 deletions modules/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Backup Solution outline document


## Summary
This document outlines the current solution in place to provide a "hardened" backup solution which aims to prevent and mitigate data loss and corruption due to "bad actors" and potential accidental scenarios.


## Solution overview
![Solution diagram](./AWSBackupProjectTNG.png)



## Solution Description
### Overview
#### Supported asset types
AWS backup, the backbone of this solution, supports many different AWS asset types (see: https://docs.aws.amazon.com/aws-backup/latest/devguide/backup-feature-availability.html)

However we are actively using this system for the following asset types:
* S3 Buckets
* DynamoDB
* EFS
* RDS
* Parameter store (via a Lambda backing up to a S3 Bucket)

#### Backup
1. Each individual resource has its own CMK (Customer Managed encryption Key) which encrypts the resources data while at rest.
2. On a regular basis a snapshot is taken off that resource and placed in the Local AWS Vault as a recovery point (a.k.a. a backup) which encrypts it data with its own CMK.
3. Once a local recovery point is in place in the local AWS Vault an AWS Backup "Copy Job" is automatically kicked off to make a copy of the recovery point into the Locked Remote Vault in the "Backup" AWS account.
4. When the recovery point is placed in the Locked AWS Vault it re-encrypts the recovery point with its own CMK. At this time a lifecycle policy is assigned to the recovery point in the Locked Remote AWS Vault
4. The recovery point can not be removed by ANYONE (including AWS support and the root user) until such time as the lifecycle expires the recovery point. See: https://docs.aws.amazon.com/aws-backup/latest/devguide/vault-lock.html

#### Restoration
The method for restoration of a resource will depend on the exact circumstances under which a restore is required. However the method will basically follow the following steps:

1. A privileged Platform Engineer logs into the "Backup" AWS account (via breakglass procedures) and selects the required restore point from the Locked Remote AWS Vault.
2. The engineer then selects to do manual copy job back to the Local AWS Vault. Normally the engineer would simply select to copy the restore point back to the original account, however it is possible to copy it to another account/region if required (for example in the case where the "Active" AWS account is felt to have been irretrievably compromised).
3. Once the restore point has been copied into the Local AWS vault the data can be restored. Typically for RDS this would mean creating a new RDS instance from the recovered snapshot (see: https://docs.aws.amazon.com/aws-backup/latest/devguide/restoring-rds.html) or for EFS resources you are able to either restore single files or the whole filesystem to a "restored" directory on the original EFS volume as the situation requires. (see: https://docs.aws.amazon.com/aws-backup/latest/devguide/restoring-efs.html)

#### Notes:
* All data in transit over the AWS network is TLS encrypted
* Access to the "Backup" AWS account is restricted to a few users, while a full access policy is being defined.
* Access to the separate "Backup" AWS account for production data is recorded following the current confluence model, while waiting on the new "breakglass" model.


## Limitations & Potential Future Improvements
### Limitations
#### RDS & Cros-regions
Currently due to a AWS feature limitation RDS snapshots can be copied to another region or to another account but not both. This is down to how AWS RDS snapshots currently handle KMS keys within regions. This means that the "Backup" AWS account currently has to be in the same region as the "Active" AWS account as is the case currently. See: https://docs.aws.amazon.com/aws-backup/latest/devguide/whatisbackup.html#features-by-resource

Unfortunately despite a feature request being already in place with AWS there is no timescale for this to be changed.

The workaround, if a separate region as well as a "Backup" AWS account is required, would be too setup a trigger for another AWS backup copy job which makes an additional RestorePoint in a Locked Vault in another region for the "Backup" AWS account. This of course would incur some additional cost.

If this is deemed an essential feature further details, including costs, can be provided upon request.


### Potential Future Improvements
#### Reporting
Reporting on backup is a currently available feature but has not been set up with the current solution.

It is strongly recommended that backup reporting should be done at the AWS Organisational level which will provide auditing of backups at a high and very independent level.

For more info see here: https://docs.aws.amazon.com/aws-backup/latest/devguide/aws-backup-audit-manager.html
48 changes: 48 additions & 0 deletions modules/backup/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Terraform module: backup schedules

## Description

This is a simple module to create a AWS Backup vault, KMS keys, Plans and automatic off-account site backup.

The vault module should be ran first in the remote AWS account.

**WARNING** Once a snaphshot has been placed into the remote locked vault it can not be removed until the
lifecycle duration has been exceeded.

## Module parameters

| Name | Description | Type | Default |
|:-----------------|:---------------------------------------------------------|:------:|:-------------------:|
| instance_name | The name of the service being served | string | - |
| remote_account | The AWS Account ID number of the remote account | string | - |
| remote_vault_arn | The vault ARN in the remote account | string | false |
| local_lifecycle | Lifecycle for local copies, in days | number | - |
| remote_lifecycle | Lifecycle for remote copies, in days | number | - |
| backup_schedule | The schedule to run backups, in AWS CRON format | string | cron(15 11 ? * * *) |
| use_env | Wether to backup by environmrnt or ALL assets in account | bool | false |
| environment | The environment name to select assets from | string | - |


## Sample usage

This snippet creates a backup vault and plan for RSS prod. The AWS account number "123456789012" and ARN
is the locked vault created by the vault module in a separate AWS account.

Local copies of the snapshots are held for 7 days and remote "tamper-proof" copies for 90 days. The backup
runs at 11:15 every day and backs up all assets in the prod environment (as defined by the Environment tag)
with the BackupRemote Tag set to true

```
module "rss_prod_backup_vault" {
source = "../modules/backup"

instance_name = "rss-prod"
remote_accountount = "123456789012"
remote_vault_arn = "arn:aws:backup:eu-west-2:123456789012:backup-vault:rss_prod_backup"
local_lifecycle = 7
remote_lifecycle = 90
backup_schedule = "cron(15 11 ? * * *)"
use_env = true
environment = "prod"
}
```
3 changes: 3 additions & 0 deletions modules/backup/data.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
// Data sources require dby module

data "aws_caller_identity" "current" {}
29 changes: 29 additions & 0 deletions modules/backup/iam.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// Backup IAM role to use for backups

resource "aws_iam_role" "remote_backup" {
name = "${var.instance_name}-remote_backup"
assume_role_policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Action": ["sts:AssumeRole"],
"Effect": "allow",
"Principal": {
"Service": ["backup.amazonaws.com"]
}
}
]
}
POLICY
}

resource "aws_iam_role_policy_attachment" "remote_backup" {
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSBackupServiceRolePolicyForBackup"
role = aws_iam_role.remote_backup.name
}

resource "aws_iam_role_policy_attachment" "remote_backup_s3" {
policy_arn = "arn:aws:iam::aws:policy/AWSBackupServiceRolePolicyForS3Backup"
role = aws_iam_role.remote_backup.name
}
62 changes: 62 additions & 0 deletions modules/backup/kms.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
// Create a specific KMS key for the local backup vault and allow the remote
// AWS account access to that key.

resource "aws_kms_alias" "remote_backup_vault_key" {
name = "alias/${var.instance_name}-remote-backup-vault-key"
target_key_id = aws_kms_key.remote_backup_vault.key_id
}

resource "aws_kms_key" "remote_backup_vault" {
description = "${var.instance_name} Remote Backup vault Key"

policy = <<POLICY
{
"Version": "2012-10-17",
"Id": "key-default-plus",
"Statement": [
{
"Sid": "Enable IAM User Permissions",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::${data.aws_caller_identity.current.id}:root"
},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Allow access from remote backup account",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::${var.remote_account}:root"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Sid": "Allow attachment of persistant resources",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::${var.remote_account}:root"
},
"Action": [
"kms:CreateGrant",
"kms:ListGrants",
"kms:RevokeGrant"
],
"Resource": "*",
"Condition": {
"Bool": {
"kms:GrantIsForAWSResource": "true"
}
}
}
]
}
POLICY
}
6 changes: 6 additions & 0 deletions modules/backup/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
// Output variable definitions

output "vault_arn" {
description = "The Local backup vault ARN"
value = aws_backup_vault.remote_backup_vault.arn
}
43 changes: 43 additions & 0 deletions modules/backup/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
// Input variable definitions

variable "instance_name" {
description = "The name of the service being served"
type = string
}

variable "remote_account" {
description = "The AWS accound ID number holding the remote locked vault"
type = string
}

variable "remote_vault_arn" {
description = "The ARN of the locked vault to copy snaphsots too in the remote AWS account"
type = string
}

variable "local_lifecycle" {
description = "The lifecycle used for local backup snaphots, in days"
type = number
}

variable "remote_lifecycle" {
description = "The lifecycle used for remote backup snaphots, in days"
type = number
}

variable "backup_schedule" {
description = "The schedule to run backups in, in AWS's CRON format"
type = string
default = "cron(15 11 ? * * *)"
}

variable "use_env" {
description = "Whether to back up by Environment or ALL in account"
type = bool
default = false
}

variable "environment" {
description = "The environment name to back up, if omitted then ALL assets in the AWS account will be backed up if requested."
type = string
}
96 changes: 96 additions & 0 deletions modules/backup/vault.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
// Create the AWS Backup Vault and plans

// The vault

resource "aws_backup_vault" "remote_backup_vault" {
name = "${var.instance_name}-remote_backup"
kms_key_arn = aws_kms_key.remote_backup_vault.arn
}

// The policy to use for the vault allowing the remote AWS account to copy snapshots
// back in case of incidents

resource "aws_backup_vault_policy" "remote_backup" {
backup_vault_name = aws_backup_vault.remote_backup_vault.name

policy = <<POLICY
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "backup:CopyIntoBackupVault",
"Resource": "*",
"Principal": {
"AWS": [
"arn:aws:iam::${var.remote_account}:root"
]
}
}
]
}
POLICY
}

// The backup plan which automatically copies snapshots off-account

resource "aws_backup_plan" "remote_backup" {
name = "${var.instance_name}-remote_backup"

rule {
rule_name = "${var.instance_name}-remote_backup"
target_vault_name = aws_backup_vault.remote_backup_vault.name
schedule = var.backup_schedule

lifecycle {
delete_after = var.local_lifecycle
}
copy_action {
destination_vault_arn = var.remote_vault_arn

lifecycle {
delete_after = var.remote_lifecycle
}
}
}
}

// A selection which backs up ALL assets in the AWS account which have
// the tag BackupRemote=True

resource "aws_backup_selection" "remote_backup_account" {
count = var.use_env ? 0 : 1
iam_role_arn = aws_iam_role.remote_backup.arn
name = "${var.instance_name}-remote-backup-account"
plan_id = aws_backup_plan.remote_backup.id
resources = ["*"]

condition {
string_equals {
key = "aws:ResourceTag/BackupRemote"
value = "true"
}
}
}

// A selection which backs up assets in the AWS account which have
// the tag BackupRemote=True AND the requested Environment tag.

resource "aws_backup_selection" "remote_backup_env" {
count = var.use_env ? 1 : 0
iam_role_arn = aws_iam_role.remote_backup.arn
name = "${var.instance_name}-remote-backup-env"
plan_id = aws_backup_plan.remote_backup.id
resources = ["*"]

condition {
string_equals {
key = "aws:ResourceTag/BackupRemote"
value = "true"
}
string_equals {
key = "aws:ResourceTag/Environment"
value = var.environment
}
}
}
37 changes: 37 additions & 0 deletions modules/vault/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Terraform module: vault

## Description

This is a simple module to create a AWS Backup vault set up to act as a destination for
remote off-account AWS backup copy jobs. The vault can be "locked" which prevents
pre-mature backup snapshot deletion.

The vault should be located in an isolated AWS account

**WARNING** Once a vault is locked you have 8 days to reverse the setting. Once this
cool-off period has been passed vault locking can not be removed.

## Module parameters

| Name | Description | Type | Default|
|:---------------|:---------------------------------------|:------:|:------:|
| client_name | The name of the client being served | string | - |
| client_account | The AWS Account ID number being served | string | - |
| lock_vault | Whether to lock the vault | bool | false |

## Sample usage

This snippet creates a locked vault for RSS prod backup called rss-prod. The AWS account
number "123456789012" is the only account which can copy backup snapshots into this
vault. (Only one account is allowed to copy into each vault so as to ensure data
segregation).

```
module "rss_prod_prod_backup_vault" {
source = "../modules/vault"

client_name = "rss-prod"
client_account = "123456789012"
lock_vault = true
}
```
Loading