diff --git a/docs/includes/backup-requirement.md b/docs/includes/backup-requirement.md new file mode 100644 index 0000000000..d02573458a --- /dev/null +++ b/docs/includes/backup-requirement.md @@ -0,0 +1,2 @@ +!!! danger "Backup requirement" + All three components—Apache Cassandra, Elasticsearch, and file storage—must be backed up to ensure proper recovery. \ No newline at end of file diff --git a/docs/includes/backup-restore-best-practices.md b/docs/includes/backup-restore-best-practices.md new file mode 100644 index 0000000000..d914f3786a --- /dev/null +++ b/docs/includes/backup-restore-best-practices.md @@ -0,0 +1,3 @@ +!!! tip "Best practices for safe backup and restore" + * Always test the backup and restore process in a non-production or test environment before applying it to a live system to ensure the process works as expected. + * Ensure you have an up-to-date backup before starting the restore operation, as errors during the restoration could lead to data loss. \ No newline at end of file diff --git a/docs/includes/data-consistency-hot-backup.md b/docs/includes/data-consistency-hot-backup.md new file mode 100644 index 0000000000..df4e275764 --- /dev/null +++ b/docs/includes/data-consistency-hot-backup.md @@ -0,0 +1,2 @@ +!!! warning "Data consistency" + Perform these instructions simultaneously, ideally triggered by a cron job, to ensure proper alignment between Apache Cassandra, Elasticsearch, and file storage. Snapshots must be taken concurrently to maintain consistency and avoid restoration issues. \ No newline at end of file diff --git a/docs/includes/hot-backup-cassandra-snapshots.md b/docs/includes/hot-backup-cassandra-snapshots.md new file mode 100644 index 0000000000..361466474e --- /dev/null +++ b/docs/includes/hot-backup-cassandra-snapshots.md @@ -0,0 +1,73 @@ +Before creating Cassandra snapshots, gather the following information: + +* Cassandra administrator password +* SSL certificates and authentication details required to connect securely to Cassandra + +Then, use the following script: + +!!! warning "Script restrictions" + This script works only when Cassandra runs directly on a machine. It doesn't support deployments using Docker or Kubernetes. + +!!! note "Keyspace name" + Before running this script, update the keyspace name to match your environment. The keyspace is typically defined in the `application.conf` file under the `db.janusgraph.storage.cql.keyspace` attribute. The script uses `thehive` by default. + +```bash +#!/bin/bash + +# Cassandra variables +CASSANDRA_KEYSPACE=thehive +CASSANDRA_DATA_FOLDER=/var/lib/cassandra + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup +SNAPSHOT_NAME="cassandra_$(date +%Y%m%d_%Hh%Mm%Ss)" +CASSANDRA_ARCHIVE_PATH="${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}/${CASSANDRA_KEYSPACE}" + +# Perform a snapshot of the keyspace +echo "Starting snapshot ${SNAPSHOT_NAME} for keyspace ${CASSANDRA_KEYSPACE}" +nodetool snapshot -t ${SNAPSHOT_NAME} ${CASSANDRA_KEYSPACE} + +# Make sure the snapshot folder exists and its subcontent permissions are correct +mkdir -p ${CASSANDRA_ARCHIVE_PATH} +chown -R cassandra:cassandra ${CASSANDRA_ARCHIVE_PATH} +echo "Snapshot of all ${CASSANDRA_KEYSPACE} tables will be stored inside ${CASSANDRA_ARCHIVE_PATH}" + +# Save the cql schema of the keyspace +cqlsh -e "DESCRIBE KEYSPACE ${CASSANDRA_KEYSPACE}" > "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}/create_keyspace_${KEYSPACE}.cql" +echo "The keyspace cql definition for ${CASSANDRA_KEYSPACE} is stored in this file: ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}/create_keyspace_${CASSANDRA_KEYSPACE}.cql" + +# For each table folder in the keyspace folder of the snapshot +for TABLE in $(ls ${CASSANDRA_DATA_FOLDER}/data/${CASSANDRA_KEYSPACE}); do + # Folder where the snapshot files are stored + TABLE_SNAPSHOT_FOLDER=${CASSANDRA_DATA_FOLDER}/data/${CASSANDRA_KEYSPACE}/${TABLE}/snapshots/${SNAPSHOT_NAME} + + # Create a folder for each table + mkdir ${CASSANDRA_ARCHIVE_PATH}/${TABLE} + chown -R cassandra:cassandra ${CASSANDRA_ARCHIVE_PATH}/${TABLE} + + # Copy the snapshot files to the proper table folder + # Snapshots files are hardlinks, + # so we use --remove-destination to make sure the files are actually copied and not just linked + cp -p --remove-destination ${TABLE_SNAPSHOT_FOLDER}/* ${CASSANDRA_ARCHIVE_PATH}/${TABLE} +done + +# Delete Cassandra snapshot once it's backed up +nodetool clearsnapshot -t ${SNAPSHOT_NAME} > /dev/null + +# Create a ".tar" archive with the folder containing the backed up Cassandra data +cd ${GENERAL_ARCHIVE_PATH} +tar cf ${SNAPSHOT_NAME}.tar ${SNAPSHOT_NAME} + +# Remove the folder once the archive is created +rm -rf ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Display the location of the Cassandra archive +echo "" +echo "Cassandra backup done! Keep the following backup archive safe:" +echo "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}.tar" +``` + +!!! info "Where to find the backup archive?" + After running the script, the backup archive is available at `/mnt/backup` with a `cassandra_` prefix. Be sure to copy this archive to a separate server or storage location to safeguard against data loss if the TheHive server fails. + +For more details, refer to the [official Cassandra documentation](https://cassandra.apache.org/doc/stable/cassandra/operating/backups.html). \ No newline at end of file diff --git a/docs/includes/hot-backup-configure-systems.md b/docs/includes/hot-backup-configure-systems.md new file mode 100644 index 0000000000..37c1a7c08e --- /dev/null +++ b/docs/includes/hot-backup-configure-systems.md @@ -0,0 +1,40 @@ +#### Cassandra keyspace + +Identify the keyspace used by TheHive. This is typically defined in the *application.conf* file under the `db.janusgraph.storage.cql.keyspace` attribute. If you followed the [step by step installation guide](/thehive/installation/step-by-step-installation-guide/), this keyspace should be named `thehive`. This name is also used in the scripts provided to create Cassandra snapshots. + +#### Elasticsearch repository + +This repository is used to create snapshots with timestamped names. + +1. Configure the repository path by adding the `path.repo` parameter in the `elasticsearch.yml` file: + + ```yaml + path.repo: /mnt/backup + ``` + +2. Restart Elasticsearch to apply the configuration changes. + +3. Register the repository named `thehive_repository` by sending the following request: + + ```http + curl -X PUT "http://127.0.0.1:9200/_snapshot/thehive_repository" \ + -H "Content-Type: application/json" \ + -d '{ + "type": "fs", + "settings": { + "location": "/mnt/backup" + } + }' + ``` + + A successful response looks like this: + + ```json + { + "acknowledged": true + } + ``` + +#### File storage location + +Locate the folder where TheHive stores files, which is backed up with the database and indices. If using a local filesystem or Network File System (NFS), the location is defined in the *application.conf* file under the `storage.localfs.location` attribute. \ No newline at end of file diff --git a/docs/includes/hot-backup-elasticsearch-snapshots.md b/docs/includes/hot-backup-elasticsearch-snapshots.md new file mode 100644 index 0000000000..05b1028bdf --- /dev/null +++ b/docs/includes/hot-backup-elasticsearch-snapshots.md @@ -0,0 +1,69 @@ +```bash +#!/bin/bash + +# Elasticsearch variables +ELASTICSEARCH_API_URL='http://127.0.0.1:9200' +ELASTICSEARCH_SNAPSHOT_REPOSITORY=thehive_repository +ELASTICSEARCH_INDEX=thehive_global + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup +SNAPSHOT_NAME="elasticsearch_$(date +%Y%m%d_%Hh%Mm%Ss)" + +# Creating the backup folder if needed +mkdir -p ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Check if the snapshot repository is correctly registered +repository_config=$(curl -s -L "${ELASTICSEARCH_API_URL}/_snapshot") +repository_ok=$(jq 'has("'${ELASTICSEARCH_SNAPSHOT_REPOSITORY}'")' <<< ${repository_config}) +if ! ${repository_ok}; then + echo "Abort, no snapshot repository registered in ElasticSearch" + echo "Set the repository folder 'path.repo'" + echo "in an environment variable" + echo "or in elasticsearch.yml" + exit 1 +fi + +# Starting the snapshot +create_snapshot=$(curl -s -L -X PUT "${ELASTICSEARCH_API_URL}/_snapshot/thehive_repository/${SNAPSHOT_NAME}" -H 'Content-Type: application/json' -d '{"indices":"'${ELASTICSEARCH_INDEX}'", "ignore_unavailable":true, "include_global_state":false}') + +# Verify that the snapshot started correctly +create_started=$(jq '.accepted == true' <<< ${create_snapshot}) +if [ ${create_started} != true ] +then + echo "Couldn't start the snapshot" + exit 1 +fi +echo "Snapshot started" + +# Verify that the snapshot is finshed +state="NONE" +while [ "${state}" != "\"SUCCESS\"" ]; do + echo "Snapshot in progress, waiting 5 seconds before checking status again..." + sleep 5 + snapshot_list=$(curl -s -L "${ELASTICSEARCH_API_URL}/_snapshot/${ELASTICSEARCH_SNAPSHOT_REPOSITORY}/*?verbose=false") + state=$(jq '.snapshots[] | select(.snapshot == "'${SNAPSHOT_NAME}'").state' <<< ${snapshot_list}) +done +echo "Snapshot finished" + +# Print the snapshot short informations +final_state=$(jq '.snapshots[] | select(.snapshot == "'${SNAPSHOT_NAME}'")' <<< ${snapshot_list}) +echo ${final_state} | jq --color-output . + +# Create a ".tar" archive with the folder containing the backed up Elasticsearch index +cd ${GENERAL_ARCHIVE_PATH} +tar cf ${SNAPSHOT_NAME}.tar ${SNAPSHOT_NAME} + +# Remove the folder once the archive is created +rm -rf ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Display the location of the Elasticsearch archive +echo "" +echo "ElasticSearch backup done! Keep the following backup archive safe:" +echo "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}.tar" +``` + +!!! info "Where to find the backup archive?" + After running the script, the backup archive is available at `/mnt/backup` with a `elasticsearch_` prefix. Be sure to copy this archive to a separate server or storage location to safeguard against data loss if the TheHive server fails. + +For more details, refer to the [official Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html). \ No newline at end of file diff --git a/docs/includes/hot-backup-file-storage.md b/docs/includes/hot-backup-file-storage.md new file mode 100644 index 0000000000..752ee6971e --- /dev/null +++ b/docs/includes/hot-backup-file-storage.md @@ -0,0 +1,32 @@ +!!! warning "Script restrictions" + This script works only when file storage is managed directly on a machine. It doesn't support deployments using Docker or Kubernetes. + +```bash +#!/bin/bash + +# TheHive attachment variables +ATTACHMENT_FOLDER=/opt/thp/thehive/files + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup +SNAPSHOT_NAME="files_$(date +%Y%m%d_%Hh%Mm%Ss)" +ATTACHMENT_ARCHIVE_PATH="${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}" + +# Copy all TheHive attachment +cp -r ${ATTACHMENT_FOLDER}/* ${ATTACHMENT_ARCHIVE_PATH}/ + +# Create a ".tar" archive with the folder containing the backed up attachment files +cd ${GENERAL_ARCHIVE_PATH} +tar cf ${SNAPSHOT_NAME}.tar ${SNAPSHOT_NAME} + +# Remove the folder once the archive is created +rm -rf ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Display the location of the attachment archive +echo "" +echo "TheHive attachment files backup done! Keep the following backup archive safe:" +echo "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}.tar" +``` + +!!! info "Where to find the backup archive?" + After running the script, the backup archive is available at `/mnt/backup` with a `files_` prefix. Be sure to copy this archive to a separate server or storage location to safeguard against data loss if the TheHive server fails. \ No newline at end of file diff --git a/docs/includes/hot-backup-required-tools.md b/docs/includes/hot-backup-required-tools.md new file mode 100644 index 0000000000..66b3bff65c --- /dev/null +++ b/docs/includes/hot-backup-required-tools.md @@ -0,0 +1,12 @@ +Before performing a hot backup, ensure the following tools are available on your system: + +* [Cassandra nodetool](https://cassandra.apache.org/doc/latest/cassandra/troubleshooting/use_nodetool.html): Command-line tool for managing Cassandra clusters, used for creating database snapshots +* [tar](https://www.gnu.org/software/tar/manual/html_node/index.html): Utility for archiving backup files +* [cqlsh](https://cassandra.apache.org/doc/latest/cassandra/managing/tools/cqlsh.html): Command-line interface for executing CQL queries against the Cassandra database +* [curl](https://curl.se/): Tool for transferring data with URLs, useful for interacting with the Elasticsearch API +* [jq](https://jqlang.org/): Lightweight command-line JSON processor for parsing and manipulating JSON data in scripts + +If any tools are missing, install them using your package manager, for example: + +* `apt install jq` for DEB-based operating systems +* `yum install jq` for RPM-based operating systems \ No newline at end of file diff --git a/docs/includes/hot-restore-application-stopped.md b/docs/includes/hot-restore-application-stopped.md new file mode 100644 index 0000000000..df1c487c0e --- /dev/null +++ b/docs/includes/hot-restore-application-stopped.md @@ -0,0 +1,2 @@ +!!! warning "Shutdown required" + Performing a restore from a hot backup requires stopping the application. \ No newline at end of file diff --git a/docs/includes/hot-restore-cassandra-snapshots.md b/docs/includes/hot-restore-cassandra-snapshots.md new file mode 100644 index 0000000000..4159690072 --- /dev/null +++ b/docs/includes/hot-restore-cassandra-snapshots.md @@ -0,0 +1,55 @@ +To restore Cassandra snapshots, run the following script: + +```bash +#!/bin/bash + +# Cassandra variables +CASSANDRA_KEYSPACE=thehive + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup + +# Look for the latest archived Cassandra snapshot +CASSANDRA_BACKUP_LIST=(${GENERAL_ARCHIVE_PATH}/cassandra_????????_??h??m??s.tar) +CASSANDRA_LATEST_BACKUP_NAME=$(basename ${CASSANDRA_BACKUP_LIST[-1]}) + +echo "Latest Cassandra backup archive found is ${GENERAL_ARCHIVE_PATH}/${CASSANDRA_LATEST_BACKUP_NAME}" + +# Extract the latest archive +CASSANDRA_SNAPSHOT_NAME=$(echo ${CASSANDRA_LATEST_BACKUP_NAME} | cut -d '.' -f 1) +CASSANDRA_SNAPSHOT_FOLDER="${GENERAL_ARCHIVE_PATH}/${CASSANDRA_SNAPSHOT_NAME}" + +tar xvf "${GENERAL_ARCHIVE_PATH}/${CASSANDRA_LATEST_BACKUP_NAME}" +echo "Latest Cassandra backup archive extracted in ${CASSANDRA_SNAPSHOT_FOLDER}" + +# Go inside the Cassandra snapshot recently extracted +cd ${CASSANDRA_SNAPSHOT_FOLDER} + +# Check if Cassandra already has an existing keyspace +cqlsh -e "DESCRIBE KEYSPACE ${CASSANDRA_KEYSPACE}" > "${CASSANDRA_SNAPSHOT_FOLDER}/target_keyspace_${CASSANDRA_KEYSPACE}.cql" + +if cmp --silent -- "${CASSANDRA_SNAPSHOT_FOLDER}/create_keyspace_${CASSANDRA_KEYSPACE}.cql" "${CASSANDRA_SNAPSHOT_FOLDER}/target_keyspace_${CASSANDRA_KEYSPACE}.cql"; then + echo "Existing ${CASSANDRA_KEYSPACE} keyspace definition is identical to the one in the backup, no need to drop and recreate it" +else + echo "Existing ${CASSANDRA_KEYSPACE} keyspace definition does not match the one in the backup, dropping it" + cqlsh --request-timeout=120 -e "DROP KEYSPACE IF EXISTS ${CASSANDRA_KEYSPACE};" + sleep 5s + echo "Creating ${CASSANDRA_KEYSPACE} keyspace using the definition from the backup" + cqlsh --request-timeout=120 -f ${CASSANDRA_SNAPSHOT_FOLDER}/create_keyspace_${CASSANDRA_KEYSPACE}.cql +fi + +# Create the tables and load related data +cd ${CASSANDRA_KEYSPACE} +for TABLE in $(ls); do + TABLE_BASENAME=$(basename ${TABLE}) + TABLE_NAME=${TABLE_BASENAME%%-*} + echo "Importing ${TABLE_NAME} table and related data" + nodetool import ${CASSANDRA_KEYSPACE} ${TABLE_NAME} ${CASSANDRA_SNAPSHOT_FOLDER}/${CASSANDRA_KEYSPACE}/${TABLE} + echo "" +done + +echo "Cassandra data restoration done!" +rm -rf ${CASSANDRA_SNAPSHOT_FOLDER} +``` + +For additional details, refer to the [official Cassandra documentation](https://cassandra.apache.org/doc/stable/cassandra/operating/backups.html). \ No newline at end of file diff --git a/docs/includes/hot-restore-elasticsearch-snapshots.md b/docs/includes/hot-restore-elasticsearch-snapshots.md new file mode 100644 index 0000000000..e17c291da7 --- /dev/null +++ b/docs/includes/hot-restore-elasticsearch-snapshots.md @@ -0,0 +1,43 @@ +To restore Elasticsearch snapshots, run the following script: + +```bash +#!/bin/bash + +# ElasticSearch variables +ELASTICSEARCH_API_URL='http://127.0.0.1:9200' +ELASTICSEARCH_SNAPSHOT_REPOSITORY=thehive_repository +ELASTICSEARCH_INDEX=thehive_global + +# Look for the latest archived ElasticSearch snapshot +ELASTICSEARCH_BACKUP_LIST=(${GENERAL_ARCHIVE_PATH}/elasticsearch_????????_??h??m??s.tar) +ELASTICSEARCH_LATEST_BACKUP_NAME=$(basename ${ELASTICSEARCH_BACKUP_LIST[-1]}) + +echo "Latest ElasticSearch backup archive found is ${GENERAL_ARCHIVE_PATH}/${ELASTICSEARCH_LATEST_BACKUP_NAME}" + +# Extract the latest archive +ELASTICSEARCH_SNAPSHOT_NAME=$(echo ${ELASTICSEARCH_LATEST_BACKUP_NAME} | cut -d '.' -f 1) +ELASTICSEARCH_SNAPSHOT_FOLDER="${GENERAL_ARCHIVE_PATH}/${ELASTICSEARCH_SNAPSHOT_NAME}" + +tar xvf "${GENERAL_ARCHIVE_PATH}/${ELASTICSEARCH_LATEST_BACKUP_NAME}" +echo "Latest ElasticSearch backup archive extracted in ${ELASTICSEARCH_SNAPSHOT_FOLDER}" + +# Delete an existing ElasticSearch index +echo "Trying to delete the existing ElasticSearch index" +delete_index=$(curl -s -L -X DELETE "${ELASTICSEARCH_API_URL}/${ELASTICSEARCH_INDEX}/") + +ack_delete=$(jq '.acknowledged == true' <<< delete_index) +if [ delete_index != true ]; then + echo "Couldn't delete ${ELASTICSEARCH_INDEX} index, maybe it was already deleted" +else + echo "Existing ${ELASTICSEARCH_INDEX} index deleted" +fi + +# Restoring the extracted snapshot +echo "Restoring ${ELASTICSEARCH_SNAPSHOT_NAME} snapshot" +restore_status=$(curl -s -L -X POST "${ELASTICSEARCH_API_URL}/_snapshot/${ELASTICSEARCH_SNAPSHOT_REPOSITORY}/${ELASTICSEARCH_SNAPSHOT_NAME}/_restore?wait_for_completion=true") + +echo "ElasticSearch data restoration done!" +rm -rf ${ELASTICSEARCH_SNAPSHOT_FOLDER} +``` + +For additional details, refer to the [official Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html). \ No newline at end of file diff --git a/docs/includes/hot-restore-file-storage.md b/docs/includes/hot-restore-file-storage.md new file mode 100644 index 0000000000..a400962cd2 --- /dev/null +++ b/docs/includes/hot-restore-file-storage.md @@ -0,0 +1,33 @@ +To restore a backup for file storage, run the following script: + +```bash +#!/bin/bash + +# TheHive attachment variables +ATTACHMENT_FOLDER=/opt/thp/thehive/files + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup + +# Look for the latest archived attachment files snapshot +ATTACHMENT_BACKUP_LIST=(${GENERAL_ARCHIVE_PATH}/files_????????_??h??m??s.tar) +ATTACHMENT_LATEST_BACKUP_NAME=$(basename ${ATTACHMENT_BACKUP_LIST[-1]}) + +echo "Latest attachment files backup archive found is ${GENERAL_ARCHIVE_PATH}/${ATTACHMENT_LATEST_BACKUP_NAME}" + +# Extract the latest archive +ATTACHMENT_SNAPSHOT_NAME=$(echo ${ATTACHMENT_LATEST_BACKUP_NAME} | cut -d '.' -f 1) +ATTACHMENT_SNAPSHOT_FOLDER="${GENERAL_ARCHIVE_PATH}/${ATTACHMENT_SNAPSHOT_NAME}" + +tar xvf "${GENERAL_ARCHIVE_PATH}/${ATTACHMENT_LATEST_BACKUP_NAME}" +echo "Latest attachment files backup archive extracted in ${ATTACHMENT_SNAPSHOT_FOLDER}" + +# Clean existing TheHive attachment files +rm -rf ${ATTACHMENT_FOLDER}/* + +# Copy the attachment files from the backup +cp -r ${ATTACHMENT_SNAPSHOT_FOLDER}/* ${ATTACHMENT_FOLDER}/ + +echo "attachment files data restoration done!" +rm -rf ${ATTACHMENT_SNAPSHOT_FOLDER} +``` \ No newline at end of file diff --git a/docs/includes/implications-cold-backup-restore.md b/docs/includes/implications-cold-backup-restore.md new file mode 100644 index 0000000000..77bb138897 --- /dev/null +++ b/docs/includes/implications-cold-backup-restore.md @@ -0,0 +1,2 @@ +!!! note "Cold vs. hot backups and restores" + Before proceeding, ensure you fully understand [the implications of performing a cold backup and restore](/thehive/operations/backup-restore/cold-hot-backup-restore/). This process requires stopping all services to ensure data integrity and is available only for standalone servers. \ No newline at end of file diff --git a/docs/includes/preliminary-checks-hot-backup.md b/docs/includes/preliminary-checks-hot-backup.md new file mode 100644 index 0000000000..9fc0e5f826 --- /dev/null +++ b/docs/includes/preliminary-checks-hot-backup.md @@ -0,0 +1,41 @@ +Perform a preliminary check on the system to identify any data corruption or inconsistencies. Resolve any issues before proceeding with the backup. + +#### Check service status + +Ensure Cassandra, Elasticsearch, and TheHive are running: + +```bash +systemctl status thehive +systemctl status cassandra +systemctl status elasticsearch +``` + +#### Check Cassandra and Elasticsearch health + +For Cassandra, verify the status and check for issues: + +```bash +nodetool status +``` + +Nodes should be marked as `UN` (Up/Normal). + +For Elasticsearch, ensure the cluster health is green: + +```bash +curl -X GET "http://127.0.0.1:9200/_cluster/health?pretty" +``` + +Status `green` means the cluster is healthy and fully functional. Other statuses include `yellow`, indicating some replicas are missing but data is still available, and `red`, indicating some data is unavailable. + +If you notice any data inconsistencies, refer to the section titled [Resolve any data inconsistencies](#resolve-any-data-inconsistencies). + +#### Review system logs + +```bash +journalctl -u thehive +journalctl -u cassandra +journalctl -u elasticsearch +``` + +If you encounter any data inconsistencies, refer to [Resolve Data Inconsistencies](/thehive/operations/backup-restore/backup/hot-backup/hot-backup-resolve-data-inconsistencies/) for assistance. \ No newline at end of file diff --git a/docs/includes/prerequisites-hot-backup-restore.md b/docs/includes/prerequisites-hot-backup-restore.md new file mode 100644 index 0000000000..ec3b3934a2 --- /dev/null +++ b/docs/includes/prerequisites-hot-backup-restore.md @@ -0,0 +1,2 @@ +!!! danger "Think twice before proceeding with hot backup" + Before proceeding, read the [Cold vs. Hot Backups and Restores](/thehive/operations/backup-restore/cold-hot-backup-restore/) topic to understand the implications of hot backup and ensure they aligns with your organization's requirements and needs. Keep in mind that while the hot backup option eliminates downtime, data integrity can't be guaranteed. \ No newline at end of file diff --git a/docs/thehive/installation/kubernetes.md b/docs/thehive/installation/kubernetes.md index c10f71e027..370364b408 100644 --- a/docs/thehive/installation/kubernetes.md +++ b/docs/thehive/installation/kubernetes.md @@ -160,4 +160,4 @@ http://cortex..svc:9001 * [Monitoring TheHive](../operations/monitoring.md) * [Troubleshooting](../operations/troubleshooting.md) -* [Perform a Cold Backup for a Stack Running with Docker Compose](../operations/backup-restore/backup/docker-compose.md) \ No newline at end of file +* [Perform a Cold Backup for a Stack Running with Docker Compose](../operations/backup-restore/backup/cold-backup/docker-compose.md) \ No newline at end of file diff --git a/docs/thehive/installation/upgrade-from-4.x.md b/docs/thehive/installation/upgrade-from-4.x.md index 5b41216da1..9f95326a8c 100644 --- a/docs/thehive/installation/upgrade-from-4.x.md +++ b/docs/thehive/installation/upgrade-from-4.x.md @@ -30,7 +30,7 @@ Before proceeding with the upgrade, ensure to back up the following components: - Index - Files -For detailed instructions on how to perform backups, refer to our [**backup and restore guide**](../operations/backup-restore/overview.md). +For detailed instructions on how to perform backups, refer to our [**backup and restore guide**](../operations/backup-restore/cold-hot-backup-restore.md).   diff --git a/docs/thehive/operations/backup-restore/assets/cassandra_backup.sh b/docs/thehive/operations/backup-restore/assets/cassandra_backup.sh new file mode 100644 index 0000000000..9ce2db4734 --- /dev/null +++ b/docs/thehive/operations/backup-restore/assets/cassandra_backup.sh @@ -0,0 +1,53 @@ +#!/bin/bash + +# Cassandra variables +CASSANDRA_KEYSPACE=thehive +CASSANDRA_DATA_FOLDER=/var/lib/cassandra + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup +SNAPSHOT_NAME="cassandra_$(date +%Y%m%d_%Hh%Mm%Ss)" +CASSANDRA_ARCHIVE_PATH="${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}/${CASSANDRA_KEYSPACE}" + +# Perform a snapshot of the keyspace +echo "Starting snapshot ${SNAPSHOT_NAME} for keyspace ${CASSANDRA_KEYSPACE}" +nodetool snapshot -t ${SNAPSHOT_NAME} ${CASSANDRA_KEYSPACE} + +# Make sure the snapshot folder exists and its subcontent permissions are correct +mkdir -p ${CASSANDRA_ARCHIVE_PATH} +chown -R cassandra:cassandra ${CASSANDRA_ARCHIVE_PATH} +echo "Snapshot of all ${CASSANDRA_KEYSPACE} tables will be stored inside ${CASSANDRA_ARCHIVE_PATH}" + +# Save the cql schema of the keyspace +cqlsh -e "DESCRIBE KEYSPACE ${CASSANDRA_KEYSPACE}" > "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}/create_keyspace_${KEYSPACE}.cql" +echo "The keyspace cql definition for ${CASSANDRA_KEYSPACE} is stored in this file: ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}/create_keyspace_${CASSANDRA_KEYSPACE}.cql" + +# For each table folder in the keyspace folder of the snapshot +for TABLE in $(ls ${CASSANDRA_DATA_FOLDER}/data/${CASSANDRA_KEYSPACE}); do + # Folder where the snapshot files are stored + TABLE_SNAPSHOT_FOLDER=${CASSANDRA_DATA_FOLDER}/data/${CASSANDRA_KEYSPACE}/${TABLE}/snapshots/${SNAPSHOT_NAME} + + # Create a folder for each table + mkdir ${CASSANDRA_ARCHIVE_PATH}/${TABLE} + chown -R cassandra:cassandra ${CASSANDRA_ARCHIVE_PATH}/${TABLE} + + # Copy the snapshot files to the proper table folder + # Snapshots files are hardlinks, + # so we use --remove-destination to make sure the files are actually copied and not just linked + cp -p --remove-destination ${TABLE_SNAPSHOT_FOLDER}/* ${CASSANDRA_ARCHIVE_PATH}/${TABLE} +done + +# Delete Cassandra snapshot once it's backed up +nodetool clearsnapshot -t ${SNAPSHOT_NAME} > /dev/null + +# Create a ".tar" archive with the folder containing the backed up Cassandra data +cd ${GENERAL_ARCHIVE_PATH} +tar cf ${SNAPSHOT_NAME}.tar ${SNAPSHOT_NAME} + +# Remove the folder once the archive is created +rm -rf ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Display the location of the Cassandra archive +echo "" +echo "Cassandra backup done! Keep the following backup archive safe:" +echo "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}.tar" diff --git a/docs/thehive/operations/backup-restore/assets/cassandra_restore.sh b/docs/thehive/operations/backup-restore/assets/cassandra_restore.sh new file mode 100644 index 0000000000..4a63bef0d4 --- /dev/null +++ b/docs/thehive/operations/backup-restore/assets/cassandra_restore.sh @@ -0,0 +1,49 @@ +#!/bin/bash + +# Cassandra variables +CASSANDRA_KEYSPACE=thehive + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup + +# Look for the latest archived Cassandra snapshot +CASSANDRA_BACKUP_LIST=(${GENERAL_ARCHIVE_PATH}/cassandra_????????_??h??m??s.tar) +CASSANDRA_LATEST_BACKUP_NAME=$(basename ${CASSANDRA_BACKUP_LIST[-1]}) + +echo "Latest Cassandra backup archive found is ${GENERAL_ARCHIVE_PATH}/${CASSANDRA_LATEST_BACKUP_NAME}" + +# Extract the latest archive +CASSANDRA_SNAPSHOT_NAME=$(echo ${CASSANDRA_LATEST_BACKUP_NAME} | cut -d '.' -f 1) +CASSANDRA_SNAPSHOT_FOLDER="${GENERAL_ARCHIVE_PATH}/${CASSANDRA_SNAPSHOT_NAME}" + +tar xvf "${GENERAL_ARCHIVE_PATH}/${CASSANDRA_LATEST_BACKUP_NAME}" +echo "Latest Cassandra backup archive extracted in ${CASSANDRA_SNAPSHOT_FOLDER}" + +# Go inside the Cassandra snapshot recently extracted +cd ${CASSANDRA_SNAPSHOT_FOLDER} + +# Check if Cassandra already has an existing keyspace +cqlsh -e "DESCRIBE KEYSPACE ${CASSANDRA_KEYSPACE}" > "${CASSANDRA_SNAPSHOT_FOLDER}/target_keyspace_${CASSANDRA_KEYSPACE}.cql" + +if cmp --silent -- "${CASSANDRA_SNAPSHOT_FOLDER}/create_keyspace_${CASSANDRA_KEYSPACE}.cql" "${CASSANDRA_SNAPSHOT_FOLDER}/target_keyspace_${CASSANDRA_KEYSPACE}.cql"; then + echo "Existing ${CASSANDRA_KEYSPACE} keyspace definition is identical to the one in the backup, no need to drop and recreate it" +else + echo "Existing ${CASSANDRA_KEYSPACE} keyspace definition does not match the one in the backup, dropping it" + cqlsh --request-timeout=120 -e "DROP KEYSPACE IF EXISTS ${CASSANDRA_KEYSPACE};" + sleep 5s + echo "Creating ${CASSANDRA_KEYSPACE} keyspace using the definition from the backup" + cqlsh --request-timeout=120 -f ${CASSANDRA_SNAPSHOT_FOLDER}/create_keyspace_${CASSANDRA_KEYSPACE}.cql +fi + +# Create the tables and load related data +cd ${CASSANDRA_KEYSPACE} +for TABLE in $(ls); do + TABLE_BASENAME=$(basename ${TABLE}) + TABLE_NAME=${TABLE_BASENAME%%-*} + echo "Importing ${TABLE_NAME} table and related data" + nodetool import ${CASSANDRA_KEYSPACE} ${TABLE_NAME} ${CASSANDRA_SNAPSHOT_FOLDER}/${CASSANDRA_KEYSPACE}/${TABLE} + echo "" +done + +echo "Cassandra data restoration done!" +rm -rf ${CASSANDRA_SNAPSHOT_FOLDER} diff --git a/docs/thehive/operations/backup-restore/assets/elasticsearch_backup.sh b/docs/thehive/operations/backup-restore/assets/elasticsearch_backup.sh new file mode 100644 index 0000000000..cc54399097 --- /dev/null +++ b/docs/thehive/operations/backup-restore/assets/elasticsearch_backup.sh @@ -0,0 +1,62 @@ +#!/bin/bash + +# Elasticsearch variables +ELASTICSEARCH_API_URL='http://127.0.0.1:9200' +ELASTICSEARCH_SNAPSHOT_REPOSITORY=thehive_repository +ELASTICSEARCH_INDEX=thehive_global + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup +SNAPSHOT_NAME="elasticsearch_$(date +%Y%m%d_%Hh%Mm%Ss)" + +# Creating the backup folder if needed +mkdir -p ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Check if the snapshot repository is correctly registered +repository_config=$(curl -s -L "${ELASTICSEARCH_API_URL}/_snapshot") +repository_ok=$(jq 'has("'${ELASTICSEARCH_SNAPSHOT_REPOSITORY}'")' <<< ${repository_config}) +if ! ${repository_ok}; then + echo "Abort, no snapshot repository registered in ElasticSearch" + echo "Set the repository folder 'path.repo'" + echo "in an environment variable" + echo "or in elasticsearch.yml" + exit 1 +fi + +# Starting the snapshot +create_snapshot=$(curl -s -L -X PUT "${ELASTICSEARCH_API_URL}/_snapshot/thehive_repository/${SNAPSHOT_NAME}" -H 'Content-Type: application/json' -d '{"indices":"'${ELASTICSEARCH_INDEX}'", "ignore_unavailable":true, "include_global_state":false}') + +# Verify that the snapshot started correctly +create_started=$(jq '.accepted == true' <<< ${create_snapshot}) +if [ ${create_started} != true ] +then + echo "Couldn't start the snapshot" + exit 1 +fi +echo "Snapshot started" + +# Verify that the snapshot is finshed +state="NONE" +while [ "${state}" != "\"SUCCESS\"" ]; do + echo "Snapshot in progress, waiting 5 seconds before checking status again..." + sleep 5 + snapshot_list=$(curl -s -L "${ELASTICSEARCH_API_URL}/_snapshot/${ELASTICSEARCH_SNAPSHOT_REPOSITORY}/*?verbose=false") + state=$(jq '.snapshots[] | select(.snapshot == "'${SNAPSHOT_NAME}'").state' <<< ${snapshot_list}) +done +echo "Snapshot finished" + +# Print the snapshot short informations +final_state=$(jq '.snapshots[] | select(.snapshot == "'${SNAPSHOT_NAME}'")' <<< ${snapshot_list}) +echo ${final_state} | jq --color-output . + +# Create a ".tar" archive with the folder containing the backed up Elasticsearch index +cd ${GENERAL_ARCHIVE_PATH} +tar cf ${SNAPSHOT_NAME}.tar ${SNAPSHOT_NAME} + +# Remove the folder once the archive is created +rm -rf ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Display the location of the Elasticsearch archive +echo "" +echo "ElasticSearch backup done! Keep the following backup archive safe:" +echo "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}.tar" diff --git a/docs/thehive/operations/backup-restore/assets/elasticsearch_restore.sh b/docs/thehive/operations/backup-restore/assets/elasticsearch_restore.sh new file mode 100644 index 0000000000..5c874ec771 --- /dev/null +++ b/docs/thehive/operations/backup-restore/assets/elasticsearch_restore.sh @@ -0,0 +1,37 @@ +#!/bin/bash + +# ElasticSearch variables +ELASTICSEARCH_API_URL='http://127.0.0.1:9200' +ELASTICSEARCH_SNAPSHOT_REPOSITORY=thehive_repository +ELASTICSEARCH_INDEX=thehive_global + +# Look for the latest archived ElasticSearch snapshot +ELASTICSEARCH_BACKUP_LIST=(${GENERAL_ARCHIVE_PATH}/elasticsearch_????????_??h??m??s.tar) +ELASTICSEARCH_LATEST_BACKUP_NAME=$(basename ${ELASTICSEARCH_BACKUP_LIST[-1]}) + +echo "Latest ElasticSearch backup archive found is ${GENERAL_ARCHIVE_PATH}/${ELASTICSEARCH_LATEST_BACKUP_NAME}" + +# Extract the latest archive +ELASTICSEARCH_SNAPSHOT_NAME=$(echo ${ELASTICSEARCH_LATEST_BACKUP_NAME} | cut -d '.' -f 1) +ELASTICSEARCH_SNAPSHOT_FOLDER="${GENERAL_ARCHIVE_PATH}/${ELASTICSEARCH_SNAPSHOT_NAME}" + +tar xvf "${GENERAL_ARCHIVE_PATH}/${ELASTICSEARCH_LATEST_BACKUP_NAME}" +echo "Latest ElasticSearch backup archive extracted in ${ELASTICSEARCH_SNAPSHOT_FOLDER}" + +# Delete an existing ElasticSearch index +echo "Trying to delete the existing ElasticSearch index" +delete_index=$(curl -s -L -X DELETE "${ELASTICSEARCH_API_URL}/${ELASTICSEARCH_INDEX}/") + +ack_delete=$(jq '.acknowledged == true' <<< delete_index) +if [ delete_index != true ]; then + echo "Couldn't delete ${ELASTICSEARCH_INDEX} index, maybe it was already deleted" +else + echo "Existing ${ELASTICSEARCH_INDEX} index deleted" +fi + +# Restoring the extracted snapshot +echo "Restoring ${ELASTICSEARCH_SNAPSHOT_NAME} snapshot" +restore_status=$(curl -s -L -X POST "${ELASTICSEARCH_API_URL}/_snapshot/${ELASTICSEARCH_SNAPSHOT_REPOSITORY}/${ELASTICSEARCH_SNAPSHOT_NAME}/_restore?wait_for_completion=true") + +echo "ElasticSearch data restoration done!" +rm -rf ${ELASTICSEARCH_SNAPSHOT_FOLDER} diff --git a/docs/thehive/operations/backup-restore/assets/files_backup.sh b/docs/thehive/operations/backup-restore/assets/files_backup.sh new file mode 100644 index 0000000000..1a681e20f8 --- /dev/null +++ b/docs/thehive/operations/backup-restore/assets/files_backup.sh @@ -0,0 +1,24 @@ +#!/bin/bash + +# TheHive attachment variables +ATTACHMENT_FOLDER=/opt/thp/thehive/files + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup +SNAPSHOT_NAME="files_$(date +%Y%m%d_%Hh%Mm%Ss)" +ATTACHMENT_ARCHIVE_PATH="${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}" + +# Copy all TheHive attachment +cp -r ${ATTACHMENT_FOLDER}/* ${ATTACHMENT_ARCHIVE_PATH}/ + +# Create a ".tar" archive with the folder containing the backed up attachment files +cd ${GENERAL_ARCHIVE_PATH} +tar cf ${SNAPSHOT_NAME}.tar ${SNAPSHOT_NAME} + +# Remove the folder once the archive is created +rm -rf ${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME} + +# Display the location of the attachment archive +echo "" +echo "TheHive attachment files backup done! Keep the following backup archive safe:" +echo "${GENERAL_ARCHIVE_PATH}/${SNAPSHOT_NAME}.tar" diff --git a/docs/thehive/operations/backup-restore/assets/files_restore.sh b/docs/thehive/operations/backup-restore/assets/files_restore.sh new file mode 100644 index 0000000000..faf8b17d24 --- /dev/null +++ b/docs/thehive/operations/backup-restore/assets/files_restore.sh @@ -0,0 +1,29 @@ +#!/bin/bash + +# TheHive attachment variables +ATTACHMENT_FOLDER=/opt/thp/thehive/files + +# Backup variables +GENERAL_ARCHIVE_PATH=/mnt/backup + +# Look for the latest archived attachment files snapshot +ATTACHMENT_BACKUP_LIST=(${GENERAL_ARCHIVE_PATH}/files_????????_??h??m??s.tar) +ATTACHMENT_LATEST_BACKUP_NAME=$(basename ${ATTACHMENT_BACKUP_LIST[-1]}) + +echo "Latest attachment files backup archive found is ${GENERAL_ARCHIVE_PATH}/${ATTACHMENT_LATEST_BACKUP_NAME}" + +# Extract the latest archive +ATTACHMENT_SNAPSHOT_NAME=$(echo ${ATTACHMENT_LATEST_BACKUP_NAME} | cut -d '.' -f 1) +ATTACHMENT_SNAPSHOT_FOLDER="${GENERAL_ARCHIVE_PATH}/${ATTACHMENT_SNAPSHOT_NAME}" + +tar xvf "${GENERAL_ARCHIVE_PATH}/${ATTACHMENT_LATEST_BACKUP_NAME}" +echo "Latest attachment files backup archive extracted in ${ATTACHMENT_SNAPSHOT_FOLDER}" + +# Clean existing TheHive attachment files +rm -rf ${ATTACHMENT_FOLDER}/* + +# Copy the attachment files from the backup +cp -r ${ATTACHMENT_SNAPSHOT_FOLDER}/* ${ATTACHMENT_FOLDER}/ + +echo "attachment files data restoration done!" +rm -rf ${ATTACHMENT_SNAPSHOT_FOLDER} diff --git a/docs/thehive/operations/backup-restore/backup/cloud.md b/docs/thehive/operations/backup-restore/backup/cloud.md deleted file mode 100644 index 54518e3c5e..0000000000 --- a/docs/thehive/operations/backup-restore/backup/cloud.md +++ /dev/null @@ -1,2 +0,0 @@ -# Cloud backup -TBD \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/docker-compose.md b/docs/thehive/operations/backup-restore/backup/cold-backup/docker-compose.md similarity index 80% rename from docs/thehive/operations/backup-restore/backup/docker-compose.md rename to docs/thehive/operations/backup-restore/backup/cold-backup/docker-compose.md index 86e8f263da..4a13dd3fed 100644 --- a/docs/thehive/operations/backup-restore/backup/docker-compose.md +++ b/docs/thehive/operations/backup-restore/backup/cold-backup/docker-compose.md @@ -1,36 +1,26 @@ -# Backup a stack run with Docker Compose +# How to Perform a Cold Backup for a Stack Running with Docker Compose -!!! Note - This solution assumes you are following our _[Running with Docker](./../../../installation/docker.md)_ guide to run your application stack. +This topic provides step-by-step instructions for performing a cold backup of a stack running with Docker Compose for TheHive. +{!includes/implications-cold-backup-restore.md!} ---- -## Introduction +{!includes/backup-restore-best-practices.md!} -The backup procedure is designed to capture the state of your application stack in three simple steps: - -1. Stop all services to ensure data consistency and prevent any changes during the backup process. -2. Copy volumes and mapped directories on the host machine, which contain your data, configurations, and logs. -3. Restart the services to resume normal operations after the backup is complete. - - ---- ## Prerequisites -Before starting, ensure: +This solution assumes you are following the [Running TheHive with Docker](../../../../installation/docker.md) guide to run your application stack. + +Before starting, ensure you have: -* You have sufficient storage space for the backup. +* Sufficient storage space for the backup. * Administrative privileges to stop and start Docker Compose services. * Familiarity with the locations of your mapped volumes and data directories. -This step-by-step procedure ensures a safe and consistent backup of your Docker Compose stack, enabling quick recovery in case of an issue or migration to a new environment. For more details on restoring these backups, refer to the [restore procedure for Docker Compose](../restore/docker-compose.md). +This step-by-step procedure ensures a safe and consistent backup of your Docker Compose stack, enabling quick recovery in case of an issue or migration to a new environment. For more details on restoring these backups, refer to the [restore procedure for Docker Compose](../../restore/cold-restore/docker-compose.md). +## Step 1: Stop the services - ---- -## Step-by-step instructions - -### Stop the services +Stop all services to ensure data consistency and prevent any changes during the backup process. !!! Example "" @@ -38,9 +28,15 @@ This step-by-step procedure ensures a safe and consistent backup of your Docker docker compose down ``` -### Copy files in a backup folder +## Step 2: Copy files in a backup folder + +Copy volumes and mapped directories on the host machine, which contain your data, configurations, and logs. + +For example, on the host server, create a folder on a dedicated NFS volume named `/opt/backups` and copy all files preserving their permissions. -For example, on the host server, create a folder on a dedicated NFS volume named `/opt/backups` and copy all files preserving their permissions +!!! tip "Tips" + * [Docker Compose profiles](https://github.com/StrangeBeeCorp/docker) include a comprehensive backup script, including all necessary housekeeping actions. + * You can also review the backup script for `prod1-thehive` directly on the [Docker Compose profiles GitHub repository](https://github.com/StrangeBeeCorp/docker). !!! Example "" @@ -80,12 +76,12 @@ For example, on the host server, create a folder on a dedicated NFS volume named ## ## ADDITIONAL RESOURCES: ## Refer to the official documentation for detailed instructions and - ## additional information: https://docs.strangebee.com/thehive/operations/backup-restore/ + ## additional information: https://docs.strangebee.com/thehive/operations/backup-restore/backup/cold-backup/docker-compose.md. ## ## WARNING: ## - This script stops Nginx, Elasticsearch, Cassandra, and TheHive services, ## performs the backup, and then restarts the services. - ## - Do not modify the rest of the script unless necessary. + ## - Don't modify the rest of the script unless necessary. ## ## ============================================================ ## DO NOT MODIFY ANYTHING BELOW THIS LINE @@ -142,7 +138,6 @@ For example, on the host server, create a folder on a dedicated NFS volume named DATE="$(date +"%Y%m%d-%H%M%z" | sed 's/+/-/')" BACKUP_FOLDER="${BACKUP_ROOT_FOLDER}/${DATE}" - ## Stop services docker compose -f ${DOCKER_COMPOSE_PATH}/docker-compose.yml stop @@ -154,8 +149,6 @@ For example, on the host server, create a folder on a dedicated NFS volume named LOG_FILE="${BACKUP_ROOT_FOLDER}/backup_log_${DATE}.log" exec &> >(tee -a "$LOG_FILE") - - ## Prepare folders tree mkdir -p ${BACKUP_FOLDER}/{thehive,cassandra,elasticsearch,nginx,certificates} echo "Created folder structure under ${BACKUP_FOLDER}" @@ -165,7 +158,7 @@ For example, on the host server, create a folder on a dedicated NFS volume named rsync -aW --no-compress ${DOCKER_COMPOSE_PATH}/thehive/ ${BACKUP_FOLDER}/thehive || { echo "TheHive backup failed"; exit 1; } echo "TheHive backup completed." - ## Copy Casssandra data + ## Copy Cassandra data echo "Starting Cassandra backup..." rsync -aW --no-compress ${DOCKER_COMPOSE_PATH}/cassandra/ ${BACKUP_FOLDER}/cassandra || { echo "Cassandra backup failed"; exit 1; } echo "Cassandra backup completed." @@ -185,17 +178,17 @@ For example, on the host server, create a folder on a dedicated NFS volume named echo "Restarting services..." docker compose up -d -f ${DOCKER_COMPOSE_PATH}/docker-compose.yml - - echo "Backup process completed at: $(date)" ``` ---- -## Validation +## Step 3: Validate the backup + +Check the backup folder and verify that the data has been copied correctly. + +## Step 4: Restart all services + +Use `docker compose up -d -f ${DOCKER_COMPOSE_PATH}/docker-compose.yml` to restart all services with the command line. -check the backup folder and verify the data has been well copied. +

Next steps

---- -!!! Tip - * A comprehensive backup script, including all necessary housekeeping actions, is included with our Docker Compose profiles. Refer to the appropriate documentation for detailed instructions [here](https://github.com/StrangeBeeCorp/docker). - * You can also review the backup script for `prod1-thehive` directly on our GitHub repository. \ No newline at end of file +* [Restore a Cold Backup for a Stack Running with Docker Compose](../../restore/cold-restore/docker-compose.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/physical-server.md b/docs/thehive/operations/backup-restore/backup/cold-backup/physical-server.md similarity index 84% rename from docs/thehive/operations/backup-restore/backup/physical-server.md rename to docs/thehive/operations/backup-restore/backup/cold-backup/physical-server.md index 88f35d546f..c2c4674d54 100644 --- a/docs/thehive/operations/backup-restore/backup/physical-server.md +++ b/docs/thehive/operations/backup-restore/backup/cold-backup/physical-server.md @@ -1,30 +1,22 @@ -# For Physical Servers - -## Introduction +# How to Perform a Cold Backup on a Physical Server -Unlike virtualized or containerized environments, physical servers require direct access to the file system and services to perform backups. This procedure focuses on cold backups, where services are stopped to ensure the integrity and consistency of the data, indices, and logs. +This topic provides step-by-step instructions for performing a cold backup on a physical server for TheHive. -When performing a backup on physical servers, it’s essential to: +Unlike virtualized or containerized environments, physical servers require direct access to the file system and services to perform backups. -1. Stop services (e.g., Elasticsearch, Cassandra, TheHive) to avoid data corruption. -2. Ensure file permissions are adequate for the backup process. -3. Use tools like rsync to copy data, configuration files, and logs to a designated backup location. -4. Validate the backup to ensure it can be restored without issues. +{!includes/implications-cold-backup-restore.md!} + +{!includes/backup-restore-best-practices.md!} ---- ## Prerequisites This guide assumes you have direct access to the server via SSH or other administrative tools and sufficient disk space to store backups. By following this procedure, you can create a consistent backup that can be securely archived or transferred for disaster recovery purposes. -This process and example below assume you have followed our [step-by-step guide](./../../../installation/step-by-step-installation-guide.md) to install the application stack. - -!!! Note - Before proceeding, ensure you have read the general [Backup and Restore Overview](../overview.md) to understand the core principles of backup strategies. +This process and example below assume you have followed the [step-by-step guide](../../../../installation/step-by-step-installation-guide.md) to install the application stack. ---- -## Step-by-step instructions +## Step 1: Stop the services -### Stop the services in this order +Stop services in this order to avoid data corruption: 1. TheHive 2. Elasticsearch @@ -38,9 +30,11 @@ This process and example below assume you have followed our [step-by-step guide] systemctl stop cassandra ``` -### Copy files in a backup folder +## Step 2: Copy files in a backup folder -For example, create a folder on a dedicated NFS volume named `/opt/backups` and copy all files preserving their permissions +Use tools like rsync to copy data, configuration files, and logs to a designated backup location. + +For example, create a folder on a dedicated NFS volume named `/opt/backups` and copy all files preserving their permissions. !!! Example "" @@ -84,7 +78,7 @@ For example, create a folder on a dedicated NFS volume named `/opt/backups` and ## WARNING: ## - This script stops Elasticsearch, Cassandra, and TheHive services, ## performs the backup, and then restarts the services. - ## - Do not modify the rest of the script unless necessary. + ## - Don't modify the rest of the script unless necessary. ## ## ============================================================ ## DO NOT MODIFY ANYTHING BELOW THIS LINE @@ -160,7 +154,7 @@ For example, create a folder on a dedicated NFS volume named `/opt/backups` and rsync -aW --no-compress /var/log/thehive/ ${BACKUP_FOLDER}/thehive/logs || { echo "TheHive logs backup failed"; exit 1; } echo "TheHive backup completed." - # Copy Casssandra data + # Copy Cassandra data echo "Starting Cassandra backup..." rsync -aW --no-compress /etc/cassandra/ ${BACKUP_FOLDER}/cassandra/config || { echo "Cassandra config backup failed"; exit 1; } rsync -aW --no-compress /var/lib/cassandra/ ${BACKUP_FOLDER}/cassandra/data || { echo "Cassandra data backup failed"; exit 1; } @@ -177,8 +171,9 @@ For example, create a folder on a dedicated NFS volume named `/opt/backups` and echo "Backup process completed at: $(date)" ``` +## Step 3: Restart all services -### Start services in this order +Restart services in this order: 1. Elasticsearch 2. Cassandra @@ -192,8 +187,10 @@ For example, create a folder on a dedicated NFS volume named `/opt/backups` and systemctl start cassandra ``` +## Step 4: Validate the backup + +Check the backup folder and verify that the data has been copied correctly. ---- -## Validation +

Next steps

-check the backup folder and verify the data has been well copied. \ No newline at end of file +* [Restore a Cold Backup on a Physical Server](../../restore/cold-restore/physical-server.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/cold-backup/virtual-server.md b/docs/thehive/operations/backup-restore/backup/cold-backup/virtual-server.md new file mode 100644 index 0000000000..1df82aa44c --- /dev/null +++ b/docs/thehive/operations/backup-restore/backup/cold-backup/virtual-server.md @@ -0,0 +1,27 @@ +# How to Perform a Cold Backup on a Virtual Server + +This topic provides step-by-step instructions for performing a cold backup on a virtual server for TheHive. + +Using virtual servers provides more flexibility in performing backup and restore operations. + +{!includes/implications-cold-backup-restore.md!} + +{!includes/backup-restore-best-practices.md!} + +## Prerequisites + +This process and example below assume you have followed the [step-by-step guide](../../../../installation/step-by-step-installation-guide.md) to install the application stack. + +## First option: Back up data folders + +Similar to using a physical server, use scripts to back up the configuration, data, and logs from each application in your stack. Store the backups in a folder that can be archived elsewhere. Refer to the [Perform a Cold Backup on a Physical Server](physical-server.md) guide for detailed instructions. + +## Second option: Leverage the capabilities of the hypervisor + +Hypervisors often come with the capacity to create a snapshot volumes and entire virtual machine. Create snapshots of volumes containing data and files after stopping TheHive, Cassandra and Elasticsearch applications. + +For the restore process, begin by restoring the snapshots created with the hypervisor. This allows you to quickly revert to a previous state, ensuring that both the system configuration and application data are restored to their exact state at the time of the snapshot. Be sure to follow any additional procedures specific to your hypervisor to ensure the snapshots are properly applied and that the system operates as expected after the restore. + +

Next steps

+ +* [Restore a Cold Backup on a Virtual Server](../../restore/cold-restore/virtual-server.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/hot-backup.md b/docs/thehive/operations/backup-restore/backup/hot-backup.md deleted file mode 100644 index 6a3db7914b..0000000000 --- a/docs/thehive/operations/backup-restore/backup/hot-backup.md +++ /dev/null @@ -1,299 +0,0 @@ -# Hot backups - -!!! Warning - As outlined in this documentation, it is **crucial** that the data, index, and files must remain intact and consistent during the backup process. Any inconsistency could jeopardize the restore process. While we provide guidance on creating and restoring Cassandra snapshots, it is important to note that Elasticsearch snapshots should be taken concurrently to ensure consistency and integrity across both systems. Additionally, when using clusters, snapshots should be taken simultaneously across all nodes to maintain consistency. - - The scripts provided are examples that **you must customize** to fit your infrastructure and application stack. For instance, folder paths may differ depending on whether you're using physical servers or Docker Compose to run the services. - - ---- -## Introduction - -Hot backups are an essential strategy for maintaining business continuity in environments where downtime is not acceptable. Unlike cold backups, which require stopping services to ensure consistency, hot backups allow you to capture snapshots of your data while systems remain operational. This approach is particularly suited for high-availability environments where even minimal downtime could disrupt critical operations. - -However, hot backups come with unique challenges. Ensuring the consistency of data, indices, and files during the backup process is critical to avoid issues during restoration. For TheHive, this means taking simultaneous snapshots of its database (Cassandra), indexing engine (Elasticsearch), and file storage. While the process enables uninterrupted service, it requires careful planning and execution to ensure that all components remain synchronized. - -This guide provides detailed steps to perform hot backups of TheHive’s components, including Cassandra, Elasticsearch, and file storage. It also discusses the risks, prerequisites, and best practices to ensure your data is securely backed up without affecting the system’s availability. Whether you're operating on physical servers, virtual machines, or containerized environments, this guide will help you implement a reliable hot backup strategy tailored to your infrastructure. - -### Pre-requisites - -Before performing a hot backup, ensure the following prerequisites are met to facilitate a smooth and reliable process: - -### General requirements - -Ensure the following tools are installed on your system: - -* **Cassandra Nodetool**: For creating database snapshots. -* **tar/bzip2**: For archiving and compressing backup files. -* **rsync**: For transferring file storage data. - -Install any missing tools using package managers such as apt or yum: - -* `apt install bzip2` for DEB based OS -* `yum install bzip2` for RPM based OS - -  - -### Configuration knowledge - -#### Cassandra Keyspace - -Identify the keyspace used by TheHive. This is typically defined in the _application.conf_ file under the `db.janusgraph.storage.cql.keyspace` attribute. - - -#### Elasticsearch Repository - -Configure a repository for Elasticsearch snapshots. Ensure the repository is accessible and writable by Elasticsearch. - - -#### File Storage Location - -Locate the folder or object storage (e.g., MinIO) where TheHive stores files. This will be backed up along with the database and indices. If you're using local filesystem or NFS to store your files, the location is typically defined in the application.conf file under the `storage.localfs.location` attribute. - -### Consistency Considerations - -#### Clustered Environments - -For Cassandra and Elasticsearch clusters, ensure snapshots are taken simultaneously across all nodes to maintain consistency. - - -#### Data Integrity Checks - -Perform a preliminary check on the system for data corruption or inconsistencies. Address any issues before proceeding with the backup. - - -### Testing and Validation - -Test the backup process in a staging or test environment to ensure scripts and configurations work as expected. - -And periodically validate your restoration procedures using test environments to confirm the integrity of the backup data. - - ---- -## Why is backing up the index optional and what are the consequences ? - -In TheHive’s architecture, the decision to back up the Elasticsearch index is a matter of balancing operational priorities such as backup speed, storage requirements, and restoration time. Here’s a breakdown of why backing up the index might be considered optional and the potential consequences of doing so. - -### Why backup the index might be skipped - -#### Rebuild capability - -Elasticsearch indices can be rebuilt from the data stored in Cassandra. This makes it possible to restore the system without explicitly backing up the index, provided the data is intact. - - -#### Backup size optimization - -Excluding the index from the backup process reduces the total backup size. This can be particularly beneficial when dealing with large datasets or limited storage. - - -### Consequences of not backing up the index - -If the index is not backed up, it must be rebuilt from scratch during the restoration process. This can significantly increase the time required to fully restore the system. - -### Recommandation - -While skipping the index backup offers operational benefits during the backup process, it increases the restoration time and may introduce additional challenges. - -* If minimal restoration time is critical, back up the index alongside the database and files. -* If backup speed and storage efficiency are prioritized, skip the index backup but prepare for a longer restoration process. - -Ultimately, the decision depends on your infrastructure, operational priorities, and acceptable downtime during recovery. Ensure you have tested and validated your backup and restore processes in a controlled environment before applying them in production. - - - ---- -## Backup procedures - -### Cassandra (database) - -Ensure that all Cassandra data is safely stored in consistent snapshots. - - -#### Prerequisites - -To back up or export the database from Cassandra, the following information is required: - -* Cassandra admin password -* Keyspace used by TheHive (default = `thehive`). This can be checked in the `application.conf` configuration file, in the database configuration under *storage*, *cql*, and `keyspace` attribute. - -!!! Tip - This information can be found in TheHive configuration file - _application.conf_ - under the `db.janusgraph.storage` attribute: - - ```yaml - db.janusgraph { - storage { - backend: cql - hostname: ["127.0.0.1"] - cql { - cluster-name: thp - keyspace: thehive - } - } - } - ``` - - -#### Create snapshots - -Following actions should be performed to backup the data successfully: - -1. Create a snapshot -2. Save the data - -!!! Warning "The steps described below for data backup should be executed on each node. Each node's data should be backed up to ensure consistency in the backups." - -Considering that ${BACKUP} is the name of the snapshot, run the following commands: - - -##### 1 - Use the `nodetool snapshot` command to create a snapshot of all keyspaces - -!!! Example "" - - ```bash - nodetool snapshot -t ${BACKUP} - ``` - -##### 2 - Create and archive with the snapshot data: Execute the following command For every cassandra keyspace - -!!! Example "" - - ```bash - tar cjfv ${KEYSPACE_NAME}.tbz -C /var/lib/cassandra/data/${KEYSPACE_NAME}/*/snapshots/${BACKUP} . - ``` - -##### 3 - Remove old snapshots (if necessary) - -!!! Example "" - - ```bash - nodetool -h localhost -p 7199 clearsnapshot -t ${BACKUP} - ``` - -!!! Note - We strongly recommend copying the snapshot archive files to a remote server or backup storage. - -  - -#### Example - -!!! Example "Example of script to generate backups of TheHive keyspace" - - ```bash - #!/bin/bash - - ## Create a tbz archive containing the snapshot - ## This script should be executed on each node of the cluster - - ## Complete variables before running: - HOSTNAME=$(hostname) - SNAPSHOT_DATE="$(date +%F)" - - REMOTE_USER= - REMOTE_HOST= - - ## Perform a backup for all keyspaces (system included) - nodetool snapshot -t ${SNAPSHOT_DATE} - - ## Navigate to the snapshot directory - - find /var/lib/cassandra/data -name snapshots - - # Archive snapshot files - mkdir -p /var/lib/cassandra/archive_backup/$HOSTNAME/${SNAPSHOT_DATE} - cd /var/lib/cassandra/archive_backup/$HOSTNAME/${SNAPSHOT_DATE} - - for KEYSPACE in $(ls /var/lib/cassandra/data); do - mkdir $KEYSPACE - cd $KEYSPACE - for TABLE in $(ls /var/lib/cassandra/data/${KEYSPACE}); do - tar cjfv ${TABLE}.tbz -C /var/lib/cassandra/data/${KEYSPACE}/${TABLE}/snapshots/${SNAPSHOT_DATE} . - done - cd .. - done - - nodetool -h localhost -p 7199 clearsnapshot -t ${SNAPSHOT_DATE} - - # Copy the snapshot archive files to a remote server - - scp /var/lib/cassandra/archive_backup/$HOSTNAME/${SNAPSHOT_DATE}/* ${REMOTE_USER}@${REMOTE_HOST}:/remote/node-hostname_cassandra_backup_directory - ``` - ---- -## Elasticsearch (index engine) - -### Prerequisites - -#### Snapshot repository configuration - -Elasticsearch requires a snapshot repository to store backups. Common options include: - -* Shared file systems -* AWS S3 buckets -* Azure Blob Storage -* Google Cloud Storage - -Set up the repository before taking snapshots. - -#### Permissions - -Elasticsearch must have the appropriate permissions to write to the snapshot repository. -For shared file systems: - -!!! Example "" - - ```bash - chown -R elasticsearch:elasticsearch /path/to/backups - chmod -R 770 /path/to/backups - ``` - -#### Cluster health - -Ensure the cluster health is green before initiating the backup. - -!!! Example "" - - ```bash - curl -X GET "localhost:9200/_cluster/health?pretty" - ``` - -### Backup procedure - -#### Register a Snapshot Repository - -Use the Elasticsearch API to register a snapshot repository. Below is an example for a file-based repository: - -!!! Example "" - - ```bash - curl -X PUT "http://localhost:9200/_snapshot/my_backup" -H 'Content-Type: application/json' -d' - { - "type": "fs", - "settings": { - "location": "/patch/to/backups/elasticsearch", - "compress": true - } - }' - - ``` - -#### Verify Snapshot Completion - -!!! Example "" - - ```bash - curl -X GET "http://localhost:9200/_snapshot/my_backup/snapshot_1" - ``` - - -!!! Note - If using a filesystem-based repository, consider archiving the snapshot files to long-term storage (e.g., cloud storage or external disks). - - ---- - -## Backup files - -Wether you use local or distributed files system storage, copy the content of the folder/bucket. - - - diff --git a/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-cluster.md b/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-cluster.md new file mode 100644 index 0000000000..165dcf1fa4 --- /dev/null +++ b/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-cluster.md @@ -0,0 +1,124 @@ +# How to Perform a Hot Backup on a Cluster + +This topic provides step-by-step instructions for performing a hot backup on a cluster for TheHive. + +{!includes/prerequisites-hot-backup-restore.md!} + +{!includes/data-consistency-hot-backup.md!} + +{!includes/backup-restore-best-practices.md!} + +The process involves backing up three components: Apache Cassandra, Elasticsearch—both distributed across three nodes—and file storage. + +* [Database backup](#create-cassandra-snapshots) +* [Indexing backup](#create-elasticsearch-snapshots) +* [File storage backup](#perform-a-backup-on-file-storage) + +## Prerequisites + +### Install required tools + +{!includes/hot-backup-required-tools.md!} + +### Configure systems + +{!includes/hot-backup-configure-systems.md!} + +### Perform preliminary checks + +{!includes/preliminary-checks-hot-backup.md!} + +### Replicate Cassandra and Elasticsearch data across all three nodes + +!!! warning "Data replication requirement" + If this requirement isn't met, cluster restoration may fail, and integrity issues could arise. It's your responsibility to ensure data replication across all nodes before proceeding. + +Before proceeding with the backup, replicate 100% of your Cassandra and Elasticsearch data across all nodes. This simplifies the snapshot procedure, allowing snapshots to be taken from just one node. + +#### Verify replication factor + +Check the replication factor for your keyspace. It should be set to *3* for a three-node cluster. Use the following command in `cqlsh`: + +```sql +DESCRIBE KEYSPACE thehive; +``` + +If needed, adjust the replication factor: + +```sql +ALTER KEYSPACE thehive WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '' : 3 }; +``` + +#### Check cluster status + +Ensure all nodes are up and running: + +```bash +nodetool status +``` + +Nodes should be marked as `UN` (Up/Normal). + +#### Run `nodetool repair` + +Run a repair to ensure data consistency across all nodes: + +```bash +nodetool repair +``` + +#### Verify data replication + +Check for any replication issues: + +```bash +nodetool netstats +``` + +## Create Cassandra snapshots + +{!includes/hot-backup-cassandra-snapshots.md!} + +## Create Elasticsearch snapshots + +Before creating Elasticsearch snapshots, ensure Elasticsearch has the appropriate permissions to write to the snapshot repository. + +For shared file systems: + +!!! Example "" + + ```bash + chown elasticsearch:elasticsearch + chmod 770 + ``` + +Then, use the following script: + +!!! warning "Script requirements" + This script works only when Elasticsearch runs directly on a machine. It doesn't support deployments using Docker or Kubernetes. + + Before running the script on a cluster setup: + + * Ensure that `/mnt/backup` is mounted on a network-shared volume accessible by all cluster nodes. + * Edit the `elasticsearch.yml` configuration file to add the `path.repo` setting pointing to the snapshot repository path. + * Perform a rolling restart of all Elasticsearch nodes to apply the configuration changes. + + For step-by-step details, see the [official Elasticsearch documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.18/snapshots-filesystem-repository.html). + +!!! note "Default values" + Before running this script: + + * Update the snapshot repository name to match your environment. The default name in the script is `thehive_repository`. + * Verify that the index name matches the one used in the script, which defaults to `thehive_global`. This name may differ if you have rebuilt or customized the index. + +{!includes/hot-backup-elasticsearch-snapshots.md!} + +## Perform a backup on file storage + +This procedure applies only to Network File System (NFS) storage. It doesn't apply to S3-compatible object storage as MinIO. Use the following script to copy the contents of the NFS folder: + +{!includes/hot-backup-file-storage.md!} + +

Next steps

+ +* [Restore a Hot Backup on a Cluster](../../restore/hot-restore/restore-hot-backup-cluster.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-resolve-data-inconsistencies.md b/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-resolve-data-inconsistencies.md new file mode 100644 index 0000000000..fbd5ab0ec8 --- /dev/null +++ b/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-resolve-data-inconsistencies.md @@ -0,0 +1,40 @@ +# How to Resolve Data Inconsistencies During a Hot Backup + +This topic provides detailed instructions for resolving data inconsistencies encountered during a hot backup of Cassandra and Elasticsearch in TheHive. + +## For Cassandra + + + +## For Elasticsearch + + + +

Next steps

+ +* [Restore a Hot Backup on a Standalone Server](../../restore/hot-restore/restore-hot-backup-standalone-server.md) +* [Restore a Hot Backup on a Cluster](../../restore/hot-restore/restore-hot-backup-cluster.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-standalone-server.md b/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-standalone-server.md new file mode 100644 index 0000000000..3f21009592 --- /dev/null +++ b/docs/thehive/operations/backup-restore/backup/hot-backup/hot-backup-standalone-server.md @@ -0,0 +1,69 @@ +# How to Perform a Hot Backup on a Standalone Server + +This topic provides step-by-step instructions for performing a hot backup on a standalone server for TheHive. + +{!includes/prerequisites-hot-backup-restore.md!} + +{!includes/data-consistency-hot-backup.md!} + +{!includes/backup-restore-best-practices.md!} + +The process requires backing up data from all three components: Apache Cassandra, Elasticsearch and file storage. + +* [Database backup](#create-cassandra-snapshots) +* [Indexing backup](#create-elasticsearch-snapshots) +* [File storage backup](#perform-a-backup-on-file-storage) + +## Prerequisites + +### Install required tools + +{!includes/hot-backup-required-tools.md!} + +### Configure systems + +{!includes/hot-backup-configure-systems.md!} + +### Perform preliminary checks + +{!includes/preliminary-checks-hot-backup.md!} + +## Create Cassandra snapshots + +{!includes/hot-backup-cassandra-snapshots.md!} + +## Create Elasticsearch snapshots + +Before creating Elasticsearch snapshots, ensure Elasticsearch has the appropriate permissions to write to the snapshot repository. + +For shared file systems: + +!!! Example "" + + ```bash + chown elasticsearch:elasticsearch + chmod 770 + ``` + +Then, use the following script: + +!!! warning "Script restrictions" + This script works only when Elasticsearch runs directly on a machine. It doesn't support deployments using Docker or Kubernetes. + +!!! note "Default values" + Before running this script: + + * Update the snapshot repository name to match your environment. The default name in the script is `thehive_repository`. + * Verify that the index name matches the one used in the script, which defaults to `thehive_global`. This name may differ if you have rebuilt or customized the index. + +{!includes/hot-backup-elasticsearch-snapshots.md!} + +## Perform a backup on file storage + +Whether using local file system storage or Network File System (NFS), copy the contents of the folder using the following script: + +{!includes/hot-backup-file-storage.md!} + +

Next steps

+ +* [Restore a Hot Backup on a Standalone Server](../../restore/hot-restore/restore-hot-backup-standalone-server.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/backup/virtual-server.md b/docs/thehive/operations/backup-restore/backup/virtual-server.md deleted file mode 100644 index 2864f86581..0000000000 --- a/docs/thehive/operations/backup-restore/backup/virtual-server.md +++ /dev/null @@ -1,19 +0,0 @@ -# Backup Virtual server - -!!! Note - This process and example below assume you have followed our [step-by-step guide](./../../../installation/step-by-step-installation-guide.md) to install the application stack. - - -Using virtual servers allow more solutions to perform backup and restore operations. - ---- -## First solution: Backup data folders - -Similar to using a physical server, use scripts to back up the configuration, data, and logs from each application in your stack, storing them in a folder that can be archived elsewhere. Refer to the cold [backup](./physical-server.md) guides for detailed instructions. - ---- -## Second solution: Leverage the capabilities of the hypervisor - -Hypervisors often come with the capacity to create a snapshot volumes and entire virtual machine. We recommend creating snapshots of volumes containing data and files after stopping TheHive, Cassandra and Elasticsearch applications. - -For the restore process, begin by restoring the snapshots created with the hypervisor. This allows you to quickly revert to a previous state, ensuring that both the system configuration and application data are restored to their exact state at the time of the snapshot. Be sure to follow any additional procedures specific to your hypervisor to ensure the snapshots are properly applied and that the system operates as expected after the restore. \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/cold-hot-backup-restore.md b/docs/thehive/operations/backup-restore/cold-hot-backup-restore.md new file mode 100644 index 0000000000..3139878c93 --- /dev/null +++ b/docs/thehive/operations/backup-restore/cold-hot-backup-restore.md @@ -0,0 +1,57 @@ +# Cold vs. Hot Backup and Restore + +This topic compares cold and hot backup and restore options in TheHive, helping you make the best decision based on your organization's needs and requirements. + +## Definitions + +A cold backup involves shutting down TheHive and its architecture components to back up all data. This method ensures that the data is consistent and intact, but requires downtime. This option is available for standalone servers only, not for clusters. + +A hot backup keeps TheHive running while it takes the backup. This reduces downtime but may not guarantee data integrity across all architecture components. This option is available for both standalone servers and clusters. + +Both cold restore and hot restore require shutting down TheHive to complete the restoration process. + +## TheHive infrastructure challenges + +{!includes/backup-requirement.md!} + +TheHive is built on an architecture that includes [Apache Cassandra as the database](../../installation/step-by-step-installation-guide.md#apache-cassandra), [Elasticsearch as the indexing engine](../../installation/step-by-step-installation-guide.md#elasticsearch), and [file storage managed either locally, via a Network File System (NFS), or using S3-compatible object storage as MinIO](../../installation/step-by-step-installation-guide.md#file-storage). This architecture requires careful coordination to maintain consistency across the database, index, and file storage during backups. Any mismatch between these components can lead to restoration failures. + +## Cold vs. hot backup and restore comparison + +| Type | Complexity | TheHive backup state | TheHive restore state | Data integrity | Tools | Supported environment | Use case | +| -----| ---------- | --------------------| ---------------| ---------------| -----------------------| --------| --------| +| **Cold** | Medium | Application stopped | Application stopped | Guaranteed | Usual tools | Standalone servers only| Want to ensure data integrity | +| **Hot** | High | Application running | Application stopped | Not guaranteed | Service-specific tools | Standalone servers and clusters | Can't afford any downtime | + +## Available backup and restore procedures + +!!! warning "Testing responsibilities" + TheHive isn't liable for data loss, downtime, or failures due to incorrect configurations or restoration issues. The responsibility for implementing and testing these processes lies with you. Validate them in a controlled environment before using them in production. + +!!! note "Full backups only" + These procedures focus exclusively on methods for creating full backups and don't cover incremental backup strategies. + +### Cold backup and restore procedures + +How you proceed with cold backup and restore depends on your infrastructure and orchestration setup. This could involve physical servers, virtual servers, Docker, Kubernetes, or cloud solutions like AWS EC2. + +For example, with AWS EC2, data, indexes, and files can be stored on dedicated volumes. In such cases, taking daily snapshots of these volumes can be a simple and efficient backup strategy, typically completed within minutes—including the necessary service stop and restart operations. + +* Physical servers: [Back up](../backup-restore/backup/cold-backup/physical-server.md) / [Restore](../backup-restore/restore/cold-restore/physical-server.md) +* Virtual servers: [Back up](../backup-restore/backup/cold-backup/virtual-server.md) / [Restore](../backup-restore/restore/cold-restore/virtual-server.md) +* Containerized environments (Docker Compose): [Back up](../backup-restore/backup/cold-backup/docker-compose.md) / [Restore](../backup-restore/restore/cold-restore/docker-compose.md) + +### Hot backup and restore procedures + +The approach to hot backup and restore depends on whether you have a standalone server or a cluster. + +* Standalone server: [Back up](../backup-restore/backup/hot-backup/hot-backup-standalone-server.md) / [Restore](../backup-restore/restore/hot-restore/restore-hot-backup-standalone-server.md) +* Cluster: [Back up](../backup-restore/backup/hot-backup/hot-backup-cluster.md) / [Restore](../backup-restore/restore/hot-restore/restore-hot-backup-cluster.md) + +

Next steps

+ +* [Cassandra Cluster Operations](../cassandra-cluster.md) +* [Security in Apache Cassandra](../cassandra-security.md) +* [MinIO Cluster Operations](../minio-cluster.md) +* [Monitoring TheHive](../monitoring.md) +* [Troubleshooting](../troubleshooting.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/overview.md b/docs/thehive/operations/backup-restore/overview.md deleted file mode 100644 index 9b2aa78425..0000000000 --- a/docs/thehive/operations/backup-restore/overview.md +++ /dev/null @@ -1,46 +0,0 @@ -# Backup and Restore Guides - -!!! Warning - Regardless of the situation, we **strongly** recommend performing **cold backups**. TheHive utilizes Cassandra as its database and Elasticsearch as its indexing engine. Files are typically stored in a folder, although some users opt for Minio S3 object storage. For every backup, the data, index, and files **must** remain intact and consistent. **Any inconsistency could result in data restoration failure**. - - This documentation provides detailed instructions for performing cold backups. Alternatively, you may opt for a hot backup and restore strategy. To assist with this, we provide sample scripts that you can tailor to your specific requirements. However, it is important to note that the final responsibility for implementing and testing the backup and restore processes lies with you. - - We strongly recommend thoroughly validating your backup and restoration procedures in a controlled environment before relying on them in production. While we strive to provide accurate and helpful guidance, we cannot assume liability for any data loss, downtime, or system failures resulting from incorrect configurations, inconsistencies in your data, or issues during the restoration process. It is essential to ensure that your chosen approach aligns with your infrastructure and operational needs. - - ---- -## Cold backup & restore - - -Your backup and restore strategy heavily depends on your infrastructure and orchestration approach, whether you are using physical servers, virtual servers, Docker, Kubernetes, or cloud solutions like AWS EC2 instances. - -For example, with AWS Amazon EC2 servers where all data, indexes, and files are stored on dedicated volumes, performing a daily snapshots of the volumes may take only few minutes, including housekeeping tasks such as stopping and restarting services. - -!!! Note "These procedures focus exclusively on methods for creating full backups and do not cover incremental backup strategies." - - -### Backup and restore procedures - -Find complete backup and restore procedures for: - -* **Physical servers**: [backup](./backup/physical-server.md) and [restore](./restore/physical-server.md) procedures -* **Virtual servers**: [backup](./backup/virtual-server.md) and [restore](./restore/virtual-server.md) procedures -* **Containerized Environments (Docker Compose)**: [backup](./backup/docker-compose.md) and [restore](./restore/docker-compose.md) procedures - -Use the links above or navigate through the documentation to find the specific procedure suited for your environment. - - ---- -## Introduction to hot backup and restore - -While **cold backups** are highly recommended due to their simplicity and consistency, you may want to consider **hot backups**. - -Hot backup procedures can minimize downtime, making them more suitable for production environments where service availability is critical. However, hot backups introduce additional complexity and risks, such as data inconsistencies, particularly in distributed systems like Cassandra and Elasticsearch. - -### Considerations for hot backups - -- **Data Consistency:** Ensure that the backup process handles the synchronization of data across services like Cassandra and Elasticsearch. Use tools such as `nodetool snapshot` for Cassandra and Elasticsearch APIs for snapshots. -- **Service-Specific Tools:** Hot backups often require the use of specialized commands or APIs to ensure that the data state remains consistent during the process. -- **Validation:** Always test your hot backup and restore process thoroughly in a non-production environment before relying on it in production. - -Sample scripts and guidelines for hot backups are provided in this documentation. However, these are intended as a starting point and must be adapted to your infrastructure and operational requirements. diff --git a/docs/thehive/operations/backup-restore/restore/cloud.md b/docs/thehive/operations/backup-restore/restore/cloud.md deleted file mode 100644 index f67981d6dd..0000000000 --- a/docs/thehive/operations/backup-restore/restore/cloud.md +++ /dev/null @@ -1,2 +0,0 @@ -# Restore Cloud -TBD \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/docker-compose.md b/docs/thehive/operations/backup-restore/restore/cold-restore/docker-compose.md similarity index 80% rename from docs/thehive/operations/backup-restore/restore/docker-compose.md rename to docs/thehive/operations/backup-restore/restore/cold-restore/docker-compose.md index 2772323bfd..b488ae8a6b 100644 --- a/docs/thehive/operations/backup-restore/restore/docker-compose.md +++ b/docs/thehive/operations/backup-restore/restore/cold-restore/docker-compose.md @@ -1,11 +1,18 @@ -# Restore a stack run with Docker Compose +# How to Restore a Cold Backup for a Stack Running with Docker Compose -!!! Note - This process assumes you are using [one of our Docker Compose profiles](https://github.com/StrangeBeeCorp/docker), and you have already created backup using the previously outlined backup procedure. +This topic provides step-by-step instructions for restoring a cold backup of a stack running with Docker Compose for TheHive. -Restore data that has been saved following the previous backup process. +{!includes/implications-cold-backup-restore.md!} -## Ensure all services are stopped +{!includes/backup-restore-best-practices.md!} + +## Prerequisites + +This process assumes you are using [one of the Docker Compose profiles](https://github.com/StrangeBeeCorp/docker) and have already created a backup using the [Perform a Cold Backup for a Stack Running with Docker Compose](../../backup/cold-backup/docker-compose.md) topic. + +## Step 1: Stop the services + +Stop all services to ensure data consistency and prevent any changes during the restore process. !!! Example "" @@ -13,14 +20,13 @@ Restore data that has been saved following the previous backup process. docker compose down ``` -## Ensure all data folder are empty before running the restore process - -A backup is highly recommended before running a restore operation; this ensures you can revert to the current state if anything goes wrong. +## Step 2: Ensure all data folder are empty -Ensure that the target data folders are empty before running this script. Indeed, pre-existing files can cause conflicts or data corruption during the restore process. +Make sure the target data folders are empty before running this script, as pre-existing files can cause conflicts or data corruption during the restore process. +## Step 3: Choose the archive to restore from the backup folder -## Choose the archive to restore from the backup folder +Before restoring data, ensure that you have identified the correct backup archive to restore from. !!! Example "" @@ -62,7 +68,7 @@ Ensure that the target data folders are empty before running this script. Indeed ## ## ADDITIONAL RESOURCES: ## Refer to the official documentation for detailed instructions and - ## additional information: https://docs.strangebee.com/thehive/operations/backup-restore/ + ## additional information: https://docs.strangebee.com/thehive/operations/backup-restore/. ## ## WARNING: ## - This script ensure Nginx, Elasticsearch, Cassandra, and TheHive services are stopped before performing the restore, and then restarts the services. @@ -122,7 +128,6 @@ Ensure that the target data folders are empty before running this script. Indeed ## Check if the backup folder to restore exists, else exit [[ -d ${BACKUP_FOLDER} ]] || { echo "Backup folder not found, exiting"; exit 1; } - # Define the log file and start logging. Log file is stored in the current folder DATE="$(date +"%Y%m%d-%H%M%z" | sed 's/+/-/')" LOG_FILE="./restore_log_${DATE}.log" @@ -131,24 +136,21 @@ Ensure that the target data folders are empty before running this script. Indeed # Log the start time echo "Restoration process started at: $(date)" - ## Exit if docker compose is running + ## Exit if Docker Compose is running docker compose ps | grep -q "Up" && { echo "Docker Compose services are running. Exiting. Stop services and remove data before retoring data"; exit 1; } - # Copy TheHive data echo "Restoring TheHive data and configuration..." rsync -aW --no-compress ${BACKUP_FOLDER}/thehive/ ${DOCKER_COMPOSE_PATH}/thehive || { echo "TheHive config restore failed"; exit 1; } - # Copy Casssandra data + # Copy Cassandra data echo "Restoring Cassandra data ..." rsync -aW --no-compress ${BACKUP_FOLDER}/cassandra/ ${DOCKER_COMPOSE_PATH}/cassandra || { echo "Cassandra data restore failed"; exit 1; } - # Copy Elasticsearch data echo "Restoring Elasticsearch data ..." rsync -aW --no-compress ${BACKUP_FOLDER}/elasticsearch/ ${DOCKER_COMPOSE_PATH}/elasticsearch || { echo "Elasticsearch data restore failed"; exit 1; } - # Copy Nginx certificates echo "Restoring Nginx data and configuration..." rsync -a ${BACKUP_FOLDER}/nginx/ ${DOCKER_COMPOSE_PATH}/nginx || @@ -163,7 +165,14 @@ Ensure that the target data folders are empty before running this script. Indeed echo "Restoration process completed at: $(date)" ``` +## Step 4: Validate the restore + +Open you browser, connect to TheHive, and check your data has been restored correctly. + +### Step 5: Restart all services + +Use `docker compose up -d -f ${DOCKER_COMPOSE_PATH}/docker-compose.yml` to restart all services with the command line. -### Restart all services +

Next steps

-The script above restarts all services with the command line `docker compose up -d -f ${DOCKER_COMPOSE_PATH}/docker-compose.yml`. \ No newline at end of file +* [Perform a Cold Backup for a Stack Running with Docker Compose](../../backup/cold-backup/docker-compose.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/physical-server.md b/docs/thehive/operations/backup-restore/restore/cold-restore/physical-server.md similarity index 82% rename from docs/thehive/operations/backup-restore/restore/physical-server.md rename to docs/thehive/operations/backup-restore/restore/cold-restore/physical-server.md index 3a68f8fc3c..d14f239287 100644 --- a/docs/thehive/operations/backup-restore/restore/physical-server.md +++ b/docs/thehive/operations/backup-restore/restore/cold-restore/physical-server.md @@ -1,30 +1,25 @@ -# Restore physical server +# How to Restore a Cold Backup on a Physical Server -## Introduction +This topic provides step-by-step instructions for restoring a cold backup on a physical server for TheHive. Restoring your application stack on physical servers is a process that involves recovering configuration files, data, and logs from a previously created backup. This procedure ensures the application is returned to a consistent and operational state. -Unlike virtual or containerized environments, restoring on physical servers requires manual handling of files and services. This guide assumes that: +Unlike virtual or containerized environments, restoring on physical servers requires manual handling of files and services. -* You are restoring from [a cold backup](../backup/physical-server.md), where services were stopped during the backup process to maintain data consistency. -* The server environment matches the original configuration (e.g., paths, software versions, and dependencies). -* The backup was created following the [Backup Procedure for Physical Servers](../backup/physical-server.md). +{!includes/implications-cold-backup-restore.md!} -When performing a restore, you will: +{!includes/backup-restore-best-practices.md!} -1. Ensure all services are stopped (Elasticsearch, Cassandra, TheHive, etc.) before running the restoration process. -2. Restore the configuration, data, and log files from the backup. -3. Restart services and verify that the system is functioning as expected. +## Prerequisites -!!! Warning - * Always test the restoration process in a non-production or test environment before applying it to a live system. - * Ensure you have a current backup before starting the restore operation, as any errors during restoration could lead to data loss. - * This guide provides general instructions; adapt them to your specific server configuration. +This guide assumes that: ---- -## Step-by-step instructions +* You are restoring from a cold backup created using the procedures outlined in the [Perform a Cold Backup on a Physical Server](../../backup/cold-backup/physical-server.md) guide, where services were stopped during the backup process to ensure data consistency. +* The server environment matches the original configuration (for example, paths, software versions, and dependencies). -### Ensure all services are stopped +## Step 1: Stop the services + +Stop all services to ensure data consistency and prevent any changes during the restore process. !!! Example "" @@ -34,15 +29,13 @@ When performing a restore, you will: systemctl stop cassandra ``` - -### Check all data folders are empty +## Step 2: Check all data folders are empty Ensure `/var/lib/cassandra/` and `/var/lib/elasticsearch/` are empty. +## Step 3: Copy files from the backup folder -### Copy files from the backup folder - -For example, with a dedicated NFS volume and a folder named `/opt/backup` copy all files preserving their permissions +For example, with a dedicated NFS volume and a folder named `/opt/backup` copy all files preserving their permissions. !!! Example "" @@ -89,7 +82,7 @@ For example, with a dedicated NFS volume and a folder named `/opt/backup` copy ## WARNING: ## - This script ensure Nginx, Elasticsearch, Cassandra, and TheHive services are stopped before performing the restore, and then restarts the services. ## - This script will overwrite existing data. Use it with caution. - ## - Do not modify the rest of the script unless necessary. + ## - Don't modify the rest of the script unless necessary. ## ## ============================================================ ## DO NOT MODIFY ANYTHING BELOW THIS LINE @@ -162,7 +155,7 @@ For example, with a dedicated NFS volume and a folder named `/opt/backup` copy rsync -aW --no-compress ${BACKUP_FOLDER}/thehive/data/ /opt/thp/thehive/files || { echo "TheHive data restore failed"; exit 1; } rsync -aW --no-compress ${BACKUP_FOLDER}/thehive/logs/ /var/log/thehive || { echo "TheHive logs restore failed"; exit 1; } - # Copy Casssandra data + # Copy Cassandra data rsync -aW --no-compress ${BACKUP_FOLDER}/cassandra/config/ /etc/cassandra || { echo "Cassandra config restore failed"; exit 1; } rsync -aW --no-compress ${BACKUP_FOLDER}/cassandra/data/ /var/lib/cassandra || { echo "Cassandra data restore failed"; exit 1; } rsync -aW --no-compress ${BACKUP_FOLDER}/cassandra/logs/ /var/log/cassandra || { echo "Cassandra logs restore failed"; exit 1; } @@ -177,7 +170,9 @@ For example, with a dedicated NFS volume and a folder named `/opt/backup` copy Ensure permissions are correctly setup before running services. -### Start services in this order +### Step 4: Restart all services + +Restart services in this order: 1. Elasticsearch 2. Cassandra @@ -191,7 +186,10 @@ Ensure permissions are correctly setup before running services. systemctl start cassandra ``` ---- -## Validation +## Step 5: Validate the restore + +Open you browser, connect to TheHive, and check your data has been restored correctly. + +

Next steps

-Open you browser, connect to TheHive, and check your data has been restored correctly. \ No newline at end of file +* [Perform a Cold Backup on a Physical Server](../../backup/cold-backup/physical-server.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/cold-restore/virtual-server.md b/docs/thehive/operations/backup-restore/restore/cold-restore/virtual-server.md new file mode 100644 index 0000000000..df65eb4363 --- /dev/null +++ b/docs/thehive/operations/backup-restore/restore/cold-restore/virtual-server.md @@ -0,0 +1,27 @@ +# How to Restore a Cold Backup on a Virtual Server + +This topic provides step-by-step instructions for restoring a cold backup on a virtual server for TheHive. + +Using virtual servers provides more flexibility in performing backup and restore operations. + +{!includes/implications-cold-backup-restore.md!} + +{!includes/backup-restore-best-practices.md!} + +## Prerequisites + +This process and example below assume you have followed the [step-by-step guide](../../../../installation/step-by-step-installation-guide.md) to install the application stack. + +## First option: Restore data folders from a backup + +Assuming you are using the [Perform a Cold Backup on a Physical Server](../../backup/cold-backup/physical-server.md) guide to backup your data, use scripts to restore the configuration, data, and logs from each application in your stack. Refer to the [Restore a Cold Backup on a Physical Server](physical-server.md) guide for detailed instructions. + +## Second option: Leverage the capabilities of the hypervisor + +Hypervisors often come with the capacity to create a snapshot volumes and entire virtual machine. Create snapshots of volumes containing data and files after stopping TheHive, Cassandra and Elasticsearch applications. + +For the restore process, begin by restoring the snapshots created with the hypervisor. This allows you to quickly revert to a previous state, ensuring that both the system configuration and application data are restored to their exact state at the time of the snapshot. Be sure to follow any additional procedures specific to your hypervisor to ensure the snapshots are properly applied and that the system operates as expected after the restore. + +

Next steps

+ +* [Perform a Cold Backup on a Virtual Server](../../backup/cold-backup/virtual-server.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-cluster.md b/docs/thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-cluster.md new file mode 100644 index 0000000000..2b5dde987a --- /dev/null +++ b/docs/thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-cluster.md @@ -0,0 +1,35 @@ +# How to Restore a Hot Backup on a Cluster + +This topic provides step-by-step instructions for restoring a hot backup on a cluster for TheHive. + +{!includes/hot-restore-application-stopped.md!} + +{!includes/backup-restore-best-practices.md!} + +The process involves restoring data from three components: Apache Cassandra, Elasticsearch—both distributed across three nodes—and file storage. + +* [Database restore](#restore-cassandra-snapshots) +* [Indexing backup](#restore-elasticsearch-snapshots) +* [File storage restore](#restore-a-backup-for-file-storage) + +These procedures assume you have completed the steps in [Perform a Hot Backup on a Cluster](../../backup/hot-backup/hot-backup-cluster.md) and have stopped your TheHive application. Ensure that paths are consistent between the backup and restore procedures. + +## Restore Cassandra snapshots + +### Prerequisites + +{!includes/hot-restore-cassandra-snapshots.md!} + +## Restore Elasticsearch snapshots + +{!includes/hot-restore-elasticsearch-snapshots.md!} + +## Restore a backup for file storage + +This procedure applies only to Network File System (NFS) storage and doesn't apply to S3-compatible object storage as MinIO. Restore the saved files to the destination folder used by TheHive on NFS. Ensure the account running TheHive has the necessary permissions to create files and folders in the destination. + +{!includes/hot-restore-file-storage.md!} + +

Next steps

+ +* [Perform a Hot Backup on a Cluster](../../backup/hot-backup/hot-backup-cluster.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-standalone-server.md b/docs/thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-standalone-server.md new file mode 100644 index 0000000000..b4e2358cc0 --- /dev/null +++ b/docs/thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-standalone-server.md @@ -0,0 +1,33 @@ +# How to Restore a Hot Backup on a Standalone Server + +This topic provides step-by-step instructions for restoring a hot backup on a standalone server for TheHive. + +{!includes/hot-restore-application-stopped.md!} + +{!includes/backup-restore-best-practices.md!} + +The process requires backing up data from all three components: Apache Cassandra, Elasticsearch and file storage. + +* [Database restore](#restore-cassandra-snapshots) +* [Indexing backup](#restore-elasticsearch-snapshots) +* [File storage restore](#restore-a-backup-for-file-storage) + +These procedures assume you have completed the steps in [Perform a Hot Backup on a Standalone Server](../../backup/hot-backup/hot-backup-standalone-server.md) and have stopped your TheHive application. Ensure that paths are consistent between the backup and restore procedures. + +## Restore Cassandra snapshots + +{!includes/hot-restore-cassandra-snapshots.md!} + +## Restore Elasticsearch snapshots + +{!includes/hot-restore-elasticsearch-snapshots.md!} + +## Restore a backup for file storage + +Whether using local file system storage or Network File System (NFS), restore the saved files to the destination folder used by TheHive. Ensure the account running TheHive has the necessary permissions to create files and folders in the destination. + +{!includes/hot-restore-file-storage.md!} + +

Next steps

+ +* [Perform a Hot Backup on a Standalone Server](../../backup/hot-backup/hot-backup-standalone-server.md) \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/restore-hot-backup.md b/docs/thehive/operations/backup-restore/restore/restore-hot-backup.md deleted file mode 100644 index f6f21f2d14..0000000000 --- a/docs/thehive/operations/backup-restore/restore/restore-hot-backup.md +++ /dev/null @@ -1,311 +0,0 @@ - -### Restore data - -## Cassandra - -### Pre requisites - -Following data is required to restore TheHive database successfully: - -* A backup of the database (`${SNAPSHOT}_${SNAPSHOT_DATE}.tbz`) -* Keyspace to restore does not exist in the database (or it will be overwritten) -* All nodes in the cluster should be up before starting the restore procedure. -* TheHive application should **NOT** be running. - -  - -### Restore your data - -#### Start by drop the database `TheHive` - -!!! Example "" - - ```bash - ## SOURCE_KEYSPACE contains the name of theHive database - CASSANDRA_PASSWORD= - CASSANDRA_ADDRESS= - SOURCE_KEYSPACE=thehive - cqlsh -u admin -p ${CASSANDRA_PASSWORD} ${CASSANDRA_ADDRESS} -e "DROP KEYSPACE IF EXISTS ${SOURCE_KEYSPACE};" - ``` - -#### Create the keyspace - -!!! Example "" - - ```bash - ## TARGET_KEYSPACE contains the new name of theHive database - ## NOTE that you can keep the same database name since the old one has been deleted - cqlsh -u admin -p ${CASSANDRA_PASSWORD} ${CASSANDRA_ADDRESS} -e " - CREATE KEYSPACE ${TARGET_KEYSPACE} - WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': '3'} - AND durable_writes = true;" - ``` - -#### Unarchive backup files - -!!! Note - Note that the following steps should be executed for every Cassandra cluster node. - -!!! Example "" - - ```bash - mkdir -p /var/lib/cassandra/restore - RESTORE_PATH="/var/lib/cassandra/restore" - SOURCE_KEYSPACE="thehive" - SNAPSHOT_DATE= - ## TABLES should contain the list of all tables that will be restored - ## table_name should include the uuid generated by cassandra, for example: table1-f901e0c05d8811ef87c71fc3a94044f4 - - TABLES="ls -1 /var/lib/cassandra/data/${SOURCE_KEYSPACE}/" - ## copy all of the snapshot tables that you want to restore from remote server or local cassandra node then extract it - ## repeat the following steps on each node Cassandra for each table in the TABLES List - - cd /var/lib/cassandra/restore - for table in $TABLES; do - scp remoteuser@remotehost:/remote/node_name_directory/${SNAPSHOT_DATE}/${SOURCE_KEYSPACE}/${table}.tbz . - mkdir -p ${RESTORE_PATH}/${SOURCE_KEYSPACE}/${table} - echo "Unarchive backup files for table: $table" - tar jxf ${table}.tbz -C ${RESTORE_PATH}/${SOURCE_KEYSPACE}/${table} - done - ``` - -#### Create tables from archive - -The archive contains the table schemas. They must be executed in the new keyspace. The schema files are in `${RESTORE_PATH}/${SOURCE_KEYSPACE}/${table}` - -!!! Example "" - - ```bash - for CQL in $(find ${RESTORE_PATH} -name schema.cql) - do - cqlsh -u admin -p ${CASSANDRA_PASSWORD} -f $CQL - done - ``` - -If you want to change the name of the keyspace (`${SOURCE_KEYSPACE}` => `${TARGET_KEYSPACE}`), you need to rewrite the cql command: - -!!! Example "" - - ```bash - for CQL in $(find ${RESTORE_PATH} -name schema.cql) - do - cqlsh cassandra -e "$(sed -e '/CREATE TABLE/s/'${SOURCE_KEYSPACE}/${TARGET_KEYSPACE}/ $CQL)" - done - ``` - -#### Load table data - -!!! Note - Note that the following command should be executed on each Cassandra node in the cluster. - -!!! Example "" - - ```bash - for TABLE in ${RESTORE_PATH}/${TARGET_KEYSPACE}/* - do - TABLE_BASENAME=$(basename ${TABLE}) - TABLE_NAME=${TABLE_BASENAME%%-*} - nodetool import ${TARGET_KEYSPACE} ${TABLE_NAME} ${RESTORE_PATH}/${TARGET_KEYSPACE}/${TABLE_BASENAME} - done - ``` -  - -If the cluster topology has changed (new nodes added ou removed from the cluster since the last data backup), please follow the run the following command to perform a restore: - -!!! Example "" - - ```bash - for TABLE in ${RESTORE_PATH}/${TARGET_KEYSPACE}/* - do - TABLE_BASENAME=$(basename ${TABLE}) - sstableloader -d ${CASSANDRA_IP} ${RESTORE_PATH}/${TARGET_KEYSPACE}/${TABLE_BASENAME} - done - ``` - -#### Cleanup - -!!! Example "" - - ```bash - rm -rf ${RESTORE_PATH} - ``` - -  - -### Rebuid an existing node - -If for a particular reason (such as corrupted system data), you need to reintegrate the node into the cluster and restore all data (including system data), here is the procedure: - -#### Make sure that the Cassandra service is still down then delete the contents of the data volume: - -!!! Example "" - - ```bash - cd /var/lib/cassandra/data - rm -rf * - ``` - -#### Copy and unarchive backup files: - -!!! Example "" - - ```bash - DATA_PATH="/var/lib/cassandra/data" - - SNAPSHOT_DATE= - ## KEYSPACES list should inlude all keyspaces - KEYSPACES="system system_distributed system_traces system_virtual_schema system_auth system_schema system_views thehive" - cd ${DATA_PATH} - for ks in $KEYSPACES; do - scp -r remoteuser@remotehost:/remote/node_name_directory/${SNAPSHOT_DATE}/${ks}/ . - for file in /var/lib/cassandra/data/${ks}/*; do - echo "Processing $file" - filename=$(basename "$file") - table_name="${filename%%.*}" - sudo mkdir -p ${ks}/${table_name} - sudo tar jxf $file -C ${ks}/${table_name} - rm -f $file - done - done - - chown -R cassandra:cassandra /var/lib/cassandra/data - ``` -#### Start cassandra service - -!!! Example "" - - ```bash - service cassandra start - - ## heck if Cassandra has started successfully by reviewing its logs - tail -n 100 /var/log/cassandra/system.log | grep -iE "listening for|startup complete|error|warning" - - INFO [main] ********,773 PipelineConfigurator.java:125 - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)... - INFO [main] ********,790 CassandraDaemon.java:776 - Startup complete - ``` - -!!! Warning "Ensure no Commitlog file exist before restarting Cassandra service. (`/var/lib/cassandra/commitlog`)" - - -!!! Example "Example of script to restore TheHive keyspace in Cassandra" - - ```bash - #!/bin/bash - - ## Restore a KEYSPACE and its data from a CQL file with the schema of the - ## KEYSPACE and an tbz archive containing the snapshot - - ## Complete variables before running: - ## CASSANDRA_ADDRESS: IP of cassandra server - ## RESTORE_PATH: choose a TMP folder !!! this folder will be removed if exists. - ## SOURCE_KEYSPACE: KEYSPACE used in the backup - ## TARGET_KEYSPACE: new KEYSPACE name ; use same name of SOURCE_KEYSPACE if no changes - ## TABLES: should contain the list of all tables that will be restored, table_name should include the uuid generated by cassandra, for example: table1-f901e0c05d8811ef87c71fc3a94044f4 - ## SNAPSHOT: choose a name for the backup - ## SNAPSHOT_DATE: date of the snapshot to restore - - ## IMPORTANT: Note that the following steps should be executed on each Cassandra cluster node. - - CASSANDRA_ADDRESS="10.1.1.1" - RESTORE_PATH="/var/lib/cassandra/restore" - SOURCE_KEYSPACE="thehive" - TARGET_KEYSPACE="thehive_restore" - TABLES=" - table1-f901e0c05d8811ef87c71fc3a94044f4 - table2-d502a0c05d8811ef87c71fc3a94044f5 - table3-a703c0c05d8811ef87c71fc3a94044f6 - " - SNAPSHOT_DATE="2024-09-23" - - ## Copy from backup folder and Uncompress data in restore folder - - cd ${RESTORE_PATH} - for table in $TABLES; do - cp -r PATH_TO_BACKUP_DIRECTORY/${SNAPSHOT_DATE}/${SOURCE_KEYSPACE}/${table}.tbz . - mkdir -p ${RESTORE_PATH}/${SOURCE_KEYSPACE}/${table} - echo "Unarchive backup files for table: $table" - tar jxf ${table}.tbz -C ${RESTORE_PATH}/${SOURCE_KEYSPACE}/${table} - done - ## Read Cassandra password - echo -n "Cassandra admin password: " - read -s CASSANDRA_PASSWORD - - # Drop the keyspace - cqlsh -u admin -p ${CASSANDRA_PASSWORD} ${CASSANDRA_ADDRESS} -e "DROP KEYSPACE IF EXISTS ${SOURCE_KEYSPACE};" - - # Create the keyspace - ## TARGET_KEYSPACE contains the new name of theHive database - ## NOTE that you can keep the same database name since the old one has been deleted - - cqlsh -u admin -p ${CASSANDRA_PASSWORD} ${CASSANDRA_ADDRESS} -e " - CREATE KEYSPACE ${TARGET_KEYSPACE} - WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': '3'} - AND durable_writes = true;" - - # Create table in keyspace - for CQL in $(find ${RESTORE_PATH} -name schema.cql) - do - cqlsh -u admin -p ${CASSANDRA_PASSWORD} ${CASSANDRA_ADDRESS} -e "$(sed -e '/CREATE TABLE/s/'${SOURCE_KEYSPACE}/${TARGET_KEYSPACE}/ $CQL)" - done - - - ## Load data - for TABLE in ${RESTORE_PATH}/${TARGET_KEYSPACE}/* - do - TABLE_BASENAME=$(basename ${TABLE}) - TABLE_NAME=${TABLE_BASENAME%%-*} - nodetool import ${TARGET_KEYSPACE} ${TABLE_NAME} ${RESTORE_PATH}/${TARGET_KEYSPACE}/${TABLE_BASENAME} - done - ``` - ---- -## Restore Elasticsearch index - -Several solutions exist regarding the index: - -1. Restore a saved Elasticsearch index ; follow Elasticsearch guides to perform this action -2. Rebuild the index on the new server, when TheHive start for the first time. - -  - -### Restoration steps - -Restoring from a snapshot involves creating a new cluster or restoring the snapshot into an existing one. Here’s an example to restore all indices from a snapshot. - -!!! Example "" - - ```bash - curl -X POST "http://localhost>:9200/_snapshot/my_backup/snapshot_1/_restore" -H 'Content-Type: application/json' -d' - { - "indices": "*", - "include_global_state": false - }' - ``` - -### Rebuild the index - -Once Cassandra database is restored, update the configuration of TheHive to rebuild the index. - -These lines should be added to the configuration file only for the first start of TheHive application, and removed later on. - -!!! Example "" - - ```yaml title="extract from /etc/thehive/application.conf" - db.janusgraph.forceDropAndRebuildIndex = true - ``` - -!!! Warning "Once TheHive application is started, both lines should be removed or commented from the _application.conf_ configuration file" - - ---- -## Restore Files - -Restore the saved files into the destination folder/bucket that will be used by TheHive. Ensure the account running TheHive application has permissions to create files and folders into the destination folder. - - ---- -## References -- Backing up and restoring Cassandra data: [https://cassandra.apache.org/doc/stable/cassandra/operating/backups.html]() -- Backing up and restoring Elasticsearch data: [https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html]() - -  \ No newline at end of file diff --git a/docs/thehive/operations/backup-restore/restore/virtual-server.md b/docs/thehive/operations/backup-restore/restore/virtual-server.md deleted file mode 100644 index fd302c53c3..0000000000 --- a/docs/thehive/operations/backup-restore/restore/virtual-server.md +++ /dev/null @@ -1,18 +0,0 @@ -# Restore virtual server - -!!! Note - This process and example below assume you have followed our [step-by-step guide](./../../../installation/docker.md) to install the application stack. - -Using virtual servers allow more solutions to perform backup and restore operations. - ---- -## 1st solution: Restore data folders from a backup - -Assuming you are using our cold [backup](../backup/physical-server.md) guide to backup your data, use scripts to restore the configuration, data, and logs from each application in your stack. Refer to the cold [restore](./physical-server.md) guide for detailed instructions. - ---- -## 2nd solution: Leverage the capabilities of the hypervisor - -Hypervisors often come with the capacity to create a snapshot volumes and entire virtual machine. We recommend creating snapshots of volumes containing data and files after stopping TheHive, Cassandra and Elasticsearch applications. - -For the restore process, begin by restoring the snapshots created with the hypervisor. This allows you to quickly revert to a previous state, ensuring that both the system configuration and application data are restored to their exact state at the time of the snapshot. Be sure to follow any additional procedures specific to your hypervisor to ensure the snapshots are properly applied and that the system operates as expected after the restore. \ No newline at end of file diff --git a/mkdocs.yml b/mkdocs.yml index e7bc766834..83812d07c0 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -309,6 +309,15 @@ plugins: 'thehive/user-guides/analyst-corner/getting-started.md': 'thehive/overview/index.md' 'thehive/user-guides/analyst-corner/introduction.md': 'thehive/overview/index.md' 'thehive/user-guides/analyst-corner/sign-in-as-an-admin.md': 'thehive/overview/index.md' + 'thehive/operations/backup-restore/backup/docker-compose.md': 'thehive/operations/backup-restore/backup/cold-backup/docker-compose.md' + 'thehive/operations/backup-restore/backup/physical-server.md': 'thehive/operations/backup-restore/backup/cold-backup/physical-server.md' + 'thehive/operations/backup-restore/backup/virtual-server.md': 'thehive/operations/backup-restore/backup/cold-backup/virtual-server.md' + 'thehive/operations/backup-restore/backup/hot-backup.md': 'thehive/operations/backup-restore/backup/hot-backup/hot-backup-standalone-server.md' + 'thehive/operations/backup-restore/overview.md': 'thehive/operations/backup-restore/cold-hot-backup-restore.md' + 'thehive/operations/backup-restore/restore/docker-compose.md': 'thehive/operations/backup-restore/restore/cold-restore/docker-compose.md' + 'thehive/operations/backup-restore/restore/physical-server.md': 'thehive/operations/backup-restore/restore/cold-restore/physical-server.md' + 'thehive/operations/backup-restore/restore/virtual-server.md': 'thehive/operations/backup-restore/restore/cold-restore/virtual-server.md' + 'thehive/operations/backup-restore/restore/restore-hot-backup.md': 'thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-standalone-server.md' extra: generator: false @@ -391,19 +400,24 @@ nav: - 'Cassandra Security Operations': thehive/operations/cassandra-security.md - 'MinIO Cluster Operations': thehive/operations/minio-cluster.md - 'Backup & Restore Operations': - - 'Overview': thehive/operations/backup-restore/overview.md + - 'Cold vs. Hot Backups and Restores': thehive/operations/backup-restore/cold-hot-backup-restore.md - 'Backup Process': - - 'Physical Server': thehive/operations/backup-restore/backup/physical-server.md - - 'Virtual Server': thehive/operations/backup-restore/backup/virtual-server.md - - 'Docker Compose': thehive/operations/backup-restore/backup/docker-compose.md - # - 'Cloud Infrastructure': thehive/operations/backup-restore/backup/cloud.md - - 'Hot backups': thehive/operations/backup-restore/backup/hot-backup.md + - 'Cold Backup': + - 'Physical Server': thehive/operations/backup-restore/backup/cold-backup/physical-server.md + - 'Virtual Server': thehive/operations/backup-restore/backup/cold-backup/virtual-server.md + - 'Docker Compose': thehive/operations/backup-restore/backup/cold-backup/docker-compose.md + - 'Hot Backup': + - 'Standalone Server': thehive/operations/backup-restore/backup/hot-backup/hot-backup-standalone-server.md + - 'Cluster': thehive/operations/backup-restore/backup/hot-backup/hot-backup-cluster.md + - 'Resolve Data Inconsistencies': thehive/operations/backup-restore/backup/hot-backup/hot-backup-resolve-data-inconsistencies.md - 'Restore Process': - - 'Physical Server': thehive/operations/backup-restore/restore/physical-server.md - - 'Virtual Server': thehive/operations/backup-restore/restore/virtual-server.md - - 'Docker Compose': thehive/operations/backup-restore/restore/docker-compose.md - - 'Cloud Infrastructure': thehive/operations/backup-restore/restore/cloud.md - - 'Restore hot backups': thehive/operations/backup-restore/restore/restore-hot-backup.md + - 'Cold Restore': + - 'Physical Server': thehive/operations/backup-restore/restore/cold-restore/physical-server.md + - 'Virtual Server': thehive/operations/backup-restore/restore/cold-restore/virtual-server.md + - 'Docker Compose': thehive/operations/backup-restore/restore/cold-restore/docker-compose.md + - 'Hot Restore': + - 'Standalone Server': thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-standalone-server.md + - 'Cluster': thehive/operations/backup-restore/restore/hot-restore/restore-hot-backup-cluster.md - 'Index Management': thehive/operations/change-index.md - 'Troubleshooting Guide': thehive/operations/troubleshooting.md - 'Monitoring Setup': thehive/operations/monitoring.md