memgraph · matea16 · Apr 18, 2025 · Apr 16, 2025 · Apr 17, 2025 · Apr 18, 2025
@@ -455,6 +455,8 @@ in Memgraph.
 | `--storage-snapshot-interval="300`"                             | Define periodic snapshot schedule via cron expression or as a period in seconds. Set to empty string to disable.                   | `[string]` |
 | `--storage-snapshot-on-exit=true`                               | Controls whether the storage creates another snapshot on exit.                                                                     | `[bool]`   |
 | `--storage-snapshot-retention-count=3`                          | The number of snapshots that should always be kept.                                                                                | `[uint64]` |
+| `--storage-parallel-snapshot-creation=false`                    | Controls whether the snapshot creation can be done in a multi-threaded fashion.                                                    | `[bool]`   |
+| `--storage-snapshot-thread-count`                               | The number of threads used to create snapshots. Defaults to using system's maximum thread count.                                   | `[uint64]` |
 | `--storage-wal-enabled=true`                                    | Controls whether the storage uses write-ahead-logging. To enable WAL, periodic snapshots must be enabled.                          | `[bool]`   |
 | `--storage-wal-file-flush-every-n-tx=100000`                    | Issue a 'fsync' call after this amount of transactions are written to the WAL file. Set to 1 for fully synchronous operation.      | `[uint64]` |
 | `--storage-wal-file-size-kib=20480`                             | Minimum file size of each WAL file.                                                                                                | `[uint64]` |

@@ -87,6 +87,8 @@ on the value of the `--storage-snapshot-on-exit` configuration flag.  When a
 snapshot creation is triggered, the entire data storage is written to the drive.
 Nodes and relationships are divided into groups called batches.
 
+Snapshot creation can be made faster by using **multiple threads**. See [Parallelized execution](#parallelized-execution) for more information.
+
 On startup, the database state is recovered from the most recent snapshot file.
 Memgraph can read the data and build the indexes on multiple threads, using
 batches as a parallelization unit: each thread will recover one batch at a time
@@ -155,6 +157,15 @@ storage mode is changed to `IN_MEMORY_TRANSACTIONAL` storage mode.
 Snapshots and WAL files are presently not compatible between Memgraph versions.
 </Callout>
 
+### Parallelized execution
+
+Snapshot creation in Memgraph can be optimized using multiple threads, which significantly reduces the time required to create snapshots for large datasets. 
+
+This behavior can be controlled using the following flags:
+- `--storage-parallel-snapshot-creation`: This flag determines whether snapshot creation is performed in a multi-threaded fashion. By default, it is set to `false`. To enable parallelized execution, set this flag to `true`.
+- `--storage-snapshot-thread-count`: This flag specifies the number of threads to be used for snapshot creation. By default, Memgraph uses the system's maximum thread count. You can override this value to fine-tune performance based on your system's resources.
+
+When parallelized execution is enabled, Memgraph divides the data into batches, where the batch size is defined via `--storage-items-per-batch`. The optimal batch size and thread count may vary depending on the dataset size and system configuration.
 
 ## Storage modes
 

@@ -82,6 +82,24 @@ for security reasons, it can't automatically create a new disk copy when you
 use `CREATE SNAPSHOT` in Memgraph. So, while the command creates a snapshot
 locally, it doesn't trigger a new snapshot in the Cloud interface.
 
+## Why am I seeing corrupt snapshot files named `_edge_part_` and `_vertex_part_`?
+
+These files are partial results from the multi-threaded execution of snapshot creation. 
+When Memgraph creates snapshots using multiple threads, it divides the data into smaller parts. Each thread processes a specific part and writes intermediate results to files named with the `_edge_part_` and `_vertex_part_` patterns.
+
+If the snapshot creation process is interrupted or fails, these partial files may remain on disk and appear as corrupt. 
+Memgraph cannot load these incomplete files during startup, as they do not represent a valid snapshot.
+
+### How to resolve this issue?
+
+To resolve this issue, you can safely delete the partial files and restart Memgraph. The database will attempt to recover its state using the most recent valid snapshot and the write-ahead log (WAL) files.
+
+```bash
+rm /var/lib/memgraph/snapshots/*_edge_part_*
+rm /var/lib/memgraph/snapshots/*_vertex_part_*
+```
+
+
 ---
 
 <CommunityLinks/>