Skip to content

Commit 11a668a

Browse files
Josipmrdenmatea16
andauthored
Add guide for memgraph in cybersecurity (#1324)
* Add guide for memgraph in cybersecurity * Modify main deploymetns page * Add deep path traversals * Apply suggestions from code review --------- Co-authored-by: Matea Pesic <80577904+matea16@users.noreply.github.com>
1 parent 8ccc712 commit 11a668a

File tree

4 files changed

+288
-2
lines changed

4 files changed

+288
-2
lines changed

next-env.d.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
/// <reference types="next/image-types/global" />
33

44
// NOTE: This file should not be edited
5-
// see https://nextjs.org/docs/pages/api-reference/config/typescript for more information.
5+
// see https://nextjs.org/docs/basic-features/typescript for more information.

pages/deployment/workloads.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ general suggestions when there's a conflict, so always defer to the targeted gui
3838

3939
Here are the currently available guides to help you deploy Memgraph effectively:
4040

41+
### [Memgraph in cybersecurity](/deployment/workloads/memgraph-in-cybsersecurity)
42+
Learn how to utilize memgraph across your cyber security network.
43+
4144
### [Memgraph in high-throughput workloads](/deployment/workloads/memgraph-in-high-throughput-workloads)
4245
Scale your write throughput while keeping up with fast-changing, high-velocity graph data.
4346

@@ -49,7 +52,6 @@ Learn how to optimize Memgraph for Retrieval-Augmented Generation (RAG) systems
4952
- Memgraph in analytical workloads
5053
- Memgraph in mission critical workloads
5154
- Memgraph in supply chain use cases
52-
- Memgraph in cybersecurity use cases
5355
- Memgraph in fraud detection use cases
5456

5557
<Callout type="info">

pages/deployment/workloads/_meta.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
export default {
2+
"memgraph-in-cybersecurity": "Memgraph in cybersecurity",
23
"memgraph-in-graphrag": "Memgraph in GraphRAG use cases",
34
"memgraph-in-high-throughput-workloads": "Memgraph in high-throughput workloads",
45
}
Lines changed: 283 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,283 @@
1+
---
2+
title: Memgraph in cybersecurity
3+
description: Suggestions on how to bring your Memgraph to production when running cybersecurity use cases.
4+
---
5+
6+
import { Callout } from 'nextra/components'
7+
import { CommunityLinks } from '/components/social-card/CommunityLinks'
8+
9+
# Memgraph in cybersecurity
10+
11+
<Callout type="info">
12+
Before diving into this guide, we recommend starting with the [Deployment best practices](/deployment/best-practices)
13+
page. It provides **foundational, use-case-agnostic advice** for deploying Memgraph in production.
14+
15+
This guide builds on that foundation, offering **additional recommendations tailored to cybersecurity workloads**.
16+
In cases where guidance overlaps, consider the information here as **complementary or overriding**, depending
17+
on the unique needs of your use case.
18+
</Callout>
19+
20+
## Is this guide for you?
21+
22+
This guide is for you if you're working with **cybersecurity and threat detection** where real-time analysis,
23+
data correlation, and threat intelligence are critical. You'll benefit from this content if:
24+
25+
- You're building a **Security Information and Event Management (SIEM)** system that needs to correlate events across multiple sources.
26+
- You're implementing **threat detection** that requires real-time analysis of network traffic, logs, and security events.
27+
- You need to **track and analyze attack patterns** across your infrastructure, identifying potential security breaches.
28+
- You're working with **threat intelligence** data that needs to be correlated with your internal security events.
29+
- You require **real-time alerting** based on complex security patterns and relationships.
30+
- You're working with cloud security use cases, leveraging **complex configuration objects** across your network topology.
31+
32+
33+
If this sounds like your use case, this guide will walk you through how to configure and scale Memgraph for
34+
**reliable, real-time security analysis** in production.
35+
36+
## Why choose Memgraph for cybersecurity use cases?
37+
38+
When your workload involves analyzing security events, correlating threats, and detecting patterns in real time,
39+
40+
Memgraph provides the performance and architecture needed to keep your systems secure, without compromise.
41+
42+
43+
Here's why Memgraph is a great fit for cybersecurity use cases:
44+
45+
- **In-memory storage engine**: Memgraph operates entirely in-memory, enabling **real-time threat detection** and analysis.
46+
This allows it to **process security events as they occur**, rather than waiting for disk I/O. Unlike systems that rely on LRU
47+
or OS-level caching, where **cache invalidation can delay threat detection**, Memgraph offers
48+
**predictable analysis latency** even under constant security event ingestion.
49+
50+
While many graph databases **max out around 1,000 events per second**, Memgraph can handle **up to 50x more**
51+
(see image below), making it ideal for **high-velocity security event processing**.
52+
53+
![](/pages/memgraph-in-production/benchmarking-memgraph/realistic-workload.png)
54+
55+
- **Non-blocking reads and writes with MVCC**: Built on multi-version concurrency control (MVCC),
56+
Memgraph ensures that **security event ingestion doesn't block threat analysis** and **analysis doesn't block ingestion**,
57+
allowing both to scale independently.
58+
59+
- **Fine-grained locking**: Locking happens at the node and relationship level, enabling **highly concurrent security event processing**
60+
and minimizing contention across threads.
61+
62+
- **Lock-free skiplist storage**: Memgraph uses **lock-free, concurrent skip list structures** for storing security events,
63+
relationships, and indices, leading to faster threat pattern matching and minimal coordination overhead between threads.
64+
65+
- **Snapshot isolation by default**: Unlike many databases that rely on **read-committed** isolation
66+
(which could miss critical security events), Memgraph provides **snapshot isolation**, ensuring data accuracy and
67+
consistency in security analysis.
68+
69+
- **Inter-query parallelization**: Each security analysis query is handled on its own CPU core, meaning Memgraph can
70+
**scale horizontally on a single machine** based on your hardware.
71+
72+
- **Horizontal read scaling with high availability**: Memgraph supports [replication](/clustering/replication) and
73+
[high availability](/clustering/high-availability), allowing you to distribute **security analysis across multiple replicas**.
74+
These replicas can also power **secondary workloads** like threat intelligence correlation or historical analysis,
75+
**without affecting the performance of the main security event processing instance**.
76+
77+
## What is covered?
78+
79+
The suggestions for cybersecurity workloads **complement** several key sections in the
80+
[best practices guide](/deployment/best-practices). These sections offer important context and
81+
82+
additional best practices tailored for security analysis, threat detection, and event correlation:
83+
84+
- [Choosing the right Memgraph flag set](#choosing-the-right-memgraph-flag-set) <br />
85+
Memgraph offers specific flags to optimize security event processing and threat detection.
86+
87+
- [Choosing the right Memgraph storage mode](#choosing-the-right-memgraph-storage-mode) <br />
88+
Guidance on selecting the optimal **storage mode** for cybersecurity use cases, depending on whether your focus is
89+
real-time analysis or historical security data retention.
90+
91+
- [Importing mechanisms](#importing-mechanisms) <br />
92+
Suggestions on how importing strategies work with Memgraph in cybersecurity for high-throughput and analytical use cases.
93+
94+
- [Optimizing security event processing](#optimizing-security-event-processing) <br />
95+
Learn how to use nested properties and nested indices to reduce TCO and improve the performance of your security use case.
96+
97+
98+
- [Enterprise features you might require](#enterprise-features-you-might-require) <br />
99+
Understand which **enterprise features**, such as high availability, audit logging, and advanced security controls
100+
101+
are essential for production-grade security deployments.
102+
103+
- [Queries that best suit your workload](#queries-that-best-suit-your-workload)
104+
Learn how to optimize security analysis queries and threat detection patterns.
105+
106+
## Choosing the right Memgraph flag set
107+
108+
When processing security events from systems like SIEMs or log collectors, the incoming payload is often **standardized**,
109+
meaning that even when a security event is updated, **some property values might remain unchanged**.
110+
111+
By default, Memgraph sets the flag `--storage-delta-on-identical-property-update=true`, which **updates all properties**
112+
of a node or relationship during an update, even if the new value is identical to the existing one.
113+
This can introduce unnecessary write overhead.
114+
115+
To optimize for **higher throughput** in scenarios where most incoming security events do not change all property values,
116+
it's recommended to set:
117+
118+
```bash
119+
--storage-delta-on-identical-property-update=false
120+
```
121+
122+
With this setting, Memgraph will **only create delta records for properties that have actually changed**,
123+
reducing internal write operations and improving overall system throughput, especially important in high-velocity
124+
125+
security event processing.
126+
127+
All available flags are listed in the [Configuration](/database-management/configuration) section of the docs.
128+
129+
130+
## Choosing the right Memgraph storage mode
131+
132+
Cybersecurity scenarios in Memgraph can run effectively on both `IN_MEMORY_TRANSACTIONAL` and `IN_MEMORY_ANALYTICAL`
133+
storage modes, depending on your specific security requirements.
134+
135+
If your security workload meets the following conditions:
136+
137+
- You are **updating security event properties** on existing nodes and relationships
138+
- You are **appending** new security events and relationships to the graph
139+
- You are **not performing deletes** of security events (for compliance reasons)
140+
- You are leveraging **on-demand analysis** of the graph with read-only queries
141+
142+
Then it may be worth considering switching to `IN_MEMORY_ANALYTICAL` mode.
143+
144+
This mode allows **security event processing to be multithreaded**, unlocking **near limitless ingestion speeds** by parallelizing
145+
event processing across CPU cores.
146+
147+
However, keep in mind:
148+
149+
- If you require **replication**, **high availability**, or **ACID guarantees** for your security data, you must use `IN_MEMORY_TRANSACTIONAL` mode.
150+
- `IN_MEMORY_ANALYTICAL` is optimized for **bulk security event ingestion** and **real-time analysis**, but it
151+
**does not support transactional rollback**, as it doesn't create delta objects during writes.
152+
Additionally, **WALs (write-ahead logs) are not generated** in this mode, meaning recovery relies solely on **snapshot creation**.
153+
154+
Learn more about [storage modes](/fundamentals/storage-memory-usage#storage-modes) in our documentation.
155+
156+
## Importing mechanisms
157+
158+
If you're dealing with **high-volume security event processing** (e.g., analyzing every network request, processing millions of security events per second),
159+
we recommend checking out the [importing mechanisms section](/deployment/workloads/memgraph-in-high-throughput-workloads#importing-mechanisms)
160+
in our high-throughput workloads guide for detailed guidance on handling high-volume data ingestion.
161+
162+
## Optimizing security event processing
163+
164+
### Deep-path traversals in attack path analysis
165+
For security use cases involving attack path analysis and threat propagation, Memgraph provides powerful path-finding algorithms:
166+
167+
- **Weighted shortest paths**: Calculate the most likely attack paths based on security metrics (e.g., vulnerability scores, access levels)
168+
- **All shortest paths**: Identify all possible attack vectors between critical assets
169+
- **Path filtering**: Focus analysis on specific types of security relationships or nodes
170+
171+
These algorithms are crucial for:
172+
- **Attack surface analysis**: Identify all possible entry points and attack vectors
173+
- **Threat propagation modeling**: Understand how threats could spread through your infrastructure
174+
- **Critical path identification**: Find the most vulnerable paths in your security architecture
175+
- **Risk assessment**: Evaluate the impact of potential security breaches
176+
177+
Learn more about [deep path traversal algorithms](/advanced-algorithms/deep-path-traversal) in our documentation.
178+
179+
### Map properties
180+
Memgraph allows you to store complex JSON objects as map properties within nodes. This is particularly beneficial for security use cases because:
181+
182+
- **Localized configuration data**: Keep all related security event properties and configurations within a single node
183+
- **Reduced lookup time**: Access nested properties directly without traversing to other nodes
184+
- **Lower TCO**: Map properties are more memory-efficient than creating separate nodes and relationships
185+
- **Simplified data model**: Maintain complex hierarchical security data without creating additional graph structure
186+
187+
### Nested indices
188+
For efficient security event processing and threat detection, Memgraph supports nested indices that allow you to create indices on map properties of nodes and relationships. This is particularly useful for security use cases where you need to:
189+
190+
- **Index nested security event data** (e.g., `event.details.severity`, `event.source.ip_address`)
191+
- **Optimize queries on complex JSON payloads** from security tools and SIEMs
192+
- **Improve performance when filtering** on deeply nested properties in security event maps
193+
- **Speed up lookups** on structured security data that follows a hierarchical format
194+
195+
Example of creating a nested index for security events:
196+
```cypher
197+
CREATE INDEX ON :SecurityEvent(event.details.severity);
198+
```
199+
200+
This creates a nested index that can significantly improve the performance of queries that filter security events based on nested properties within the event map.
201+
202+
Learn more about [nested indices](/fundamentals/indexes#label-property-index) in our documentation.
203+
204+
## Enterprise features you might require
205+
206+
For production-grade cybersecurity deployments, you may need advanced capabilities to ensure
207+
**availability**, **data retention**, and **security compliance**. Memgraph offers several enterprise features
208+
designed to support these needs:
209+
210+
- **Replication, high availability, and automatic failover**
211+
If you require your security system to be **available at all times**, Memgraph supports
212+
[clustering and high availability](/clustering/high-availability), allowing you to minimize downtime and recover
213+
automatically from failures.
214+
215+
- **Node and relationship TTL (time-to-live)**
216+
In security environments, you may need to automatically **archive or remove old security events** after a certain retention period
217+
to comply with data retention policies.
218+
Memgraph supports [time-to-live (TTL)](/querying/time-to-live) mechanisms for both **nodes** and **relationships**,
219+
ensuring your security graph remains manageable and compliant over time.
220+
221+
- **Multi-tenancy**
222+
Some security deployments require **separate security graphs per department or customer** to ensure strict data isolation and compliance.
223+
Memgraph supports [multi-tenancy](/database-management/multi-tenancy), enabling you to manage multiple independent
224+
security graphs within a single Memgraph instance.
225+
226+
- **Role-based access control**
227+
For security-sensitive deployments, Memgraph provides [role-based access control](/database-management/authentication-and-authorization/role-based-access-control)
228+
to ensure that only authorized personnel can access and modify security data.
229+
230+
## Queries that best suit your workload
231+
232+
### Idempotency concept
233+
234+
When processing security events, it's best to keep your Cypher queries **simple, idempotent, and efficient**. A typical security event ingestion query should look like:
235+
236+
```cypher
237+
MERGE (n:SecurityEvent) SET n += $row;
238+
```
239+
240+
This approach ensures **idempotency** (safe reprocessing of the same security events) and **minimizes query execution time** by keeping the transaction lightweight.
241+
Keep in mind that adding **complex security logic** or **customization** to the ingestion queries will **increase query latency**, so it's always a good practice to **profile your queries** using Memgraph's [`PROFILE` tool](/querying/clauses/profile) to understand and optimize performance.
242+
243+
### Dynamic labels and edge types
244+
245+
Memgraph also supports:
246+
247+
- [Dynamic node label creation](/querying/clauses/create#14-creating-node-labels-dynamically)
248+
- [Dynamic relationship type creation](/querying/clauses/create#23-creating-relationship-types-dynamically)
249+
250+
This is particularly useful for security events where the event type might be dynamic based on the source or severity.
251+
252+
However, **dynamic creation is only supported with `CREATE` operations**, and **matching or merging dynamically created
253+
labels and types is not supported**.
254+
255+
If your security event payload contains **dynamic labels or edge types** and you still need **idempotency**, you have two options:
256+
257+
- **Programmatically construct your Cypher query strings** based on the security event payload to ensure correct label/type usage before sending the query to Memgraph.
258+
- **Optionally use the [`merge`](/advanced-algorithms/available-algorithms/merge) procedure from MAGE**
259+
260+
<Callout type="warning">
261+
While MAGE procedures are **written in C++ and highly optimized**, they still introduce **slightly more overhead**
262+
compared to **pure Cypher**, as they are executed as external modules. We recommend favoring pure Cypher when
263+
possible for the **highest performance** in security event processing.
264+
</Callout>
265+
266+
### Using `convert.str2object` for parsing nested properties
267+
268+
When working with security event streams, sometimes your incoming payload contains **serialized JSON strings** that need to be
269+
transformed into property maps inside your graph.
270+
Memgraph provides the [`convert.str2object` function](/querying/functions#conversion-functions) to easily handle this scenario.
271+
272+
Example usage:
273+
274+
```cypher
275+
WITH convert.str2object('{"event_type":"login_attempt", "severity":"high", "source_ip":"192.168.1.1"}') AS props
276+
MERGE (n:SecurityEvent)
277+
SET n += props;
278+
```
279+
280+
This function **parses a JSON-formatted string into a Cypher map**, making it very useful for flexible security event ingestion pipelines
281+
where the event structure might vary slightly or be semi-structured.
282+
283+
<CommunityLinks/>

0 commit comments

Comments
 (0)