Skip to content

Commit 4dbdfba

Browse files
committed
fix(influxdb3): Data durability, data flow
1. Maintains the higher-level structure 2. Uses the bulleted list format for clarity 3. Adds the detailed explanations from the original Data flow section as Details items 4. Uses sentence case for all headings 5. Numbers the sections to match the original flow diagram
1 parent 053ea8d commit 4dbdfba

File tree

6 files changed

+36
-64
lines changed

6 files changed

+36
-64
lines changed

content/influxdb3/core/reference/internals/durability/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ menu:
88
parent: Core internals
99
name: Data durability
1010
weight: 200
11-
source: /content/shared/influxdb3-internals-reference/durability/_index.md
11+
source: /shared/influxdb3-internals-reference/durability.md
1212
---
1313

1414
<!--

content/influxdb3/enterprise/reference/internals/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ menu:
77
name: Enterprise internals
88
parent: Reference
99
weight: 107
10-
source: /content/shared/influxdb3-internals-reference/_index.md
10+
source: /shared/influxdb3-internals-reference/_index.md
1111
---
1212

1313
<!--

content/influxdb3/enterprise/reference/internals/durability/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ menu:
88
parent: Enterprise internals
99
name: Data durability
1010
weight: 200
11-
source: /content/shared/influxdb3-internals-reference/durability.md
11+
source: /shared/influxdb3-internals-reference/durability.md
1212
---
1313

1414
<!--
Lines changed: 33 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,78 +1,50 @@
1-
## How data flows through InfluxDB 3
1+
## How data flows through {{% product-name %}}
22

33
When data is written to {{% product-name %}}, it progresses through multiple stages to ensure durability, optimize performance, and enable efficient querying. Configuration options at each stage affect system behavior, balancing reliability and resource usage.
44

5-
### Write Path Overview
6-
7-
{{% product-name %}} processes data through several stages to ensure durability, query performance, and efficient storage. Below is a high-level overview of these stages:
8-
9-
1. [Write validation](#write-validation)
10-
11-
2. [Memory buffer](#memory-buffer)
12-
13-
3. [Write-Ahead Log (WAL) persistence](#wal-persistence)
14-
15-
4. [Queryable buffer](#query-availability)
16-
17-
5. [Parquet storage](#parquet-storage)
18-
19-
6. [In-memory cache](#in-memory-cache)
20-
21-
22-
##### Write Validation
23-
24-
- Process: InfluxDB validates incoming data before accepting it into the system.
25-
26-
- Impact: Prevents malformed or unsupported data from entering the database.
27-
28-
##### Memory Buffer
29-
30-
- Process: Incoming writes are stored in an in-memory buffer before persistence.
31-
32-
- Impact: Increases ingestion efficiency by allowing batch processing.
33-
34-
- Tradeoff: Larger batches improve throughput but require more memory.
35-
36-
##### WAL Persistence
37-
38-
- Process: The system flushes the write buffer to the WAL every second (default).
39-
40-
- Impact: Ensures durability by persisting data to object storage.
41-
42-
- Tradeoff: More frequent flushing improves durability but increases I/O overhead.
43-
44-
##### Query Availability
45-
46-
- Process: The system moves data to the queryable buffer after WAL persistence.
5+
## Data flow
476

48-
- Impact: Enables fast queries on recent data.
7+
As data moves through {{% product-name %}}, it follows a structured path to ensure durability, efficient querying, and optimized storage.
498

50-
- Tradeoff: A larger buffer speeds up queries but increases memory usage.
9+
The figure below shows how written data flows through the database.
5110

52-
##### Parquet Storage
11+
{{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}}
5312

54-
- Process: Every ten minutes (default), data is persisted to Parquet files in object storage.
13+
1. [Write validation and memory buffer](#1-write-validation-and-memory-buffer)
14+
2. [Write-ahead log (WAL) persistence](#2-write-ahead-log-wal-persistence)
15+
3. [Query availability](#3-query-availability)
16+
4. [Parquet storage](#4-parquet-storage)
17+
5. [In-memory cache](#5-in-memory-cache)
5518

56-
- Impact: Provides durable, long-term storage.
19+
### Write validation and memory buffer
5720

58-
- Tradeoff: More frequent persistence reduces reliance on the WAL but increases I/O costs.
21+
- **Process**: InfluxDB validates incoming data before accepting it into the system.
22+
- **Impact**: Prevents malformed or unsupported data from entering the database.
23+
- **Details**: The system validates incoming data and stores it in the write buffer (in memory). If [`no_sync=true`](#no-sync-write-option), the server sends a response to acknowledge the write.
5924

60-
##### In-Memory Cache
25+
### Write-ahead log (WAL) persistence
6126

62-
- Process: Recently persisted Parquet files are cached in memory.
27+
- **Process**: The system flushes the write buffer to the WAL every second (default).
28+
- **Impact**: Ensures durability by persisting data to object storage.
29+
- **Tradeoff**: More frequent flushing improves durability but increases I/O overhead.
30+
- **Details**: Every second (default), the system flushes the write buffer to the Write-Ahead Log (WAL) for persistence in the Object store. If [`no_sync=false`](#no-sync-write-option) (default), the server sends a response to acknowledge the write.
6331

64-
- Impact: Reduces query latency by minimizing object storage access.
32+
### Query availability
6533

66-
## Data flow
34+
- **Process**: The system moves data to the queryable buffer after WAL persistence.
35+
- **Impact**: Enables fast queries on recent data.
36+
- **Tradeoff**: A larger buffer speeds up queries but increases memory usage.
37+
- **Details**: After WAL persistence completes, data moves to the queryable buffer where it becomes available for queries. By default, the server keeps up to 900 WAL files (15 minutes of data) buffered.
6738

68-
As data moves through InfluxDB 3, it follows a structured path to ensure durability, efficient querying, and optimized storage.
39+
### Parquet storage
6940

70-
The figure below shows how written data flows through the database.
41+
- **Process**: Every ten minutes (default), data is persisted to Parquet files in object storage.
42+
- **Impact**: Provides durable, long-term storage.
43+
- **Tradeoff**: More frequent persistence reduces reliance on the WAL but increases I/O costs.
44+
- **Details**: Every ten minutes (default), the system persists the oldest data from the queryable buffer to the Object store in Parquet format. InfluxDB keeps the remaining data (the most recent 5 minutes) in memory.
7145

72-
{{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}}
46+
### In-memory cache
7347

74-
1. **Incoming writes**: The system validates incoming data and stores it in the write buffer (in memory). If [`no_sync=true`](#no-sync-write-option), the server sends a response to acknowledge the write.
75-
2. **WAL flush**: Every second (default), the system flushes the write buffer to the Write-Ahead Log (WAL) for persistence in the Object store. If [`no_sync=false`](#no-sync-write-option) (default), the server sends a response to acknowledge the write.
76-
3. **Query availability**: After WAL persistence completes, data moves to the queryable buffer where it becomes available for queries. By default, the server keeps up to 900 WAL files (15 minutes of data) buffered.
77-
4. **Long-term storage in Parquet**: Every ten minutes (default), the system persists the oldest data from the queryable buffer to the Object store in Parquet format. InfluxDB keeps the remaining data (the most recent 5 minutes) in memory.
78-
5. **In-memory cache**: InfluxDB puts Parquet files into an in-memory cache so that queries against the most recently persisted data don't have to go to object storage.
48+
- **Process**: Recently persisted Parquet files are cached in memory.
49+
- **Impact**: Reduces query latency by minimizing object storage access.
50+
- **Details**: InfluxDB puts Parquet files into an in-memory cache so that queries against the most recently persisted data don't have to go to object storage.

0 commit comments

Comments
 (0)