|
1 |
| -## How data flows through InfluxDB 3 |
| 1 | +## How data flows through {{% product-name %}} |
2 | 2 |
|
3 | 3 | When data is written to {{% product-name %}}, it progresses through multiple stages to ensure durability, optimize performance, and enable efficient querying. Configuration options at each stage affect system behavior, balancing reliability and resource usage.
|
4 | 4 |
|
5 |
| -### Write Path Overview |
6 |
| - |
7 |
| -{{% product-name %}} processes data through several stages to ensure durability, query performance, and efficient storage. Below is a high-level overview of these stages: |
8 |
| - |
9 |
| -1. [Write validation](#write-validation) |
10 |
| - |
11 |
| -2. [Memory buffer](#memory-buffer) |
12 |
| - |
13 |
| -3. [Write-Ahead Log (WAL) persistence](#wal-persistence) |
14 |
| - |
15 |
| -4. [Queryable buffer](#query-availability) |
16 |
| - |
17 |
| -5. [Parquet storage](#parquet-storage) |
18 |
| - |
19 |
| -6. [In-memory cache](#in-memory-cache) |
20 |
| - |
21 |
| - |
22 |
| -##### Write Validation |
23 |
| - |
24 |
| -- Process: InfluxDB validates incoming data before accepting it into the system. |
25 |
| - |
26 |
| -- Impact: Prevents malformed or unsupported data from entering the database. |
27 |
| - |
28 |
| -##### Memory Buffer |
29 |
| - |
30 |
| -- Process: Incoming writes are stored in an in-memory buffer before persistence. |
31 |
| - |
32 |
| -- Impact: Increases ingestion efficiency by allowing batch processing. |
33 |
| - |
34 |
| -- Tradeoff: Larger batches improve throughput but require more memory. |
35 |
| - |
36 |
| -##### WAL Persistence |
37 |
| - |
38 |
| -- Process: The system flushes the write buffer to the WAL every second (default). |
39 |
| - |
40 |
| -- Impact: Ensures durability by persisting data to object storage. |
41 |
| - |
42 |
| -- Tradeoff: More frequent flushing improves durability but increases I/O overhead. |
43 |
| - |
44 |
| -##### Query Availability |
45 |
| - |
46 |
| -- Process: The system moves data to the queryable buffer after WAL persistence. |
| 5 | +## Data flow |
47 | 6 |
|
48 |
| -- Impact: Enables fast queries on recent data. |
| 7 | +As data moves through {{% product-name %}}, it follows a structured path to ensure durability, efficient querying, and optimized storage. |
49 | 8 |
|
50 |
| -- Tradeoff: A larger buffer speeds up queries but increases memory usage. |
| 9 | +The figure below shows how written data flows through the database. |
51 | 10 |
|
52 |
| -##### Parquet Storage |
| 11 | +{{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}} |
53 | 12 |
|
54 |
| -- Process: Every ten minutes (default), data is persisted to Parquet files in object storage. |
| 13 | +1. [Write validation and memory buffer](#1-write-validation-and-memory-buffer) |
| 14 | +2. [Write-ahead log (WAL) persistence](#2-write-ahead-log-wal-persistence) |
| 15 | +3. [Query availability](#3-query-availability) |
| 16 | +4. [Parquet storage](#4-parquet-storage) |
| 17 | +5. [In-memory cache](#5-in-memory-cache) |
55 | 18 |
|
56 |
| -- Impact: Provides durable, long-term storage. |
| 19 | +### Write validation and memory buffer |
57 | 20 |
|
58 |
| -- Tradeoff: More frequent persistence reduces reliance on the WAL but increases I/O costs. |
| 21 | +- **Process**: InfluxDB validates incoming data before accepting it into the system. |
| 22 | +- **Impact**: Prevents malformed or unsupported data from entering the database. |
| 23 | +- **Details**: The system validates incoming data and stores it in the write buffer (in memory). If [`no_sync=true`](#no-sync-write-option), the server sends a response to acknowledge the write. |
59 | 24 |
|
60 |
| -##### In-Memory Cache |
| 25 | +### Write-ahead log (WAL) persistence |
61 | 26 |
|
62 |
| -- Process: Recently persisted Parquet files are cached in memory. |
| 27 | +- **Process**: The system flushes the write buffer to the WAL every second (default). |
| 28 | +- **Impact**: Ensures durability by persisting data to object storage. |
| 29 | +- **Tradeoff**: More frequent flushing improves durability but increases I/O overhead. |
| 30 | +- **Details**: Every second (default), the system flushes the write buffer to the Write-Ahead Log (WAL) for persistence in the Object store. If [`no_sync=false`](#no-sync-write-option) (default), the server sends a response to acknowledge the write. |
63 | 31 |
|
64 |
| -- Impact: Reduces query latency by minimizing object storage access. |
| 32 | +### Query availability |
65 | 33 |
|
66 |
| -## Data flow |
| 34 | +- **Process**: The system moves data to the queryable buffer after WAL persistence. |
| 35 | +- **Impact**: Enables fast queries on recent data. |
| 36 | +- **Tradeoff**: A larger buffer speeds up queries but increases memory usage. |
| 37 | +- **Details**: After WAL persistence completes, data moves to the queryable buffer where it becomes available for queries. By default, the server keeps up to 900 WAL files (15 minutes of data) buffered. |
67 | 38 |
|
68 |
| -As data moves through InfluxDB 3, it follows a structured path to ensure durability, efficient querying, and optimized storage. |
| 39 | +### Parquet storage |
69 | 40 |
|
70 |
| -The figure below shows how written data flows through the database. |
| 41 | +- **Process**: Every ten minutes (default), data is persisted to Parquet files in object storage. |
| 42 | +- **Impact**: Provides durable, long-term storage. |
| 43 | +- **Tradeoff**: More frequent persistence reduces reliance on the WAL but increases I/O costs. |
| 44 | +- **Details**: Every ten minutes (default), the system persists the oldest data from the queryable buffer to the Object store in Parquet format. InfluxDB keeps the remaining data (the most recent 5 minutes) in memory. |
71 | 45 |
|
72 |
| -{{< img-hd src="/img/influxdb/influxdb-3-write-path.png" alt="Write Path for InfluxDB 3 Core & Enterprise" />}} |
| 46 | +### In-memory cache |
73 | 47 |
|
74 |
| -1. **Incoming writes**: The system validates incoming data and stores it in the write buffer (in memory). If [`no_sync=true`](#no-sync-write-option), the server sends a response to acknowledge the write. |
75 |
| -2. **WAL flush**: Every second (default), the system flushes the write buffer to the Write-Ahead Log (WAL) for persistence in the Object store. If [`no_sync=false`](#no-sync-write-option) (default), the server sends a response to acknowledge the write. |
76 |
| -3. **Query availability**: After WAL persistence completes, data moves to the queryable buffer where it becomes available for queries. By default, the server keeps up to 900 WAL files (15 minutes of data) buffered. |
77 |
| -4. **Long-term storage in Parquet**: Every ten minutes (default), the system persists the oldest data from the queryable buffer to the Object store in Parquet format. InfluxDB keeps the remaining data (the most recent 5 minutes) in memory. |
78 |
| -5. **In-memory cache**: InfluxDB puts Parquet files into an in-memory cache so that queries against the most recently persisted data don't have to go to object storage. |
| 48 | +- **Process**: Recently persisted Parquet files are cached in memory. |
| 49 | +- **Impact**: Reduces query latency by minimizing object storage access. |
| 50 | +- **Details**: InfluxDB puts Parquet files into an in-memory cache so that queries against the most recently persisted data don't have to go to object storage. |
0 commit comments