Skip to content

Commit 081a5ed

Browse files
authored
Merge pull request #5895 from influxdata/pbarnett/update-examples-and-pe-cache
Updates for new cluster configurations in Enterprise and new in-memory cache
2 parents 79d1996 + 9a73763 commit 081a5ed

File tree

7 files changed

+216
-69
lines changed

7 files changed

+216
-69
lines changed

api-docs/influxdb3/core/v3/ref.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ tags:
118118
InfluxDB 3 Core provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins in response to events in your database.
119119
Use Processing engine plugins and triggers to run code and perform tasks for different database events.
120120
121-
To get started with the Processing engine, see the [Processing engine and Python plugins](/influxdb3/core/processing-engine/) guide.
121+
To get started with the Processing Engine, see the [Processing Engine and Python plugins](/influxdb3/core/processing-engine/) guide.
122122
- name: Quick start
123123
description: |
124124
1. [Check the status](#section/Server-information) of the InfluxDB server.

api-docs/influxdb3/enterprise/v3/ref.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,7 @@ tags:
118118
InfluxDB 3 Enterprise provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins in response to events in your database.
119119
Use Processing engine plugins and triggers to run code and perform tasks for different database events.
120120
121-
To get started with the Processing engine, see the [Processing engine and Python plugins](/influxdb3/enterprise/processing-engine/) guide.
121+
To get started with the Processing Engine, see the [Processing Engine and Python plugins](/influxdb3/enterprise/processing-engine/) guide.
122122
- name: Quick start
123123
description: |
124124
1. [Check the status](#section/Server-information) of the InfluxDB server.

content/influxdb3/core/plugins.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
2-
title: Processing engine and Python plugins
2+
title: Processing Engine and Python plugins
33
description: Use the Python processing engine to trigger and execute custom code on different events in an {{< product-name >}} instance.
44
menu:
55
influxdb3_core:
6-
name: Processing engine and Python plugins
6+
name: Processing Engine and Python plugins
77
weight: 4
88
influxdb3/core/tags: []
99
related:

content/influxdb3/enterprise/plugins.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
2-
title: Processing engine and Python plugins
2+
title: Processing Engine and Python plugins
33
description: Use the Python processing engine to trigger and execute custom code on different events in an {{< product-name >}} instance.
44
menu:
55
influxdb3_enterprise:
6-
name: Processing engine and Python plugins
6+
name: Processing Engine and Python plugins
77
weight: 4
88
influxdb3/core/tags: []
99
related:

content/shared/v3-core-get-started/_index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -156,14 +156,14 @@ The following examples show how to start InfluxDB 3 with different object store
156156
```bash
157157
# Memory object store
158158
# Stores data in RAM; doesn't persist data
159-
influxdb3 serve --node-id=local01 --object-store=memory
159+
influxdb3 serve --node-id=host01 --object-store=memory
160160
```
161161

162162
```bash
163163
# Filesystem object store
164164
# Provide the filesystem directory
165165
influxdb3 serve \
166-
--node-id=local01 \
166+
--node-id=host01 \
167167
--object-store=file \
168168
--data-dir ~/.influxdb3
169169
```
@@ -198,7 +198,7 @@ docker run -it \
198198

199199
```bash
200200
influxdb3 serve \
201-
--node-id=local01 \
201+
--node-id=host01 \
202202
--object-store=s3 \
203203
--bucket=BUCKET \
204204
--aws-access-key=AWS_ACCESS_KEY \
@@ -211,7 +211,7 @@ influxdb3 serve \
211211
# Specify the object store type and associated options
212212
213213
```bash
214-
influxdb3 serve --node-id=local01 --object-store=s3 --bucket=BUCKET \
214+
influxdb3 serve --node-id=host01 --object-store=s3 --bucket=BUCKET \
215215
--aws-access-key=AWS_ACCESS_KEY \
216216
--aws-secret-access-key=AWS_SECRET_ACCESS_KEY \
217217
--aws-endpoint=ENDPOINT \

content/shared/v3-core-plugins/_index.md

Lines changed: 157 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11

2-
Use the {{% product-name %}} Processing engine to run code and perform tasks
2+
Use the {{% product-name %}} Processing Engine to run code and perform tasks
33
for different database events.
44

5-
{{% product-name %}} provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins
5+
{{% product-name %}} provides the InfluxDB 3 Processing Engine, an embedded Python VM that can dynamically load and trigger Python plugins
66
in response to events in your database.
77

88
## Key Concepts
99

1010
### Plugins
1111

12-
A Processing engine _plugin_ is Python code you provide to run tasks, such as
12+
A Processing Engine _plugin_ is Python code you provide to run tasks, such as
1313
downsampling data, monitoring, creating alerts, or calling external services.
1414

1515
> [!Note]
@@ -25,7 +25,7 @@ A _trigger_ is an InfluxDB 3 resource you create to associate a database
2525
event (for example, a WAL flush) with the plugin that should run.
2626
When an event occurs, the trigger passes configuration details, optional arguments, and event data to the plugin.
2727

28-
The Processing engine provides four types of triggers--each type corresponds to
28+
The Processing Engine provides four types of triggers--each type corresponds to
2929
an event type with event-specific configuration to let you handle events with targeted logic.
3030

3131
- **WAL Flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second).
@@ -35,9 +35,9 @@ an event type with event-specific configuration to let you handle events with ta
3535
- **Parquet Persistence (coming soon)**: Triggered when InfluxDB 3 persists data to object storage Parquet files.
3636
-->
3737

38-
### Activate the Processing engine
38+
### Activate the Processing Engine
3939

40-
To enable the Processing engine, start the {{% product-name %}} server with the
40+
To enable the Processing Engine, start the {{% product-name %}} server with the
4141
`--plugin-dir` option and a path to your plugins directory.
4242
If the directory doesn’t exist, the server creates it.
4343

@@ -234,7 +234,7 @@ influx create trigger --run-asynchronously
234234
#### Configure error handling
235235
#### Configure error behavior for plugins
236236

237-
The Processing engine logs all plugin errors to stdout and the `system.processing_engine_logs` system table.
237+
The Processing Engine logs all plugin errors to stdout and the `system.processing_engine_logs` system table.
238238

239239
To configure additional error handling for a trigger, use the `--error-behavior` flag:
240240

@@ -466,3 +466,153 @@ To run the plugin, you send an HTTP request to `<HOST>/api/v3/engine/my-plugin`.
466466
Because all On Request plugins for a server share the same `<host>/api/v3/engine/` base URL,
467467
the trigger-spec you define must be unique across all plugins configured for a server,
468468
regardless of which database they are associated with.
469+
470+
471+
## In-memory cache
472+
473+
The Processing Engine provides a powerful in-memory cache system that enables plugins to persist and retrieve data between executions. This cache system is essential for maintaining state, tracking metrics over time, and optimizing performance when working with external data sources.
474+
475+
### Key Benefits
476+
477+
- **State persistence**: Maintain counters, timestamps, and other state variables across plugin executions.
478+
- **Performance and cost optimization**: Store frequently used data to avoid expensive recalculations. Minimize external API calls by caching responses and avoiding rate limits.
479+
- **Data Enrichment**: Cache lookup tables, API responses, or reference data to enrich data efficiently.
480+
481+
### Cache API
482+
483+
The cache API is accessible via the `cache` property on the `influxdb3_local` object provided to all plugin types:
484+
485+
```python
486+
# Basic usage pattern
487+
influxdb3_local.cache.METHOD(PARAMETERS)
488+
```
489+
490+
491+
| Method | Parameters | Returns | Description |
492+
|--------|------------|---------|-------------|
493+
| `put` | `key` (str): The key to store the value under<br>`value` (Any): Any Python object to cache<br>`ttl` (Optional[float], default=None): Time in seconds before expiration<br>`use_global` (bool, default=False): If True, uses global namespace | None | Stores a value in the cache with an optional time-to-live |
494+
| `get` | `key` (str): The key to retrieve<br>`default` (Any, default=None): Value to return if key not found<br>`use_global` (bool, default=False): If True, uses global namespace | Any | Retrieves a value from the cache or returns default if not found |
495+
| `delete` | `key` (str): The key to delete<br>`use_global` (bool, default=False): If True, uses global namespace | bool | Deletes a value from the cache. Returns True if deleted, False if not found |
496+
497+
### Cache Namespaces
498+
499+
The cache system offers two distinct namespaces, providing flexibility for different use cases:
500+
501+
| Namespace | Scope | Best For |
502+
| --- | --- | --- |
503+
| **Trigger-specific** (default) | Isolated to a single trigger | Plugin state, counters, timestamps specific to one plugin |
504+
| **Global** | Shared across all triggers | Configuration, lookup tables, service states that should be available to all plugins |
505+
506+
### Using the In-Memory Cache
507+
508+
The following examples show how to use the cache API in plugins:
509+
510+
```python
511+
# Store values in the trigger-specific namespace
512+
influxdb3_local.cache.put("last_processed_time", time.time())
513+
influxdb3_local.cache.put("error_count", 0)
514+
influxdb3_local.cache.put("processed_records", {"total": 0, "errors": 0})
515+
516+
# Store values with expiration
517+
influxdb3_local.cache.put("temp_data", {"value": 42}, ttl=300) # Expires in 5 minutes
518+
influxdb3_local.cache.put("auth_token", "t0k3n", ttl=3600) # Expires in 1 hour
519+
520+
# Store values in the global namespace
521+
influxdb3_local.cache.put("app_config", {"version": "1.0.2"}, use_global=True)
522+
influxdb3_local.cache.put("global_counter", 0, use_global=True)
523+
524+
# Retrieve values
525+
last_time = influxdb3_local.cache.get("last_processed_time")
526+
auth = influxdb3_local.cache.get("auth_token")
527+
config = influxdb3_local.cache.get("app_config", use_global=True)
528+
529+
# Provide defaults for missing keys
530+
missing = influxdb3_local.cache.get("missing_key", default="Not found")
531+
count = influxdb3_local.cache.get("visit_count", default=0)
532+
533+
# Delete cached values
534+
influxdb3_local.cache.delete("temp_data")
535+
influxdb3_local.cache.delete("app_config", use_global=True)
536+
```
537+
538+
#### Example: Maintaining State Between Executions
539+
540+
This example shows a WAL plugin that uses the cache to maintain a counter across executions:
541+
542+
```python
543+
544+
def process_writes(influxdb3_local, table_batches, args=None):
545+
# Get the current counter value or default to 0
546+
counter = influxdb3_local.cache.get("execution_counter", default=0)
547+
548+
# Increment the counter
549+
counter += 1
550+
551+
# Store the updated counter back in the cache
552+
influxdb3_local.cache.put("execution_counter", counter)
553+
554+
influxdb3_local.info(f"This plugin has been executed {counter} times")
555+
556+
# Process writes normally...
557+
```
558+
559+
#### Example: Sharing Configuration Across Triggers
560+
561+
One benefit of using a global namespace is being more responsive to changing conditions. This example demonstrates using the global namespace to share configuration, so a scheduled call can check thresholds placed by prior trigger calls, without making a query to the DB itself:
562+
563+
```python
564+
def process_scheduled_call(influxdb3_local, time, args=None):
565+
# Check if we have cached configuration
566+
config = influxdb3_local.cache.get("alert_config", use_global=True)
567+
568+
if not config:
569+
# Load configuration from database
570+
results = influxdb3_local.query("SELECT * FROM system.alert_config")
571+
572+
# Transform query results into config object
573+
config = {row["name"]: row["value"] for row in results}
574+
575+
# Cache the configuration with a 5-minute TTL
576+
influxdb3_local.cache.put("alert_config", config, ttl=300, use_global=True)
577+
influxdb3_local.info("Loaded fresh configuration from database")
578+
else:
579+
influxdb3_local.info("Using cached configuration")
580+
581+
# Use the configuration
582+
threshold = float(config.get("cpu_threshold", "90.0"))
583+
# ...
584+
```
585+
586+
The cache is designed to support stateful operations while maintaining isolation between different triggers. Use the trigger-specific namespace for most operations and the global namespace only when data sharing across triggers is necessary.
587+
588+
### Best Practices
589+
590+
#### Use TTL Appropriately
591+
Set realistic expiration times based on how frequently data changes.
592+
593+
```python
594+
# Cache external API responses for 5 minutes
595+
influxdb3_local.cache.put("weather_data", api_response, ttl=300)
596+
```
597+
598+
#### Cache Computation Results
599+
Store the results of expensive calculations that need to be utilized frequently.
600+
```python
601+
# Cache aggregated statistics
602+
influxdb3_local.cache.put("daily_stats", calculate_statistics(data), ttl=3600)
603+
```
604+
605+
#### Implement Cache Warm-Up
606+
Prime the cache at startup for critical data. This can be especially useful for global namespace data where multiple triggers will need this data.
607+
608+
```python
609+
# Check if cache needs to be initialized
610+
if not influxdb3_local.cache.get("lookup_table"):
611+
influxdb3_local.cache.put("lookup_table", load_lookup_data())
612+
```
613+
614+
#### Cache Limitations
615+
616+
- **Memory Usage**: Since cache contents are stored in memory, monitor your memory usage when caching large datasets.
617+
- **Server Restarts**: The cache is cleared when the server restarts, so it's recommended you design your plugins to handle cache initialization (as noted above).
618+
- **Concurrency**: Be cautious when multiple trigger instances might update the same cache key simultaneously to prevent inaccurate or out-of-date data access.

0 commit comments

Comments
 (0)