Merge pull request #5895 from influxdata/pbarnett/update-examples-and-pe-cache

jstirnaman · web-flow · commit 081a5ed02e9e · 2025-03-17T01:42:28.000-05:00
Updates for new cluster configurations in Enterprise and new in-memory cache
diff --git a/api-docs/influxdb3/core/v3/ref.yml b/api-docs/influxdb3/core/v3/ref.yml
@@ -118,7 +118,7 @@ tags:
       InfluxDB 3 Core provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins in response to events in your database.
       Use Processing engine plugins and triggers to run code and perform tasks for different database events.
 
-      To get started with the Processing engine, see the [Processing engine and Python plugins](/influxdb3/core/processing-engine/) guide.
+      To get started with the Processing Engine, see the [Processing Engine and Python plugins](/influxdb3/core/processing-engine/) guide.
   - name: Quick start
     description: |
       1. [Check the status](#section/Server-information) of the InfluxDB server.
diff --git a/api-docs/influxdb3/enterprise/v3/ref.yml b/api-docs/influxdb3/enterprise/v3/ref.yml
@@ -118,7 +118,7 @@ tags:
       InfluxDB 3 Enterprise provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins in response to events in your database.
       Use Processing engine plugins and triggers to run code and perform tasks for different database events.
 
-      To get started with the Processing engine, see the [Processing engine and Python plugins](/influxdb3/enterprise/processing-engine/) guide.
+      To get started with the Processing Engine, see the [Processing Engine and Python plugins](/influxdb3/enterprise/processing-engine/) guide.
   - name: Quick start
     description: |
       1. [Check the status](#section/Server-information) of the InfluxDB server.
diff --git a/content/influxdb3/core/plugins.md b/content/influxdb3/core/plugins.md
@@ -1,9 +1,9 @@
 ---
-title: Processing engine and Python plugins
+title: Processing Engine and Python plugins
 description: Use the Python processing engine to trigger and execute custom code on different events in an {{< product-name >}} instance.
 menu:
   influxdb3_core:
-    name: Processing engine and Python plugins
+    name: Processing Engine and Python plugins
 weight: 4
 influxdb3/core/tags: []
 related:
diff --git a/content/influxdb3/enterprise/plugins.md b/content/influxdb3/enterprise/plugins.md
@@ -1,9 +1,9 @@
 ---
-title: Processing engine and Python plugins
+title: Processing Engine and Python plugins
 description: Use the Python processing engine to trigger and execute custom code on different events in an {{< product-name >}} instance.
 menu:
   influxdb3_enterprise:
-    name: Processing engine and Python plugins
+    name: Processing Engine and Python plugins
 weight: 4
 influxdb3/core/tags: []
 related:
diff --git a/content/shared/v3-core-get-started/_index.md b/content/shared/v3-core-get-started/_index.md
@@ -156,14 +156,14 @@ The following examples show how to start InfluxDB 3 with different object store
 ```bash
 # Memory object store
 # Stores data in RAM; doesn't persist data
-influxdb3 serve --node-id=local01 --object-store=memory
+influxdb3 serve --node-id=host01 --object-store=memory
 ```
 
 ```bash
 # Filesystem object store
 # Provide the filesystem directory
 influxdb3 serve \
-  --node-id=local01 \
+  --node-id=host01 \
   --object-store=file \
   --data-dir ~/.influxdb3
 ```
@@ -198,7 +198,7 @@ docker run -it \
 
 ```bash
 influxdb3 serve \
-  --node-id=local01 \
+  --node-id=host01 \
   --object-store=s3 \
   --bucket=BUCKET \
   --aws-access-key=AWS_ACCESS_KEY \
@@ -211,7 +211,7 @@ influxdb3 serve \
 # Specify the object store type and associated options
 
 ```bash
-influxdb3 serve --node-id=local01 --object-store=s3 --bucket=BUCKET \
+influxdb3 serve --node-id=host01 --object-store=s3 --bucket=BUCKET \
   --aws-access-key=AWS_ACCESS_KEY \
   --aws-secret-access-key=AWS_SECRET_ACCESS_KEY \
   --aws-endpoint=ENDPOINT \
diff --git a/content/shared/v3-core-plugins/_index.md b/content/shared/v3-core-plugins/_index.md
@@ -1,15 +1,15 @@
 
-Use the {{% product-name %}} Processing engine to run code and perform tasks
+Use the {{% product-name %}} Processing Engine to run code and perform tasks
 for different database events.
 
-{{% product-name %}} provides the InfluxDB 3 Processing engine, an embedded Python VM that can dynamically load and trigger Python plugins
+{{% product-name %}} provides the InfluxDB 3 Processing Engine, an embedded Python VM that can dynamically load and trigger Python plugins
 in response to events in your database.
 
 ## Key Concepts
 
 ### Plugins
 
-A Processing engine _plugin_ is Python code you provide to run tasks, such as
+A Processing Engine _plugin_ is Python code you provide to run tasks, such as
 downsampling data, monitoring, creating alerts, or calling external services.
 
 > [!Note]
@@ -25,7 +25,7 @@ A _trigger_ is an InfluxDB 3 resource you create to associate a database
 event (for example, a WAL flush) with the plugin that should run.
 When an event occurs, the trigger passes configuration details, optional arguments, and event data to the plugin.
 
-The Processing engine provides four types of triggers--each type corresponds to
+The Processing Engine provides four types of triggers--each type corresponds to
 an event type with event-specific configuration to let you handle events with targeted logic.
 
 - **WAL Flush**: Triggered when the write-ahead log (WAL) is flushed to the object store (default is every second).
@@ -35,9 +35,9 @@ an event type with event-specific configuration to let you handle events with ta
 - **Parquet Persistence (coming soon)**: Triggered when InfluxDB 3 persists data to object storage Parquet files.
 -->
 
-### Activate the Processing engine
+### Activate the Processing Engine
 
-To enable the Processing engine, start the {{% product-name %}} server with the
+To enable the Processing Engine, start the {{% product-name %}} server with the
 `--plugin-dir` option and a path to your plugins directory.
 If the directory doesn’t exist, the server creates it. 
 
@@ -234,7 +234,7 @@ influx create trigger --run-asynchronously
 #### Configure error handling
 #### Configure error behavior for plugins
 
-The Processing engine logs all plugin errors to stdout and the `system.processing_engine_logs` system table.
+The Processing Engine logs all plugin errors to stdout and the `system.processing_engine_logs` system table.
 
 To  configure additional error handling for a trigger, use the `--error-behavior` flag:
 
@@ -466,3 +466,153 @@ To run the plugin, you send an HTTP request to `<HOST>/api/v3/engine/my-plugin`.
 Because all On Request plugins for a server share the same `<host>/api/v3/engine/` base URL,
 the trigger-spec you define must be unique across all plugins configured for a server,
 regardless of which database they are associated with.
+
+
+## In-memory cache
+
+The Processing Engine provides a powerful in-memory cache system that enables plugins to persist and retrieve data between executions. This cache system is essential for maintaining state, tracking metrics over time, and optimizing performance when working with external data sources.
+
+### Key Benefits
+
+-   **State persistence**: Maintain counters, timestamps, and other state variables across plugin executions.
+-   **Performance and cost optimization**: Store frequently used data to avoid expensive recalculations. Minimize external API calls by caching responses and avoiding rate limits.
+-   **Data Enrichment**: Cache lookup tables, API responses, or reference data to enrich data efficiently.
+
+### Cache API
+
+The cache API is accessible via the `cache` property on the `influxdb3_local` object provided to all plugin types:
+
+```python 
+# Basic usage pattern  
+influxdb3_local.cache.METHOD(PARAMETERS)
+```
+
+
+| Method | Parameters | Returns | Description |
+|--------|------------|---------|-------------|
+| `put` | `key` (str): The key to store the value under<br>`value` (Any): Any Python object to cache<br>`ttl` (Optional[float], default=None): Time in seconds before expiration<br>`use_global` (bool, default=False): If True, uses global namespace | None | Stores a value in the cache with an optional time-to-live |
+| `get` | `key` (str): The key to retrieve<br>`default` (Any, default=None): Value to return if key not found<br>`use_global` (bool, default=False): If True, uses global namespace | Any | Retrieves a value from the cache or returns default if not found |
+| `delete` | `key` (str): The key to delete<br>`use_global` (bool, default=False): If True, uses global namespace | bool | Deletes a value from the cache. Returns True if deleted, False if not found |
+
+### Cache Namespaces
+
+The cache system offers two distinct namespaces, providing flexibility for different use cases:
+
+| Namespace | Scope | Best For |
+| --- | --- | --- |
+| **Trigger-specific** (default) | Isolated to a single trigger | Plugin state, counters, timestamps specific to one plugin |
+| **Global** | Shared across all triggers | Configuration, lookup tables, service states that should be available to all plugins |
+
+### Using the In-Memory Cache
+
+The following examples show how to use the cache API in plugins:
+
+```python
+# Store values in the trigger-specific namespace
+influxdb3_local.cache.put("last_processed_time", time.time())
+influxdb3_local.cache.put("error_count", 0)
+influxdb3_local.cache.put("processed_records", {"total": 0, "errors": 0})
+
+# Store values with expiration
+influxdb3_local.cache.put("temp_data", {"value": 42}, ttl=300)  # Expires in 5 minutes
+influxdb3_local.cache.put("auth_token", "t0k3n", ttl=3600)     # Expires in 1 hour
+
+# Store values in the global namespace
+influxdb3_local.cache.put("app_config", {"version": "1.0.2"}, use_global=True)
+influxdb3_local.cache.put("global_counter", 0, use_global=True)
+
+# Retrieve values
+last_time = influxdb3_local.cache.get("last_processed_time")
+auth = influxdb3_local.cache.get("auth_token")
+config = influxdb3_local.cache.get("app_config", use_global=True)
+
+# Provide defaults for missing keys
+missing = influxdb3_local.cache.get("missing_key", default="Not found")
+count = influxdb3_local.cache.get("visit_count", default=0)
+
+# Delete cached values
+influxdb3_local.cache.delete("temp_data")
+influxdb3_local.cache.delete("app_config", use_global=True)
+```
+
+#### Example: Maintaining State Between Executions
+
+This example shows a WAL plugin that uses the cache to maintain a counter across executions:
+
+```python
+
+def process_writes(influxdb3_local, table_batches, args=None):
+    # Get the current counter value or default to 0
+    counter = influxdb3_local.cache.get("execution_counter", default=0)
+    
+    # Increment the counter
+    counter += 1
+    
+    # Store the updated counter back in the cache
+    influxdb3_local.cache.put("execution_counter", counter)
+    
+    influxdb3_local.info(f"This plugin has been executed {counter} times")
+    
+    # Process writes normally...
+```
+
+#### Example: Sharing Configuration Across Triggers
+
+One benefit of using a global namespace is being more responsive to changing conditions. This example demonstrates using the global namespace to share configuration, so a scheduled call can check thresholds placed by prior trigger calls, without making a query to the DB itself:
+
+```python
+def process_scheduled_call(influxdb3_local, time, args=None):
+    # Check if we have cached configuration
+    config = influxdb3_local.cache.get("alert_config", use_global=True)
+    
+    if not config:
+        # Load configuration from database
+        results = influxdb3_local.query("SELECT * FROM system.alert_config")
+        
+        # Transform query results into config object
+        config = {row["name"]: row["value"] for row in results}
+        
+        # Cache the configuration with a 5-minute TTL
+        influxdb3_local.cache.put("alert_config", config, ttl=300, use_global=True)
+        influxdb3_local.info("Loaded fresh configuration from database")
+    else:
+        influxdb3_local.info("Using cached configuration")
+    
+    # Use the configuration
+    threshold = float(config.get("cpu_threshold", "90.0"))
+    # ...
+```
+
+The cache is designed to support stateful operations while maintaining isolation between different triggers. Use the trigger-specific namespace for most operations and the global namespace only when data sharing across triggers is necessary.
+
+### Best Practices
+
+#### Use TTL Appropriately
+Set realistic expiration times based on how frequently data changes.
+
+```python
+# Cache external API responses for 5 minutes  
+influxdb3_local.cache.put("weather_data", api_response, ttl=300)
+```
+
+#### Cache Computation Results
+Store the results of expensive calculations that need to be utilized frequently.
+```python
+# Cache aggregated statistics  
+influxdb3_local.cache.put("daily_stats", calculate_statistics(data), ttl=3600)
+```
+
+#### Implement Cache Warm-Up
+Prime the cache at startup for critical data. This can be especially useful for global namespace data where multiple triggers will need this data.
+
+```python
+# Check if cache needs to be initialized  
+if not influxdb3_local.cache.get("lookup_table"):   
+    influxdb3_local.cache.put("lookup_table", load_lookup_data())
+```
+
+#### Cache Limitations
+
+-   **Memory Usage**: Since cache contents are stored in memory, monitor your memory usage when caching large datasets.
+-   **Server Restarts**: The cache is cleared when the server restarts, so it's recommended you design your plugins to handle cache initialization (as noted above).
+-   **Concurrency**: Be cautious when multiple trigger instances might update the same cache key simultaneously to prevent inaccurate or out-of-date data access.
diff --git a/content/shared/v3-enterprise-get-started/_index.md b/content/shared/v3-enterprise-get-started/_index.md