Skip to content

Commit 6111835

Browse files
authored
Add indexof_regex, parse_path, regex_quote, ingestion_time (#309)
1 parent 95a3a07 commit 6111835

File tree

6 files changed

+578
-2
lines changed

6 files changed

+578
-2
lines changed
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
---
2+
title: ingestion_time
3+
description: 'This page explains how to use the ingestion_time function in APL.'
4+
---
5+
6+
Use the `ingestion_time` function to retrieve the timestamp of when each record was ingested into Axiom. This function helps you distinguish between the original event time (as captured in the `_time` field) and the time the data was actually received by Axiom.
7+
8+
You can use `ingestion_time` to:
9+
10+
- Detect delays or lags in data ingestion.
11+
- Filter events based on their ingestion window.
12+
- Audit data pipelines by comparing event time with ingestion time.
13+
14+
This function is especially useful when working with streaming or event-based data sources where ingestion delays are common and might affect alerting, dashboarding, or correlation accuracy.
15+
16+
## For users of other query languages
17+
18+
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
19+
20+
<AccordionGroup>
21+
<Accordion title="Splunk SPL users">
22+
23+
Splunk provides the `_indextime` field, which represents when an event was indexed. In APL, the equivalent concept is accessed using the `ingestion_time` function, which must be called explicitly.
24+
25+
<CodeGroup>
26+
```sql Splunk example
27+
... | eval ingest_time=_indextime
28+
````
29+
30+
```kusto APL equivalent
31+
...
32+
| extend ingest_time = ingestion_time()
33+
```
34+
35+
</CodeGroup>
36+
37+
</Accordion>
38+
<Accordion title="ANSI SQL users">
39+
40+
ANSI SQL does not have a standard equivalent to `ingestion_time`, since SQL databases typically do not distinguish ingestion time from event time. APL provides `ingestion_time` for observability-specific workflows where the arrival time of data is important.
41+
42+
<CodeGroup>
43+
```sql SQL example
44+
SELECT event_time, CURRENT_TIMESTAMP AS ingest_time FROM logs;
45+
```
46+
47+
```kusto APL equivalent
48+
['sample-http-logs']
49+
| extend ingest_time = ingestion_time()
50+
```
51+
52+
</CodeGroup>
53+
54+
</Accordion>
55+
</AccordionGroup>
56+
57+
## Usage
58+
59+
### Syntax
60+
61+
```kusto
62+
ingestion_time()
63+
```
64+
65+
### Parameters
66+
67+
This function does not take any parameters.
68+
69+
### Returns
70+
71+
A `datetime` value that represents when each record was ingested into Axiom.
72+
73+
## Use case examples
74+
75+
<Tabs>
76+
<Tab title="Log analysis">
77+
78+
Use `ingestion_time` to identify delays between when an HTTP request occurred and when it was ingested into Axiom.
79+
80+
**Query**
81+
82+
```kusto
83+
['sample-http-logs']
84+
| extend ingest_time = ingestion_time()
85+
| extend delay = datetime_diff('second', ingest_time, _time)
86+
| where delay > 1
87+
| project _time, ingest_time, delay, method, uri, status
88+
```
89+
90+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20extend%20ingest_time%20%3D%20ingestion_time()%20%7C%20extend%20delay%20%3D%20datetime_diff('second'%2C%20ingest_time%2C%20_time)%20%7C%20where%20delay%20%3E%201%20%7C%20project%20_time%2C%20ingest_time%2C%20delay%2C%20method%2C%20uri%2C%20status%22%7D)
91+
92+
**Output**
93+
94+
| _time | ingest_time | delay | method | uri | status |
95+
| -------------------- | -------------------- | ----- | ------ | ------------- | ------ |
96+
| 2025-06-10T12:00:00Z | 2025-06-10T12:01:30Z | 90 | GET | /api/products | 200 |
97+
| 2025-06-10T12:05:00Z | 2025-06-10T12:06:10Z | 70 | POST | /api/cart/add | 201 |
98+
99+
This query calculates the difference between the ingestion time and event time, highlighting entries with more than 60 seconds delay.
100+
101+
</Tab>
102+
<Tab title="OpenTelemetry traces">
103+
104+
Use `ingestion_time` to monitor ingestion lags for spans generated by services, helping identify pipeline slowdowns or delivery issues.
105+
106+
**Query**
107+
108+
```kusto
109+
['otel-demo-traces']
110+
| extend ingest_time = ingestion_time()
111+
| extend delay = datetime_diff('second', ingest_time, _time)
112+
| summarize avg_delay = avg(delay) by ['service.name'], kind
113+
| order by avg_delay desc
114+
```
115+
116+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20extend%20ingest_time%20%3D%20ingestion_time()%20%7C%20extend%20delay%20%3D%20datetime_diff('second'%2C%20ingest_time%2C%20_time)%20%7C%20summarize%20avg_delay%20%3D%20avg(delay)%20by%20%5B'service.name'%5D%2C%20kind%20%7C%20order%20by%20avg_delay%20desc%22%7D)
117+
118+
**Output**
119+
120+
| service.name | kind | avg_delay |
121+
| --------------- | -------- | ---------- |
122+
| checkoutservice | server | 45 |
123+
| cartservice | client | 30 |
124+
| frontend | internal | 12 |
125+
126+
This query calculates the average ingestion delay per service and kind to identify services affected by delayed ingestion.
127+
128+
</Tab>
129+
<Tab title="Security logs">
130+
131+
Use `ingestion_time` to identify recently ingested suspicious activity, even if the event occurred earlier.
132+
133+
**Query**
134+
135+
```kusto
136+
['sample-http-logs']
137+
| extend ingest_time = ingestion_time()
138+
| where status == '401' and ingest_time > ago(1h)
139+
| project _time, ingest_time, id, method, uri, ['geo.country']
140+
```
141+
142+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20extend%20ingest_time%20%3D%20ingestion_time()%20%7C%20where%20status%20%3D%3D%20'401'%20and%20ingest_time%20%3E%20ago(1h)%20%7C%20project%20_time%2C%20ingest_time%2C%20id%2C%20method%2C%20uri%2C%20%5B'geo.country'%5D%22%7D)
143+
144+
**Output**
145+
146+
| _time | ingest_time | id | method | uri | geo.country |
147+
| -------------------- | -------------------- | ------- | ------ | ------------------ | ----------- |
148+
| 2025-06-11T09:15:00Z | 2025-06-11T10:45:00Z | user123 | GET | /admin/login | US |
149+
| 2025-06-11T08:50:00Z | 2025-06-11T10:30:00Z | user456 | POST | /api/session/start | DE |
150+
151+
This query surfaces failed login attempts that were ingested in the last hour, regardless of when the request actually occurred.
152+
153+
</Tab>
154+
</Tabs>

apl/scalar-functions/string-functions.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'String functions'
33
description: 'Learn how to use and combine different string functions in APL'
4-
sidebarTitle: String
4+
sidebarTitle: Overview
55
tags:
66
['axiom documentation', 'documentation', 'axiom', 'string functions', 'countof', 'coalesce', 'extract', 'extract all', 'format url', 'isempty', 'indexof', 'parse json', 'parse url', 'replace', 'reverse', 'strcat', 'strlen']
77
---
Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
---
2+
title: indexof_regex
3+
description: 'This page explains how to use the indexof_regex function in APL.'
4+
---
5+
6+
Use the `indexof_regex` function to find the position of the first match of a regular expression in a string. The function is helpful when you want to locate a pattern within a larger text field and take action based on its position. For example, you can use `indexof_regex` to extract fields from semi-structured logs, validate string formats, or trigger alerts when specific patterns appear in log data.
7+
8+
The function returns the zero-based index of the first match. If no match is found, it returns `-1`. Use `indexof_regex` when you need more flexibility than simple substring search (`indexof`), especially when working with dynamic or non-fixed patterns.
9+
10+
<Note>
11+
All regex functions of APL use the [RE2 regex syntax](https://github.com/google/re2/wiki/Syntax).
12+
</Note>
13+
14+
## For users of other query languages
15+
16+
If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.
17+
18+
<AccordionGroup>
19+
<Accordion title="Splunk SPL users">
20+
21+
Use `match()` in Splunk SPL to perform regular expression matching. However, `match()` returns a Boolean, not the match position. APL’s `indexof_regex` is similar to combining `match()` with additional logic to extract position, which is not natively supported in SPL.
22+
23+
<CodeGroup>
24+
```sql Splunk example
25+
... | eval match_index=if(match(field, "pattern"), 0, -1)
26+
````
27+
28+
```kusto APL equivalent
29+
['dataset']
30+
| extend match_index = indexof_regex(field, 'pattern')
31+
```
32+
33+
</CodeGroup>
34+
35+
</Accordion>
36+
<Accordion title="ANSI SQL users">
37+
38+
ANSI SQL does not have a built-in function to return the index of a regex match. You typically use `REGEXP_LIKE` for Boolean evaluation. `indexof_regex` provides a more direct and powerful way to find the exact match position in APL.
39+
40+
<CodeGroup>
41+
```sql SQL example
42+
SELECT CASE WHEN REGEXP_LIKE(field, 'pattern') THEN 0 ELSE -1 END FROM table;
43+
```
44+
45+
```kusto APL equivalent
46+
['dataset']
47+
| extend match_index = indexof_regex(field, 'pattern')
48+
```
49+
50+
</CodeGroup>
51+
52+
</Accordion>
53+
</AccordionGroup>
54+
55+
## Usage
56+
57+
### Syntax
58+
59+
```kusto
60+
indexof_regex(string, match [, start [, occurrence [, length]]])
61+
```
62+
63+
### Parameters
64+
65+
| Name | Type | Required | Description |
66+
| ---------- | ------ | -------- | --- |
67+
| string | string | Yes | The input text to inspect. |
68+
| match | string | Yes | The regular expression pattern to search for. |
69+
| start | int | | The index in the string where to begin the search. If negative, the function starts that many characters from the end. |
70+
| occurrence | int | | Which instance of the pattern to match. Defaults to `1` if not specified. |
71+
| length | int | | The number of characters to search through. Use `-1` to search to the end of the string. |
72+
73+
### Returns
74+
75+
The function returns the position (starting at zero) where the pattern first matches within the string. If the pattern is not found, the result is `-1`.
76+
77+
The function returns `null` in the following cases:
78+
79+
- The `start` value is negative.
80+
- The `occurrence` value is less than 1.
81+
- The `length` is set to a value below `-1`.
82+
83+
## Use case examples
84+
85+
<Tabs>
86+
<Tab title="Log analysis">
87+
88+
Use `indexof_regex` to detect whether the URI in a log entry contains an encoded user ID by checking for patterns like `user-[0-9]+`.
89+
90+
**Query**
91+
92+
```kusto
93+
['sample-http-logs']
94+
| extend user_id_pos = indexof_regex(uri, 'user-[0-9]+')
95+
| where user_id_pos != -1
96+
| project _time, id, uri, user_id_pos
97+
```
98+
99+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20extend%20user_id_pos%20%3D%20indexof_regex(uri%2C%20'user-%5B0-9%5D%2B')%20%7C%20where%20user_id_pos%20!%3D%20-1%20%7C%20project%20_time%2C%20id%2C%20uri%2C%20user_id_pos%22%7D)
100+
101+
**Output**
102+
103+
| _time | id | uri | user_id_pos |
104+
| -------------------- | ------ | ------------------------ | ------------- |
105+
| 2025-06-10T12:34:56Z | user42 | /api/user-12345/settings | 5 |
106+
| 2025-06-10T12:35:07Z | user91 | /v2/user-6789/dashboard | 4 |
107+
108+
The query finds log entries where the URI contains a user ID pattern and shows the position of the match in the URI string.
109+
110+
</Tab>
111+
<Tab title="OpenTelemetry traces">
112+
113+
Use `indexof_regex` to detect trace IDs that include a specific structure, such as four groups of hex digits.
114+
115+
**Query**
116+
117+
```kusto
118+
['otel-demo-traces']
119+
| extend match_index = indexof_regex(trace_id, '^[0-9a-f]{8}-[0-9a-f]{4}')
120+
| where match_index == 0
121+
| project _time, trace_id, match_index
122+
```
123+
124+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'otel-demo-traces'%5D%20%7C%20extend%20match_index%20%3D%20indexof_regex(trace_id%2C%20'%5E%5B0-9a-f%5D%7B8%7D-%5B0-9a-f%5D%7B4%7D')%20%7C%20where%20match_index%20%3D%3D%200%20%7C%20project%20_time%2C%20trace_id%2C%20match_index%22%7D)
125+
126+
**Output**
127+
128+
| _time | trace_id | match_index |
129+
| -------------------- | ------------------------------------ | ------------ |
130+
| 2025-06-10T08:23:12Z | ab12cd34-1234-5678-9abc-def123456789 | 0 |
131+
| 2025-06-10T08:24:55Z | fe98ba76-4321-abcd-8765-fedcba987654 | 0 |
132+
133+
This query finds spans where the trace ID begins with a specific regex pattern, helping validate span ID formatting.
134+
135+
</Tab>
136+
<Tab title="Security logs">
137+
138+
Use `indexof_regex` to locate suspicious request patterns such as attempts to access system files (`/etc/passwd`).
139+
140+
**Query**
141+
142+
```kusto
143+
['sample-http-logs']
144+
| extend passwd_index = indexof_regex(uri, '/etc/passwd')
145+
| where passwd_index != -1
146+
| project _time, id, uri, passwd_index
147+
```
148+
149+
[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B'sample-http-logs'%5D%20%7C%20extend%20passwd_index%20%3D%20indexof_regex(uri%2C%20'%2Fetc%2Fpasswd')%20%7C%20where%20passwd_index%20!%3D%20-1%20%7C%20project%20_time%2C%20id%2C%20uri%2C%20passwd_index%22%7D)
150+
151+
**Output**
152+
153+
| _time | id | uri | passwd_index |
154+
| -------------------- | ------ | ------------------------------ | ------------- |
155+
| 2025-06-10T10:15:45Z | user88 | /cgi-bin/view?path=/etc/passwd | 20 |
156+
157+
This query detects HTTP requests attempting to access sensitive file paths, a common indicator of intrusion attempts.
158+
159+
</Tab>
160+
</Tabs>

0 commit comments

Comments
 (0)