Skip to content

Commit a9f5a61

Browse files
committed
some 2.9 features are back-ported to 2.8.2
1 parent 6688e00 commit a9f5a61

File tree

7 files changed

+124
-5
lines changed

7 files changed

+124
-5
lines changed

docs/datatypes.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Like many analytics systems, the following common types are supported.
99
| | float | -3.1415 | default with 4 bytes. Same as `float32`. You can also use `float64` or `double` for 8 bytes. No `float8` or `float16`. | [to_float](/functions_for_type#to_float) |
1010
| Boolean Type | bool | true | true or false | |
1111
| String Type | string | 'Hello' | strings of an arbitrary length. You can also use `varchar` To create string columns with fixed size in bytes, use `fixed_string(positiveInt)` | [to_string](/functions_for_type#to_string), [etc.](/functions_for_text) |
12-
| JSON Type | json | '\{"a":1,"b":["x","y"]\}' | New in Timeplus Enterprise 2.9. The JSON document is stored in a more optimized, columnar-like layout to improve query performance. |
12+
| JSON Type | json | '\{"a":1,"b":["x","y"]\}' | New in Timeplus Enterprise 2.9 (also available in 2.8.2 or above). The JSON document is stored in a more optimized, columnar-like layout to improve query performance. |
1313
| Universally Unique Identifier | uuid | 1f71acbf-59fc-427d-a634-1679b48029a9 | a universally unique identifier (UUID) is a 16-byte number used to identify records. For detailed information about the UUID, see [Wikipedia](https://en.wikipedia.org/wiki/Universally_unique_identifier) | [uuid](/functions_for_text#uuid) |
1414
| IP address | ipv4 | '116.253.40.133' | IPv4 addresses. Stored in 4 bytes as uint32. | [to_ipv4](/functions_for_url#to_ipv4) |
1515
| | ipv6 | '2a02:aa08:e000:3100::2' | IPv6 addresses. Stored in 16 bytes as uint128. | [to_ipv6](/functions_for_url#to_ipv6) |

docs/functions_for_json.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,9 @@ This takes one or more parameters and return a json string. You can also turn al
6363
This function is available since Timeplus Enterprise v2.9.
6464

6565
This takes one or more parameters and return a json object. You can also turn all column values in the row as a json object via `json_cast(*)`.
66+
67+
### json_array_length
68+
Get the length of the JSON array. For example, `json_array_length('[3,4,5]')` will return `3`.
69+
70+
### json_merge_patch
71+
Merge multiple JSON documents into one. For example, `json_merge_patch('{"a":1,"b":2}', '{"b":3,"c":4}')` will return `{"a":1,"b":3,"c":4}`. If the key exists in both documents, the value from the second document will overwrite the first one.

docs/functions_for_text.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ Alias of [generate_uuidv4](#generate_uuidv4).
161161

162162
Generates a universally unique identifier (UUIDv4) which is a 16-byte number used to identify records.
163163

164+
### uuid7
165+
166+
Alias of [generate_uuidv7](#generate_uuidv7).
167+
164168
### generate_uuidv7
165169

166170
`generate_uuidv7()` Generates a universally unique identifier (UUIDv7), which contains the current Unix timestamp in milliseconds (48 bits), followed by version "7" (4 bits), a counter (42 bit) to distinguish UUIDs within a millisecond (including a variant field "2", 2 bit), and a random field (32 bits). For any given timestamp (unix_ts_ms), the counter starts at a random value and is incremented by 1 for each new UUID until the timestamp changes. In case the counter overflows, the timestamp field is incremented by 1 and the counter is reset to a random new start value.

docs/http-external.md

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# HTTP External Stream
22

3-
Since [Timeplus Enterprise v2.9](/enterprise-v2.9), Timeplus can send data to HTTP endpoints via the HTTP External Stream. You can use this feature to trigger Slack notifications or send streaming data to downstream systems, such as Splunk, Elasticsearch, or any other HTTP-based service.
3+
Since Timeplus Enterprise [v2.9](/enterprise-v2.9) and v2.8.2, you can send data to HTTP endpoints via the HTTP External Stream. You can use this feature to trigger Slack notifications or send streaming data to downstream systems, such as Splunk, Datadog, Elasticsearch, Databricks, or any other HTTP-based service.
44

55
Currently, it only supports writing data to HTTP endpoints, but reading data from HTTP endpoints is not supported yet.
66

@@ -169,6 +169,40 @@ Then you can insert data via a materialized view or just via `INSERT` command:
169169
INSERT INTO http_bigquery_t1 VALUES(10,'A'),(11,'B');
170170
```
171171

172+
#### Write to Databricks {#example-write-to-databricks}
173+
174+
Follow [the guide](https://docs.databricks.com/aws/en/dev-tools/auth/pat) to create an access token for your Databricks workspace.
175+
176+
Assume you have created a table in Databricks SQL warehouse with 2 columns:
177+
```sql
178+
CREATE TABLE sales (
179+
product STRING,
180+
quantity INT
181+
);
182+
```
183+
184+
Create the HTTP external stream in Timeplus:
185+
```sql
186+
CREATE EXTERNAL STREAM http_databricks_t1 (product string, quantity int)
187+
SETTINGS
188+
type = 'http',
189+
http_header_Authorization='Bearer $TOKEN',
190+
url = 'https://$HOST.cloud.databricks.com/api/2.0/sql/statements/',
191+
data_format = 'Template',
192+
format_template_resultset_format='{"warehouse_id":"$WAREHOUSE_ID","statement": "INSERT INTO sales (product, quantity) VALUES (:product, :quantity)", "parameters": [${data}]}',
193+
format_template_row_format='{ "name": "product", "value": ${product:JSON}, "type": "STRING" },{ "name": "quantity", "value": ${quantity:JSON}, "type": "INT" }',
194+
format_template_rows_between_delimiter=''
195+
```
196+
197+
Replace the `TOKEN`, `HOST`, and `WAREHOUSE_ID` to match your Databricks settings. Also change `format_template_row_format` and `format_template_row_format` to match the table schema.
198+
199+
Then you can insert data via a materialized view or just via `INSERT` command:
200+
```sql
201+
INSERT INTO http_databricks_t1(product, quantity) VALUES('test',95);
202+
```
203+
204+
This will insert one row per request. We plan to support batch insert and Databricks specific format to support different table schemas in the future.
205+
172206
#### Trigger Slack Notifications {#example-trigger-slack}
173207

174208
You can follow [the guide](https://api.slack.com/messaging/webhooks) to configure an "incoming webhook" to send notifications to a Slack channel.

docs/mongo-external.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,14 @@ MongoDB connection string options as a URL formatted string. e.g. 'authSource=ad
7979
#### oid_columns
8080
A comma-separated list of columns that should be treated as oid in the `WHERE` clause. Default to `_id`.
8181

82+
### Query Settings
83+
84+
#### mongodb_throw_on_unsupported_query
85+
By default this setting is `true`. While querying the MongoDB external table with SQL, if the query contains `GROUP BY`, `HAVING` or other aggregations, Timeplus will throw exceptions. Set this to `false` or `0` to disable this behavior, and Timeplus will read full table data from MongoDB and execute the query in Timeplus. For example:
86+
```sql
87+
SELECT name, COUNT(*) AS cnt FROM mongodb_ext_table GROUP BY name HAVING cnt >5 SETTINGS mongodb_throw_on_unsupported_query = false;
88+
```
89+
8290
## DROP EXTERNAL TABLE
8391

8492
```sql

docs/mutable-stream.md

Lines changed: 37 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,19 @@ PRIMARY KEY (col1, col2)
2222
SETTINGS
2323
logstore_retention_bytes=..,
2424
logstore_retention_ms=..,
25-
shards=..
25+
shards=..,
26+
version_column=..,
27+
coalesced=..,
28+
ttl_seconds=..
2629
```
2730

2831
Since Timeplus Enterprise 2.7, if you create a mutable stream with `low_cardinality` columns, the system will ignore the `low_cardinality` modifier to improve performance.
2932
[Learn more](/sql-create-mutable-stream).
3033

3134
`PARTITION BY`, `ORDER BY` or `SAMPLE BY` clauses are not allowed while creating the mutable stream.
3235

36+
Since Timeplus Enterprise 2.8.2, you can set `coalesced` (default to false). If it's true and the insert data only contains partial columns in the WAL, the partial columns will merge with the existing rows.[Learn more](#coalesced). Also, since 2.8.2, you can set `ttl_seconds` (default to -1). If it's set to a positive value, then data with primary key older than the `ttl_seconds` will be scheduled to be pruned in the next key compaction cycle. [Learn more](#ttl_seconds).
37+
3338
## INSERT
3439
You can insert data to the mutable stream with the following SQL:
3540
```sql
@@ -184,7 +189,7 @@ Mutable stream can also be used in [JOINs](/joins).
184189
### Retention Policy for Historical Storage{#ttl_seconds}
185190
Like normal streams in Timeplus, mutable streams use both streaming storage and historical storage. New data are added to the streaming storage first, then continuously write to the historical data with deduplication/merging process.
186191

187-
Starting from Timeplus Enterprise 2.9, you can set `ttl_seconds` on mutable streams. If the data is older than this value, it is scheduled to be pruned in the next key compaction cycle. Default value is -1. Any value less than 0 means this feature is disabled.
192+
Starting from Timeplus Enterprise 2.9 (also backported to 2.8.2), you can set `ttl_seconds` on mutable streams. If the data is older than this value, it is scheduled to be pruned in the next key compaction cycle. Default value is -1. Any value less than 0 means this feature is disabled.
188193

189194
```sql
190195
CREATE MUTABLE STREAM ..
@@ -280,6 +285,36 @@ PRIMARY KEY (device_id, timestamp, batch_id)
280285
SETTINGS shards=3
281286
```
282287

288+
### Coalesced and Versioned Mutable Stream {#coalesced}
289+
For a mutable stream with many columns, there are some cases that only some columns are updated over time. Create a mutable stream with `coalesced=true` setting to enable the partial merge. For example, given a mutable stream:
290+
```sql
291+
create mutable stream kv_99061_1 (
292+
p string, m1 int, m2 int, m3 int, v uint64,
293+
family cf1(m1),
294+
family cf2(m2),
295+
family cf3(m3),
296+
family cf4(_tp_time)
297+
) primary key p
298+
settings coalesced = true;
299+
```
300+
If we insert one row with `m1=1`:
301+
```sql
302+
insert into kv_99061_1 (p, m1, _tp_time) values ('p1', 1, '2025-01-01T00:00:01');
303+
```
304+
Query the mutable stream. You will get one row.
305+
306+
Then insert the other row with the same primary key and `m2=2`.
307+
```sql
308+
insert into kv_99061_1 (p, m2, _tp_time) values ('p1', 2, '2025-01-01T00:00:02');
309+
```
310+
Query it again with
311+
```sql
312+
select * from table(kv_99061_1);
313+
```
314+
You will see one row with m1 and m2 updated and other columns in the default vaule.
315+
316+
Compared to the [Versioned Stream](versioned-stream), coalesced mutable streams don't require you to set all column values when you update a primary key. You can also set `version_column` to the column name to indicate which column with the verison number. Say there are updates for the same primary key, `v` as the `version_column`, the first update is "v=1,p=1,m=1" and the second update is "v=2,p=1,m=2". For some reasons, if Timeplus receives the second update first, then when it gets the "v=1,p=1,m=1", since the version is 1, lower than the current version, so this update will be reject and we keep the latest update as "v=2,p=1,m=2". This is beneficial specialy in distributed environment with potential out of order events.
317+
283318
## Performance Tuning {#tuning}
284319
If you are facing performance challenges with massive data in mutable streams, please consider adding [secondary indexes](#index), [column families](#column_family) and use [multiple shards](#shards).
285320

docs/sql-alter-stream.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ to_datetime(created_at) + INTERVAL 48 HOUR
1212
```
1313

1414
## MODIFY SETTING{#modify_setting}
15-
You can add or modify the retention policy for streaming storage. e.g.
15+
You can add or modify the retention policy for streams or mutable streams, e.g.
1616

1717
```sql
1818
ALTER STREAM stream_name MODIFY SETTING
@@ -32,6 +32,11 @@ You can also change the codec for mutable streams. e.g.
3232
ALTER STREAM test MODIFY SETTING logstore_codec='lz4';
3333
```
3434

35+
Starting from Timeplus Enterprise 2.8.2, you can also modify the TTL for mutable stream.
36+
```sql
37+
ALTER STREAM test MODIFY SETTING ttl_seconds = 10;
38+
```
39+
3540
## MODIFY QUERY SETTING
3641

3742
:::info
@@ -69,6 +74,11 @@ Syntax:
6974
ALTER STREAM stream_name ADD COLUMN column_name data_type
7075
```
7176

77+
Since Timeplus Enterprise 2.8.2, you can also add multiple columns at once:
78+
```sql
79+
ALTER STREAM stream_99005 ADD COLUMN e int, ADD COLUMN f int;
80+
```
81+
7282
`DELETE COLUMN` is not supported yet. Contact us if you have strong use cases.
7383

7484
## RENAME COLUMN
@@ -92,6 +102,28 @@ Since Timeplus Enterprise v2.9.0, you can drop an index from a mutable stream.
92102
ALTER STREAM mutable_stream DROP INDEX index_name
93103
```
94104

105+
## MATERIALIZE INDEX
106+
Since Timeplus Enterprise 2.8.2, you can rebuild the secondary index `name` for the specified `partition_name`.
107+
```sql
108+
ALTER STREAM mutable_stream MATERIALIZE INDEX [IF EXISTS] name [IN PARTITION partition_name] SETTINGS mutations_sync = 2"
109+
```
110+
111+
For example:
112+
```sql
113+
ALTER STREAM minmax_idx MATERIALIZE INDEX idx IN PARTITION 2 SETTINGS mutations_sync = 2
114+
```
115+
116+
## CLEAR INDEX
117+
Since Timeplus Enterprise 2.8.2, you can delete the secondary index `name` from disk.
118+
```sql
119+
ALTER STREAM mutable_stream CLEAR INDEX [IF EXISTS] name [IN PARTITION partition_name] SETTINGS mutations_sync = 2"
120+
```
121+
122+
For example:
123+
```sql
124+
ALTER STREAM minmax_idx CLEAR INDEX idx IN PARTITION 2 SETTINGS mutations_sync = 2
125+
```
126+
95127
## DROP PARTITION
96128
You can delete some data in the stream without dropping the entire stream via `ALTER STREAM .. DROP PARTITION ..`.
97129

0 commit comments

Comments
 (0)