Skip to content

Commit d474284

Browse files
committed
focus on create udf via SQL not via UI
1 parent 63cec98 commit d474284

File tree

3 files changed

+149
-75
lines changed

3 files changed

+149
-75
lines changed

docs/js-udf.md

Lines changed: 69 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,21 @@
11
# JavaScript UDF
22

3-
In addition to [Remote UDF](/remote-udf), Timeplus Proton also supports JavaScript-based UDF running in the SQL engine. You can develop User-defined scalar functions (UDFs) or User-defined aggregate functions (UDAFs) with modern JavaScript (powered by [V8](https://v8.dev/)). No need to deploy extra server/service for the UDF. More languages will be supported in the future.
4-
5-
:::info
6-
7-
The JavaScript-based UDF can run in both Timeplus and Proton local deployments. It runs "locally" in the database engine. It doesn't mean this feature is only available for local deployment.
8-
9-
:::
3+
Timeplus supports JavaScript-based UDF running in the SQL engine. You can develop User-defined scalar functions (UDFs) or User-defined aggregate functions (UDAFs) with modern JavaScript (powered by [V8](https://v8.dev/)). No need to deploy extra server/service for the UDF. More languages will be supported in the future.
104

115
## Register a JS UDF via SQL {#ddl}
12-
Please check [CREATE FUNCTION](/sql-create-function) page for the SQL syntax.
6+
Please check [CREATE FUNCTION](/sql-create-function#javascript-udf) page for the SQL syntax.
137

148
## Register a JS UDF via Web Console {#register}
159

16-
1. Open "UDFs" from the navigation menu on the left, and click the 'Register New Function' button.
10+
1. Open "UDFs" from the navigation menu on the left, and click the 'New UDF' button.
1711
2. Specify a function name, such as `second_max`. Make sure the name won't conflict with built-in functions or other UDF. Description is optional.
1812
3. Choose the data type for input parameters and return value.
1913
4. Choose "JavaScript" as the UDF type.
2014
5. Specify whether the function is for aggregation or not.
2115
6. Enter the JavaScript source for the UDF. (We will explain more how to write the code.)
2216
7. Click **Create** button to register the function.
2317

24-
### Arguments
18+
## Arguments
2519

2620
Unlike Remote UDF, the argument names don't matter when you register a JS UDF. Make sure you the list of arguments matches the input parameter lists in your JavaScript function.
2721

@@ -35,7 +29,7 @@ The input data are in Timeplus data type. They will be converted to JavaScript d
3529
| date/date32/datetime/datetime64 | Date (in milliseconds) |
3630
| array(Type) | Array |
3731

38-
### Returned value
32+
## Returned value
3933

4034
The JavaScript UDF can return the following data types and they will be converted back to the specified Timeplus data types. The supported return type are similar to argument types. The only difference is that if you return a complex data structure as an `object`, it will be converted to a named `tuple` in Timeplus.
4135

@@ -64,10 +58,14 @@ SELECT * FROM user_clicks where is_work_email(email)
6458

6559
You can use the following code to define a new function `is_work_email` with one input type `string` and return `bool`.
6660

67-
```javascript
61+
```sql
62+
CREATE OR REPLACE FUNCTION is_work_email(email string)
63+
RETURNS bool
64+
LANGUAGE JAVASCRIPT AS $$
6865
function is_work_email(values){
6966
return values.map(email=>!email.endsWith("@gmail.com"));
7067
}
68+
$$;
7169
```
7270

7371
Notes:
@@ -89,7 +87,10 @@ Similar to the last tutorial, you create a new function called `email_not_in`. T
8987

9088
The following code implements this new function:
9189

92-
```javascript
90+
```sql
91+
CREATE OR REPLACE FUNCTION email_not_in(email string,list string)
92+
RETURNS bool
93+
LANGUAGE JAVASCRIPT AS $$
9394
function email_not_in(emails,lists){
9495
let list=lists[0].split(','); // convert string to array(string)
9596
return emails.map(email=>{
@@ -100,6 +101,7 @@ function email_not_in(emails,lists){
100101
return true; // no match, return true confirming the email is in none of the provided domains
101102
});
102103
}
104+
$$;
103105
```
104106

105107
### Scalar function with no argument {#scalar0}
@@ -112,10 +114,14 @@ SELECT *, magic_number(1) FROM user_clicks
112114

113115
The `magic_number` takes an `int` argument as a workaround.
114116

115-
```javascript
117+
```sql
118+
CREATE OR REPLACE FUNCTION magic_number(v int)
119+
RETURNS bool
120+
LANGUAGE JAVASCRIPT AS $$
116121
function magic_number(values){
117122
return values.map(v=>42)
118123
}
124+
$$;
119125
```
120126

121127
In this case, the function will return `42` no matter what parameter is specified.
@@ -143,57 +149,57 @@ Let's take an example of a function to get the second maximum values from the gr
143149

144150
The full source code for this JS UDAF is
145151

146-
```javascript
147-
{
148-
initialize: function() {
149-
this.max = -1.0;
150-
this.sec_max = -1.0;
151-
},
152-
153-
process: function(values) {
154-
for (let i = 0; i < values.length; i++) {
155-
this._update(values[i]);
156-
}
157-
},
158-
159-
_update: function(value) {
160-
if (value > this.max) {
161-
this.sec_max = this.max;
162-
this.max = value;
163-
} else if (value > this.sec_max) {
164-
this.sec_max = value;
165-
}
166-
},
167-
168-
finalize: function() {
169-
return this.sec_max
170-
},
171-
172-
serialize: function() {
173-
return JSON.stringify({
174-
'max': this.max,
175-
'sec_max': this.sec_max
176-
});
177-
},
178-
179-
deserialize: function(state_str) {
180-
let s = JSON.parse(state_str);
181-
this.max = s['max'];
182-
this.sec_max = s['sec_max']
183-
},
184-
185-
merge: function(state_str) {
186-
let s = JSON.parse(state_str);
187-
this._update(s['max']);
188-
this._update(s['sec_max']);
189-
}
190-
};
152+
```sql
153+
CREATE AGGREGATE FUNCTION test_sec_large(value float32)
154+
RETURNS float32
155+
LANGUAGE JAVASCRIPT AS $$
156+
{
157+
initialize: function() {
158+
this.max = -1.0;
159+
this.sec = -1.0
160+
},
161+
process: function(values) {
162+
for (let i = 0; i < values.length; i++) {
163+
if (values[i] > this.max) {
164+
this.sec = this.max;
165+
this.max = values[i]
166+
}
167+
if (values[i] < this.max && values[i] > this.sec)
168+
this.sec = values[i];
169+
}
170+
},
171+
finalize: function() {
172+
return this.sec
173+
},
174+
serialize: function() {
175+
let s = {
176+
'max': this.max,
177+
'sec': this.sec
178+
};
179+
return JSON.stringify(s)
180+
},
181+
deserialize: function(state_str) {
182+
let s = JSON.parse(state_str);
183+
this.max = s['max'];
184+
this.sec = s['sec']
185+
},
186+
merge: function(state_str) {
187+
let s = JSON.parse(state_str);
188+
if (s['sec'] >= this.max) {
189+
this.max = s['max'];
190+
this.sec = s['sec']
191+
} else if (s['max'] >= this.max) {
192+
this.sec = this.max;
193+
this.max = s['max']
194+
} else if (s['max'] > this.sec) {
195+
this.sec = s['max']
196+
}
197+
}
198+
}
199+
$$;
191200
```
192201

193-
To register this function, steps are different in Timeplus Enterprise and Proton:
194-
195-
* With Timeplus UI: choose JavaScript as UDF type, make sure to turn on 'is aggregation'. Set the function name say `second_max` (you don't need to repeat the function name in JS code). Add one argument in `float` type and set return type to `float` too. Please note, unlike JavaScript scalar function, you need to put all functions under an object `{}`. You can define internal private functions, as long as the name won't conflict with native functions in JavaScript, or in the UDF lifecycle.
196-
* With SQL in Proton Client: check the example at [here](/js-udf#udaf).
202+
To register this function with Timeplus Console: choose JavaScript as UDF type, make sure to turn on 'is aggregation'. Set the function name say `second_max` (you don't need to repeat the function name in JS code). Add one argument in `float` type and set return type to `float` too. Please note, unlike JavaScript scalar function, you need to put all functions under an object `{}`. You can define internal private functions, as long as the name won't conflict with native functions in JavaScript, or in the UDF lifecycle.
197203

198204
### Advanced Example for Complex Event Processing {#adv_udaf}
199205

docs/sql-create-function.md

Lines changed: 78 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,5 @@
11
# CREATE FUNCTION
2-
At Timeplus, we leverage SQL to make powerful streaming analytics more accessible to a broad range of users. Without SQL, you have to learn and call low-level programming API, then compile/package/deploy them to get analytics results. This is a repetitive and tedious process, even for small changes.
3-
4-
But some developers have concerns that complex logic or systems integration are hard to express using SQL.
5-
6-
That's why we add User-Defined Functions (UDF) support in Timeplus. This enables users to leverage existing programming libraries, integrate with external systems, or just make SQL easier to maintain.
7-
8-
Timeplus Proton and Timeplus Enterprise support [SQL UDF](/) and [Local UDF in JavaScript](/js-udf). You can develop User-defined scalar functions (UDFs) in SQL, or develop UDFs or User-defined aggregate functions (UDAFs) with modern JavaScript (powered by V8). No need to deploy extra server/service for the UDF. More languages will be supported.
9-
10-
:::info
11-
In Timeplus Enterprise, the Python UDF will be ready soon.
12-
:::
2+
Timeplus supports four ways to develop/register UDF. Please check [UDF](/udf) page for the overview.
133

144
## SQL UDF
155
You can create or replace a SQL UDF, by specifying the function name, parameters and the expression.
@@ -27,6 +17,26 @@ CREATE OR REPLACE FUNCTION color_hex AS (r, g, b) -> '#'||hex(r)||hex(g)||hex(b)
2717

2818
[Learn More](/sql-udf)
2919

20+
## Remote UDF
21+
Register a webhook as the UDF. You may use any programming language/framework to develop/deploy the webhook. A good starting point is using AWS Lambda.
22+
23+
Syntax:
24+
```sql
25+
CREATE REMOTE FUNCTION udf_name(ip string) RETURNS string
26+
URL 'https://the_url'
27+
AUTH_METHOD 'none'
28+
```
29+
If you need to protect the end point and only accept requests with a certain HTTP header, you can use the AUTH_HEADER and AUTH_KEY setting, e,g.
30+
```sql
31+
CREATE REMOTE FUNCTION udf_name(ip string) RETURNS string
32+
URL 'https://the_url'
33+
AUTH_METHOD 'auth_header'
34+
AUTH_HEADER 'header_name'
35+
AUTH_KEY 'value';
36+
```
37+
38+
[Learn More](/remote-udf)
39+
3040
## JavaScript UDF
3141

3242
### UDF {#js-udf}
@@ -62,11 +72,13 @@ You can also add `EXECUTION_TIMEOUT <num>` to the end of the `CREATE FUNCTION` t
6272
You can add debug information via `console.log(..)` in the JavaScript UDF. The logs will be available in the server log files.
6373
:::
6474

75+
Check [more examples](js-udf#udf) for scalar function with 2 or more arguments or 0 argument.
76+
6577
### UDAF {#js-udaf}
6678

6779
Creating a user-defined-aggregation function (UDAF) requires a bit more effort. Please check [this documentation](/js-udf#udaf) for the 3 required and 3 optional functions.
6880

69-
```sql showLineNumbers
81+
```sql
7082
CREATE AGGREGATE FUNCTION test_sec_large(value float32)
7183
RETURNS float32
7284
LANGUAGE JAVASCRIPT AS $$
@@ -115,3 +127,57 @@ LANGUAGE JAVASCRIPT AS $$
115127
}
116128
$$;
117129
```
130+
131+
[Learn More](/js-udf)
132+
133+
## Python UDF
134+
starting from v2.7, Timeplus Enterprise also supports Python-based UDF. You can develop User-defined scalar functions (UDFs) or User-defined aggregate functions (UDAFs) with the embedded Python 3.10 runtime in Timeplus core engine. No need to deploy extra server/service for the UDF.
135+
136+
[Learn more](/py-udf) why Python UDF, and how to map the data types in Timeplus and Python, as well as how to manage dependencies.
137+
138+
### UDF {#py-udf}
139+
Syntax:
140+
```sql
141+
CREATE OR REPLACE FUNCTION udf_name(param1 type1,..)
142+
RETURNS type2 LANGUAGE PYTHON AS
143+
$$
144+
import …
145+
146+
def udf_name(col1..):
147+
148+
149+
$$
150+
SETTINGS ...
151+
```
152+
153+
### UDAF {#py-udaf}
154+
UDAF or User Defined Aggregation Function is stateful. It takes one or more columns from a set of rows and return the aggregated result.
155+
156+
Syntax:
157+
```sql
158+
CREATE OR REPLACE AGGREGATION FUNCTION uda_name(param1 type1,...)
159+
RETURNS type2 language PYTHON AS
160+
$$
161+
import ...
162+
class uda_name:
163+
def __init__(self):
164+
...
165+
166+
def serialize(self):
167+
...
168+
169+
def deserialize(self, data):
170+
...
171+
172+
def merge(self, data):
173+
...
174+
175+
def process(self, values):
176+
...
177+
def finalize(self):
178+
...
179+
$$
180+
SETTINGS ...
181+
```
182+
183+
[Learn More](/py-udf)

docs/udf.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,5 @@ Please note, there are many factors to determine the number of function calls. F
2828
Long story short, developers should not make assumption for the number of function calls. For User-defined scalar functions (UDFs) it should be stateless, and for User-defined aggregate functions (UDAFs), data might be aggregated more than once, but the final result is correct.
2929

3030
:::
31+
32+
Check [CREATE FUNCTION](/sql-create-function) for how to create functions via SQL.

0 commit comments

Comments
 (0)