You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+2-1Lines changed: 2 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -3,10 +3,11 @@
3
3
## [Unreleased]
4
4
5
5
- Implement [HttpSink](src/main/java/com/getindata/connectors/http/sink/HttpSink.java) deriving from [AsyncSinkBase](https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink) introduced in Flink 1.15.
6
+
- Add support for Table API in HttpSink in the form of [HttpDynamicSink](src/main/java/com/getindata/connectors/http/table/HttpDynamicSink.java).
6
7
7
8
## [0.1.0] - 2022-05-26
8
9
9
-
- Implement baisc support for Http connector for Flink SQL
10
+
- Implement basic support for Http connector for Flink SQL
Copy file name to clipboardExpand all lines: README.md
+76-26Lines changed: 76 additions & 26 deletions
Original file line number
Diff line number
Diff line change
@@ -1,22 +1,26 @@
1
1
# flink-http-connector
2
-
The HTTP TableLookup connector that allows for pulling data from external system via HTTP GET method.
3
-
The goal for this connector was to use it in Flink SQL statement as a standard table that can be later joined with other stream using pure SQL Flink.
2
+
The HTTP TableLookup connector that allows for pulling data from external system via HTTP GET method and HTTP Sink that allows for sending data to external system via HTTP requests.
3
+
4
+
#### HTTP TableLookup Source
5
+
The goal for HTTP TableLookup connector was to use it in Flink SQL statement as a standard table that can be later joined with other stream using pure SQL Flink.
4
6
5
-
Currently, connector supports only Lookup Joins [1] and expects JSON as a response body.
7
+
Currently, HTTP TableLookup connector supports only Lookup Joins [1] and expects JSON as a response body. It also supports only the STRING types.
6
8
7
-
Connector supports only STRING types.
9
+
#### HTTP Sink
10
+
`HttpSink` supports both Streaming API (when using [HttpSink](src/main/java/com/getindata/connectors/http/sink/HttpSink.java) built using [HttpSinkBuilder](src/main/java/com/getindata/connectors/http/sink/HttpSinkBuilder.java)) and the Table API (using connector created in [HttpDynamicTableSinkFactory](src/main/java/com/getindata/connectors/http/table/HttpDynamicTableSinkFactory.java)).
8
11
9
12
## Prerequisites
10
13
* Java 11
11
14
* Maven 3
12
15
* Flink 1.15+
13
16
14
-
## Implementation
15
-
Implementation is based on Flink's `TableFunction` and `AsyncTableFunction` classes.
16
-
To be more specific we are using a `LookupTableSource`. Unfortunately Flink's new unified source interface [2] cannot be used for this type of source.
17
-
Issue was discussed on Flink's user mailing list - https://lists.apache.org/thread/tx2w1m15zt5qnvt924mmbvr7s8rlyjmw
17
+
## Installation
18
+
19
+
In order to use the `flink-http-connector` the following dependencies are required for both projects using a build automation tool (such as Maven or SBT) and SQL Client with SQL JAR bundles. For build automation tool reference, look into Maven Central: [https://mvnrepository.com/artifact/com.getindata/flink-http-connector](https://mvnrepository.com/artifact/com.getindata/flink-http-connector).
18
20
19
21
## Usage
22
+
23
+
### HTTP TableLookup Source
20
24
Flink SQL table definition:
21
25
22
26
```roomsql
@@ -48,28 +52,70 @@ For Example:
48
52
http://localhost:8080/client/service?id=1&uuid=2
49
53
``
50
54
55
+
### HTTP Sink
56
+
The following example shows the minimum Table API example to create a [HttpDynamicSink](src/main/java/com/getindata/connectors/http/table/HttpDynamicSink.java) that writes JSON values to an HTTP endpoint using POST method, assuming Flink has JAR of [JSON serializer](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/formats/json/) installed:
57
+
58
+
```roomsql
59
+
CREATE TABLE http (
60
+
id bigint,
61
+
some_field string
62
+
) WITH (
63
+
'connector' = 'http-sink'
64
+
'url' = 'http://example.com/myendpoint'
65
+
'format' = 'json'
66
+
)
67
+
```
68
+
69
+
Then use `INSERT` SQL statement to send data to your HTTP endpoint:
70
+
71
+
```roomsql
72
+
INSERT INTO http VALUES (1, 'Ninette'), (2, 'Hedy')
73
+
```
74
+
75
+
Due to the fact that `HttpSink` sends bytes inside HTTP request's body, one can easily swap `'format' = 'json'` for some other [format](https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/formats/overview/).
76
+
77
+
Other examples of usage of the Table API can be found in [some tests](src/test/java/com/getindata/connectors/http/table/HttpDynamicSinkInsertTest.java).
78
+
79
+
## Implementation
80
+
Implementation of an HTTP source connector is based on Flink's `TableFunction` and `AsyncTableFunction` classes.
81
+
To be more specific we are using a `LookupTableSource`. Unfortunately Flink's new unified source interface [2] cannot be used for this type of source.
82
+
Issue was discussed on Flink's user mailing list - https://lists.apache.org/thread/tx2w1m15zt5qnvt924mmbvr7s8rlyjmw
83
+
84
+
Implementation of an HTTP Sink is based on Flink's `AsyncSinkBase` introduced in Flink 1.15 [3, 4].
85
+
51
86
## Http Response to Table schema mapping
52
-
The mapping from Http Json Response to SQL table schema is done via Json Paths [3].
87
+
The mapping from Http Json Response to SQL table schema is done via Json Paths [5].
53
88
This is achieved thanks to `com.jayway.jsonpath:json-path` library.
54
89
55
90
If no `root` or `field.#.path` option is defined, the connector will use the column name as json path and will try to look for Json Node with that name in received Json. If no node with a given name is found, the connector will return `null` as value for this field.
56
91
57
92
If the `field.#.path` option is defined, connector will use given Json path from option's value in order to find Json data that should be used for this column.
58
93
For example `'field.isActive.path' = '$.details.isActive'` - the value for table column `isActive` will be taken from `$.details.isActive` node from received Json.
59
94
60
-
## Connector Options
61
-
| Option | Required | Description/Value|
62
-
| -------------- | ----------- | -------------- |
63
-
| connector | required | The Value should be set to _rest-lookup_|
64
-
| url | required | The base URL that should be use for GET requests. For example _http://localhost:8080/client_|
65
-
| asyncPolling | optional | true/false - determines whether Async Pooling should be used. Mechanism is based on Flink's Async I/O.|
66
-
| root | optional | Sets the json root node for entire table. The value should be presented as Json Path [3], for example `$.details`.|
67
-
| field.#.path | optional | The Json Path from response model that should be use for given `#` field. If `root` option was defined it will be added to field path. The value must be presented in Json Path format [3], for example `$.details.nestedDetails.balance`|
| connector | required | The Value should be set to _rest-lookup_|
100
+
| url | required | The base URL that should be use for GET requests. For example _http://localhost:8080/client_|
101
+
| asyncPolling | optional | true/false - determines whether Async Pooling should be used. Mechanism is based on Flink's Async I/O. |
102
+
| root | optional | Sets the json root node for entire table. The value should be presented as Json Path [5], for example `$.details`. |
103
+
| field.#.path | optional | The Json Path from response model that should be use for given `#` field. If `root` option was defined it will be added to field path. The value must be presented in Json Path format [5], for example `$.details.nestedDetails.balance`|
| connector | required | Specify what connector to use. For HTTP Sink it should be set to _'http-sink'_. |
109
+
| url | required | The base URL that should be use for HTTP requests. For example _http://localhost:8080/client_.|
110
+
| format | required | Specify what format to use. |
111
+
| insert-method | optional | Specify which HTTP method to use in the request. The value should be set either to `POST` or `PUT`. |
112
+
| sink.batch.max-size | optional | Maximum number of elements that may be passed in a batch to be written downstream. |
113
+
| sink.requests.max-inflight | optional | The maximum number of in flight requests that may exist, if any more in flight requests need to be initiated once the maximum has been reached, then it will be blocked until some have completed. |
114
+
| sink.requests.max-buffered | optional | Maximum number of buffered records before applying backpressure. |
115
+
| sink.flush-buffer.size | optional | The maximum size of a batch of entries that may be sent to the HTTP endpoint measured in bytes. |
116
+
| sink.flush-buffer.timeout | optional | Threshold time in milliseconds for an element to be in a buffer before being flushed. |
68
117
69
118
## Build and deployment
70
-
Currently, we are not publishing this artifact to any repository. The CI/CD configuration is also next thing to do.
71
-
To
72
-
73
119
To build the project locally you need to have `maven 3` and Java 11+. </br>
74
120
75
121
Project build command: `mvn package`. </br>
@@ -83,7 +129,7 @@ It will start HTTP server listening on `http://localhost:8080/client`
83
129
Steps to follow:
84
130
- Run Mock HTTP server from `HttpStubApp::main` method.
85
131
- Start your Flink cluster, for example as described under https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/try-flink/local_installation/
Copy file name to clipboardExpand all lines: pom.xml
+10-1Lines changed: 10 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ under the License.
27
27
<packaging>jar</packaging>
28
28
29
29
<name>flink-http-connector</name>
30
-
<description>The HTTP TableLookup connector that allows for pulling data from external system via HTTP GET method. The goal for this connector was to use it in Flink SQL statement as a standard table that can be later joined with other stream using pure SQL Flink.</description>
30
+
<description>The HTTP TableLookup connector that allows for pulling data from external system via HTTP GET method and HTTP Sink that allows for sending data to external system via HTTP requests. The goal for HTTP TableLookup connector was to use it in Flink SQL statement as a standard table that can be later joined with other stream using pure SQL Flink.</description>
0 commit comments