File tree Expand file tree Collapse file tree 4 files changed +26
-1
lines changed Expand file tree Collapse file tree 4 files changed +26
-1
lines changed Original file line number Diff line number Diff line change @@ -16,4 +16,7 @@ The connector has the following system requirements:
16
16
* For writing data, MarkLogic 9.0-9 or higher.
17
17
* For reading data, MarkLogic 10.0-9 or higher.
18
18
19
+ In addition, if your MarkLogic cluster has multiple hosts in it, it is highly recommended to put a load balancer in front
20
+ of your cluster and have the MarkLogic Spark connector connect through the load balancer.
21
+
19
22
Please see the [ Getting Started guide] ( getting-started/getting-started.md ) to begin using the connector.
Original file line number Diff line number Diff line change @@ -252,9 +252,17 @@ with more partition readers and a higher batch size.
252
252
You can also adjust the level of parallelism by controlling how many threads Spark uses for executing partition reads.
253
253
Please see your Spark distribution's documentation for further information.
254
254
255
+ ### Using a load balancer
256
+
257
+ If your MarkLogic cluster has multiple hosts, it is highly recommended to put a load balancer in front
258
+ of your cluster and have the connector connect through the load balancer. A typical load balancer will help ensure
259
+ not only that load is spread across the hosts in your cluster, but that any network or connection failures can be
260
+ retried without the error propagating to the connector.
261
+
255
262
### Direct connections to hosts
256
263
257
- If your Spark program is able to connect to each host in your MarkLogic cluster, you can set the
264
+ If you do not have a load balancer in front of your MarkLogic cluster, and your Spark program is able to connect to
265
+ each host in your MarkLogic cluster, you can set the
258
266
` spark.marklogic.client.connectionType ` option to ` direct ` . Each partition reader will then connect to the
259
267
host on which the reader's assigned forest resides. This will typically improve performance by reducing the network
260
268
traffic, as the host that receives a request will not need to involve any other host in the processing of that request.
Original file line number Diff line number Diff line change @@ -257,6 +257,13 @@ The effectiveness of this approach can be evaluated by executing the Optic query
257
257
[ MarkLogic's qconsole application] ( https://docs.marklogic.com/guide/qconsole/intro ) , which will execute the query in
258
258
a single request as well.
259
259
260
+ ### Using a load balancer
261
+
262
+ If your MarkLogic cluster has multiple hosts, it is highly recommended to put a load balancer in front
263
+ of your cluster and have the connector connect through the load balancer. A typical load balancer will help ensure
264
+ not only that load is spread across the hosts in your cluster, but that any network or connection failures can be
265
+ retried without the error propagating to the connector.
266
+
260
267
### More detail on partitions
261
268
262
269
This section is solely informational and is not required understanding for using the connector
Original file line number Diff line number Diff line change @@ -233,6 +233,13 @@ The rule of thumb above can thus be expressed as:
233
233
234
234
Number of partitions * Value of spark.marklogic.write.threadCount <= Number of hosts * number of app server threads
235
235
236
+ ### Using a load balancer
237
+
238
+ If your MarkLogic cluster has multiple hosts, it is highly recommended to put a load balancer in front
239
+ of your cluster and have the connector connect through the load balancer. A typical load balancer will help ensure
240
+ not only that load is spread across the hosts in your cluster, but that any network or connection failures can be
241
+ retried without the error propagating to the connector.
242
+
236
243
### Error handling
237
244
238
245
The connector may throw an error during one of two phases of operation - before it begins to write data to MarkLogic,
You can’t perform that action at this time.
0 commit comments