Skip to content

Commit 87c2895

Browse files
committed
Fixed statement in query restrictions
The restrictions on RowBatcher for e.g. groupBy/limit/orderBy don't apply to the Spark connector. The connector will push those down to MarkLogic, but if more than 1 call is made to MarkLogic, Spark will apply them as well, thus avoiding the problem that RowBatcher has where those calls only get applied to each batch and not the entire set of matching rows.
1 parent 0fc8206 commit 87c2895

File tree

1 file changed

+15
-5
lines changed

1 file changed

+15
-5
lines changed

docs/reading.md

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -48,11 +48,21 @@ query expansion via [a thesaurus](https://docs.marklogic.com/guide/search-dev/th
4848

4949
## Optic query requirements
5050

51-
As of the 2.0 release of the connector, the Optic query must use the
52-
[op.fromView](https://docs.marklogic.com/op.fromView) accessor function. The query must also adhere to the
53-
restrictions that the
54-
[RowBatcher in the Data Movement SDK](https://github.com/marklogic/java-client-api/wiki/Row-Batcher#building-a-plan-for-exporting-the-view)
55-
adheres to as well.
51+
As of the 2.0.0 release of the connector, the Optic query must use the
52+
[op.fromView](https://docs.marklogic.com/op.fromView) accessor function. Future releases of both the connector and
53+
MarkLogic will strive to relax this requirement.
54+
55+
In addition, calls to `groupBy`, `orderBy`, `limit`, and `offset` should be performed via Spark instead of within
56+
the initial Optic query. A key benefit of Spark and the MarkLogic connector is the ability to execute the query in
57+
parallel via multiple Spark partitions. The aforementioned calls, if made in the Optic query, may not produce the
58+
expected results if more than one Spark partition is used or if more than one request is made to MarkLogic. The
59+
equivalent Spark operations should be called instead, or the connector should be configured to make a single request
60+
to MarkLogic. See the "Pushing down operations" and "Tuning performance" sections below for more information.
61+
62+
Finally, the query must adhere to the handful of limitations imposed by the
63+
[Optic Query DSL](https://docs.marklogic.com/guide/app-dev/OpticAPI#id_46710). A good practice in validating a
64+
query is to run it in your [MarkLogic server's qconsole tool](https://docs.marklogic.com/guide/qconsole) in a buffer
65+
with a query type of "Optic DSL".
5666

5767
## Schema inference
5868

0 commit comments

Comments
 (0)