-
Notifications
You must be signed in to change notification settings - Fork 80
Exporting data
New in 3.3.0, ml-gradle provides several tasks for easily exporting any set of documents in MarkLogic to either a single file/zip or many files/zips. These tasks make use of the DMSDK Jobs provided by the ml-javaclient-util library.
The tasks provided by 3.3.0 are:
- mlExportToFile
- mlExportToZip
- mlExportBatchesToDirectory
- mlExportBatchesToZips
Each of the tasks for exporting data can be configured via several properties. To see the available properties for any task, just run the task with "-PjobProperties" (no value needed) - for example:
gradle mlExportToFile -PjobProperties
All documents selected by a query can be exported to a single file via mlExportToFile:
gradle mlExportToFile -PexportPath=export.xml -PwhereCollections=example
This task is similar to the other DMSDK Tasks in that a "where" property is required to specify the documents to export. "whereCollections", "whereUriPattern", and "whereUrisQuery" are the current supported properties, e.g.:
gradle mlExportToFile -PwhereUriPattern=*.xml
gradle mlExportToFile -PwhereUrisQuery="cts:element-value-query(xs:QName('hello'), 'world')"
This export capability is simply wrapping existing DMSDK functionality, specifically the ExportToWriterListener class. So you can utilize some of the properties on that class, e.g.:
gradle mlExportToFile -PrecordPrefix="<wrapper>" -PrecordSuffix="</wrapper>" -PwhereCollections=example
You can also specify content to be written to the beginning and end of the file:
gradle mlExportToFile -PwhereCollections=example -PfileHeader="<results>" -PfileFooter="</results>"
With mlExportToFile, you can reference a REST API transform, which enables exporting data to CSV - i.e. write a transform that converts a document to the exact CSV that you want (and of course you can load that transform with ml-gradle):
gradle mlExportToFile -Ptransform=my-csv-transform -PwhereCollections=example
You can use ml-gradle to stub out that transform first:
gradle mlCreateTransform -PtransformName=my-csv-transform -PtransformType=sjs|xqy|xsl
Of course, the REST API transform can produce any content that you want.
All documents selected by a query can be exported to a single zip via mlExportToZip:
gradle mlExportToZip -PexportPath=export.xml -PwhereCollections=example
Like exporting to a file, you can also apply a transform on each document:
gradle mlExportToZip -PexportPath=export.xml -PwhereCollections=example -Ptransform=my-transform
Each URI as is used for creating a zip entry for each document. The URI can be "flattened" - i.e. everything up to and including the last "/" will be dropped:
gradle mlExportToZip -PexportPath=export.xml -PwhereCollections=example -PflattenUri=true
You can also provide a prefix on each zip entry:
gradle mlExportToZip -PexportPath=export.xml -PwhereCollections=example -PuriPrefix=/my-prefix