The bdds_publication_client provides a command line interface to the publication capabilities provided by the Globus Data Publication service.
To use the the client, the globus_sdk project must be in your PYTHONPATH. The globus_sdk is available from https://github.com/globusonline/globus-sdk-python.
You must also install the requests package using pip install requests.
General usage instructions can be generated by running:
bdds_publication_client.py --help
The arguments generally fall in three categories:
- Configuring the server location and authentication.
- Defining input values to the client operations.
- Specifying what operations to perform.
--service-url
Specifies the root URL for the REST API. The default is https://publish.globus.org/v1/api/
, and if the default is correct, it need not be specified.
--token-file
Names a file where a valid Globus Authentication token has been stored.
--nexus-token-file
Names a file where a Globus Nexus token is stored. The use and need for Nexus tokens will be deprecated shortly.
--generate-nexus-token
Can be used to generate a new Nexus token instead of using one stored in a file. The format username:password is used to provide the needed credentials for generating the token. The token will be output so that it can be stored in a file for subsequent use.
--collection-id
Specifies the id of the collection for a new dataset submission. Collection ids can be discovered using the --list-collections
flag.
--metadata-file
Names a file where metadata for a dataset is stored. The metadata must be in JSON format with property names which match metadata properties already present in the Data Publication environment. Schemas may be inspected using the --list-schemas
and --introspect-schema
flags.
--data-endpoint
and --data-directory
Provide the location for data that is to be moved in to the dataset. All data in the data directory will be recursively copied into the dataset.
--dataset-id
Specifies the id of a dataset to operate upon. If a dataset has been started using the --create-dataset
operation (see below), the dataset id returned can be used for futher operations such as performing transfers of data in to the dataset.
--transfer-id
Is used to specify a Globus Transfer task identifier to use on the --poll
or --wait
command line options.
list-schemas
and --introspect-schemas
Are used to discover schemas deployed in the service and their content, in particular the name of the properties of the schema. These commands can be useful when preparing data to be specified with the --metadata-file
argument.
--list-collections
Provides a list of all collections visible to the user. The id of a collection is used with --collection-id
when creating a new dataset. The content of the collection also displays the license the user must comply with upon submitting a dataset. Use of the client assumes compliance with the license.
--create-dataset
Creates a new dataset in the specified collection (--collection-id
) and populates it with the specified metadata (--metadata-file
). The dataset remains 'open' for further updates and for data to be transfered in to the dataset (using --transfer-data
). The dataset id required on further operaitions is displayed in the output.
--transfer-data
Performs a data transfer operation in to the dataset specified (--dataset-id
) from the data locations named (--data-endpoint
and --data-directory
). A transfer task id is displayed in the output of this command.
--wait
and --poll
Are used to insure that data transfers have completed. --wait
will not return until the named transfer task (--transfer-id
) is complete. --poll N
will poll until N seconds have elapsed or until the transfer is complete.
--submit
Moves the dataset in to the collections archive. Following the submit operation, no further data can be transferred, and no updates to the metadata may be made. The dataset may still require curation before it enters the archive depending on the policy of the collection.
--delete-dataset
Removes a dataset (--dataset-id
) from the submission process. If it has been submitted using the --submit
operation, it cannot be removed.