Skip to content

serializeToBagit ignores DataObjects that remotely reference data #122

@mbjones

Description

@mbjones

A DataObject can include a dataURL that indicates that the bytes of the object are remotely stored on another server, rather than being either in memory or on the local filesystem (which are the other two options). When serializing a DataPackage to disk in BagIt format, the serializeToBagit function skips over any data objects that use the dataURL slot as the reference to data, thus breaking support for this serialization.

To fix, either:

  • during creation of the BagIt, download remote objects and serialize them like others
  • during creation of the BagIt, serialize remotely referenced data objects by reference in the fetch.txt file, preserving them as remote

The challenge with the second approach is we still need checksums for the remote objects. Technically this shoul dbe in the SystemMetadata for the DataObject, but its likely it was not calculated. If the remote object is a DataONE object, then the SystemMetadata should have the needed checksum.

Relates to issue #3 and #119

Metadata

Metadata

Assignees

Labels

bugcriticalCritical issues that should be addressed immediately

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions