Skip to content

Releases: caltechlibrary/dataset

Performance improvement

18 Sep 16:45
Compare
Choose a tag to compare
Pre-release

Performance improvement for object creation. Namaste is only created on collection init. Add Go func CreateObjectsJSON() for collections. This func [rovides batch object creation using a default JSON source
for created objects. It avoids writing collection metadata until it all objects are created. For large data imports this saves writing collection.json on each individual object creation. The was added at the Go level to improve the performance of make_objects() func in libdataset.

libdataset additions

17 Sep 21:31
Compare
Choose a tag to compare
libdataset additions Pre-release
Pre-release

This release adds two functions to libdataset -- make_objects() and update_objects() which let you create
objects in batch. This is helpful if you are scripting inguest of large numbers of objects into dataset at once.

This was compiled with libdataset was compiled Golang v1.13 on Linux, Windows 10 and macOS on Intel.https://github.com/tpoechtrager/wclang

Fix bug issue #96

19 Jul 17:21
Compare
Choose a tag to compare
Fix bug issue #96 Pre-release
Pre-release

Fixed bug in the AttachFile() function which was leaving us with empty attachments.

Bug fix, issue #95, attachment problem.

15 Jul 18:07
Compare
Choose a tag to compare
Pre-release

This is a bug fix release in prep for v1.0.0 release. The AttatchStream() func failed to actually write the attachment's content. Cleaned up code and corrected missing assignment after reading the io.Reader buffer.

Release candidate 4 for v1.0.0

26 Jun 16:51
Compare
Choose a tag to compare
Pre-release

Bug fixes in libdataset used by py_dataset. Testing cross compile from Linux to Windows and Mac OS X for libdataset.*

Release 3 candidate for v1.0.0

25 Jun 22:02
Compare
Choose a tag to compare
Pre-release

In this release the data frame was refactored to drop the Grid attribute. You can get a Grid from a data frame by using the Grid() function on the frame. Added func Objects() to frame for returning a copy of the DataFrame.ObjectList values. Updated libdataset to reflect this change adding both Objects() and Grid() funcs. Labels MUST be provided in the frame definition now. The dot paths and labels lengths must match. If the first element of the dopaths is NOT ._Key it will be prefixed to the dot path list automatically and a "_Key" label will be prefixed to the labels list before generating the frame. "._Key" and "_Key" will always be the first in the list of dot paths and labels. This is less critical as the Grid 2D array has been dropped but it is always reflected in the output of the Grid function since the definitions of dot paths and labels enforce this.

The command line dataset has been updated to reflect the changes in frame definition requirements. In order to manage the command line arguments the you form the label/dot path pair by joining them with an equal sign.
Labels are used as the keys in the objects of a frame while the dot path specifies where the value comes from in the collection's JSON objects. This makes a command line frame definition looks something like this

    cat keys.txt | dataset frame collection.ds MyFrame one=.title two=.family_name three=.given_name

The frame "MyFrame" will contain an object list with the keys of "one", "two" and "three" with values from
the dot paths ".title, .family_name, .given_name".

Release candidate 2, for v1.0.0

10 Jun 18:26
Compare
Choose a tag to compare
Pre-release

Added boolean clean object to Read() and ReadList(). This will let you retrieve an object without the dataset added _Key and _Attachments attributes.

Release candidate for v1.0.0

08 Jun 19:47
Compare
Choose a tag to compare
Pre-release

Dropped Bleve support. Removed buckets code. Remove verbs related to search (we'll add them back when Lunr is available in Go). Documentation cleanup. Refactored attachments from tarball to semver directory holding attachments by their base name. Expanded metadata in collection.json, improved init to include additional auto-generate metadata.

Refactor frame to support a list of objects

06 Jun 20:39
Compare
Choose a tag to compare

The major change is frames now will include an object list and the grid element of a frame is considered depreciated. The object list has proven more useful in the code we write that uses frames it is also more consistent with how Python and R represent a data frame as well. The array of objects remains easy to export to a 2D array if needed.

The code support a BUCKET collection has been removed as this data layout has been out of use for more than a year. The eliminated two verbs migrate verb, simplified check and repair as well as the need to specify a layout when initializing a collection.

Since removing the BUCKET layout is conceptually a breaking change (it is removing a feature) and we the grid element in a data frame is being depreciated this release is considered a pre-release while we test practical usage.

Unified dataset and py_dataset release

06 May 21:20
Compare
Choose a tag to compare

This release is meant to unify the version numbers with dataset, libdataset and py_dataset.