Skip to content

Releases: CopticScriptorium/tokenizers

v4.1.0 February 2016

17 Feb 22:51
Compare
Choose a tag to compare

This version integrates the DDGLC lemma list into the tokenization and morphological analysis.

Tokenizer v4.0.1

09 Nov 22:11
Compare
Choose a tag to compare

Adds support for morphological analysis and some bug fixes.

Tokenizer release 3.1.0 (July 16, 2015)

16 Jul 19:09
Compare
Choose a tag to compare

New version of the tokenizer:

  • Corrects a bug with the line break addition parameter -l
  • Adds better support for constructions with je- and nominalized tre-f

Tokenizer v. 3.0.1 May 2015

31 May 22:12
Compare
Choose a tag to compare

This release is similar to the previous release, except it provides additional instructions.

May 2015 release

22 May 18:30
Compare
Choose a tag to compare

Perl script tokenizes Coptic text segmented into bound groups into constituent parts for further annotation. Based on Layton's grammar. Also includes an Excel macro that merges cells based on certain conditions.

v 3.0 has improved accuracy in tokenization.

March 2015 release

11 Mar 03:06
Compare
Choose a tag to compare

Release includes perl script to tokenize Coptic text segmented into bound groups and an Excel macro that merges cells based on certain conditions.

11 March 2015

v2.0.1

26 Sep 00:00
Compare
Choose a tag to compare

Adds more patterns of various bound groups to the tokenizer and adds a parameter to accommodate diplomatic transcriptions of text in which a line break interrupts a bound group.

v1.1.1

11 Jul 01:50
Compare
Choose a tag to compare
version 1.1.1

v1.0.0

11 Jul 01:50
Compare
Choose a tag to compare
version 1.0.0

v0.9.3

11 Jul 01:49
Compare
Choose a tag to compare
version 0.9.3