Skip to content

Commit bb56cd4

Browse files
committed
Update the status
1 parent 1c6bced commit bb56cd4

File tree

1 file changed

+15
-10
lines changed

1 file changed

+15
-10
lines changed

README.org

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ using only character and most translation opcodes basically works.
1717
The original YAML test suite is supported and can be used to test the
1818
re-implementation.
1919

20-
Currently, the re-implementation passes 68% of the liblouis test suite
20+
Currently, the re-implementation passes 83% of the liblouis test suite
2121
successfully.
2222

2323
* Relation to liblouis
@@ -111,22 +111,30 @@ The parser is built from the grammar used in [[https://github.com/liblouis/tree-
111111
which is a port of the [[https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form][EBNF grammar]] in [[https://github.com/liblouis/rewrite-louis][rewrite-louis]], which in turn is
112112
a just port of the [[https://en.wikipedia.org/wiki/Parsing_expression_grammar][Parsing expression grammar]] from [[https://github.com/liblouis/louis-parser][louis-parser]].
113113

114-
* Todo [6/15]
115-
- [ ] Parse with context
114+
* Todo [7/15]
115+
- [X] Parse with context
116116
- currently tables are parsed line by line. Opcodes have no idea
117117
whether a character or a class has been defined before
118118
- Probably need to pass some context to the rule parser where
119119
character definitions and class names are kept
120120
- this is solved with a two-pass compilation now. The first pass
121121
collects all relevant information and the second pass consequently
122122
uses that.
123-
- [ ] (Emphasis and Caps) Indication
123+
- [-] Indication [2/3]
124124
- presumably this could be done independently of translation, i.e.
125125
find indication locations and put them in the typeform array
126126
before even translating.
127+
- [X] Numeric indication
128+
- [X] Caps indication
129+
- [ ] Emphasis indication
127130
- [X] Add support for virtual dots
128131
- Virtual dots are supported and are converted to Unicode Supplementary Private Use Area-A
129-
- [ ] The correct, multipass and match opcodes
132+
- [-] The correct, multipass and match opcodes [1/3]
133+
- [X] Match opcode
134+
- A basic regexp engine has been implemented and aside from
135+
negation the match opcode basically works
136+
- [ ] Correct opcode
137+
- [ ] Multipass opcode
130138
- [X] Currently the matching of input text against the rules is case
131139
sensitive.
132140
- [X] Make it case insensitive.
@@ -152,11 +160,8 @@ a just port of the [[https://en.wikipedia.org/wiki/Parsing_expression_grammar][P
152160
- However normal translation has currently no way to specify a
153161
display table
154162
- [X] Handle undefined characters similarly to liblouis
155-
- [ ] Use a well established FST or graph library as a bases
156-
- currently regular expressions are implemented using a simple
157-
directed acyclic graph. It would surely be better to use a well
158-
established library for that task such as [[https://github.com/garvys-org/rustfst][rustfst]], [[https://crates.io/crates/petgraph][petgraph]] or
159-
[[https://github.com/neo4j-labs/graph][graph]].
163+
- [ ] Instead of hand-rolling an finite state machine to implement
164+
regular expressions we should use [[https://docs.rs/regex-automata/latest/regex_automata/][regex_automata]].
160165

161166
* License
162167

0 commit comments

Comments
 (0)