Skip to content

Commit 8155817

Browse files
committed
Add notes about duplicate character transitions
that can happen when we have regexps with alternations
1 parent 3957868 commit 8155817

File tree

1 file changed

+47
-0
lines changed

1 file changed

+47
-0
lines changed

doc/Architecture_Decision_Records.org

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,53 @@
77

88
#+TODO: DRAFT PROPOSED | ACCEPTED REJECTED DEPRECATED SUPERSEDED
99

10+
* DRAFT Regexps duplicate character transitions
11+
- Deciders :: CE
12+
- Date :: [2024-12-24 Di]
13+
14+
** Context and Problem Statement
15+
We have a graph for all regular expressions. For alternatives we
16+
create a new node and add all the alternatives between the current and
17+
the new node. Now what do we do if there is already a character
18+
transition for one of the characters in the alternatives? We could
19+
either
20+
1. merge the nodes
21+
- how do we handle the case where both nodes are accepting, i.e.
22+
have translations? Which translation gets precedence?
23+
2. add an epsilon transition
24+
- from the new node to the existing one
25+
- and delete the edge
26+
3. allow multiple character transition edges with the same character
27+
from the same node
28+
- this of course totally breaks the current data structure
29+
30+
#+begin_example
31+
letter a 1
32+
letter b 2
33+
letter o 3
34+
letter r 4
35+
letter z 5
36+
37+
always bor 17
38+
match b(a|o) r z 14
39+
#+end_example
40+
41+
#+begin_example
42+
/bor|b(a|o)rz/
43+
#+end_example
44+
45+
** Decision Drivers
46+
** Considered Options
47+
** Decision Outcome
48+
Chosen option: "TBD", because ...
49+
50+
** Positive Consequences
51+
-
52+
** Negative Consequences
53+
-
54+
** Pros and Cons of the Options
55+
** Links
56+
1057
* DRAFT Hyphenation
1158
- Deciders :: CE
1259
- Date :: [2024-12-20 Fr]

0 commit comments

Comments
 (0)