|
| 1 | +# KoboldCpp Experimental |
| 2 | + |
| 3 | +## Emphasisfsm |
| 4 | + |
| 5 | +The common problem during text generiation are misplaced emphasis characters. |
| 6 | + |
| 7 | + *looks at you "why* this is here?" |
| 8 | + |
| 9 | +while it should be |
| 10 | + |
| 11 | + *looks at you* "why this is here?" |
| 12 | + |
| 13 | +This emphasisfsm solves this by simple (and fast) grammar expressed by deterministic finite state machine. |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | +Single letters are not practical in LLMs as tokens often contains more than one. |
| 18 | + |
| 19 | +Emphasisfsm uses LLM tokens as its alphabet making it very fast. |
| 20 | + |
| 21 | + |
| 22 | + |
| 23 | +Those are only most obvious examples. There are more, eg. ' "***' is a valid token to transition from qout to star. and '*this' is vaild for quot->none or none->quot. |
| 24 | + |
| 25 | +### Usage |
| 26 | + |
| 27 | +To support variety of GUIs this extension shamefully exploits GBNF grammar string. *This is not a proper GBNF grammar, it only uses the field which is easily editable in most GUIs* |
| 28 | + |
| 29 | +  |
| 30 | + |
| 31 | + |
| 32 | + emphasisfsm "_bias_[D][_emph1_][,_emphn_]" |
| 33 | + |
| 34 | +Empty string emphasisfsm is disabled. The easiest way to enable is to |
| 35 | + |
| 36 | + emphasisfsm "-20" |
| 37 | + |
| 38 | +which defaults to |
| 39 | + |
| 40 | + emphasisfsm "-20 \" \" * *" |
| 41 | + |
| 42 | +(no debug, only * and " are considered) |
| 43 | + |
| 44 | + |
| 45 | +### how it works |
| 46 | + |
| 47 | +Main loop is extended from: |
| 48 | + |
| 49 | +- retrieve logits |
| 50 | +- sample logits, select token (top_k and friends) |
| 51 | +- output token |
| 52 | + |
| 53 | +to |
| 54 | + |
| 55 | +- retrieve logits |
| 56 | +- ban forbidden emphasisfsm transitions from current state (stetting their logits low) |
| 57 | +- sample logits, select token (top_k and friends) |
| 58 | +- emphasisfsm trasition on selected token |
| 59 | +- output token |
| 60 | + |
| 61 | + |
| 62 | +### TODO |
| 63 | + |
| 64 | +- find split utf8 letters over more than one token (i don't plant to support it, but warning would be nice) |
| 65 | +- banning end tokens generation inside of emphasis - forcing LLM to finsh his 'thought' ? |
| 66 | + |
| 67 | + |
| 68 | +### Meta-Llama-3-8B stats for default (" *) emphasisfsm |
| 69 | + |
| 70 | + empcats_gen: ban bias: -17.500000 |
| 71 | + empcats_gen: emphasis indifferent tokens: 126802 |
| 72 | + empcats_gen: tokens for emphasis '"' '"': 1137 |
| 73 | + empcats_gen: tokens for emphasis '*' '*': 315 |
| 74 | + empcats_gen: always banned tokens: 2 |
| 75 | + empcats_gen: total tokens: 128256 |
| 76 | + |
| 77 | +Always banned tokens are : |
| 78 | + |
| 79 | +<pre>' "*"', ' "*"'</pre> |
| 80 | + |
| 81 | +### Tests |
| 82 | + |
| 83 | + emphasisfsm "-20 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 0 0" |
| 84 | + |
| 85 | +This forces that every digit is a citation, so example text completion looks like: |
| 86 | + |
| 87 | + |
| 88 | +``` |
| 89 | +Give me math vector of random numbers.Here is a 3-dimensional math vector with random numbers: |
| 90 | +
|
| 91 | +
|
| 92 | +Vector: |
| 93 | +[ |
| 94 | + 3.445, |
| 95 | + -5.117, |
| 96 | + 7.992 |
| 97 | +] |
| 98 | +``` |
| 99 | + |
| 100 | +There is no other digit between two 3, two 4, two 5 and so on.... |
| 101 | + |
0 commit comments