-
Notifications
You must be signed in to change notification settings - Fork 209
Triggers
Introduction — Triggers — Replies — Conversations — Topics — Plugins and Functions — Knowledge — Under the Hood
Triggers form the basis of all chat conversation and create a regular expression to match input on. In SuperScript, triggers are identified by +
in front of them, and replies by -
. The example below is a simple gambit and includes a trigger and a reply:
+ hello bot
- Hello, human!
If the user input is hello bot, the system will reply with Hello, human!.
Triggers are cleaned and normalized prior to being exposed to the engine. Extra whitespaces and punctuation are removed, as well matches are always case insensitive. The following rules are all identical.
+ hello bot
+ HELLO BOT
+ Hello, bot
+ hello BOT!
Inputs are processed by the system to make them easy to understand for a bot:
- cleaning and normalization
- spelling corrections for common spelling errors (or British spelling)
- idiom conversion
- junk word removal
- abbreviation expansion and more.
Examples:
- It’s a nice day becomes it is a nice day.
- Greetings such as hi, hello, hola, yo, hey become ~emohello.
Warning: normalization might remove some words from the input (like “really?” or “quite”), which might make rule triggering awkward sometimes.
Also, inputs coming into the system are burst into separate message objects and handled separately. Multiple phrases are broken based on ending punctuation AND commas following WH words. The reply of each gambit is concatenated back together to form the final message returned by the bot.
+ my name is john
- Hi John!
+ what is your name
- My name is Ava.
- An input like My name is John, what's your name? is split into two separate inputs and normalized: my name is john and what is your name. Then the bot looks for matches for both inputs, and then concatenates the replies, before replying to the user with Hi John! My name is Ava.
We use wildcards to allow for more natural and expressive rules. A wildcard will match one or more characters, words or tokens. Depending on the type of wildcard, the input may or may NOT be captured or saved by the system.
Will match zero to unlimited characters, words or tokens. The input is NOT captured or saved by the system.
+ * hello *
- That is crazy.
- Matches The dog said hello to the cat.
- Matches Hello world.
- Matches While hello.
If you know exactly how many words you want to accept, but don't want to list what the words are, then an exact length wildcard might be what you're after.
The syntax is ***n **, where n is the exact number of words you want to let through.
+ hello *2
- Matches Hello John Doe
- Does not match Hello John
The wildcards will be captured by the system and can later be used in replies.
If you want to only allow a few words though, you might consider using a variable length wildcard. The syntax for variable length wildcards is *~n
, where n
is the maximum number of words you want to let through.
+ hello *~2
- That is crazy!
- Matches Hello!
- Matches Hello John!
- Matches Hello John Doe
- Does not match Hello John George Doe
Variable length wildcards are great for capturing input where an adjective might be slipped in before a noun.
These are useful if you want to capture, let’s say, at least 2 words but no more than 4 words (the example below). Using *(n)
is the equivalent of *n
(exact length wildcard). The wildcards are captured by the system and can be used in replies.
+ hello *(2-4)
- Matches Hello John Doe and Hello John Dorian Doe
- Does not match Hello John
Alternates are used when you have a range of words that could fit into the rule, but one is required. The alternate in the input is captured by the system and can be used in replies.
+ i go by (bus|train|car)
- Matches I go by bus
- Matches I go by train
- Matches I go by car
- Does not match I go by
Optionals can be used to check for extra/optional words. Optionals are not captured by the system.
+ my [big] red balloon [is awesome]
- Matches my red balloon
- Matches my big red balloon
- Matches my red balloon is awesome
- Matches my big red balloon is awesome
- Does not match my big red balloon awesome
WordNet is a database of words and ontology including hypernym, synonyms and lots of other neat relationships between words. SuperScript is using the raw WordNet library, as well it has expanded it to include fact triples and provide even more relationships, through its scripted fact graph database.
These terms are expanded by using a tilde ~
before the word you want to expand.
+ I ~like ~sport
- Matches I like hockey
- Matches I love baseball
- Matches I care for soccer
- Matches I prefer lacrosse
When input comes into the system, we tag it and analyze it to help make sense of what is being said. SuperScript has a few tagged keywords you can use in triggers. These tags can also have a numeric value attached to them to get even more specific.
-
<noun>
,<nounN>
,<nouns>
-
<adjective>
,<adjectiveN>
,<adjectives>
-
<verb>
,<verbN>
,<verbs>
-
<adverb>
,<adverbN>
,<adverbs>
-
<pronoun>
,<pronounN>
,<pronouns>
-
<name>
,<nameN>
,<names>
+ <name1> is [more|less] <adjectives> than <name2>
- Matches Tom is taller than Mary
- Matches John is less disciplined than Jack.
Note that pronouns are a subclass of nouns, so I, you, her will match both <noun>
and <pronoun>
. For an input like I’m an engineer, the system will normalize it to ** I am an engineer** then tag it:
taggedWords:
[ [ 'I', 'NN' ],
[ 'am', 'VBP' ],
[ 'an', 'DT' ],
[ 'engineer', 'NN' ],
The only matches the first noun in the lookup, where matches all nouns, therefore
- I am a will match **I am a I **
- I am a Will match I am a I or** I am an engineer**
- I am a Will match I am a cow or I am an engineer
We can identify questions (with or without the ending question mark) in the input, so you can create specific rules (by using ? to begin your trigger pattern in SuperScript, or selecting from the droplist in the editor.
?:Will you do *
- Hmmm, let me get back to you on that.
SuperScript can go one step further and disseminate between different question types:
- Question word (who, what, where, when, why).
- Choice questions (this or that)
- Yes/No questions
- Tag questions (He is bald, isn’t he?)
?:WH * store
- Matches Who went to the store?
- Matches Why did you go to the store?
- Does not match Is this your store?
?:CH is your car *
- Matches Is your car green or blue?
- Does not match Is your car green?
We can also match based on the type of input. In the future, we may look at other types of classification that follow more linguistic types like speech acts and adjacency pairs.
SuperScript supports 8 broad categories and over 40 sub-categories with 80% accuracy:
ABBR - abbreviation
abb - abbreviation
exp - expression abbreviated
ENTY - entities
animal - animals
body - organs of body
color - colors
creative - inventions, books and other creative pieces
currency - currency names
event - events
food - food
instrument - musical instrument
lang - languages
letter - letters like a-z
other - other entities
plant - plants
product - products
religion - religions
sport - sports
substance - elements and substances
symbol - symbols and signs
technique - techniques and methods
term - equivalent terms
vehicle - vehicles
word - words with a special property
DESC - description and abstract concepts
def - definition of sth.
desc - description of sth.
manner - manner of an action
reason - reasons
HUM - human beings
group - a group or organization of persons
ind - an individual
title - title of a person
desc - description of a person
LOC - locations
city - cities
country - countries
mountain - mountains
other - other locations
state - states
NUM - numeric values
code - postcodes, phone number or other codes
count - number of sth.
expression - numeric mathmatical expression
date - dates
distance - linear measures
money - prices
order - ranks
other - other numbers
period - the lasting time of sth.
percent - fractions
speed - speed
temp - temperature
size - size, area and volume
weight - weight
Here are some examples:
?:NUM:code * phone *
- Matches My phone is 415-315 9862.
Input types are different from concepts or parts of speech because they are made up of more than one word. LOC
, for example, usually starts with “Where” then drills into a region or other complementary word.