What model(s) would you suggest for this problem #6839
-
I'm looking for ideas for a problem that is essentially schema guided / task-oriented dialogue. The task is essentially invoking an internal api. I have a number of In my case, both the intents and slots are fully defined by the schema at inference time, but the potential values of the slots aren't (all) known at train time. Also the available slots are context dependent - i.e. different users manage different sites and contractors, and users will most often mistype/abbreviate the slot values. I can use conventional joint intent and slot, with some sort of normalisation/text distance to map the slot spans to the most likely schema entries - but since the valid slot values are known at inference time it seems useful to make use of this somehow. Does NeMo have a model that does something like this, i.e. I could embed all the valid slot values (list of sites, list of contractors etc filtered by context) but how do I make use of this at inference time? I've considered first using an intent classifier, then based on the schema use something like SentenceTransformer to compare the user text to the embedded slot values. I'm interested in any ideas/recommendations/publications that are relevant. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
I would suggest you have a look at https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/nlp/Dialogue.ipynb particularly at Section 1.4 where this problem that you're describing is reduced to this: given an utterance, generate an intent as well as the corresponding slot names and slot values. In this problem formulation, it's not required that slot names are known beforehand during inference. With enough examples and training steps, the GPT style model can learn to draw this correlation between intent and slot names (without needing a sentence transformer style model for approximate match or predefining a schema). In fact this dialogue module also implements the traditional bert-style joint intent and slot models as well as intent classification based on sentence transformer but these can be a lot more complex to set up. The effectiveness of this model strongly depends on that the set of intents and slot names in your training set overlaps with your test set (since the model learns to memorize these). In terms of publications, this approach was inspired by https://proceedings.neurips.cc/paper/2020/hash/e946209592563be0f01c844ab2170f0c-Abstract.html |
Beta Was this translation helpful? Give feedback.
-
Thanks @Zhilin123 I have looked at that example, the BERT model appears similar to my test model that I built from scratch - although probably a lot more robust! I wasn't 100% sure if the GPT2 version is better or not, the accuracy after 3 epochs looks lower but both models produce different reports so its hard to directly compare BERT (3 epochs)
GPT (3 epochs)
Finally the SGD example only seems to predict intents, not slots? Is that true - and is it possible to configure the model to do both? |
Beta Was this translation helpful? Give feedback.
Hi @david-waterworth
Both BERT and GPT models predict intent and slots (in BERT, see this metric unified_slot_precision in the table).
The metrics for intent are called by the same names in both GPT and BERT while for slots, the metrics are reported slightly different because of how each model does the prediction.
In BERT, given this utterance below, it predicts something like
Intent 15 | slot0 | slot0 | slot0 | slot0 | slot0 | slot0 | slot24 | slot24 | slot24
[CLS] what is the energy consumption for 1 George St
Where Intent 15 is "energy consumption" and 1 out of say 50 intents and slot24 is the slot name for "site" out of say 40 different slots (with slot0 being the name for empty slot)…