-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Overview
We need to implement a RoBERTa-based classifier to predict the correct ERPNext Doctype from a user's natural language query. This is the first stage of our multi-step query generation pipeline and SBERT for predicting top fields.
Tasks
- Finalize the training dataset in RoBERTa & SBERT format
- Train or fine-tune a RoBERTa & SBERT model.
- Evaluate classification accuracy on dev/test sets.
- Add inference script and API wrapper.
- Integrate with overall pipeline.
- Add logging and error handling.
Notes
This component is expected to evolve continuously — we will likely update it with more training data and re-tune periodically as new doctypes or user question patterns emerge.
Labels
NLP
, model
, pipeline
, stage-1
, continual-improvement