-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Description
Summary
In several FLAN Stage 3 query generation examples, we missed explicitly including fields used in the filtering clause (e.g., filters={'field': value}
) in the input fields
list. As a result, the model incorrectly assumes or omits filtering logic, leading to incomplete or incorrect query generation.
Problem
While some examples include both retrieval and filtering fields, others only include output fields, leaving out critical filter fields like disabled
, supplier_group
, territory
, etc.
This inconsistency:
- Reduces training consistency
- Affects generalization to filtering-type questions
- Leads to wrong query structure in multi-field queries
Solution
We need to:
- Identify all filtering-type questions in the FLAN Stage 3 dataset
- Ensure that filter fields are added alongside retrieval fields
- Reformat existing samples to include all required fields for correctness
- Review and validate updated entries with multi-field and conditional logic
Labels
bug
, data-quality
, field-mapping
, query-logic
, high-priority