v0.4.1
Fixed
- Internal: route all evaluations through the unified multi_evaluatefunction (single-case path now uses the same aggregation pipeline).
- Prompting: moved part of the prior user prompt into the system prompt and refined the system prompt wording for clarity and consistency.