-
Notifications
You must be signed in to change notification settings - Fork 250
fix: QwQ and R1 finetune format #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
scosman
merged 20 commits into
Kiln-AI:main
from
leonardmq:leonard/fix-qwq-fine-tune-format
May 1, 2025
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
a0334cc
wip: annotate todos
leonardmq db6cf83
wip: fix qwq/r1 data strategy
leonardmq b35706b
test: add cases and update existing tests
leonardmq 096c6db
fix: default values on finetune_model_provider pydantic model
leonardmq e9b06a9
fix: expose all data strategies in UI dropdown when
leonardmq 26ee24a
fix: dataset formatter duplicate final output message
leonardmq 7c7c637
fix: validation of data strategy, and small refactor and validation t…
leonardmq 7b14cc9
chore: remove obsolete todo
leonardmq 2415c94
refactor: extract valid thinking data strategies into own constant
leonardmq be33978
fix: raise error in r1 serialization if none or empty thinking
leonardmq 6a1f7d1
refactor: data formatter generators and fixes on COT vs R1
leonardmq 86c8512
test: add tests for data_strategies_from_finetune_id and qwen3 match
leonardmq 2848879
chore: replace error message strings
leonardmq b25ea6c
fix: formatter fix (newlines, and throw if R1 while vertex)
leonardmq fca9792
fix: UI to select strategy, clean labels, clean switch default option
leonardmq defa1ec
ui: error and block submit when no thinking filter dataset for R1 model
leonardmq 1070910
fix: disable R1 data strategy for vertex download, toolcall downloads
leonardmq f45eb69
chore: remove lingering console log
leonardmq ea59f62
fix: string change for warning, and not block submit
leonardmq fb123c8
fix: no thinking instructions on r1 model and update validation
leonardmq File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And Qwen 3 already requires a change 😀
We shoud add "qwen3", and qwen3 should return final_only, final_and_intermediate_r1_compatible (since it can do both thinking or non thinking with /think /no_think directives)
Let's maybe split this into another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a pattern match in the
data_strategies_from_finetune_id
to targetqwen3
and only allow['final_and_intermediate_r1_compatible', 'final_only']
- just to be able to test for the allowed strategies in a parameterized test, but have not implemented any other logic forqwen3
besides that