Skip to content

fix: QwQ and R1 finetune format #294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
May 1, 2025
Merged
Changes from 1 commit
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
a0334cc
wip: annotate todos
leonardmq Apr 22, 2025
db6cf83
wip: fix qwq/r1 data strategy
leonardmq Apr 24, 2025
b35706b
test: add cases and update existing tests
leonardmq Apr 25, 2025
096c6db
fix: default values on finetune_model_provider pydantic model
leonardmq Apr 26, 2025
e9b06a9
fix: expose all data strategies in UI dropdown when
leonardmq Apr 26, 2025
26ee24a
fix: dataset formatter duplicate final output message
leonardmq Apr 26, 2025
7c7c637
fix: validation of data strategy, and small refactor and validation t…
leonardmq Apr 27, 2025
7b14cc9
chore: remove obsolete todo
leonardmq Apr 27, 2025
2415c94
refactor: extract valid thinking data strategies into own constant
leonardmq Apr 27, 2025
be33978
fix: raise error in r1 serialization if none or empty thinking
leonardmq Apr 27, 2025
6a1f7d1
refactor: data formatter generators and fixes on COT vs R1
leonardmq Apr 27, 2025
86c8512
test: add tests for data_strategies_from_finetune_id and qwen3 match
leonardmq Apr 30, 2025
2848879
chore: replace error message strings
leonardmq Apr 30, 2025
b25ea6c
fix: formatter fix (newlines, and throw if R1 while vertex)
leonardmq Apr 30, 2025
fca9792
fix: UI to select strategy, clean labels, clean switch default option
leonardmq Apr 30, 2025
defa1ec
ui: error and block submit when no thinking filter dataset for R1 model
leonardmq Apr 30, 2025
1070910
fix: disable R1 data strategy for vertex download, toolcall downloads
leonardmq Apr 30, 2025
f45eb69
chore: remove lingering console log
leonardmq May 1, 2025
ea59f62
fix: string change for warning, and not block submit
leonardmq May 1, 2025
fb123c8
fix: no thinking instructions on r1 model and update validation
leonardmq May 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,11 @@
}

function build_available_model_select(models: FinetuneProvider[]) {
for (const model of models) {
for (const provider of model.models) {
console.log(model.id, provider.name, provider.id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should remove

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already removed - possibly not reviewing the latest commits?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was reviewing commit by commit since it's gotten large. Looks good now!

}
}
available_model_select = []
available_model_select.push([
disabled_header,
Expand Down Expand Up @@ -487,41 +492,41 @@
window.open(base_url + "/api/download_dataset_jsonl?" + query_string)
}

const data_strategies_labels: Record<FinetuneDataStrategy, string> = {
final_only: "Standard - Learn only from final response",
final_and_intermediate:
"Reasoning - Learn intermediate thinking and final response",
final_and_intermediate_r1_compatible:
"Reasoning (R1 compatible) - Learn intermediate thinking and final response",
}
let data_strategy_select_options: [FinetuneDataStrategy, string][] = []

function get_data_strategies_supported(
function update_data_strategies_supported(
base_model_id: string,
is_download: boolean,
): FinetuneDataStrategy[] {
if (is_download) {
return [
"final_and_intermediate",
"final_and_intermediate_r1_compatible",
"final_only",
]
) {
const data_strategies_labels: Record<FinetuneDataStrategy, string> = {
final_only: "Standard - Learn only from final response",
final_and_intermediate:
"Reasoning - Learn intermediate thinking and final response",
final_and_intermediate_r1_compatible: is_download
? "Reasoning (R1 compatible) - Learn intermediate thinking and final response"
: "Reasoning - Learn intermediate thinking and final response",
}
return (
available_models
?.map((model) => model.models)
.flat()
.find((model) => model.id === base_model_id)
?.data_strategies_supported ?? []
)

const compatible_data_strategies: FinetuneDataStrategy[] = is_download
? [
"final_and_intermediate",
"final_and_intermediate_r1_compatible",
"final_only",
]
: available_models
?.map((model) => model.models)
.flat()
.find((model) => model.id === base_model_id)
?.data_strategies_supported ?? []

data_strategy_select_options = compatible_data_strategies.map(
(strategy) => [strategy, data_strategies_labels[strategy]],
) as [FinetuneDataStrategy, string][]

data_strategy = compatible_data_strategies[0]
}

$: data_strategy_select_options = get_data_strategies_supported(
base_model_id,
is_download,
).map((strategy) => [strategy, data_strategies_labels[strategy]]) as [
FinetuneDataStrategy,
string,
][]
$: update_data_strategies_supported(base_model_id, is_download)
</script>

<div class="max-w-[1400px]">
Expand Down
Loading