Skip to content

Is GRPO described in the r1 paper actually using sequence likelihood as in GSPO ? #725

@hannibalhuang

Description

@hannibalhuang

Original question from Grad on X. Would be great to clarify :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions