Skip to content

Negative reward when serving ArmoRM-Llama3-8B-v0.1 #23

@maoliyuan

Description

@maoliyuan

Hello! When I serve ArmoRM-Llama3-8B-v0.1 using OpenRLHF, the output rewards are almost negative (around -2.0). I've attached some pictures of how I served the reward model. Is the output of this RM naturally around -2.0, or is it because the way I serve the RM is wrong? (The prompt dataset are also from rlhflow, like "RLHFlow/iterative-prompt-v1-iter7-20K", and the responses are generated from "RLHFlow/LLaMA3-iterative-DPO-final". We also apply the chat template when creating the prompt-response dataset
serve-armo-reward-model2
serve-armo-reward-model
)
serve-armo-reward-model1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions