Skip to content

[Rust] GrpoRewards: Implement Policy Evaluation Functions #17

@kyjohnso

Description

@kyjohnso

@nikg4 @taenin what other functions should we put into the GrpoRewards struct? other rewards functions? Here is a first cut from sonnet

  1. Policy Evaluation Functions:
    • evaluate_policy(policy, environment) - Measure performance of a given policy
    • compute_advantage(values, rewards) - Calculate advantage estimates
    • compute_returns(rewards, gamma) - Calculate cumulative returns

Implement evaluate_policy, compute_advantage, compute_returns functions in rust

Originally posted by @kyjohnso in #13

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions