[Rust] GrpoRewards: Implement Policy Evaluation Functions

> @nikg4 @taenin what other functions should we put into the GrpoRewards struct? other rewards functions? Here is a first cut from sonnet
> 
> 
> 2. **Policy Evaluation Functions**:
>    - `evaluate_policy(policy, environment)` - Measure performance of a given policy
>    - `compute_advantage(values, rewards)` - Calculate advantage estimates
>    - `compute_returns(rewards, gamma)` - Calculate cumulative returns
> 

Implement evaluate_policy, compute_advantage, compute_returns functions in rust

 _Originally posted by @kyjohnso in [#13](https://github.com/oumi-ai/roumi/issues/13#issuecomment-2745276290)_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Rust] GrpoRewards: Implement Policy Evaluation Functions #17

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Rust] GrpoRewards: Implement Policy Evaluation Functions #17

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions