All

4 repositories

CheemsRM
Public
ACL'25: Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch
reinforcement-learning large-language-model reward-model
Python
•
Apache License 2.0
•0•10•0•0•Updated Jun 10, 2025Jun 10, 2025
Critic-CoT
Public
Repo of the ACL'25 Findings paper "Critic-CoT: Boosting the Reasoning Abilities of Large Language Model via Chain-of-Thought Critic"
Python
•
MIT License
•0•5•0•0•Updated May 27, 2025May 27, 2025
RLFH
Public
ACL'25 Findings: On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation
reinforcement-learning hallucination large-language-model
Python
•
Apache License 2.0
•0•8•0•0•Updated May 26, 2025May 26, 2025
Generalizable-MM-RM
Public
ICML'25: The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models
Python
•0•10•1•0•Updated May 19, 2025May 19, 2025