R1 Zero GRPO Resources

Tiu MoLess than 1 minute

R1 Zero GRPO Resources

A curated collection of resources related to R1 Zero and GRPO (Generative Reward-Penalty Optimization) implementations and research.

Official Implementations

Open R1open in new window - Official implementation by Hugging Face
X-R1open in new window - C++ implementation
R1-Onevisionopen in new window - Vision-language model implementation
Open R1 Multimodalopen in new window - Multimodal implementation

Training Tools & Frameworks

LLaMA Factoryopen in new window - Training framework with quickstart guide
EasyR1open in new window - Simplified R1 implementation
VERLopen in new window - Volcengine's implementation
VLM-R1open in new window - Vision-Language Model implementation

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT https://github.com/CaraJ7/T2I-R1

Documentation

Swift GRPO Documentationopen in new window - Official GRPO documentation

Additional Resources

Awesome LLM Resourcesopen in new window - Comprehensive collection of LLM resources

https://github.com/qiwang067/awesome-visual-rl

https://github.com/datawhalechina/easy-rl?tab=readme-ov-file 强化学习教程

训练框架

https://github.com/Simple-Efficient/RL-Factory

物理规则推理模型 https://github.com/nvidia-cosmos/cosmos-reason1