R1 Zero GRPO Resources
R1 Zero GRPO Resources
A curated collection of resources related to R1 Zero and GRPO (Generative Reward-Penalty Optimization) implementations and research.
Official Implementations
- Open R1 - Official implementation by Hugging Face
- X-R1 - C++ implementation
- R1-Onevision - Vision-language model implementation
- Open R1 Multimodal - Multimodal implementation
Training Tools & Frameworks
- LLaMA Factory - Training framework with quickstart guide
- EasyR1 - Simplified R1 implementation
- VERL - Volcengine's implementation
- VLM-R1 - Vision-Language Model implementation
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT https://github.com/CaraJ7/T2I-R1
Thinking with Images
DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning
paperprojectcodewechat
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning projectpapercode
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual‘Tool Selection
Thinking with Generated Images
•SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
•Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
•GRIT: Teaching MLLMs to Think with Images
•Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL
•Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
•UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
•OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning
•Perception-R1: Pioneering Perception Policy with Reinforcement Learning
Documentation
- Swift GRPO Documentation - Official GRPO documentation
Additional Resources
- Awesome LLM Resources - Comprehensive collection of LLM resources
https://github.com/qiwang067/awesome-visual-rl
https://github.com/datawhalechina/easy-rl?tab=readme-ov-file 强化学习教程
训练框架
https://github.com/Simple-Efficient/RL-Factory
物理规则推理模型 https://github.com/nvidia-cosmos/cosmos-reason1