A collection of robotics research papers that demonstrate reliability and robustness in the real world.
Prerequisites:
- must include real-world results
Common themes include:
-
Reward learning from human feedback and interventions
-
Value and progress estimation
Contributions are welcome!
Name | Date | Real World Success Rate | Project | Paper | Code | Organization(s) | Notes |
---|---|---|---|---|---|---|---|
WSRL: Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data | 07/2025 | 100% success rate on Franka peg insertion task in 18 minutes, SERL fails (0/20) even with 50 minutes.![]() |
Link | Link | Link | UC Berkeley | Overall idea: ![]() |
Dyna Robotics (Unknown Model) | 07/2025 | 99.9% success rate in folding towels for 8 hours/day over 3 days (dropped 1 towel on day 2). No intervention. | Link | Dyna Robotics | |||
Figure (Helix) | 06/2025 | ~95% accuracy at correctly orienting barcodes. 4.05 seconds per package. | Link | Figure | Adds memory for more robust, long-term tasks and force feedback for improved grip. | ||
RSS 2025 Workshop: Human-in-the-Loop Robot Learning: Teaching, Correcting, and Adapting | 06/2025 | various results | Link | various universities | |||
Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections | 06/2025 | book-flipping success rate of 100% (60% improvement) and belt assembly success of 70% (50% improvement) ![]() |
Link | Link | Stanford | ||
Dyna Robotics DYNA-1 Model | 04/2025 | 99.4% success rate in folding napkins over 24 hours. No intervention. | Link | Dyna Robotics | |||
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy | 02/2025 | 96.3% avg success rate across tasks, compared to 31.9% w/ HIL-SERL ![]() |
Link | Link | Chinese Academy of Sciences | Online and offline fine-tuning. | |
HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning | 10/2024 | 100% success rate on a variety of tasks ![]() |
Link | Link | Link | UC Berkeley | Online fine-tuning, human intervention allowed. Implementation available in LeRobot. |
RLIF: INTERACTIVE IMITATION LEARNING AS REINFORCEMENT LEARNING | 03/2024 | * 95% success rate in cloth unfolding within 7 rounds * 100% rate success in peg insertion within 6 rounds ![]() |
Link | Link | Link | UC Berkeley | |
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning | 01/2024 | 100% success on PCB insertion, cable routing, object relocation ![]() |
Link | Link | Link | UC Berkeley |
Name | Date | Real World Success Rate | Project | Paper | Code | Organization(s) | Notes |
---|---|---|---|---|---|---|---|
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations | 05/2025 | 50% - 100% success rate on new tasks, ~5x improvement over baseline ![]() |
Link | Link | USC, Amazon, KAIST | Focussed on new tasks. | |
Opening Articulated Structures in the Real World | 05/2025 | 61% success rate across 13 real-world homes and offices on previously unseen cabinets and drawers | Link | Link | Link | UIUC | General mobile manipulation system with no environment-specific tuning. |
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation | 03/2025 | 87.5% - 100% success rate on unseen tasks, 56% improvement in success rates across tasks over baseline (ACT, VLA, Octo) ![]() |
Link | Link | Link | Tsinghua | Focussed on new tasks. Human-level inference/robot speed. |
GVL: Vision Language Models are In-Context Value Learner | 11/2024 | 15% - 90% success rate, 0.46 avg improvement (VOC) on scale -1.0 to 1.0 over DP (diffusion policy)![]() |
Link | Link | Deepmind, UPenn, Stanford | Focussed on new tasks and estimation using VLM. |