Skip to content

BaiShuanghao/Awesome-Robotics-Manipulation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 

Repository files navigation

Awesome-Robotics-Manipulation

✨ About

This repo contains a curated list of Robot Manipulation papers relating to Robotics domain.

This repository will be continuously updated, and we warmly welcome contributions from the community. If you have papers, projects, or resources that are not yet included, please feel free to submit them via a pull request, open an issue for discussion or email me to add papers!

🏠 Table of Contents

📝 Awesome Papers

📄 Survey

Title Venue Date Code Notes
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges arXiv 2025-05-07 - VLA Models
A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI arXiv 2025-05-01 - Navigation and Manipulation
Diffusion Models for Robotic Manipulation: A Survey arXiv 2025-04-11 - DP for Manipulation
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision arXiv 2025-04-03 Star Github Robot Vision
Generative Artificial Intelligence in Robotic Manipulation: A Survey arXiv 2025-03-05 Star Github Manipulation
A Survey of Embodied Learning for Object-Centric Robotic Manipulation arXiv 2024-08-21 Star Github Manipulation
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI arXiv 2024-07-09 Star Github Embodied Agent
A Survey on Vision-Language-Action Models for Embodied AI arXiv 2024-05-23 - VLA Models
Survey of Learning-based Approaches for Robotic In-Hand Manipulation arXiv 2024-01-15 - In-hand Manipulation
Language-conditioned Learning for Robotic Manipulation: A Survey arXiv 2023-12-17 Star Github Manipulation
Deep Learning Approaches to Grasp Synthesis: A Review T-RO 2023 2023-07-06 Project Grasp

(back to top)

🦾 Grasp

Rectangle-based Grasp

Title Venue Date Code
RoboGrasp: A Universal Grasping Policy for Robust Robotic Control arXiv 2025-02-05 -
HMT-Grasp: A Hybrid Mamba-Transformer Approach for Robot Grasping in Cluttered Environments arXiv 2024-10-04 -
LLGD: Lightweight Language-driven Grasp Detection using Conditional Consistency Model IROS 2024 2024-07-25 Star Github
grasp_det_seg_cnn: End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB ICRA 2021 2021-07-12 Star Github
GR-ConvNet: Antipodal Robotic Grasping using Generative Residual Convolutional Neural Network IROS 2020 2019-09-11 Star Github
Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach RSS 2018 2018-04-14 Star Github
Robotic Grasp Detection using Deep Convolutional Neural Networks IROS 2017 2016-11-24 Star Github

(back to top)

6-DoF Grasp

Title Venue Date Code
GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation arXiv 2025-05-16 Star Github
Exploiting Radiance Fields for Grasp Generation on Novel Synthetic Views RSSW 2025 2025-05-16 Star Github
Grasp the Graph (GtG) 2.0: Ensemble of GNNs for High-Precision Grasp Pose Detection in Clutter arXiv 2025-05-05 Star Github
PCF-Grasp: Converting Point Completion to Geometry Feature to Enhance 6-DoF Grasp arXiv 2025-04-22 Star Github
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection CoRL 2024 2024-10-09 Project
OrbitGrasp: SE(3)-Equivariant Grasp Learning CoRL 2024 2024-07-03 Project
EquiGraspFlow: SE(3)-Equivariant 6-DoF Grasp Pose Generative Flows CoRL 2024 2024-09-06 Star Github
EconomicGrasp: An Economic Framework for 6-DoF Grasp Detection ECCV 2024 2024-07-11 Star Github
Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge CVPR 2024 2024-04-02 Star Github
FlexLoG: Rethinking 6-Dof Grasp Detection: A Flexible Framework for High-Quality Grasping arXiv 2024-03-22 -
HGGD: Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes RA-L 2023 2024-03-27 Star Github
AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal Domains T-RO 2023 2022-12-16 Star Github
Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes ICRA 2021 2021-03-25 Star Github
GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping CVPR 2020 2020-08-05 Star Github
6-DOF GraspNet: Variational Grasp Generation for Object Manipulation ICCV 2019 2019-05-25 Star Github

(back to top)

Grasp with 3D Techniques

Title Venue Date Code
ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping CVPR 2025 2025-04-15 Project
SDF
IGD: Implicit Grasp Diffusion: Bridging the Gap between Dense Prediction and Sampling-based Grasping CoRL 2024 2024-09-05 Star Github
NeuGraspNet: Learning Any-View 6DoF Robotic Grasping in Cluttered Scenes via Neural Surface Rendering RSS 2024 2023-06-12 -
NeRF
LERF-TOGO: Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping CoRL 2023 2023-09-14 Star Github
GraspNeRF: Multiview-based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF ICRA 2023 2022-10-12 Star Github
3D Gaussian Splatting (3DGS)
SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images arXiv 2024-12-03 -
GraspSplats: Efficient Manipulation with 3D Feature Splatting CoRL 2024 2024-09-03 Star Github
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping RA-L 2024 2024-03-14 Star Github

(back to top)

Language-Driven Grasp

Title Venue Date Code
GraspCorrect: Robotic Grasp Correction via Vision-Language Model-Guided Feedback* arXiv 2025-03-19 -
Free-form language-based robotic reasoning and grasping arXiv 2025-03-17 Project
AffordGrasp: In-Context Affordance Reasoning for Open-Vocabulary Task-Oriented Grasping in Clutter arXiv 2025-03-02 Project
RoboReflect: Robotic Reflective Reasoning for Grasping Ambiguous-Condition Objects arXiv 2025-01-16 -
Attribute-Based Robotic Grasping with Data-Efficient Adaptation T-RO 2024 2024-12-12 Project
RTAGrasp: Learning Task-Oriented Grasping from Human Videos via Retrieval, Transfer, and Alignment ICRA 2025 2024-09-24 Project
LGrasp6D: Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance ECCV 2024 2024-07-18 Star Github
Reasoning Grasping: Reasoning Grasping via Multimodal Large Language Model CoRL 2024 2024-02-09 Project
ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter CoRL 2024 2024-07-16 Star Github
OWG: Towards Open-World Grasping with Large Vision-Language Models CoRL 2024 2024-06-26 Project
RT-Grasp: Reasoning Tuning Robotic Grasping via Multi-modal Large Language Model IROS 2024 2024-11-07 Project
VL-Grasp: a 6-Dof Interactive Grasp Policy for Language-Oriented Objects in Cluttered Indoor Scenes IROS 2023 2023-08-01 Star Github
GraspGPT: Leveraging Semantic Knowledge from a Large Language Model for Task-Oriented Grasping RA-L 2023 2023-07-25 Star Github
A Joint Modeling of Vision-Language-Action for Target-oriented Grasping in Clutter ICRA 2023 2023-02-24 Star Github

(back to top)

Grasp for Transparent Objects

Title Venue Date Code
FuseGrasp: Radar-Camera Fusion for Robotic Grasping of Transparent Objects arXiv 2025-02-27 -
TranSplat: Surface Embedding-guided 3D Gaussian Splatting for Transparent Object Manipulation arXiv 2025-02-11 -
T2SQNet: A Recognition Model for Manipulating Partially Observed Transparent Tableware Objects CoRL 2024 2024-09-06 Star Github
ASGrasp: Generalizable Transparent Object Reconstruction and Grasping from RGB-D Active Stereo Camera ICRA 2024 2024-05-09 Star Github
Dex-NeRF: Using a Neural Radiance Field to Grasp Transparent Objects CoRL 2021 2021-10-27 Star Github

(back to top)

Dexterous Grasp

Title Venue Date Code
Grasp What You Want: Embodied Dexterous Grasping System Driven by Your Voice arXiv 2024-12-14 Project
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping arXiv 2024-12-03 Star Github

(back to top)

Bimanual Grasp

Title Venue Date Code
COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping arXiv 2025-02-12 Project

(back to top)

🤖 Manipulation

Representation Learning with Auxiliary Tasks

Title Venue Date Code
Contrastive Learning (Alignment)
Σ-agent: Contrastive Imitation Learning for Language-guided Multi-Task Robotic Manipulation CoRL 2024 2024-06-14 Project
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers RSS 2024 2024-03-19 Project
R3M: A Universal Visual Representation for Robot Manipulation CoRL 2022 2022-03-23 Star Github
HULC: What Matters in Language Conditioned Robotic Imitation Learning over Unstructured Data RA-L 2022 2022-04-13 Star Github
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning CoRL 2021 2022-02-04 Star Github
Masked Reconstruction
STP: Spatiotemporal Predictive Pre-training for Robotic Motor Control arXiv 2024-03-08 -
MUTEX: Learning Unified Policies from Multimodal Task Specifications CoRL 2023 2023-09-25 Star Github
Robot Learning with Sensorimotor Pre-training CoRL 2023 2023-06-16 Project
Voltron: Language-Driven Representation Learning for Robotics RSS 2023 2023-02-24 Star Github
MVP: Real-World Robot Learning with Masked Visual Pre-training CoRL 2022 2022-10-06 Star Github
Text Goal Generation
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning ICRA 2025 2024-09-23 Star Github
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought NeurIPS 2023 2023-05-24 Star Github
COTPC: Chain-of-Thought Predictive Control ICML 2024 2023-04-03 Star Github
Visual Goal Generation
VIRT: Vision Instructed Transformer for Robotic Manipulation arXiv 2024-10-09 Star Github
KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance CoRL 2024 2024-08-06 Star Github
GENIMA: Generative Image as Action Models CoRL 2024 2024-07-10 Star Github
ATM: Any-point Trajectory Modeling for Policy Learning RSS 2024 2023-12-28 Star Github
MPI: Learning Manipulation by Predicting Interaction RSS 2024 2024-06-01 Star Github
OCI: Object-Centric Instruction Augmentation for Robotic Manipulation ICRA 2024 2024-01-05 Project
HOPMan: Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans ICRA 2024 2023-12-01 Project
CALAMARI: Contact-Aware and Language conditioned spatial Action MApping for contact-RIch manipulation CoRL 2023 2023-08-30 Project
Image / Video Prediction
GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation ICLR 2025 2025-02-13 -
Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation ICLR 2025 2024-12-19 Star Github
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations arXiv 2024-12-19 Project
GHIL-Glue: Hierarchical Control with Filtered Subgoal Images arXiv 2024-10-26 Project
FoAM: Foresight-Augmented Multi-Task Imitation Policy for Robotic Manipulation arXiv 2024-09-29 Project
VideoAgent: Self-Improving Video Generation arXiv 2024-10-14 Star Github
GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy RA-L 2025 2024-08-26 Star Github
GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation arXiv 2024-10-08 Project
VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation RSS 2024 2024-07-13 Star Github
GR-1: Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation ICLR 2024 2023-12-20 Star Github
SuSIE: Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models ICLR 2024 2023-10-16 Star Github
VLP: Video Language Planning ICLR 2024 2023-10-16 Github

(back to top)

Visual Representation Learning

Title Venue Date Code
SEM: Enhancing Spatial Understanding for Robust Robot Manipulation arXiv 2025-05-22 -
H2R: A Human-to-Robot Data Augmentation for Robot Pre-training from Videos arXiv 2025-05-17 -
L2D2: Robot Learning from 2D Drawings arXiv 2025-05-17 Project
Exploring Pose-Guided Imitation Learning for Robotic Precise Insertion arXiv 2025-05-14 Star Github
Augmented Reality for RObots (ARRO): Pointing Visuomotor Policies Towards Visual Robustness arXiv 2025-05-13 Project
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors CVPR 2025 2025-04-30 Star Github
CIVIL: Causal and Intuitive Visual Imitation Learning arXiv 2025-04-22 Project
SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation arXiv 2025-04-22 -
Bi-LAT: Bilateral Control-Based Imitation Learning via Natural Language and Action Chunking with Transformers arXiv 2025-04-02 Project
X-IL: Exploring the Design Space of Imitation Learning Policies arXiv 2025-02-17 Star Github
Imit Diff: Semantics Guided Diffusion Transformer with Dual Resolution Fusion for Imitation Learning arXiv 2025-02-11 -
Rethinking Latent Redundancy in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation ICML 2025 2025-02-05 Star Github
MCR: Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets ICLR 2025 2024-10-29 Star Github
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation ICLR 2025 2024-10-10 Star Github
CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation NeurIPS 2024 2024-09-13 Star Github
Theia: Distilling Diverse Vision Foundation Models for Robot Learning CoRL 2024 2024-07-29 Star Github
BAKU: An Efficient Transformer for Multi-Task Policy Learning NeurIPS 2024 2024-06-11 Star Github
MPI: Learning Manipulation by Predicting Interaction RSS 2024 2024-06-01 Star Github
VC-1: Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? NeurIPS 2023 2023-03-31 Star Github
MVP: Real-World Robot Learning with Masked Visual Pre-training CoRL 2023 2022-10-06 Star Github
LIV: Language-Image Representations and Rewards for Robotic Control ICML 2023 2023-06-01 Star Github
VIMA: General Robot Manipulation with Multimodal Prompts ICML 2023 2022-10-06 Star Github
ACT: Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware RSS 2023 2023-04-23 Star Github
Voltron: Language-Driven Representation Learning for Robotics RSS 2023 2023-02-24 Star Github
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training ICLR 2023 2022-08-30 Star Github
R3M: A Universal Visual Representation for Robot Manipulation CoRL 2022 2022-03-23 Star Github
ZeST: Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation? L4DC 2022 2022-04-23 Project

(back to top)

Multimodal Representation Learning

Title Venue Date Code
MS-Bot: Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation CoRL 2024 2024-08-02 Star Github
MUTEX: Learning Unified Policies from Multimodal Task Specifications CoRL 2023 2023-09-25 Star Github

(back to top)

Latent Action Learning

Title Venue Date Code
CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning arXiv 2025-05-22 Star Github
FLARE: Robot Learning with Implicit World Modeling arXiv 2025-05-21 Project
DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories arXiv 2025-05-19 Project
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations arXiv 2025-05-08 Star Github
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation arXiv 2024-12-05 Star Github
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation ICRA 2025 2024-09-27 Project
IGOR: Image-GOal Representations Atomic Control Units for Foundation Models in Embodied AI - 2024 Project
LAPA: Latent Action Pretraining from Videos ICLR 2025 2024-10-15 Star Github
GRIF: Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control CoRL 2023 2023-06-30 Star Github
MimicPlay: Long-Horizon Imitation Learning by Watching Human Play CoRL 2023 2023-02-24 Star Github
KOAP: Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers arXiv 2024-10-24 -
LAPO: Learning to Act without Actions ICLR 2024 2023-12-17 Star Github
ILPO: Imitating Latent Policies from Observation ICML 2019 2018-05-21 Star Github

(back to top)

World Model

Title Venue Date Code
Vid2World: Crafting Video Diffusion Models to Interactive World Models arXiv 2025-05-20 Project
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning ICLR 2025 2025-05-13 Project
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance arXiv 2025-04-23 -
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets arXiv 2025-04-03 Star Github
Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning arXiv 2025-03-03 -
Generalist World Model Pre-Training for Efficient Reinforcement Learning arXiv 2025-02-26 -
Learning View-invariant World Models for Visual Robotic Manipulation ICLR 2025 2025-01-23 Star Github
Cosmos World Foundation Model Platform for Physical AI arXiv 2025-01-07 Star Github
Sirius-Fleet: Multi-Task Interactive Robot Fleet Learning with Visual World Models CoRL 2024 2024-10-30 Project
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning CoRL 2023 2024-01-06 Project
FOWM: Finetuning Offline World Models in the Real World CoRL 2023 2023-10-24 Star Github
SWIM: Structured World Models from Human Videos RSS 2023 2023-08-23 Project
Surfer: Progressive Reasoning with World Models for Robotic Manipulation arXiv 2023-06-20 Star Github

(back to top)

Asynchronous Action Learning

Title Venue Date Code
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation NeurIPS 2024 2024-10-14 Star Github
RoboDual: Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation arXiv 2024-10-10 Star Github
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers CoRL 2024 2024-09-12 -
LCB: From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control IROS 2024 2024-05-08 Project
MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models CoRL 2023 2024-01-25 Star Github

(back to top)

Diffusion Policy Learning

Title Venue Date Code
3D Equivariant Visuomotor Policy Learning via Spherical Projection arXiv 2025-05-22 -
Latent Theory of Mind: A Decentralized Diffusion Architecture for Cooperative Manipulation arXiv 2025-05-14 Project
H3DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning arXiv 2025-05-12 Project
Demystifying Diffusion Policies: Action Memorization and Simple Lookup Table Alternatives arXiv 2025-05-09 Project
Latent Diffusion Planning for Imitation Learning arXiv 2025-04-23 Star Github
Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation CVPR 2025 2025-03-13 -
Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation RSS 2025 2025-03-04 Star Github
FRMD: Fast Robot Motion Diffusion with Consistency-Distilled Movement Primitives for Smooth Action Generation arXiv 2025-03-03 -
S2-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation arXiv 2025-02-13 -
MoDE: Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning ICLR 2025 2024-12-17 Star Github
Score and Distribution Matching Policy: Advanced Accelerated Visuomotor Policies via Matched Distillation arXiv 2024-12-12 Star Github
AffordDP: Generalizable Diffusion Policy with Transferable Affordance arXiv 2024-12-04 Project
Instant Policy: In-Context Imitation Learning via Graph Diffusion ICLR 2025 2024-11-19 Star Github
STMDP: Brain-inspired Action Generation with Spiking Transformer Diffusion Policy Model arXiv 2024-11-15 -
MBA: Motion Before Action: Diffusing Object Motion as Manipulation Condition arXiv 2024-11-14 Star Github
DiT Policy: Diffusion Transformer Policy arXiv 2024-10-21 -
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation arXiv 2024-10-19 Project
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation ICLR 2025 2024-10-10 Star Github
ScaleDP: Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation ICRA 2025 2024-09-22 Project
SDP: Spiking Diffusion Policy for Robotic Manipulation with Learnable Channel-Wise Membrane Thresholds arXiv 2024-09-17 -
DiT-Block Policy: The Ingredients for Robotic Diffusion Transformers arXiv 2024-10-14 Star Github
GenDP: 3D Semantic Fields for Category-Level Generalizable Diffusion Policy CoRL 2024 2024-10-23 Star Github
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning CoRL 2024 2024-07-01 Star Github
SDP: Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning CoRL 2024 2024-07-01 Star Github
ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation arXiv 2024-06-03 Star Github
RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective IROS 2024 2024-04-18 Star Project
MDT: Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals RSS 2024 2024-07-08 Star Github
R&D: Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning RSS 2024 2024-05-28 Star Github
DP3: 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations RSS 2024 2024-03-06 Star Github
PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play CoRL 2023 2023-12-07 Project
EquiDiff: Equivariant Diffusion Policy CoRL 2024 2024-07-01 Star Code
StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects RSS 2023 2022-11-08 Star Github
BESO: Goal-Conditioned Imitation Learning using Score-based Diffusion Policies RSS 2023 2023-04-05 Star Github
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion RSS 2023 2023-03-07 Star Github

(back to top)

Other Policies

Title Venue Date Code
Dense Policy: Bidirectional Autoregressive Learning of Actions arXiv 2025-03-17 Star Github
RoboBERT: An End-to-end Multimodal Robotic Manipulation Model arXiv 2025-02-11 Star Github
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation arXiv 2025-01-03 Project
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction arXiv 2024-12-09 Star Github
FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation AAAI 2025 2024-12-06 Star Github
Autoregressive Action Sequence Learning for Robotic Manipulation arXiv 2024-10-04 Star Github
MaIL: Improving Imitation Learning with Selective State Space Models CoRL 2024 2024-06-12 Star Github

(back to top)

Vision Language Action Models

Title Venue Date Code
Interactive Post-Training for Vision-Language-Action Models arXiv 2025-05-22 Star Github
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization arXiv 2025-05-22 Project
ManipLVM-R1: Reinforcement Learning for Reasoning in Embodied Manipulation with Large Vision-Language Models arXiv 2025-05-22 -
From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems arXiv 2025-05-21 -
Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control arXiv 2025-05-21 -
Incentivizing Multimodal Reasoning in Large Models for Direct Robot Manipulation arXiv 2025-05-19 -
VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation arXiv 2025-05-14 Project
FSD: From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation arXiv 2025-05-13 Project
ECoT-Lite: Training Strategies for Efficient Embodied Reasoning arXiv 2025-05-13 -
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions RSS 2025 2025-05-09 Star Github
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks arXiv 2025-05-09 Project
OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation arXiv 2025-05-06 Star Github
GraspVLA: a Grasping Foundation Model Pre-trained on Billion-scale Synthetic Action Data arXiv 2025-05-06 Star Github
Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions arXiv 2025-05-04 Project
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation arXiv 2025-05-04 -
π0.5: a Vision-Language-Action Model with Open-World Generalization arXiv 2025-04-22 Project
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation arXiv 2025-04-17 Star Github
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models CVPR 2025 2025-03-27 Project
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation arXiv 2025-03-26 Star Github
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy arXiv 2025-03-25 Star Github
DataPlatter: Boosting Robotic Manipulation Generalization with Minimal Costly Data arXiv 2025-03-25 -
AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems - 2025-03-09 Star Github
VLA Model-Expert Collaboration for Bi-directional Manipulation Learning arXiv 2025-03-06 Project
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping arXiv 2025-03-05 Star Github
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction ICML 2025 2025-03-05 Star Github
Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding arXiv 2025-02-28 -
FLOWER: Democratizing Generalist Robot Policies with Efficient Vision-Language-Action Flow Policies ICLR 2025 2025-02-28 -
RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete CVPR 2025 2025-02-28 Project
Fine-Tuning Vision-Language-Action Models: Optimizing Speed and Success arXiv 2025-02-27 Star Github
Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models arXiv 2025-02-26 Project
ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration arXiv 2025-02-26 Project
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model arXiv 2025-02-20 Project
Magma: A Foundation Model for Multimodal AI Agents CVPR 2025 2025-02-18 Star Github
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control arXiv 2025-02-09 Star Github
HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation ICLR 2025 2025-02-08 Project
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency Policy RSS 2025 2025-02-08 Project
Probing a Vision-Language-Action Model for Symbolic States and Integration into a Cognitive Architecture arXiv 2025-02-06 -
RAD: Action-Free Reasoning for Policy Generalization arXiv 2025-02-06 Project
VLA-Cache: Towards Efficient Vision-Language-Action Model via Adaptive Token Caching in Robotic Manipulation arXiv 2025-02-04 -
UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent arXiv 2025-01-31 -
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model arXiv 2025-01-27 Star Github
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation arXiv 2024-12-29 Star Github
Seer: Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation ICLR 2025 2024-12-19 Star Github
RoboVLMs: Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models arXiv 2024-12-18 Star Github
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning arXiv 2024-12-16 Star Github
VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation ICLR 2025 2025-01-25 Star Github
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies ICLR 2025 2024-12-13 Star Github
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression ICML 2025 2024-12-14 Project
π0: A Vision-Language-Action Flow Model for General Robot Control arXiv 2024-10-31 Project
BYOVLA: Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust arXiv 2024-10-02 Star Github
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation RA-L 2025 2024-09-19 Star Github
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution NeurIPS 2024 2024-11-04 Github
QueST: Self-Supervised Skill Abstractions for Learning Continuous Control NeurIPS 2024 2024-07-22 Star Github
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation NeurIPS 2024 2024-06-06 Star Github
DP-VLA: A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM CoRL 2024 2024-10-21 -
OpenVLA: An Open-Source Vision-Language-Action Model CoRL 2024 2024-06-13 Star Github
LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning CoRL 2024 2024-06-17 Star Github
ECoT: Robotic Control via Embodied Chain-of-Thought Reasoning CoRL 2024 2024-07-11 Star Github
3D-VLA: A 3D Vision-Language-Action Generative World Model ICML 2024 2024-03-14 Star Github
Octo: An Open-Source Generalist Robot Policy RSS 2024 2024-05-20 Star Github
RT-H: Action Hierarchies Using Language RSS 2024 2024-03-04 Project
RoboFlamingo: Vision-Language Foundation Models as Effective Robot Imitators ICLR 2024 2023-11-02 Star Github
Open X-Embodiment: Robotic Learning Datasets and RT-X Models ICRA 2024 2023-10-13 Star Github
MOO: Open-World Object Manipulation using Pre-trained Vision-Language Models CoRL 2023 2023-03-02 Project
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control CoRL 2023 2023-07-28 Project
RT-1: Robotics Transformer for Real-World Control at Scale RSS 2023 2022-12-13 Star Github

(back to top)

Reinforcement Learning

Title Venue Date Code
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations arXiv 2025-05-16 Project
What Matters for Batch Online Reinforcement Learning in Robotics? arXiv 2025-05-12 Project
Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach ICML 2025 2025-05-10 Star Github
TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations ICRA 2025 2025-05-09 Project
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network arXiv 2025-02-01 -
Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model ICLR 2025 2024-12-18 Star Github
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning arXiv 2024-12-13 Project
HIL-SERL: Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning arXiv 2024-10-29 Project
PointPatchRL - Masked Reconstruction Improves Reinforcement Learning on Point Clouds CoRL 2024 2024-10-24 Project
SPIRE: Synergistic Planning, Imitation, and Reinforcement for Long-Horizon Manipulation CoRL 2024 2024-10-23 Project
Maniwhere: Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning CoRL 2024 2024-07-22 Project
PSL: Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks ICLR 2024 2024-05-02 Star Github
TD-MPC2: Scalable, Robust World Models for Continuous Control ICLR 2024 2023-10-25 Star Github
VELAP: ** ** CoRL 2023 2023 -
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions CoRL 2023 2023-09-18 Project
PTR: Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials RSS 2023 2022-10-11 Project
TD-MPC: Temporal Difference Learning for Model Predictive Control ICML 2022 2022-03-09 Star Github

(back to top)

Motion, Tranjectory and Flow

Title Venue Date Code
Path Planning
LACO: Language-Conditioned Path Planning CoRL 2023 2024-08-31 Star Github
Motion Planning
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards ICRA 2025 2025-02-12 Star Github
DiffusionSeeder: Seeding Motion Optimization with Diffusion for Rapid Motion Planning CoRL 2024 2024-10-22 Project
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation CoRL 2024 2024-09-03 Star Github
CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models ICRAW 2024 2024-03-13 Star Github
Elastic-DS: Task Generalization with Stability Guarantees via Elastic Dynamical System Motion Policies CoRL 2023 2023-09-05 Star Github
Trajectory Optimization
ORION: Vision-based Manipulation from Single Human Video with Open-World Object Graphs arXiv 2024-05-30 Project
PointFlowMatch: Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching CoRL 2024 2024-09-11 Project
RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation ICRA 2024 2023-08-30 Star Github
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models CoRL 2023 2023-07-12 Star Github
LATTE: LAnguage Trajectory TransformEr ICRA 2023 2022-08-04 Star Github
Trajectory-conditioned policy
Diffusion Trajectory-guided Policy for Long-horizon Robot Manipulation arXiv 2025-02-14 -
P3-PO: Prescriptive Point Priors for Visuo-Spatial Generalization of Robot Policies arXiv 2024-12-09 Star Github
Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation ECCV 2024 2024-05-02 Star Github
ATM: Any-point Trajectory Modeling for Policy Learning RSS 2024 2023-12-28 Star Github
AWE: Waypoint-Based Imitation Learning for Robotic Manipulation CoRL 2023 2023-07-26 Star Github
Flow-conditioned policy
VIP: Vision Instructed Pre-training for Robotic Manipulation arXiv 2024-10-09 Star Github
ManiTrend: Bridging Future Generation and Action Prediction with 3D Flow for Robotic Manipulation arXiv 2024-02-14 -
Im2Flow2Act: Flow as the Cross-Domain Manipulation Interface CoRL 2024 2024-07-21 Star Github
AVDC: Learning to Act from Actionless Videos through Dense Correspondences ICLR 2024 2023-10-12 Star Github

(back to top)

Data Collection, Selection and Augmentation

Title Venue Date Code
Data Collection
Guiding Data Collection via Factored Scaling Curves arXiv 2025-05-12 Project
Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction arXiv 2025-03-07 -
AVR: Active Vision-Driven Robotic Precision Manipulation with Viewpoint and Focal Length Optimization arXiv 2025-03-03 Project
Physics-Driven Data Generation for Contact-Rich Manipulation via Trajectory Optimization arXiv 2025-02-27 Project
Re3Sim: Generating High-Fidelity Simulation Data via 3D-Photorealistic Real-to-Sim for Robotic Manipulation arXiv 2025-02-12 Star Github
ALPHA-α and Bi-ACT Are All You Need: Importance of Position and Force Information/Control for Imitation Learning of Unimanual and Bimanual Robotic Manipulation with Low-Cost System arXiv 2024-11-15 Project
SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment CoRL 2024 2024-10-24 Project
NILS: Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models CoRL 2024 2024-10-23 Project
SOAR: Autonomous Improvement of Instruction Following Skills via Foundation Models CoRL 2024 2024-07-30 Star Github
Manipulate-Anything: Automating Real-World Robots using Vision-Language Models CoRL 2024 2024-06-27 Project
DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation CoRL 2024 2024-03-12 Star Github
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots RSS 2024 2024-02-15 Star Github
AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild ICRA 2024 2023-09-26 Star Github
SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling ICRA 2024 2023-06-20 Star Github
Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition CoRL 2023 2023-07-26 Star Github
DIAL: Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models RSS 2023 2022-11-21 Project
RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation TMLR 2023 2023-06-20 Star Github
Data Selection
Robot Data Curation with Mutual Information Estimators arXiv 2025-02-12 Project
What Matters in Learning from Large-Scale Datasets for Robot Manipulation ICLR 2025 2025-01-23 Project
AMF: Active Fine-Tuning of Generalist Policies arXiv 2024-10-07 -
Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning CoRL 2024 2024-08-26 Star Github
An Unbiased Look at Datasets for Visuo-Motor Pre-Training CoRL 2023 2023-10-13 Star Github
Data Quality in Imitation Learning NeurIPS 2023 2023-06-04 -
Data Retrieval
STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning ICLR 2025 2024-12-19 Project
Retrieval-Augmented Embodied Agents CVPR 2024 2024-04-17 -
Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled Datasets RSS 2023 2023-04-08 Star Github
Data Augmentation
RoboEngine: Plug-and-Play Robot Data Augmentation with Semantic Robot Segmentation and Background Generation arXiv 2025-03-24 Star Github
Predictive Red Teaming: Breaking Policies Without Breaking Robots arXiv 2025-02-10 Project
RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations arXiv 2024-11-25 Project
View-Invariant Policy Learning via Zero-Shot Novel View Synthesis CoRL 2024 2024-09-05 Star Github
RoVi-Aug: Robot and Viewpoint Augmentation for Cross-Embodiment Robot Learning CoRL 2024 2024-09-05 Project
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning CoLLAs 2024 2024-07-30 Project
Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning RSS 2024 2023-02-27 Star Github
ROSIE: Scaling Robot Learning with Semantically Imagined Experience RSS 2023 2023-02-22 Project
GenAug: Retargeting behaviors to unseen situations via Generative Augmentation RSS 2023 2023-02-13 Star Github
Evaluation
Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection arXiv 2025-02-14 -
Contrast Sets for Evaluating Language-Guided Robot Policies CoRL 2024 2024-06-19 -

(back to top)

Affordance Learning

Title Venue Date Code
Articulated Object Affordance
ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation? arXiv 2024-12-13 -
UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models ICRA 2025 2024-09-16 Project
A3VLM: Actionable Articulation-Aware Vision Language Model CoRL 2024 2024-06-14 Star Github
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation CoRL 2024 2024-06-17 Project
SAGE: Bridging Semantic and Actionable Parts for Generalizable Manipulation of Articulated Objects RSS 2024 2023-12-03 Star Github
Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs ICRA 2024 2023-11-06 Star Github
Ditto: Building Digital Twins of Articulated Objects from Interaction CVPR 2022 2022-08-16 Star Github
Part-Based Object Affordance
3DAPNet: Language-Conditioned Affordance-Pose Detection in 3D Point Clouds ICRA 2024 2023-09-19 Star Github
CPM: Composable Part-Based Manipulation CoRL 2023 2024-05-09 Project
PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations CVPR 2023 2023-03-29 Star Github
GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts CVPR 2023 2022-11-10 Star Github
Spatial Affordance
Robotic Visual Instruction arXiv 2025-05-01 -
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics arXiv 2024-11-25 Project
SpatialBot: Precise Spatial Understanding with Vision Language Models ICRA 2025 2024-06-19 Star Github
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics CoRL 2024 2024-06-15 Star Github
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities CVPR 2024 2024-01-22 Project
Visual Affordance
RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation CoRL 2024 2024-07-05 Star Github
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting RSS 2024 2024-03-05 Star Github
SLAP: Spatial-Language Attention Policies CoRL 2023 2023-04-21 Star Github
KITE: Keypoint-Conditioned Policies for Semantic Manipulation CoRL 2023 2023-06-29 Project
HULC++: Grounding Language with Visual Affordances over Unstructured Data ICRA 2023 2022-10-04 Star Github
CLIPort: What and Where Pathways for Robotic Manipulation CoRL 2022 2021-09-24 Star Github
VAPO: Affordance Learning from Play for Sample-Efficient Policy Learning ICRA 2022 2022-03-01 Project
Transporter Networks: Rearranging the Visual World for Robotic Manipulation CoRL 2020 2020-10-27 Star Github

(back to top)

3D Representation for Manipulation

Title Venue Date Code
RoboSplat: Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation RSS 2025 2025-04-17 Star Github
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation arXiv 2024-11-27 Star Github
MSGField: A Unified Scene Representation Integrating Motion, Semantics, and Geometry for Robotic Manipulation arXiv 2024-10-21 Star Github
Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting CoRL 2024 2024-05-07 Star Github
IMAGINATION POLICY: Using Generative Point Cloud Models for Learning Manipulation Policies CoRL 2024 2024-06-17 Project
Physically Embodied Gaussian Splatting: A Realtime Correctable World Model for Robotics CoRL 2024 2024-06-16 Project
RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation CoRL 2024 2024-03-28 Star Github
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation CoRL 2024 2024-02-23 Star Github
D3Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement CoRL 2024 2023-09-28 Star Github
Object-Aware Gaussian Splatting for Robotic Manipulation ICRAW 2024 2024-04-24 Project
F3RM: Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation CoRL 2023 2023-07-27 Star Github
R-NDF: SE(3)-Equivariant Relational Rearrangement with Neural Descriptor Fields CORL 2022 2022-11-17 Star Github
NDF: Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation ICRA 2022 2021-12-09 Star Github

(back to top)

3D Representation Policy Learning

Title Venue Date Code
Diffusion Policy (DP)
PPI: Gripper Keypose and Object Pointflow as Interfaces for Bimanual Robotic Manipulation RSS 2025 2025-04-24 Star Github
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation ICLR 2025 2024-09-30 Project
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations CoRL 2024 2024-02-16 Star Github
DP3: 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations RSS 2024 2024-03-06 Star Github
Reconstruction
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation arXiv 2024-11-27 Star Github
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation ECCV 2024 2024-03-13 Star Github
SGRv2: Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation CoRL 2024 2024-06-15 Star Github
RVT-2: Learning Precise Manipulation from Few Demonstrations RSS 2024 2024-01-12 Star Github
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields CoRL 2023 2023-08-31 Star Github
3D4RL: Visual Reinforcement Learning with Self-Supervised 3D Representations RA-L 2023 2022-10-13 Star Github
PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation CoRL 2023 2023-09-27 Star Github
M2T2: Multi-Task Masked Transformer for Object-centric Pick and Place CoRL 2023 2023-11-02 Star Github
PerAct: Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation CoRL 2022 2022-09-12 Star Github
Visual Goal Generation
3D-MVP: 3D Multiview Pretraining for Robotic Manipulation CVPR 2025 2024-06-26 Project
ActAIM2: Discovering Robotic Interaction Modes with Discrete Representation Learning CoRL 2024 2024-10-26 Project
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation ICML 2024 2024-05-30 Star Github
RVT: Robotic View Transformer for 3D Object Manipulation CoRL 2023 2023-06-26 Star Github
GROOT: Learning Generalizable Manipulation Policies with Object-Centric 3D Representations CoRL 2023 2023-10-22 Star Github
3D Policy
EmbodiedMAE: A Unified 3D Multi-Modal Representation for Robot Manipulation arXiv 2025-05-15 -
FP3: A 3D Foundation Policy for Robotic Manipulation arXiv 2025-03-11 Star Github
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation CVPR 2025 2025-03-10 Project
SPHINX: What's the Move? Hybrid Imitation Learning via Salient Points ICLR 2025 2024-12-06 Star Github
SGR: A Universalc Semantic-Geometric Representation for Robotic Manipulation CoRL 2023 2023-06-18 Star Github

(back to top)

Reasoning, Planning and Code Generation

Title Venue Date Code
Task Planning
Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models arXiv 2025-05-12 Project
MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation arXiv 2024-11-26 Project
Socratic Planner: Inquiry-Based Zero-Shot Planning for Embodied Instruction Following arXiv 2024-04-21 -
Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models IROS 2024 2024-08-15 Project
PG-InstructBLIP: Physically Grounded Vision-Language Models for Robotic Manipulation ICRA 2024 2023-09-05 Project
RoCo: Dialectic Multi-Robot Collaboration with Large Language Models ICRA 2024 2023-07-10 Star Github
REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction CoRL 2023 2023-06-27 Star Github
Saycan: Do As I Can, Not As I Say: Grounding Language in Robotic Affordances CoRL 2023 2022-04-04 Star Github
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency arXiv 2023-04-22 Star Github
Inner Monologue: Embodied Reasoning through Planning with Language Models CoRL 2022 2022-07-12 Project
SHOWTELL: Teaching Robots with Show and Tell: Using Foundation Models to Synthesize Robot Policies from Language and Visual Demonstrations CoRL 2024 2024-09-06 Project
GIRAF: Gesture-Informed Robot Assistance via Foundation Models CoRL 2023 2023-09-06 Project
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models ICCV 2023 2022-12-08 Star Github
Code Generation
Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation arXiv 2025-01-08 Project
Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought NeurIPS 2023 2023-05-26 Project
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model arXiv 2023-05-18 Star Github
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models ICRA 2023 2022-09-22 Star Github
ChatGPT for Robotics: Design Principles and Model Abilities IEEE Access 2023 2023-02-20 Star Github
Code as Policies: Language Model Programs for Embodied Control ICRA 2023 2022-09-16 Star Github
TidyBot: Personalized Robot Assistance with Large Language Models Autonomous Robots 2023 2023-05-09 Star Github
Statler: State-Maintaining Language Models for Embodied Reasoning ICRA 2024 2023-06-30 Star Github
InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning RSS 2024 2023-05-30 Star Github
Text2Motion: From Natural Language Instructions to Feasible Plans Autonomous Robots 2023 2023-03-21 Project
Multimodal Reasoning
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks arXiv 2025-03-14 -
Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies arXiv 2025-03-11 Project
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation arXiv 2025-02-18 Star Github
From Foresight to Forethought: VLM-In-the-Loop Policy Steering via Latent Alignment arXiv 2025-02-03 -
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection CVPR 2025 2024-12-05 Project
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation ICLR 2025 2024-10-01 Project
λ-Repformer: Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations CoRL 2024 2024-10-01 Project
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation CVPR 2024 2023-12-24 Star Github
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought NeurIPS 2023 2023-05-24 Star Github
Matcha: Chat with the Environment: Interactive Multimodal Perception Using Large Language Models IROS 2023 2023-03-14 Star Github
PaLM-E: An Embodied Multimodal Language Model ICML 2023 2023-03-06 Star Github
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language ICLR 2023 2022-04-01 Project

(back to top)

Generalization

Title Venue Date Code
Generalization with Benchmarks
A Taxonomy for Evaluating Generalist Robot Policies arXiv 2024-03-03 Project
Generalization using Data
Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting RSS 2024 2024-02-29 Star Github
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation ICRA 2024 2024-02-29 Star Github
Disentangled Representation Learning
Zero-Shot Visual Generalization in Robot Manipulation arXiv 2022-05-16 Project
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation arXiv 2025-05-07 Star Github
Disentangled Object-Centric Image Representation for Robotic Manipulation arXiv 2025-03-14 -
Compositional Generalization
Policy Architectures for Compositional Generalization in Control NeurIPSW 2022 2022-03-10 Star Github
PROGRAMPORT: Programmatically Grounded, Compositionally Generalizable Robotic Manipulation ICLR 2023 2023-04-26 Project
Efficient Data Collection for Robotic Manipulation via Compositional Generalization RSS 2024 2024-03-08 Project
Sim2Real or real2real Generalization
Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware arXiv 2025-05-14 Project
X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real arXiv 2025-05-11 Star Github
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation arXiv 2025-03-31 Project
Natural Language Can Help Bridge the Sim2Real Gap RSS 2024 2024-05-16 Star Github
RialTo: Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation RSS 2024 2024-03-06 Star Github
Domain Randomization: Sim-to-Real Transfer of Robotic Control with Dynamics Randomization ICRA 2018 2017-10-18
Generalization for Long-horizon and Complex Task
RoboHorizon: An LLM-Assisted Multi-View World Model for Long-Horizon Robotic Manipulation arXiv 2025-01-11 -
ManipGen: Local Policies Enable Zero-shot Long-horizon Manipulation CoRLW 2024 2024-10-29 Project
TBBF: A Backbone for Long-Horizon Robot Task Understanding RA-L 2025 2024-08-02 Project
STAP: Sequencing Task-Agnostic Policies ICRA 2023 2022-10-21 Star Github
BOSS: Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance CoRL 2023 2023-12-16 Star Github
BLADE: Learning Compositional Behaviors from Demonstration and Language CoRL 2024 2024 Project
PALO: Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation CoRL 2024 2024-08-29 Star Github
Lifelong Learning
Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation CVPR 2025 2025-04-01 -
Few-shot
Few-Shot Vision-Language Action-Incremental Policy Learning arXiv 2025-04-22 Star Github
You Only Teach Once: Learn One-Shot Bimanual Robotic Manipulation from Video Demonstrations arXiv 2025-01-24 Star Github
Learning Generalizable 3D Manipulation With 10 Demonstrations arXiv 2024-11-15 Star Github
Incremental Learning
iManip: Skill-Incremental Learning for Robotic Manipulation arXiv 2025-03-10 -

(back to top)

Generalist

Title Venue Date Code
Generalist with Different Embodiment Types
CrossFormer: Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation CoRL 2024 2024-08-21 Star Github
ARIO: All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents arXiv 2024-08-20 Project
HPT: Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers NeurIPS 2024 2024-09-30 Star Github
Generalist in Different Embodied Tasks
LEO: An Embodied Generalist Agent in 3D World ICML 2024 2023-11-18 Star Github
Manipulation Generalist
Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding arXiv 2025-01-08 Star Github
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning arXiv 2024-12-13 Project
RoboMM: All-in-One Multimodal Large Model for Robotic Manipulation arXiv 2024-12-10 Star Github
Effective Tuning Strategies for Generalist Robot Manipulation Policies arXiv 2024-10-02 -
Octo: An Open-Source Generalist Robot Policy RSS 2024 2024-05-20 Star Github
V-GPS: Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance CoRL 2024 2024-10-17 Project
Open X-Embodiment: Robotic Learning Datasets and RT-X Models ICRA 2024 2023-10-13 Star Github
RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking ICRA 2024 2023-09-05 Star Github
Maniwhere: Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning CoRL 2024 2024-07-22 Project
CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation arXiv 2024-10-19 Project
Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments arXiv 2024-09-09 Github
More for VLAs

(back to top)

Human-Robot Interaction and Collaboration

Title Venue Date Code
Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment arXiv 2024-12-06 Project
Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration CoRL 2024 2024-09-06 Project
APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs CoRL 2024 - Project
Text2Interaction: Establishing Safe and Preferable Human-Robot Interaction CoRL 2024 2024-08-12 Star Github
GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation CVPR 2024 2024-01-01 Star Github
KNOWNO: Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners CoRL 2023 2023-07-04 Github
LILAC: Yell At Your Robot: Improving On-the-Fly from Language Corrections arXiv 2024-03-19 Star Github
YAY Robot: "No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy HRI 2023 2023-01-06 Star Github

(back to top)

Mobile Manipulation

Title Venue Date Code
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant arXiv 2024-09-30 Project
TaMMa: Target-driven Multi-subscene Mobile Manipulation CoRL 2024 2024-09-06 -
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning CoRL 2023 2024-07-12 Project
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation CoRL 2024 2024-01-04 Star Github
GAMMA: Graspability-Aware Mobile MAnipulation Policy Learning based on Online Grasping Pose Fusion ICRA 2024 2023-09-27 Star Github

(back to top)

Tactile-based Manipulation

Title Venue Date Code
CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding arXiv 2025-05-13 Project
GelFusion: Enhancing Robotic Manipulation under Visual Constraints via Visuotactile Fusion arXiv 2025-05-12 Project
On the Importance of Tactile Sensing for Imitation Learning: A Case Study on Robotic Match Lighting arXiv 2025-04-18 Project
Look-to-Touch: A Vision-Enhanced Proximity and Tactile Sensor for Distance and Geometry Perception in Robotic Manipulation arXiv 2025-04-14 -
TLA: Tactile-Language-Action Model for Contact-Rich Manipulation arXiv 2025-03-11 Project
Digitizing Touch with an Artificial Multimodal Fingertip arXiv 2024-11-04 Star Github
Sparsh: Self-supervised touch representations for vision-based tactile sensing CoRL 2024 2024 Star Github
MimicTouch: Leveraging Multi-modal Human Tactile Demonstrations for Contact-rich Manipulation CoRL 2024 2023-10-25 Project
Octopi: Object Property Reasoning with Large Tactile-Language Models RSS 2024 2024-05-05 Star Github
RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing RSS 2024 2024-07-01 Project
RotateIt: General In-Hand Object Rotation with Vision and Touch CoRL 2023 2023-09-18 Project
T-DEX: Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play CoRL 2023 2023-03-21 Star Github

(back to top)

Dexterous Manipulation

Title Venue Date Code
Object-Focus Actor for Data-efficient Robot Generalization Dexterous Manipulation arXiv 2025-05-21 Project
DORA: Object Affordance-Guided Reinforcement Learning for Dexterous Robotic Manipulation arXiv 2025-05-20 Project
MAPLE: Encoding Dexterous Robotic Manipulation Priors Learned From Egocentric Videos arXiv 2025-04-08 Star Code
CordViP: Correspondence-based Visuomotor Policy for Dexterous Manipulation in Real-World arXiv 2025-02-12 Project
D(R, O) Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping CoRLW 2024 2024-10-02 Star Github
Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning arXiv 2024-07-03 Star Github
DexGraspNet 2.0: Learning Generative Dexterous Grasping in Large-scale Synthetic Cluttered Scenes CoRL 2024 2024 Star Github
DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation ICRA 2023 2022-10-06 Star Github
Demonstrating Learning from Humans on Open-Source Dexterous Robot Hands RSS 2024 2024 2024-01-01
CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation CVPR 2024 2024-02-22 Star Github
Dexterous Functional Grasping CoRL 2023 2023-12-05 Project
DEFT: Dexterous Fine-Tuning for Real-World Hand Policies CoRL 2023 2023-10-30 Project
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation CoRL 2023 2023-09-06 Project
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation CoRL 2023 2023-09-02 Star Github
AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System RSS 2023 2023-07-10 Project

(back to top)

Other Applications

Title Venue Date Code
Precise Manipulation
Find the Fruit: Designing a Zero-Shot Sim2Real Deep RL Planner for Occlusion Aware Plant Manipulation arXiv 2025-05-22 -
FoAR: Force-Aware Reactive Policy for Contact-Rich Robotic Manipulation arXiv 2024-11-24 Project
ForceMimic: Force-Centric Imitation Learning with Force-Motion Capture System for Contact-Rich Manipulation arXiv 2024-10-10 Project
Predicting Object Interactions with Behavior Primitives: An Application in Stowing Tasks CoRL 2023 2023-09-28 Star Github
VAPORS: Learning Sequential Acquisition Policies for Robot-Assisted Feeding CoRL 2023 2023-09-11 Project
HANDLOOM: Learned Tracing of One-Dimensional Objects for Inspection and Manipulation CoRL 2023 2023-03-15 Project
RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools CoRL 2023 2023-06-26 Star Github
Object Rearrangement
PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement WACV 2025 2024-10-29 -
LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement IROS 2024 2023-09-27 Star Github
LLM-GROP: Task and Motion Planning with Large Language Models for Object Rearrangement IROS 2023 2023-03-10 Colab
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics RA-L 2023 2022-10-05 Project
Non-prehensile Manipulation
HACMan: Learning Hybrid Actor-Critic Maps for 6D Non-Prehensile Manipulation CoRL 2023 2023-05-06 Star Github
Tool Manipulation
Leveraging Language for Accelerated Learning of Tool Manipulation CoRL 2023 2022-06-27 Star Github
Responsible Manipulation
Adversarial Data Collection: Human-Collaborative Perturbations for Efficient and Robust Robotic Imitation Learning arXiv 2025-03-14 -
How vulnerable is my policy? Adversarial attacks on modern behavior cloning policies arXiv 2025-02-06 -
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation arXiv 2024-11-27 Star Github
TrojanRobot: Backdoor Attacks Against LLM-based Embodied Robots in the Physical World arXiv 2024-11-18 Project
AI4Science
RoboCulture: A Robotics Platform for Automated Biological Experimentation arXiv 2025-05-20 -

(back to top)

📊 Awesome Benchmarks

Grasp Datasets

Title Venue Date Code
GraspClutter6D: A Large-scale Real-world Dataset for Robust Perception and Grasping in Cluttered Scenes arXiv 2025-04-09 Project
QDGset: A Large Scale Grasping Dataset Generated with Quality-Diversity arXiv 2024-10-03 Project
Real-to-Sim Grasp: Rethinking the Gap between Simulation and Real World in Grasp Detection CoRL 2024 2024-10-09 Project
Grasp-Anything-6D: Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance ECCV 2024 2024-07-18 Star Github
Grasp-Anything++: Language-driven Grasp Detection CVPR 2024 2024-06-13 Star Github
Grasp-Anything: Large-scale Grasp Dataset from Foundation Models ICRA 2024 2023-09-18 Star Github
GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping CVPR 2020 2020-08-05 Star Github
Jacquard: A Large Scale Dataset for Robotic Grasp Detection IROS 2018 2018-03-30 Project

(back to top)

Manipulation Benchmarks

Title Venue Date Code
Manipulation in Home Environment
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots RSS 2024 2024-06-04 Star Github
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes ICCV 2023 2023-04-09 Star Github
HomeRobot: Open-Vocabulary Mobile Manipulation CoRL 2023 2023-06-20 Star Github
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks CVPR 2020 2019-12-03 Star Github
Manipulation in On-Table Environment
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization arXiv 2025-05-21 Star Github
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins CVPR 2025 2025-04-17 Star Github
AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World arXiv 2025-03-31 Star Github
MuBlE: MuJoCo and Blender simulation Environment and Benchmark for Task Planning in Robot Manipulation arXiv 2025-03-03 Star Github
FLAME: A Federated Learning Benchmark for Robotic Manipulation arXiv 2025-03-03 -
VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks arXiv 2024-12-24 Star Github
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy ICRA 2025 2024-10-02 Star Github
OBSBench: Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning NeuIPS 2024 2024-02-04 Star Github
GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs CoRL 2024 2024-10-04 Star Github
Evaluating Real-World Robot Manipulation Policies in Simulation CoRL 2024 2024-05-09 Star Github
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation RSS 2024 2024-02-13 Star Github
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning NeurIPS 2023 2023-06-05 Star Github
VIMA: General Robot Manipulation with Multimodal Prompts ICML 2023 2022-10-06 Star Github
CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks RA-L 2021 2021-12-06 Star Github
RLBench: The Robot Learning Benchmark & Learning Environment RA-L 2020 2019-09-26 Star Github
KitchenShift: Evaluating Zero-Shot Generalization of Imitation-Based Policy Learning Under Domain Shifts NeurIPSW 2021 2021 Star Github
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning CoRL 2019 2019-10-24 Star Github
Franka-Kitchen: Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning CoRL 2019 2019-10-25 Project
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation arXiv 2024-10-07 -
ClutterGen: A Cluttered Scene Generator for Robot Learning CoRL 2024 2024-07-07 Star Github
Tactile Manipulation
TacCompress: A Benchmark for Multi-Point Tactile Data Compression in Dexterous Manipulation arXiv 2025-05-22 Star Github
Efficient Tactile Simulation with Differentiability for Robotic Manipulation CoRL 2022 2022-09-10 Star Github
Functional Manipulation
FMB: a Functional Manipulation Benchmark for Generalizable Robotic Learning IJRR 2024 2024-01-16 Star Github
Robot Trajectory Datasets
RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction arXiv 2024-05-18 -
Open X-Embodiment: Robotic Learning Datasets and RT-X Models ICRA 2024 2023-10-13 Star Github
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset ICRA 2024 2024-03-19 Star Github
BridgeData V2: A Dataset for Robot Learning at Scale CoRL 2023 2024-08-24 Star Github
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot RSSW 2023 2023-07-02 Project
Embodied QA and Affordance Datasets
Robo2VLM: Visual Question Answering from Large-Scale In-the-Wild Robot Manipulation Datasets arXiv 2024-05-21 Huggingface
PointArena: Probing Multimodal Grounding Through Language-Guided Pointing arXiv 2024-05-15 Star Github
ManipBench: Benchmarking Vision-Language Models for Low-Level Robot Manipulation arXiv 2024-05-14 Project
ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models IROS 2024 2024-03-17 Star Github
OpenEQA: Embodied Question Answering in the Era of Foundation Models CVPR 2024 2024 Star Github
Others
Two by Two: Learning Multi-Task Pairwise Objects Assembly for Generalizable Robot Manipulation CVPR 2025 2025-04-09 Project

(back to top)

Cross-Embodiment Benchmarks

Title Venue Date Code
ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts arXiv 2025-05-15 Star Github
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning RSS 2025 2025-04-26 Star Github
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation arXiv 2024-12-18 Star Github
GENESIS: A generative world for general-purpose robotics & embodied AI learning - - Star Github
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI arXiv 2024-10-01 Star Github
All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents arXiv 2024-08-20 Dataset
CortexBench: Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence? NeurIPS 2023 2023-03-31 Star Github
Isaac Lab: Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments RA-L 2023 2023-01-10 Star Github

(back to top)

🛠️ Awesome Techniques

Title Venue Date Code
Awesome-Implicit-NeRF-Robotics: Neural Fields in Robotics: A Survey - 2024-10-26 Star Github
Awesome-Video-Robotic-Papers - 2024 Star Github
Awesome-Generalist-Robots-via-Foundation-Models: Neural Fields in Robotics: A Survey - 2024 Star Github
Awesome-Robotics-3D - 2024 Star Github
Awesome-Robotics-Foundation-Models: Foundation Models in Robotics: Applications, Challenges, and the Future - 2023-12-13 Star Github
Awesome-LLM-Robotics - 2022 Star Github

(back to top)

✨ Citation

If you find this repository useful, please consider citing this list:

@misc{bai2024roboticsmanipulation,
    title = {Awesome-Robotics-Manipulation},
    author = {Bai, Shuanghao and Ding, Pengxiang and Zhang, Haoran},
    journal = {GitHub repository},
    url = {https://github.com/BaiShuanghao/Awesome-Robotics-Manipulation},
    year = {2024},
}

About

A comprehensive list of papers about Robot Manipulation, including papers, codes, and related websites.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •