Skip to content
@CraftJarvis

CraftJarvis

This is the collection of our joint efforts to Craft an open-ended, multitask, generalist agent (Jarvis).

Welcome to Team CraftJarvis

At CraftJarvis, we're a passionate team committed to exploring the vast potential of AI in the dynamic, open-world environment of Minecraft. Our focus is on developing a generalist agent, an AI entity capable of mastering a wide range of tasks and challenges within this virtual world.

Publications

Here are a list of our latest publications on Open-world Agents. (Sort by time order)

  • JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse (ACL 2025)

    [Website] [Paper] [Videos] [Datasets] [Models]

  • MCU: An Evaluation Framework for Open-Ended Game Agents (ICML 2025)

    [Website] [Paper] [Code]

  • Open-World Skill Discovery from Unsegmented Demonstrations

    [Website] [Paper] [Code]

  • MineStudio: A Streamlined Package for Minecraft AI Agent Development

    [Paper] [Code] [Document]

  • ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment

    [Website] [Paper] [Code] [Demo]

  • ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting (CVPR 2025)

    [Website] [Paper] [Code] [Demo]

  • GROOT-2: Weakly Supervised Multi-Modal Instruction Following Agents (ICLR 2025)

    [Paper]

  • OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents (NeurIPS 2024)

    [Website] [Paper]

  • JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models (T-PAMI 2024)

    [Website] [Paper] [Code]

  • GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR 2024)

    [Website] [Paper] [Code]

  • Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents (NeurIPS 2023)

    [Paper] [Code]

  • Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction (CVPR 2023)

    [Paper] [Code]

Popular repositories Loading

  1. JARVIS-1 JARVIS-1 Public

    JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models

    Java 367 21

  2. MC-Planner MC-Planner Public

    Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents"

    Python 279 23

  3. MineStudio MineStudio Public

    MineStudio: A Streamlined Package for Minecraft AI Agent Development

    Python 261 14

  4. RAT RAT Public

    Implementation of "RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation".

    Python 241 25

  5. JarvisVLA JarvisVLA Public

    Official Implementation of "JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse"

    Python 77 7

  6. GROOT GROOT Public

    GROOT: Learning to Follow Instructions by Watching Gameplay Videos (ICLR 2024 Spotlight)

    Java 65 2

Repositories

Showing 10 of 18 repositories

Top languages

Loading…

Most used topics

Loading…