GitHub - fbotathome/fbot_vision: This repository contains a collection of ROS2 packages designed to implement computer vision algorithms for the FBOT@Home robot (BORIS) in the RoboCup@Home league.

A ROS 2 vision system for robotics applications featuring object detection, face recognition, person tracking, and vision-language model integration.

Overview • Architecture • Installation • Usage • fbot_vision message and services • Contributing

Overview

fbot_vision is a ROS 2 package suite designed for robotic vision applications. It provides real-time object detection, face recognition, person tracking with pose estimation, and vision-language model capabilities for interactive robotics systems. It was designed for the RoboCup@Home and the robot BORIS competition but is adaptable to various robotics scenarios.

Architecture

The system consists of three main packages:

fbot_vision/
├── 📁 fbot_recognition/          # Core recognition algorithms
|   ├── 📁 base_recognition/      # Abstract base class for all recognition modules
│   ├── 📁 face_recognition/      # Face detection and recognition
│   ├── 📁 moondream_recognition/ # Object recognition using VLM Moondream2
│   ├── 📁 yolo_tracker_recognition/ # People tracking
│   └── 📁 yolov8_recognition/    # Object detection with YOLOv8
├── 📁 fbot_vlm/                  # Vision Language Model integration
└── 📁 fbot_vision_msgs/          # Custom ROS message definitions

Installation

Prerequisites

ROS2 Humble
Python 3.10+
Ubuntu 22.04
Dependencies listed in package.xml and requirements.txt

Setup

Clone the repository into your ROS workspace:

cd ~/fbot_ws/src
git clone https://github.com/fbotathome/fbot_vision.git

Install dependencies:

cd ~/fbot_ws
sudo rosdep init  # Skip if already initialized
rosdep update
rosdep install --from-paths src --ignore-src -r -y
pip install -r src/fbot_vision/requirements.txt

Build the workspace:

cd ~/fbot_ws
colcon build --packages-select fbot_recognition fbot_vlm fbot_vision_msgs
source install/setup.bash

Usage

Object Detection

# Launch YOLOv8 object detection
ros2 launch fbot_recognition yolov8_object_recognition.launch.py use_realsense:=True

# Start/stop detection service
ros2 service call /fbot_vision/fr/object_start std_srvs/srv/Empty
ros2 service call /fbot_vision/fr/object_stop std_srvs/srv/Empty

Person Tracking

# Launch YOLO tracker with pose estimation
ros2 launch fbot_recognition yolo_tracker_recognition.launch.py use_realsense:=True
 
# Start/stop tracking
ros2 service call /fbot_vision/pt/start std_srvs/srv/Empty
ros2 service call /fbot_vision/pt/stop std_srvs/srv/Empty

Face Recognition

# Launch face recognition
ros2 launch fbot_recognition face_recognition.launch.py

# Introduce a new person
ros2 service call /fbot_vision/face_recognition/people_introducing \
    fbot_vision_msgs/srv/PeopleIntroducing "{name: 'John Doe'}"

# Forget an existing person from database
ros2 service call /fbot_vision/face_recognition/people_forgetting \
    fbot_vision_msgs/srv/PeopleForgetting "{name: 'John Doe'}"

Moondream Object Recognition

# Launch Moondream object recognition (local)
ros2 launch fbot_recognition moondream_object_recognition.launch.py use_remote:=false use_realsense:=True

# Set the object prompt (class to detect)
ros2 topic pub /fbot_vision/fr/object_prompt std_msgs/String "data: 'cup'"

Vision Language Model

# Launch VLM service
ros2 launch fbot_vlm vlm.launch.py

# Ask questions about the current camera view (uses live camera feed)
ros2 service call /fbot_vision/vlm/question_answering/query \
    fbot_vision_msgs/srv/VLMQuestionAnswering "{question: 'What do you see?', use_image: true}"

# Ask text-only questions (no image processing)
ros2 service call /fbot_vision/vlm/question_answering/query \
    fbot_vision_msgs/srv/VLMQuestionAnswering "{question: 'What is the capital of France?', use_image: false}"

# Ask questions about a specific image (provide custom image)
ros2 service call /fbot_vision/vlm/question_answering/query \
    fbot_vision_msgs/srv/VLMQuestionAnswering "{
        question: 'Describe this image in detail', 
        use_image: true,
        image: {
            header: {stamp: {sec: 0, nanosec: 0}, frame_id: 'camera_link'},
            height: 480, width: 640, encoding: 'rgb8',
            is_bigendian: false, step: 1920,
            data: [/* image data bytes */]
        }
    }"

# Get VLM conversation history
ros2 service call /fbot_vision/vlm/answer_history/query \
    fbot_vision_msgs/srv/VLMAnswerHistory "{questions_filter: []}"

# Get history for specific questions only
ros2 service call /fbot_vision/vlm/answer_history/query \
    fbot_vision_msgs/srv/VLMAnswerHistory "{questions_filter: ['What do you see?', 'Describe the scene']}"

fbot_vision message and services

Topics

Topic	Type	Description
`/fbot_vision/fr/object_recognition`	`Detection3DArray`	3D object detections
`/fbot_vision/pt/tracking3D`	`Detection3DArray`	3D person tracking
`/fbot_vision/fr/face_recognition`	`Detection3DArray`	3D face recognition
`/fbot_vision/vlm/question_answering/query`	`VLMQuestion`	VLM questions
`/fbot_vision/vlm/question_answering/answer`	`VLMAnswer`	VLM responses
`/fbot_vision/fr/object_prompt`	`std_msgs/String`	Object prompt for Moondream

Services

Service	Type	Description
`/fbot_vision/fr/object_start`	`std_srvs/Empty`	Start object detection
`/fbot_vision/fr/object_stop`	`std_srvs/Empty`	Stop object detection
`/fbot_vision/pt/start`	`std_srvs/Empty`	Start person tracking
`/fbot_vision/pt/stop`	`std_srvs/Empty`	Stop person tracking
`/fbot_vision/vlm/question_answering/query`	`VLMQuestionAnswering`	Ask VLM questions
`/fbot_vision/vlm/answer_history/query`	`VLMAnswerHistory`	Get VLM conversation history
`/fbot_vision/face_recognition/people_introducing`	`PeopleIntroducing`	Register new person
`/fbot_vision/face_recognition/people_forgetting`	`PeopleForgetting`	Forget an existing person
`/fbot_vision/look_at_description`	`LookAtDescription3D`	Look at specific 3D detection

Contributing

Create a feature branch (git checkout -b feat/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feat/amazing-feature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 192 Commits
fbot_recognition		fbot_recognition
fbot_vision_msgs		fbot_vision_msgs
fbot_vlm		fbot_vlm
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Architecture

Installation

Prerequisites

Setup

Usage

Object Detection

Person Tracking

Face Recognition

Moondream Object Recognition

Vision Language Model

fbot_vision message and services

Topics

Services

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 10

Uh oh!

Languages

fbotathome/fbot_vision

Folders and files

Latest commit

History

Repository files navigation

Overview

Architecture

Installation

Prerequisites

Setup

Usage

Object Detection

Person Tracking

Face Recognition

Moondream Object Recognition

Vision Language Model

fbot_vision message and services

Topics

Services

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 10

Uh oh!

Languages

Packages