REGROUP-HRI 🤖

REGROUP: A Robot-Centric Group Detection and Tracking System

REGROUP is a new system that enables robots to detect and track groups of people from an ego-centric perspective using a crowd-aware, tracking-by-detection approach (see Figures 1-2).

[Paper] [Video]

Figure 1: This shows REGROUP running on an RGB video stream, as captured from a mobile robot.

Contents:

Introduction
REGROUP Overview
Installation
Code Structure Overview
Dataset Setup
Usage
Experiments

Introduction

To facilitate the Human-Robot Interaction field's transition from dyadic to group interaction, new methods are needed for robots to sense and understand team behavior. We introduce the Robot-Centric Group Detection and Tracking System (REGROUP), a new method that enables robots to detect and track groups of people from an ego-centric perspective using a crowd-aware, tracking-by-detection approach. Our system employs a novel technique that leverages person re-identification deep learning features to address the group data association problem. REGROUP is robust to real-world vision challenges such as occlusion, camera egomotion, shadow, and varying lighting illuminations. Also, it runs in real-time on real-world data.

You can use this bibtex if you would like to cite this work (Taylor and Riek, 2022):

@article{taylor_2022, 
author = {Taylor, A. and Riek, L.D.}, 
title = {REGROUP: A Robot-Centric Group Detection and Tracking System}, 
journal = {In Proc. of the 17th Annual ACM/IEEE Conference on Human Robot Interaction (HRI).}, 
year = {2022}
}

REGROUP Overview

Figure 2: Given an image sequence, REGROUP extracts pedestrian patches, extract appearance descriptors using a CNN. Then, it uses these feature vectors to track pedestrians. Then, REGROUP's group detector uses these tracks to detect groups, then track them using group detections and our crowd indication feature (CIF), which enables REGROUP to handle high levels of occlusion.

We use the pedestrian tracker from Deep Sort.

Installation

Install REGROUP

You can install the dependencies with pip:

git clone https://github.com/UCSD-RHC-Lab/regroup-hri.git 
cd regroup-hri 
pip install opencv-python
pip install numpy

Additional libraries to install:

Tensorflow

Prerequisites (Pedestrian Detection with YOLO)

Download YOLOv3 weights, configuration file, and class name files:

mkdir regoup/darknet
mkdir regroup/darknet/model 
cd regroup/darknet/model

wget https://opencv-tutorial.readthedocs.io/en/latest/_downloads/549b18ea691a01b06e888f9bb6b35900/yolo1.py

wget https://opencv-tutorial.readthedocs.io/en/latest/_downloads/10e685aad953495a95c17bfecd1649e5/yolov3.cfg

wget https://opencv-tutorial.readthedocs.io/en/latest/_downloads/a9fb13cbea0745f3d11da9017d1b8467/coco.names

Prerequisites (Install Person Re-Identification Convolutional Neural Network models)

Download network weights and add the contents of the folder to the regroup/resources/networks folder.

Code Structure Overview

regroup-hri contains:

regroup/: stores files to run REGROUP

regroup:
darknet/: contains YOLO pedestrian detection system from here.
resources/: store person re-identification Convolutional Neural Network models from Deep SORT
tools/: stores helper scripts to generate pedestrian detections
application_util/: stores group tracking visualization code

images/: images on GitHub README

data/: stores the data set discussed in the paper

test/: stores the testing dataset

Dataset Setup

The input data for REGROUP is stored in the regroup/data folder (See example image in Figure 3). REGROUP requires pedestrian detections as input. The dataset should be formatted as follows:

data/: stores the data set discussed in the paper

test/: stores the testing dataset
- [sequence]/: stores small dataset with custom sequence name (user-defined)
  - rgb/: folder that stores RGB data
  - det/: folder that stores pedestrian detection files

mkdir data
mkdir data/test
mkdir data/test/group-01/rgb
mkdir data/test/group-01/det

Here, [sequence] is group-01.

Figure 3: Example input RGB image for REGROUP.

The pedestrian detection files are formatted as (see Figure 3):

<image frame ID>, <pedestrian track ID>, <top-right-x-coordinate>, <top-right-y-coordinate>, <width>, <height>, -1, -1, -1, -1

Figure 4: Example pedestrian detections from data/test/[sequence]/det/det.txt

Usage

Running REGROUP on a video sample.

The REGROUP system outputs group tracks as bounding box coordinates in a video consistent with the Multiple Object Tracking literature. It uses the same format as pedestrian detections (see Figure 4 for an example):

The basic structure of commands is the following:

python regroup-pre-stored-dets.py --sequence_dir=<sequence_dir> --display=<display> --img_output_path=<img_output_path>

where <sequence_dir> is the dataset and group directory, <display> indicates whether the tracker displays the video as the tracker runs, <img_output_path> is the path to save the image data that shows group tracks.

After the dataset is set up, run the tracker on a set of images (e.g., <sequence_dir> shown below) as follows:

python regroup-pre-stored-dets.py --sequence_dir=data/test/[sequence] --display=1 --output_file=data/test/[sequence].txt

Additional parameters include:

detection_file: path to pedestrian detections

distance_metric: distance metric for data association (default="cosine")

output_file: path to the tracking output file. This file will contain the tracking results on completion

min_confidence: detection confidence threshold. Disregard all detections that have a confidence lower than this value (default=-1)

min_detection_height: threshold on the detection bounding box height. Detections with height smaller than this value are disregarded (default=40)

cif: crowd indication feature threshold (default=4)

nms_max_overlap: non-maxima suppression threshold: maximum detection overlap (default=0.7)

max_cosine_distance: Gating threshold for cosine distance metric object appearance (default=0.9) nn_budget: maximum size of the appearance descriptors gallery. If None, no budget is enforced (default=100)

display: show intermediate tracking results (default=0)

img_output_path: path to write images

group_detector: group detection method (default='regroup')

model: path to freezed tensorflow inference graph protocol buffer, which is used to extract features from person re-identification Convolutional Neural Networks (default='regroup/resources/networks/mars-small128.pb')

gp_dist: ground plane distance threshold \kappa (default=20)

h_ratio: height ratio threshold \alpha [0,1] (default=0.8)

cost_metric: cost_metric = {0-appearance, 1-motion, 2-both} (default=2)

Generate Pedestrian Detections

REGROUP requires pedestrian detections as input to predict group detections. These are instructions to run the pedestrian detection on a video stream and on pre-stored RGB image sequences that have been extracted from a video sequence. Be sure to store RGB data in the regroup/data/test/[sequence]>/rgb directory.

To generate pedestrian detections on a video stream using OpenCV Library, provide the path to the video and output pedestrian detection filename:

cd regroup/darknet/ python generate_ped_detections_video.py --sequence_dir=<../data/test/[sequence]> --video_path=<../data/test/[sequence].mp4> --output_filename=<../data/test/[sequence]/det/det.txt>

To run YOLO on pre-stored image data using OpenCV Library, provide the path to the directory where the image data is stored and output pedestrian detection filename:

cd regroup/darknet/ python generate_ped_detections_image.py --sequence_dir=<../data/test/[sequence]> --video_path=<path to video>

Different Ways of Deploying REGROUP

REGROUP can be run either on a pre-recorded video stream or on pre-stored pedestrian detections.

REGROUP requires pedestrian detections as input to predict group detections. There is no need to regenerate pedestrian detections each time to REGROUP. After collecting pedestrian detections and storing them in the correct format, you can run REGROUP using the detections directly for computer vision benchmarking. It is useful to test vision systems offline, particularly using preprocessed pedestrian detections which are conveniently provided in this repository. For instance, REGROUP performs non-maximum suppression on bounding boxes and it removes 'bad' bounding boxes automatically (see parameters from the Usage Section).

Run on video stream

To run REGROUP on a video stream using OpenCV Library, provide the sequence directory and path to the video:

cd regroup/ python regroup-yolo-video-input.py --sequence_dir=<../data/test/[sequence]> --video_path=<../data/test/[sequence]/[sequence].mp4>

regroup-yolo-video-input.py will generate an image sequence from the video at video_path and store the images in the sequence_dir.

REGROUP assumes a 640X360 image resolution with a frame rate of at least 30 frames per second, so be sure to pre-process the data to be consistent with this.

Run on pre-stored pedestrian detections

To REGROUP offline with pre-stored pedestrian detections, you need to download YOLO first. See Installation Section for details. Store the RGB images in the regroup/data/test/[sequence]/rgb directory, and store pre-stored pedestrian detections in a file located at data/test/[sequence]/det/det.txt

cd regroup/ python regroup-yolo.py --sequence_dir=<../data/test/[sequence]>

Alternate Group Detectors

We compared REGROUP's group detector to four state-of-the-art group detectors. We will provide code for comparative methods to enable HRI researchers to benchmark their vision systems for the group perception problem domain in the near future.

Additional comments

The code in this repository generates the results found in our paper (see Figures 5-7).

Results for group detection experiments:

Figure 5: Results for group detection and tracking ablation experiments:

Figure 6: Group Detection and Tracking Results. We report Multiple Object Tracking Accuracy (MOTA ↑), Multiple Object Tracking Precision (MOTP ↑), Mostly Tracked Targets (MT ↑), Mostly Lost Targets (ML ↓), False Positives (FP ↓), False Negatives (FN ↓), Total Number of ID Switches (IDsw ↓), and end-to-end computation time in seconds per image (t(s)) where ↑ means higher is better and ↓ means lower is better.

Figure 7: Visual results from ablation experiments using REGROUP with NCuts and Self-Tuning-SC.

Further Issues and questions ❓

If you have issues or questions, don't hesitate to contact Angelique Taylor at amt062@eng.ucsd.edu, or amt298@cornell.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
regroup		regroup
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

REGROUP-HRI 🤖

Introduction

REGROUP Overview

Installation

Install REGROUP

Prerequisites (Pedestrian Detection with YOLO)

Prerequisites (Install Person Re-Identification Convolutional Neural Network models)

Code Structure Overview

Dataset Setup

Usage

Running REGROUP on a video sample.

Generate Pedestrian Detections

Different Ways of Deploying REGROUP

Run on video stream

Run on pre-stored pedestrian detections

Alternate Group Detectors

Additional comments

Further Issues and questions ❓

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

UCSD-RHC-Lab/regroup-hri

Folders and files

Latest commit

History

Repository files navigation

REGROUP-HRI 🤖

Introduction

REGROUP Overview

Installation

Install REGROUP

Prerequisites (Pedestrian Detection with YOLO)

Prerequisites (Install Person Re-Identification Convolutional Neural Network models)

Code Structure Overview

Dataset Setup

Usage

Running REGROUP on a video sample.

Generate Pedestrian Detections

Different Ways of Deploying REGROUP

Run on video stream

Run on pre-stored pedestrian detections

Alternate Group Detectors

Additional comments

Further Issues and questions ❓

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages