This package is a modified version of the autoware_universe/perception/autoware_tensorrt_yolox package.
This package detects target objects e.g., cars, trucks, bicycles, and pedestrians and segment target objects such as cars, trucks, buses and pedestrian, building, vegetation, road, sidewalk on a image based on YOLOX model with multi-header structure.
Traffic Light detection with classification using a custom YOLOX model trained on a custom dataset including ~2000 pictures of traffic lights.
Important
'tl' in all file names means traffic light, the model is not perfect but suitable for testing purposesThis custom package extends the original YOLOX inference node for Autoware to support traffic light recognition and publishes the results in the standardized tier4_perception_msgs/msg/TrafficLightArray message. The YOLOX detection results are shared via SharedInferenceData so multiple nodes (like the traffic light publisher) can access the same detections. YOLOX inference and traffic light publishing run in a single executable (combined launch via yolox_tl launch), ensuring tight coupling and shared memory without separate launch processes.
Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, Jian Sun, "YOLOX: Exceeding YOLO Series in 2021", arXiv preprint arXiv:2107.08430, 2021 [ref]
Name | Type | Description |
---|---|---|
in/image |
sensor_msgs/Image |
The input image |
Name | Type | Description |
---|---|---|
out/objects |
tier4_perception_msgs/DetectedObjectsWithFeature |
The detected objects with 2D bounding boxes |
out/image |
sensor_msgs/Image |
The image with 2D bounding boxes for visualization |
out/mask |
sensor_msgs/Image |
The semantic segmentation mask |
out/color_mask |
sensor_msgs/Image |
The colorized image of semantic segmentation mask for visualization |
out/traffic_lights |
tier4_perception_msgs/TrafficLightArray |
The detected traffic light's id, type, color, shape, status and confidence |
-
{{json_to_markdown("perception/autoware_tensorrt_yolox/schema/yolox_s_plus_opt.schema.json")}}
-
{{json_to_markdown("perception/autoware_tensorrt_yolox/schema/yolox_tiny.schema.json")}}
The label contained in detected 2D bounding boxes (i.e., out/objects
) will be either one of the followings:
- CAR
- PEDESTRIAN ("PERSON" will also be categorized as "PEDESTRIAN")
- BUS
- TRUCK
- BICYCLE
- MOTORCYCLE
If other labels (case insensitive) are contained in the file specified via the label_file
parameter,
those are labeled as UNKNOWN
, while detected rectangles are drawn in the visualization result (out/image
).
The semantic segmentation mask is a gray image whose each pixel is index of one following class:
index | semantic name |
---|---|
0 | road |
1 | building |
2 | wall |
3 | UNDER_DRIVABLE |
4 | traffic_light |
5 | traffic_sign |
6 | person |
7 | vehicle |
8 | bike |
9 | road |
10 | sidewalk |
11 | roadPaint |
12 | curbstone |
13 | crosswalk_others |
14 | vegetation |
15 | sky |
All the traffic light id-s are 0 in default but a system to assign different id-s could be implemented using ROI-s.
A sample model (named yolox-tiny.onnx
) is downloaded by ansible script on env preparation stage, if not, please, follow Manual downloading of artifacts.
To accelerate Non-maximum-suppression (NMS), which is one of the common post-process after object detection inference,
EfficientNMS_TRT
module is attached after the ordinal YOLOX (tiny) network.
The EfficientNMS_TRT
module contains fixed values for score_threshold
and nms_threshold
in it,
hence these parameters are ignored when users specify ONNX models including this module.
This package accepts both EfficientNMS_TRT
attached ONNXs and models published from the official YOLOX repository (we referred to them as "plain" models).
In addition to yolox-tiny.onnx
, a custom model named yolox-sPlus-opt-pseudoV2-T4-960x960-T4-seg16cls
is either available.
This model is multi-header structure model which is based on YOLOX-s and tuned to perform more accurate detection with almost comparable execution speed with yolox-tiny
.
To get better results with this model, users are recommended to use some specific running arguments
such as precision:=int8
, calibration_algorithm:=Entropy
, clip_value:=6.0
.
Users can refer launch/yolox_sPlus_opt.launch.xml
to see how this model can be used.
Beside detection result, this model also output image semantic segmentation result for pointcloud filtering purpose.
All models are automatically converted to TensorRT format.
These converted files will be saved in the same directory as specified ONNX files
with .engine
filename extension and reused from the next run.
The conversion process may take a while (typically 10 to 20 minutes) and the inference process is blocked
until complete the conversion, so it will take some time until detection results are published (even until appearing in the topic list) on the first run
The custom traffic light model (tl_class.onnx
) is included in /model
via .onnx format which on the first run converts to a .engine
file. The launch files are set to fp16 in deafult but int8 and fp32 are also available.
To convert users' own model that saved in PyTorch's pth
format into ONNX,
users can exploit the converter offered by the official repository.
For the convenience, only procedures are described below.
Please refer the official document for more detail.
-
Install dependency
git clone git@github.com:Megvii-BaseDetection/YOLOX.git cd YOLOX python3 setup.py develop --user
-
Convert pth into ONNX
python3 tools/export_onnx.py \ --output-name YOUR_YOLOX.onnx \ -f YOUR_YOLOX.py \ -c YOUR_YOLOX.pth
-
Install dependency
git clone git@github.com:Megvii-BaseDetection/YOLOX.git cd YOLOX python3 setup.py develop --user pip3 install git+ssh://git@github.com/wep21/yolox_onnx_modifier.git --user
-
Convert pth into ONNX
python3 tools/export_onnx.py \ --output-name YOUR_YOLOX.onnx \ -f YOUR_YOLOX.py \ -c YOUR_YOLOX.pth --decode_in_inference
-
Embed
EfficientNMS_TRT
to the end of YOLOXyolox_onnx_modifier YOUR_YOLOX.onnx -o YOUR_YOLOX_WITH_NMS.onnx
Download tl_class.onnx (35.8 MB)
This model was trained to detect traffic light states like red-circle
, green-arrow
, etc.
AP for UNKNOWN: 0.0966
AP for RED-CIRCLE: 0.8132
AP for RED-AMBER-CIRCLE: 1.0000
AP for AMBER-CIRCLE: 0.9091
AP for GREEN-CIRCLE: 0.8106
AP for RED-ARROW-LEFT: 0.6659
AP for RED-ARROW-RIGHT: 0.9218
AP for RED-ARROW-STRAIGHT: 0.9788
AP for AMBER-ARROW-LEFT: 0.5455
AP for AMBER-ARROW-RIGHT: 0.0000
AP for AMBER-ARROW-STRAIGHT: 0.0000
AP for GREEN-ARROW-LEFT: 0.9000
AP for GREEN-ARROW-RIGHT: 0.9437
AP for GREEN-ARROW-STRAIGHT: 0.8075
Mean Average Precision (mAP):
- mAP@0.5:0.95: 0.4863
- mAP@0.5: 0.6709
A sample label file (named label.txt
) and semantic segmentation color map file (name semseg_color_map.csv
) are also downloaded automatically during env preparation process
(NOTE: This file is incompatible with models that output labels for the COCO dataset (e.g., models from the official YOLOX repository)).
This file represents the correspondence between class index (integer outputted from YOLOX network) and class label (strings making understanding easier). This package maps class IDs (incremented from 0) with labels according to the order in this file.
The new label file for traffic lights can be found in /label
named label_tl.txt
which consists of the following 14
classes:
index | class name |
---|---|
0 | UNKNOWN |
1 | RED-CIRCLE |
2 | RED-AMBER-CIRCLE |
3 | AMBER-CIRCLE |
4 | GREEN-CIRCLE |
5 | RED-ARROW-LEFT |
6 | RED-ARROW-RIGHT |
7 | RED-ARROW-STRAIGHT |
8 | AMBER-ARROW-LEFT |
9 | AMBER-ARROW-RIGHT |
10 | AMBER-ARROW-STRAIGHT |
11 | GREEN-ARROW-LEFT |
12 | GREEN-ARROW-RIGHT |
13 | GREEN-ARROW-STRAIGHT |
- Ensure the label file and the ONNX model are included in
autoware_data/tensorrt_yolox
. - Place the package under the
autoware/src/universe/autoware_universe/perception
directory.
Build the package and source the environment
You can use both the yolox_tl.launch.xml
and the yolox.launch.py
files to launch this package.
For example:
yolox_tl.launch.xml
ros2 launch autoware_tensorrt_yolox_traffic_light_classifier yolox_tl.launch.xml
yolox.launch.py
ros2 launch autoware_tensorrt_yolox_traffic_light_classifier yolox.launch.py
Open a new terminal, run the ros bag and you can also check the out/traffic_lights
topic for the results.
Example to run a ros bag
ros2 bag play <PATH-TO-MCAP> --remap /sensing/camera/pinhole/image_raw:=/sensing/camera/camera0/image_rect_color --clock
The main.cpp
file serves as the entry point for the autoware_tensorrt_yolox_traffic_light_classifier
package. It is responsible for initializing and managing the execution of the YOLOX inference node and the traffic light publisher node. Below are the key details:
-
Node Initialization:
- The
TrtYoloXNode
is initialized to handle YOLOX-based object detection, including traffic light detection. - The
TrafficLightPublisherNode
is initialized to process and publish traffic light detection results.
- The
-
Shared Data Structure:
- A
SharedInferenceData
object is created and shared between the two nodes. This structure ensures that the detected objects from the YOLOX inference node are accessible to the traffic light publisher node in a thread-safe manner.
- A
-
Multi-Threaded Execution:
- A
rclcpp::executors::MultiThreadedExecutor
is used to manage the concurrent execution of the two nodes. This ensures that both nodes can operate simultaneously without blocking each other, enabling efficient processing and publishing of traffic light data.
- A
-
Lifecycle Management:
- The
rclcpp::spin
function is used to keep the executor running, ensuring that the nodes continue to process data and publish results until the application is terminated.
- The
-
Error Handling:
- Basic error handling is implemented to ensure that the application exits gracefully in case of initialization failures or runtime errors.
This file plays a crucial role in orchestrating the interaction between the YOLOX inference and traffic light publishing components, ensuring seamless and efficient operation of the package.
The TrafficLightPublisherNode
is a new addition designed to publish traffic light information. It subscribes to shared inference data and publishes traffic light arrays (tier4_perception_msgs::msg::TrafficLightArray
) at regular intervals. Below are the key details:
-
Node Initialization:
- The node is initialized with a shared data structure (
SharedInferenceData
) that contains detected objects. - A reliable QoS policy is used for the publisher to ensure message delivery.
- A timer is set up to periodically call the
publishTrafficLights
method.
- The node is initialized with a shared data structure (
-
Traffic Light Publishing:
- The
publishTrafficLights
method retrieves the latest detected objects from the shared data structure using a mutex for thread safety. - For each detected object, a
tier4_perception_msgs::msg::TrafficLightElement
message is created based on the object's label (e.g.,RED-CIRCLE
,GREEN-ARROW-LEFT
). - The color and shape of the traffic light are determined using a series of conditional checks.
- The traffic light elements are added to a
tier4_perception_msgs::msg::TrafficLight
message, which is then added to aTrafficLightArray
message. - The
TrafficLightArray
message is published to the~/out/traffic_lights
topic.
- The
-
Logging:
- Informational logs are added to provide insights into the detected objects and the publishing process.
- Warnings are logged if no detected objects are available.
-
Class Definition:
- The
TrafficLightPublisherNode
class inherits fromrclcpp::Node
. - It includes a constructor that initializes the node and a destructor that logs when the node is destroyed.
- The
-
Private Members:
shared_data_
: A shared pointer to theSharedInferenceData
structure containing detected objects.tl_pub_
: A publisher fortier4_perception_msgs::msg::TrafficLightArray
messages.timer_
: A timer that triggers the periodic publishing of traffic light messages.
These files work together to ensure that traffic light detection results are published in a standardized format, enabling downstream nodes to process the information effectively.
The tensorrt_yolox
implementation has been extended to support traffic light detection. Key changes include:
-
Integration of Traffic Light-Specific Attributes:
- The
Object
structure has been modified to include traffic light-specific attributes such aslabel
andtype
. This allows the model to classify and differentiate between various traffic light states.
- The
-
Traffic Light-Specific Preprocessing and Postprocessing:
- While the original implementation already included preprocessing and postprocessing steps, these have been extended to handle traffic light-specific data. This ensures accurate scaling, alignment, and decoding of traffic light-specific outputs.
-
Support for Multi-Task Models:
- The implementation now supports multi-task models that can simultaneously detect traffic lights and perform other tasks such as semantic segmentation. This is achieved through the addition of new data structures and utility functions.
-
Error Handling and Logging:
- Improved error handling and logging mechanisms have been introduced to provide better insights into the model's performance and potential issues during inference.
These changes ensure that the tensorrt_yolox
implementation is optimized for traffic light detection, providing accurate and efficient results.
The SharedInferenceData
structure is a critical component of the package, designed to facilitate thread-safe communication between the YOLOX inference node and the traffic light publisher node. Below are the key details:
-
Purpose:
- Acts as a shared data structure to store and manage detected objects, ensuring seamless communication between nodes.
-
Components:
std::mutex
: Ensures thread-safe access to the shared data, preventing race conditions when multiple threads access or modify the data simultaneously.std::vector<autoware::tensorrt_yolox::ObjectArrays>
: Stores the detected objects, which include traffic light attributes such as label, type, and confidence.
-
Usage:
- The YOLOX inference node populates this structure with detection results.
- The traffic light publisher node retrieves the data to create and publish
TrafficLightArray
messages.
This structure ensures efficient and safe data sharing in a multi-threaded environment, enabling accurate and timely traffic light detection and publishing.
In the original Autoware setup, launch files were primarily XML-based. These files were static and less flexible for complex configurations. With the introduction of Python-based launch files, users can now leverage Python's scripting capabilities to create more dynamic and modular launch configurations. This transition enhances maintainability and scalability, especially for large projects like Autoware.
The flow of launch files in the perception stack typically follows this sequence:
autoware.launch.xml
→ Main entry point for launching the Autoware system.tier4_perception_component.launch.xml
→ Launches perception-related components.perception.launch.xml
→ Configures and launches specific perception nodes.
In this package, the equivalent Python-based launch files are:
autoware.launch.py
tier4_perception_component.launch.py
perception.launch.py
-
autoware.launch.py
:- Serves as the main entry point for launching the Autoware system.
- Dynamically includes other launch files based on user-defined parameters.
-
perception.launch.py
:- Configures and launches perception-related nodes, including the traffic light classifier.
- Allows for dynamic parameter adjustments, such as model paths and inference settings.
-
tier4_perception_component.launch.py
:- Launches specific perception components, such as object detection and traffic light classification.
- Integrates seamlessly with the rest of the Autoware perception stack.
The perception stack in Autoware is designed to process sensor data and provide meaningful information for autonomous driving. The flow of launch files in the perception stack is as follows:
-
autoware.launch.xml
:- Launches the entire Autoware system, including perception, planning, and control modules.
-
tier4_perception_component.launch.xml
:- Focuses on perception-related components, such as object detection and traffic light classification.
-
perception.launch.xml
:- Configures and launches specific perception nodes, such as the YOLOX-based traffic light classifier.
This package provides Python-based equivalents for these XML launch files, enabling users to leverage the benefits of Python scripting.
To illustrate the transition from XML-based launch files to Python-based ones, refer to the following diagram:
This diagram illustrates how the original XML-based launch file structures map to their Python-based counterparts. You can replace the XML files with Python files while maintaining the same structure.
The use_custom_yolox_tl
switch is a key feature in this package that allows users to toggle between the custom YOLOX-based traffic light classifier and the default traffic light recognition module. This switch is defined as a launch argument in the perception.launch.py
file and is set to false
by default.
-
Custom YOLOX Traffic Light Classifier:
- When
use_custom_yolox_tl
is set totrue
, the custom YOLOX-based traffic light classifier is used. - This includes the following components:
- A decompression node for handling compressed images.
- The YOLOX-based traffic light classifier node.
- When
-
Default Traffic Light Recognition:
- When
use_custom_yolox_tl
is set tofalse
, the default traffic light recognition module is used. - This module is defined in the
traffic_light.launch.xml
file.
- When
To enable the custom YOLOX-based traffic light classifier, set the use_custom_yolox_tl
argument to true
when launching autoware:
ros2 launch autoware_launch perception.launch.py <params> use_custom_yolox_tl:=true
To use the default traffic light recognition module, leave the argument as false
.