Skip to content

jkk-research/autoware_tensorrt_yolox_traffic_light_classifier

Repository files navigation

autoware_tensorrt_yolox_traffic_light_classifier

This package is a modified version of the autoware_universe/perception/autoware_tensorrt_yolox package.

Original Purpose

This package detects target objects e.g., cars, trucks, bicycles, and pedestrians and segment target objects such as cars, trucks, buses and pedestrian, building, vegetation, road, sidewalk on a image based on YOLOX model with multi-header structure.

New Purpose

Traffic Light detection with classification using a custom YOLOX model trained on a custom dataset including ~2000 pictures of traffic lights.

Important 'tl' in all file names means traffic light, the model is not perfect but suitable for testing purposes

Demo

demo

Watch the full demo video

Inner-workings / Algorithms

This custom package extends the original YOLOX inference node for Autoware to support traffic light recognition and publishes the results in the standardized tier4_perception_msgs/msg/TrafficLightArray message. The YOLOX detection results are shared via SharedInferenceData so multiple nodes (like the traffic light publisher) can access the same detections. YOLOX inference and traffic light publishing run in a single executable (combined launch via yolox_tl launch), ensuring tight coupling and shared memory without separate launch processes.

Cite

Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, Jian Sun, "YOLOX: Exceeding YOLO Series in 2021", arXiv preprint arXiv:2107.08430, 2021 [ref]

Inputs / Outputs

Input

Name Type Description
in/image sensor_msgs/Image The input image

Output

Name Type Description
out/objects tier4_perception_msgs/DetectedObjectsWithFeature The detected objects with 2D bounding boxes
out/image sensor_msgs/Image The image with 2D bounding boxes for visualization
out/mask sensor_msgs/Image The semantic segmentation mask
out/color_mask sensor_msgs/Image The colorized image of semantic segmentation mask for visualization
out/traffic_lights tier4_perception_msgs/TrafficLightArray The detected traffic light's id, type, color, shape, status and confidence

Parameters

  • {{json_to_markdown("perception/autoware_tensorrt_yolox/schema/yolox_s_plus_opt.schema.json")}}

  • {{json_to_markdown("perception/autoware_tensorrt_yolox/schema/yolox_tiny.schema.json")}}

Assumptions / Known limits

Original

The label contained in detected 2D bounding boxes (i.e., out/objects) will be either one of the followings:

  • CAR
  • PEDESTRIAN ("PERSON" will also be categorized as "PEDESTRIAN")
  • BUS
  • TRUCK
  • BICYCLE
  • MOTORCYCLE

If other labels (case insensitive) are contained in the file specified via the label_file parameter, those are labeled as UNKNOWN, while detected rectangles are drawn in the visualization result (out/image).

The semantic segmentation mask is a gray image whose each pixel is index of one following class:

index semantic name
0 road
1 building
2 wall
3 UNDER_DRIVABLE
4 traffic_light
5 traffic_sign
6 person
7 vehicle
8 bike
9 road
10 sidewalk
11 roadPaint
12 curbstone
13 crosswalk_others
14 vegetation
15 sky

Modified

All the traffic light id-s are 0 in default but a system to assign different id-s could be implemented using ROI-s.

Onnx model

A sample model (named yolox-tiny.onnx) is downloaded by ansible script on env preparation stage, if not, please, follow Manual downloading of artifacts. To accelerate Non-maximum-suppression (NMS), which is one of the common post-process after object detection inference, EfficientNMS_TRT module is attached after the ordinal YOLOX (tiny) network. The EfficientNMS_TRT module contains fixed values for score_threshold and nms_threshold in it, hence these parameters are ignored when users specify ONNX models including this module.

This package accepts both EfficientNMS_TRT attached ONNXs and models published from the official YOLOX repository (we referred to them as "plain" models).

In addition to yolox-tiny.onnx, a custom model named yolox-sPlus-opt-pseudoV2-T4-960x960-T4-seg16cls is either available. This model is multi-header structure model which is based on YOLOX-s and tuned to perform more accurate detection with almost comparable execution speed with yolox-tiny. To get better results with this model, users are recommended to use some specific running arguments such as precision:=int8, calibration_algorithm:=Entropy, clip_value:=6.0. Users can refer launch/yolox_sPlus_opt.launch.xml to see how this model can be used. Beside detection result, this model also output image semantic segmentation result for pointcloud filtering purpose.

All models are automatically converted to TensorRT format. These converted files will be saved in the same directory as specified ONNX files with .engine filename extension and reused from the next run. The conversion process may take a while (typically 10 to 20 minutes) and the inference process is blocked until complete the conversion, so it will take some time until detection results are published (even until appearing in the topic list) on the first run

Modified

The custom traffic light model (tl_class.onnx) is included in /model via .onnx format which on the first run converts to a .engine file. The launch files are set to fp16 in deafult but int8 and fp32 are also available.

Package acceptable model generation

To convert users' own model that saved in PyTorch's pth format into ONNX, users can exploit the converter offered by the official repository. For the convenience, only procedures are described below. Please refer the official document for more detail.

For plain models

  1. Install dependency

    git clone git@github.com:Megvii-BaseDetection/YOLOX.git
    cd YOLOX
    python3 setup.py develop --user
  2. Convert pth into ONNX

    python3 tools/export_onnx.py \
      --output-name YOUR_YOLOX.onnx \
      -f YOUR_YOLOX.py \
      -c YOUR_YOLOX.pth

For EfficientNMS_TRT embedded models

  1. Install dependency

    git clone git@github.com:Megvii-BaseDetection/YOLOX.git
    cd YOLOX
    python3 setup.py develop --user
    pip3 install git+ssh://git@github.com/wep21/yolox_onnx_modifier.git --user
  2. Convert pth into ONNX

    python3 tools/export_onnx.py \
      --output-name YOUR_YOLOX.onnx \
      -f YOUR_YOLOX.py \
      -c YOUR_YOLOX.pth
      --decode_in_inference
  3. Embed EfficientNMS_TRT to the end of YOLOX

    yolox_onnx_modifier YOUR_YOLOX.onnx -o YOUR_YOLOX_WITH_NMS.onnx

Download ONNX model

Download tl_class.onnx (35.8 MB)

This model was trained to detect traffic light states like red-circle, green-arrow, etc.

AP for UNKNOWN: 0.0966
AP for RED-CIRCLE: 0.8132
AP for RED-AMBER-CIRCLE: 1.0000
AP for AMBER-CIRCLE: 0.9091
AP for GREEN-CIRCLE: 0.8106
AP for RED-ARROW-LEFT: 0.6659
AP for RED-ARROW-RIGHT: 0.9218
AP for RED-ARROW-STRAIGHT: 0.9788
AP for AMBER-ARROW-LEFT: 0.5455
AP for AMBER-ARROW-RIGHT: 0.0000
AP for AMBER-ARROW-STRAIGHT: 0.0000
AP for GREEN-ARROW-LEFT: 0.9000
AP for GREEN-ARROW-RIGHT: 0.9437
AP for GREEN-ARROW-STRAIGHT: 0.8075

Mean Average Precision (mAP):

  • mAP@0.5:0.95: 0.4863
  • mAP@0.5: 0.6709

Label file

A sample label file (named label.txt) and semantic segmentation color map file (name semseg_color_map.csv) are also downloaded automatically during env preparation process (NOTE: This file is incompatible with models that output labels for the COCO dataset (e.g., models from the official YOLOX repository)).

This file represents the correspondence between class index (integer outputted from YOLOX network) and class label (strings making understanding easier). This package maps class IDs (incremented from 0) with labels according to the order in this file.

The new label file for traffic lights can be found in /label named label_tl.txt which consists of the following 14 classes:

index class name
0 UNKNOWN
1 RED-CIRCLE
2 RED-AMBER-CIRCLE
3 AMBER-CIRCLE
4 GREEN-CIRCLE
5 RED-ARROW-LEFT
6 RED-ARROW-RIGHT
7 RED-ARROW-STRAIGHT
8 AMBER-ARROW-LEFT
9 AMBER-ARROW-RIGHT
10 AMBER-ARROW-STRAIGHT
11 GREEN-ARROW-LEFT
12 GREEN-ARROW-RIGHT
13 GREEN-ARROW-STRAIGHT

How to use this package

Preparations

  • Ensure the label file and the ONNX model are included in autoware_data/tensorrt_yolox.
  • Place the package under the autoware/src/universe/autoware_universe/perception directory.

Launch

Build the package and source the environment

You can use both the yolox_tl.launch.xml and the yolox.launch.py files to launch this package.

For example:

yolox_tl.launch.xml

ros2 launch autoware_tensorrt_yolox_traffic_light_classifier yolox_tl.launch.xml

yolox.launch.py

ros2 launch autoware_tensorrt_yolox_traffic_light_classifier yolox.launch.py

Open a new terminal, run the ros bag and you can also check the out/traffic_lights topic for the results.

Example to run a ros bag

 ros2 bag play <PATH-TO-MCAP> --remap /sensing/camera/pinhole/image_raw:=/sensing/camera/camera0/image_rect_color --clock

Changes

main.cpp

The main.cpp file serves as the entry point for the autoware_tensorrt_yolox_traffic_light_classifier package. It is responsible for initializing and managing the execution of the YOLOX inference node and the traffic light publisher node. Below are the key details:

  1. Node Initialization:

    • The TrtYoloXNode is initialized to handle YOLOX-based object detection, including traffic light detection.
    • The TrafficLightPublisherNode is initialized to process and publish traffic light detection results.
  2. Shared Data Structure:

    • A SharedInferenceData object is created and shared between the two nodes. This structure ensures that the detected objects from the YOLOX inference node are accessible to the traffic light publisher node in a thread-safe manner.
  3. Multi-Threaded Execution:

    • A rclcpp::executors::MultiThreadedExecutor is used to manage the concurrent execution of the two nodes. This ensures that both nodes can operate simultaneously without blocking each other, enabling efficient processing and publishing of traffic light data.
  4. Lifecycle Management:

    • The rclcpp::spin function is used to keep the executor running, ensuring that the nodes continue to process data and publish results until the application is terminated.
  5. Error Handling:

    • Basic error handling is implemented to ensure that the application exits gracefully in case of initialization failures or runtime errors.

This file plays a crucial role in orchestrating the interaction between the YOLOX inference and traffic light publishing components, ensuring seamless and efficient operation of the package.

traffic_light_publisher_node.cpp and traffic_light_publisher_node.hpp

The TrafficLightPublisherNode is a new addition designed to publish traffic light information. It subscribes to shared inference data and publishes traffic light arrays (tier4_perception_msgs::msg::TrafficLightArray) at regular intervals. Below are the key details:

traffic_light_publisher_node.cpp

  1. Node Initialization:

    • The node is initialized with a shared data structure (SharedInferenceData) that contains detected objects.
    • A reliable QoS policy is used for the publisher to ensure message delivery.
    • A timer is set up to periodically call the publishTrafficLights method.
  2. Traffic Light Publishing:

    • The publishTrafficLights method retrieves the latest detected objects from the shared data structure using a mutex for thread safety.
    • For each detected object, a tier4_perception_msgs::msg::TrafficLightElement message is created based on the object's label (e.g., RED-CIRCLE, GREEN-ARROW-LEFT).
    • The color and shape of the traffic light are determined using a series of conditional checks.
    • The traffic light elements are added to a tier4_perception_msgs::msg::TrafficLight message, which is then added to a TrafficLightArray message.
    • The TrafficLightArray message is published to the ~/out/traffic_lights topic.
  3. Logging:

    • Informational logs are added to provide insights into the detected objects and the publishing process.
    • Warnings are logged if no detected objects are available.

traffic_light_publisher_node.hpp

  1. Class Definition:

    • The TrafficLightPublisherNode class inherits from rclcpp::Node.
    • It includes a constructor that initializes the node and a destructor that logs when the node is destroyed.
  2. Private Members:

    • shared_data_: A shared pointer to the SharedInferenceData structure containing detected objects.
    • tl_pub_: A publisher for tier4_perception_msgs::msg::TrafficLightArray messages.
    • timer_: A timer that triggers the periodic publishing of traffic light messages.

These files work together to ensure that traffic light detection results are published in a standardized format, enabling downstream nodes to process the information effectively.

tensorrt_yolox.cpp and tensorrt_yolox.hpp

The tensorrt_yolox implementation has been extended to support traffic light detection. Key changes include:

  1. Integration of Traffic Light-Specific Attributes:

    • The Object structure has been modified to include traffic light-specific attributes such as label and type. This allows the model to classify and differentiate between various traffic light states.
  2. Traffic Light-Specific Preprocessing and Postprocessing:

    • While the original implementation already included preprocessing and postprocessing steps, these have been extended to handle traffic light-specific data. This ensures accurate scaling, alignment, and decoding of traffic light-specific outputs.
  3. Support for Multi-Task Models:

    • The implementation now supports multi-task models that can simultaneously detect traffic lights and perform other tasks such as semantic segmentation. This is achieved through the addition of new data structures and utility functions.
  4. Error Handling and Logging:

    • Improved error handling and logging mechanisms have been introduced to provide better insights into the model's performance and potential issues during inference.

These changes ensure that the tensorrt_yolox implementation is optimized for traffic light detection, providing accurate and efficient results.

shared_inference_data.hpp

The SharedInferenceData structure is a critical component of the package, designed to facilitate thread-safe communication between the YOLOX inference node and the traffic light publisher node. Below are the key details:

  1. Purpose:

    • Acts as a shared data structure to store and manage detected objects, ensuring seamless communication between nodes.
  2. Components:

    • std::mutex: Ensures thread-safe access to the shared data, preventing race conditions when multiple threads access or modify the data simultaneously.
    • std::vector<autoware::tensorrt_yolox::ObjectArrays>: Stores the detected objects, which include traffic light attributes such as label, type, and confidence.
  3. Usage:

    • The YOLOX inference node populates this structure with detection results.
    • The traffic light publisher node retrieves the data to create and publish TrafficLightArray messages.

This structure ensures efficient and safe data sharing in a multi-threaded environment, enabling accurate and timely traffic light detection and publishing.

Autoware Launch

Transition from XML to Python Launch Files

In the original Autoware setup, launch files were primarily XML-based. These files were static and less flexible for complex configurations. With the introduction of Python-based launch files, users can now leverage Python's scripting capabilities to create more dynamic and modular launch configurations. This transition enhances maintainability and scalability, especially for large projects like Autoware.

The flow of launch files in the perception stack typically follows this sequence:

  1. autoware.launch.xml → Main entry point for launching the Autoware system.
  2. tier4_perception_component.launch.xml → Launches perception-related components.
  3. perception.launch.xml → Configures and launches specific perception nodes.

In this package, the equivalent Python-based launch files are:

  1. autoware.launch.py
  2. tier4_perception_component.launch.py
  3. perception.launch.py

Python Launch Files

  1. autoware.launch.py:

    • Serves as the main entry point for launching the Autoware system.
    • Dynamically includes other launch files based on user-defined parameters.
  2. perception.launch.py:

    • Configures and launches perception-related nodes, including the traffic light classifier.
    • Allows for dynamic parameter adjustments, such as model paths and inference settings.
  3. tier4_perception_component.launch.py:

    • Launches specific perception components, such as object detection and traffic light classification.
    • Integrates seamlessly with the rest of the Autoware perception stack.

Perception Stack in Autoware

The perception stack in Autoware is designed to process sensor data and provide meaningful information for autonomous driving. The flow of launch files in the perception stack is as follows:

  1. autoware.launch.xml:

    • Launches the entire Autoware system, including perception, planning, and control modules.
  2. tier4_perception_component.launch.xml:

    • Focuses on perception-related components, such as object detection and traffic light classification.
  3. perception.launch.xml:

    • Configures and launches specific perception nodes, such as the YOLOX-based traffic light classifier.

This package provides Python-based equivalents for these XML launch files, enabling users to leverage the benefits of Python scripting.

Diagram

To illustrate the transition from XML-based launch files to Python-based ones, refer to the following diagram:

Launch File Transition

This diagram illustrates how the original XML-based launch file structures map to their Python-based counterparts. You can replace the XML files with Python files while maintaining the same structure.

Use Custom YOLOX Traffic Light Classifier

The use_custom_yolox_tl switch is a key feature in this package that allows users to toggle between the custom YOLOX-based traffic light classifier and the default traffic light recognition module. This switch is defined as a launch argument in the perception.launch.py file and is set to false by default.

How It Works

  1. Custom YOLOX Traffic Light Classifier:

    • When use_custom_yolox_tl is set to true, the custom YOLOX-based traffic light classifier is used.
    • This includes the following components:
      • A decompression node for handling compressed images.
      • The YOLOX-based traffic light classifier node.
  2. Default Traffic Light Recognition:

    • When use_custom_yolox_tl is set to false, the default traffic light recognition module is used.
    • This module is defined in the traffic_light.launch.xml file.

Example Usage

To enable the custom YOLOX-based traffic light classifier, set the use_custom_yolox_tl argument to true when launching autoware:

ros2 launch autoware_launch perception.launch.py <params> use_custom_yolox_tl:=true

To use the default traffic light recognition module, leave the argument as false.

Reference repositories

About

YOLOX-based object detection and classification model for recognizing traffic light states in urban environments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published