-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Task: CARLA to RDS-HQ Scene Converter
Overview
Implement a conversion system to extract 3D road geometry, sensor data, and vehicle trajectories from CARLA and export them in RDS-HQ format for use with NVIDIA Cosmos pipeline and autonomous driving machine learning applications.
Background
RDS-HQ stores 3D road geometry with semantic information, sensor data, and temporal information. The format uses WebDataset TAR files containing JSON geometry data and numpy pose arrays to describe a clip which can be rendered into an HD Map.
CARLA provides OpenDRIVE road networks, lane topology, traffic signs, and a complete sensor system that can be extracted and converted to RDS-HQ format. The conversion requires both static geometry extraction and dynamic multi-camera pose recording.
HD Map Rendering Requirements: The Cosmos rendering pipeline does not just require static geometry, but also camera poses, calibration data and dynamic object poses, frame to frame to generate proper HD map visualizations from different viewpoints.
Task Objectives
Goals
- Extract Static Road Geometry: Convert CARLA's roads to RDS-HQ geometric primitives using CosmosExporter classes.
- Dynamic Trajectory Recording: Capture vehicle movement with per-frame camera poses and export them.
- WebDataset Compatibility: Output data in proper TAR/JSON/NPY format for streaming.
- HD Map Validation: Ensure generated jsons compatibility with existing Cosmos rendering pipeline.
RDS-HQ Elements for HD Map Generation
Static Geometry
- 3D Lanes (
3d_lanes/
): Lane boundary polylines. - 3D Lane Lines (
3d_lanelines/
): Lane centerlines polylines. - 3D Road Boundaries (
3d_road_boundaries/
): Road boundary polylines. - 3D Traffic Signs (
3d_traffic_signs/
): Extract sign positions as 3D bounding boxes. - 3D Traffic Lights (
3d_traffic_lights/
): Extract traffic light positions and states. - 3D Crosswalks (
3d_crosswalks/
): Extract crosswalk surfaces. - 3D Poles (
3d_poles/
): Extract vertical structures (signs, lights, trees). - 3D Road Markings (
3d_road_markings/
): Extract surface road markings
Dynamic Poses and Sensor Data
- Vehicle Poses (
pose/
): Per-frame & Per-Sensor- Format:
{frame:06d}.pose.{camera_name}.npy
- Transforms: Sensor relative transforms per dynamic object
- Format:
- Camera Intrinsics: Camera calibration parameters
- Format:
{camera_type}_intrinsic.{camera_name}.npy
- Camera projection parameters for 3D→2D mapping
- Format:
Format Requirements
- Coordinate System: Vehicle-relative ("rig" frame) in meters
- JSON Schema: Follow RDS-HQ label structure with proper attributes and metadata
- Geometry Types: Use appropriate
polylines3d
,polyline3d
,surface
,cuboid3d
- WebDataset: Package in TAR files with correct naming conventions
Implementation Plan
RDS_HQ object types exporters in Unreal Engine:
- ULanesExporter: Extend UCosmosStaticExporter for lane boundary polylines
- UCrosswalksExporter: Extend UCosmosStaticExporter for crosswalk polygons
- ULanelinesExporter: Extend UCosmosStaticExporter for painted lane marking centerlines
- URoadBoundariesExporter: Extend UCosmosStaticExporter for physical road edges
- UTrafficSignsExporter: Extend UCosmosStaticExporter for sign 3D bounding boxes
- UTrafficLightsExporter: Extend UCosmosStaticExporter for traffic light positions
- UPolesExporter: Extend UCosmosStaticExporter for vertical structures
- URoadMarkingsExporter: Extend UCosmosStaticExporter for surface road markings
- UWaitLinesExporter: Extend UCosmosStaticExporter for stop lines
(WIP) Pose Extractions
- Coordinate Transformation: Convert CARLA coordinates to RDS-HQ vehicle-relative frame
- Temporal Consistency: Maintain synchronized timestamps across all cameras
- Numpy Export: Save pose matrices in format expected by Cosmos pipeline
(WIP) Camera Parameters Export
- Export focal length, fov... (Figure out which parameters match the cosmos pinhole model parameters)
(WIP) Data Packaging
- TAR File Creation: Package JSON geometry data into WebDataset format
- Numpy Array Handling: Store pose and intrinsic data as .npy files within TARs
HD Map Rendering Validation
- Cosmos Pipeline Compatibility: Verify what are the minimal output could be that still work with existing
render_from_rds_hq.py
- Multi-View Rendering: Test HD map generation from several different sensor perspectives
Output Integration
- carla_cosmos_gen.py: Extend existing video generation pipeline with RDS-HQ export functionality
- Cosmos Rendering: Full compatibility with
render_from_rds_hq.py
HD map generation
Tasks
- Extract all RDS-HQ object types from CARLA maps using CosmosExporter classes
- Export camera parameter data for both ftheta and pinhole models
- Generate valid WebDataset TAR files compatible with Cosmos pipeline
- Export lane information (speed limits, lane connectivity, traffic rules)
- Integrate with existing
carla_cosmos_gen.py
workflow - Test output compatibility with
render_from_rds_hq.py
andvisualize_rds_hq.py
- Test on all standard CARLA towns (Town01-Town15)
- Ensure resulting geometry handles complex road topologies (intersections, roundabouts, ramps)
- Multi-camera HD map rendering matches expected viewpoints and projections
Potential Challenges
Undocumented Metadata
The RDS-HQ format includes undocumented fields in its JSON structures that don't have direct equivalents in CARLA. These fields appear throughout the geometry definitions. Without clear documentation on which fields are required versus optional for the Cosmos rendering pipeline, we'll need to reverse-engineer appropriate values through experimentation. Many fields will require synthetic generation or placeholder values that maintain format compatibility while not necessarily reflecting accurate data.
Example: Lane Line Metadata
A single lane line in RDS-HQ contains metadata fields beyond the geometric vertices:
{
"sessionId": "2d23a1f4-c269-46aa-8e7d-1bb595d1e421",
"assetRef": "av://clip/b54dce34-5043-46b5-bced-c78357630d73", // Unknown what this refers to exactly
"labelClassNamespace": "minimap", // Unknown if fixed or variable (what does minimap mean?)
"labelClassIdentifier": "lanelines:autolabels", // Uknown classes
"feature_id": "81026252", // What does it mean by feature, is this just the uuid of each object in the scene?
"feature_version": "37301", // Unknown versioning system
"clip_version_id": "6008533", // Unkown versioning system
"timestampMicroseconds": "2445376400000", // CARLA tick to Unix time conversion
"is_first_point_physical_end": "CUT",
"left_driving_direction": ["FORWARD"],
"colors": ["WHITE"],
"styles": ["DASHED"] // Lane Markings are country specific and colours and styles change, data is not included in OpenDrive
}
The challenge will be determining the minimum viable set of these fields and generating plausible values that allow the Cosmos pipeline to render HD maps correctly, even if the semantic meaning isn't perfectly preserved from the CARLA source.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status