-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Describe the bug
We're in the weeds of the poorly documented TCPROS protocol, and this issue could equally validly be filed on rosrust
or roslibrust
, but this is where I encountered it.
Withing abstract_bridge.rs this chunk of code in process_query:
let res = spawn_blocking_runtime(move || {
let description = RawMessageDescription {
msg_definition: String::from("*"),
md5sum: topic.md5.clone(),
msg_type: topic.datatype.clone(),
};
ros1_client.req_with_description(&rosrust::RawMessage(payload), description)
})
.await
does not resolve when it should when interacting with some ROS1 service server implementations over TCPROS.
Specifically when interacting with roslibrust (which is what I'm trying to do). The following happens:
- Request is sent over TCP socket
- Request is received by roslibrust and processed
- Response is sent over TCP socket
- roslibrust holds the TCP socket open waiting to see if more requests are going to come
- the zenoh-ros1-bridge / rosrust receives the response over the TCP socket, but doesn't process it and instead waits for the TCP socket to shut down
- This ultimately leads to the zenoh query timing out even thou the bytes of the response were sent to the bridge
I can bypass this issue by modifying roslibrust to shutdown the TCP socket from its end after it sends the payload at which point everything behaves normally.
https://wiki.ros.org/ROS/TCPROS - describes the "persistent" header field of service requests, but poorly describes how service servers are supposed to behave when this field is not present. In the case of the bridge / rosrust this field is NOT included in the header.
roslibrust has choose to keep the TCP socket open when the field is not present which has proven compatible with all the ROS1 ecosystem we've tested with so far (which isn't everything). I can (and likely will) modify roslibrust's behavior to work around this issue, but regardless I don't think that the bridge should be waiting for the TCP socket to close to respond to the query, it should respond as soon as the bytes are received.
To reproduce
- Start a zenoh-ros1-bridge with a master
- Start a roslibrust ros1 service server
cargo run --features ros1 --example ros1_service_server
- Call the service over zenoh
See that the request makes it the roslibrust ros1 service server, and that it responds, see the zenoh query timeout and the response never make it to the source of the query.
System info
Ubuntu 20.04