Skip to content

MAVSDK-Python Hangs After Backend Logs "System discovered" on Ubuntu 20.04 #759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zhouchenc1 opened this issue Apr 15, 2025 · 3 comments
Assignees

Comments

@zhouchenc1
Copy link

Environment:

  • OS: Ubuntu 20.04.6 LTS
  • Python: 3.9 (via new Conda environment, also tested 3.8)
  • MAVSDK-Python: 3.0.1 (installed via pip install mavsdk, also tested 1.3.0)
  • grpcio: Tested with 1.71.0 and 1.64.1
  • protobuf: Tested with 6.30.2 and 5.27.1
  • Simulator: PX4 SITL v1.14 (or specify your version) with Gazebo Classic 11 (default Iris model, also observed with jMAVSim)

Description:

When running a basic Python script to connect to a running PX4 SITL instance using listening modes (udp://:14540 or udpin://:14540), the MAVSDK-Python library hangs.

Specifically, the DEBUG logs show that the mavsdk_server backend (v3.0.0 corresponding to MAVSDK-Python 3.0.1) successfully starts, receives MAVLink packets from SITL (from 127.0.0.1:14580), identifies the system (SysID 1, CompID 1, Vehicle Type 2), and explicitly logs [Info] System discovered. However, the Python script execution stops right after these backend logs are printed. The await drone.connect(...) call never completes, and the script does not proceed to the async for state in drone.core.connection_state(): loop (no Python-level connection state updates are printed).

The same hanging behavior (hanging after backend version log) occurs when using the connecting mode (udpout://127.0.0.1:14550), although in this mode, the backend does not log system discovery before hanging.

Expected Behavior:

After the mavsdk_server backend logs "System discovered", the await drone.connect(...) call in Python should complete, and the async for state in drone.core.connection_state(): loop should begin yielding connection states, eventually indicating is_connected=True.

Steps to Reproduce:

  1. Set up Ubuntu 20.04.
  2. Install Conda.
  3. Create and activate a clean Conda environment:
    conda create --name mavsdk_bug_report python=3.9 -y
    conda activate mavsdk_bug_report
  4. Install the latest MAVSDK-Python:
    pip install mavsdk --no-cache-dir
  5. Start PX4 SITL:
    # In PX4-Autopilot directory
    make px4_sitl_default gazebo
  6. Run the following Python script (test_connect.py):
    import logging
    logging.basicConfig(level=logging.DEBUG) # Enable DEBUG logs
    import asyncio
    from mavsdk import System
    import time
    
    async def run():
        print("--- Script starting ---")
        drone = System()
        connect_addr = "udpin://:14540" # Also tested "udp://:14540" with same result
    
        print(f"Attempting to connect using: {connect_addr}")
        await drone.connect(system_address=connect_addr)
    
        # Script never reaches here
        print("Waiting for connection state updates...")
    
        start_time = time.time()
        timeout_seconds = 20
    
        try:
            async for state in drone.core.connection_state():
                current_time = time.time()
                elapsed = current_time - start_time
                print(f"[{elapsed:.2f}s] Connection state: {state}, Is connected: {state.is_connected}")
    
                if state.is_connected:
                    print(f"✅ Drone discovered! (after {elapsed:.2f}s)")
                    break
    
                if elapsed > timeout_seconds:
                    print(f"❌ Connection attempt timed out after {timeout_seconds} seconds.")
                    break
            else:
                 print("Connection state stream ended unexpectedly without connecting.")
    
        except Exception as e:
            print(f"An error occurred during connection check: {e}")
        finally:
            print("--- Run function finished ---")
    
    
    if __name__ == "__main__":
        try:
            asyncio.run(run())
        except KeyboardInterrupt:
            print("\nScript interrupted by user.")
        except Exception as e:
            print(f"An error occurred in main execution: {e}")
        finally:
            print("--- Script finished ---")

Logs and Evidence:

  1. Python Script Output (Hangs after backend discovery):
    (This output is observed when running the script above with udpin://:14540 or udp://:14540)

    DEBUG:asyncio:Using selector: EpollSelector
    --- Script starting ---
    Attempting to connect using: udpin://:14540
    DEBUG:grpc._cython.cygrpc:Using AsyncIOEngine.POLLER as I/O engine
    DEBUG:mavsdk.async_plugin_manager:Waiting for mavsdk_server to be ready...
    DEBUG:mavsdk.system:[HH:MM:SS|Info ] MAVSDK version: v3.0.0 (mavsdk_impl.cpp:30) # Timestamp varies
    DEBUG:mavsdk.system:[HH:MM:SS|Info ] Waiting to discover system on udpin://:14540... (connection_initiator.h:20)
    DEBUG:mavsdk.system:[HH:MM:SS|Info ] New system on: 127.0.0.1:14580 (with system ID: 1) (udp_connection.cpp:206)
    DEBUG:mavsdk.system:[HH:MM:SS|Debug] New system ID: 1 Comp ID: 1 (mavsdk_impl.cpp:730)
    DEBUG:mavsdk.system:[HH:MM:SS|Debug] Component Autopilot (1) added. (system_impl.cpp:377)
    DEBUG:mavsdk.system:[HH:MM:SS|Warn ] Vehicle type changed (new type: 2, old type: 0) (system_impl.cpp:217)
    DEBUG:mavsdk.system:[HH:MM:SS|Debug] Discovered 1 component(s) (system_impl.cpp:497)
    DEBUG:mavsdk.system:[HH:MM:SS|Debug] Request message 148 using REQUEST_MESSAGE (mavlink_request_message.cpp:78)
    DEBUG:mavsdk.system:[HH:MM:SS|Info ] System discovered (connection_initiator.h:62)
    DEBUG:mavsdk.system:[HH:MM:SS|Info ] Server started (grpc_server.cpp:173)
    DEBUG:mavsdk.system:[HH:MM:SS|Info ] Server set to listen on 0.0.0.0:50051 (grpc_server.cpp:174)
    # --- HANGS HERE INDEFINITELY ---
    
  2. Manual mavsdk_server Execution (Works Correctly):
    (Output when running the backend standalone with default udp://:14540)

    $ /path/to/conda/envs/mavsdk_clean_test/lib/python3.9/site-packages/mavsdk/bin/mavsdk_server
    [HH:MM:SS|Info ] MAVSDK version: v3.0.0 (mavsdk_impl.cpp:30)
    [HH:MM:SS|Info ] Waiting to discover system on udp://:14540... (connection_initiator.h:20)
    [HH:MM:SS|Warn ] Connection using udp:// is deprecated, please use udpin:// or udpout:// (cli_arg.cpp:28)
    [HH:MM:SS|Info ] New system on: 127.0.0.1:14580 (with system ID: 1) (udp_connection.cpp:206)
    [HH:MM:SS|Debug] New system ID: 1 Comp ID: 1 (mavsdk_impl.cpp:730)
    [HH:MM:SS|Debug] Component Autopilot (1) added. (system_impl.cpp:377)
    [HH:MM:SS|Warn ] Vehicle type changed (new type: 2, old type: 0) (system_impl.cpp:217)
    [HH:MM:SS|Debug] Discovered 1 component(s) (system_impl.cpp:497)
    [HH:MM:SS|Debug] Request message 148 using REQUEST_MESSAGE (mavlink_request_message.cpp:78)
    [HH:MM:SS|Info ] System discovered (connection_initiator.h:62)
    [HH:MM:SS|Info ] Server started (grpc_server.cpp:173)
    [HH:MM:SS|Info ] Server set to listen on 0.0.0.0:50051 (grpc_server.cpp:174)
    [HH:MM:SS|Debug] MAVLink: info: GCS connection regained  (system_impl.cpp:243) # Might vary
    # --- STAYS RUNNING HERE until Ctrl+C ---

Troubleshooting Steps Taken:

  • Verified network connectivity using tcpdump (PX4 SITL sends packets to UDP 14540).
  • Confirmed mavsdk_server C++ backend discovers the system and runs correctly when executed standalone (using default udp://:14540).
  • Tested in a completely new Conda environment (Python 3.9).
  • Tested with both MAVSDK-Python 3.0.1 and 1.3.0.
  • Tested with both latest dependencies (grpcio 1.71.0, protobuf 6.30.2) and minimum compatible versions (grpcio 1.64.1, protobuf 5.27.1).
  • Tested different connection strings (udp://:14540, udpin://:14540, udpout://127.0.0.1:14550).
  • The hanging behavior after backend discovery (in listening mode) is consistent across all tests.

This suggests a potential issue in the gRPC communication or async handling between the Python library and the backend after the system discovery event occurs on Ubuntu 20.04.

@julianoes
Copy link
Collaborator

Thanks for the detailed issue. I'll try to reproduce this, later this week.

@zhouchenc1
Copy link
Author

Thanks for the detailed issue. I'll try to reproduce this, later this week.
Great, thank you

@julianoes julianoes self-assigned this Apr 15, 2025
@julianoes
Copy link
Collaborator

julianoes commented Apr 15, 2025

I think the devil is in the connection string.

This doesn't work:

connect_addr = "udpin://:14540"

These two work for me:

connect_addr = "udp://:14540"
connect_addr = "udpin://0.0.0.0:14540

Tested in Ubuntu 22.04 against Gazebo Classic and Ubuntu 24.04 against new Gazebo, but not using conda, just pip.

I think the thing to improve here is to properly give error feedback for wrong connection strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants