Skip to content

Debugging DVC interactively

skshetry edited this page Oct 20, 2023 · 5 revisions

While debugging DVC, being a command-line application, you may need to change the command quite often. This might need you to change the arguments repeatedly which is cumbersome. The following guide walks you through how to debug interactively in VSCode with minimal effort.

Setting up configurations

  1. Install Python Extension for VSCode if you haven't already.

  2. Run pip install debugpy. debugpy needs to be installed in the same environment as DVC is installed in.

  3. Open "Run and Debug" in the VSCode from the sidebar (Shortcuts: Ctrl/Cmd + Shift + D).

  4. Open "Create a launch.json file" if you don't have already.

    Otherwise, Click on "Add config" from gears ⚙️ icon. Then, Select "Python" and then "Remote Attach".

  5. Enter the hostname as it is (i.e. "localhost") and then set the port.

This should create a .vscode/launch.json file similar to the following.

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Remote Attach",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}",
                    "remoteRoot": "."
                }
            ]
        }
    ]
}

Debugging

  1. Set the breakpoints you want. If you are unsure where to set, set it at the top-level with the command-runner.

  2. Run the command that you want to debug in the terminal (any terminal, not just vscode :) ).

    python -m debugpy --wait-for-client --listen {port} -m dvc {command} 

    eg: for debugging dvc push and debugger running in port 5678:

    python -m debugpy --wait-for-client --listen 5678 -m dvc push

    It won't start running till we start the debugger.

  3. From the "Run and Debug", run the appropriate debugger config ("Python: Remote Attach" in above).

Debugging daemon

Using debugpy, you can attach subprocesses and forked processes to the debugger. First, follow the same steps as above.

Then, copy the following snippet inside dvc.cli.main() function.

import os

if os.environ.get("DVC_DAEMON"):
    import debugpy

    debugpy.listen(5678) # change port if needed as per `launch.json`
    debugpy.wait_for_client()  # blocks execution until client is attached

    breakpoint()

If you are using subprocess-based daemons, it is better to copy that snippet in dvc/__init__.py, at the top-level before any dvc imports (forked process will copy all the modules so this will not work for them, so add inside main() in that case).

Then, run any dvc command and let it complete. The daemon process is waiting for the debugger client to get attached.

export DVC_DAEMON_LOGFILE="logfile"
dvc status -vv

In VSCode, from the "Run and Debug", run the appropriate debugger config ("Python: Remote Attach" in above). It should take you to the appropriate breakpoint. You can also add breakpoint in other places as you like.

Happy debugging!!!

Clone this wiki locally