Skip to content

Add support for debugging PennyLane (Python) and Catalyst (C++) simultaneously #1712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
e28ffbe
Initial support for active debugger session for compiler attachment
mlxd May 2, 2025
52d848b
Add debug flag autocheck
mlxd May 2, 2025
6101c6f
Ensure VSCode debugger launch file is tracked
mlxd May 2, 2025
4d1ff94
Add debug_compiler args to CompileOptions and qjit
mlxd May 2, 2025
21686dd
Fix stdout stderr from popen
mlxd May 2, 2025
dfd5007
Merge branch 'main' into add_dualmode_debugging
mlxd May 2, 2025
97cd18b
Update debugger prompt
mlxd May 2, 2025
44fbf63
Merge branch 'add_dualmode_debugging' of github.com:PennyLaneAI/catal…
mlxd May 2, 2025
e6b4849
Merge branch 'main' into add_dualmode_debugging
mlxd May 6, 2025
a56a45c
Merge branch 'main' into add_dualmode_debugging
mlxd May 30, 2025
6d77976
Add external build type makefile arg for middle-end
mlxd May 30, 2025
afba2ce
Add debugging docs for mixed-mode
mlxd May 30, 2025
deb0502
Remove restriction on launching from a Python debugger
mlxd May 30, 2025
23a8cc5
Add caption for filenames
mlxd May 30, 2025
a6e7514
Add python debug session checker
mlxd May 30, 2025
fcc3895
Use editor interpreter path
mlxd May 30, 2025
f40a06b
Move debug config to docs
mlxd May 30, 2025
9c84fcb
Add CL
mlxd May 30, 2025
dbe0908
Rehide the internal active debug check
mlxd May 30, 2025
25583b3
fix format
mlxd May 30, 2025
fa50232
Ensure stdout,stderr tracked from subprocess
mlxd May 30, 2025
004ebff
Update doc/dev/debugging.rst
mlxd May 30, 2025
6ddad2a
Update doc/dev/debugging.rst
mlxd Jun 2, 2025
634f8fd
Update doc/dev/debugging.rst
mlxd Jun 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions doc/dev/debugging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -388,3 +388,85 @@ corresponding arguments.
$ /path/to/executable
MemRef: base@ = 0x64fc9dd5ffc0 rank = 0 offset = 0 sizes = [] strides = [] data =
25

Mixed-mode debugging of Python and C++
======================================

Catalyst supports mixed-mode debugging of Python and/or C++ code when providing the ``debug_compiler=True`` flag to
the ``@qjit`` decorator. Enabling this option signals to the compiler to wait for an appropriate user-provided signal
after launching the compiler process. Some notes about use of this support:

* This functionality requires building Catalyst with debug symbols. This can be achieved via
``make all BUILD_TYPE="RelWithDebInfo"``. The debug symbols are only available
within the Catalyst-owned targets.
To enable debugging of LLVM and other associated external libraries and binaries, ensure the
``BUILD_TYPE_EXT="RelWithDebInfo"`` option is also set when building Catalyst.
* Launching the C++ debugger requires attaching to a running process. This often requires ``sudo`` privileges on the
running system.
* The spawned compiler subprocess immediately issues a ``SIGSTOP`` signal to avoid execution of the compiler. To
continue execution requires receipt of a ``SIGCONT`` signal after the C++ debugger has attached.
* To validate if running within an active (Python) debugger session, the function :func:`~.debug.debugger.is_debugger_active`
can be used.

The signalling steps can be provided via an active terminal session as

.. code-block:: shell

$ kill -s SIGCONT <PID>

where ``<PID>`` is the process-ID. This can also be issued from an active Python debugger session, such as through VSCode's
debug terminal as

.. code-block:: python

import os, signal
os.kill(<PID>, signal.SIGCONT)

To enable support from VSCode, the following configuration files can be used to add debugger configurations for Python, and
C++.

.. code-block:: json
:caption: Filename ``.vscode/launch.json``

{
"version": "0.2.0",
"configurations": [
{
"name": "(Python): Debug Current Python File",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"justMyCode": false
},
{
"name": "(C++): Attach To Executing Python Process",
"type": "cppdbg",
"request": "attach",
"program": "${command:python.interpreterPath}",
Copy link
Contributor

@dime10 dime10 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this parameter. I tried removing it but then VSCode complains that "program" is not specified. The thing is though, the process we are attaching to is not running Python, it is running the catalyst program (the compiler). Further, the value really doesn't seem to matter as long it's a valid path because I can put /bin/echo and it still works 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it influence where the source directory is parsed for the debug information?
If not, all good to put in a no-op here, just to please the VSCode integrator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it influence where the source directory is parsed for the debug information?

Hmm, how can I check this? I used the echo path and it still stopped on the breakpoint in my local catalyst installation.

"processId": "${command:pickProcess}",
"MIMode": "gdb",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned this in the internal guide, but with the given configuration I get

Unable to start debugging. Launch options string provided by the project system is invalid. Unable to determine path to debugger.  Please specify the "MIDebuggerPath" option.

Not sure what I would put for the path though since I can't find gdb on my mac. "MIMode": "lldb" works though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, as in the other, I think we need some form of pre-execution script to define a system-dependent argument for this using a custom task

Copy link
Contributor

@dime10 dime10 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a basic solution, I think it's ok to mention in the instructions to switch to lldb if on mac.

"setupCommands": [
{
"description": "Enable pretty-printing",
"text": "-enable-pretty-printing",
"ignoreFailures": true,
}
]
},
]
}


.. code-block:: json
:caption: Filename ``.vscode/settings.json``

{
"python.defaultInterpreterPath": "${env:VIRTUAL_ENV}",
"python.terminal.launchArgs": [],
}
Comment on lines +463 to +467
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain these options (and why they are needed)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add some references and guidelines on how https://code.visualstudio.com/docs/configure/settings are used here with the debugger.



Note that on MacOS ``gdb`` will alias ``lldb``, and will continue to function identically
to ``gdb`` on Linux using the editor's debugging interface. To explicitly use ``lldb`` on Linux, it may be necessary to also
the `machine-interface driver <https://github.com/lldb-tools/lldb-mi>`_.
9 changes: 9 additions & 0 deletions doc/releases/changelog-dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,10 @@

<h3>Internal changes ⚙️</h3>

* `qjit` now supports a `debug_compiler` argument, which signals that the compiler driver should stop and wait
for a user-provided `SIGNCONT`. This allows developer to attach a debugger session to the driver before processing its input.
[(#1712)](https://github.com/PennyLaneAI/catalyst/pull/1712)

* `null.qubit` can now support an optional `track_resources` argument which allows it to record which gates are executed.
[(#1619)](https://github.com/PennyLaneAI/catalyst/pull/1619)

Expand Down Expand Up @@ -291,6 +295,10 @@

<h3>Documentation 📝</h3>

* Documentation for the configuration of mixed-mode (Python and C++) debugging with Catalyst has
been added. Configuration guidelines are provided for VSCode.
[(#1712)](https://github.com/PennyLaneAI/catalyst/pull/1712)

* The header (logo+title) images in the README and in the overview on RtD have been updated,
reflecting that Catalyst is now beyond the beta!
[(#1718)](https://github.com/PennyLaneAI/catalyst/pull/1718)
Expand All @@ -312,6 +320,7 @@ Christina Lee,
Mehrdad Malekmohammadi,
Anton Naim Ibrahim,
Erick Ochoa Lopez,
Lee J. O'Riordan,
Ritu Thombre,
Paul Haochen Wang,
Jake Zaia.
32 changes: 27 additions & 5 deletions frontend/catalyst/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import pathlib
import platform
import shutil
import signal
import subprocess
import sys
import tempfile
Expand Down Expand Up @@ -403,7 +404,7 @@
return cmd

@debug_logger
def run_from_ir(self, ir: str, module_name: str, workspace: Directory):

Check notice on line 407 in frontend/catalyst/compiler.py

View check run for this annotation

codefactor.io / CodeFactor

frontend/catalyst/compiler.py#L407

Too many branches (13/12) (too-many-branches)
"""Compile a shared object from a textual IR (MLIR or LLVM).

Args:
Expand Down Expand Up @@ -438,15 +439,36 @@
output_ir_name = os.path.join(str(workspace), f"{module_name}.ll")

cmd = self.get_cli_command(tmp_infile_name, output_ir_name, module_name, workspace)

try:
if self.options.verbose:
print(f"[SYSTEM] {' '.join(cmd)}", file=self.options.logfile)
result = subprocess.run(cmd, check=True, capture_output=True, text=True)

with subprocess.Popen(
cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True
) as p:
# Ensure process creation succeeds
if p.returncode not in {0, None}:
raise subprocess.CalledProcessError(p.returncode, cmd)

Check warning on line 452 in frontend/catalyst/compiler.py

View check run for this annotation

Codecov / codecov/patch

frontend/catalyst/compiler.py#L452

Added line #L452 was not covered by tests

if self.options.debug_compiler:
print(f"Compiler PID={p.pid}")
print(

Check warning on line 456 in frontend/catalyst/compiler.py

View check run for this annotation

Codecov / codecov/patch

frontend/catalyst/compiler.py#L455-L456

Added lines #L455 - L456 were not covered by tests
f"""Ensure C++ debugger is attached and running before continuing with:
kill -s SIGCONT {p.pid}"""
)
p.send_signal(signal.SIGSTOP)

Check warning on line 460 in frontend/catalyst/compiler.py

View check run for this annotation

Codecov / codecov/patch

frontend/catalyst/compiler.py#L460

Added line #L460 was not covered by tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know the process hasn't done any work yet by the time we get here?

Copy link
Member Author

@mlxd mlxd May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. The assumption (from most places I've read) is that this should be (reasonably) fine. But if we 100% need to be strict on it, we can modify the process launch options to ensure the signal is the first thing that is hit.

Copy link
Contributor

@dime10 dime10 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just based on hoping that the OS doesn't schedule the subprocess thread until we hit this statement, or is there some other reason it would hold? For instance, does the process only start running once we call p.communicate (not sure how it works)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So communicate is something akin to a synchronisation point for the child-process --- it allows data to be sent to the process (assuming it is waiting for it) and receives data from the process.

For a quick 'n' dirty example of whether it is fast enough to ensure the process isn't running anything, we can opt for something akin to https://stackoverflow.com/questions/50002804/create-subprocess-in-python-in-suspended-state

In practice, use of the preexec_fn argument in Popen should work just fine here (as in, we create the PID, and before execution happens we immediately put it into the suspend state). This should be fine. Though, if we are ok with letting the child process set itself up first, chances are quite low anything will execute. Taking a simple compiled C binary that outputs a string test, we can try out the following to see when the output hits the screen (creating a file should work too):

import os
import signal

def f():
    "pre-execute function to print the new process PID"
    print(os.getpid())

print("Starting process")
p = Popen(["/tmp/test"], preexec_fn=f)
print("Running process")

# Stop process after creating the Process and a print call.
# If the `/tmp/test` binary can output before this line is hit,
# we need to stop within the preexec_fn. Otherwise, all good
# to set up the process.
p.send_signal(signal.SIGSTOP)

# Have some wait-time that we control 
i = input("enter/return to continue")

# Allow direct input/output from communicate
p.communicate()

# It should be done by now
print("Finished process")

# explicitly kill the process
p.kill()
print("Killed process")

Reasoning for favouring not relying on the preexec_fn is there appears to be some (valid) ongoing attempts to remove this from CPython (see python/cpython#82616).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info! I'll leave it up to you then if we want to use preexec or not, although in the current approach maybe we could at least send the signal before printing? 😅

Copy link
Member Author

@mlxd mlxd Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me. We can always revert if the above CPython changes make it to a release.

Update: As an aside, since the printing will happen from the subprocess, we may have issues with capturing the stdout and stderr for reporting. I'll see if this is something that can be mitigated, but if not, we just use the existing.


res_stdout, res_stderr = p.communicate()
# Ensure process execution succeeds
if p.returncode not in {0, None}:
raise subprocess.CalledProcessError(p.returncode, cmd, res_stdout, res_stderr)

if self.options.verbose or os.getenv("ENABLE_DIAGNOSTICS"):
if result.stdout:
print(result.stdout.strip(), file=self.options.logfile)
if result.stderr:
print(result.stderr.strip(), file=self.options.logfile)
if res_stdout:
print(res_stdout.strip(), file=self.options.logfile)
if res_stderr:
print(res_stderr.strip(), file=self.options.logfile)
except subprocess.CalledProcessError as e: # pragma: nocover
raise CompileError(f"catalyst failed with error code {e.returncode}: {e.stderr}") from e

Expand Down
1 change: 1 addition & 0 deletions frontend/catalyst/debug/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
get_compilation_stages_groups,
replace_ir,
)
from catalyst.debug.debugger import is_debugger_active
from catalyst.debug.instruments import instrumentation
from catalyst.debug.printing import ( # pylint: disable=redefined-builtin
print,
Expand Down
26 changes: 26 additions & 0 deletions frontend/catalyst/debug/debugger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright 2025 Xanadu Quantum Technologies Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
This module adds functionality to check if the active Python session
is being run with an active debugger.
"""


import sys


def is_debugger_active() -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this function useful, if we don't use it internally?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this function again, I don't think we should add it to the Catalyst API, it's completely unrelated to Catalyst code.

"""Will return true in active debugger session"""
return hasattr(sys, "gettrace") and sys.gettrace()

Check warning on line 26 in frontend/catalyst/debug/debugger.py

View check run for this annotation

Codecov / codecov/patch

frontend/catalyst/debug/debugger.py#L26

Added line #L26 was not covered by tests
3 changes: 3 additions & 0 deletions frontend/catalyst/jit.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ def qjit(
circuit_transform_pipeline=None,
pass_plugins=None,
dialect_plugins=None,
debug_compiler=False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would have preferred not adding another keyword to this decorator, but it's not the end of the world 😌
The automatic detection based on the already active Python compiler session was actually really neat in that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, i'd agree it was nice -- maybe we can chat next week and see which seems better. This way allows attaching from a non-VSCode/non-Python debug session, so if people prefer to use it without, they can. But happy to defer either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I was able to get it running now and the process is pretty straightforward, so I'm happy to go this route if you prefer :)

Although, I just thought of another alternative. If we want to contain this sort of functionality to the debug module, we could add a context manager for emitting the signal, similar to the instrumentation one. A bit more inline with what we already have, and keeps it out of the regular user options. Just an idea though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, not a bad idea. Let me see what is needed to try that approach. If it becomes a pain, we can always support that in a follow-up, and leave this as usable in the current form for now.

): # pylint: disable=too-many-arguments,unused-argument
"""A just-in-time decorator for PennyLane and JAX programs using Catalyst.

Expand Down Expand Up @@ -162,6 +163,8 @@ def qjit(
If not specified, the default pass pipeline will be applied.
pass_plugins (Optional[List[Path]]): List of paths to pass plugins.
dialect_plugins (Optional[List[Path]]): List of paths to dialect plugins.
debug_compiler (Optional[bool]): Enable external debugger attachment to the compiler
driver when launching from an active Python debugging environment.

Returns:
QJIT object.
Expand Down
3 changes: 3 additions & 0 deletions frontend/catalyst/pipelines.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ class CompileOptions:
Default is None.
pass_plugins (Optional[Set[Path]]): List of paths to pass plugins.
dialect_plugins (Optional[Set[Path]]): List of paths to dialect plugins.
debug_compiler (Optional[bool]): Enable external debugger attachment to the compiler
driver when launching from an active Python debugging environment.
"""

verbose: Optional[bool] = False
Expand All @@ -94,6 +96,7 @@ class CompileOptions:
circuit_transform_pipeline: Optional[dict[str, dict[str, str]]] = None
pass_plugins: Optional[Set[Path]] = None
dialect_plugins: Optional[Set[Path]] = None
debug_compiler: Optional[bool] = False

def __post_init__(self):
# Check that async runs must not be seeded
Expand Down
8 changes: 5 additions & 3 deletions mlir/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ RT_BUILD_DIR ?= $(MK_DIR)/../runtime/build
ENABLE_ASAN ?= OFF
STRICT_WARNINGS ?= ON
BUILD_TYPE ?= Release
BUILD_TYPE_EXT ?= Release
LLVM_EXTERNAL_LIT ?= $(LLVM_BUILD_DIR)/bin/llvm-lit

ifeq ($(shell uname), Darwin)
Expand Down Expand Up @@ -65,7 +66,7 @@ llvm:
patch -p1 $(TARGET_FILE) $(PATCH_FILE); \
fi
cmake -G Ninja -S llvm-project/llvm -B $(LLVM_BUILD_DIR) \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE) \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE_EXT) \
-DLLVM_BUILD_EXAMPLES=OFF \
-DLLVM_TARGETS_TO_BUILD="host" \
-DLLVM_ENABLE_PROJECTS="$(LLVM_PROJECTS)" \
Expand Down Expand Up @@ -101,7 +102,7 @@ mhlo:
patch -p1 $(TARGET_FILE) $(PATCH_FILE); \
fi
cmake -G Ninja -S mlir-hlo -B $(MHLO_BUILD_DIR) \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE) \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE_EXT) \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DMLIR_DIR=$(LLVM_BUILD_DIR)/lib/cmake/mlir \
-DPython3_EXECUTABLE=$(PYTHON) \
Expand All @@ -123,7 +124,7 @@ enzyme:
@echo "build enzyme"
cmake -G Ninja -S Enzyme/enzyme -B $(ENZYME_BUILD_DIR) \
-DENZYME_STATIC_LIB=ON \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE) \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE_EXT) \
-DLLVM_DIR=$(LLVM_BUILD_DIR)/lib/cmake/llvm \
-DCMAKE_C_COMPILER=$(C_COMPILER) \
-DCMAKE_CXX_COMPILER=$(CXX_COMPILER) \
Expand All @@ -144,6 +145,7 @@ plugin:
cmake -B standalone/build -G Ninja \
-DCMAKE_C_COMPILER=$(C_COMPILER) \
-DCMAKE_CXX_COMPILER=$(CXX_COMPILER) \
-DCMAKE_BUILD_TYPE=$(BUILD_TYPE_EXT) \
-DCMAKE_C_COMPILER_LAUNCHER=$(COMPILER_LAUNCHER) \
-DCMAKE_CXX_COMPILER_LAUNCHER=$(COMPILER_LAUNCHER) \
-DMLIR_DIR=$(LLVM_BUILD_DIR)/lib/cmake/mlir \
Expand Down