-
Notifications
You must be signed in to change notification settings - Fork 47
Add support for debugging PennyLane (Python) and Catalyst (C++) simultaneously #1712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
To mitigate one could avoid building llvm itself with debug symbols, the catalyst code base itself is not that bad in debug mode 🤔 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!!! Thanks Lee :)
This is a good point; I may try to restrict the debug symbols to the Catalyst additions and see what savings we get. |
This works, so extended the Makefile to offer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Lee, I would like to test it before approving, but I'm having some difficulties 😅
import sys | ||
|
||
|
||
def is_debugger_active() -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this function useful, if we don't use it internally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this function again, I don't think we should add it to the Catalyst API, it's completely unrelated to Catalyst code.
|
||
{ | ||
"python.defaultInterpreterPath": "${env:VIRTUAL_ENV}", | ||
"python.terminal.launchArgs": [], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain these options (and why they are needed)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will add some references and guidelines on how https://code.visualstudio.com/docs/configure/settings are used here with the debugger.
f"""Ensure C++ debugger is attached and running before continuing with: | ||
kill -s SIGCONT {p.pid}""" | ||
) | ||
p.send_signal(signal.SIGSTOP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know the process hasn't done any work yet by the time we get here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good question. The assumption (from most places I've read) is that this should be (reasonably) fine. But if we 100% need to be strict on it, we can modify the process launch options to ensure the signal is the first thing that is hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just based on hoping that the OS doesn't schedule the subprocess thread until we hit this statement, or is there some other reason it would hold? For instance, does the process only start running once we call p.communicate
(not sure how it works)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So communicate
is something akin to a synchronisation point for the child-process --- it allows data to be sent to the process (assuming it is waiting for it) and receives data from the process.
For a quick 'n' dirty example of whether it is fast enough to ensure the process isn't running anything, we can opt for something akin to https://stackoverflow.com/questions/50002804/create-subprocess-in-python-in-suspended-state
In practice, use of the preexec_fn
argument in Popen
should work just fine here (as in, we create the PID, and before execution happens we immediately put it into the suspend state). This should be fine. Though, if we are ok with letting the child process set itself up first, chances are quite low anything will execute. Taking a simple compiled C binary that outputs a string test
, we can try out the following to see when the output hits the screen (creating a file should work too):
import os
import signal
def f():
"pre-execute function to print the new process PID"
print(os.getpid())
print("Starting process")
p = Popen(["/tmp/test"], preexec_fn=f)
print("Running process")
# Stop process after creating the Process and a print call.
# If the `/tmp/test` binary can output before this line is hit,
# we need to stop within the preexec_fn. Otherwise, all good
# to set up the process.
p.send_signal(signal.SIGSTOP)
# Have some wait-time that we control
i = input("enter/return to continue")
# Allow direct input/output from communicate
p.communicate()
# It should be done by now
print("Finished process")
# explicitly kill the process
p.kill()
print("Killed process")
Reasoning for favouring not relying on the preexec_fn
is there appears to be some (valid) ongoing attempts to remove this from CPython (see python/cpython#82616).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info! I'll leave it up to you then if we want to use preexec or not, although in the current approach maybe we could at least send the signal before printing? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me. We can always revert if the above CPython changes make it to a release.
Update: As an aside, since the printing will happen from the subprocess, we may have issues with capturing the stdout and stderr for reporting. I'll see if this is something that can be mitigated, but if not, we just use the existing.
@@ -96,6 +96,7 @@ def qjit( | |||
circuit_transform_pipeline=None, | |||
pass_plugins=None, | |||
dialect_plugins=None, | |||
debug_compiler=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would have preferred not adding another keyword to this decorator, but it's not the end of the world 😌
The automatic detection based on the already active Python compiler session was actually really neat in that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, i'd agree it was nice -- maybe we can chat next week and see which seems better. This way allows attaching from a non-VSCode/non-Python debug session, so if people prefer to use it without, they can. But happy to defer either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I was able to get it running now and the process is pretty straightforward, so I'm happy to go this route if you prefer :)
Although, I just thought of another alternative. If we want to contain this sort of functionality to the debug module, we could add a context manager for emitting the signal, similar to the instrumentation one. A bit more inline with what we already have, and keeps it out of the regular user options. Just an idea though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, not a bad idea. Let me see what is needed to try that approach. If it becomes a pain, we can always support that in a follow-up, and leave this as usable in the current form for now.
Co-authored-by: David Ittah <dime10@users.noreply.github.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1712 +/- ##
==========================================
- Coverage 96.60% 96.53% -0.08%
==========================================
Files 82 83 +1
Lines 9211 9226 +15
Branches 872 875 +3
==========================================
+ Hits 8898 8906 +8
- Misses 254 259 +5
- Partials 59 61 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
"request": "attach", | ||
"program": "${command:python.interpreterPath}", | ||
"processId": "${command:pickProcess}", | ||
"MIMode": "gdb", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned this in the internal guide, but with the given configuration I get
Unable to start debugging. Launch options string provided by the project system is invalid. Unable to determine path to debugger. Please specify the "MIDebuggerPath" option.
Not sure what I would put for the path though since I can't find gdb
on my mac. "MIMode": "lldb"
works though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, as in the other, I think we need some form of pre-execution script to define a system-dependent argument for this using a custom task
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a basic solution, I think it's ok to mention in the instructions to switch to lldb if on mac.
"name": "(C++): Attach To Executing Python Process", | ||
"type": "cppdbg", | ||
"request": "attach", | ||
"program": "${command:python.interpreterPath}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this parameter. I tried removing it but then VSCode complains that "program" is not specified. The thing is though, the process we are attaching to is not running Python, it is running the catalyst
program (the compiler). Further, the value really doesn't seem to matter as long it's a valid path because I can put /bin/echo
and it still works 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it influence where the source directory is parsed for the debug information?
If not, all good to put in a no-op here, just to please the VSCode integrator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it influence where the source directory is parsed for the debug information?
Hmm, how can I check this? I used the echo path and it still stopped on the breakpoint in my local catalyst installation.
f"""Ensure C++ debugger is attached and running before continuing with: | ||
kill -s SIGCONT {p.pid}""" | ||
) | ||
p.send_signal(signal.SIGSTOP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it just based on hoping that the OS doesn't schedule the subprocess thread until we hit this statement, or is there some other reason it would hold? For instance, does the process only start running once we call p.communicate
(not sure how it works)?
@@ -96,6 +96,7 @@ def qjit( | |||
circuit_transform_pipeline=None, | |||
pass_plugins=None, | |||
dialect_plugins=None, | |||
debug_compiler=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I was able to get it running now and the process is pretty straightforward, so I'm happy to go this route if you prefer :)
Although, I just thought of another alternative. If we want to contain this sort of functionality to the debug module, we could add a context manager for emitting the signal, similar to the instrumentation one. A bit more inline with what we already have, and keeps it out of the regular user options. Just an idea though.
Co-authored-by: David Ittah <dime10@users.noreply.github.com>
Co-authored-by: David Ittah <dime10@users.noreply.github.com>
Context: Debugging multi-language programming environments can be a challenge due to the intermingling and swapping between all participating languages. In the case of PennyLane and Catalyst, this is a mix of Python and C++, through our own libraries all the way through LLVM and back. Interactive debugging of a multi-language environment can require some custom setup to ensure the environment allows for useful interactive data inspection and validation.
VSCode provides support for mixed-mode debugging where a single editor can support (through the Python and C++ debugging extensions) interactive breakpointing, stepping, and inspection of the running program, where the view can switch between the Python debugger and the C++ debugger, once running. This PR adds preliminary support for this mixed-mode debugging, allowing the user to set breakpoints in their PennyLane program their Catalyst (+/- LLVM) internals, and step all the way through to investigate all aforementioned stages.
Some notes:
make all BUILD_TYPE="RelWithDebInfo"
, and may generated approximately 80GB of libraries and binaries. Added internal (BUILD_TYPE
) and external (BUILD_TYPE_EXT
) configuration for build-flags to restrict debug symbols to our libs and bins only.sudo
privileges.SIGCONT
to continue executing after the C++ debugger has attached.Setting the activeThis restriction has been removed.qml.qjit(debug_compiler=True)
option will not enable the support unless the Python process is called from a debugger (pdb, debugby, etc)Description of the Change: As above.
Benefits: Enables debugging of mixed mode (Python + C++/LLVM) programs directly through VSCode.
Possible Drawbacks:
Related GitHub Issues: