Skip to content

Conversation

@noahbkim
Copy link

@noahbkim noahbkim commented Apr 1, 2025

We've noticed a transient issue where py-spy will abort due to unmet assumptions about the contents of co_linetable (for example we've seen [231], which doesn't even satisfy the internal specification). To address this, I've patched the co_linetable parsing to propagate out-of-bounds errors so that the samples are simply ignored (and a warning is raised).

This change is not as minimal as it could be, for example it reorders a couple lines in python_interpreters.rs for clarity. If that's a problem, I'd be happy to go back and pare down such changes.

Taking a glance, I would guess this is the root cause for #735 and #569, although I have no explanation for why the co_linetable's appear invalid.

@noahbkim
Copy link
Author

noahbkim commented Apr 4, 2025

Updated the PR to include a fix for an off-by-one introduced by #635, trivially reproducible with the following:

# a.py
import b

# b.py
import time
time.sleep(10)

# Command
cargo run -F unwind --release -- record --native --rate 1 -- python3.12 a.py

@korniltsev
Copy link

This change is not as minimal as it could be, for example it reorders a couple lines in python_interpreters.rs for clarity. If that's a problem, I'd be happy to go back and pare down such changes.

Let's change only what's needed for the fix, so that it's easier to cherry-pick & revert & reason about

@Hnasar
Copy link

Hnasar commented Apr 22, 2025

@benfred can approve this workflow so the build can run?

Otherwise this LGTM, and we've manually built it and confirmed it's fixed the issues we were seeing.

@benfred
Copy link
Owner

benfred commented Jul 12, 2025

@benfred can approve this workflow so the build can run?

I believe that this branch needs to be updated to the latest master first (both to get some CI fixes in - and to let me approve the workflow run) - and I don't seem to have permissions on this branch to update it myself =(

@korniltsev
Copy link

I suggest we also fix int overflows either in the same PR or in a followup, so that debug builds of pyspy will not panic

@noahbkim
Copy link
Author

Updated! @benfred thanks for taking a look.
@korniltsev can you point to the overflows to fix?

@korniltsev
Copy link

korniltsev commented Jul 18, 2025

@korniltsev can you point to the overflows to fix?

Basically every time some uncontrolled user-provided number (from the linetable) is added, subtracted, or shifted, etc.
You can search for "checked" in this diff https://github.com/grafana/py-spy/pull/1/files
Fuzzing is a great and easy way to find some of those overflows.

Btw, I am not a maintainer here, this is just a suggestion for improvements. Maybe even better to do as a separate change. Feel free to ignore.

Thank you for working on this!

@benfred
Copy link
Owner

benfred commented Jul 31, 2025

Updated the PR to include a fix for an off-by-one introduced by #635, trivially reproducible with the following:

I believe this is fixed by #751 - can you verify if this still happens with the latest code?

Also, There was also an alternative fix for the same linetable problem in #736 - I've merged that one since it passed CI and I want to get a release out.

I like the unittests that you've added here though, any chance that you can merge in the other PR here so we can get those tests included?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants