Skip to content

Regression in crun 1.16.1 causing occasional error "corrupted size vs. prev_size in fastbins" #1537

@bduffany

Description

@bduffany

We are running some workloads using crun which started occasionally failing after upgrading to crun version 1.16.1. In a significant fraction of these workloads, we started seeing the error "corrupted size vs. prev_size in fastbins." I'm not very familiar with this error, but some basic searching indicates that this might be happening due to an invalid memory access.

Notably, the issue doesn't reproduce when running crun via podman. Our usage of crun is a bit unusual - we are not invoking it via podman and we also don't run conmon, because we are trying to reduce overhead as much as possible. Instead, we generate an OCI container spec and then directly invoke crun. We tried to make the generated container spec match podman's as closely as possible.

To try and find the exact commit where this issue was introduced, I did a git bisect between 1.16 (a known good version) and 1.16.1 (the known bad version), building statically linked crun with the nix method outlined in the README. The bisect revealed that this behavior started happening in commit 72b4eea.

Unfortunately, I don't have a minimal repro yet... it's difficult to find a minimal repro in this case because these are customer workloads where we don't have access to the source code. What I do know is that the executable for this workload appears to be nodejs 22. Unfortunately, strace proved unhelpful, because the issue doesn't reproduce under strace. Even worse, if I wrap the workload with any wrapper process whatsoever (even just a simple sh -c 'exec <executable> <args...>), the issue doesn't reproduce.

I am continuing to investigate to see if I can find a minimal repro, and I am also planning to dig into that commit to see if I can find exactly what it changed that might be causing this bug, but I thought I would file a report anyway just to raise an early warning, and just so that other people can more easily find this issue if they start seeing this behavior too (it took me a long time to track down this commit as being the culprit).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions