Skip to content

fix(core): presize __main__ module to avoid heap fragmentation #4842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 2, 2025

Conversation

romanz
Copy link
Contributor

@romanz romanz commented Mar 28, 2025

Also, check for main globals and sys.modules reallocations.

@romanz romanz added micropython Python interpreter and runtime that runs Trezor firmware rust Pull requests that update Rust code labels Mar 28, 2025
@github-project-automation github-project-automation bot moved this to 🔎 Needs review in Firmware Mar 28, 2025
@romanz romanz added the core Trezor Core firmware. Runs on Trezor Model T and T2B1. label Mar 28, 2025
Copy link

github-actions bot commented Mar 28, 2025

core UI changes device test click test persistence test
T2T1 Model T test(screens) main(screens) test(screens) main(screens) test(screens) main(screens)
T3B1 Safe 3 test(screens) main(screens) test(screens) main(screens) test(screens) main(screens)
T3T1 Safe 5 test(screens) main(screens) test(screens) main(screens) test(screens) main(screens)
All main(screens)

@romanz romanz force-pushed the romanz/dict-no-alloc branch 2 times, most recently from a86bd59 to 1d3ba69 Compare March 31, 2025 08:44
@romanz romanz changed the base branch from main to romanz/qstr-fixes March 31, 2025 08:45
@romanz romanz marked this pull request as ready for review March 31, 2025 09:38
@romanz romanz requested a review from prusnak as a code owner March 31, 2025 09:38
@romanz
Copy link
Contributor Author

romanz commented Mar 31, 2025

Currently, no_realloc is disabled for emulator build (due to core unittests and profiling support).

@romanz romanz self-assigned this Mar 31, 2025
@romanz romanz requested a review from matejcik March 31, 2025 09:40
@matejcik
Copy link
Contributor

Currently, no_realloc is disabled for emulator build (due to core unittests and profiling support).

that's less than useful

could we at least enable it for the frozen build? the test suite is where we would normally discover the overflows

@matejcik
Copy link
Contributor

in fact it's a good idea to only enable this flag in debug builds -- in production, we'll prefer performance degradation (fragmented memory) to a full crash

@romanz
Copy link
Contributor Author

romanz commented Mar 31, 2025

Sounds good - 40f4061 enables no_realloc on frozen emulator builds, and keeps it disabled on firmware builds.

@@ -76,6 +76,7 @@ int main(uint32_t cmd, void *arg) {
// Interpreter init
printf("CORE: Starting interpreter\n");
mp_init();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe omit this noise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my bad... I removed the code from firmware build, instead of disabling it for non-debug builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote in 24a3284.

// .used = MP_ARRAY_SIZE(table_name),
// .alloc = MP_ARRAY_SIZE(table_name),
let bits = 0b111 | N << 3;
let bits = 0b1111 | N << 4;
Copy link
Contributor

@matejcik matejcik Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please have a look at bindgen, to see whether the above limitation was lifted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

24a3284 doesn't require changing MicroPython code & its Rust bindings.

@romanz romanz force-pushed the romanz/qstr-fixes branch from ac30529 to 887506d Compare March 31, 2025 13:22
@romanz romanz marked this pull request as draft March 31, 2025 19:10
@romanz romanz moved this from 🔎 Needs review to 🏃‍♀️ In progress in Firmware Mar 31, 2025
@romanz
Copy link
Contributor Author

romanz commented Apr 1, 2025

could we at least enable it for the frozen build? the test suite is where we would normally discover the overflows

It seems that most of our emulator device tests are using profiling, so they fail when we disallow reallocations :(
image

@matejcik
Copy link
Contributor

matejcik commented Apr 1, 2025

does the failure come from __main__ or sys.modules?

if the latter, we can just bump the limit

if __main__, that's a problem because it means that the profiler __main__ is different from main.py, correct? so even if we tweak the prealloc size, it won't catch the actual behavior that we care about.

iirc profiling only works if you make a special build for it? if yes, let's freeze __main__ only when not profiling. technically it would be nice to run a test suite in CI for non-profiling but with frozen __main__; practically it will be enough to run it locally and on HW

if there's no build flag for profiling, then uhhh let's not spend more time on it and only enable it on hw debug

@romanz romanz force-pushed the romanz/dict-no-alloc branch 2 times, most recently from 60d160c to 24a3284 Compare April 2, 2025 09:09
@romanz
Copy link
Contributor Author

romanz commented Apr 2, 2025

IIUC, we always build the emulator with tracing enabled so I think that we can simplify the PR by periodically checking for reallocations - as currently done by utils.unimport_end().

The new code in 24a3284 uses the pre-allocation constants from mpconfigport.h, and skips the check if the tracing profiler is detected in run-time.

WDYT?

@matejcik
Copy link
Contributor

matejcik commented Apr 2, 2025

yeah, this approach makes more sense
with regard to main.py, you could also look up main in sys.modules and check the capacity of that, which should work in the profiler case

@romanz romanz changed the title chore(vendor): upgrade MicroPython to use non-reallocateble dicts fix(core): presize __main__ module to avoid heap fragmentation Apr 2, 2025
@romanz romanz removed micropython Python interpreter and runtime that runs Trezor firmware rust Pull requests that update Rust code labels Apr 2, 2025
@romanz romanz moved this from 🏃‍♀️ In progress to 🔎 Needs review in Firmware Apr 2, 2025
@romanz romanz marked this pull request as ready for review April 2, 2025 11:54
romanz added 2 commits April 2, 2025 15:20
Rewrite the static comparison in `utils.unimport_end()` in C.

[no changelog]
@romanz romanz force-pushed the romanz/dict-no-alloc branch from f94735f to fe09331 Compare April 2, 2025 12:23
@romanz romanz changed the base branch from romanz/qstr-fixes to main April 2, 2025 13:01
@romanz romanz merged commit e6f9697 into main Apr 2, 2025
96 checks passed
@romanz romanz deleted the romanz/dict-no-alloc branch April 2, 2025 13:01
@github-project-automation github-project-automation bot moved this from 🔎 Needs review to 🤝 Needs QA in Firmware Apr 2, 2025
@bosomt bosomt moved this from 🤝 Needs QA to ✅ Approved in Firmware Apr 3, 2025
@romanz
Copy link
Contributor Author

romanz commented Apr 3, 2025

It seems that there is no main module on firmware - so this PR causes the firmware to not boot :(
Opened #4872 to fix this issue.

@romanz
Copy link
Contributor Author

romanz commented Apr 3, 2025

It has caused also the CI hardware tests to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Trezor Core firmware. Runs on Trezor Model T and T2B1.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants