-
Notifications
You must be signed in to change notification settings - Fork 7.6k
xtensa: support for C library provided by Xtensa toolchain #90351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
799c2a7
to
5120c6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes and nitpicks. Do check the interrupt return path though, pretty sure that one's a bug.
This is something we've needed for a while, SOF targets generally build with the toolchain libc already but do it ad hoc and without OS support.
I do worry that with the need for all that synchronization (and maybe TLS?) support, that there's a lack of testing for this on an SMP platform. Is there a way to validate this on an existing qemu device somehow, maybe with a stub? Otherwise the Intel DSPs are going to end up being guinea pigs for upstream.
arch/xtensa/core/crt1.S
Outdated
#if defined(CONFIG_XTENSA_LIBC) | ||
rsr a5, PRID | ||
bnez a5, 1f | ||
CALL _init_reent_bss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that some (actually most) xtensa SOCs don't come through this code, having their own bootstrap path (often coming out of e.g. bootloader ROMs that have already done the hardware setup). If this has complicated things to do we should consider adding a test somewhere to catch platforms that mess it up.
Alternatively, if this is something that isn't required for CPU0 (looks like it isn't) consider moving to arch_kernel_init()
, which runs on a cleanly-initialized CPU capable of running C code. (Though I guess this isn't a windowed call so would still need to be done in inline assembly)
Finally, my memory is that PRID is not guaranteed to be equal to zero for the boot CPU. On esp32, my memory (which is fuzzy, Espressif folks will want to comment) is that both cores had non-zero/not-even-small-integer PRID values. This may not work the way you want, which would require putting this code on the SMP bringup path.
Also: um... what exactly does "init bss" even mean? Is .bss not expected to be zero-filled?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, correcting typo/ambiguity: "isn't required for CPU1+". That is, this looks like code you want to run on the boot CPU only, after startup but before any subsystem code that might touch the libc runs. That's exactly what arch_kernel_init() is for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I successfully verified moving the _init_reent_bss()
call to arch_kernel_init()
, but in an SMP SoC is arch_kernel_init()
meant to run only on the boot CPU ? if not, is there an existing CONFIG_ macro which has the ID of the boot CPU ?
_init_reent_bss()
is a bit of misnomer; _init_reent_bss()
is to be called to initialize the single global copy of struct _reent
state for single-threaded executables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the latest update of this PR, calling _init_reent_bss()
is no longer needed.
__DYNAMIC_REENT__
support in xclib is used whereby the _reent
struct is retrieved using __getreent()
.
@andyross Is arch_kernel_init() guaranteed to run only on the boot CPU ?
if not, is there an existing CONFIG_ macro which has the ID of the boot CPU ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR no longer modifies the file arch/xtensa/include/kernel_arch_func.h
which holds arch_kernel_init()
.
But I would appreciate a response on whether arch_kernel_init()
is guaranteed to run only on the boot CPU.
arch/xtensa/core/xtensa_asm2_util.S
Outdated
#if defined(CONFIG_XTENSA_LIBC) && defined(CONFIG_MULTITHREADING) | ||
rsr a6, ZSR_CPU | ||
l32i a6, a6, ___cpu_t_current_OFFSET | ||
call4 _xclib_update_reent_ptr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems awkward to do this in assembly in the darkest center of the context switch. It's just a C function call, why not add e.g. _xclib_update_reent_ptr(_current);
to arch_switch()
, which is the inline wrapper around this. It's currently degenerate but at some point in the past had C code in it.
Also unless I'm missing it, this needs to be added to the context switch path on interrupt return too, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually reading that, I think the argument wants to be &_current->reent and not the bare thread pointer, right? The libc obviously doesn't know the layout of a zephyr thread, it wants that struct _reent I assume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks just like newlib; can it use the same bits as that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyross correct I need the reent structure address of the current thread to update the global struct _reent *_reent_ptr
used by xclib for threadsafety.
@keith-packard please could you point me to the sources for newlib ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newlib integration bits for Zephyr are in lib/libc/newlib; you can either copy all of that and adjust to suit or try to share the code between newlib and the Xtensa toolchain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keith-packard The integration bits do not seem to implement anything for managing the current struct _reent
.
I would have expected to see __DYNAMIC_REENT
or _REENT_THREAD_LOCAL
set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the latest update of this PR, __DYNAMIC_REENT__
support in xclib is used whereby the _reent
struct is retrieved using __getreent()
.
It is no longer needed to do a _reent
pointer update during context switch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keith-packard @andyross To further test SMP thread safety, I am using tinyraytracer
which does ray-tracing and render on the terminal:
https://github.com/BrunoLevy/TinyPrograms/blob/main/tinyraytracer.c
Each pixel is computed independently, so I saw an opportunity to use it as a multi-threading test.
I ported it to Zephyr, where as many thread as there are cores available get created:
williamtcdns@35d41e30
Both Xtensa C library and PICOLIBC successfully complete the rendering for a single core:
However, with more than one cores (ie: 4 cores in my tests), PICOLIBC fails, while I somewhat (noise in the rendering being debugged) get a complete rendering with Xtensa C library because it manages the current struct _reent
:
If the above Zephyr port of tinyraytracer
sounds like an interesting SMP thread safety test, I would be more than happy to submit it as a sample test.
set_property(TARGET linker PROPERTY cpp_base -lstdc++11) | ||
else() | ||
set_property(TARGET linker PROPERTY cpp_base -lstdc++) | ||
endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lib/libc/xclib/multithreading.c
Outdated
_Initlocks(); | ||
return 0; | ||
} | ||
SYS_INIT(xclib_Initlocks, APPLICATION, CONFIG_KERNEL_INIT_PRIORITY_DEFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use a SYS_INIT() hook for arch code, it wastes two words for the record and however many instructions it takes for that wrapper that adds a needless return value. Arch code has copious spots to put initialization code in before calling z_cstart(), c.f. the arch_kernel_init() suggestion above.
Also, note that the "APPLICATION" init level is very late, and there may be driver/subsystem code from many sources (including out-of-the-zephyr-tree) that relies on libc behavior and whatever _Initlocks() needed to do. If for some reason you do need this to be a SYS_INIT() hook, I'm guessing you want it to be PRE_KERNEL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I also successfully verified moving the _Initlocks()
call to arch_kernel_init()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the latest update of this PR, calling _Initlocks()
is no longer needed.
In fact, this PR no longer modifies the file arch/xtensa/include/kernel_arch_func.h
which holds arch_kernel_init()
.
} | ||
SYS_INIT(xclib_Initlocks, APPLICATION, CONFIG_KERNEL_INIT_PRIORITY_DEFAULT); | ||
|
||
typedef struct k_mutex *_Rmtx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pet peeve: you probably want a spinlock or a k_sem here and not a mutex. k_mutex is very heavyweight and the only feature it brings is priority inheritance. Semaphores are a better general purpose blocking IPC lock for almost all purposes. But given the use case, I'm thinking that the libc is probably using this to protect tight critical sections, where spinlocks are much lighter still and objectively better choices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially used k_sem, but a recursive locking mechanism is needed to avoid hangs.
Thank you for all the feedbacks. Regarding the interrupt return path, I am assuming that xtensa_switch() is single point where context switch happens, regardless of whether in the interrupt return path. Xtensa Tools RJ-2025.5 has been released, which includes a 2 cores config called Until I complete the support for using the same xt-sim board with a selected Xtensa config in the environment variable XTENSA_CORE, separate patch sets are used. I am using those 2 SMP configs to validate this Xtensa toolchain C library support. |
ebdeba9
to
b1d3e55
Compare
Also, shouldn't the |
Yes, Where the linker puts it in It makes per-CPU local memory a requirement. |
b36ab5c
to
6945f41
Compare
With the latest update of this PR, |
a8f3156
to
3a8ca08
Compare
This change add an additional C library implementation provided by Xtensa toolchain. Signed-off-by: William Tambe <williamt@cadence.com>
Changes needed to build C++ code using Xtensa toolchain. Signed-off-by: William Tambe <williamt@cadence.com>
3a8ca08
to
295c7eb
Compare
|
This change add an additional C Library Implementation provided by Xtensa toolchain.