Skip to content

[WIP] Enable Rockchip arch in kernel #2556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

sambonbonne
Copy link

Set CONFIG_ARCH_ROCKCHIP

After some discussion on Matrix about Odroid M1S (based on RK3566), I made this PR to enable Rockchip arch in the kernel.

The goal is to test the generated aarch64 image on my hardware and to see if the initrd size is not increased to much before discussing about the possible inclusion of this configuration in Flatcar.

⚠️ This PR is based on #2300 to use 6.12 kernel, because Odroid M1S support has been added on 6.12, so it can't be merged prior to #2300.

How to use

Installing the image in Odroid M1S requires U-Boot binaries and multiple steps. I will add testing commands if it works on my hardware and someone want to try it on real hardware.

Testing done

No testing for now. I plan to download the generated image (when it's generated) and install it on my hardware to test if it boots.

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

@@ -5,6 +5,7 @@ CONFIG_ARCH_BCM2835=y
CONFIG_ARCH_BCM_IPROC=y
# CONFIG_ARCH_MEDIATEK is not set
# CONFIG_ARCH_QCOM is not set
CONFIG_ARCH_ROCKCHIP=y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CONFIG_ARCH_MULTI_V7 might be required too

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RK3566 is ARMv8-A (it has only 4 Cortex A-55, which are ARMv8.2-A) and CONFIG_ARCH_MULTI_V7 is for ARMv7 (if I understand correctly). So at first sight, it should not be required but if my real test fails, I'll try to add it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is weird because it says it's for Rockchip Cortex-A9 so I'll try without it and if it doesn't work, I'll try to add it.

If I add it, I'll try to use the SDK container to build locally and make the change if it helps.

@ader1990
Copy link
Contributor

ader1990 commented Jan 3, 2025

@sambonbonne the image was built, you can download it and try it from the github actions artifacts page. https://github.com/flatcar/scripts/actions/runs/12575139169?pr=2556

Copy link

github-actions bot commented Jan 3, 2025

Build action triggered: https://github.com/flatcar/scripts/actions/runs/14925630539

@ader1990
Copy link
Contributor

ader1990 commented Jan 3, 2025

The vmlinuz image increase is pretty big, by 3MB, which is more than the space allowed for the updates to work, that s why some of the tests failed. But as a PoC, you can try first the resulting image and see if it works.

@sambonbonne
Copy link
Author

@ader1990 thank you! I had some problems when trying to edit the partition layout of my SD card with fdisk on thursday (for U-Boot) and I couldn't try to fix it this week-end but I may have time to work on it on Monday or Tuesday. I already know what I can try to fix my partitioning problem so it should not take too long for me to try.

I understand 3MB is too big, unfortunately I don't know enough about "kernel things" to help on this so if I manage to have a working image, I hope we will be able to reduce the added size or find an alternative.

@sambonbonne
Copy link
Author

@ader1990 I'm trying to use flatcar-install /path/to/flatcar_production_image.bin but I have some errors, IDK if I did something wrong. Here are the logs:

$ flatcar-install -d /dev/sdb -B arm64-usr -i /path/to/ignition.json -f /path/to/flatcar_production_image.bin -u
Using existing image: /path/to/flatcar_production_image.bin
Writing /path/to/flatcar_production_image.bin...
Running in chroot, ignoring request.
Running in chroot, ignoring request.
mount: /tmp/flatcar-install.77HskzhsAm/oemfs: WARNING: source write-protected, mounted read-only.
Installing Ignition config /path/to/ignition.json...
cp: cannot create regular file '/tmp/flatcar-install.77HskzhsAm/oemfs/config.ign': Read-only file system
Error: return code 1 from [[ -n "${IGNITION}" ]]
/dev/sdb: 8 bytes were erased at offset 0x00000200 (gpt): 45 46 49 20 50 41 52 54
/dev/sdb: 2 bytes were erased at offset 0x000001fe (PMBR): 55 aa
/dev/sdb: calling ioctl to re-read partition table: Success

If you have any idea to help me, I will be very happy!

@sambonbonne
Copy link
Author

@ader1990 I managed to fix my problem and to use flatcar-install with the generated image.

But it seems the first partition, the EFI one, is not mountable when installed so I could not write the boot.scr file. I mounted the partition of the image file to copy the boot.scr before running flatcar-install so I guess it's a workaround but I find it suspicious that the partition is not mountable after dd.

Anyway, the image does not seem to boot so I added CONFIG_ARCH_MULTI_V7 as you recommended but I can't run the pipeline myself. I tried to build using the SDK container but I have some problems when emergeing and I don't have the knowledge to fix it.

Can you launch the pipeline so I can try a new image with the added kernel parameter? Otherwise, should I ask for help with the SDK container (if yes, where? The Matrix channel?)?
Thanks in advance!

Just FYI, here is the boot.txt file I use to generate the boot.scr:

load ${devtype} ${devnum}:1 ${kernel_addr_r} /EFI/boot/bootaa64.efi                                           
bootefi ${kernel_addr_r}  

@ader1990
Copy link
Contributor

@sambonbonne had to solve some conflicts, I did trigger a new build and you should be able to download the image artifact in a few hours, if all goes well.

@sambonbonne
Copy link
Author

@ader1990 thanks! I hope the new image will boot. I'll give it a try when I'm able to.

@sambonbonne
Copy link
Author

@ader1990 it seems I cannot set CONFIG_ARCH_MULTI_V7, the pipeline for ARM64 failed with:

ERROR: sys-kernel/coreos-modules-6.12.0::coreos-overlay failed (configure phase):
  Requested options not enabled in build:
    CONFIG_ARCH_MULTI_V7

See https://github.com/flatcar/scripts/actions/runs/12743950372/job/35527535677#step:7:4656.

So I guess I can't enable the CONFIG_ARCH_MULTI_V7 option?

@ader1990
Copy link
Contributor

@ader1990 it seems I cannot set CONFIG_ARCH_MULTI_V7, the pipeline for ARM64 failed with:

ERROR: sys-kernel/coreos-modules-6.12.0::coreos-overlay failed (configure phase):
  Requested options not enabled in build:
    CONFIG_ARCH_MULTI_V7

See https://github.com/flatcar/scripts/actions/runs/12743950372/job/35527535677#step:7:4656.

So I guess I can't enable the CONFIG_ARCH_MULTI_V7 option?

it seems that the newer kernel 6.12 does not need it anymore. You can push a new change and I can start the build.

For building the kernel properly with Flatcar, I have the following notes for https://www.flatcar.org/docs/latest/reference/developer-guides/sdk-modifying-flatcar/#getting-started:

./build_packages --board arm64-usr

# make sure the tmp is clean
sudo rm -rf /build/arm64-usr/var/tmp/portage/sys-kernel*

# if the kernel sources have been changed
emerge-arm64-usr sys-kernel/coreos-sources

# if the kernel config or patches have changed
emerge-arm64-usr sys-kernel/coreos-modules

# if the bootengine commit id has changed
emerge-arm64-usr sys-kernel/bootengine

# if the bootengine commit id has changed
sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio
emerge-arm64-usr sys-kernel/coreos-kernel

# do a build packages to make sure
./build_packages --board arm64-usr

# follow the official docs
# https://www.flatcar.org/docs/latest/reference/developer-guides/sdk-modifying-flatcar/#getting-started
# do build_image
# do image_to_vm

@sambonbonne
Copy link
Author

sambonbonne commented Jan 18, 2025

Hello @ader1990 and thanks for those details!

I pushed a commit to remove the CONFIG_ARCH_MULTI_V7. I also got the error when trying to build locally.

Speaking of, I tried to build again after removing this config, so I enter the SDK container with ./run_sdk_container -a arm64 -t and run ./build_package --board arm64-usr but I get another error and I don't understand how it's possible, as I use the SDK container:

sys-kernel/coreos-modules-6.12.0 is missing libraries:
	x86_64: libcrypto.so.3
WARNING build_packages: test_image_content: Failed dependency check
WARNING build_packages: This may be the result of having a long-lived SDK with binary
WARNING build_packages: packages that predate portage 2.2.18. If this is the case try:
    emerge-arm64-usr -agkuDN --rebuilt-binaries=y -j9  @world
    emerge-arm64-usr -a --depclean

I will try to run both emerge commands and rebuild but I you think of something else, feel free to tell me.
Edit: I ran both command, both seem to not do anything specific (but succeeded) and the build fails with the same error.

I hope I don't ask for too much with all my questions, to be honest this is my first time building an entire distro, it's challenging and very instructive.

@ader1990
Copy link
Contributor

What usually happens when trying to run build_packages, is that the error is a little up in the logs, and you might need to tee those logs in a file to better search for the error: ./build_packages --board arm64-usr 2>&1 | tee -a build_packages.log.

When I have errors with the build process, I usually start with a very clean environment from scratch, as there might be leftovers or errors introduced by multiple builds. Being a dockerized environment, it is usually easy to create a new env, just remove the cloned repository, do a docker rm of the dangling containers and images, do a docker system prune for safety (of course, make sure you are not using that env for other work), and start over. Always start with a new cloned repo of flatcar/scripts, otherwise you can do a git reset, git clean -fxd, git rebase on flatcar/scripts main branch, and start the process from step 1: ./run_sdk_container -t.

@ader1990
Copy link
Contributor

@sambonbonne
Copy link
Author

sambonbonne commented Jan 26, 2025

@ader1990 just wanted to tell you I'm still investigating the boot problem.

I tried multiple boot.txt files, even to directly boot vmlinuz-a (with the bootz command) and even that don't work so I think the problem is on U-boot (the difficult part being: I don't have a UART cable so I can't see U-boot logs).

Beside this, I managed to boot MicroOS with this simple boot.txt (using mkimage to make a boot.scr of course):

btrload ${devtype} ${devnum}:${bootpart} ${ramdisk_addr_r} /boot/grub2/arm64-efi/kernel.img
bootm ${ramdisk_addr_r}  

So I know U-Boot is capable of booting a working OS, I just don't know why it doesn't boot Flatcar.

Edit: I just decided to order a serial cable to see if U-Boot logs message to the UART port, it may take some time to arrive but I still want to work on this.

@sambonbonne sambonbonne marked this pull request as draft January 26, 2025 15:32
@sambonbonne sambonbonne force-pushed the feature/enable-rockchip-in-kernel branch from 9ed0ab0 to 87bc1d5 Compare January 26, 2025 15:38
@sambonbonne
Copy link
Author

Latest push is due to me rebasing this branch from ader1990/linux_kernel_6_10 and removing the commits which added then removed CONFIG_ARCH_MULTI_V7. No need to rebuild for now.

@sambonbonne
Copy link
Author

Good news: I got my cable and I already have some things.

Bad news: right now, it's still complicated to find why Flatcar doesn't boot.

I can see the Grub menu through the serial port but after booting Flatcar, even by adding a debug parameter, all I get is:

Booting a command list

EFI stub: Booting Linux Kernel...
EFI stub: EFI_RNG_PROTOCOL unavailable
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...

I also tried to boot the A partition directly (still from Grub menu) with the debug parameter, nothing more than the above logs.

I think I miss a kernel configuration, I'm looking for it. Fortunately, Home Assistant provides a working image for Odroid M1S so I will try to find out what they use. I hope I'll be able to build locally this time, this would avoid multiple pipelines just to find out missing parameters.
I guess this will increase the initrd size but let's find out the config, then we'll see.

@ader1990
Copy link
Contributor

I see that the support has been there since the 6.12 torvalds/linux@10dc64f, so it should work. For the debug logs, I might suggest adding more kernel config params - mainly console=ttyAMA0 console=ttyAMA1 console=ttyS0 console=ttyS1, maybe you can get some more information. Also, it would be helpful to share a kernel config file from an image that works, to cross-compare and see what might be missing.

Thanks.

@sambonbonne
Copy link
Author

sambonbonne commented Feb 12, 2025

My comment is a bit long so I tried to structure it two parts.

Debug

I tried different parameters for debug logs: the four console=… you gave and console=tty0 (this last was tried because I found it in the boot partition of the HAOS image): I get nothing. I tried in the default entry and in the A partition entry.

Maybe I do it wrong but it's not my first time adding a temporary kernel parameter from Grub: I use cu as a serial console, I some Odroid boot information, then I have the Grub menu. I edit the first entry, append the parameter at the end of the line starting with linux and ending with $linux_cmdline and hit Ctrl-x to boot.

Kernel configs

As a theoretically working kernel config, I have two sources for Odroid M1S:

Gentoo wiki configs try

I tried to add configs from Gentoo wiki but that's where I got some build error when running build_packages from my machine. I pulled my branch since you rebased it on main (thanks for that) but now I have a new build error, from podman, the podman's build log (in /build/arm64-usr/var/log/portage/app-containers:podman-5.3.0:20250212-102732.log) just ends with this:

/build/arm64-usr/var/log/portage/app-containers:podman-5.3.0:20250212-102732.log
cd .                                                                                                                                                                                         
GOROOT='/usr/lib/go' /usr/lib/go/pkg/tool/linux_amd64/link -o $WORK/b001/exe/a.out -importcfg $WORK/b001/importcfg.link -installsuffix shared -X=runtime.godebugDefault=asynctimerchan=1,goty
pesalias=0,httpservecontentkeepheaders=1,tls3des=1,tlskyber=0,x509keypairleaf=0,x509negativeserial=1 -buildmode=pie -buildid=Di7VbJOVEz0cCecPSPVh/lIE-3YL_RKIYFm-tCLNS/g5oi0ADpMmXJOtBy8UrJ/D
i7VbJOVEz0cCecPSPVh -X github.com/containers/podman/v5/libpod/define.buildInfo=1739356126 -X github.com/containers/podman/v5/libpod/config._installPrefix=/usr -X github.com/containers/podma
n/v5/libpod/config._etcDir=/etc -X github.com/containers/podman/v5/pkg/systemd/quadlet._binDir=/usr/bin -X github.com/containers/common/pkg/config.additionalHelperBinariesDir= -extld=aarch6
4-cros-linux-gnu-gcc $WORK/b001/_pkg_.a                                                                                                                                                      
/usr/lib/go/pkg/tool/linux_amd64/buildid -w $WORK/b001/exe/a.out # internal                                                                                                                  
mkdir -p bin/                                                                                                                                                                                
cp $WORK/b001/exe/a.out bin/podman-testing                                                                                                                                                   
/usr/lib/go/pkg/tool/linux_amd64/buildid -w $WORK/b001/exe/a.out # internal                                                                                                                  
mkdir -p bin/                                                                                                                                                                                
cp $WORK/b001/exe/a.out bin/podman                                                                                                                                                           
test -z "" || chcon -t container_runtime_exec_t bin/podman                                                                                                                                      

I built from a fresh environment (new clone and remove all Flatcar containers and images) so I don't understand how I can still face this kind of errors (I run the ./build_package --board arm64-usr from the SDK container, run with ./run_sdk_container -a arm64 -t).

Is it possible to build a smaller set just to try the boot, without Podman for example?

Home Assistant OS info

For HAOSS (short for Home Assistant OS), it bootloops when installed on SD card so I may have to try to install it directly on eMMC and maybe I can find a way to copy the /proc/config.gz to check if there is useful kernel parameters. But even if I find other kernel parameters to add, I may face build errors.

@sambonbonne
Copy link
Author

I managed to build with more parameters but still no luck. I created a PR on my repo for that and started a self-hosted runner with the required labels.

Build working but image still not booting: sambonbonne@1eb4d0b

Build failing: sambonbonne@98c0da9 (build log: https://github.com/sambonbonne/flatcar-scripts/actions/runs/13437246231/job/37542450165)

My next try, when I have the time, will be to run the HAOSS image directly in eMMC (instead of SD) and see if the kernel config is available. 🤞

ader1990 and others added 5 commits April 29, 2025 07:05
pahole: added a revamped patch to remove the parallel implementation
kernel: use pahole 1.27 feature of reproducible builds
The out-of-tree nvidia driver requires symbols that are behind DRM_TTM_HELPER
if DRM_FBDEV_EMULATION is enabled, but DRM_TTM_HELPER can't be selected unless
we build more drm drivers (which is undesirable). To get out of this, disable
DRM_FBDEV_EMULATION.

Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
Remove CONFIG_AMD_IOMMU_V2, CONFIG_FB_ARMCLCD, CONFIG_MD_LINEAR, CONFIG_NET_ACT_IPT.

Add CONFIG_MODULE_COMPRESS.

See: torvalds/linux@5a0b11a

linux: remove CONFIG_MD_LINEAR

See: torvalds/linux@849d18e

linux: remove CONFIG_NET_ACT_IPT

See: torvalds/linux@86fe596

linux: add required CONFIG_MODULE_COMPRESS=y

See: torvalds/linux@c7ff693

linux: remove CONFIG_FB_ARMCLCD

See: torvalds/linux@dee56cc
@sambonbonne
Copy link
Author

sambonbonne commented May 6, 2025

So, today I discovered earlycon and keep_bootcon Linux command line parameters (yeah, I'm that noob) and it helped a lot.

First the kernel stops with a clk: Disabling unused clocks problem, I guess this may come from my DTB file but I'm not sure. I can't manage to boot without passing a DTB file so I guess I have to do something about this but as a workaround, I can pass a clk_ignore_unused parameter for now. Or find a way to boot without DTB.

Then I got FATAL: iscsiroot requested but kernel/initrd does not support iscsi, followed by Refusing to continue then shutdown.

I see CONFIG_ISCSI_IBFT=y and CONFIG_ISCSI_IBFT_FIND=y in amd64_defconfig-6.12 so right now I'm building Flatcar again after adding the same configs in arm64_defconfig-6.12 to see if it boots.

I hope I'll be able to fix the clocks problem but what a relief to get to that point!

@chewi
Copy link
Contributor

chewi commented May 7, 2025

Ah, good idea, should have thought of that. Those messages might be red herrings though. iscsi support certainly isn't mandatory.

@sambonbonne
Copy link
Author

Well, added configs did not change anything but by reading more carefully, I found this in the logs:

systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'.
systemd[1]: Failed to start systemd-modules-load.service - Load Kernel Modules.

It happens after the clock problem (when ignored with clk_ignore_unused) but just before the ISCSI problem, which make me think the ISCSI fails is just the consequence of not being able of loading kernel modules.

@chewi should I wait for the new Dracut version before retrying or it won't change anything? Because the ISCSI failure logs come from Dracut but as the systemd-modules-load.service fails in the first hand, I think it's more a systemd problem than a Dracut problem (but as I don't know a lot about kernel/systemd/dracut, I would be happy if the answer is "it's Dracut").

@chewi
Copy link
Contributor

chewi commented May 7, 2025

Ah, so you are getting further than before. I thought it was still freezing before it even got to the initrd. Are you not able to get an emergency shell at this point? Then you could find out exactly why systemd-modules-load failed. Hard to guess why. Could be certain modules missing, the whole module directory being wrong, or some module just failing to load.

I doubt the new Dracut will help here. If you're building from recent master, then you should have it already anyway. Otherwise, it will be in the next Alpha, which should be out very soon.

@sambonbonne
Copy link
Author

Ah, so you are getting further than before. I thought it was still freezing before it even got to the initrd.

I'm getting further when I boot the vmlinuz-a image directly, but with Grub I still got no luck unfortunately. So right now I do my tests by booting the vmlinuz-a image directly.

Are you not able to get an emergency shell at this point?

How can I try? It fails and stop but maybe I could try some kernel parameter?

I'm not sure it would work because everything is printed thanks to earlyprintk and keep_bootcon parameters so I'm not sure I could get something interactive. If I remove the keep_bootcon, I got not console.

If you're building from recent master, …

Right now, I'm still based on the 6.12 kernel branch (as Odroid M1S mainline support has been added in Linux 6.12 only) and I didn't rebase for a long time. I wanted to wait for the 6.12 merge before rebasing on it on a more regular basis.

@chewi
Copy link
Contributor

chewi commented May 7, 2025

How can I try? It fails and stop but maybe I could try some kernel parameter?

I'm not sure it would work because everything is printed thanks to earlyprintk and keep_bootcon parameters so I'm not sure I could get something interactive. If I remove the keep_bootcon, I got not console.

Normally, it should either finish booting and reach the bash prompt or drop you to an emergency shell. You might not be seeing the latter. Then again, you might not be seeing the former either. Maybe it's actually booting? You can force an emergency shell with the rd.break. There are various points you can tell it to break at. See man dracut.cmdline.

Right now, I'm still based on the 6.12 kernel branch (as Odroid M1S mainline support has been added in Linux 6.12 only) and I didn't rebase for a long time. I wanted to wait for the 6.12 merge before rebasing on it on a more regular basis.

Ah yes, 6.12 isn't quite merged yet. Possibly waiting on feedback from, I'll get on that. The branch (currently f0479a5) does include the new Dracut though.

@sambonbonne
Copy link
Author

Maybe it's actually booting?

I don't think so, here are the final logs.

Kernel logs
[    3.804011] systemd[1]: Starting dracut-cmdline-ask.service - dracut ask for additional cmdline parameters...
[    3.810784] systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
[    3.813810] systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'.
[    3.817288] systemd[1]: Failed to start systemd-modules-load.service - Load Kernel Modules.
[    3.824375] systemd-journald[220]: Collecting audit messages is disabled.
[    3.837866] systemd[1]: Starting systemd-sysctl.service - Apply Kernel Variables...
[    3.887785] systemd[1]: Finished systemd-tmpfiles-setup-dev.service - Create Static Device Nodes in /dev.
[    3.932308] systemd[1]: Started systemd-journald.service - Journal Service.
[    4.635221] scsi_transport_iscsi: version magic '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_4c1878ebf69f0970b95d3a28dd0d6256599204b7fc7c4e9c913a3ff
975cebd24' should be '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_4b2265eabc93dcbc1dc7f1aa2a6d6b3e44fdba91ec4a3be9063cfe0213443d79'
[    4.640048] dracut: FATAL: iscsiroot requested but kernel/initrd does not support iscsi
[    4.640987] dracut: Refusing to continue
[    4.822584] systemd-shutdown[1]: Syncing filesystems and block devices.
[    4.825414] systemd-shutdown[1]: Sending SIGTERM to remaining processes...
[    4.856711] systemd-journald[220]: Received SIGTERM from PID 1 (systemd-shutdow).
[    6.555001] random: crng init done
[    6.578620] systemd-shutdown[1]: Sending SIGKILL to remaining processes...
[    6.604126] systemd-shutdown[1]: Unmounting file systems.
[    6.608623] (sd-umount)[347]: Unmounting '/run/credentials/systemd-resolved.service'.
[    6.613167] (sd-umount)[348]: Unmounting '/run/credentials/systemd-tmpfiles-setup.service'.
[    6.617760] (sd-umount)[349]: Unmounting '/run/credentials/systemd-sysctl.service'.
[    6.622255] (sd-umount)[350]: Unmounting '/run/credentials/systemd-tmpfiles-setup-dev.service'.
[    6.626933] (sd-umount)[351]: Unmounting '/run/credentials/systemd-journald.service'.
[    6.631475] (sd-umount)[352]: Unmounting '/run/credentials/systemd-vconsole-setup.service'.
[    6.635995] (sd-remount)[353]: Remounting '/usr' read-only with options ''.
[    6.640223] (sd-remount)[354]: Remounting '/' read-only with options ''.
[    6.643166] systemd-shutdown[1]: All filesystems unmounted.
[    6.643776] systemd-shutdown[1]: Deactivating swaps.
[    6.644455] systemd-shutdown[1]: All swaps deactivated.
[    6.645061] systemd-shutdown[1]: Detaching loop devices.
[    6.646676] systemd-shutdown[1]: All loop devices detached.
[    6.647317] systemd-shutdown[1]: Stopping MD devices.
[    6.648458] systemd-shutdown[1]: All MD devices stopped.
[    6.649073] systemd-shutdown[1]: Detaching DM devices.
[    6.650159] systemd-shutdown[1]: All DM devices detached.
[    6.650733] systemd-shutdown[1]: All filesystems, swaps, loop devices, MD devices and DM devices detached.
[    6.653224] systemd-shutdown[1]: Syncing filesystems and block devices.
[    6.655495] systemd-shutdown[1]: Halting system.
[    6.657488] kvm: exiting hardware virtualization
[    6.658099] reboot: System halted

I think the reboot: System halted at the end tells everything, unfortunately.

You can force an emergency shell with the rd.break.

Thank's, I'll give it a try when I'm able to.

The branch (currently f0479a5) does include the new Dracut though.

Maybe I should rebase just to be sure I have latest Dracut version, I'll try to do that first.

@ader1990
Copy link
Contributor

ader1990 commented May 7, 2025

@sambonbonne I have these notes that might be helpful to make sure you are rebuilding correctly:

# make sure the tmp is clean
sudo rm -rf /build/arm64-usr/var/tmp/portage/sys-kernel*

# if the kernel sources have been changed
emerge-arm64-usr sys-kernel/coreos-sources

# if the kernel config or patches have changed
emerge-arm64-usr sys-kernel/coreos-modules

# if the bootengine commit id has changed
emerge-arm64-usr sys-kernel/bootengine

# if the bootengine commit id has changed
sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio
emerge-arm64-usr sys-kernel/coreos-kernel

# do a build packages to make sure
./build_packages --board arm64-usr

@sambonbonne
Copy link
Author

sambonbonne commented May 7, 2025

Thanks @ader1990, I think mines are more aggressive:

sudo rm -rf sdk_container/.env sdk_container/.sdkenv sdk_container/.cache/ sdk_container/.config __build__/
sudo podman container rm $(sudo podman container ls -a | grep -F flatcar-sdk-arm64 | cut -d ' ' -f 1)

So I may try yours to see if the rebuild is quicker. If I understand correctly, I have to run yours directly in the SDK container and it won't rebuild all packages, it seems better.

@chewi
Copy link
Contributor

chewi commented May 7, 2025

[    4.635221] scsi_transport_iscsi: version magic '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_4c1878ebf69f0970b95d3a28dd0d6256599204b7fc7c4e9c913a3ff
975cebd24' should be '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_4b2265eabc93dcbc1dc7f1aa2a6d6b3e44fdba91ec4a3be9063cfe0213443d79'

This does suggest you're not getting a clean build. The version magic of this module does not match that of the kernel. At least I think that's what it means, I've rarely seen this error! This could be true for all the modules.

@sambonbonne
Copy link
Author

This is strange but maybe my "cleanup" commands are wrong?

Anyway, for this test I cloned Adrian's branch for 6.12 kernel in a new repo and just copied my arm64 kernel config before running the build. And the SDK image is more up to date than on my branch so it creates a new container. This should make a clean build.

I didn't know this line was an error so thank you for pointing it to me, if I see it again I'll know it's a problem!

@sambonbonne
Copy link
Author

Well, even with a clean build I get these errors:

overlay: version magic '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_71ad937ae63f3d79750961d1bfa81b890c5594f3d7b49584186880514fd81b44' should be '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_9a728556cb6a7696af99ad13b13837dd5195c3206552972867870cfba04f37f5'
llc: version magic '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_71ad937ae63f3d79750961d1bfa81b890c5594f3d7b49584186880514fd81b44' should be '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_9a728556cb6a7696af99ad13b13837dd5195c3206552972867870cfba04f37f5'
scsi_transport_iscsi: version magic '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_71ad937ae63f3d79750961d1bfa81b890c5594f3d7b495841868805
14fd81b44' should be '6.12.20-flatcar SMP preempt mod_unload aarch64RANDSTRUCT_9a728556cb6a7696af99ad13b13837dd5195c3206552972867870cfba04f37f5'
dracut: FATAL: iscsiroot requested but kernel/initrd does not support iscsi
dracut: Refusing to continue

I did as I said I would:

  1. I cloned @ader1990 branch for kernel 6.12 in a new directory
  2. I copied and pasted my arm64defconfig (I didn't commit this change)
  3. I ran the SDK container to build packages and images for board-arm64 (I can see the SDK container is new, it's not a preceding container)

What can create this "version magic" error in these steps?

@chewi
Copy link
Contributor

chewi commented May 7, 2025

I don't know exactly how version magic works. Given that Flatcar builds the modules first and then the kernel itself later, it must get stored somewhere, but I can't work out where. In short though, it just means that the kernel and the modules weren't built from the same... config or maybe environment? I doubt you are, but make sure that you're not installing coreos-modules or coreos-kernel from a binary package.

@sambonbonne
Copy link
Author

I'm sorry for this stupid question but how can I make sure of that?

@ader1990
Copy link
Contributor

ader1990 commented May 7, 2025

I'm sorry for this stupid question but how can I make sure of that?

I would make sure by having a git commit and cherry-pick instead of a copy paste for the changes that you have. Then to properly cleanup, make sure to wipe the scripts folder you have and do a fresh clone, or the alternative can be (I prefer it): git reset --hard (make sure you have done git add/git commit/git push before this), and then do a git clean -fxd

@sambonbonne sambonbonne force-pushed the feature/enable-rockchip-in-kernel branch from 0702293 to 9deb546 Compare May 8, 2025 14:06
@sambonbonne
Copy link
Author

sambonbonne commented May 8, 2025

@ader1990 thanks!
I decided to be lazy so I copied the .github/workflows/ci.yaml from your repository, last month you told me you were trying to make image build possible for everyone (or maybe I didn't understand correctly) so it may be simpler than trying to always have a clean repository and pushing meaningless changes.
Also, I pushed my rebase on your Linux 6.12 branch so I should be up do date on everything.

Edit: it seems it's starting to run on my repo (Flatcar fork), many thanks for this!

@sambonbonne sambonbonne force-pushed the feature/enable-rockchip-in-kernel branch from 9deb546 to 153a6e7 Compare May 8, 2025 14:12
@ader1990
Copy link
Contributor

ader1990 commented May 8, 2025

@ader1990 thanks! I decided to be lazy so I copied the .github/workflows/ci.yaml from your repository, last month you told me you were trying to make image build possible for everyone (or maybe I didn't understand correctly) so it may be simpler than trying to always have a clean repository and pushing meaningless changes. Also, I pushed my rebase on your Linux 6.12 branch so I should be up do date on everything.

Edit: it seems it's starting to run on my repo (Flatcar fork), many thanks for this!

Great to hear that -> this is a working build, just in case you need a working example: https://github.com/ader1990/scripts/actions/runs/14442077036

@ader1990
Copy link
Contributor

ader1990 commented May 9, 2025

sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio

Seems that the sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio does not work anymore and a new path is now for the bootengine.cpio. @chewi I know you made some changes to the dracut recently, do you happen to know if the bootengine.cpio artifact path has changed and how to make sure we do not really need a full clean Flatcar rebuild to test a bootengine new commit?

Now the sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio does not exist anymore, and there s a new path for bootengine.cpio in the coreos-modules folder.

This workflow does not work anymore given that a new bootengine commit id needs to be tested:

# if the bootengine commit id has changed
emerge-arm64-usr sys-kernel/bootengine

# if the bootengine commit id has changed
sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio
emerge-arm64-usr sys-kernel/coreos-kernel

# do a build packages to make sure
./build_packages --board arm64-usr

@ader1990
Copy link
Contributor

ader1990 commented May 9, 2025

sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio

Seems that the sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio does not work anymore and a new path is now for the bootengine.cpio. @chewi I know you made some changes to the dracut recently, do you happen to know if the bootengine.cpio artifact path has changed and how to make sure we do not really need a full clean Flatcar rebuild to test a bootengine new commit?

Now the sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio does not exist anymore, and there s a new path for bootengine.cpio in the coreos-modules folder.

This workflow does not work anymore given that a new bootengine commit id needs to be tested:

# if the bootengine commit id has changed
emerge-arm64-usr sys-kernel/bootengine

# if the bootengine commit id has changed
sudo rm /build/arm64-usr/usr/share/bootengine/bootengine.cpio
emerge-arm64-usr sys-kernel/coreos-kernel

# do a build packages to make sure
./build_packages --board arm64-usr

@chewi Seems that you did the following changes:

https://github.com/flatcar/scripts/pull/2837/files#diff-85946bb23fcc3dd4b34d4ddf1ade124d142ac6c5b6214c88da68302a2a30aab0R90

and now the build path is /build/amd64-usr/usr/lib/modules/6.6.89-flatcar/build/bootengine.cpio, controlled by the coreos-modules.

then the coreos-kernel has this change:

"${ESYSROOT}"/usr/bin/update-bootengine -k "${KV_FULL}" -o "${S}"/build/bootengine.cpio "${BE_ARGS[@]}" || die

Which means, now, to test the new bootengine commit, one needs to also clean and rebuild the coreos-modules:

# if the bootengine commit id has changed
emerge-arm64-usr sys-kernel/bootengine

# extra cleanup commands to be sure?
sudo rm -rf /build/arm64-usr/var/db/pkg/sys-kernel/bootengine-0.0.38-r37*
sudo rm -rf /build/arm64-usr/var/lib/portage/pkgs/sys-kernel/bootengine-0.0.38-r37*
sudo rm -rf /build/arm64-usr/usr/share/SLSA/sys-kernel_bootengine-0.0.38-r37.*

# if the bootengine commit id has changed
sudo rm /build/amd64-usr/usr/lib/modules/6.6.89-flatcar/build/bootengine.cpio
emerge-arm64-usr sys-kernel/coreos-modules

emerge-arm64-usr sys-kernel/coreos-kernel

# do a build packages to make sure
./build_packages --board arm64-usr

@chewi
Copy link
Contributor

chewi commented May 9, 2025

I've just checked. The bootengine.cpio in coreos-modules at /build/*-usr/usr/lib/modules/*-flatcar/build is a 512-byte empty archive that was always there. I guess it's needed to satisfy the modules build before we build the kernel and initrd later. I'm not sure whether it really needs to be installed, but I see no reason to change it.

bootengine.cpio was only written to /build/*-usr/usr/share/bootengine because Dracut was previously run in a chroot where it couldn't write to the Portage ${WORKDIR}. Now that's no longer the case, it can just write it alongside the other build files like it should. It's therefore no longer persisted, so there's no need to delete it beforehand. I don't think there was really any need to delete it anyway, as it would have been overwritten regardless. I guess that documented step was just a precaution. I'll remove that step now. All you need to do is rebuild coreos-kernel.

@sambonbonne sambonbonne force-pushed the feature/enable-rockchip-in-kernel branch from 153a6e7 to 2940fb0 Compare May 10, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ⚒️ In Progress
Development

Successfully merging this pull request may close these issues.

5 participants