-
Notifications
You must be signed in to change notification settings - Fork 59
Upgrade Linux kernel from 6.6 to 6.12 #2300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ZFS 2.2.5 does not support kernel 6.10, the zfs upgrade patches will be dropped after portage stable update PR gets merged (with 2.2.6 zfs): #2298 |
Build action triggered: https://github.com/flatcar/scripts/actions/runs/14906773777 |
@@ -36,6 +36,5 @@ IUSE="" | |||
# local patches overlap with the upstream patch. | |||
UNIPATCH_LIST=" | |||
${PATCH_DIR}/z0001-kbuild-derive-relative-path-for-srctree-from-CURDIR.patch \ | |||
${PATCH_DIR}/z0002-revert-pahole-flags.patch \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this?
When pahole
is executed with -j
(parallel) then btf metadata order is non-deterministic and the built kernel and modules don't match.
It doesn't have to be a revert, but we need to carry some patch (unless something significant changed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely, working on it. pahole flags moved to scripts/Makefile.btf, so that needs to be addressed, was working now on a patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We recently updated pahole to a newer version (1.27) that was supposed to be reproducible regardless of how many threads it uses, but dropping the patch didn't work for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we may still need a kernel patch to pass --btf_features=all,reproducible_build
: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?h=v1.27&id=43bd3efa85656565129063cdd6dd7499e44a7867
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be upstreamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will test it asap and send it to LKML if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the reproducible_build flag to the pahole params, although I don't like how that addition is done, that file is a beautiful soup and needs some better management.
Things like this make me want to wait with an upgrade to 6.10: It takes a couple minor releases on a new stable branch before it is ready to make its way into Flatcar. |
Adding the blocker bug here: https://bugzilla.kernel.org/show_bug.cgi?id=219229 Possible resolution from the bug discussion in the kernel config: CONFIG_TCG_TPM2_HMAC=n |
The feature CONFIG_TCG_TPM2_HMAC has been introduced in 6.10 as extra security layer: https://github.com/torvalds/linux/blob/master/drivers/char/tpm/Kconfig#L37 |
cce5dc3
to
6dd0a5b
Compare
Managed to get the ARM64 image built, but the AMD64 image fails at the initrd/grub stage with error: Full error bellow:
|
Successful build for the AMD64:
|
The bpf amd64 |
...oreos-overlay/sys-kernel/coreos-sources/files/6.10/z0002-pahole-remove-parallel-j-flag.patch
Outdated
Show resolved
Hide resolved
...oreos-overlay/sys-kernel/coreos-sources/files/6.10/z0002-pahole-remove-parallel-j-flag.patch
Outdated
Show resolved
Hide resolved
...oreos-overlay/sys-kernel/coreos-sources/files/6.10/z0002-pahole-remove-parallel-j-flag.patch
Outdated
Show resolved
Hide resolved
5595c96
to
4ad039e
Compare
@t-lo I observed that from Linux kernel 6.10, there is a change in name of a hyper-v daemon binary - see torvalds/linux@82b0945. Should we leave the same systemd unit name though? I wonder how the https://github.com/microsoft/azurelinux will be doing it (have not seen yet any patch). I am oscillating between this 4ad039e vs changing the name in all places. |
I think we should rename the systemd service to prevent confusion down the road. |
The thing is that the binaries do the same thing / have the same interface, but just internally have a different implementation aka uio_hv_generic. The weird part is that the old implementation is still present, but has build disabled. I will add a new service definition (as it also has a different device path trigger) for the new version, to keep things separate. |
The /boot partition is very close to a critical level, 49% already used, leaving around 1.5MB free to use:
|
04d37b9
to
0d9b839
Compare
Note: on AMD64 vmlinuz-a, the build_library/extract-initramfs-from-vmlinuz.sh fails due to the fact now that the scripts finds the corrupted CPIO first. Need to do some more debugging on why this issue happens in the first place (what has changed upstream). |
06de6c8
to
f62e9ba
Compare
2987c4d
to
ce6b28b
Compare
sdk_container/src/third_party/coreos-overlay/sys-kernel/coreos-sources/Manifest
Outdated
Show resolved
Hide resolved
ce6b28b
to
86c9ab1
Compare
86c9ab1
to
f8e92b4
Compare
f8e92b4
to
9c0c7b3
Compare
Just a heads up that the latest CI run shows that both the amd64 and arm64 vmlinuz images will be the largest they've ever been, despite my recent changes to tackle the size. They're still within the limit, but obviously it's getting very tight now. I'll keep looking at moving the image. |
How big are we now? And have we looked into what is contributing to an increase with this update? |
61,065,720 bytes for amd64 and 63,002,616 bytes for arm64. The size has bumped up and down for various reasons over time, but the previous biggest sizes I have recorded are 59,903,488 and 62,138,880 respectively. I have kept track of the remaining space with two such images present alongside the other files in /boot across different releases. In the worst case, we have 2,022,928 and 3,070,992 remaining respectively. Some kernel modules have increased in size, but not massively. There are some new ones, but some have been removed too. There are a couple of new firmware files, but these are small. The report seems slightly bugged here.
Also bear in mind that the arm64 kernel itself cannot be compressed, but it does uniquely benefit from Some new drivers have presumably been added, and I would expect the kernel to generally grow in size anyway, so such a jump from 6.6 to 6.12 does not surprise me. It's just something to keep an eye on. |
9c0c7b3
to
f0479a5
Compare
Could you investigate this? There's likely some settings that got auto-enabled that are not needed. |
One thing I've noticed. |
pahole: added a revamped patch to remove the parallel implementation kernel: use pahole 1.27 feature of reproducible builds
The out-of-tree nvidia driver requires symbols that are behind DRM_TTM_HELPER if DRM_FBDEV_EMULATION is enabled, but DRM_TTM_HELPER can't be selected unless we build more drm drivers (which is undesirable). To get out of this, disable DRM_FBDEV_EMULATION. Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
…_fcopy_uio_daemon See: torvalds/linux@82b0945ce2c2d636d5e893ad50210875c929f257wq Also fix hv tools build for arm64.
Remove CONFIG_AMD_IOMMU_V2, CONFIG_FB_ARMCLCD, CONFIG_MD_LINEAR, CONFIG_NET_ACT_IPT. Add CONFIG_MODULE_COMPRESS. See: torvalds/linux@5a0b11a linux: remove CONFIG_MD_LINEAR See: torvalds/linux@849d18e linux: remove CONFIG_NET_ACT_IPT See: torvalds/linux@86fe596 linux: add required CONFIG_MODULE_COMPRESS=y See: torvalds/linux@c7ff693 linux: remove CONFIG_FB_ARMCLCD See: torvalds/linux@dee56cc
f0479a5
to
288e8c9
Compare
CONFIG_MODULE_COMPRESS was enabled because it is a new requirement for the CONFIG_MODULE_* that we already had enabled -> CONFIG_MODULE_COMPRESS_XZ=y. Without setting CONFIG_MODULE_COMPRESS, the build fails as CONFIG_MODULE_COMPRESS_XZ=y was set. See the commit message here: 74c7e5d and the linked Linux commit description for the details on why: torvalds/linux@c7ff693 |
I see. I did some comparisons with both options disabled. The combined kernel image does indeed come out 1MB smaller. It's harder to measure the impact on /usr. zstd is used with btrfs, so if I recompress all the coreos-modules files with zstd, that comes out 119MB larger. This may not be representative. I think we can spare 1MB for now, and I still hope to move the kernel, so let's leave this as it is. |
What I was trying to say is that the |
That's okay, I'm just trying to find reasons for the size increase.
Some of the cryptographic options have changed from I can't see anything else obviously unneeded and/or huge. |
Upgrade Linux kernel from the 6.6.y stable branch to 6.12.y stable branch.
See: flatcar/Flatcar#1527
This PR is mostly to reveal any possible big blockers before getting to the new 6.12 LTS release.
Tested 6.10.y and it works as expected.
Tested 6.11.y and it works as expected.
Now testing 6.12.y.
Testing done
[Describe the testing you have done before submitting this PR. Please include both the commands you issued as well as the output you got.]
changelog/
directory (user-facing change, bug fix, security fix, update)/boot
and/usr
size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.Boot partition size: