-
Notifications
You must be signed in to change notification settings - Fork 7.6k
tests: posix: add one integration platform per suite #86634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
cfriedt
wants to merge
1
commit into
zephyrproject-rtos:main
from
cfriedt:issue-86633-tests-posix-ensure-each-testsuite-has-at-least-one-integration-platform
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing that I wasn't totally sure about, is whether there is a conflict between
platform_key
permutations andintegration_platforms
.Ideally, we would want one tier0 per supported platform across all supported libc's (which are currently separate test scenarios). Does
integration_platforms
restrict CI to only 1 platform per test scenario?Are tests only run on-diff for integration_platforms and then all tests run weekly?
I just want to avoid regressions, if at all possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side, note, the
common
testsuite will eventually be completely separated to smaller testsuites. There are quite a few Just blocked on #83386 for now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integration_platforms are used when twister is called with --integration/-G. Currently, this is done only with on-pr trigger and not always, depending on the scope of the changes, but can change in near future. E.g. maybe samples will always use -G. This is being discussed currently in the context of test strategy for Zephyr. I believe at least once per day (e.g. nightly) there will be a run with a scope broader than just integration_platforms.
There is no hard limit in twister, but the idea is to restrict the list to a minimal viable scope. If there are key differences between boards affecting a test then it might be worth having more than a 1. In most cases 1 should be enough.
platform_key works as a filter: after a single tests falling under the key parameters is done all other tests falling under the same key should be skipped. I guess having two integration_platforms falling under the same key parameters with a platform_key filter would be a problem, since the second one would get skipped and twister would complain, that skipping mustn't happen on integration_platforms. I don't think it should cause other problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I suspected.
In this case, I don't think it's wise to restrict integration tests to only 1 board, because only testing a subset of architectures against a subset of C libraries has historically been where regressions have occurred in POSIX and I will close this PR.
Also what I suspected.
In that case, I would consider the output of
-ll qa
mentioned in the linked issue to be a false positive, and I'll close this PR, as it would undoubtedly invite regressions in the future.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that a single platform in integration scope is not enough to catch majority of regressions that can be introduced in a PR related to C libraries?
If so please consider this approach to testing (that we might adapt as our strategy):
We have limited resources (ci time = money) so we have to be efficient with our testing. To achieve this we can do testing at multiple levels with different scope:
On-PR: the most frequent trigger, hence the most optimized testing. Running only a subset of configurations (e.g. only on few integration platforms) can already cover ~80% (arbitrary number) of possible failures. We accept the risk at this stage
on-main/nightly: several times day/once a day. Broader scope, e.g. all "default_platforms" (i.e. multiple sim/emulators). With this we cover these extra %. When a regression is detected the release team makes a decision: quarantine/wait for hotfix/revert.
If ~80% of coverage cannot be achieved with a few platforms, let's add more to integration_platforms to get this 80% (integration_platforms doesn't have to be limited to just 1 platform).
As for the QA output:
It highlight that these tests relays on
filter
(e.g.filter: not CONFIG_NATIVE_LIBC
).filter
requires to run a cmake stage before twister can decide if a given platform should be skipped or not. So if no integration_platforms is used to limit the CI choice on PR it would start with way broader scope of platforms, and for each scenario it will have to run this cmake step for all platforms (several seconds per platform+scenario intead of "no time" when strings are compared).Having the above in mind, please reconsider re-opening the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Are you surprised? POSIX touches quite a lot. Literally everything from thread stacks and MMUs to networking and IPC. As POSIX maintainer, I've seen regressions occur for just about every architecture with every C library that Zephyr supports.
I have. Reducing the scope of CI allowed quite a few regressions through, and this continued for the better part of a year. Most regressions were found because the testing was not being done up-front, and failures were seen in nightly or weekly runs. In several cases, months worth of work was reverted as workarounds which meant that months of my time were wasted.
There really was no alternative but to increase the scope again to prevent regressions.
My current strategy is to break up the larger-scoped, less structured, legacy POSIX testsuites, into testsuites that are smaller in scope. The primary testsuites that are of concern to me are 'common' and 'fs'.
Once that is done, it should be possible to use integration platforms more effectively and narrow the scope of testing. However, that work is effectively blocked at the moment due to lack of reviews.
Perhaps a subset of this PR can be integrated now. I'll have another look.
In any case, after the common and fs POSIX testsuites are split up, there should be less risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PerMac - I'll see what I can do here right after 4.1.0 is released.