Skip to content

lib: os: timespec_util: reimplement and test with broad inputs #92709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

cfriedt
Copy link
Member

@cfriedt cfriedt commented Jul 4, 2025

Originally, the timespec_to_timeout() and timespec_from_timeout() tests assumed that tv_sec and tv_nsec values divided evenly with the number of ticks per second (CONFIG_SYS_CLOCK_TICKS_PER_SEC). However, those assumptions broke down when testing with 32768 ticks/s .

Upon further investigation, there were additional corner cases discovered that were not handled in an ideal or safe way.

Part of this fix involved identifying several "domains" and special values that have semantic meaning that must be handled particularly carefully.

Fixes #92158
Fixes #92908
Fixes #92912
Fixes #90479 (Coverity CID 524770)

Some context for #92908 and #92912 is provided below.

Added Constants

Additional hidden-API constants were made to simplify the solution. These are "hidden" via an existing Doxygen @cond block.

K_TICK_MIN: 1
K_TICK_MAX: UINT32_MAX-1 in 32-bit, INT64_MAX in 64-bit

K_TIMESPEC_MIN: the equivalent of K_TICK_MIN as a timespec object
K_TIMESPEC_MAX: the equivalent of K_TICK_MAX as a timespec object

K_TIMESPEC_NO_WAIT: the equivalent of K_NO_WAIT as a timespec object
K_TIMESPEC_FOREVER: the equivalent of K_FOREVER as a timespec object

Coverage Details

  1. Converting a negative k_timeout_t

These only exist in 64-bit representation and actually encode absolute timepoints via ticks. Since the stated purpose of the conversion functions is to convert between durations, we must reject absolute time points in timespec_from_timeout(). The previous implementation did this as well.

  1. Converting a negative timespec

We assume that this duration means a timeout has already expired, and round these up to K_NO_WAIT in timespec_to_timeout(). The previous implementation also considered this case but it has been simplified with timespec_compare().

  1. Due to the larger numeric space in the timespec representation, the reverse mapping of timespec to k_timeout_t is "fuzzy". However, K_NO_WAIT (K_TIMESPEC_NO_WAIT) and K_FOREVER (K_TIMESPEC_FOREVER) must remain semantically equivalent. The previous implementation also held this to be true, but the test cases are a bit clearer about it now.

  2. Also, due to the larger numeric space in timespec representation, there was a special requirement to round up to the nearest tick boundary for any timespec in the strictly exclusive range (K_TIMESPEC_NO_WAIT, K_TIMESPEC_MAX). We must round up, since

  • rounding down to K_NO_WAIT is most certainly not representative of a non-zero finite duration delay
  • the kernel operates on tick boundaries

However, since it's possible that rounding up might not always be the right thing to do, in order to allow the application to make more informed decisions, we created a modified
timespec_to_timeout_rem() that also returns the remainder (or difference) between the requested time to convert and resulting k_timeout_t. The difference is expressed as a timespec object. The previous implementation did not provide this capability in a convenient manner.

  1. Above the K_TIMESPEC_MAX boundary, which is the highest possible timespec that can be represented by a tick, and specifically in the 64-bit timeout representation, there is a domain that cannot be rounded up to K_TIMESPEC_FOREVER and should always be rounded down to K_TIMESPEC_MAX.

This is to ensure that finite durations remain finite (even if they are beyond the heat death of the universe).

Added API

k_timeout_t timespec_to_timeout_rem(const struct timespec *req, struct timespec *rem)

The function above accounts for the fact that converting from timespec (which has a much larger numerical range) to k_timeout_t is a "fuzzy" conversion. The entire numerical range of timespec must be mapped to either a semantic k_timeout_t or must be aligned to an appropriate tick boundary.

Fortunately, it is not a "lossy" conversion.

Inline Functions changed

  • timespec_from_timeout() was changed from an inline to a regular function
  • timespec_to_timeout() was changed to a wrapper around timespec_to_timeout_rem() with rem = NULL

Testing Done

Additional test cases were defined to exercise permutations of:

  • 32-bit platform
  • 64-bit platform
  • CONFIG_SPEED_OPTIMIZATIONS = y / n
  • CONFIG_TIMEOUT_64BIT = y / n
  • low tick rate (1 Hz)
  • high tick rate (1000000 Hz)
  • 32k tick rate (32768 Hz)
  • Prime tick rate (10007 Hz)
$ twister -T tests/lib/timespec_util/
INFO    - Using Ninja..
INFO    - Zephyr version: v4.2.0-rc2-36-g05ebc9fc942f
INFO    - Using 'zephyr' toolchain.
INFO    - Selecting default platforms per testsuite scenario
INFO    - Building initial testsuite list...
INFO    - Writing JSON report /home/cfriedt/zephyrproject/zephyr/twister-out/testplan.json
INFO    - JOBS: 32
INFO    - Adding tasks to the queue...
INFO    - Added initial list of jobs to queue
INFO    - Total complete:   18/  18  100%  built (not run):    0, filtered:   86, failed:    0, error:    0
INFO    - 12 test scenarios (104 configurations) selected, 86 configurations filtered (86 by static filter, 0 at runtime).
INFO    - 18 of 18 executed test configurations passed (100.00%), 0 built (not run), 0 failed, 0 errored, with no warnings in 26.27 seconds.
INFO    - 198 of 198 executed test cases passed (100.00%) on 9 out of total 1115 platforms (0.81%).
INFO    - 18 test configurations executed on platforms, 0 test configurations were only built.
INFO    - Saving reports...
INFO    - Writing JSON report /home/cfriedt/zephyrproject/zephyr/twister-out/twister.json
INFO    - Writing xunit report /home/cfriedt/zephyrproject/zephyr/twister-out/twister.xml...
INFO    - Writing xunit report /home/cfriedt/zephyrproject/zephyr/twister-out/twister_report.xml...
INFO    - Run completed

@cfriedt cfriedt force-pushed the timespec-tests-rework branch 5 times, most recently from fa5790b to 09cba91 Compare July 6, 2025 22:40
@cfriedt cfriedt changed the title lib: os: timespec_util: reimplement and test with 32768 ticks/s lib: os: timespec_util: reimplement and test with broad inputs Jul 6, 2025
@cfriedt cfriedt force-pushed the timespec-tests-rework branch from 09cba91 to 54f04c4 Compare July 6, 2025 22:51
@cfriedt cfriedt force-pushed the timespec-tests-rework branch from 54f04c4 to 4cba546 Compare July 6, 2025 23:23
@cfriedt cfriedt requested a review from andyross July 6, 2025 23:23
@cfriedt
Copy link
Member Author

cfriedt commented Jul 6, 2025

I would strongly recommend merging this before v4.2.0 is tagged. If that is not possible, I would strongly recommend backporting to v4.2-branch as soon as possible.

@cfriedt cfriedt force-pushed the timespec-tests-rework branch from 4cba546 to 4aa56b8 Compare July 6, 2025 23:30
@cfriedt cfriedt marked this pull request as ready for review July 6, 2025 23:37
@github-actions github-actions bot added the area: Base OS Base OS Library (lib/os) label Jul 6, 2025
@github-actions github-actions bot requested a review from dcpleung July 6, 2025 23:38
@cfriedt cfriedt force-pushed the timespec-tests-rework branch 2 times, most recently from a79627d to e7fb518 Compare July 6, 2025 23:39
@cfriedt cfriedt added the Hotfix Fix for issues blocking development, i.e. CI issues, tests failing in CI, etc. label Jul 7, 2025
Copy link
Member

@aescolar aescolar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the code correctly there is some overflow issues.
Note also a few suggestions so we would be able to support other time_t sizes.
(I have not reviewed the test itself.)

@cfriedt
Copy link
Member Author

cfriedt commented Jul 7, 2025

  • merge lib/os/timeutil.c into lib/utils/timeutil.c
  • change the lower bound for low tick rate to 1 (present in some benchmark tests)
  • change the upper bound for high tick rate to 1000000 (used by some native sim tests)
  • rename test configurations to be clearer
  • define SYS_TIME_T_MIN / SYS_TIME_T_MAX, remove hard-selection of CONFIG_PICOLIBC=y
  • document timespec_timeout_rem()

I added more rounding test-cases above 1000 ticks, so that we could be sure the conversion is accurate (with rounding) in the slightly higher end of timespec_from_timetout() as well.

@aescolar
Copy link
Member

@cfriedt in the previous to the last commit there has been both quite significant changes to the PR content and a rebase in 1.
Doing the rebase and a change in one push makes it quite much more difficult for reviewers to handle it.
It is much nicer to do two separate pushes one with a rebase and the other with the changes in the code.

@cfriedt cfriedt force-pushed the timespec-tests-rework branch 2 times, most recently from 8cfbfc5 to 385daba Compare July 12, 2025 00:52
@cfriedt
Copy link
Member Author

cfriedt commented Jul 12, 2025

  • added modulo operator back, although it was completely unnecessary
  • corrected code formatting

@cfriedt cfriedt requested a review from aescolar July 12, 2025 00:57
@cfriedt
Copy link
Member Author

cfriedt commented Jul 12, 2025

@andyross - would be great if you could re-approve.

@danieldegrasse @dkalowsk - it would be really quite good to get this Priority: high fix in so that v4.2.0 is not released with bugs for all platforms 🤷‍♂️

I've made quite a few of the recommended changes by Alberto. Coverage and bisectability tests show that it's safe.

I think it's ultimately up to you as release managers.

#define K_TICK_MAX ((k_ticks_t)(IS_ENABLED(CONFIG_TIMEOUT_64BIT) ? INT64_MAX : UINT32_MAX - 1))

/* The semantic equivalent of K_NO_WAIT but expressed as a timespec object*/
#define K_TS_NO_WAIT K_TICKS_TO_TIMESPEC(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess TS is for TIMESPEC? Can we please use the same convention everywhere, i.e. K_TIMESPEC_NO_WAIT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - I can do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@aescolar
Copy link
Member

@cfriedt

The changes to support any time_t size are trivial, and you got them in #92709 (comment)

How I see it, you have two options:

a) you apply those changes, this code supports and time_t size, this PR can be merged and we move on.
b) you continue arguing. But then you can expect me to continue requesting changes due to the issues already mentioned.

@@ -166,6 +166,49 @@ typedef struct {
/* added tick needed to account for tick in progress */
#define _TICK_ALIGN 1

/* Maximum and minimum value TIME_T can hold (accounting for 32-bit time_t in glibc.h) */
Copy link
Member

@aescolar aescolar Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Maximum and minimum value TIME_T can hold (accounting for 32-bit time_t in glibc.h) */
/* Maximum and minimum value TIME_T can hold */

This supports any time_t size

cfriedt added 5 commits July 14, 2025 15:32
Define additional constants and macros for converting between timespec
objects and k_timeout_t.

There are four semantically important durations identified with new
constants identified via '.

1. K_NO_WAIT / K_TIMESPEC_NO_WAIT'
  *  duration of zero

2. K_TICK_MIN' / K_TIMESPEC_MIN'
  * minimum number of ticks strictly greater than K_NO_WAIT

3. K_TICK_MAX' / K_TIMESPEC_MAX'
  * maximum number of ticks strictly and semantically less than K_FOREVER

4. K_FOREVER / K_TIMESPEC_FOREVER'
  * the familiar non-expiring number of ticks (i.e. infinity)

With these definitions, there are 3 meaningful regions for conversion
between timespec and k_timeout_t representation:

1. < K_TIMESPEC_NO_WAIT (i.e. a negative timespec)
  * should be rounded up to K_TIMESPEC_NO_WAIT
  * < K_NO_WAIT in ticks is an absolute time point in 64-bit
    representation should no be permitted in the affected conversion
    routines, as they are explicitly  for converting time durations rather
    than absolute time points.

2. K_TIMESPEC_NO_WAIT < x < K_TIMESPEC_MAX
  * should be rounded-up to the next tick boundary
  * must be represented as the minimal non-zero duration in ticks

3. K_TIMESPEC_MAX <= x < K_TIMESPEC_FOREVER
  * must be represented as the maximum expiring duration in ticks
  * should be rounded-down to K_TIMESPEC_MAX

Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
Originally, the timespec_to_timeout() and timespec_from_timeout() tests
assumed that tv_sec and tv_nsec values divided evenly with the number of
ticks per second (CONFIG_SYS_CLOCK_TICKS_PER_SEC). However, those
assumptions broke down when testing with 32768 ticks/s .

Upon further investigation, there were additional corner cases
discovered that were not handled in an ideal or safe way.

Part of this fix involved identifying several "domains" and special
values that have semantic meaning that must be handled particularly
carefully.

Additional hidden-API constants were made to simplify the solution.

K_TICK_MIN (1)
K_TICK_MAX (UINT32_MAX-1 in 32-bit, INT64_MAX in 64-bit)

K_TIMESPEC_NO_WAIT (like K_NO_WAIT)
K_TIMESPEC_FOREVER (like K_FOREVER)

1. Converting a negative k_timeout_t

These only exist in 64-bit representation and actually encode absolute
timepoints via ticks. Since the stated purpose of the conversion
functions is to convert between durations, we must reject absolute
time points in timespec_from_timeout().

2. Converting a negative timespec

We assume that this duration means a timeout has already expired, and
round these up to K_NO_WAIT in timespec_to_timeout().

3. Due to the larger numeric space in the timespec representation,
the reverse mapping of timespec to k_timeout_t is "fuzzy". However,
K_NO_WAIT (K_TIMESPEC_NO_WAIT) and K_FOREVER (K_TIMESPEC_FOREVER) must
remain semantically equivalent. The previous implementation also held this
to be true, but the test cases are a bit clearer about it now.

4. Also, due to the larger numeric space in timespec representation,
there was a special requirement to round up to the nearest tick
boundary for any timespec in the strictly exclusive range
(K_TIMESPEC_NO_WAIT, K_TIMESPEC_MAX). We must round up, since
a) rounding down to K_NO_WAIT is most certainly not representative
   of a non-zero finite duration delay
b) the kernel operates on tick boundaries

5. Above the K_TIMESPEC_MAX boundary, which is the highest possible
timespec that can be represented by a tick, and specifically in the
64-bit timeout representation, there is a domain that cannot be
rounded up to K_TIMESPEC_FOREVER and should always be rounded down to
K_TIMESPEC_MAX.

This is to ensure that finite durations remain finite (even if
they are beyond the heat death of the universe).

Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
Elaborate on testing for functions that convert between timespec and
k_timeout_t.

We explicitly add tests to verify functionality in the regions of
interest, verify the semantic equivalence of specific time points.

Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
Since it's possible that rounding up might not always be the right thing
to do in every situation, in order to allow the application to make more
informed decisions, we created a modified timespec_to_timeout() that also
returns the remainder (or difference) between the requested time to
convert and resulting k_timeout_t. The difference is expressed as a
timespec object.

Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
This reverts commit 8f09262.

Signed-off-by: Chris Friedt <cfriedt@tenstorrent.com>
@cfriedt cfriedt force-pushed the timespec-tests-rework branch from 385daba to a2cb683 Compare July 14, 2025 19:42
@cfriedt cfriedt requested a review from nashif July 14, 2025 19:42
@cfriedt
Copy link
Member Author

cfriedt commented Jul 14, 2025

The changes to support any time_t size are trivial, and you got them in #92709 (comment)

@aescolar - thanks. I appreciate your technical input on this topic, and your reviews were very helpful especially in terms of the modulo arithmetic.

How I see it, you have two options:

a) you apply those changes, this code supports and time_t size, this PR can be merged and we move on. b) you continue arguing. But then you can expect me to continue requesting changes due to the issues already mentioned.

There is a third option. Where we merge this PR as a fix for the bugs that it's fixing, keep the scope of it narrow, and move on.

The idealogical differences that we have about whether Zephyr should explicitly support 32-bit time_t are irrelevant to the test suite that is being adjusted in PR, since it was also previously designed to enforce 64-bit time_t.

However, I did make use of your suggestion to incorporate SYS_TIME_T_MAX and SYS_TIME_T_MIN as a workaround to support timespec utlities on platforms with a 32-bit time_t, which comprise 0.06% of supported platforms, and 0% of shipping platforms.

Copy link

@cfriedt cfriedt added Coverity A Coverity detected issue or its fix and removed Hotfix Fix for issues blocking development, i.e. CI issues, tests failing in CI, etc. labels Jul 15, 2025
@cfriedt
Copy link
Member Author

cfriedt commented Jul 15, 2025

Aside from fixing 4 bugs, another reason to consider merging this before tagging v4.2.0 is to avoid forcing the deprecation process for timespec_to_timeout(req).

This PR adds the rem parameter to provide the remainder to the caller, whereas the original did not.

If this is merged within the release cycle, then timespec_to_timeout(req, rem) will become API within one release cycle without requiring any changes.

If this is not merged within this release cycle, then we will need to rename timespec_to_timeout(req, rem) to something like timespec_to_timeout_remainder(req, rem). The sub-par timespec_to_timeout(req) will keep the premium parking spot for not only 3 release cycles to deprecate it, but there would be an additional 2 release cycles required to rename timespec_to_timeout_remainder(req, rem) to timespec_to_timeout(req, rem).

In other words, timespec_to_timeout() would not be "fixed" until v4.7.0 (v5.???). Probably far longer than it would even take to ensure time_t is 64-bit consistently.

@danieldegrasse and @dkalowsk - I would be happy to bring it up with the TSC if necessary.

@andyross has already approved once and changed requested by @nashif have been made.

@aescolar
Copy link
Member

If this is not merged within this release cycle, then we will need to rename timespec_to_timeout(req, rem) to something like timespec_to_timeout_remainder(req, rem).

We can just backport this change to 4.2.
Our deprecation policy is for API we can expect users to have started using, we do not need to follow it for something which has been a couple of weeks in the tree.

About this code today requiring a 64bit time_t, it seems you are misunderstanding my motivation to ask for any time_t to be suported.
So I'll try to elaborate:

The size of time_t is decided by the C library given the platform.
We support proprietary toolchains (which tend to have their own C libraries), we also support user provided C libraries.
In the tree we default to Picolibc for most platforms, but that does not mean users can only use pico or that they do.
We also support out of tree architectures.

Several of our users and contributors (including TSC members) have their own out of tree architectures, toolchains and C libraries, and/or have clients who do.

For processors focused on audio processing 24/48bit architectures are very typical.
Where native integer types are 24/48 bits, you can also expect the c library to define a time_t of 48bits which does not have any Y2038 issue.

Many users will not want to care at all about time_t/timespec either because they do not have a need for that kind type, or because they are all together avoiding it due to the trouble around time_t and the functions that use it (see your own MISRA reference as an example).
Due to this, and the presence of legacy libraries which either assume time_t is a given size or are prebuilt for a given size, some libCs will keep using an old time_t size by default.
And many of those libC users just do not care or need to care.

This code (lib/os/clock.c and timeutil) is built always, and so are these new (for a couple of weeks) functions. Even for users do not have any interest in using time_t or timespec. These users will probably be a bit confused about what part of the zephyr code they may use could be using these or not.
A test failing in their platform with a requirement for time_t being a given size will confuse them.

Quite many users are using the native_simulator based targets for host testing, and prefer to build with their host libc (glibc typically) so they can interact with the host easier. A lot of this "host interaction" happens by using other host prebuilt libraries.
Forcing glibc to build with a 64bit time_t breaks the ABI for calls using that type or derived types, and that is a nasty bug to debug. If the library they use is prebuilt, the user may not have an option to rebuild it with a 64bit time_t.

In general many of Zephyr developers just want to be able to test their code with any libC and don't want unnecessary constraints set on them.
They don't want to depend on code that only works with a given size of time_t or only works with some libCs if they happen to be using it as part of their testing on their target or host.

You seem to think that the Zephyr project can easily force time_t to be 64bits. But it can't, because its size is defined by the libC. All the Zephyr project can do is make it more difficult to work with other time_t sizes and take time and frustrate our users/developers.

What I'm requesting is that this code would support any time_t size. Be it 32, 48, 64 or whatever the libC and user decides.
You have already done most of the changes for this to be the case (most replacements of UINT64_MAX/MIN with SYS_TIME_T_MAX/MIN).
The remaining changes are trivial. And none of them have any drawback for us or our users. Pico will still default to the 64bit time_t. For other libCs the users can still chose the size they want. If they care about time_t and the year 2038 issue they just need to ensure it is more than 32bits, which is the default in the tree for embedded targets anyhow.

If you do this change, the code supports any libc, so you can remove the filter on the library on this code, and stop selecting pico (if you want). And any other developer whose code uses this code can also avoid having any constraint about the libC choice or time_t size in their testing.

I just don't see any drawback in supporting any time_t size, but quite a lot of drawbacks of not supporting it.
While supporting any time_t size does not mean we force or default to using any time_t size in embedded products that has any issue.

@cfriedt cfriedt modified the milestones: v4.2.0, v4.2.1, future Jul 15, 2025
@cfriedt
Copy link
Member Author

cfriedt commented Jul 15, 2025

2025-07-15: release meeting

  • relatively new api, not stable, does not need to follow the deprecation process
  • can be merged after the release
  • backport to v4.2.1

@cfriedt cfriedt added the backport-v4.2-branch Request backport to the v4.2-branch label Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Base OS Base OS Library (lib/os) area: Utilities backport-v4.2-branch Request backport to the v4.2-branch Coverity A Coverity detected issue or its fix
Projects
None yet
5 participants