New Feature: Constant Bandwidth Server (CBS)

It's with a great pleasure that I come before you in behalf of my research laboratory (SoftCPS - ISEP) to submit our first contribution to this amazing open-source RTOS: the Constant Bandwidth Server, or CBS. 

## Introduction

A CBS is an extension of the Earliest Deadline First (EDF) scheduler that allows tasks to be executed virtually isolated from each other, in a way that if a task executes for longer than expected it doesn't interfere on the execution of the others. In other words, the CBS prevents that a task misbehavior causes other tasks to miss their own deadlines.

### Problem description

The regular EDF scheduler selects which task should be executing based on whichever has the earliest absolute deadline of all. It has the advantage of allowing a higher CPU utilization (up to 100%, theoretically) for any number of tasks executing in the system, but it comes with a slight downside: it has no protocol for when a task misses a deadline, and it doesn't account for tasks taking longer to execute than expected. The problem is that in *hard* real-time systems (such as an airplane), not meeting the deadline means a catastrophic failure, whereas in *soft* real-time systems, it means annoyance to the end user. Most complex systems nowadays have both types of tasks, thus being a hybrid system.

### The Solution

The CBS solves this problem in EDF by trading the variable (hard to predict) execution time of the task for a time *budget* that represents for how long the task is *allowed* to execute, somewhat similar to the time slice in the round-robin scheduler.
Unlike round-robin, though, the kernel automatically downgrades the thread priority by postponing its deadline when the budget runs out. So, if even with the lowered priority the task in question remains the one with the earliest deadline, it keeps executing as if nothing happened.

### How it works

Assume we have two jobs of tasks *A* and *B*, with an execution time of 2 and 3 seconds each, that are pushed to a CBS with (budget, period) = (3,7) seconds. Both jobs are pushed sequentially to the CBS, so job A will be served first and job B will go next. If the CBS is running alone in the system, this is what would happen:

![example-1](https://github.com/user-attachments/assets/b84a5e70-7b68-463f-844e-36580cb583f2)

* At instant t=1, both jobs are pushed. The CBS was idle, so the   deadline is calculated as (t + period) = (1 + 7) = 8. As there is no other contending task in the system, the CBS starts executing job A immediately.
* At instant t=3, job A finishes execution. The remaining budget is of 1 second, since job A spent 2 seconds with its own execution. Job B, the next one in line, starts here.
* At instant t=4, the budget runs out. This triggers the kernel, which replenishes the budget at the cost of postponing the CBS deadline by 7 units (the CBS period). So the new deadline is  (8 + 7) = 15. Since there is no other contending task with an earliest deadline, the CBS resumes execution of job B.
* At instant t=6, job B is completed. The remaining budget is of 1 second and will be like this until a new job is pushed.

Note that, originally, the CBS has an utilization factor (bandwidth) of (budget / period) = (3 / 7) = 0.4286. When the budget runs out at t=4, the CBS is granted 3 more seconds of execution at the expense of an increased deadline, then becoming (6 / 14) = 0.4286. That means the CBS is always expected to use at most 42.86% of the CPU - hence the name *Constant Bandwidth Server*.

Now let's add to this example by imagining two more jobs C and D, with execution times of 1.3 and 1, are pushed to the CBS at instants t=8 and t=16, respectively. This is the execution outcome:

![example_2](https://github.com/user-attachments/assets/973d7681-713a-4e46-8f9e-b39ac9ea18a3)

* At instant t=8, job C is pushed. The CBS was idle, so the kernel  checks if the available bandwidth of the server - that is, the
  budget left (1) divided by the current time until the deadline (7, since at this point the deadline is set at t=15) - is greater or
  equal than the configured server bandwidth (0.4286). Since (1 / 7) = 0.14, the current bandwidth is lower than configured.
  Thus, job C is served with the current budget and deadline.
* At instant t=9, the budget runs out. The kernel then replenishes the budget and postpones the deadline in 7 seconds, leading to a new value of (15 + 7) = 22. As there are no other contending tasks, job C resumes execution.
* At instant t=9.3, job C is completed. The remaining budget is 2.7 seconds.
* At instant t=16, job D is pushed. The CBS was idle, so the kernel performs the same bandwidth check as above. This time the current budget is 2.7 seconds and the deadine is 22, so the available bandwidth is (2.7 / (22 - 16)) = (2.7 / 6) = 0.45. As it is higher than the configured 0.4286, serving job D with the current values could jeopardize the system schedulability (thus, it's better to intervene). The budget is replenished to 3, and the new deadine is set at (t + period) = (16 + 7) = 23.
* At instant t=17, job D is completed. The remaining budget is of 2 seconds and will be like this until a new job is pushed.

The working principle is simple: the CBS executes jobs that might come at random intervals in FIFO order, and every time a new job comes to an idle server, the kernel checks if the available bandwidth is higher than configured. If that's the case, an intervention happen and the server deadline is recalculated as to not risk the system schedulability. Moreover, the deadline is postponed whenever the running job exhausts the budget.

### Use cases

You want to use the CBS when:

1. You want an automated way of recalculating the task deadlines.
2. You have a task set running under EDF and there needs to be a *scheduling guarantee*, that is, a guarantee that no task will ever starve because a higher priority one is executing for too long.
3. You want to improve the response time of lower priority tasks without jeopardizing the guarantees of the higher priority tasks.
4. You want to decide the task priorities by deciding their guaranteed CPU time (for example, task 1 needs 50%, task 2 needs 30% and task 3 needs 20%), like a round-robin on steroids. That is possible because the ratio of the CBS attributes (budget, period) is equivalent to the guaranteed bandwidth of the server. It should be noted that if that's the case, the sum of all tasks' utilization should necessarily be less or equal than 100%.

So, depending on what you need, the CBS can be seen as a safe way of serving tasks with no deadlines within a EDF taskset (like a workqueue, but better) **or** a fail-safe for existing EDF tasks that might overrun, and thus endanger other tasks's deadlines.

### Proposed change

We propose to bring this feature to the Zephyr kernel. As far as we know, currently Zephyr already has a feature called [workqueue](https://docs.zephyrproject.org/latest/kernel/services/threads/workqueue.html) which methods and goals are similar to the ones of CBS. However, workqueues seems to support only static priorities and there is no guarantee on its execution "boundaries" other than the configurable thread timeslices and the yield behavior when a new item is submitted. Moreover, the workqueues are typically assigned a single handler function (i.e. a single job) upon `k_work_init` and expected to only be submitted new arguments to execute within it by calling `k_work_submit`. The CBS, on the other hand, has been designed to accept different jobs and arguments altogether at each new submission with `k_cbs_push_job`.

Thus, the CBS pose as the perfect *complement* to the Workqueue rather then its *substitute* - it is meant to work alongside the EDF scheduler instead of static priorities and support a more volatile set of job functions instead of defining a single function from the start and only change the arguments passed to it.

## Detailed RFC

The CBS has been already implemented and extensively tested on various targets (including physical and emulated ARM, Xtensa, RISC-V) with no problems detected so far, but I'm doing a RFC before submitting a pull request 1) to follow the guidelines and b) to allow the community to inspect the work done so far and give tips on changes. The  changes made to Zephyr can be found [here](https://github.com/zephyrproject-rtos/zephyr/compare/main...alexpaschoaletto:zephyr:cbs).

### Proposed change (Detailed)

The works conducted on this feature were additive; They don't modify or remove any existing functionality, therefore are unlikely to break any of the current features. The public API consists of just two elements, really: a `K_CBS_DEFINE` macro and a `k_cbs_push_job` function.

```c
/*
  this statically defines and initializes a Constant Bandwidth Server.
  a CBS can then be referenced in the code by the name given to it:
  const struct cbs_t <name>;
  
  it should be noted that the static priority has nothing to do with the CBS itself.
  it is necessary because the EDF policy in Zephyr is just a tie-breaker between
  two or more ready tasks with the same static priority. Therefore a user should
  provide the same static priority to all the threads to run under EDF.
*/
#define K_CBS_DEFINE(name, budget, period, static_priority)

/*
  this function pushes a job and a job argument to the CBS queue.
  the job will try to be pushed to a dedicated message queue with the
  specified timeout, and the return value is the same as returned by the
  message queue. 
*/
int k_cbs_push_job(cbs_t *cbs, cbs_callback_t job_function, void *job_arg, k_timeout_t timeout);
``` 

The proposed implementation also offers a logging feature to allow users to keep track of the CBS events. When enabled, the CBS will use the logging API to register keypoints of its execution - such as a job being pushed to the queue, budget being replenished or the CBS thread entering the CPU to serve the jobs given to it.  Logs will show up on screen like this:

```cmd
  [00:00:12.028,000] <inf> CBS: cbs_1     J_PUSH  43543           // first job is pushed
  [00:00:12.028,000] <inf> CBS: cbs_1     B_COND  100000       // conditiom met, budget replenished
  [00:00:12.028,000] <inf> CBS: cbs_1     J_PUSH  100000         // one more job pushed
  [00:00:12.028,000] <inf> CBS: cbs_1     SWT_TO  100000       // CBS thread enters CPU to execute
  [00:00:12.031,000] <inf> CBS: cbs_1     J_COMP  68954         // first job completed
  [00:00:12.034,000] <inf> CBS: cbs_1     J_COMP  38914         // second job completed
  [00:00:12.034,000] <inf> CBS: cbs_1     SWT_AY  38914         // CBS thread leaves the CPU
```

If looking for more specific details, feel free to ask or inspect the aforementioned [fork](https://github.com/zephyrproject-rtos/zephyr/compare/main...alexpaschoaletto:zephyr:cbs-as-subsystem) with the proposed implementation.



### Dependencies

The CBS implementation was developed with the goal of using as many existing kernel features as possible in a way to minimize doing things from scratch. Therefore, each CBS is essentially composed of one thread, one message queue and one timer. The CBS structure holds a reference to those and a few extra metadata to be used internally:

```c
typedef struct {
    struct k_timer *timer;
    struct k_msgq *queue;
    struct k_thread *thread;
    cbs_budget_t budget;
    cbs_cycle_t period;
    cbs_cycle_t abs_deadline;
    cbs_cycle_t start_cycle;
    cbs_cycle_t bandwidth;
    bool is_initialized;
    bool is_idle;
} cbs_t;
```

I don't know what else should I include here, so if anything is missing please let me know, but as I said before, nearly nothing of Zephyr's current implementation is excluded or modified with the proposed code. Therefore I don´t think anything is likely to break or showcase unexpected behavior. As of my rounds of testing, everything seems to be working fine. 

### Overheads

The most frequent events expected for the CBS lifecycle - switched in, switched out and budget replenishement after runouts - were submitted to an overhead analysis which amassed over 50.000 samples for budget runout and 200.000 samples for context switching. The absolute worst case observed overhead on a XIAO ESP32C3 (RISC-V, single-core, 160MHz) was:

- switched_in: 128 clock cycles;
- switched_out: 139 clock cycles;
- budget_run_out: 154 clock cycles;

However, considering the mean values and standard deviations, we have:

- switched_in: 110.92 clock cycles (dev: 23.56)
- switched_out: 90.94 clock cycles (dev: 8.05)
- budget_run_out: 105.15 clock cycles (dev: 30.17)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New Feature: Constant Bandwidth Server (CBS) #78000

Introduction

Problem description

The Solution

How it works

Use cases

Proposed change

Detailed RFC

Proposed change (Detailed)

Dependencies

Overheads

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New Feature: Constant Bandwidth Server (CBS) #78000

Description

Introduction

Problem description

The Solution

How it works

Use cases

Proposed change

Detailed RFC

Proposed change (Detailed)

Dependencies

Overheads

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions