-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Video: Rework buffer management #92370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Video: Rework buffer management #92370
Conversation
Interesting! Instead of having to carry pointers around, one only says "I need 10 buffers", then to enqueue them, " [EDIT: answering my own questions after discussing on #92366] Q1: Is it important that the application knows exactly which buffer index was given to a device? If not, then enqueue could be A1: Not all buffers will have the same size, so cannot all be mixed between devices. Q2: It seems it is not possible anymore to manually craft a video buffer. For instance, if video data arrives from the network and then goes through PXP to adapt it to a display, how to submit the buffer coming directly from the network into a A2: It is possible to add a Q3: Would it make sense to start making use of A3: This does not seem related to what this PR does: buffer management and accounting vs queueing them to the devices. Possibly solved by Q4: In my impression, an application developer does not want to know how many bytes are allocated to video in total, but instead decide on a resolution and let all allocation be automated. The current approach is to open A4: This can come as an overlay on top of this eventually, these functions allow this to be implemented on top. A different topic to solve in parallel. |
There seems to be several topics involved: T1: Memory management: how much in total, where from, which back-end for it It seems like this current PR aims to address something living between T1 and T2. |
@josuah Hope you don't mind. I implemented the needed stuffs to generalize your RTIO driver demo and integtated it in this buffer management rework. Effectively, RTIO serves as the backend that replaces the FIFO queues (and some more) but it does not overlap or conflict with some changes in this PR which mainly reworks the buffer allocation and interfacing with application. The main idea in the RTIO generalization is It doesn't work yet :). I have error at stream_start(). I will try to debug later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for picking it up so early!
I will try with an i.MXRT1160 I have around.
Essential to get it working is probably just to call __frame_done_cb()
whenever a frame is submitted on io_q()
, which might require api->iodev_submit
...
drivers/video/video_common.c
Outdated
int video_dequeue(const struct device *dev, struct video_buffer **buf, | ||
k_timeout_t timeout) | ||
{ | ||
struct rtio_cqe *cqe = rtio_cqe_consume_block(&rtio); | ||
*buf = cqe->userdata; | ||
|
||
if (cqe->result < 0) { | ||
LOG_ERR("I/O operation failed"); | ||
return cqe->result; | ||
} | ||
|
||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dev
and timeout
are unused so can go away.
rtio_cqe_consume_block()
acts as a combined k_poll()
and video_dequeue()
.
Maybe we need a pointer to struct video_interface
in struct video_buffer
to keep track of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffers are kept track of by index and type ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! ->index
type is fully user-defined, which allows user to organize a reference table in the application to store extra metadata about buffers.
I think this solves this point, feel free to fold this.
drivers/video/video_common.c
Outdated
if (sqe->op != RTIO_OP_RX) { | ||
LOG_ERR("Invalid operation %d of length %u for submission %p", sqe->op, | ||
sqe->rx.buf_len, (void *)iodev_sqe); | ||
rtio_iodev_sqe_err(iodev_sqe, -EINVAL); | ||
return NULL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fragment was there for the driver to validate that the operation requested is supported.
It is the equivalent to if (vbuf->type != VIDEO_BUF_TYPE_INPUT) {
.
Should this go into video_mcux_csi.c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understand is this is to get the buffer that is filled from the sqe, to do some check (buf type check or buf_addr check as in csi driver). The code is common for all driver so it could be refactored as a function ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense as a helper agreed!
Though this does not allow the caller to report completion/failure, so maybe only the 1st buffer would work.
So maybe something like this?
struct rtio_iodev_sqe *video_pop_io_q(struct mpsc *io_q)
{
struct mpsc_node *node;
struct rtio_iodev_sqe *iodev_sqe;
struct video_buffer *vbuf;
node = mpsc_pop(io_q);
if (node == NULL) {
return NULL;
}
iodev_sqe = CONTAINER_OF(node, struct rtio_iodev_sqe, q);
vbuf = iodev_sqe->sqe.userdata;
__ASSERT_NO_MSG(vbuf != NULL);
if ((vbuf->type == VIDEO_BUF_TYPE_OUTPUT && iodev_sqe->sqe.op == RTIO_OP_RX) ||
(vbuf->type == VIDEO_BUF_TYPE_INPUT && iodev_sqe->sqe.op == RTIO_OP_TX)) {
return iodev_sqe;
} else {
LOG_ERR("Unsupported RTIO operation (%d) or video buffer type (%d)",
iodev_sqe->sqe.op, vbuf->type);
rtio_iodev_sqe_err(iodev_sqe, -EINVAL);
return NULL;
}
}
It is only 1 line to get the vbuf out of it:
iodev_sqe = video_pop_io_q(&data->io_q);
if (iodev_sqe == NULL) {
return;
}
vbuf = iodev_sqe->sqe.userdata;
/* Submit `vbuf` contoent to the hardware */
rtio_iodev_sqe_ok(iodev_sqe, 0);
drivers/video/video_common.c
Outdated
rtio_sqe_prep_read(sqe, ri, RTIO_PRIO_NORM, video_buf[index].buffer, | ||
video_buf[index].size, &video_buf[index]); | ||
|
||
sqe->flags |= RTIO_SQE_MULTISHOT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there need to be a video_enqueue_read()
and video_enqueue_write()
or use switch (vbuf->type){}
. Currently it is only doing read requests.
RTIO_SQE_MULTISHOT
might not be possible with write requests as the user has to specify the buffer to use every time, or the same buffer would be sent in loop.
c805080
to
d9db3cb
Compare
I think I see the problem. Unlike the SW generator, the HW device like CSI needs to enqueue buffer to the HW, I blindly remove the enqueue() API. |
d9db3cb
to
f4c7b63
Compare
I will follow-up on this conversation: #92370 (comment) It looks like we are close! |
f4c7b63
to
0cb2ab8
Compare
It already know and can track which buffer is enqueued.
It was in TODO list and done now by commit |
include/zephyr/drivers/video.h
Outdated
/** buffer memory is on the video heap */ | ||
VIDEO_MEMORY_VIDEO = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about VIDEO_MEMORY_POOL
to avoid a repetition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POOL is a set of buffers of same size and attribute, seems not describe well the situation. Are VIDEO_MEMORY_INTERNAL
and VIDEO_MEMORY_EXTERNAL
better ?
drivers/video/video_buffer.c
Outdated
void video_release_buf() | ||
{ | ||
/* Buffer will be re-queued thanks to RTIO_SQE_MULTISHOT */ | ||
rtio_cqe_release(&rtio, rtio_cqe_consume_block(&rtio)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is video_buffer_release()
and video_release_buf()
. To avoid confusion, how about video_buffer_done()
or other name?
Nice ! Thank you for the follow-up ! I 've just picked up your commit. It works for the sw-generator ! For the CSI, I just naively did:
(as seen in the commit) and see that it is still not working yet. Sorry, I didn't take a look in RTIO yet, just did it naively. |
About
and driver will implement .submit() instead of video_enqueue(). But I don't know if we could get the buffer from
|
Hopefully you do not have to for this PR!
It think you might need to remove uses of I will try and push it, there might not be much missing as it more or less implements the same API as |
Yes, it is |
Very good idea, I like it! This would look very similar to how RTIO works in other places of Zephyr, and help users understand it, and RTIO maintainers to understand how we use it. |
a09740a
to
4cd4e46
Compare
Sorry I ran out of time while trying to load anything on my I just noticed you pushed an update of the API as you described, I will propagate that to the other drivers. |
I removed driver's PS: Driver's implementation of |
That is great! This is the only one I cannot test easily... |
Sorry for not inform you before working on it to avoid wasting your time, but I 've just work on this about 45 minutes ago ... I will have to put this on hold for about the next 10 days due to some obligation. Thank you for the work ! |
No time was lost as I spent it getting familiar with the i.MXRT ecosystem and tooling host-side mostly.
There is time before the end of feature freeze, and plenty of other drivers to convert. What is left to do AFAIU:
Thank you for the work as well ! |
|
Currently, the video_enqueue() API enqueues the whole external video buffer container structure from the application and the driver stores this container in its FIFO queue. While it works in simple applications where the enqueued video_buffer container persists for the whole program lifecycle, it does not work in situations where we cannot keep this container, e.g. enqueuing a buffer inside a function, the local variable will be destroyed when the function returns and hence the buffer is no longer valid. Video buffers are maintained and tracked by the via their indices. Set the index field when buffers are allocated and enqueue the internal buffer rather than the external container will fix the issue. Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
Add a video_buf_state field to track the current state of a buffer when it is flowing. This also helps to check whether a buffer can be safely enqueued or dequeued, data are valid or not, etc. Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
The video subsystem holds an internal buffer pool and allocate buffers of this pool on its own heap. The application hence requests the subsystem to allocate buffers on behalf of it. Application does not need to get handles on the allocated buffers, just read back the first buffer index and the number of allocated buffer. Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
Add support to allow user provide their own allocated buffers. Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
The main device that interface with application on memory. Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
- Application side nearly unchanged, except a call to video_release_buf() after finishing with the full buffer to queue it back to sqe. RTIO is opaque to application. - Driver just needs to hold an io_q and deal a bit with the buffer from sqe in isr. No need to implement rtio_iodev_api's submit() - The video subsystem does everything needed for buffer management. Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com> Co-developed-by: Josuah Demangeon <me@josuah.net>
Move buffer management stuffs into a separate video_buffer.c file Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
This contains only the mandatory modifications to get the drivers to work: - expose `struct rtio_iodev_sqe` to the drivers so that the functions `rtio_iodev_sqe_ok()` and `rtio_iodev_sqe_err()` can be called. - expose `stuct rtio_cqe` to the application so that the functions `rtio_cqe_release()` can be called with the same completion event (cqe) that was just dequeued. The implementation can be different, but these points seems to be mandatory to operate RTIO end-to-end.
… API Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
Signed-off-by: Phi Bang Nguyen <phibang.nguyen@nxp.com>
4cd4e46
to
bfc86f6
Compare
Update:
TODO:
|
|
Thanks @ngphibang @josuah for all this work. I haven't yet look at all implementation details but more on the API for the time being. Looking at the API, and especially the request / queue mechanism I am wondering how use-case of dealing with several kind of buffers (type / size) are handled. I probably missed something, could you clarify this to me ? Another case that I can think of is when 2 IPs are in a sense pipelines, for example DCMI/DCMIPP + JPEG encoder. The output of the DCMI/DCMIPP (aka output buffer) will be also used as input to the JPEG encoder. That same buffer can be used it is up to the application to dequeue from one side and queue it to the other size. Does the current API allow to do that ? Aka, in a sense use a output type buffer (from DCMI/DCMIPP) into the JPEG input ? |
Hi @avolmat-st ,
Yes, the request() should return the first index of the batch as in above comment (TODO) and in the commit message. It's not implemented yet in the current code but will be done, sorry I am tight up at this moment.
To enqueue the "empty" buffer (freshly requested / allocated), we have the buffer index so we can enqueue it, video_enqueue() can be rewritten which takes index and type (input / output) To enqueue a "full" buffer dequeued from previous device to another device, since after dequeue, we have a full handle of the video_buffer struct so it's not a problem, we can call the same above video_enqueue(index, type). Another way of designing the video_enqueue() API is PS: I forgot that to support user provided buffers, we need to give the buffer address to the framework, so the video_enqueue() need to take the following parameters:
So, finally, it seems |
This PR reworks the buffer management in the video subsystem step by step to have a common buffer management framework.
It fixes the current issues of video buffer management and may help to facilitate #89858.
Still a work-in-progress. I will try to describe the change in more details and complete it gradually.
TODO: