The OpenCL implementation is faster in processing neighbouring pixels (with a window of 31x19) #7907
Unanswered
immortalsalomon
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, everyone.
The general concept of the program is simple. For each pixel I want to find the maximum and minimum values in the neighbouring pixels using a window of 31x19 pixels. Furthermore, before considering a pixel in the window I want to check that it is valid. Let us say that a pixel is valid if its value is in a certain range.
Excluding for now the search for the maximum and minimum pixel to simplify things.
I don't understand how the implementation in Halide takes around 70ms (without min max search) while the one in OpenCL takes only 3-4ms (with min max search).
Halide Algorithm:
Halide Scheduling:
OpenCL implementation:
I am sure that what slows down are the check for the validity of a point and the double for loop to process all neighbouring pixels. Could it be that there are parameters to optimise the implementation for OpenCL?
I found this issue related to the median filter #7302 . Do you think the answer is also feasible in this situation? If so, could you give me some tips.
Thank you in advance for your time :).
Beta Was this translation helpful? Give feedback.
All reactions