How to measure a Halide::Func #5710
Unanswered
TongMoNumb
asked this question in
Q&A
Replies: 1 comment 1 reply
-
One thing you can do to get some understanding at the per If your code is in a generator, in CMake you would do something like the following: add_halide_library(<some_function> FROM <some_generator>
TARGETS ${Halide_CMAKE_TARGET}-profile <- `-profile` here
AUTOSCHEDULER ${SCHEDULER_TYPE}
SCHEDULE OUTVAR
PARAMS auto_schedule=true) With <pipeline>
total time: 1071.243164 ms samples: 995 runs: 10 time/run: 107.124313 ms
average threads used: 7.522613
heap allocations: 2880 peak heap usage: 1357504 bytes
<func1>: 1.057ms (33%) threads: 6.400
<func2>: 1.057ms (33%) threads: 6.400
<func3>: 1.057ms (33%) threads: 6.400 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi All,
I am playing with Adam2019 codes recently. During the beam search, I could extract Halide::Fun and stages of the pipeline, and I want to know how could I measure the pipeline on hardware in this circumstance.
Take Matmul case for example. Base on the tutorials, if I want to measure a pipeline, I should follow steps like:
(1) define var
(2) compile_jit(target)
(3) realize it.
But now, I do not know the exact input size, all I know are the Funcs I obtain:
(1) Func output
produce output:
for y.y in [0, 127]:
unrolled y.yi in [0, 11]:
for x.x in [0, 1]:
for x.xi.xi in [0, 47]:
produce matrix_mul:
vectorized x.xi in [0, 15]:
matrix_mul(...) = ...
vectorized x.xi in [0, 15]:
for r8 in [0, 1535]:
matrix_mul(...) = ...
consume matrix_mul:
vectorized x.xi.xii in [0, 15]:
output(...) = ...
(2) Func Matmul
produce matrix_mul:
unrolled y:
for x.x:
vectorized x.xi in [0, 15]:
matrix_mul(...) = ...
unrolled y:
for x.x:
vectorized x.xi in [0, 15]:
for r8 in [0, 1535]:
matrix_mul(...) = ...
(3) Func Input
produce input_b_im:
for _1:
for _0:
input_b_im(...) = ...
produce input_a_im:
for _1:
for _0:
input_a_im(...) = ...
Above are four Halide::Func I can obtain, based on these, how could I combine them to a multi-stage pipeline and how could I know the input size based on the Func?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions