Skip to content

Commit 314ce59

Browse files
hipuddingggerganov
authored andcommitted
CANN: Add support for async operator submission (llama/12864)
Submit operators using asynchronous threads to improve performance. Use the environment variable GGML_CANN_ASYNC_MODE to control whether asynchronous submission is enabled. It is disabled by default. Testing shows a 10%–20% performance improvement in scenarios with small parameter sizes, especially in quantized models.
1 parent cb7642b commit 314ce59

File tree

4 files changed

+604
-356
lines changed

4 files changed

+604
-356
lines changed

0 commit comments

Comments
 (0)