Skip to content

Commit 4f007e8

Browse files
[0.9.1]DBO support EP parallel and optimize dual stream overlap (#1589)
### What this PR does / why we need it? 1. DBO model support EP parallel 2. optimize dual stream overlap max tokens:32784 input_len:1024 bs 32 dp2tp8ep16 before open dbo TTFT: 4017ms ![before](https://github.com/user-attachments/assets/8f9e338d-978f-42cf-9add-825a8dd3418f) after open dbo TTFT: 3017ms ![after](https://github.com/user-attachments/assets/79f706fa-22c8-4c71-b5e3-ae3f53dac23b) ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? --------- Signed-off-by: shikang-hangzhou <459956190@qq.com>
1 parent 10df64c commit 4f007e8

File tree

3 files changed

+207
-208
lines changed

3 files changed

+207
-208
lines changed

0 commit comments

Comments
 (0)