-
Notifications
You must be signed in to change notification settings - Fork 401
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
refactor: Improved metainfo for trtllm-gen kernels
#1328
opened Jul 25, 2025 by
cyx-6
Loading…
5 tasks
feat: Support logits_soft_cap for Persistent attn; fix kv split limit
#1324
opened Jul 25, 2025 by
Edenzzzz
Loading…
5 tasks
Add k_scale and v_scale to persistent attention
#1322
opened Jul 24, 2025 by
Edenzzzz
Loading…
5 tasks
Add blockwise-scaled FP8 GEMM via TRTLLM-Gen.
#1320
opened Jul 24, 2025 by
sergachev
Loading…
5 tasks done
feat: support output nvfp4 in trtllm-gen function call.
#1318
opened Jul 24, 2025 by
weireweire
Loading…
2 of 5 tasks
Support loading autotuned results from json for cutlass fp4 moe backends
#1310
opened Jul 23, 2025 by
kaixih
Loading…
Api regression test for trtllmgen fp8 moe
#1308
opened Jul 23, 2025 by
aleozlx
Loading…
5 tasks done
fix: a workaround to make fp8 kv-cache work for prefill
#1304
opened Jul 22, 2025 by
chenyang78
Loading…
2 tasks
add mm_fp4 use cutlass backend for large bs
#1296
opened Jul 21, 2025 by
ttyio
Loading…
5 tasks done
Add native cudnn_decode for improved cudnn decode performance
#1283
opened Jul 18, 2025 by
Anerudhan
Loading…
5 tasks done
use NVSHMEM4Py instead of custom bindings for NVSHMEM MNNVL Allreduce
#1263
opened Jul 15, 2025 by
Amir-19
Loading…
5 tasks
feat(aot): add nvshmem module for aot compilation
#1261
opened Jul 15, 2025 by
EmilienM
Loading…
3 of 5 tasks
refactor: separate SM100 and legacy TRT-LLM comm modules
#1259
opened Jul 15, 2025 by
EmilienM
Loading…
3 of 5 tasks
bugfix: fix fp32 acc threshold for qk using math::inf according to dtype by AIDC-AI
#1247
opened Jul 14, 2025 by
yongchaoding
Loading…
5 tasks done
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.