Releases: AD2605/llama.cpp
Releases · AD2605/llama.cpp
b5432
sycl: disable reorder for sycl mulmat (#13536)
b5423
mtmd : add vision support for llama 4 (#13282) * wip llama 4 conversion * rm redundant __init__ * fix conversion * fix conversion * test impl * try this * reshape patch_embeddings_0 * fix view * rm ffn_post_norm * cgraph ok * f32 for pos embd * add image marker tokens * Llama4UnfoldConvolution * correct pixel shuffle * fix merge conflicts * correct * add debug_graph * logits matched, but it still preceives the image incorrectly * fix style * add image_grid_pinpoints * handle llama 4 preprocessing * rm load_image_size * rm unused line * fix * small fix 2 * add test & docs * fix llava-1.6 test * test: add notion of huge models * add comment * add warn about degraded quality
b5416
CANN: Support MOE Model MUL_MAT_ID (#13042) Signed-off-by: noemotiovon <757486878@qq.com>
b5392
server : proper error handling for missing elements in messages array…
b5359
clip : cap max image size 1024 for qwen vl model (#13478)
b5329
metal : optimize MoE for large batches (#13388) ggml-ci
b5316
server : (webui) fix a very small misalignment (#13387) * server : (webui) fix a very small misalignment * restore font-bold
b5307
docker : disable arm64 and intel images (#13356)
b5303
llama : deci : support ffn-free with attention (#13296)
b5283
clip : fix confused naming ffn_up and ffn_down (#13290) * clip : fix confused naming ffn_up and ffn_down * rm ffn_i/o/g naming * rename n_embd, n_ff * small fix * no check n_ff