**Description** Add UI and backend features to enable speculative decoding support. https://github.com/ggml-org/llama.cpp/pull/10455 **Use Case** Faster interference of slower models. https://arxiv.org/abs/2211.17192