[Rust] Prototype in-browser serving for small LLMs

* Understand performance characteristics e.g.,  latency vs model size, etc
* Potential speedups: GPUs, acceleration, parallelization

[Doc link](https://docs.google.com/document/d/1FcZYLK4ylSAogvnZNc0PMX_2PS-eyf4IP3qCjGuGBxI/edit?tab=t.0#bookmark=id.6qfx36vpf8q2)