You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/features/spec_decode.md
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -256,12 +256,12 @@ speculative decoding, breaking down the guarantees into three key areas:
256
256
2.**Algorithmic Losslessness**
257
257
\- vLLM’s implementation of speculative decoding is algorithmically validated to be lossless. Key validation tests include:
258
258
259
-
> -**Rejection Sampler Convergence**: Ensures that samples from vLLM’s rejection sampler align with the target
260
-
> distribution. [View Test Code](https://github.com/vllm-project/vllm/blob/47b65a550866c7ffbd076ecb74106714838ce7da/tests/samplers/test_rejection_sampler.py#L252)
261
-
> -**Greedy Sampling Equality**: Confirms that greedy sampling with speculative decoding matches greedy sampling
262
-
> without it. This verifies that vLLM's speculative decoding framework, when integrated with the vLLM forward pass and the vLLM rejection sampler,
263
-
> provides a lossless guarantee. Almost all of the tests in <gh-dir:tests/spec_decode/e2e>.
264
-
> verify this property using [this assertion implementation](https://github.com/vllm-project/vllm/blob/b67ae00cdbbe1a58ffc8ff170f0c8d79044a684a/tests/spec_decode/e2e/conftest.py#L291)
259
+
> -**Rejection Sampler Convergence**: Ensures that samples from vLLM’s rejection sampler align with the target
260
+
> distribution. [View Test Code](https://github.com/vllm-project/vllm/blob/47b65a550866c7ffbd076ecb74106714838ce7da/tests/samplers/test_rejection_sampler.py#L252)
261
+
> -**Greedy Sampling Equality**: Confirms that greedy sampling with speculative decoding matches greedy sampling
262
+
> without it. This verifies that vLLM's speculative decoding framework, when integrated with the vLLM forward pass and the vLLM rejection sampler,
263
+
> provides a lossless guarantee. Almost all of the tests in <gh-dir:tests/spec_decode/e2e>.
264
+
> verify this property using [this assertion implementation](https://github.com/vllm-project/vllm/blob/b67ae00cdbbe1a58ffc8ff170f0c8d79044a684a/tests/spec_decode/e2e/conftest.py#L291)
265
265
266
266
3.**vLLM Logprob Stability**
267
267
\- vLLM does not currently guarantee stable token log probabilities (logprobs). This can result in different outputs for the
0 commit comments