Skip to content

Commit c4e3b12

Browse files
authored
[Docs] Add minimal demo of Ray Data API usage (#21080)
Signed-off-by: Ricardo Decal <rdecal@anyscale.com>
1 parent 8dfb45c commit c4e3b12

File tree

1 file changed

+26
-3
lines changed

1 file changed

+26
-3
lines changed

docs/serving/offline_inference.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,31 @@ This API adds several batteries-included capabilities that simplify large-scale,
3030
- Automatic sharding, load balancing, and autoscaling distribute work across a Ray cluster with built-in fault tolerance.
3131
- Continuous batching keeps vLLM replicas saturated and maximizes GPU utilization.
3232
- Transparent support for tensor and pipeline parallelism enables efficient multi-GPU inference.
33-
34-
The following example shows how to run batched inference with Ray Data and vLLM:
35-
<gh-file:examples/offline_inference/batch_llm_inference.py>
33+
- Reading and writing to most popular file formats and cloud object storage.
34+
- Scaling up the workload without code changes.
35+
36+
??? code
37+
38+
```python
39+
import ray # Requires ray>=2.44.1
40+
from ray.data.llm import vLLMEngineProcessorConfig, build_llm_processor
41+
42+
config = vLLMEngineProcessorConfig(model_source="unsloth/Llama-3.2-1B-Instruct")
43+
processor = build_llm_processor(
44+
config,
45+
preprocess=lambda row: {
46+
"messages": [
47+
{"role": "system", "content": "You are a bot that completes unfinished haikus."},
48+
{"role": "user", "content": row["item"]},
49+
],
50+
"sampling_params": {"temperature": 0.3, "max_tokens": 250},
51+
},
52+
postprocess=lambda row: {"answer": row["generated_text"]},
53+
)
54+
55+
ds = ray.data.from_items(["An old silent pond..."])
56+
ds = processor(ds)
57+
ds.write_parquet("local:///tmp/data/")
58+
```
3659

3760
For more information about the Ray Data LLM API, see the [Ray Data LLM documentation](https://docs.ray.io/en/latest/data/working-with-llms.html).

0 commit comments

Comments
 (0)