GuideLLM DS Generation Engine #26

## Motivation
GuideLLM currently enables to receive HF originated DSs, path to local DSs, or synthetic DSs. 
In order to demonstrate the strengths of KVCache-aware routing, we need to be able to easily create DSs that represent use-cases that bring the highest value and leverage the advantages of this kind of routing, such as RAG-based apps and agentic apps.

 ## The Plan
Create a DS generation engine that will receive use-case requirements as parameters and return a guideLLM-ready full DS that matches the use case. The DS will be ready to fed as the --data parameter in GuideLLM Benchmark without any changes.

## High-level steps of implementation
The engine will include 2 consecutive layers:
1. First layer will receive requirements as parameters (i.e. number of different apps, system prompt length, tools length, RAG docs length, RAG docs number per app etc.), and eventually will return a json file, containing all the Apps required by the user, in a textual human-understandable form, where all length are completely configurable
simplified example:

`{
"systemPrompt": "8DzB0vXMMDO1ihCpCNsEBDH2FrHfmnR",
"tools": "iSvQglvUQgoapyEWuYjNvgrqRR8DeX6zH6vQfQoC0OSSzcafs1XHHHLnxYS9O",
"ragDocs": [
            "Iwat4dvnPdrmsLhYEP8RTsR9Es1kc4MI0wIfsFG55"
            "0xYplap6ennnt6nlhBFMjlJTHNU8kW68JhaHY6TK"
]
}`

2. Second layer will receive the output json of the 1st layer as input, along with use-case related parameters (i.e. number of users, number of request per user-session etc, num of users that share the same App, num of documents per user etc.), and then compress and flatten it to guideLLM-ready prompt based DS, in a way that will take the use-case into consideration. 
i.e. 10 users, each 2 users use the same App from which they use 2 documents - the layer will create couples of consecutive prompts sharing the same App's system-prompt an tools, and differing only in the RAG docs chosen (maybe) and in the user-prompt.


Link to issue in Distributed-KV-Cache repo - https://github.com/neuralmagic/llm-d-kv-cache-manager/issues/4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GuideLLM DS Generation Engine #26 #134

Motivation

The Plan

High-level steps of implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GuideLLM DS Generation Engine #26 #134

Description

Motivation

The Plan

High-level steps of implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions