Replies: 7 comments 16 replies
-
The source of the model manifest data is also interesting here. Inference libs are hardcoded in a way. They can only read specific data from This means that at least a part of the schema is already present (as code) in the inference API implementations ( We could (and likely should) make use of this and generate some schema from the code itself. |
Beta Was this translation helpful? Give feedback.
-
Interaction between Manifest and ACWe use dictionaries to interact with AC. In the code we access dictionary values directly. This means that it is very simple to make a change of the code which changes the schema. Calling any method on the dictionary can potentially do that: value = params.find("valueWittTypo"); This flexibility make writing inference models easy, but keeping up with the schema becomes difficult. Note that some parameters can have default values. E.g. auto antiprompts = Dict_optValueAt(params, "antoprompts", std::vector<std::string>{}); This adds a further complication. Here are some ideas how to keep the model manifest in sync with the code: 1. Write (property-based) testsWe can try to generate test automatically from the model manifest description. Each test could run a function and be considered successful if the function doesn't return an error. We can generate parameter dictionaries based on the model manifest, run the function and make sure that it finishes correctly. Then, we can also validate the result dictionary. This testing approach is called property-based testing. One of the first tools that implement it QuickCheck for Haskell. Problems:
Work that needs to be done:
2. Use proper types instead of dictsWe can still retain the dictionary interface to the outside world. However internally we use proper types and convert to dicts as a first step of executing a function and just before we return the result. The types and the conversion/validation should be generated from the model manifest schema. This solution will solve both problems stated above Problems:
Work that needs to be done:
3. Add validation to dictsThis might go beyond my current understanding of C++ but probably can be done instance->runOp("run", {{"prompt", prompt}, {"max_tokens", 20}, {"antiprompts", antiprompts}}, {
[&](ac::CallbackResult<void> result) {
if (result.has_error()) {
opError = std::move(result.error().text);
return;
}
latch->count_down();
},
[](std::string_view, ac::Dict result) {
std::cout << result.at("result").get<std::string_view>();
}
});
We could do instead
auto dict = new ValidatedDict<LLAMA_INSTANCE_RUN_PARAMS_SCHEMA>({{"prompt", prompt}, {"max_tokens", 20}, {"antiprompts", antiprompts}})
instance->runOp("run", dict, {
[&](ac::CallbackResult<void> result) {
if (result.has_error()) {
opError = std::move(result.error().text);
return;
}
latch->count_down();
},
[](std::string_view, ac::ValidatedDict<LLAMA_INSTANCE_RUN_RESULT_SCHEMA> result) {
std::cout << result.at("result").get<std::string_view>();
}
}); In this example
This way we can test the schema using our normal tests. Also if we decide to use ValidatedDict in the core library, we can use a macro to remove any validation for the release build. But such optimisations are probably not necessary because we don't expect that validation will be a performance bottleneck for the ways that the library will be used. This solution avoids rewriting the core inference code. Work to be done:
4. Write schema from code?This one is very hypothetical. In all other ideas the model manifest is the source of truth that the code should follow. Could we reverse this? We can try to use code-gen on the inference code to generate a part of the model manifest. Then this part must be merged with the human-generated portions of the manifest to create the final model manifest. In order to do this we must be able to collect information from all occurrences of dict constructors or any other dict methods. For example if we find the following code: void run(Dict params, ...) {
...
auto prompt = Dict_optValueAt(params, "antiprompts", std::vector<std::string>{});
} we must be able to deduce that the We must be able to extract all such facts from the code and merge them to produce the final schema. The only way that I can think of doing this is to run example code or tests using a special Dict subclass which records all Dict ops that were called and produces a schema based on it. It would look something like: auto params = new SchemaWritingDict({{"prompt", prompt}, {"max_tokens", 20}, {"antiprompts", antiprompts}})
instance->runOp("run", params, {
[&](ac::CallbackResult<void> result) {
if (result.has_error()) {
opError = std::move(result.error().text);
return;
}
latch->count_down();
},
[](std::string_view, ac::SchemaWritingDict result) {
// Add the discovered schema to the schema discovered so far by other runs. Use the schema discovered from write ops
result.mergeWriteSchemaWith("llama_instance_run_result.schema")
std::cout << result.at("result").get<std::string_view>();
}
});
// Add the discovered schema to the schema which was discovered so far by other runs. Use the schema discovered from read ops
params.mergeReadSchemaWith("llama_instance_run_params.schema") The advantage to this approach is that the C++ programmers who will write the inference code will have an easy to use way to generate schemas in their preferred language. The disadvantage is that this approach looks to be the craziest from all current approaches. I don't know if it can be done. ProblemsI don't know if this can be done. For example how do we deal with cases where the existence of certain fields depends on the value of another field? Consider the following code: auto type = params.find("type");
if (type=="type1") {
auto val1 = params.find("param1");
...
} else {
auto val2 = params.find("param2");
...
} Work to be done:
|
Beta Was this translation helpful? Give feedback.
-
So far I'm favoring generating schema from code, though it doesn't need to be that seamless. We could add (hardcode) a formal mapping from dict entry to symbol. Think It is true that the generated schemas will have to be merged with the model manifest separately |
Beta Was this translation helpful? Give feedback.
-
I think that at this point we should focus on the actual model manifest structure rather than the source of the usage schemas. I think it's important to make a distinction between the two. The usage schema is not the model manifest, but just a part of it. I'll edit the titles of the existing issues and add new ones |
Beta Was this translation helpful? Give feedback.
-
@iboB What do you envision to be the purpose of DRY? Is it just to simplify the work of manifest writers or is there anything else? |
Beta Was this translation helpful? Give feedback.
-
As part of this project is it a business goal to incentivise users to create as many models (i.e. model manifests) as possible? Like Huggingface or Ollama? Or do we rather expect that we will be creating the models and users will just use them (more like jan.ai)? I suppose we would prefer the first option, but I want to make sure. |
Beta Was this translation helpful? Give feedback.
-
Do we plan to include default prompts as part of a model like ollama does? This can be a good hook to incentivise people to write models, because it is very easy to take a model, add a new prompt and publish that. Ollama uses Go templates for their prompts |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
A provider should be able to er... well... provide a model manifest. This is the list of available models and data about them.
This data includes:
There are questions about how to structure and present this data. The question whether something should be available to the user is also always present for each item. A potential resolution might be, that we just hide this. Nevertheless, the data should always be available to us.
Data structure
It is not obvious. We would ideally like to have some DRY in the manifest, but how much?
DRY can be achieved by having multiple tables which reference each other, or by coming up with a magical hierarchical structure which somehow minimizes it.
Here are some examples of repeated/reusable data:
The finer grain the allowed invariants, the less benefit there is from tables referencing each other.
Provider specific data
Some data is provider specific. This includes
Beta Was this translation helpful? Give feedback.
All reactions