You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Feb 15, 2025. It is now read-only.
LeapfrogAI currently has two primary models that are used on the backend, but more should be added/tested. By implementing certain small models and evaluating their efficacy from a human perspective, we can make better decisions as to what models to use and evaluate against.
Goal
To determine a list of models and model configs that work well in LFAI from a human-in-the-loop perspective (no automated evals).
Methodology
Determine a short list (up to 5) models to test in LFAI (from HuggingFace is the simplest way to do this)
Run LFAI in a local dev context, replacing the model backend with one of these choices
Change the config parameters to gauge performance
Record config
Limitations
Model licenses are very permissive (e.g MIT, Apache-2.0)
Must be compatible with vllm or llama-cpp-python (the two frameworks currently supported by LFAI)
Limited vRAM requirements (12-16Gb including model weights and context)
Delivery
A list of models, different config options, and a respective set of subjective scores gauging their performance within LFAI.
A report of the methodology used to evaluate these models (so the experiment can be replicated)
Potentially a repository that contains code used to run these evaluations (and what was evaluated on)