ARCHIVED: replaced by clowder-dev/storage-manager
Per-worker-node storage manager, responsible for retrieving aimages, models and components, storing them locally and making them available to inference engines.
- Downloading and storing images, models and components to a configurable cache directory
- Managing everything optimally in the cache using OCI layout format, which provides content-addressable storage (CAS) and deduplication
- Exposing an API to pass it commands:
- Download a specific aimage, model or component from a provided URL
- Check if a specific aimage, model or component is available and complete
- Remove a specific aimage, model or component from the cache
- Enable configuration of options via CLI flags and environment variables, with reasonable defaults
Option | Flag | Env Var | Description | Default |
---|---|---|---|---|
Cache Directory | --cache-dir |
CACHE_DIR |
Directory where images, models and components are stored | /var/lib/nekko/cache |
Address | --address |
NEKKO_ADDRESS |
Address and port or Unix-domain socket where the API listens | localhost:8050 |
Log Level | --verbose |
VERBOSE |
Log level for the application | 0 |
The storage manager exposes an API with the following endpoints:
GET /content/<URL>
: Check if URL is available in cache.POST /content/
: Download content from the provided URL and store it in the cache.DELETE /content/<URL>
: Removes content from the cache.
Check if a specific URL is available in the cache. Returns 200
if the provided content URL is available in the cache. Returns 404
if not available, 200
if available and complete, and 206
if available but incomplete. URL is base64-encoded.
Response:
{
"url": "<URL>",
"digest": "<DIGEST>"
}
Downloads content from the provided URL and stores it in the cache. Body
contains json with the URL to the content. Returns 201
if successful.
Response:
{
"url": "<URL>",
"digest": "<DIGEST>"
}
Body is as follows:
{
"url": "<URL>"
}
URL is not base64-encoded.
The URL format determines which downloader is used.
You can provide credentials via the field "credentials"
and an optional "credentialsType"
field. E.g.:
{
"url": "<URL>",
"credentials": "<TOKEN>"
}
or
{
"url": "<URL>",
"credentials": "<TOKEN>",
"credentialsType": "Bearer"
}
The interpretation of the token is up to the individual downloader.
Removes the aimage from the cache. URL is base64-encoded. Returns 204
if successful.
Response: No content in the response body.
The following downloaders and request formats are supported.
- URL format:
oci://<registry>/<repository>/<image>:<tag>
oroci://<registry>/<repository>/<image>@<digest>
- Credentials: token
- Credentials Type: Only
Bearer
supported, defaults toBearer
- URL format:
huggingface://<registry>/<model>/<file>
orhf://<registry>/<model>/<file>
; if no<registry>
is supplied, defaults tohuggingface.co
, e.g.hf:///unsloth/SmolLM2-135M-Instruct-GGUF/SmolLM2-135M-Instruct-Q2_K.gguf
(note three/
followinghf
) - Credentials: token
- Credentials Type: Only
Bearer
supported, defaults toBearer
Supports both http and https
- URL format:
http://<host>/<path>
orhttps://<host>/<path>
- Credentials: token or username-password,
:
-separated and base64-encoded - Credentials Type:
Bearer
orBasic
, defaults toBearer
Future planned support. Closely resembles OCI.
- URL format:
ollama://<host>/<path>
; if not host is provided, defaults toollama.com
, e.g.ollama:///<path>
(note three/
followingollama
) - Credentials: token
- Credentials Type: Only
Bearer
supported, defaults toBearer