Skip to content

Commit b489520

Browse files
committed
gguf.md: GGUF Filename Parsing Strategy
1 parent 1bf1ab5 commit b489520

File tree

1 file changed

+23
-3
lines changed

1 file changed

+23
-3
lines changed

docs/gguf.md

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,12 @@ The key difference between GGJT and GGUF is the use of a key-value structure for
2020

2121
### GGUF Naming Convention
2222

23-
GGUF follow a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<Quantization>.gguf`.
23+
GGUF follow a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<Quantization>.gguf`
2424

2525
The components are:
2626
1. **Model**: A descriptive name for the model type or architecture.
27-
2. **Version (Optional)**: Denotes the model version number, starting at `v1` if not specified, formatted as `v<Major>.<Minor>`.
28-
- Best practice to include model version number only if model has multiple versions and assume the unversioned model to be the first version and/or check the model card.
27+
2. **Version**: (Optional) Denotes the model version number, formatted as `v<Major>.<Minor>`
28+
- If model is missing a version number then assume `v0.0` (Prerelease)
2929
3. **ExpertsCount**: Indicates the number of experts found in a Mixture of Experts based model.
3030
4. **Parameters**: Indicates the number of parameters and their scale, represented as `<count><scale-prefix>`:
3131
- `T`: Trillion parameters.
@@ -45,6 +45,26 @@ The components are:
4545
- Even Number (0 or 2): `<model weights> = <scaling factor> * <quantised weight>`
4646
- Odd Number (1 or 3): `<model weights> = <offset factor> + <scaling factor> * <quantised weight>`
4747

48+
#### Parsing Above Naming Convention
49+
50+
To correctly parse a well formed naming convention based gguf filename, it is recommended to read from right to left using `-` as the delimiter. This strategy allow for the most flexibility in model name to include dashes if they so choose, while at the same time allowing for version string to be optional. This approach also gives some future proofing to extend the format if needed in the future.
51+
52+
For example:
53+
54+
* `mixtral-v0.1-8x7B-Q2_K.gguf`:
55+
- Model Name: Mixtral
56+
- Version Number: v0.1
57+
- Expert Count: 8
58+
- Parameter Count: 7B
59+
- Quantization: Q2_K
60+
61+
* `Hermes-2-Pro-Llama-3-8B-F16.gguf`:
62+
- Model Name: Hermes 2 Pro Llama
63+
- Version Number: v0.0 (`<Version>-` missing)
64+
- Expert Count: 0 (`<ExpertsCount>x` missing)
65+
- Parameter Count: 8B
66+
- Quantization: F16
67+
4868
### File Structure
4969

5070
![image](https://github.com/ggerganov/ggml/assets/1991296/c3623641-3a1d-408e-bfaf-1b7c4e16aa63)

0 commit comments

Comments
 (0)