Skip to content

[GGUF] typed metadata #1649

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 24, 2025
Merged

[GGUF] typed metadata #1649

merged 3 commits into from
Jul 24, 2025

Conversation

mishig25
Copy link
Collaborator

@mishig25 mishig25 commented Jul 23, 2025

Description

Enhance GGUF functionality by adding typedMetadata support

This update introduces typedMetadata to the gguf function, allowing users to request structured metadata alongside the standard output. The implementation includes checks for both V1 and V2 file formats, ensuring compatibility and consistency in metadata retrieval. Additionally, tests have been added to validate the new functionality and ensure that metadata values align correctly between standard and typed formats.

Usage

import { GGMLQuantizationType, GGUFValueType, gguf } from "@huggingface/gguf";

const URL_LLAMA = "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/191239b/llama-2-7b-chat.Q2_K.gguf";

const { metadata, typedMetadata } = await gguf(URL_LLAMA, { typedMetadata: true });

console.log(typedMetadata);
// {
//     version: { value: 2, type: GGUFValueType.UINT32 },
//     tensor_count: { value: 291n, type: GGUFValueType.UINT64 },
//     kv_count: { value: 19n, type: GGUFValueType.UINT64 },
//     "general.architecture": { value: "llama", type: GGUFValueType.STRING },
//     "general.file_type": { value: 10, type: GGUFValueType.UINT32 },
//     "general.name": { value: "LLaMA v2", type: GGUFValueType.STRING },
//     "llama.attention.head_count": { value: 32, type: GGUFValueType.UINT32 },
//     "llama.attention.layer_norm_rms_epsilon": { value: 9.999999974752427e-7, type: GGUFValueType.FLOAT32 },
//     "tokenizer.ggml.tokens": { value: ["<unk>", "<s>", "</s>", ...], type: GGUFValueType.ARRAY },
//     ...
// }

// Access both value and type information
console.log(typedMetadata["general.architecture"].value); // "llama"
console.log(typedMetadata["general.architecture"].type);  // GGUFValueType.STRING (8)

@mishig25 mishig25 requested review from ngxson and julien-c as code owners July 23, 2025 21:20

// Check if scores array is properly handled
if (typedMetadata["tokenizer.ggml.scores"]) {
expect(typedMetadata["tokenizer.ggml.scores"].type).toEqual(GGUFValueType.ARRAY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, should we now separate GGUFValueType.ARRAY into GGUFValueType.ARRAY_INT32, GGUFValueType.ARRAY_STRING, etc ?

It will come in handy when we want to distinguish among array of uint, int or float

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be diverging from the gguf spec, no ?

https://github.com/ggml-org/ggml/blob/master/docs/gguf.md?plain=1#L191

there is no ARRAY_STRING or ARRAY_INT32 in enum gguf_metadata_value_type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed 0ba56b1

if type is array, there will be property subType as well

//     "tokenizer.ggml.tokens": { value: ["<unk>", "<s>", "</s>", ...], type: GGUFValueType.ARRAY, subType: GGUFValueType.STRING },
//     "tokenizer.ggml.scores": { value: [0.0, -1000.0, -1000.0, ...], type: GGUFValueType.ARRAY, subType: GGUFValueType.FLOAT32 },

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would keep things simple personally but 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes subType would work too!

This is necessary because we need to reconstruct the array with the original type. Otherwise GGUF will fail to load if the element type mismatched (which can be the case for float/uint/int)

@mishig25 mishig25 merged commit ca155c1 into main Jul 24, 2025
5 checks passed
@mishig25 mishig25 deleted the gguf_typed_metadata branch July 24, 2025 11:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants