Skip to content

Commit f50e66c

Browse files
EAddarioqnixsynapse
authored andcommitted
quantize : handle user-defined pruning of whole layers (blocks) (ggml-org#13037)
1 parent ff05727 commit f50e66c

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

include/llama.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -390,6 +390,7 @@ extern "C" {
390390
void * imatrix; // pointer to importance matrix data
391391
void * kv_overrides; // pointer to vector containing overrides
392392
void * tensor_types; // pointer to vector containing tensor types
393+
void * prune_layers; // pointer to vector containing layer indices to prune
393394
} llama_model_quantize_params;
394395

395396
typedef struct llama_logit_bias {

0 commit comments

Comments
 (0)