Releases · huggingface/setfit

08 Nov 15:26

lewtun

v0.4.1

7a76448

v0.4.1 Patch release

Fixes an issue on Google Colab, where the default version of Python 3.7 is incompatible with the Literal type. See #162 for more details.

Assets 2

08 Nov 14:11

lewtun

v0.4.0

c47598c

v0.4.0 Differentiable heads & various quality of life improvements

Differentiable heads for `SetFitModel`

@blakechi has implemented a differentiable head in PyTorch for SetFitModel that enables the model to be trained end-to-end. The implementation is backwards compatible with the scikit-learn heads and can be activated by setting use_differentiable_head=True when loading SetFitModel. Here's a full example:

from datasets import load_dataset
from sentence_transformers.losses import CosineSimilarityLoss

from setfit import SetFitModel, SetFitTrainer


# Load a dataset from the Hugging Face Hub
dataset = load_dataset("sst2")

# Simulate the few-shot regime by sampling 8 examples per class
num_classes = 2
train_dataset = dataset["train"].shuffle(seed=42).select(range(8 * num_classes))
eval_dataset = dataset["validation"]

# Load a SetFit model from Hub
model = SetFitModel.from_pretrained(
    "sentence-transformers/paraphrase-mpnet-base-v2",
    use_differentiable_head=True,
    head_params={"out_features": num_classes},
)

# Create trainer
trainer = SetFitTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    loss_class=CosineSimilarityLoss,
    metric="accuracy",
    batch_size=16,
    num_iterations=20, # The number of text pairs to generate for contrastive learning
    num_epochs=1, # The number of epochs to use for constrastive learning
    column_mapping={"sentence": "text", "label": "label"} # Map dataset columns to text/label expected by trainer
)

# Train and evaluate
trainer.freeze() # Freeze the head
trainer.train() # Train only the body

# Unfreeze the head and freeze the body -> head-only training
trainer.unfreeze(keep_body_frozen=True)
# or
# Unfreeze the head and unfreeze the body -> end-to-end training
trainer.unfreeze(keep_body_frozen=False)

trainer.train(
    num_epochs=25, # The number of epochs to train the head or the whole model (body and head)
    batch_size=16,
    body_learning_rate=1e-5, # The body's learning rate
    learning_rate=1e-2, # The head's learning rate
    l2_weight=0.0, # Weight decay on **both** the body and head. If `None`, will use 0.01.
)
metrics = trainer.evaluate()

# Push model to the Hub
trainer.push_to_hub("my-awesome-setfit-model")

# Download from Hub and run inference
model = SetFitModel.from_pretrained("lewtun/my-awesome-setfit-model")
# Run inference
preds = model(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])

Bug fixes and improvements

add num_epochs to train_step calculation by @PhilipMay in #139
Support for the differentiable head by @blakechi in #112
redirect call to predict by @PhilipMay in #142
fix: templated examples copy empty vector by @pdhall99 in #148
Add support to kwargs in compute() method called by trainer.evaluate() by @mpangrazzi in #125
Small fix on hyperparameter search by @Mouhanedg56 in #150
Fix typo: temerature => temperature by @tomaarsen in #155
Add the usage and relevant info. of the differentiable head to README by @blakechi in #149
Fix non default loss_class issue by @PhilipMay in #154
Add sampling function & update notebooks by @lewtun in #146
Fix typos: image(s) -> sentence(s) by @victorjmarin in #160
Add more loss function options by @PhilipMay in #159

Significant community contributions

The following contributors have made significant changes to the library over the last release:

@pdhall99
- fix: allow load of pretrained model without head
- fix: templated examples copy empty vector (#148)
@PhilipMay
- add num_epochs to train_step calculation (#139)
- redirect call to predict (#142)
- Fix non default loss_class issue (#154)
- Add more loss function options (#159)
@blakechi
- Support for the differentiable head (#112)
- Add the usage and relevant info. of the differentiable head to README (#149)
@mpangrazzi
- Add support to kwargs in compute() method called by trainer.evaluate() (#125)

Contributors

PhilipMay, mpangrazzi, and 6 other contributors

Assets 2

14 Oct 10:45

lewtun

v0.3.0

db300c9

v0.3.0 Improved hyperparameter search

This release includes improvements to the hyperparameter_search() function of SetFitTrainer, along with several small fixes in saving fine-tuned models.

Thanks to @sanderland @bradleyfowler123 @Mouhanedg56 for their contributions 🤗 !

Contributors

bradleyfowler123, Mouhanedg56, and sanderland

Assets 2

11 Oct 15:40

lewtun

v0.2.0

3fa0e35

v0.2.0 Hyperparameter search and multilabel text classification

This release comes with two main features:

Support to train models on multilabel text classification datasets
An optuna integration to run hyperparameter search on both the SetFitModel head and the hyperparameters used during training.

Significant community contributions

The following contributors have made significant changes to the library over the last release:

Contributors

mpangrazzi and Mouhanedg56

Assets 2

06 Oct 10:38

lewtun

v0.1.1

c0522a9

v0.1.1 Patch release

Fixes a bug where the column mapping checks threw an error when a column mapping wasn't provided for datasets with valid column names.

See #82 for more details.

Assets 2

04 Oct 07:18

lewtun

v0.1.0

53196ff

v0.1.0 Column mapping for SetFitTrainer

Column mapping for `SetFitTrainer`

The SetFitTrainer assumes that the training and evaluation datasets contain text and label columns. Previously, this required users to manually rename their dataset columns before creating the trainer. In #75 we added support for users to specify the column mapping directly in the trainer.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Differentiable heads for `SetFitModel`

Bug fixes and improvements

Significant community contributions

Contributors

Uh oh!

Contributors

Uh oh!

Significant community contributions

Contributors

Uh oh!

Uh oh!

Column mapping for `SetFitTrainer`

Uh oh!

Releases: huggingface/setfit

v0.4.1 Patch release

Uh oh!

v0.4.0 Differentiable heads & various quality of life improvements

Differentiable heads for SetFitModel

Bug fixes and improvements

Significant community contributions

Contributors

Uh oh!

v0.3.0 Improved hyperparameter search

Contributors

Uh oh!

v0.2.0 Hyperparameter search and multilabel text classification

Significant community contributions

Contributors

Uh oh!

v0.1.1 Patch release

Uh oh!

v0.1.0 Column mapping for SetFitTrainer

Column mapping for SetFitTrainer

Uh oh!

Differentiable heads for `SetFitModel`

Column mapping for `SetFitTrainer`