You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The bad sts correlation is because SMART uses MSE loss for its calculation of adverserial loss.
432
-
We did not change it yet.
432
+
We did not change it **yet**.
433
433
434
434
### Final model
435
435
We combined some of our results in the final model.
@@ -516,22 +516,20 @@ This could be achieved be generating more (true) data from the datasets sst and
516
516
- give other losses different weights.
517
517
- with or without combined losses.
518
518
- maybe based in dev_acc performance in previous epoch.
519
-
520
-
## Contributing
521
-
522
-
>📋 Pick a licence and describe how to contribute to your code repository.
519
+
- implement SMART for BERT-STS
520
+
- Dropout and weight decay tuning for BERT (AdamW and Sophia)
523
521
524
522
## Member Contributions
525
523
Dawor, Moataz: Generalisations on Custom Attention, Splitted and reordererd batches, analysis_dataset
526
524
527
525
Lübbers, Christopher L.: Part 1 complete; Part 2: sBERT, Tensorboard (metrics + profiler), sBERT-Baseline, SOPHIA, SMART, Optuna, sBERT-Optuna for Optimizer, Optuna for sBERT and BERT-SMART, Optuna for sBERT-regularization, sBERT with combinded losses, sBERT with gradient surgery, README for those tasks
528
526
529
-
Niegsch, Luaks*: Generalisations on Custom Attention, Splitted and reordererd batches,
527
+
Niegsch, Lukas*: Generalisations on Custom Attention, Splitted and reordererd batches,
530
528
531
529
Schmidt, Finn Paul:
532
530
533
531
534
-
##Submit commands
532
+
##Submit commands
535
533
536
534
Für sophia base mit optimierten parametern zu trainieren:
0 commit comments