Skip to content

VirtuosoResearch/MTL-and-finetuning-reference-repository

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multitask learning reference repositories

Surveys

  • Kouw, W. M., & Loog, M. (2018). An introduction to domain adaptation and transfer learning. arXiv. paper
  • Vandenhende, S., Georgoulis, S., Van Gansbeke, W., Proesmans, M., Dai, D., & Van Gool, L. (2021). Multi-task learning for dense prediction tasks: A survey.T-PAMI. paper
  • Zhang, Y., & Yang, Q. (2021). A survey on multi-task learning. IEEE transactions on knowledge and data engineering.
  • Jiang et al. (2022). Transferability in deep learning: A survey.
  • Lin, B., & Zhang, Y. (2022). LibMTL: A python library for multi-task learning. JMLR. paper

Multitask Learning Basics

  • Caruana, R. (1997). Multitask learning. Machine learning. paper
  • Caruana, R. (1996). Algorithms and applications for multitask learning. In ICML. Paper
  • Duong et al. (2015). Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. In ACL.
  • Yang, Y., & Hospedales, T. (2016). Deep multi-task representation learning: A tensor factorisation approach. ICLR. Paper
  • GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. ICLR 2019. paper
  • SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. NeurIPS 2019. paper
  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018. paper
  • Multi-task Sequence to Sequence Learning. ICLR 2016. paper
  • The natural language decathlon: Multitask learning as question answering. arXiv 2019. paper
  • Understanding and Improving Information Transfer in Multi-Task Learning. ICLR 2020. paper
  • Multi-Task Deep Neural Networks for Natural Language Understanding. ACL 2019. paper

Task Relatedness

Theoretical notions of task relatedness.

  • Ben-David, S., & Schuller, R. (2003). Exploiting task relatedness for multiple task learning. In Learning Theory and Kernel Machines. paper
  • Ben-David et al. (2010). A theory of learning from different domains. Machine learning paper
  • Hanneke, S., & Kpotufe, S. (2019). On the value of target data in transfer learning. Advances in Neural Information Processing Systems. Paper
  • Du et al. (2020). Few-shot learning via learning the representation, provably. ICLR. paper

Measurements in deep neural networks.

Grdients

  • Yu et al. (2020). Gradient surgery for multi-task learning. NeurIPS. Paper
  • Dery et al. (2021). Auxiliary task update decomposition: The good, the bad and the neutral. ICLR. paper
  • Chen et al. (2021). Weighted training for cross-task learning. ICLR. paper

Predicted probabilities between tasks

  • Nguyen et al (2020). Leep: A new measure to evaluate transferability of learned representations. ICML. Paper

  • Identifying beneficial task relations for multi-task learning in deep neural networks. EACL 2017. paper

Task affinity

  • Standley et al. (2020). Which tasks should be learned together in multi-task learning? ICML. paper
  • Fifty et al. (2021). Efficiently identifying task groupings for multi-task learning. NeuIPS. Paper

Multitask Learning Architectures

Mixture-of-Experts

  • Ma et al. (2018). Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD. paper
  • Hazimeh, H., Zhao, Z., Chowdhery, A., Sathiamoorthy, M., Chen, Y., Mazumder, R., ... & Chi, E. (2021). Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning. NeurIPS. paper

Branching

  • Guo et al. (2020). Learning to branch for multi-task learning. In ICML. paper
  • Lu et al. (2017). Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In CVPR. paper
  • Huang, et al. (2018). Gnas: A greedy neural architecture search method for multi-attribute learning. In ACM MM.
  • Ruder et al. (2019). Latent multi-task architecture learning. In AAAI.

Soft-parameter sharing

  • Liu et al. (2019). End-to-end multi-task learning with attention. In CVPR. paper

  • Cross-stitch Networks for Multi-task Learning. CVPR 2016. paper

  • Gated multi-task network for text classification. NAACL 2018. paper

  • A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. EMNLP 2017. paper

  • End-to-End Multi-Task Learning with Attention. CVPR 2019. paper

  • Latent Multi-task Architecture Learning. AAAI 2019. paper

  • Learning Multiple Tasks with Multilinear Relationship Networks. NIPS 2017. paper

Optimization Methods for Multi-Task Learning

  • Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. CVPR 2018. paper

Benchmarks

  • GLUE: Natural Language Understanding
  • decaNLP: 10 NLP Tasks

Softwares and Open-source Libraries

  • LibMTL: an open-source library built on PyTorch for mulitask learning.

Meta Learning

Survey

Meta-Learning in Neural Networks: A Survey. paper

Black-Box Approaches

Recurrent Neural Network

(MANN) Meta-learning with memory-augmented neural networks. ICML 2016. paper

Attention-Based Network

Matching Networks for One-Shot Learning. NIPS 2016. paper

(SNAIL) A Simple Neural Attentive Meta-Learner. ICLR 2018. paper

Optimization-Based Methods

(MAML) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. paper

(Reptile; First-order method) On First-Order Meta-Learning Algorithms. arXiv 2018. paper

Other Forms of Prior on MAML

(Implicit MAML) Meta-Learning with Implicit Gradients. NIPS 2019. paper

(Implicit Differentiation; SVM) Meta-Learning with Differentiable Convex Optimization. CVPR 2019. paper

(Bayesian linear regression) Meta-Learning Priors for Efficient Online Bayesian Regression. Workshop on the Algorithmic Foundations of Robotics 2018. paper

(Ridge regression; Logistic regression) Meta-learning with Differentiable Closed-Form Solvers. ICLR 2019. paper

Understanding MAML

(MAML expressive power and university) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

(Map MAML to Bayes Framework) Recasting Gradient-Based Meta-Learning as Hierarchical Bayes. ICLR 2018. paper

Tricks to Optimize MAML

Choose architecture that is effective for inner gradient-step

Auto-Meta: Automated Gradient Based Meta Learner Search. NIPS 2018 Workshop on Meta-Learning. paper

Automatically learn inner vector learning rate, tune outer learning rate

Alpha MAML: Adaptive Model-Agnostic Meta-Learning. ICML 2019 Workshop on Automated Machine Learning. paper

Meta-SGD: Learning to Learn Quickly for Few-Shot Learning. arXiv 2017. paper

Optimize only a subset of the parameters in the inner loop

(DEML) Deep Meta-Learning: Learning to Learn in the Concept Space. arXiv 2018. paper

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Decouple inner learning rate, BN statistics per-step

(MAML++) How to train your MAML. ICLR 2019. paper

Introduce context variables for increased expressive power

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

(Bias transformation) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

Non-Parametric Methods via Metric Learning

Siamese Neural Networks for One-shot Image Recognition. ICML 2015. paper

Matching Networks for One Shot Learning. NIPS 2016. paper

Prototypical Networks for Few-shot Learning. NIPS 2017. paper

Learn non-linear relation module on embeddings

Learning to Compare: Relation Network for Few-Shot Learning. CVPR 2018. paper

Learn infinite mixture of prototypes

Infinite Mixture Prototypes for Few-Shot Learning. ICML 2019. paper

Perform message passing on embeddings

Few-Shot Learning with Graph Neural Networks ICLR 2018. paper

Bayesian Meta-Learning & Generative Models

Amortized Inference

Amortized Bayesian Meta-Learning. ICLR 2019. paper

Ensemble Method

Bayesian Model-Agnostic Meta-Learning. NIPS 2018. paper

Sampling & Hybrid Inference

Probabilistic Model-Agnostic Meta-Learning. NIPS 2018. paper

Meta-Learning Probabilistic Inference for Prediction. ICLR 2019. paper

Hybrid meta-learning approaches

Meta-Learning with Latent Embedding Optimization. ICLR 2019. paper

Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR 2020. paper

Few-Shot Learning with Graph Neural Networks. ICLR 2018. paper

(CAML) Learning to Learn with Conditional Class Dependencies. ICLR 2019. paper

Meta Reinforcement Learning

Policy Gradient RL

MAML and Black-Box Meta Learning Approaches can be directly applied to Policy-Gradient RL methods

Value-Based RL

It is not easy to applied existing meta learning approaches to Value-Based RL because Value-Based RL is dynamic programming method

Meta-Q-Learning. ICLR 2020. paper

(Goal-Conditioned RL with hindsight relabeling)/(Multi-Task RL) Hindsight Experience Replay. NIPS 2017. paper

(better learning) Learning Latent Plans from Play. CoRL 2019. paper

(learn a better goal representation)

Universal Planning Networks. ICML 2018. paper

Unsupervised Visuomotor Control through Distributional Planning Networks. RSS 2019. paper

Applications

Meta-Learning for Low-Resource Neural Machine Translation. EMNLP 2018. paper

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions. ICLR 2018. paper

One-Shot Imitation Learning. NIPS 2017. paper

Massively Multitask Networks for Drug Discovery. ICML 2015. paper

About

Recent papers and projects in multitask Learning and their applications in LLMs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •