GitHub - VirtuosoResearch/MTL-and-finetuning-reference-repository: Recent papers and projects in multitask Learning and their applications in LLMs

Multitask learning reference repositories

Surveys

Kouw, W. M., & Loog, M. (2018). An introduction to domain adaptation and transfer learning. arXiv. paper
Vandenhende, S., Georgoulis, S., Van Gansbeke, W., Proesmans, M., Dai, D., & Van Gool, L. (2021). Multi-task learning for dense prediction tasks: A survey.T-PAMI. paper
Zhang, Y., & Yang, Q. (2021). A survey on multi-task learning. IEEE transactions on knowledge and data engineering.
Jiang et al. (2022). Transferability in deep learning: A survey.
Lin, B., & Zhang, Y. (2022). LibMTL: A python library for multi-task learning. JMLR. paper

Multitask Learning Basics

Caruana, R. (1997). Multitask learning. Machine learning. paper
Caruana, R. (1996). Algorithms and applications for multitask learning. In ICML. Paper
Duong et al. (2015). Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. In ACL.
Yang, Y., & Hospedales, T. (2016). Deep multi-task representation learning: A tensor factorisation approach. ICLR. Paper
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. ICLR 2019. paper
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. NeurIPS 2019. paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018. paper
Multi-task Sequence to Sequence Learning. ICLR 2016. paper
The natural language decathlon: Multitask learning as question answering. arXiv 2019. paper
Understanding and Improving Information Transfer in Multi-Task Learning. ICLR 2020. paper
Multi-Task Deep Neural Networks for Natural Language Understanding. ACL 2019. paper

Task Relatedness

Theoretical notions of task relatedness.

Ben-David, S., & Schuller, R. (2003). Exploiting task relatedness for multiple task learning. In Learning Theory and Kernel Machines. paper
Ben-David et al. (2010). A theory of learning from different domains. Machine learning paper
Hanneke, S., & Kpotufe, S. (2019). On the value of target data in transfer learning. Advances in Neural Information Processing Systems. Paper
Du et al. (2020). Few-shot learning via learning the representation, provably. ICLR. paper

Measurements in deep neural networks.

Grdients

Yu et al. (2020). Gradient surgery for multi-task learning. NeurIPS. Paper
Dery et al. (2021). Auxiliary task update decomposition: The good, the bad and the neutral. ICLR. paper
Chen et al. (2021). Weighted training for cross-task learning. ICLR. paper

Predicted probabilities between tasks

Nguyen et al (2020). Leep: A new measure to evaluate transferability of learned representations. ICML. Paper
Identifying beneficial task relations for multi-task learning in deep neural networks. EACL 2017. paper

Task affinity

Standley et al. (2020). Which tasks should be learned together in multi-task learning? ICML. paper
Fifty et al. (2021). Efficiently identifying task groupings for multi-task learning. NeuIPS. Paper

Multitask Learning Architectures

Mixture-of-Experts

Ma et al. (2018). Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In KDD. paper
Hazimeh, H., Zhao, Z., Chowdhery, A., Sathiamoorthy, M., Chen, Y., Mazumder, R., ... & Chi, E. (2021). Dselect-k: Differentiable selection in the mixture of experts with applications to multi-task learning. NeurIPS. paper

Branching

Guo et al. (2020). Learning to branch for multi-task learning. In ICML. paper
Lu et al. (2017). Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In CVPR. paper
Huang, et al. (2018). Gnas: A greedy neural architecture search method for multi-attribute learning. In ACM MM.
Ruder et al. (2019). Latent multi-task architecture learning. In AAAI.

Soft-parameter sharing

Liu et al. (2019). End-to-end multi-task learning with attention. In CVPR. paper
Cross-stitch Networks for Multi-task Learning. CVPR 2016. paper
Gated multi-task network for text classification. NAACL 2018. paper
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. EMNLP 2017. paper
End-to-End Multi-Task Learning with Attention. CVPR 2019. paper
Latent Multi-task Architecture Learning. AAAI 2019. paper
Learning Multiple Tasks with Multilinear Relationship Networks. NIPS 2017. paper

Optimization Methods for Multi-Task Learning

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. CVPR 2018. paper

Benchmarks

GLUE: Natural Language Understanding
decaNLP: 10 NLP Tasks

Softwares and Open-source Libraries

LibMTL: an open-source library built on PyTorch for mulitask learning.

Meta Learning

Survey

Meta-Learning in Neural Networks: A Survey. paper

Black-Box Approaches

Recurrent Neural Network

(MANN) Meta-learning with memory-augmented neural networks. ICML 2016. paper

Attention-Based Network

Matching Networks for One-Shot Learning. NIPS 2016. paper

(SNAIL) A Simple Neural Attentive Meta-Learner. ICLR 2018. paper

Optimization-Based Methods

(MAML) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. ICML 2017. paper ⭐

(Reptile; First-order method) On First-Order Meta-Learning Algorithms. arXiv 2018. paper

Other Forms of Prior on MAML

(Implicit MAML) Meta-Learning with Implicit Gradients. NIPS 2019. paper

(Implicit Differentiation; SVM) Meta-Learning with Differentiable Convex Optimization. CVPR 2019. paper

(Bayesian linear regression) Meta-Learning Priors for Efficient Online Bayesian Regression. Workshop on the Algorithmic Foundations of Robotics 2018. paper

(Ridge regression; Logistic regression) Meta-learning with Differentiable Closed-Form Solvers. ICLR 2019. paper

Understanding MAML

(MAML expressive power and university) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

(Map MAML to Bayes Framework) Recasting Gradient-Based Meta-Learning as Hierarchical Bayes. ICLR 2018. paper

Tricks to Optimize MAML

Choose architecture that is effective for inner gradient-step

Auto-Meta: Automated Gradient Based Meta Learner Search. NIPS 2018 Workshop on Meta-Learning. paper

Automatically learn inner vector learning rate, tune outer learning rate

Alpha MAML: Adaptive Model-Agnostic Meta-Learning. ICML 2019 Workshop on Automated Machine Learning. paper

Meta-SGD: Learning to Learn Quickly for Few-Shot Learning. arXiv 2017. paper

Optimize only a subset of the parameters in the inner loop

(DEML) Deep Meta-Learning: Learning to Learn in the Concept Space. arXiv 2018. paper

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Decouple inner learning rate, BN statistics per-step

(MAML++) How to train your MAML. ICLR 2019. paper

Introduce context variables for increased expressive power

(CAVIA) Fast Context Adaptation via Meta-Learning. ICML 2019. paper

(Bias transformation) Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm. ICLR 2018. paper

Non-Parametric Methods via Metric Learning

Siamese Neural Networks for One-shot Image Recognition. ICML 2015. paper

Matching Networks for One Shot Learning. NIPS 2016. paper

Prototypical Networks for Few-shot Learning. NIPS 2017. paper

Learn non-linear relation module on embeddings

Learning to Compare: Relation Network for Few-Shot Learning. CVPR 2018. paper

Learn infinite mixture of prototypes

Infinite Mixture Prototypes for Few-Shot Learning. ICML 2019. paper

Perform message passing on embeddings

Few-Shot Learning with Graph Neural Networks ICLR 2018. paper

Bayesian Meta-Learning & Generative Models

Amortized Inference

Amortized Bayesian Meta-Learning. ICLR 2019. paper

Ensemble Method

Bayesian Model-Agnostic Meta-Learning. NIPS 2018. paper

Sampling & Hybrid Inference

Probabilistic Model-Agnostic Meta-Learning. NIPS 2018. paper

Meta-Learning Probabilistic Inference for Prediction. ICLR 2019. paper

Hybrid meta-learning approaches

Meta-Learning with Latent Embedding Optimization. ICLR 2019. paper

Fast Context Adaptation via Meta-Learning. ICML 2019. paper

Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR 2020. paper

Few-Shot Learning with Graph Neural Networks. ICLR 2018. paper

(CAML) Learning to Learn with Conditional Class Dependencies. ICLR 2019. paper

Meta Reinforcement Learning

Policy Gradient RL

MAML and Black-Box Meta Learning Approaches can be directly applied to Policy-Gradient RL methods

Value-Based RL

It is not easy to applied existing meta learning approaches to Value-Based RL because Value-Based RL is dynamic programming method

Meta-Q-Learning. ICLR 2020. paper

(Goal-Conditioned RL with hindsight relabeling)/(Multi-Task RL) Hindsight Experience Replay. NIPS 2017. paper

(better learning) Learning Latent Plans from Play. CoRL 2019. paper

(learn a better goal representation)

Universal Planning Networks. ICML 2018. paper

Unsupervised Visuomotor Control through Distributional Planning Networks. RSS 2019. paper

Applications

Meta-Learning for Low-Resource Neural Machine Translation. EMNLP 2018. paper

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions. ICLR 2018. paper

One-Shot Imitation Learning. NIPS 2017. paper

Massively Multitask Networks for Drug Discovery. ICML 2015. paper

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
Benchmarks_and_Datasets.md		Benchmarks_and_Datasets.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multitask learning reference repositories

Surveys

Multitask Learning Basics

Task Relatedness

Multitask Learning Architectures

Optimization Methods for Multi-Task Learning

Benchmarks

Softwares and Open-source Libraries

Meta Learning

Survey

Black-Box Approaches

Recurrent Neural Network

Attention-Based Network

Optimization-Based Methods

Other Forms of Prior on MAML

Understanding MAML

Tricks to Optimize MAML

Non-Parametric Methods via Metric Learning

Bayesian Meta-Learning & Generative Models

Amortized Inference

Ensemble Method

Sampling & Hybrid Inference

Hybrid meta-learning approaches

Meta Reinforcement Learning

Policy Gradient RL

Value-Based RL

Applications

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

VirtuosoResearch/MTL-and-finetuning-reference-repository

Folders and files

Latest commit

History

Repository files navigation

Multitask learning reference repositories

Surveys

Multitask Learning Basics

Task Relatedness

Multitask Learning Architectures

Optimization Methods for Multi-Task Learning

Benchmarks

Softwares and Open-source Libraries

Meta Learning

Survey

Black-Box Approaches

Recurrent Neural Network

Attention-Based Network

Optimization-Based Methods

Other Forms of Prior on MAML

Understanding MAML

Tricks to Optimize MAML

Non-Parametric Methods via Metric Learning

Bayesian Meta-Learning & Generative Models

Amortized Inference

Ensemble Method

Sampling & Hybrid Inference

Hybrid meta-learning approaches

Meta Reinforcement Learning

Policy Gradient RL

Value-Based RL

Applications

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Packages