Curated list of groundbreaking music generation research.
-
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation [2025] [Alibaba] [Paper] [Code] [Demo]
-
FluxMusic FLUX that Plays Music [2024] [Skywork] [Paper] [Code]
-
MusicGen Simple and Controllable Music Generation [2024] [Meta] [Paper] [Code]
-
MusicLM: Generating Music From Text [2023] [Google] [Paper]
-
LeVo: High-Quality Song Generation with Multi-Preference Alignment [2025] [Tencent] [Paper] [GitHub] [Demo]
-
SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement [2025] [CUHK-Shenzhen] [Paper] [Demo] [GitHub]
-
ACE-Step: A Step Towards Music Generation Foundation Model [2025] [GitHub]
-
YuE: Scaling Open Foundation Models for Long-Form Music Generation [2025] [m-a-p] [Paper] [Code] [Demo ]
-
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion [2025] [ASLP-lab] [Paper] [Code] [Demo] [HuggingFace]
-
SongCreator: Lyrics-based Universal Song Generation [Paper] [Demo]
-
SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor [Paper] [Demo]
-
MuseControlLite: Multifunctional Music Generation with Lightweight Conditioners [2025] [Paper]
-
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer [2024] [ICASSP] [Paper]
-
Music ControlNet: Multiple Time-Varying Controls for Music Generation [2024] [Demo]
-
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning [2024] [Paper]
-
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation [2024] [Demo]
-
FastSAG: towards fast non-autoregressive singing accompaniment generation [2024] [Paper] [Code]
-
SingSong: Generating musical accompaniments from singing [2023] [Google] [Paper]
-
CSL-L2M: Controllable Song-Level Lyric-to-Melody Generation Based on Conditional Transformer with Fine-Grained Lyric and Musical Controls [2025] [AAAI] [Paper] [Code] [Demo]
-
SongComposer: A large language model for lyric and melody composition in song generation [2024] [Paper] [GitHub]
-
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms [2025] [Tsinghua] [Paper] [Code] [Demo ]
-
Mupt: A generative symbolic music pretrained transformer [2024] [m-a-p] [Paper] [Demo]
-
MuQ: Self-supervised music representation learning with mel residual vector quantization [2025] [Tencent] [Code]
-
MusicFM A foundation model for music informatics [2024] [Paper] [Code]
-
MERT: Acoustic music understanding model with large-scale self-supervised training [2023] [m-a-p] [Paper] [GitHub]
-
YourMT3+: Multi-Instrument Music Transcription with Enhanced Transformer Architectures and Cross-Dataset STEM Augmentation [2024] [Paper] [GitHub]
-
Perceiver TF Multitrack music transcription with a time-frequency perceiver [2023] [ByteDance] [ICASSP] [Paper]
-
MT3: Multi-task multitrack music transcription [2021] [Paper]
-
SCNet: Sparse compression network for music source separation [2025] [ICASSP] [Paper]
-
Music source separation with band-split rope transformer [2024] [ICASSP] [Paper]
-
Music source separation with band-split RNN [2023] [TASLP] [Paper]
-
SongEval: A Benchmark Dataset for Song Aesthetics Evaluation [2025] [ASLP-lab] [Paper] [GitHub] [Dataset]
-
MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation [2025] [AISHELL] [Paper] [Dataset]
-
Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation [2024] [Paper] [GitHub]