Skip to content
/ MODA Public

[ICML 2025 Spotlight] MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding

License

Notifications You must be signed in to change notification settings

KwaiVGI/MODA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding

Zhicheng Zhang1,2,†, Wuyou Xia1, Chenxi Zhao1,†, Yan Zhou3, Xiaoqiang Liu3, Yongjie Zhu3,‡, Wenyu Qin3, Pengfei Wan3, Di Zhang3, Jufeng Yang1,2,✉
1Nankai University      2Pengcheng Laboratory      3Kuaishou Technology     

Work done at KlingAI      Project Leader      Corresponding Author     

🎉 Accepted by ICML 2025 Spotlight 🎉

[📃 Paper ] [📦 Code ] [⚒️ Project ] [📅 Slide ]

TL;DR: We i) identify attention deficit disorder as a critical barrier hindering fine-grained content understanding in MLLMs; ii) introduce a modular duplex attention mechanism to mitigate modality bias and enhance attention score justification; and iii) develop MODA-based MLLMs that enable fine-grained multimodal understanding across perception, cognition, and emotion tasks.

📈 1. News

  • 🔥2025-07-10: Creating repository. The code is uploading ...
  • 2025-05-01: MODA has been accepted to ICML 2025!

About

[ICML 2025 Spotlight] MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages