Skip to content
Change the repository type filter

All

    Repositories list

    • [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
      Python
      830500Updated Oct 14, 2025Oct 14, 2025
    • RuleR

      Public
      [NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling
      Python
      11410Updated Sep 27, 2025Sep 27, 2025
    • Mosaic-IT

      Public
      [ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning
      Python
      32000Updated Sep 27, 2025Sep 27, 2025
    • [NeurIPS'25] ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
      Python
      02700Updated Sep 27, 2025Sep 27, 2025
    • DisCL

      Public
      [ICCV 2025] Diffusion Curriculum (DisCL)
      Jupyter Notebook
      01320Updated Sep 26, 2025Sep 26, 2025
    • FaSTAR

      Public
      Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
      Jupyter Notebook
      22800Updated Jun 27, 2025Jun 27, 2025
    • [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
      Python
      1417900Updated Jun 25, 2025Jun 25, 2025
    • [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
      Python
      2639810Updated Jun 25, 2025Jun 25, 2025
    • [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
      Python
      13510Updated Jun 5, 2025Jun 5, 2025
    • C3PO

      Public
      [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
      Jupyter Notebook
      11810Updated Apr 9, 2025Apr 9, 2025
    • CoSTAR

      Public
      Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
      Jupyter Notebook
      12300Updated Mar 26, 2025Mar 26, 2025
    • R2-T2

      Public
      [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
      Python
      21610Updated Mar 10, 2025Mar 10, 2025
    • MosT

      Public
      Code for "Many-objective multi-solution transport"
      Python
      0200Updated Feb 28, 2025Feb 28, 2025
    • BenTo

      Public
      [ICLR 2025] "BENTO: benchmark reduction with in-context learning transferability"
      Python
      0400Updated Oct 18, 2024Oct 18, 2024
    • [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
      Python
      118211Updated Oct 15, 2024Oct 15, 2024
    • DEBATunE

      Public
      [ACL'24] Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
      Python
      22200Updated Sep 14, 2024Sep 14, 2024
    • [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
      Python
      3036300Updated Sep 6, 2024Sep 6, 2024
    • mctune

      Public
      [ACL'24] Multi-Objective Linguistic Control of Large Language Models
      Python
      0200Updated Jun 30, 2024Jun 30, 2024