Tianyi Lab @ UMD

All

18 repositories

HallusionBench
Public
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
benchmark benchmarks lmm hallucination gpt-4 large-language-models llm llava large-vision-language-models vlms
Python
•
BSD 3-Clause "New" or "Revised" License
•8•305•0•0•Updated Oct 14, 2025Oct 14, 2025
RuleR
Public
[NAACL'25] RuleR: Improving LLM Controllability by Rule-based Data Recycling
Python
•1•14•1•0•Updated Sep 27, 2025Sep 27, 2025
Mosaic-IT
Public
[ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning
Python
•3•20•0•0•Updated Sep 27, 2025Sep 27, 2025
ColorBench
Public
[NeurIPS'25] ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness
Python
•
Apache License 2.0
•0•27•0•0•Updated Sep 27, 2025Sep 27, 2025
DisCL
Public
[ICCV 2025] Diffusion Curriculum (DisCL)
Jupyter Notebook
•0•13•2•0•Updated Sep 26, 2025Sep 26, 2025
FaSTAR
Public
Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing
Jupyter Notebook
•
BSD 3-Clause "New" or "Revised" License
•2•28•0•0•Updated Jun 27, 2025Jun 27, 2025
Superfiltering
Public
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
Python
•14•179•0•0•Updated Jun 25, 2025Jun 25, 2025
Cherry_LLM
Public
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
Python
•26•398•1•0•Updated Jun 25, 2025Jun 25, 2025
MiP-Overthinking
Public
[COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
Python
•
MIT License
•1•35•1•0•Updated Jun 5, 2025Jun 5, 2025
C3PO
Public
[COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
Jupyter Notebook
•
Apache License 2.0
•1•18•1•0•Updated Apr 9, 2025Apr 9, 2025
CoSTAR
Public
Cost-Sensitive Toolpath Agent for Multi-turn Image Editing
Jupyter Notebook
•
BSD 3-Clause "New" or "Revised" License
•1•23•0•0•Updated Mar 26, 2025Mar 26, 2025
R2-T2
Public
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
Python
•
MIT License
•2•16•1•0•Updated Mar 10, 2025Mar 10, 2025
MosT
Public
Code for "Many-objective multi-solution transport"
Python
•0•2•0•0•Updated Feb 28, 2025Feb 28, 2025
BenTo
Public
[ICLR 2025] "BENTO: benchmark reduction with in-context learning transferability"
Python
•
Apache License 2.0
•0•4•0•0•Updated Oct 18, 2024Oct 18, 2024
MoE-Embedding
Public
[ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
Python
•11•82•1•1•Updated Oct 15, 2024Oct 15, 2024
DEBATunE
Public
[ACL'24] Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
Python
•2•22•0•0•Updated Sep 14, 2024Sep 14, 2024
Reflection_Tuning
Public
[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Python
•30•363•0•0•Updated Sep 6, 2024Sep 6, 2024
mctune
Public
[ACL'24] Multi-Objective Linguistic Control of Large Language Models
Python
•0•2•0•0•Updated Jun 30, 2024Jun 30, 2024