Change the repository type filter
All
Repositories list
23 repositories
FTIIBench
PublicGuessArena
Public[ACL 2025] GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and ReasoningQAEncoder
PublicRolePlay_LLMDoctor
Public- Academic Survey Paper Generation.
SEAP
PublicMeta-Chunking
PublicUHGEval
Public[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.MARA
PublicMaintainCoder
PublicCRUD_RAG
PublicCRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language ModelsxVerify
PublicxVerify: Efficient Answer Verifier for Reasoning Model EvaluationsSafeRAG
PublicAwesome-Attention-Heads
PublicAn awesome repository & A comprehensive survey on interpretability of LLM attention heads.xFinder
Public[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable EvaluationICSFSurvey
PublicExplore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.FastMem
PublicDATG
Public[ACL 2024]Controlled Text Generation for Large Language Model with Dynamic Attribute GraphsCTGSurvey
PublicControllable Text Generation for Large Language Models: A SurveyPGRAG
PublicNewsBench
Public[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese JournalismGrimoire
PublicUHGEval-dataset
PublicThe full pipeline of creating UHGEval hallucination dataset