Skip to content

ChicagoHAI/HypoEval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation (Evaluators)

This repository provides off-the-shelf 0-shot LLM-based evaluators for summarization and story generation, from the paper HypoEval: Hypothesis-Guided Evaluation of Natural Language Generation. For more details on HypoEval implementation, reproducing results, and adding new evaluated aspects, please refer to ChicagoHAI/HypoEval-Gen.

Updates

May'25: HypoEval is now incorporated in quotient-ai/judges!

Usage

To use the evaluator for summaries on aspect in ["coherence", "consistency", "informativeness", "fluency", "relevance"]:

from hypoeval.evaluator import SummaryEvaluator

evaluator = SummaryEvaluator(model_name=MODEL_NAME, model_path=MODEL_PATH) # (optional) specify model path for local models
evaluated_aspect = "coherence"
summary_list = ["...", "..."]
source_text_list = ["...", "..."]
evaluation_scores = evaluator.batched_evaluate(aspect=evaluated_aspect, summaries=summary_list, source_texts=source_text_list)

To use the evaluator for stories on aspect in ["coherence", "cohesiveness", "complexity", "empathy", "engagement", "grammaticality", "likability", "relevance", "surprise"]:

from hypoeval.evaluator import StoryEvaluator

evaluator = StoryEvaluator(model_name=MODEL_NAME, model_path=MODEL_PATH) # (optional) specify model path for local models
evaluated_aspect = "coherence"
story_list = ["...", "..."]
story_prompt_list = ["...", "..."]
evaluation_scores = evaluator.batched_evaluate(aspect=evaluated_aspect, stories=story_list, story_prompts=story_prompt_list)

Citation

Please consider citing our work if it contributes to your research:

@misc{li2025hypoevalhypothesisguidedevaluationnatural,
      title={HypoEval: Hypothesis-Guided Evaluation for Natural Language Generation}, 
      author={Mingxuan Li and Hanchen Li and Chenhao Tan},
      year={2025},
      eprint={2504.07174},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.07174}, 
}

About

0-shot evaluators from HypoEval-Gen (Hypothesis-Guided Evaluation for Natural Language Generation)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages