Data, scripts, and recipes for the benchmark Sci2Pol-Bench, a comprehensive benchmark for evaluating large language models.
The data consists of policy briefs obtained from Nature Energy, Nature Climate, Nature Cities, and Journal of Health and Social Behavior Policy Briefs.
Policy briefs originally were introduced in the Nature Energy journal with the goal of:
This format aims to provide policy professionals with accessible summaries of research papers published in our journal, written by the paper’s authors on invitation by our editors
Finding the reference scientific paper used for writing the policy briefs is relatively straightforward amongst the Nature-x briefs since each brief says "based on: title doi" for the associated paper. The information from Journal of Health and Social Behavior Policy Briefs operates on similar principle but the original paper had to be searched and manually discovered from the list of articles written by authors. Disambiguation of title was not necessary since most articles had either same or similar titles used for the policy brief.
TBA
Weimin Wu, Alexander Furnas, Eddie, Akhil, Guo Ye, Xuefeng Song.