This repository corresponds to the paper:
A Position Paper on the Automatic Generation of Machine Learning Leaderboards
Figure 1: A conceptual diagram illustrating the flow of information extraction from scientific papers to generate leaderboards.
This curated list contains research papers and resources that explore various methods for extracting leaderboard tuples from scientific literature. The collection will be continually updated to support the ongoing leaderboard survey paper.
-
📝 Hou et al. (2019)
Identification of tasks, datasets, evaluation metrics, and numeric scores for scientific leaderboard construction.
🔗 Read the Paper -
📝 Singh et al. (2019)
Automated early leaderboard generation from comparative tables.
🔗 Read the Paper -
📝 Kardas et al. (2020)
AxCell: Automatic extraction of results from machine learning papers.
🔗 Read the Paper -
📝 Jain et al. (2020)
SciREX: A challenge dataset for document-level information extraction.
🔗 Read the Paper -
📝 Kabongo et al. (2021)
Automated mining of leaderboards for empirical AI research.
🔗 Read the Paper -
📝 Yang et al. (2022)
TELIN: Table entity linker for extracting leaderboards from machine learning publications.
🔗 Read the Paper -
📝 Kabongo et al. (2023)
ORKG-Leaderboards: A systematic workflow for mining leaderboards as a knowledge graph.
🔗 Read the Paper -
📝 Kabongo et al. (2024)
Effective context selection in LLM-based leaderboard generation: An empirical study.
🔗 Read the Paper -
📝 Singh et al. (2024)
LegoBench: Scientific leaderboard generation benchmark.
🔗 Read the Paper -
📝 Şahinuç et al. (2024)
Efficient performance tracking: Leveraging large language models for automated construction of scientific leaderboards.
🔗 Read the Paper
📰 Paper | 🔍 Method | 🔥 LLMP | 👥 HIL | 🔗 CR | 📄 Doc-TAET | 🧾 Doc-REC | 📑 Full Paper | 📊 Tab | 🧠 NLP Models |
---|---|---|---|---|---|---|---|---|---|
Hou et al. (2019) | TDMS-IE | ✗ | ✗ | ✗ | ✓ | BERT | |||
Singh et al. (2019) | PIG | ✗ | ✗ | ✗ | ✓ | N/A | |||
Kardas et al. (2020) | AXCELL | ✗ | ✗ | ✗ | ∧ | ∧ | ULMFiT, BM25 | ||
Jain et al. (2020) | SCIREX-IE | ✗ | ✗ | ✓ | ✓ | SciBERT ∨ BiLSTM | |||
Kabongo et al. (2021) | ORKG-TDM | ✗ | ✗ | ✗ | ✓ | XLNet ∨ SciBERT ∨ BERTbase | |||
Yang et al. (2022) | TELIN | ✗ | ✓ | ✗ | ∧ | ∧ | |||
Kabongo et al. (2023) | ORKG-LB | ✗ | ✗ | ✗ | ✓ | BERT ∨ SciBERT ∨ XLNet ∨ BigBERT | |||
Kabongo et al. (2024) | TDMS-PR | ✓ | ✗ | ✗ | ∨ | ∨ | ∨ | Llama 2 ∨ Mistral | |
Singh et al. (2024) | MS-PR | ✓ | ✗ | ✗ | ✓ | Falcon ∨ Galactica ∨ Llama ∨ Mistral ∨ Vicuna ∨ Sephyr ∨ Gemini ∨ GPT-4 | |||
Şahinuç et al. (2024) | TDMR-PR | ✓ | ✗ | ✗ | ✗ | ✗ | ✓ | Llama 2 ∨ Llama 3 ∨ Mixtral ∨ GPT-4 |
🔍 Method | 📑 Paper | 🌍 Open Domain | 🛠 Works w/o Tables | 📊 Extract all Results |
---|---|---|---|---|
TDMS-IE | Hou et al. (2019) | ✗ | ✓ | ✗ |
PIG | Singh et al. (2019) | ✗ | ||
AXCELL | Kardas et al. (2020) | ✗ | ✗ | ✗ |
SCIREX-IE | Jain et al. (2020) | ✓ | ✓ | - |
ORKG-TDM | Kabongo et al. (2021) | ✗ | ✓ | - |
TELIN | Yang et al. (2022) | ✓ | ✗ | ✗ |
ORKG-LB | Kabongo et al. (2023) | ✗ | ✓ | - |
TDMS-LLMP | Kabongo et al. (2024) | ✓ | ✓ | |
MS-LLMP | Singh et al. (2024) | ✓ | ✓ | |
TDMR-LLMP | Şahinuç et al. (2024) | ✓ | ✓ | ✗ |
To contribute new research papers or make updates:
- Fork the repository. 🍴
- Create a new branch with your changes. 🌿
- Submit a pull request with your additions. ✅
This page serves as an evolving resource to advance research on the automated extraction of leaderboard results. Contributions are encouraged to maintain a comprehensive and up-to-date list. 🎓