-
Notifications
You must be signed in to change notification settings - Fork 2k
Add AI2ARC dataset #8502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add AI2ARC dataset #8502
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for the AI2 Reasoning Challenge (ARC) dataset via a new AI2ARC
class and related utilities.
- Introduces
AI2ARC
to load and process the ARC-Challenge and ARC-Easy subsets into DSPy examples. - Adds
ai2_arc_metric
andparse_arc_answer
functions for evaluation and parsing model outputs. - Exposes
AI2ARC
indspy.datasets
and provides convenience loadersARC_Challenge
andARC_Easy
.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
dspy/datasets/ai2_arc.py | New dataset loader, processing logic, metric/parsing utilities |
dspy/datasets/init.py | Registers AI2ARC in the module’s public API |
Comments suppressed due to low confidence (2)
dspy/datasets/ai2_arc.py:144
- [nitpick] Function names should follow snake_case per PEP8. Rename
ARC_Challenge
toarc_challenge
(and similarlyARC_Easy
).
def ARC_Challenge():
dspy/datasets/ai2_arc.py:84
- Consider adding unit tests for
ai2_arc_metric
andparse_arc_answer
to verify that answer parsing and evaluation behave as expected.
def ai2_arc_metric(gold, pred, trace=None):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty interesting, I am down to checking in this one.
I will defer the decision to @okhat if we want to expand the dataset collection.
to my knowledge, the datasets we have in the repo only exist to maintain some past example notebooks/tests (though maybe we should actually just remove them given that we've ported these into tutorials - e.g. I don't see any instances of it's probably just cleaner to refactor out these datasets + their references in tutorials (so just adding ai2arc for demonstrative purposes in a tutorial) and only keep Dataset and DataLoader in dspy/datasets/ |
yeah it’s rare that we add datasets anymore 😅 |
Oh, thank you for the context. So we won't add more datasets and will deprecate existing datasets someday? |
Add a dataset for https://huggingface.co/datasets/allenai/ai2_arc