Skip to content

Regarding the validation and test set split of datasets like MultiArith/SVAMP #3

@hccngu

Description

@hccngu

Hello, I see in your paper that for datasets like MultiArith/SVAMP, you randomly sampled 500 data points to serve as a validation set, with the rest as the test set. Have you made this split validation and test set public? Or the corresponding index files? I only found the val_index.npy for gsm8k, and it only sampled 200 data points from the training set, which is not quite consistent with what you mentioned in the paper about "sampling 500 data points from the test set to serve as the validation set"?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions