generated from allenai/python-package-template
-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Description
I'm planning on generating some amount of training data (hopefully on the order of 10s of thousands of pages, depends on cost) using olmocr/data/buildsilver.py (at least I assume this is how the data was generated). I've been running this on a lot of medicinal chemistry-esque papers and its been struggling here.
If you are open to me donating the data, I can use open access papers exclusively for this, otherwise I'll just throw what I have into a private mix.
Metadata
Metadata
Assignees
Labels
No labels