Automatically download research data linked to your bibliography references.
civic-paperkit fetches datasets, CSVs, and supplementary materials referenced in your academic papers.
It reads your .bib file and a companion metadata file to archive all your sources.
pip install civic-paperkit- Create a refs_meta.yamlfile mapping your BibTeX keys to data sources:
# refs_meta.yaml
cdc_pmdr:
  notes: "CDC Maternal Mortality Rates 2018-2022"
  assets:
    - url: "https://data.cdc.gov/api/views/e2d5-ggg7/rows.csv"
      filename: "maternal_mortality.csv"
smith2024:
  assets:
    - page_url: "https://example.org/supplementary"
      allow_ext: [".csv", ".xlsx", ".zip"]- Run the tool:
ci-paperkit --bib paper/refs.bib --meta paper/refs_meta.yaml- Find your data in data/raw/<bibkey>/
- Download direct file URLs (CSV, Excel, PDF, etc.)
- Scrape pages for data files
- Organize downloads by citation key
- Checksum verification (optional)
- Configurable output directories
- Python 3.12+
- A BibTeX file with your references
- A YAML metadata file mapping references to data sources
MIT