Fetch research papers from PubMed API and filter authors affiliated with pharmaceutical/biotech companies.
✅ Fetches research papers using PubMed API
✅ Filters authors affiliated with pharma/biotech companies
✅ Saves results as CSV file
✅ Supports CLI arguments for flexibility
✅ Uses Poetry for dependency management
pubmedapitask/
│── researchpapers/
│ ├── __init__.py
│ ├── main.py # CLI script for fetching and processing papers
│ ├── pubmedapi.py # Fetches data from PubMed API
│ ├── data_processing.py # Extracts and filters authors from XML data
│── .gitignore
│── pyproject.toml # Poetry dependencies and setup
│── README.md # Documentation
git clone https://github.com/ShreyasDankhade/pubmedapitask.git
cd pubmedapitask
pip install poetry
poetry install
Before running the project, ensure all dependencies are installed:
pip install -r requirements.txt
Run the script via Poetry:
poetry run get-papers-list "cancer immunotherapy"
poetry run get-papers-list "diabetes research" -f research_results.csv
poetry run get-papers-list "genetic engineering" -d
Option | Description |
---|---|
query | (Required) Search term for fetching research papers |
-f, --file | Specify filename to save results as a CSV |
-d, --debug | Enable debug mode for detailed logs |
If poetry run doesn't work, enter the Poetry virtual environment first:
poetry shell
get-papers-list "cancer immunotherapy" -f output.csv
PubmedID,Title,Publication Date,Authors with Pharma/Biotech Affiliations
123456, "Breakthrough in Cancer Research", 2024-03, "Dr. John Doe, Dr. Emily Smith"
789101, "Genetic Engineering in Medicine", 2023-11, "Dr. Alex Brown"
To update dependencies:
poetry update
To pull latest changes from GitHub:
git pull origin main
poetry install
If Poetry cannot find modules:
poetry install
poetry run get-papers-list "cancer research"
Manually run the script:
poetry run python researchpapers/main.py "cancer research"
- Python 3.9+
- Poetry (Dependency Management)
- Requests (HTTP Requests)
- Pandas (Data Processing)
- XML Parsing (PubMed Data Extraction)
For questions or support, contact Shreyas Dankhade at shreyasdankhade75@gmail.com.