Skip to content

feat(cli): add geo-matrix command to download and convert GEO matrix files to TSV #233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

aditi75432
Copy link

📋 Description

Added a new CLI command geo-matrix-to-tsv to pysradb for downloading and converting GEO Series Matrix files (GSE) to clean TSV format.
This supports optional skipping of downloads if files already exist.


✅ Type of Change

  • ✨ New feature (non-breaking change which adds functionality)

ℹ️ Additional Information

🔧 Functionality:

  • Command: pysradb geo-matrix-to-tsv GSE12345 --outdir ./output --skip-download
  • Downloads the .txt/.gz matrix file from GEO FTP.
  • Converts it into clean .tsv format for downstream analysis.
  • Output is saved in the specified --outdir.

🧪 Testing:

  • Manual testing performed using GSE10072, GSE60424
  • Verified output TSV integrity, skips download correctly if --skip-download is set.

📸 Screenshot/Logs

$ pysradb geo-matrix-to-tsv GSE10072 --outdir ./tsv_output

✔️ Download complete
✔️ Converted GSE10072_series_matrix.txt.gz → GSE10072_series_matrix.tsv

🔗 Link to Ticket

Fix for - #229

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant