20240624 developer call notes #1236
tomwhite
started this conversation in
Meeting Notes
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
20240624
Pre-notes
PRs
Issues
Discussions
Notes
Attendees
Discussion
Sgkit next
JK: made a vcztools prototype using Tom’s sgkit code
TW: next thing could be CLI using Cubed and JAX under it
JK: hard bit is to do bcftools filtering language - quite arcane (regex)
JH: our lab once implemented a filter language with PEG.js that was not so bad: https://github.com/hammerlab/cycledash/blob/master/grammars/querylanguage.pegjs
JH: get AI to code gen? Feed it https://vcftools.github.io/man_latest.html.
BJ: new Claude is very good
JK: would be nice to implement a subset of plink too (for GWAS - e.g. BoltLMM) - to get users to use it - since they can’t run plink on UKB today.
JH: VCF manipulation may be more amenable to getting new users
JK: long term goal is to get methods authors to write new methods against our format
TW: NumPy 2, Zarr 3
JK: NumPy 2 looks straightforward
TW: cyvcf already supports NumPy 2, Numba does not (it’s compatible, but doesn’t reproduce behaviour yet)
TW: Let’s have people use bio2zarr for VCF reading
TW: Zarr 3 has broken the API more than expected; have raised several upstream issues while working on Cubed
JK: will give it a week then try bio2zarr again
JH: Who are the main Zarr 3 implementers?
TW: Joe Hamman, Davis Bennett, Norman Rzepka at Scalable Minds
TW: Xarray has pinned to Zarr < 3
TW: Has done work to pull hypothesis VCF out of sgkit, will move to new project under sgkit-dev
TW: https://scikit.bio has had a reboot
Beta Was this translation helpful? Give feedback.
All reactions