Releases: DiffAPF/torchlpc
v0.7.1
This release is identical to v0.7
. No new feature is added.
Full Changelog: v0.7...v0.7.1
v0.7
What's Changed
- feat: parallel scan extension for CPU by @yoyolicoris in #17
- feat: enhance CI workflow to support multiple OS environments by @yoyolicoris in #19
- feat: cpp extension for LPC by @yoyolicoris in #18
- feat: use original cuda scan from linear RNN by @yoyolicoris in #22
- feat: LPC CUDA kernel by @yoyolicoris in #24
- feat: return
zf
by @yoyolicoris in #25 - fix: linking openMP on mac by @yoyolicoris in #20
No pre-built wheels
This release contains C++/CUDA implementations of time-varying LPC/scan as replacements for the existing Numba kernels, which will be dropped entirely in v1.0.
Since this release, there will be no pre-built wheels; only source distributions are available.
The user must ensure that the corresponding compilers (e.g., gcc
, nvcc
) are available on their system for building binaries during the installation process.
Performance changes
Comparing v0.7
to v0.6
, the speed on Nvidia GPU is 1.2 to 2 times faster than before, tested on an RTX 5060 Ti.
However, when running on the CPU, it's sometimes two to three times slower, as tested on a Ubuntu 24.10 machine with an Intel i7-7700K.
We are exploring ways to address this gap in the future, to achieve the same speed again without the Numba compiler.
Full Changelog: v0.6...v0.7
v0.6
What's Changed
- v0.6: drop support for <2.0, support jacobian and hessian computation by @yoyololicon in #12
Summary:
- Drop support for
torch<2.0
to use the new schema for writingautograd.Function
- add
vmap
for calculatingjacfwd/jacrev/hessian
Full Changelog: v0.5...v0.6