Skip to content

A new backend for Lomb Scargle Fourier Transform based on Low Rank Approximation #925

@pupperemeritus

Description

@pupperemeritus

The paper introducing the method

The method seems to provide much closer accuracy to the slow method, while maintaining the same time complexity (just scaled up by a constant). I suggest we parallellize the FFTs in the LRA optionally using numba. This method is currently in development at astropy. Its a much smaller hassle to implement in Stingray.jl as FastTransforms.jl has an implementation. But nonetheless that is not really my forte someone else can take it up.

TLDR

I propose implementing a newer, high-performance algorithm for the Lomb-Scargle periodogram based on the Non-Uniform Fast Fourier Transform (NUFFT) via Low-Rank Approximation (LRA). This method offers significant advantages over the traditional fast method (Press & Rybicki) in terms of parallelism, accuracy control, and memory efficiency.

This method is slowly being adopted already in the FOSS community, with a reference implementation in development for astropy (astropy/astropy#17842).

The Method

The core idea is to re-frame the non uniform discrete fourier transform problem. Instead of "spreading" non-uniform data onto a uniform grid (like Press & Rybicki), this method decomposes the non-uniform transform matrix itself into a sum of a small, fixed number (K) of diagonally-scaled, uniform DFTs.

Paper: A Nonuniform Fast Fourier Transform based on Low Rank Approximation

Key Advantages

  • Parallelism: The algorithm consists of K (typically ~16 for double precision) completely independent FFTs. This is a perfect fit for modern multi-core CPUs and can be parallelized trivially using tools like numba, potentially leading to a massive speedup over the mostly-sequential Press & Rybicki method.
  • Tunable and Guaranteed Accuracy: The accuracy is explicitly controlled by the parameter K, which is chosen based on a desired working precision ε. Unlike the spreading method, where accuracy is indirectly tied to the oversampling factor, this method provides a direct and a priori error bound. This means we can offer users a much more predictable trade-off between speed and precision. It is less costly than the oversampling in the Press & Rybicki method.
  • Memory Efficiency: This method avoids creating a large, oversampled grid for the FFT. It operates on vectors of the original data size N, making it more memory-friendly, especially for very large datasets where the oversampling grid of other methods might become a bottleneck.

Implementation Pointers

  • The K independent FFTs are an ideal use case for numba.njit(parallel=True). A prange loop could compute all K FFTs concurrently.
  • The Astropy PR has an implementation we can refer to in order to modify our lsft_fast in fourier.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions