What's an effective approach for storing discontinuous signal data? #7673
-
ContextI've been using Pandas DataFrames for a while, but I've become frustrated with it's lack of support for multi-dimensional datasets. So I'm experimenting with switching to xarray for my use case. For context, I'm working with laser spectroscopy data which presents itself as a signal. A single sample will contain 3 components of the original signal: fundamental, harmonic, and a ratio. These components will have a fixed N length. In addition, there's an N length index of the wavelengths for each data point and some meta data. I've been able to implement xarray nicely for my data so far:
ProblemHowever, sometimes the data will come from multiple lasers spanning different, discontinuous wavelength regions. For example, we might have a laser scanning a sample from 8-8.5µm, and then 9-9.5µm. Here's a screenshot of some data with two lasers in HDF5 format for a better visualization of what I mean: Each laser will collect the same 3 components as mentioned previously, just over a different wavelength region. How can I best represent this discontinuous signal data with xarray? Attempts at a solution
Is there another approach I haven't considered, or are the drawbacks I mentioned in my attempts not really drawbacks at all? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
You can stick with your first solution of simply concatenating the wavelength dimension. If you add a categorical coordinate, maybe called |
Beta Was this translation helpful? Give feedback.
You can stick with your first solution of simply concatenating the wavelength dimension. If you add a categorical coordinate, maybe called
laser_num = [0, 0, ..., 1, 1, ... 2, 2...]
you can apply your filters using groupby.