-
Notifications
You must be signed in to change notification settings - Fork 94
Description
I have a tdms file from which I'm trying to read a channel in chunks.
For certain chunk sizes it works for others not.
data_read_sliced = []
with nptdms.TdmsFile.open(fp) as tdms_file:
len_data = len(tdms_file['Messdaten'][sensor])
dt = tdms_file['Messdaten'][sensor].properties['wf_increment']
len_slice = 2705
len_slice = 2725
len_slice = 4000 # Hardcoding 4000 reading works
for idx in range(int(np.floor(len_data/len_slice))):
data_slice = tdms_file['Messdaten'][sensor].read_data(offset=len_slice*idx,length=len_slice)
data_read_sliced.append(data_slice)
data_read_sliced = np.concatenate(data_read_sliced)
data_read_once = tdms_file['Messdaten'][sensor].read_data()
np.sum(data_read_sliced - data_read_once)
len_slice = 2705
----> 1 np.sum(data_read_sliced - data_read_once)
ValueError: operands could not be broadcast together with shapes (16359840,) (16360000,)
len_slice = 2725
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[19], line 11
8 len_slice = 2725
10 for idx in range(int(np.floor(len_data/len_slice))):
---> 11 data_slice = tdms_file['Messdaten'][sensor].read_data(offset=len_slice*idx,length=len_slice)
12 data_read_sliced.append(data_slice)
14 data_read_sliced = np.concatenate(data_read_sliced)
File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/tdms.py:604, in TdmsChannel.read_data(self, offset, length, scaled)
591 """ Reads data for this channel from the TDMS file and returns it as a numpy array
592
593 Indexing into the channel with a slice should be preferred over using
(...)
601 For DAQmx data a dictionary of scaler id to raw scaler data will be returned.
602 """
603 if self._raw_data is None:
--> 604 raw_data = self._read_channel_data(offset, length)
605 else:
606 raw_data = slice_raw_data(self._raw_data, offset, length)
File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/tdms.py:810, in TdmsChannel._read_channel_data(self, offset, length)
808 for chunk in self._reader.read_raw_data_for_channel(self.path, offset, length):
809 if chunk.data is not None:
--> 810 channel_data.append_data(chunk.data)
811 if chunk.scaler_data is not None:
812 for scaler_id, scaler_data in chunk.scaler_data.items():
File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/channel_data.py:92, in NumpyDataReceiver.append_data(self, new_data)
90 start_pos = self._data_insert_position
91 end_pos = self._data_insert_position + len(new_data)
---> 92 self.data[start_pos:end_pos] = new_data
93 self._data_insert_position += len(new_data)
ValueError: could not broadcast input array from shape (200,) into shape (0,)
len_slice = 4000
Works and gives a sum of 0.
Reading the data in one go always works.
For len_slice = 2725 (the once I actually want) the shown error is that new_data should be appended to self.data at 2725:2925 but self.data has only 2725 elements. In the reader.py somehow num_chunk is 2 for the last chunk of 200 and hence is tried to read twice. Also the end_segment is too large and hence the trimming code for the segment doesn't trigger.
So far for my debug attempts. I tried changing some things but it didn't get better as I don't know anything about the internals of the tdms format.
I could provide you the file in question if required but it is >2GB so I can't just upload it here.
I tried with nptdms 1.7.1 and 1.9.0.