[Bug] Reading tdms data chunks works only for certain chunk sizes

I have a tdms file from which I'm trying to read a channel in chunks. 

For certain chunk sizes it works for others not. 

```
data_read_sliced = []

with nptdms.TdmsFile.open(fp) as tdms_file:
    len_data = len(tdms_file['Messdaten'][sensor])

    dt = tdms_file['Messdaten'][sensor].properties['wf_increment']

    len_slice = 2705
    len_slice = 2725
    len_slice = 4000  # Hardcoding 4000 reading works

    for idx in range(int(np.floor(len_data/len_slice))):
        data_slice = tdms_file['Messdaten'][sensor].read_data(offset=len_slice*idx,length=len_slice)
        data_read_sliced.append(data_slice)

    data_read_sliced = np.concatenate(data_read_sliced)
    
    data_read_once = tdms_file['Messdaten'][sensor].read_data()

np.sum(data_read_sliced - data_read_once)
```


len_slice = 2705
```
----> 1 np.sum(data_read_sliced - data_read_once)
ValueError: operands could not be broadcast together with shapes (16359840,) (16360000,) 
```


len_slice = 2725

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[19], line 11
      8 len_slice = 2725
     10 for idx in range(int(np.floor(len_data/len_slice))):
---> 11     data_slice = tdms_file['Messdaten'][sensor].read_data(offset=len_slice*idx,length=len_slice)
     12     data_read_sliced.append(data_slice)
     14 data_read_sliced = np.concatenate(data_read_sliced)

File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/tdms.py:604, in TdmsChannel.read_data(self, offset, length, scaled)
    591 """ Reads data for this channel from the TDMS file and returns it as a numpy array
    592 
    593 Indexing into the channel with a slice should be preferred over using
   (...)
    601     For DAQmx data a dictionary of scaler id to raw scaler data will be returned.
    602 """
    603 if self._raw_data is None:
--> 604     raw_data = self._read_channel_data(offset, length)
    605 else:
    606     raw_data = slice_raw_data(self._raw_data, offset, length)

File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/tdms.py:810, in TdmsChannel._read_channel_data(self, offset, length)
    808 for chunk in self._reader.read_raw_data_for_channel(self.path, offset, length):
    809     if chunk.data is not None:
--> 810         channel_data.append_data(chunk.data)
    811     if chunk.scaler_data is not None:
    812         for scaler_id, scaler_data in chunk.scaler_data.items():

File /opt/pyenvs/DSS04/lib/python3.10/site-packages/nptdms/channel_data.py:92, in NumpyDataReceiver.append_data(self, new_data)
     90 start_pos = self._data_insert_position
     91 end_pos = self._data_insert_position + len(new_data)
---> 92 self.data[start_pos:end_pos] = new_data
     93 self._data_insert_position += len(new_data)

ValueError: could not broadcast input array from shape (200,) into shape (0,)
```

len_slice = 4000

Works and gives a sum of 0. 


Reading the data in one go always works. 


For len_slice = 2725 (the once I actually want) the shown error is that new_data should be appended to self.data at 2725:2925 but self.data has only 2725 elements. In the reader.py somehow num_chunk is 2 for the last chunk of 200 and hence is tried to read twice. Also the end_segment is too large and hence the trimming code for the segment doesn't trigger. 
So far for my debug attempts. I tried changing some things but it didn't get better as I don't know anything about the internals of the tdms format. 

I could provide you the file in question if required but it is >2GB so I can't just upload it here. 

I tried with nptdms 1.7.1 and 1.9.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Reading tdms data chunks works only for certain chunk sizes #337

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug] Reading tdms data chunks works only for certain chunk sizes #337

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions