-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Preprocess species in BirdFlowR running in R 4.5 with rhdf5
is producing a warning:
You created a large dataset with compression and chunking.
The chunk size is equal to the dataset dimensions.
If you want to read subsets of the dataset, you should test smaller chunk sizes to improve read times.
We should probably heed this. It's unclear to me if that's individual transition matrices that are written in one chunk which would be fine, or if the entire file is one chunk which would not be good. Some testing is in order.
There are several functions that read and write just part of the file that could be used for comparison testing:
read_geom()
- reads just the $geom
component of an HDF5 model file.
extend_birdflow()
- reads, edits, and overwrites the $geom
component of an HDF5 file.
To test write the hdf5 file with different chunk sizes and then see if it affects:
- Reading the geometry
- Updating the geometry
- Reading the entire file
And then either set a chunk size or suppress the warning.
Metadata
Metadata
Assignees
Labels
No labels