-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Describe the bug
While working on #8470 I noticed that the API to report memory usage when encryption was used undercounts the actual memory used
ParquetMetaData::memory_size
is used for memory accounting for in memory parquet caches, and thus should be accurate
To Reproduce
Specifically this function
arrow-rs/parquet/src/file/metadata/mod.rs
Lines 281 to 286 in b8ae8e0
pub fn memory_size(&self) -> usize { | |
std::mem::size_of::<Self>() | |
+ self.file_metadata.heap_size() | |
+ self.row_groups.heap_size() | |
+ self.column_index.heap_size() | |
+ self.offset_index.heap_size() |
Does not account for the heap allocations in the file_decryptor
field:
arrow-rs/parquet/src/file/metadata/mod.rs
Line 191 in b8ae8e0
file_decryptor: Option<FileDecryptor>, |
Expected behavior
ParquetMetaData::memory_size
should report its actually heap allocation size (by implementing the HeapSize
trait for FileDecryptor
and all its subfields
Additional context