-
Notifications
You must be signed in to change notification settings - Fork 426
Description
Is there an existing proposal for this?
- I have searched the existing proposals
Is your feature request related to a problem?
I'm currently using memray to try and track down where my application is consuming too much memory. My application is one that currently reads very large files into memory, so the mitigation that I am applying is to mmap the file and iterate over the mmap'd object instead.
However, the result of my fix is not being made clear in memray. There are two issues:
- The mmap itself is shown as an allocation of the size of the file (I understand space on the heap is being reserved, so this may technically be considered an allocation, but it's not memory reserved for my application)
- In the graph for "Resident size", what is graphed appears to be VmRSS, which includes RssFile. RssFile is data that is currently resident for the process, but is backed by a file and thus can be reclaimed at any time by the OS.
In other words, I cannot see the positive impact of my changes in memray and have to look to procfs to verify my fix is working.
Describe the solution you'd like
If the memray flamegraph charted RssPrivate (in addition to VmRSS, or instead of), I would be able to more easily verify my fix. Other users would also be able to see the distinction between allocations that were dedicated to their process and allocations which were reclaimable by the OS.
Additionally, if in the flame graph, file-backed allocations could be optionally shown / not shown, that would be very helpful as well.
Alternatives you considered
No response
Sample code for reproduction of the issue
First:
dd if=/dev/zero of=$(pwd)/test_file bs=1M count=5000
memray.py
import mmap
from hashlib import md5
from time import sleep
def hasher(mmap_):
hash_ = md5()
while values := mmap_.read(8192):
hash_.update(values)
return hash_.hexdigest()
with open("test_file", 'rb') as f:
y = mmap.mmap(f.fileno(), length=0, prot=mmap.PROT_READ)
hashed = hasher(y)
print("done reading")
print(open("/proc/self/status", 'r').read())
while True:
sleep(10)
Relevant output:
VmPeak: 5231640 kB
VmSize: 5231640 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 5164776 kB
VmRSS: 5164776 kB
RssAnon: 27772 kB
RssFile: 5137004 kB
RssShmem: 0 kB
And the graphs shown:

