Description
FragmentedMapper::get_or_optionally_allocate_slab_table(addr, allocate)
finds the slab table for an address, or optionally allocates a slab table for it if not found. It is essentially a hash map. Its algorithm is:
- Compute the hash value of the base address derived form
addr
, and use the hash value as the currentindex
. - Look up the
slab_map
from theindex
. - If it hits, return the
stab_table
entry. - Otherwise, acquire the
lock
. - If the entry at
index
is free, either returnNone
or allocate a slab table there and return it. - Othrwise the entry must be occupied by the slab table of another address.
- Increment
index
by one. - Release the
lock
. - Go to step 2
It will work normally if step 2 hits immediately. But if the index happens to be used for another address, it will go through step 4 to 9 and repeat from 2, linearly scanning the hash table until it finds the matching or vacant entry. During this process, it will frequently acquire and release the mutex FragmentedMapper::lock
, causing significant slowdown.
The effect is amplified by frequently calling is_mmtk_object
which checks if an arbitrary address points to a valid MMTk object. In the mmtk-ruby binding, I added an aggressive assertion at every write barrier to check if both the source and the target objects are valid MMTk objects. It normally costs very little compared to the rest of the VM. But remember that MMTk must check if an address is mapped before accessing the VO bits, and FragmentedMapper::get_or_optionally_allocate_slab_table
is called when checking if a given metadata address is mapped. If the metadata address happens to have a conflict in the FragmentedMapperInner::slab_map
, it will reach the pathological case I described above. When running the optcarrot, I observed almost 50% drop in FPS when choosing certain GC plans. The problem goes away when I disable the assertion.
Solution
We may argue that we should remove such assertions when measuring performance. But I still believe the current implementation of FragmentedMapper::get_or_optionally_allocate_slab_table
should be improved to eliminate the pathological cases.
First of all, why does FragmentedMapper
have to be a hash table? Each Slab
governs 2048 chunks, and the FragmentedMapper::slab_map
has 2048 entries. That is 4M chunks, which is 16TB. That is already 44 bits address space, slightly smaller than the typical 48-bit (256TB) address space of common architectures such as x86_64. If we change both MMAP_NUM_CHUNKS
(chunk per Slab
) and SLAB_TABLE_SIZE
(total Slab
in slab_map
), it will cover 48-bit address space. Even if we have to support 56-bit address space one day, we can still adjust both of them to 131072, or we introduce a three-level table.
Secondly, even if we make FragmentedMapper
a hash table, the table must have the capability to resize and rehash. It is necessary if the heap plus metadata can grow beyond 16TB.
Thirdly, I think it should be possible to make the hash table lock-free, at least for the case where we don't need to insert new entries. The check of VO bits should not cause more chunks to be mapped. For this reason, get_or_optionally_allocate_slab_table
does not need to insert new slabs. Then it just need to visit entries from the hash
code without holding lock. Currently, it holds the lock to see if another thread has allocated the slab concurrently. But if it is for the VO bit, it can simply return false
because the "arbitrary address" existed before we call this function, and another thread can't concurrently make that address a "valid MMTk object" by allocating anything.
Fourthly, can we just use a third-party lock-free hash table implementation instead of writing by ourselves?
The third solution should be one that needs the least invasive changes, but we should think why we didn't do it the first way (i.e. making it a two-level lazy array instead of hash map) because it should be both easier and faster.