-
Notifications
You must be signed in to change notification settings - Fork 292
CP-49078: Preprocess fields into a Hashtbl within get_record #6114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-49078: Preprocess fields into a Hashtbl within get_record #6114
Conversation
Signed-off-by: Colin James <colin.barr@cloud.com>
8e4d0ea
to
b60c587
Compare
BVT+BST (208107) all green except 2 known issues (I think). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are tables shrining or removed?
The database internals are unchanged. It's still The 3 relevant commits do the following:
In these cases, the hash tables are ephemeral. In the third commit:
This probably persists for the entirety of xapi since the Schema is constructed from the datamodel and consulted often. The benefit is purely removing the huge cost of comparing strings. The 3 relevant commits remove 3s off the total time taken by my serial In each case, I experimented with precomputing a perfect hash table instead. I don't think it's worth the effort, because:
Hashtbl won out against all other implementations: balanced trees ( The other side effect is that it kind of "normalises" the shape of flame graphs we collect. Instead of their width being dominated by |
Flame graphs indicate that, under load created by parallel "xe vm-list" commands, the DB action get_record is hit often. This function constructs an API-level record by marshalling an association list that maps field names to unmarshalled string values. To do this, it serially queries all the field names using List.assoc. This has rather large cost in doing lexicographical string comparisons (caml_compare on string keys). To avoid this, regardless of record size, we preprocess the association lists __regular_fields and __set_refs into a (string, string) Hashtbl.t and query that to construct each record field. Signed-off-by: Colin James <colin.barr@cloud.com>
Rewrite the API.assocer function to internally construct a hash table from an association list. Then, shadow its usage in relevant places. Signed-off-by: Colin James <colin.barr@cloud.com>
Speed up "find" operation on Schema Tables and Database. Previously, these would use List.find which involves costly comparisons on string keys. Now, we use a hash table, keyed by names, and use that to identify columns and tables. Signed-off-by: Colin James <colin.barr@cloud.com>
6ddd5ef
to
3a49e86
Compare
Flame graphs indicate that, under load created by parallel "xe vm-list" commands, the DB action get_record is hit often. This function constructs an API-level record by marshalling an association list that maps field names to unmarshalled string values. To do this, it serially queries all the field names using
List.assoc
. This has rather large cost in doing lexicographical string comparisons (caml_compare
on string keys).To avoid this, regardless of record size, we preprocess the association lists
__regular_fields
and__set_refs
into a (string, string) Hashtbl.t and query that to construct each record field.This benefit of this change is most notable for large records (such as
VM
andpool
). The cost of the previously generated code, which does a bunch of serialList.assoc
calls, incurs the quadratic cost of list traversal (compounded by the costly lexicographical comparison of strings during the search).To produce measurements, I sampled xapi under a load of 500 consecutive
xe vm-list
invocations (using the same sampling rate withperf
) on a host with a single VM.Without the change, the
get_record
done internally byxe vm-list
makes up for ~33.59% of the samples (33.59% = 1,592,782,255 samples). With the change,get_record
accounts for ~7.56 of the samples (of which there are substantially fewer collected: 7.56% = 264,948,239). So the number of samples forget_record
has dropped from 1,592,782,255 to 264,948,239 (assumingperf
's sampling is reliable).You can see the visual difference in the flame graphs:
Before:

After:

Of course, this is benefit as measured in aggregate (
perf
sampling), so quite a fast and loose comparison. In practice, thexe vm-list
stress test goes from 7.4s to 6.2s (asget_record
makes up only a small part of the work done for a singlexe vm-list
).