-
Notifications
You must be signed in to change notification settings - Fork 292
CP-54826/CP-54827: optimize handling of the Pool object #6445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3a87c35
to
ae60d68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be great to see some performance comparisons from before and after.
invalid_arg (Printf.sprintf "%s: %s" __FUNCTION__ x) | ||
| Ok (t, tz, _) -> | ||
{t; tz} | ||
if String.length x > 5 && x.[4] <> '-' && x.[String.length x - 1] = 'Z' then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have some comment explaining the particular reasons behind each condition here
I can adapt some microbenchmarks that I have to measure these situations, I think the most impactful optimization is the 'sexpr' one but lets see. |
Unused. Avoids having to maintain the function when the code around it changes. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Added a benchmark, it shows an ~15% improvement, but only when considering all commits cumulatively, when measuring one-by-one there is no discernable change at 95% confidence level. Probably not benchmarking the same thing I've seen in the flamegraph though, will add a few more benchmarks. |
Added benchmarks for individual functions and we can more reliably measure improvements.
|
Found a commit from 2024 that can improve Db.Pool.get_record (by about 28%). |
f6c2fc2
to
27ed5d7
Compare
There is a bug in the bechamel+ministat interaction here, will push a fixup. diff --git a/ocaml/tests/bench/bechamel_simple_cli.ml b/ocaml/tests/bench/bechamel_simple_cli.ml
index c105e5576..9e68a960c 100644
--- a/ocaml/tests/bench/bechamel_simple_cli.ml
+++ b/ocaml/tests/bench/bechamel_simple_cli.ml
@@ -177,7 +177,10 @@ let cli ~always ~workloads cfg tests store =
let label = Measure.label Instance.monotonic_clock in
results.Benchmark.lr
|> Array.iter @@ fun measurement ->
- Printf.fprintf out "%.16g\n" (Measurement_raw.get ~label measurement)
+ Printf.fprintf out "%.16g\n"
+ (Measurement_raw.get ~label measurement
+ /. Measurement_raw.run measurement
+ ) |
Updated results:
|
3a52289
to
30afc28
Compare
Better progress indication, CLI tunables, and the ability to dump raw data to a directory, that can be used by `ministat`. Example usage: ``` mkdir /tmp/pool dune exec ./bench_pool_field.exe --profile=release -- -d /tmp/pool --quota 20 ... dune exec ./bench_pool_field.exe --profile=release -- -d /tmp/pool --quota 20 for i in 'Db.Pool.get_all_records' 'Rpc.t -> pool_t' 'pool_t -> Rpc.t'; do ~/git/ministat/ministat -s "/tmp/pool/${OLD}/${i}.dat" "/tmp/pool/${NEW}/$i.dat" -c 99.5; done ``` Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Write directly to the buffer, instead of writing to a buffer, returning a string, and writing to a buffer again. Escaping shows up in performance profiles, because `trusted_on_first_use` field in the pool contains a JSON, which contains \" characters that need escaping. Reduces memory allocations from 324.4308 mnw/run to 196.3633 mnw/run. Slight change in performance: ``` Db.Pool.get_all_records: N Min Max Median Avg Stddev x 384 81858.452 513398.88 87109.044 88314.45 21963.255 + 389 78006.031 559052.12 83431.429 84780.535 24315.561 Difference at 95.0% confidence -3533.91 +/- 3267.84 -4.00151% +/- 3.63493% (Student's t, pooled s = 23176.9) ``` Signed-off-by: Edwin Török <edwin.torok@cloud.com>
The non-standard S-expression serializer wraps all strings in `'`. The lexer also ignores every non-escaped character after a `'`, until it sees the next `'` or escape char. So it should be safe to avoid escaping `"`. Unescaping is unchanged: any character can be escaped with `\`, and it returns it unchanged after removing the escape char, so this preserves backwards compatibility when loading an old database. Escaping shows up in performance profiles, because `trusted_on_first_use` field in the pool contains a JSON, which contains \" characters that needed escaping. With this change it won't anymore. `ministat` confirms that there is an improvement: ``` sexpr_of_json_string: N Min Max Median Avg Stddev x 806 1438.7128 63663.127 1490.3661 1594.5095 2192.0548 + 850 911.23529 48528.173 967.25054 1037.7674 1632.6855 Difference at 95.0% confidence -556.742 +/- 185.531 -34.9162% +/- 9.28192% (Student's t, pooled s = 1925.34) str_of_sexp_json: N Min Max Median Avg Stddev x 792 1622.9135 49591.388 1719.5377 1786.3412 1702.6472 + 893 605.37329 3354.8035 626.51812 636.34457 107.24734 Difference at 95.0% confidence -1150 +/- 111.92 -64.3772% +/- 2.26583% (Student's t, pooled s = 1169.88) ``` Signed-off-by: Edwin Török <edwin.torok@cloud.com>
For backwards compatibility the serialized form looks like 20250319T04:16:24Z, which had to go through several transformations before it was parsed: 2 sscanf, 2 sprintf, and then the Ptime parser. Since we're parsing the format with sscanf anyway, add a fastpath that builds a Ptime.t directly without going through reformatting the string and reparsing it. This speeds up API replies that contain dates, like pool which contains 'telemetry_next_collection' as a date. `ministat` confirms: ``` Date.of_iso8601: N Min Max Median Avg Stddev x 786 1703.462 98255.061 1796.5826 2031.7502 3858.1277 + 905 525.1954 73923.347 558.17972 711.02195 2732.7547 Difference at 95.0% confidence -1320.73 +/- 315.725 -65.0045% +/- 10.3523% (Student's t, pooled s = 3303.82) Db.Pool.get_all_records: N Min Max Median Avg Stddev x 390 76966.273 498179.5 82995.374 84536.667 21266.749 + 401 69709.657 546133 74811.568 76379.246 23782.821 Difference at 95.0% confidence -8157.42 +/- 3147.12 -9.64957% +/- 3.57009% (Student's t, pooled s = 22577.4) Rpc.t -> pool_t : N Min Max Median Avg Stddev x 554 16945.375 267477.23 17620.482 18195.16 10648.914 + 594 11432.375 251226.67 12011.373 12493.367 9824.9986 Difference at 95.0% confidence -5701.79 +/- 1184.38 -31.3369% +/- 5.53746% (Student's t, pooled s = 10230.9) str_of_sexp_json: N Min Max Median Avg Stddev x 893 605.37329 3354.8035 626.51812 636.34457 107.24734 + 911 510.23412 3298.9936 523.73684 532.25916 98.007142 Difference at 95.0% confidence -104.085 +/- 9.47756 -16.3568% +/- 1.36325% (Student's t, pooled s = 102.685) ``` Signed-off-by: Edwin Török <edwin.torok@cloud.com>
We called the wrong finally here. In general we mark the backtrace in finally as important (which involves formatting it, this could be optimized separately). However Mutex.unlock doesn't raise (in well-behaved code, unless you double unlock), so we can use Fun.protect instead. Hashtbl.find in xapi_local_session.ml raises every time in the RBAC checks, this change avoids the costly backtrace formatting (which was discarded by a try/with later anyway). Signed-off-by: Edwin Török <edwin.torok@cloud.com>
We can use Hashtbl.mem instead of catching the exception from Hashtbl.find. `ministat` confirms: ``` local_session_hook : N Min Max Median Avg Stddev x 1057 117.49169 15160.913 123.5201 142.19665 464.12468 + 1166 38.851635 14365.19 41.619012 57.978805 423.21452 Difference at 95.0% confidence -84.2178 +/- 36.8873 -59.2263% +/- 19.5015% (Student's t, pooled s = 443.137) ``` Although there are also some unexplained, but reproducible slowdowns in code that isn't touched by this commit at all: ``` str_of_sexp_json : N Min Max Median Avg Stddev x 911 510.16149 3075.0192 523.44043 531.48578 97.085991 + 892 605.73462 2951.0948 632.27053 652.10806 95.96889 Difference at 95.0% confidence 120.622 +/- 8.91245 22.6953% +/- 1.88103% (Student's t, pooled s = 96.5349) ``` Perhaps this is due to code layout changes? Signed-off-by: Edwin Török <edwin.torok@cloud.com>
read_record was marshaling the maps all the time, and some of these maps can be quite big. Cache the marshaled form in the row itself. This is a trade-off between memory usage and performance (caching increases global memory usage, but avoids repeated allocations every time get_record is called, which may decrease memory usage/GC pressure under load). Eventually we may be able to drop the marshaled form, but we need to change the db_rpc remote protocol for that (it currently cannot marshal/unmarshal values on its own, because it doesn't have access to the type information). `ministat` confirms: ``` Db.Pool.get_all_records : N Min Max Median Avg Stddev x 400 71008.769 489829.75 75874.584 76957.497 20852.389 + 871 51638.867 469264.11 54400.946 55511.462 19845.992 Difference at 95.0% confidence -21446 +/- 2387.53 -27.8674% +/- 2.84131% (Student's t, pooled s = 20167.8) sexpr_of_json_string : N Min Max Median Avg Stddev x 862 819.74359 44758.053 851.53894 915.0725 1496.4717 + 1688 970.16146 51923.24 1024.1987 1097.3771 1748.4532 Difference at 95.0% confidence 182.305 +/- 136.827 19.9224% +/- 15.8188% (Student's t, pooled s = 1667.57) str_of_sexp_json : N Min Max Median Avg Stddev x 1896 605.00587 2932.3931 627.51199 640.73142 89.48524 + 1812 537.56839 3105.1991 550.25275 558.89761 93.633252 Difference at 95.0% confidence -81.8338 +/- 5.89411 -12.7719% +/- 0.864486% (Student's t, pooled s = 91.5357) ``` Slight increase in 'sexpr_of_json_string'. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
This doesn't have feature flags, they are optimizations.
The pool object has gained a JSON field recently, which slowed it down significantly, because the
"
was getting escaped (so the fastpath that checks whether there are any escapable characters or not couldn't be triggerred).The date field is also very slow to handle due to various backward compatibility format transformations. Add a direct parser for the format used in memory (it'd be good to skip all these serializing/deserializing, but that is a bigger change).
Handling the database requires several
finally
calls, which perform a lot of backtrace formatting, and some code keeps raising exceptions even on happy paths (e.g. Hashtbl.find). We need to fix those too, but we should also speed upfinally
.Xapi_local_session was also raising exceptions all the time, replace this with Hashtbl.mem