Skip to content

Merge feature/perf to master #6229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 75 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
4e0ecd6
Merge master into feature/perf (#6078)
edwintorok Oct 25, 2024
8c3438d
Update feature/perf from master (#6111)
edwintorok Nov 12, 2024
b6952d6
Update feature branch (#6120)
edwintorok Nov 18, 2024
6011545
CP-49158: [prep] Add Task completion latency benchmark
edwintorok Apr 21, 2024
95dbc42
CP-51690: [prep] Xapi_periodic_scheduler: Factor out Delay.wait call
edwintorok Apr 21, 2024
68af6ce
CP-51690: [bugfix] Xapi_periodic_scheduler: avoid 10s sleep on empty …
edwintorok Apr 21, 2024
a2f3441
CP-51693: feat(use-xmlrpc): [perf] use JSONRPC instead of XMLRPC for …
edwintorok Apr 21, 2024
71a4a84
CP-51701: [perf] Xapi_event: do not convert to lowercase if already l…
edwintorok Apr 21, 2024
774316f
CP-51701: [perf] Xapi_event: drop duplicate lowercase_ascii
edwintorok Apr 21, 2024
4b86134
CP-51701: [perf] Xapi_events: replace List.any+map with List.exists
edwintorok Apr 21, 2024
a8f9bc6
CP-51693: introduce feature flag to use JSONRPC for internal pool com…
psafont Nov 19, 2024
5115fa1
CP-49064:`Tgroup` library
GabrielBuica Sep 25, 2024
ce7de90
CP-51493: Add `set_cgroup`
GabrielBuica Sep 25, 2024
0714ce2
CP-51488: Set `tgroup` based on request header.
GabrielBuica Sep 25, 2024
3d822a7
CP-49064: Init cgroups at xapi startup
GabrielBuica Oct 2, 2024
e90b32c
CP-50537: Always reset `_extra_headers` when making a connection.
GabrielBuica Oct 28, 2024
4f5dbb5
CP-50537: Propagate originator as a http request header
GabrielBuica Oct 23, 2024
13cff9f
CP-51489: Classify threads based on http requests.
GabrielBuica Oct 22, 2024
efaf3f0
CP-50537: Add a guard in `xapi_globs`, `Xapi_globs.tgroups_enabled`.
GabrielBuica Nov 4, 2024
f713c79
CP-51692: feat(use-event-next): introduce use-event-next configuratio…
edwintorok Apr 21, 2024
3a36ed9
CP-52625: workaround Rpc.Int32 parsing bug
edwintorok Apr 21, 2024
e40c1ae
CP-51692: feat(use-event-next): cli_util: use Event.from instead of E…
edwintorok Apr 21, 2024
527e124
CP-50537: TGroup library to manage the priority and classify xapi exe…
edwintorok Nov 19, 2024
ace50ae
CP-51692: feat(use-event-next): xe event-wait: use Event.from instead…
edwintorok Apr 21, 2024
7b5dbcb
CP-51690: fix timeouts shorter than 10s in the periodic scheduler (#6…
robhoes Nov 19, 2024
4068f9d
CA-401651: stunnel_cache: run the cache expiry code periodically
edwintorok Nov 7, 2024
f9a523d
CA-401652: stunnel_cache: set stunnel size limit based on host role
edwintorok Nov 7, 2024
02cec08
CA-388210: rename vm' to vm
edwintorok Nov 21, 2024
ed55521
CA-388210: drop unused domain parameter
edwintorok Aug 7, 2024
bee9e05
CA-388210: factor out computing the domain parameter
edwintorok Aug 7, 2024
77b8ae9
CA-388210: SMAPIv3 concurrency safety: send the (unique) datapath arg…
edwintorok Aug 7, 2024
2686c6f
CA-388210: SMAPIv3 debugging: log PID
edwintorok Aug 7, 2024
b93ce07
CP-52707: Improve Event.from/next API documentation
edwintorok Nov 21, 2024
7c20ec7
CP-52707: Improve Event.from/next API documentation (#6130)
robhoes Nov 21, 2024
d2d21f1
CA-388210: use unique datapaths for concurrent VDI copies (#5920)
edwintorok Nov 25, 2024
5f1a59c
CP-51692: use Event.from instead of Event.next (#6125)
edwintorok Nov 25, 2024
f1c3cee
CA-388210: SMAPIv3 concurrency: turn on concurrent operations by default
edwintorok Aug 7, 2024
3e36355
CA-388210: delete comment about deadlock bug, they are fixed
edwintorok Dec 3, 2024
c69162b
CA-388210: SMAPIv3 concurrency: turn on concurrent operations by defa…
edwintorok Dec 3, 2024
44fb57a
CA-401650: reduce open connections between pool members and the coord…
robhoes Dec 3, 2024
d2804d6
CA-388564: move qemu-dm to vm.slice
edwintorok Dec 3, 2024
59da2e0
CA-388564: move qemu-dm to vm.slice (#6150)
edwintorok Dec 4, 2024
51e9cf6
Merge master into feature/perf
edwintorok Dec 10, 2024
ce5abab
Update feature/perf from master (#6167)
edwintorok Dec 10, 2024
1ac3f07
Update feature/perf from master
edwintorok Dec 11, 2024
0b34302
Update feature/perf again (#6173)
edwintorok Dec 12, 2024
6b02474
CP-52821: Xapi_periodic_scheduler: introduce add_to_queue_span
edwintorok Nov 28, 2024
68cb0b8
CP-52821: Xapi_event: use Clock.Timer instead of gettimeofday
edwintorok Nov 28, 2024
6b6c6c5
CP-52821: xapi_periodic_scheduler: use Mtime.span instead of Mtime.t
edwintorok Dec 5, 2024
83f4517
CP-52821: use Mtime in Xapi_periodic_scheduler (#6161)
edwintorok Dec 12, 2024
0e909ec
CP-49158: [prep] batching: add a helper for recursive, batched calls …
edwintorok Apr 18, 2024
efaa606
CP-49158: [prep] Event.from: replace recursion with Batching.with_rec…
edwintorok Apr 18, 2024
3e1d8a2
CP-51692: Event.next: use same batching as Event.from
edwintorok Apr 18, 2024
2b4e0db
CP-49158: [prep] Event.{from,next}: make delays configurable and prep…
edwintorok Apr 20, 2024
9435eea
CP-49158: Event.next is deprecated: increase delays
edwintorok Apr 20, 2024
0beb5c1
CP-49158: Use exponential backoff for delay between recursive calls
edwintorok Apr 20, 2024
257af94
CP-49158: Throttle: add Thread.yield
edwintorok Apr 18, 2024
8a427b9
CP-52526: rate limit event updates (#6126)
edwintorok Dec 12, 2024
767b3dd
CP-49141: add OCaml timeslice setter
edwintorok Aug 20, 2024
7e42f49
CP-52709: add timeslice configuration to all services
edwintorok Aug 20, 2024
3ad905e
CP-52709: add simple measurement code
edwintorok Aug 20, 2024
38e1ad8
CP-52709: recommended measurement
edwintorok Aug 20, 2024
93f85be
CP-52709: Enable timeslice setting during unit tests by default
edwintorok Aug 22, 2024
a454548
CP-52320: Improve xapi thread classification
GabrielBuica Dec 4, 2024
76c8556
CP-52320 & CP-52795: Add unit tests for tgroup library
GabrielBuica Dec 4, 2024
63391ba
CP-52320 & CP-52743: Classify xapi threads.
GabrielBuica Dec 4, 2024
6589d9a
Xapi thread classification - part 2 (#6154)
mg12 Jan 8, 2025
8b8af63
Merge remote-tracking branch 'upstream/master' into feature/perf
edwintorok Jan 13, 2025
b418d69
Update feature/perf from master (#6218)
edwintorok Jan 13, 2025
9c5c8dd
CP-52709: use timeslices shorter than 50ms (#6177)
edwintorok Jan 13, 2025
77147a3
CP-51692: Do not enable Event.next ratelimiting if Event.next is stil…
edwintorok Jan 7, 2025
b892c6e
CP-51692: Do not enable Event.next ratelimiting if Event.next is stil…
edwintorok Jan 14, 2025
6ccaf7b
Update feature/perf with latest blocker fixes (#6237)
edwintorok Jan 20, 2025
c69b19d
Merge remote-tracking branch 'upstream/master' into feature/perf
edwintorok Jan 31, 2025
e39baa6
Merge master into feature/perf and fix conflicts (#6265)
edwintorok Jan 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions dune-project
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,13 @@
)
)

(package
(name tgroup)
(depends
xapi-log
xapi-stdext-unix)
)

(package
(name xml-light2)
)
Expand Down Expand Up @@ -321,6 +328,7 @@
(synopsis "The toolstack daemon which implements the XenAPI")
(description "This daemon exposes the XenAPI and is used by clients such as 'xe' and 'XenCenter' to manage clusters of Xen-enabled hosts.")
(depends
(ocaml (>= 4.09))
(alcotest :with-test)
angstrom
astring
Expand Down Expand Up @@ -374,6 +382,7 @@
tar
tar-unix
uri
tgroup
(uuid (= :version))
uutf
uuidm
Expand Down Expand Up @@ -587,6 +596,7 @@ This package provides an Lwt compatible interface to the library.")
(safe-resources(= :version))
sha
(stunnel (= :version))
tgroup
uri
(uuid (= :version))
xapi-backtrace
Expand Down
1 change: 1 addition & 0 deletions http-lib.opam
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ depends: [
"safe-resources" {= version}
"sha"
"stunnel" {= version}
"tgroup"
"uri"
"uuid" {= version}
"xapi-backtrace"
Expand Down
22 changes: 17 additions & 5 deletions ocaml/idl/datamodel.ml
Original file line number Diff line number Diff line change
Expand Up @@ -8517,11 +8517,18 @@ module Event = struct
]
~doc:
"Blocking call which returns a (possibly empty) batch of events. This \
method is only recommended for legacy use. New development should use \
event.from which supersedes this method."
method is only recommended for legacy use.It stores events in a \
buffer of limited size, raising EVENTS_LOST if too many events got \
generated. New development should use event.from which supersedes \
this method."
~custom_marshaller:true ~flags:[`Session]
~result:(Set (Record _event), "A set of events")
~errs:[Api_errors.session_not_registered; Api_errors.events_lost]
~errs:
[
Api_errors.session_not_registered
; Api_errors.events_lost
; Api_errors.event_subscription_parse_failure
]
~allowed_roles:_R_ALL ()

let from =
Expand Down Expand Up @@ -8551,7 +8558,8 @@ module Event = struct
~doc:
"Blocking call which returns a new token and a (possibly empty) batch \
of events. The returned token can be used in subsequent calls to this \
function."
function. It eliminates redundant events (e.g. same field updated \
multiple times)."
~custom_marshaller:true ~flags:[`Session]
~result:
( Set (Record _event)
Expand All @@ -8562,7 +8570,11 @@ module Event = struct
(*In reality the event batch is not a set of records as stated here.
Due to the difficulty of representing this in the datamodel, the doc is generated manually,
so ensure the markdown_backend.ml and gen_json.ml is updated if something changes. *)
~errs:[Api_errors.session_not_registered; Api_errors.events_lost]
~errs:
[
Api_errors.event_from_token_parse_failure
; Api_errors.event_subscription_parse_failure
]
~allowed_roles:_R_ALL ()

let get_current_id =
Expand Down
1 change: 1 addition & 0 deletions ocaml/libs/http-lib/dune
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
http_lib
ipaddr
polly
tgroup
threads.posix
tracing
tracing_propagator
Expand Down
10 changes: 10 additions & 0 deletions ocaml/libs/http-lib/http.ml
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,8 @@ module Hdr = struct

let location = "location"

let originator = "originator"

let hsts = "strict-transport-security"
end

Expand Down Expand Up @@ -674,6 +676,14 @@ module Request = struct
let headers, body = to_headers_and_body x in
let frame_header = if x.frame then make_frame_header headers else "" in
frame_header ^ headers ^ body

let with_originator_of req f =
Option.iter
(fun req ->
let originator = List.assoc_opt Hdr.originator req.additional_headers in
f originator
)
req
end

module Response = struct
Expand Down
2 changes: 2 additions & 0 deletions ocaml/libs/http-lib/http.mli
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ module Request : sig

val to_wire_string : t -> string
(** [to_wire_string t] returns a string which could be sent to a server *)

val with_originator_of : t option -> (string option -> unit) -> unit
end

(** Parsed form of the HTTP response *)
Expand Down
2 changes: 2 additions & 0 deletions ocaml/libs/http-lib/http_svr.ml
Original file line number Diff line number Diff line change
Expand Up @@ -574,6 +574,8 @@ let handle_connection ~header_read_timeout ~header_total_timeout
~max_length:max_header_length ss
in

Http.Request.with_originator_of req Tgroup.of_req_originator ;

(* 2. now we attempt to process the request *)
let finished =
Option.fold ~none:true
Expand Down
14 changes: 10 additions & 4 deletions ocaml/libs/stunnel/stunnel_cache.ml
Original file line number Diff line number Diff line change
Expand Up @@ -40,15 +40,19 @@ let debug = if debug_enabled then debug else ignore_log
type endpoint = {host: string; port: int}

(* Need to limit the absolute number of stunnels as well as the maximum age *)
let max_stunnel = 70
let max_stunnel = Atomic.make 70

let max_age = 180. *. 60. (* seconds *)
let set_max_stunnel n =
D.info "Setting max_stunnel = %d" n ;
Atomic.set max_stunnel n

let max_idle = 5. *. 60. (* seconds *)
let max_age = ref (180. *. 60.) (* seconds *)

let max_idle = ref (5. *. 60.) (* seconds *)

(* The add function adds the new stunnel before doing gc, so the cache *)
(* can briefly contain one more than maximum. *)
let capacity = max_stunnel + 1
let capacity = Atomic.get max_stunnel + 1

(** An index of endpoints to stunnel IDs *)
let index : (endpoint, int list) Hashtbl.t ref = ref (Hashtbl.create capacity)
Expand Down Expand Up @@ -104,6 +108,7 @@ let unlocked_gc () =
let to_gc = ref [] in
(* Find the ones which are too old *)
let now = Unix.gettimeofday () in
let max_age = !max_age and max_idle = !max_idle in
Tbl.iter !stunnels (fun idx stunnel ->
match Hashtbl.find_opt !times idx with
| Some time ->
Expand All @@ -122,6 +127,7 @@ let unlocked_gc () =
debug "%s: found no entry for idx=%d" __FUNCTION__ idx
) ;
let num_remaining = List.length all_ids - List.length !to_gc in
let max_stunnel = Atomic.get max_stunnel in
if num_remaining > max_stunnel then (
let times' = Hashtbl.fold (fun k v acc -> (k, v) :: acc) !times [] in
let times' =
Expand Down
11 changes: 11 additions & 0 deletions ocaml/libs/stunnel/stunnel_cache.mli
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@
HTTP 1.1 should be used and the connection should be kept-alive.
*)

val set_max_stunnel : int -> unit
(** [set_max_stunnel] set the maximum number of unusued, but cached client stunnel connections.
This should be a low number on pool members, to avoid hitting limits on the coordinator with large pools.
*)

val with_connect :
?use_fork_exec_helper:bool
-> ?write_to_log:(string -> unit)
Expand Down Expand Up @@ -46,3 +51,9 @@ val flush : unit -> unit

val gc : unit -> unit
(** GCs old stunnels *)

val max_age : float ref
(** maximum time a connection is kept in the stunnel cache, counted from the time it got initially added to the cache *)

val max_idle : float ref
(** maximum time a connection is kept in the stunnel cache, counted from the most recent time it got (re)added to the cache. *)
11 changes: 11 additions & 0 deletions ocaml/libs/tgroup/dune
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
(library
(name tgroup)
(modules tgroup)
(public_name tgroup)
(libraries xapi-log xapi-stdext-unix xapi-stdext-std))

(test
(name test_tgroup)
(modules test_tgroup)
(package tgroup)
(libraries tgroup alcotest xapi-log))
83 changes: 83 additions & 0 deletions ocaml/libs/tgroup/test_tgroup.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
module D = Debug.Make (struct let name = __MODULE__ end)

let test_identity () =
let specs =
[
((Some "XenCenter2024", "u1000"), "u1000/XenCenter2024")
; ((None, "u1001"), "u1001")
; ((None, "Special!@#"), "Special")
; ((Some "With-Hyphen", "123"), "123/WithHyphen")
; ((Some "", ""), "root")
; ((Some " Xen Center 2024 ", ", u 1000 "), "u1000/XenCenter2024")
; ((Some "Xen Center ,/@.~# 2024", "root"), "root/XenCenter2024")
; ((Some "XenCenter 2024.3.18", ""), "root/XenCenter2024318")
; ((Some "", "S-R-X-Y1-Y2-Yn-1-Yn"), "SRXY1Y2Yn1Yn")
; ( (Some "XenCenter2024", "S-R-X-Y1-Y2-Yn-1-Yn")
, "SRXY1Y2Yn1Yn/XenCenter2024"
)
]
in

let test_make ((user_agent, subject_sid), expected_identity) =
let actual_identity =
Tgroup.Group.Identity.(make ?user_agent subject_sid |> to_string)
in
Alcotest.(check string)
"Check expected identity" expected_identity actual_identity
in
List.iter test_make specs

let test_of_creator () =
let dummy_identity =
Tgroup.Group.Identity.make ~user_agent:"XenCenter2024" "root"
in
let specs =
[
((None, None, None, None), "external/unauthenticated")
; ((Some true, None, None, None), "external/intrapool")
; ( ( Some true
, Some Tgroup.Group.Endpoint.External
, Some dummy_identity
, Some "sm"
)
, "external/intrapool"
)
; ( ( Some true
, Some Tgroup.Group.Endpoint.Internal
, Some dummy_identity
, Some "sm"
)
, "external/intrapool"
)
; ( ( None
, Some Tgroup.Group.Endpoint.Internal
, Some dummy_identity
, Some "cli"
)
, "internal/cli"
)
; ( (None, None, Some dummy_identity, Some "sm")
, "external/authenticated/root/XenCenter2024"
)
]
in
let test_make ((intrapool, endpoint, identity, originator), expected_group) =
let originator = Option.map Tgroup.Group.Originator.of_string originator in
let actual_group =
Tgroup.Group.(
Creator.make ?intrapool ?endpoint ?identity ?originator ()
|> of_creator
|> to_string
)
in
Alcotest.(check string) "Check expected group" expected_group actual_group
in
List.iter test_make specs

let tests =
[
("identity make", `Quick, test_identity)
; ("group of creator", `Quick, test_of_creator)
]

let () = Alcotest.run "Tgroup library" [("Thread classification", tests)]
Empty file.
Loading
Loading