-
Notifications
You must be signed in to change notification settings - Fork 292
Update feature/perf from master #5913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Prior to this commit, the xapi on the coordinator host records the 'Unix.gettimeofday' as the timestamps of the heartbeat with other pool supporter hosts. When the system clock is updated with a huge jump forward, the timestamps would be far back into the past. This would cause the xapi assumes that the hosts are offline as long time no heartbeats. In this commit, the timestamps are changed to get from a monotonic clock. In this way, the system clock changes will not impact the heartbeats' timestamps any more. Additionally, Host_metrics.last_updated is only set when the object is created. It's useless in check_host_liveness at all. Signed-off-by: Ming Lu <ming.lu@cloud.com>
…to-non-cdn-update Merge master to feature/non-cdn-update
…to-non-cdn-update Merge master to feature/non-cdn-update
Add a field "origin" in "repository" to indicate the origin of the repository. It's an enum type with 2 values: "remote" and "bundle". Signed-off-by: Bengang Yuan <bengang.yuan@cloud.com>
Signed-off-by: Bengang Yuan <bengang.yuan@cloud.com>
CP-49212: Update datamodel for non-CDN update
By default message-switch calls are serialized for backwards compatibility reasons in the Lwt and Async modes. (We tried enabling parallel actions by default but got some hard to debug failures in the CI). This causes very long VM start times when multiple VBDs are plugged/unplugged concurrently: the operations are seen concurrently by message-switch, but xapi-storage-script only retrieves and dispatches them sequentially, so any opportunity for parallel execution is lost. Even though the actions themselves only take seconds, due to serialization, a VM start may take minutes. Enable parallel processing explicitly here (instead of for all message-switch clients). SMAPIv3 should expect to be called concurrently (on different hosts even), so in theory this change should be safe and improve performance, but there are some known bugs in SMAPIv3 plugins currently. So introduce a config file flag 'concurrent' for now, that defaults to false, but that can be turned to 'true' for testing purposes. When all SMAPIv3 concurrency bugs are believed to be fixed we can flip the default, and eventually remove this flag once no more bugs are reported. The configuration value is done as a global to simplify integrating intot he Lwt port, instead of changing a lot of functions to thread through an argument. This doesn't change the behaviour of xapi-storage-script in its default configuration. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Add a module Tar_ext to unpack a tar file. During the unpacking process, verify the tar file, containing total file size, file type, file path, file integrity, etc. Signed-off-by: Bengang Yuan <bengang.yuan@cloud.com>
Signed-off-by: Bengang Yuan <bengang.yuan@cloud.com>
CP-49213: Add new tar unpacking module
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Edwin Török <edwin.torok@cloud.com>
If we want to reproduce a failure we need to know the exact commandline that was used. Longer than 80 chars is not a problem, this is a logfile, and a truncated line is worse than a long line. Signed-off-by: Edwin Török <edwin.torok@cloud.com>
Non-running VMs' metrics are stored in the coordinator. When the coordinator is asked about the metrics try to unarchive them instead of failing while trying to fetch the coordinator's IP address. This needs to force the HTTP method of the query to be POST Also returns a Service Unavailable when the host is marked as Broken. Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Forces users to use an address, instead of being implicit, this avoid the underlying cause for the issue fixed in the previous commit: it allowed a coordinator to call Pool_role.get_master_address, which always fails. Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
This makes the selection of the action obvious, previously the two booleans made it hazy to understand the decision, and was part of the error why the coordinator tried to get the coordinator address from the pool_role file (and failed badly) Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Currently a List.assoc is used, which raises an unhandled exception. Signed-off-by: Pau Ruiz Safont <pau.ruizsafont@cloud.com>
Shell quoting changes in xen-api.git 65f152d broke the dry-run functionality, as by quoting parameters in the way it was done it meant the space separation was not properly handled to separate parameters etc. Signed-off-by: Alex Brett <alex.brett@cloud.com>
The xe-[backup,restore]-metadata scripts have cleanup logic designed to ensure we do not leave any vbd objects etc behind. This logic calls `vbd-unplug` with a 20s timeout, and then (regardless of the result) allows up to 10s for any device specified in the VBD to disappear - if it does not, it does not trigger a `vbd-destroy`. This logic fails in the case where a VDI is attached to a VM running on the same host, as the `device` field in the new VBD will be populated with the backend device for the running VM. In this case, the `vbd-unplug` fails immediately (as the vbd is not plugged because the original `vbd-plug` attempt fails as the VDI is in use), but then we sit waiting for 10s for the device to disappear (which is obviously does not), and then fail to trigger a `vbd-destroy`, leaving the VBD behind. Fix this by simply removing the wait for the device to disappear and always attempting a `vbd-destroy`, as I am not aware of any situation where this additional 10s wait will give any benefit given current behaviours. Signed-off-by: Alex Brett <alex.brett@cloud.com>
This patch makes the feature use the debugfs utility, part of e2fsprogs. This makes the system as a whole a heck of a lot better, if only because it won't be able to parse XFS, ReiserFS or any of the many plugins of libfsimage. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Add a new option `-o` to xe-restore-metadata, which is used to distinguish whether to allow use of legacy backup VDIs, or enforce only use of the new format VDIs with known UUIDs. Also modify xe-restore-metadata such that it no longer stops searching the candidate list if only one VDI is found, but instead identifies all possible backup VDIs. If more than one is found, and you are doing anything other than listing the VDIs, the script will abort. This is to cover the case where a malicious legacy format VDI is present - we will detect it and the expected 'real' backup VDI. Modify xe-backup-metadata to always expect to use the deterministic UUID to identify the VDI to add backups to, do not rely on the `other-config:ctxs-pool-backup` property for identification in any way. This is XSA-459 / CVE-2024-31144 Signed-off-by: Alex Brett <alex.brett@cloud.com>
- Quote a parameter - Adjust how we check the returncode of some function calls to satifsy shellcheck - Disable the warnings where we are explicitly relying on string splitting Signed-off-by: Alex Brett <alex.brett@cloud.com>
This parameter was added in 7f1d315, but the changes to always use the new metadata VDIs with known UUIDs means it is no longer required, so remove it. Signed-off-by: Alex Brett <alex.brett@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
* Remove redundant parameter wiping Removes ineffectual parameter wiping introduced by 6e24ca4. Signed-off-by: Colin James <colin.barr@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Other selectors help strings have spaces before them, VM doesn't, so the final combined help string looked like this: "Suspend the selected VM(s).VMs...." Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Currently, `xe <TAB>` shows a pager after asking whether or not to show 600+ commands. This is extremely unhelpful and does not speed up working with the CLI. Instead, determine unique prefixes ("groups", or "modules", such as 'vm', or 'repository', or 'pool') for all commands, and show them first (there's only 50). Once the command entered by the user unambiguously refers to only one such group, only then show the subcommands in said group (e.g. if 'vm-' is entered, show 'vm-copy', 'vm-migrate', etc.) This does not change the underlying CLI workings, just how autocompletion presents existing commands. Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Group data sources. Leave groups with '-' as suffix, and unique data source names without '-' (i.e. given input of "cpu0-foo,cpu0-bar,xapi_open_fds", return "cpu0-,xapi_open_fds"). Once the data source group is referred to unambiguously, show all suggestions Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
These are cleaned up from the final string before it's autocompleted into the command line. Stylistically, the auxiliary information is right-padded to the longest command so that it starts on the same vertical line. The help string is cut to the first sentence since some of these can be quite long. Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Covers all kinds of UUIDs (options 'uuid=', 'sr-uuid=') These are stripped before the suggestion is entered into the command line. Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Until autocompletion can be generated from the datamodel, this is just naive hardcoding. These cases were obtained by looking for boolean fields: 'xapi-cli-server/records.ml | grep bool_of_string' and 'grep string_of_bool' Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
It was a duplicate switch case that would never be hit - unify them instead. Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
':' as separators did not work, as colon was treated as a separator and wasn't parsed correctly in the initial OLDSTYLE_WORDS phase. This makes the behaviour correct and improves the code around it. Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Dereferences need to be quoted, but curly brackets are only needed when the variable name is inside a sentence where characters after it can still potentially refer to a variable name; or when indexing into an array. Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Check if the parameter has already been provided and if so, do not include it in the list of autocompletions anymore. This should be safe - xe has other ways of entering parameters multiple times: with commas, or as parts of records. To verify, cli_operations.ml receives the command line arguments as an associative list and only contains List.assoc and the like, no List.find_all (apart from the special case in 5d1601c which detects if a parameter was entered multiple times and errors out, warning the user - which only underlines that entering a parameter multiple times is not the desired behaviour, can confuse things and lead to errors) Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
…ompletion If command has both optional and required parameters, shows a prefix before the parameter name specifying its status: ``` $ xe vmss-create <TAB><TAB> OPTIONAL: enabled= REQUIRED: frequency= OPTIONAL: name-description= REQUIRED: name-label= OPTIONAL: retained-snapshots= REQUIRED: type= OPTIONAL: schedule: $ xe vmss-create n<TAB><TAB> OPTIONAL: name-description= REQUIRED: name-label= ``` But: ``` $ xe observer-create <TAB><TAB> OPTIONAL: attributes= OPTIONAL: endpoints= REQUIRED: name-label= OPTIONAL: components= OPTIONAL: host-uuids= OPTIONAL: enabled= OPTIONAL: name-description= $ xe observer-create en<TAB><TAB> enabled= endpoints= ``` (If suggested parameters all come from one class, this status is not shown due to the limitation in the architecture of the autocompletion - this prefix would be common to all commands and would be inserted into the user's command line) Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
…ters ``` $ xe vdi-create <Ctrl-r><q> $ xe vdi-create sr-uuid= name-label= virtual-size= ``` Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Factors out logic for transforming custom specifications into standard specifications in order to permit custom argument specifications to be provided alongside descriptions. Introduces a new list of options for this purpose. Signed-off-by: Colin James <colin.barr@cloud.com>
In preparation for memoization, we wrap a region of "login_with_password" (the part calling out to external authentication code) within a nested function ("query_external_auth"). We introduce a new record type "external_auth_result" that represents the set of downward exposed variables that that are bound within the code our nested function has incorporated. This record becomes the result of "query_external_auth". Consequently, the function is immediately invoked and its result is matched upon to bring all of the fields into scope. The future memoization code will treat a call to this function as the "slow path". Signed-off-by: Colin James <colin.barr@cloud.com>
Implements a wrapper for Psq that provides maximum capacity behaviour. Signed-off-by: Colin James <colin.barr@cloud.com>
Adds basic functional testing for a bounded priority search queue. Signed-off-by: Colin James <colin.barr@cloud.com>
Provides a simple authentication cache built using a bounded priority search queue. The structure of the cache is effectively a (bounded) priority search queue, mapping usernames to entries - where entries have an explicit time-to-live and can store arbitrary data (optionally secured with a cryptographic hash). Also provides a configurable expiry span in the form of a global in Xapi_globs, "external_authentication_expiry". This time span is added to the current time - upon entering data into the cache - to provide an expiration time. Its default value is 5 minutes. The input is specified in seconds. The underlying priority search queue treats the minimum entry as being the entry with highest priority. Therefore, as entries are compared by their expiration time, the entry that is soonest to expire is the one that is evicted when attempting to add a new entry when the cache is at capacity. Signed-off-by: Colin James <colin.barr@cloud.com>
Signed-off-by: Colin James <colin.barr@cloud.com>
- Adds mirage-crypto-rng and mirage-crypto-rng.unix libraries to xapi's dune file. These are used to provide random number generation used for generating salts. - Introduces a cache implementation that secures authenticated sessions (results) using SHA-512 hashes (computed using libcrypt's crypt_r function). Signed-off-by: Colin James <colin.barr@cloud.com>
Adds configurable options "enable-external-authentication-cache" and "external-authentication-cache-size" to Xapi_globs, which control whether the cache is enabled and the maximum capacity of the cache, respectively. Also adds "external-authentication-expiry" option, which permits configuration of the time-to-live assigned to each cache entry. The global that it manipulates in Xapi_globs was introduced earlier. In future, these configuration options may become a pool-wide feature. Signed-off-by: Colin James <colin.barr@cloud.com>
This change provides the workings required to allow the previously functionalised section of login_with_password to be bypassed by consulting an authentication cache. Before attempting to execute that section of login_with_password (which calls out to external authentication plugins), a cache is consulted. If a previously-computed result resides within the cache, that is provided. If there is no such result present in the cache, or external authentication caching is disabled, the previous section of code (the "slow path") is invoked. Signed-off-by: Colin James <colin.barr@cloud.com>
tests for Unix.select and introduce select-as-epoll
The epoll tests have been merged to master, so we need to update |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Draft.
Waiting on:
#5912
#5910