Skip to content

Conversation

@georgeliao
Copy link
Contributor

@georgeliao georgeliao commented Jan 30, 2025

MULTI-1765
Public side of https://github.com/canonical/multipass-private/pull/719

A few things to note
The make_cert_key_pair function originally handles the generation of both server and client key-certificate pairs . The server_name parameter determines the use case, when server_name is an empty string, it indicated the client key-certificate case. In the standard gRPC TLS scheme, the root key-certificate pair is added.

Now, the role of the make_cert_key_pair function is still generating and holding server and client key-certificate pairs. However, the server certificate has evolved into a signed server certificate. As the result, the server side branch must first generate root key-certificate pair and use it to sign the newly created server certificate.

Previously the constructor of X509Cert only handled server and client certifcate generation, distinguishing between them based on whether server_address was empty. Now, with the introduction of root, server and client certifcate, the dispatch is managed by using the CertType enumeration.

Additionally, the X509Cert constructor has been refined to generate certificates in a standard format. Key differences between before and after include adjustments to the serial number format and the inclusion of X509v3 extensions. To inspect a certificate, you can use the following command: openssl x509 -in <cert_path> -noout -text

The certificate paths are as follows:

  • Server certificate:/root/.local/share/multipassd/certificates/localhost.pem
  • Client certificate: /home/<user name>/.local/share/multipass-client-certificate/multipass_cert.pem
  • Root certificate: /usr/local/share/ca-certificates/multipass_root_cert.pem

Snap environment considerations
In the snap environment, the root certificate is stored at /var/snap/multipass/common/data/multipassd/certificates/multipass_root_cert.pem
Unlike /root/, /var/snap/ directory allows other users view its files, making this setup feasible.

Certificate regeneration and migration
The root certificate and server key-certificate pairs area automatically regenerated if either is missing. This mechanism ensures a smooth server key-certificate pair migration when updating Multipass. Upon upgrading, the server startup process will automatically generates root key-certificate pair and use it to sign a fresh server certificate. Consequently, the original server key-certificate pair will be overwritten, enabling successful verification under the new standard grpc TLS.

The Multipass upgrade process should be included in the functional testing as well, both the cmd and gui clients should be tested.

Unit test adaptations
The unit tests have been modified to accommodate changes in the gRPC TLS verification process. The key adjustments include:

  1. Mocking the get_root_cert_path to allow usage of string-based certificates.
  2. Merge the two server and client key-certificate pairs into one, With MockCertProvider being used on both server side and client to provide key-certificate pair. In the unit testing environment, they can be the same.

Following the commit history and messages is also a helpful way to understand changes that have been made.

@georgeliao georgeliao changed the title Standard grpc mTLs Standard grpc mTLS Jan 30, 2025
@georgeliao georgeliao marked this pull request as draft January 30, 2025 11:42
@codecov
Copy link

codecov bot commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 93.38235% with 9 lines in your changes missing coverage. Please review.

Project coverage is 89.33%. Comparing base (23e6620) to head (6c97765).
Report is 3520 commits behind head on main.

Files with missing lines Patch % Lines
src/cert/ssl_cert_provider.cpp 95.68% 5 Missing ⚠️
src/daemon/daemon_config.cpp 57.14% 3 Missing ⚠️
src/utils/utils.cpp 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3909      +/-   ##
==========================================
+ Coverage   89.26%   89.33%   +0.07%     
==========================================
  Files         260      260              
  Lines       14693    14735      +42     
==========================================
+ Hits        13116    13164      +48     
+ Misses       1577     1571       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@georgeliao georgeliao force-pushed the standard_grpc_mTLS branch 5 times, most recently from 5957e41 to c26c20c Compare February 7, 2025 09:03
@georgeliao georgeliao marked this pull request as ready for review February 7, 2025 09:59
Copy link
Contributor

@andrei-toterman andrei-toterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick pass with some nitpicks, didn't test functionally yet.

@andrei-toterman andrei-toterman self-requested a review February 7, 2025 15:01
xmkg
xmkg previously approved these changes Feb 11, 2025
Copy link
Member

@xmkg xmkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for one small nit.

EDIT: Updated review for the new code changes.

Copy link
Contributor

@andrei-toterman andrei-toterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, @georgeliao! So now, in the local environment, the GUI works fine both when upgrading from the old certificates and when starting with no certificates. But in the case of the CLI, it works when starting with no certificates, but if I upgrade from old certificates, I get the following error

E0211 14:23:49.406977684   36809 ssl_transport_security.cc:1316]       Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.

And in the snap, both when starting fresh or when upgrading, I get the following

[error] [client] Caught an unhandled exception: failed to open file '/var/snap/multipass/commondata/multipassd/certificates/multipass_root_cert.pem': No such file or directory(2)

@georgeliao
Copy link
Contributor Author

E0211 14:23:49.406977684 36809 ssl_transport_security.cc:1316] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.

About this root certificate is unsync with server certificate issue in development environment, maybe we can add a check at here, which checks not only the existence of the certificates but also whether the root certificate is the one who signed the server certificate. If not, everything will be re-created. This check can be done by openssl c-api. However, not sure the juice worth the squeeze.

[error] [client] Caught an unhandled exception: failed to open file '/var/snap/multipass/commondata/multipassd/certificates/multipass_root_cert.pem': No such file or directory(2)

This is fixed in the latest version of the PR.

@andrei-toterman
Copy link
Contributor

Yeah, for the dev environment, I don't think it's worth doing the effort. We already have to remove stuff manually, so this is just an extra thing to remove.

Copy link
Contributor

@andrei-toterman andrei-toterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, now both the CLI and the GUI work as expected, both locally and in the snap, both when upgrading or installing fresh. Thanks, @georgeliao!

Now, one thing to discuss with everyone: the purpose of these changes was so that we could remove our grpc patches, and that goal is accomplished by this PR. And now the CLI verifies the server certificate, which couldn't be avoided without the patches. But the GUI can avoid verifying the server certificate, using the plain vanilla grpc dart library. My question is: should the GUI also verify the server certificate, to be in line with the CLI, or should we keep the existing, functioning behavior of not having the GUI verify the server certificate?

@ricab
Copy link
Collaborator

ricab commented Feb 12, 2025

As discussed elsewhere, let's have the GUI also adhere to the new scheme, preferably in a separate PR.

Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @georgeliao, lots of good work in here!

I am only doing a tertiary review in this case, so I glossed over most of the scary low-level certificate stuff. @andrei-toterman and @xmkg, we're relying on you on that front 💪 If you could provide assurance that you've verified sanity of all the nitty-gritty, that would be much appreciated. All of this would be for nothing if security were somehow broken on the foundation...

Other than that, I have some proposals for path derivation and the initialization. Let me know what you think.

@georgeliao georgeliao force-pushed the standard_grpc_mTLS branch 2 times, most recently from ac0e4ee to ee42308 Compare February 17, 2025 10:05
…ts sub-folder."

This reverts commit 26415bc.

Revert "[qemu platform] enforce ower_all permission to network sub-folder."

This reverts commit 6a1879f.

Revert "[open ssh] enforce  permission to ssh-keys sub-folder."

This reverts commit b2afed4.

Revert "[vault] enforce ower_all permission to vault sub-folder."

This reverts commit 1ad954d.

Revert "[daemon] enforce ower_all permission to multipassd-vm-instances.json file."

This reverts commit 72f55a9.
Step1: recursively restrict everything before the creation of certificates directory, opens up the data directory only for other user read.
Step 2: create certificate directory and certificates with the right permission. Or overwrite the permissions in the case of existing certificates directory and certificate files.
Sploder12
Sploder12 previously approved these changes Mar 27, 2025
Copy link
Contributor

@Sploder12 Sploder12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, everything works as expected on Ubuntu!

@georgeliao
Copy link
Contributor Author

@Sploder12

Can you do a review on the last two commits? Basically, the recursive permission setting on the data folder overwrites the root cert permission to be not accessible to client process. The exact behavior is that the server restart causes the client process unable to read the root cert due to permission issue. The last two commits are to fix that.

By the way, also feel free to review the file permission related code, very little code in daemon_config.cpp. The whole idea is calling restrict_permissions on the data folder first, and overwrite the permissions of root cert related files and folders. It includes data folder itself, certificates sub folder and multipass_root_cert.pem file itself.

The functional testing is the same, includes multipass upgrade, new installation and plus the server restart.

Sploder12
Sploder12 previously approved these changes Mar 27, 2025
Copy link
Contributor

@Sploder12 Sploder12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good catch with the restart issue! This time I tested the restart as well. Everything looks good to me on the Linux side of things!

ricab
ricab previously approved these changes Mar 27, 2025
Copy link
Collaborator

@ricab ricab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jia, just a small comment, but otherwise looks good. Approving modulo suggestion

Co-authored-by: Ricardo Abreu <6698114+ricab@users.noreply.github.com>
Signed-off-by: George Liao <georgeliaojia@gmail.com>
@georgeliao georgeliao dismissed stale reviews from ricab and Sploder12 via 6c97765 March 27, 2025 21:39
Copy link
Member

@xmkg xmkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving the latest changes.

@ricab ricab added this pull request to the merge queue Mar 28, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 28, 2025
@georgeliao georgeliao added this pull request to the merge queue Mar 28, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 28, 2025
@georgeliao georgeliao added this pull request to the merge queue Mar 28, 2025
github-merge-queue bot pushed a commit that referenced this pull request Mar 28, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 28, 2025
@xmkg
Copy link
Member

xmkg commented Mar 28, 2025

The merge queue pipeline should be fixed once #4017 lands.

@georgeliao georgeliao added this pull request to the merge queue Mar 28, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to no response for status checks Mar 29, 2025
@ricab ricab merged commit 5385fac into main Mar 31, 2025
16 checks passed
@ricab ricab deleted the standard_grpc_mTLS branch March 31, 2025 09:52
@ricab ricab added this to the 1.16.0 milestone Jun 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants