Skip to content

CA-404460: Expose Stunnel_verify_error for mismatched or corrupted certificate, and expose ssl_verify_error during update syncing #6376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 21, 2025

Conversation

gangj
Copy link
Contributor

@gangj gangj commented Mar 20, 2025

Pls refer to commit msg for details

Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the overall (existing) look of check_verify_error as it looks fragile and likely to fail when Stunnel changes. Would extracting the desired information with a regex be simpler?

| Some e ->
raise
(Stunnel_verify_error
(split_1 "," (sub_after (e + String.length "error=") line))
(split_1 "," (sub_after (e + String.length "error:") line))
)
| None ->
raise (Stunnel_verify_error "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the string empty and not providing details? Could at least use __LOC__.

Copy link
Contributor Author

@gangj gangj Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is to expose the error with reason found in the stunnel log. Raise the empty reason when no error can be found in the log.

Copy link
Contributor

@lindig lindig Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For documentation, could you add in a comment an example for an actual error that we would like to parse? Otherwise it is difficult to assess how the current code works. I would suggest we define a single regular expression (it may contain alternatives (...|...|...) that catches the error and that is easier to maintain than a sequence of matches.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment for the mismatched certificate case, but it only looks for log line containing "certificate verify failed", and after "certificate verify failed" is found in the line, it tries to look for detailed error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still maintain that using Re would be simple; the code below matches the string at the end and extracts the error code.

  4 let rx =
  3   Re.Posix.re {|error:([^:]*).*certificate verify failed|} |> Re.compile
  2 
  1 let str = "SSL_connect: ssl/statem/statem_clnt.c:1889: error:0A000086:SSL routines::certificate verify failed";;
  0                                                                               
  1 let m = Re.exec rx str in Re.Group.get m 1 |> print_endline
  2 

@gangj
Copy link
Contributor Author

gangj commented Mar 20, 2025

I don't like the overall (existing) look of check_verify_error as it looks fragile and likely to fail when Stunnel changes. Would extracting the desired information with a regex be simpler?

Yeah, I agree it is very fragile.
But after upgrading stunnel or openssl, the log might change significantly, as we can see here: the original log "VERIFY ERROR" is not printed by stunnel now. So I am afraid a regex might not made the situation better, maybe we can add XenRT test cases for these scenarios to make sure it works.

gangj added 4 commits March 21, 2025 10:41
Xapi uses stunnel to connect to remote peer and exposes certificate
verify error by parsing stunnel logs. And when connect with a
mismatched certificate, the log from stunnel would be:

stunnel 5.60 on x86_64-koji-linux-gnu platform
Compiled/running with OpenSSL 3.0.9 30 May 2023
Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,OCSP,SNI Auth:LIBWRAP
Reading configuration from descriptor 8
UTF-8 byte order mark not detected
FIPS mode disabled
Configuration successful
Service [stunnel] accepted connection from unnamed socket
s_connect: connected 10.63.96.116:443
Service [stunnel] connected remote server from 10.63.97.76:34138
CERT: Pre-verification error: self-signed certificate
Rejected by CERT at depth=0: CN=10.63.96.116
SSL_connect: ssl/statem/statem_clnt.c:1889: error:0A000086:SSL routines::certificate verify failed
Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket

This commit fixes the exposing of Stunnel_verify_error by checking
"certificate verify failed" in the log, and expose it with reason
"0A000086:SSL routines::certificate verify failed".

We can find that the log "VERIFY ERROR" is not print by stunnel 5.60,
which is the version of stunnel used in XS now, but it indeed was
printed before:

20d6d2faf740ee5eb9b13752b076ee583fec94d8:src/verify.c:        s_log(LOG_WARNING, "VERIFY ERROR: depth=%d, error=%s: %s",

[gangj@xenrt10715872 stunnel]$ git branch --contains 20d6d2faf740ee5eb9b13752b076ee583fec94d8
  master
* private/gangj/stunnel-5.60

While we can find the log "certificate verify failed" which comes from
openssl library:
https://github.com/openssl/openssl/blob/openssl-3.0.9/ssl/ssl_err.c

    {ERR_PACK(ERR_LIB_SSL, 0, SSL_R_CERTIFICATE_VERIFY_FAILED),
    "certificate verify failed"},

Signed-off-by: Gang Ji <gang.ji@cloud.com>
Xapi uses stunnel to connect to remote peer and exposes certificate
verify error by parsing stunnel logs. And when connecting with a
corrupted certificate, the log from stunnel would be:

Initializing inetd mode configuration
Clients allowed=500
stunnel 5.60 on x86_64-koji-linux-gnu platform
Compiled/running with OpenSSL 3.0.9 30 May 2023
Threading:PTHREAD Sockets:POLL,IPv6 TLS:ENGINE,OCSP,SNI Auth:LIBWRAP
errno: (*__errno_location ())
Initializing inetd mode configuration
Reading configuration from descriptor 8
UTF-8 byte order mark not detected
FIPS mode disabled
No PRNG seeding was required
stunnel default security level set: 2
Ciphers: ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256
TLSv1.3 ciphersuites: TLS_AES_256_GCM_SHA384:TLS_AES_128_GCM_SHA256:TLS_CHACHA20_POLY1305_SHA256
TLS options: 0x02100000 (+0x00000000, -0x00000000)
Session resumption enabled
No certificate or private key specified
error queue: crypto/x509/by_file.c:234: error:05880009:x509 certificate routines::PEM lib
error queue: crypto/pem/pem_info.c:169: error:0488000D:PEM routines::ASN1 lib
error queue: crypto/asn1/tasn_dec.c:349: error:0688010A:asn1 encoding routines::nested asn1 error
error queue: crypto/asn1/tasn_dec.c:1178: error:06800066:asn1 encoding routines::bad object header
SSL_CTX_load_verify_locations: crypto/asn1/asn1_lib.c:95: error:0680009B:asn1 encoding routines::too long
Inetd mode: Failed to initialize TLS context
Configuration failed
Deallocating temporary section defaults

This commit exposes Stunnel_verify_error by checking
"No certificate or private key specified" in the log, and expose it with
reason "The specified certificate is corrupt".

And the log "No certificate or private key specified" comes from
stunnel:
https://github.com/mtrojnar/stunnel/blob/9f291d5ba27f0fa45353ae87cf9ac5f05401b012/src/ctx.c#L690

    /* load the certificate and private key */
    if(!section->cert || !section->key) {
        s_log(LOG_DEBUG, "No certificate or private key specified");
        return 0; /* OK */
    }

Signed-off-by: Gang Ji <gang.ji@cloud.com>
There are 4 error logs are checked in check_error:
"Connection refused"
"No host resolved"
"No route to host"
"Invalid argument"

We can indeed find the logging in stunnel for 2 of them in stunnel 5.60,
which is the version used in XS now:

[gangj@xenrt10715872 stunnel]$ git grep -C 1 -wn "Connection refused"
src/log.c-493-    case 10061:
src/log.c:494:        return "Connection refused (WSAECONNREFUSED)";
src/log.c-495-    case 10062:
--
src/protocol.c-240-            s_log(LOG_ERR,
src/protocol.c:241:                "SOCKS5 request failed: Connection refused");
src/protocol.c-242-            break;
[gangj@xenrt10715872 stunnel]$

[gangj@xenrt10715872 stunnel]$ git grep -C 1 -wn "Invalid argument"
src/log.c-437-    case 10022:
src/log.c:438:        return "Invalid argument (WSAEINVAL)";
src/log.c-439-    case 10024:

While the other 2 are not found:
[gangj@xenrt10715872 stunnel]$ git grep -C 1 -wn "No host resolved"
[gangj@xenrt10715872 stunnel]$
[gangj@xenrt10715872 stunnel]$ git grep -C 1 -wn "No route to host"
[gangj@xenrt10715872 stunnel]$

But seems "No host resolved" was in the history of stunnel:

ddef8f192ecfe195610000c6f6272f6b77b97e53:src/client.c:        s_log(LOG_ERR, "No host resolved");
[gangj@xenrt10715872 stunnel]$ git branch --contains ddef8f192ecfe195610000c6f6272f6b77b97e53
  master
* private/gangj/stunnel-5.60

And I failed to find the log "No route to host" in any historical code
of stunnel or openssl.

So at least for the two errors "No host resolved" and "No route to
host", I think we will need to test and fix them later.

Signed-off-by: Gang Ji <gang.ji@cloud.com>
Signed-off-by: Gang Ji <gang.ji@cloud.com>
@gangj gangj force-pushed the private/gangj/CA-404460 branch from 92331ee to 15df700 Compare March 21, 2025 02:42
@gangj gangj added this pull request to the merge queue Mar 21, 2025
Merged via the queue into xapi-project:master with commit afe37ec Mar 21, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants