|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: "Post-Quantum Cryptography in Kubernetes" |
| 4 | +slug: pqc-in-k8s |
| 5 | +date: 2025-06-03 |
| 6 | +author: "Fabian Kammel (ControlPlane)" |
| 7 | +draft: true |
| 8 | +--- |
| 9 | + |
| 10 | +The world of cryptography is on the cusp of a major shift with the advent of |
| 11 | +quantum computing. While powerful quantum computers are still largely |
| 12 | +theoretical for many applications, their potential to break current |
| 13 | +cryptographic standards is a serious concern, especially for long-lived |
| 14 | +systems. This is where _Post-Quantum Cryptography_ (PQC) comes in. In this |
| 15 | +article, I\'ll dive into what PQC means for TLS and, more specifically, for the |
| 16 | +Kubernetes ecosystem. I'll explain what the (suprising) state of PQC in |
| 17 | +Kubernetes is and what the implications are for current and future clusters. |
| 18 | + |
| 19 | +## What is Post-Quantum Cryptography |
| 20 | + |
| 21 | +Post-Quantum Cryptography refers to cryptographic algorithms that are thought to |
| 22 | +be secure against attacks by both classical and quantum computers. The primary |
| 23 | +concern is that quantum computers, using algorithms like [Shor\'s Algorithm], |
| 24 | +could efficiently break widely used public-key cryptosystems such as RSA and |
| 25 | +Elliptic Curve Cryptography (ECC), which underpin much of today\'s secure |
| 26 | +communication, including TLS. The industry is actively working on standardizing |
| 27 | +and adopting PQC algorithms. One of the first to be standardized by [NIST] is |
| 28 | +the Module-Lattice Key Encapsulation Mechanism (`ML-KEM`), formerly known as |
| 29 | +Kyber, and now standardized as [FIPS\-203] (PDF download). |
| 30 | + |
| 31 | +It is difficult to predict when quantum computers will be able to break |
| 32 | +classical algorithms. However, it is clear that we need to start migrating to |
| 33 | +PQC algorithms now, as the next section shows. To get a feeling for the |
| 34 | +predicted timeline we can look at a [NIST report] covering the transition to |
| 35 | +post-quantum cryptography standards. It declares that system with classical |
| 36 | +crypto should be deprecated after 2030 and disallowed after 2035. |
| 37 | + |
| 38 | +## Key exchange vs. digital signatures: different needs, different timelines {#timelines} |
| 39 | + |
| 40 | +In TLS, there are two main cryptographic operations we need to secure: |
| 41 | + |
| 42 | +**Key Exchange**: This is how the client and server agree on a shared secret to |
| 43 | +encrypt their communication. If an attacker records encrypted traffic today, |
| 44 | +they could decrypt it in the future, if they gain access to a quantum computer |
| 45 | +capable of breaking the key exchange. This makes migrating KEMs to PQC an |
| 46 | +immediate priority. |
| 47 | + |
| 48 | +**Digital Signatures**: These are primarily used to authenticate the server (and |
| 49 | +sometimes the client) via certificates. The authenticity of a server is |
| 50 | +verified at the time of connection. While important, the risk of an attack |
| 51 | +today is much lower, because the decision of trusting a server cannot be abused |
| 52 | +after the fact. Additionally, current PQC signature schemes often come with |
| 53 | +significant computational overhead and larger key/signature sizes compared to |
| 54 | +their classical counterparts. |
| 55 | + |
| 56 | +Another significant hurdle in the migration to PQ certificates is the upgrade |
| 57 | +of root certificates. These certificates have long validity periods and are |
| 58 | +installed in many devices and operating systems as trust anchors. |
| 59 | + |
| 60 | +Given these differences, the focus for immediate PQC adoption in TLS has been |
| 61 | +on hybrid key exchange mechanisms. These combine a classical algorithm (such as |
| 62 | +Elliptic Curve Diffie-Hellman Ephemeral (ECDHE)) with a PQC algorithm (such as |
| 63 | +`ML-KEM`). The resulting shared secret is secure as long as at least one of the |
| 64 | +component algorithms remains unbroken. The `X25519MLKEM768` hybrid scheme is the |
| 65 | +most widely supported one. |
| 66 | + |
| 67 | +## State of PQC key exchange mechanisms (KEMs) today {#state-of-kems} |
| 68 | + |
| 69 | +Support for PQC KEMs is rapidly improving across the ecosystem. |
| 70 | + |
| 71 | +**Go**: The Go standard library\'s `crypto/tls` package introduced support for |
| 72 | +`X25519MLKEM768` in version 1.24 (released February 2025). Crucially, it\'s |
| 73 | +enabled by default when there is no explicit configuration, i.e., |
| 74 | +`Config.CurvePreferences` is `nil`. |
| 75 | + |
| 76 | +**Browsers & OpenSSL**: Major browsers like Chrome (version 131, November 2024) |
| 77 | +and Firefox (version 135, February 2025), as well as OpenSSL (version 3.5.0, |
| 78 | +April 2025), have also added support for the `ML-KEM` based hybrid scheme. |
| 79 | + |
| 80 | +Apple is also [rolling out support][ApplePQC] for `X25519MLKEM768` in version |
| 81 | +26 of their operating systems. Given the proliferation of Apple devices, this |
| 82 | +will have a significant impact on the global PQC adoption. |
| 83 | + |
| 84 | +For a more detailed overview of the state of PQC in the wider industry, |
| 85 | +see [this blog post by Cloudflare][PQC2024]. |
| 86 | + |
| 87 | +## Post-quantum KEMs in Kubernetes: an unexpected arrival |
| 88 | + |
| 89 | +So, what does this mean for Kubernetes? Kubernetes components, including the |
| 90 | +API server and kubelet, are built with Go. |
| 91 | + |
| 92 | +As of Kubernetes v1.33, released in April 2025, the project uses Go 1.24. A |
| 93 | +quick check of the Kubernetes codebase reveals that `Config.CurvePreferences` |
| 94 | +is not explicitly set. This leads to a fascinating conclusion: Kubernetes |
| 95 | +v1.33, by virtue of using Go 1.24, supports hybrid post-quantum |
| 96 | +`X25519MLKEM768` for TLS connections by default! |
| 97 | + |
| 98 | +You can test this yourself. If you set up a Minikube cluster running Kubernetes |
| 99 | +v1.33.0, you can connect to the API server using a recent OpenSSL client: |
| 100 | + |
| 101 | +```console |
| 102 | +$ minikube start --kubernetes-version=v1.33.0 |
| 103 | +$ kubectl cluster-info |
| 104 | +Kubernetes control plane is running at https://127.0.0.1:<PORT> |
| 105 | +$ kubectl config view --minify --raw -o jsonpath=\'{.clusters[0].cluster.certificate-authority-data}\' | base64 -d > ca.crt |
| 106 | +$ openssl version |
| 107 | +OpenSSL 3.5.0 8 Apr 2025 (Library: OpenSSL 3.5.0 8 Apr 2025) |
| 108 | +$ echo -n "Q" | openssl s_client -connect 127.0.0.1:<PORT> -CAfile ca.crt |
| 109 | +[...] |
| 110 | +Negotiated TLS1.3 group: X25519MLKEM768 |
| 111 | +[...] |
| 112 | +DONE |
| 113 | +``` |
| 114 | + |
| 115 | +Lo and behold, the negotiated group is `X25519MLKEM768`! This is a significant |
| 116 | +step towards making Kubernetes quantum-safe, seemingly without a major |
| 117 | +announcement or dedicated KEP (Kubernetes Enhancement Proposal). |
| 118 | + |
| 119 | +## The Go version mismatch pitfall |
| 120 | + |
| 121 | +An interesting wrinkle emerged with Go versions 1.23 and 1.24. Go 1.23 |
| 122 | +included experimental support for a draft version of `ML-KEM`, identified as |
| 123 | +`X25519Kyber768Draft00`. This was also enabled by default if |
| 124 | +`Config.CurvePreferences` was `nil`. Kubernetes v1.32 used Go 1.23. However, |
| 125 | +Go 1.24 removed the draft support and replaced it with the standardized version |
| 126 | +`X25519MLKEM768`. |
| 127 | + |
| 128 | +What happens if a client and server are using mismatched Go versions (one on |
| 129 | +1.23, the other on 1.24)? They won\'t have a common PQC KEM to negotiate, and |
| 130 | +the handshake will fall back to classical ECC curves (e.g., `X25519`). How |
| 131 | +could this happen in practice? |
| 132 | + |
| 133 | +Consider a scenario: |
| 134 | + |
| 135 | +A Kubernetes cluster is running v1.32 (using Go 1.23 and thus |
| 136 | +`X25519Kyber768Draft00`). A developer upgrades their `kubectl` to v1.33, |
| 137 | +compiled with Go 1.24, only supporting `X25519MLKEM768`. Now, when `kubectl` |
| 138 | +communicates with the v1.32 API server, they no longer share a common PQC |
| 139 | +algorithm. The connection will downgrade to classical cryptography, silently |
| 140 | +losing the PQC protection that has been in place. This highlights the |
| 141 | +importance of understanding the implications of Go version upgrades, and the |
| 142 | +details of the TLS stack. |
| 143 | + |
| 144 | +## Limitations: packet size {#limitation-packet-size} |
| 145 | + |
| 146 | +One practical consideration with `ML-KEM` is the size of its public keys |
| 147 | +with encoded key sizes of around 1.2 kilobytes for `ML-KEM-768`. |
| 148 | +This can cause the initial TLS `ClientHello` message not to fit inside |
| 149 | +a single TCP/IP packet, given the typical networking constraints |
| 150 | +(most commonly, the standard Ethernet frame size limit of 1500 |
| 151 | +bytes). Some TLS libraries or network appliances might not handle this |
| 152 | +gracefully, assuming the Client Hello always fits in one packet. This issue |
| 153 | +has been observed in some Kubernetes-related projects and networking |
| 154 | +components, potentially leading to connection failures when PQC KEMs are used. |
| 155 | +More details can be found at [tldr.fail]. |
| 156 | + |
| 157 | +## State of Post-Quantum Signatures |
| 158 | + |
| 159 | +While KEMs are seeing broader adoption, PQC digital signatures are further |
| 160 | +behind in terms of widespread integration into standard toolchains. NIST has |
| 161 | +published standards for PQC signatures, such as `ML-DSA` (`FIPS-204`) and |
| 162 | +`SLH-DSA` (`FIPS-205`). However, implementing these in a way that\'s broadly |
| 163 | +usable (e.g., for PQC Certificate Authorities) [presents challenges]: |
| 164 | + |
| 165 | +**Larger Keys and Signatures**: PQC signature schemes often have significantly |
| 166 | +larger public keys and signature sizes compared to classical algorithms like |
| 167 | +Ed25519 or RSA. For instance, Dilithium2 keys can be 30 times larger than |
| 168 | +Ed25519 keys, and certificates can be 12 times larger. |
| 169 | + |
| 170 | +**Performance**: Signing and verification operations [can be substantially slower]. |
| 171 | +While some algorithms are on par with classical algorithms, others may have a |
| 172 | +much higher overhead, sometimes on the order of 10x to 1000x worse performance. |
| 173 | +To improve this situation, NIST is running a |
| 174 | +[second round of standardization][NIST2ndRound] for PQC signatures. |
| 175 | + |
| 176 | +**Toolchain Support**: Mainstream TLS libraries and CA software do not yet have |
| 177 | +mature, built-in support for these new signature algorithms. The Go team, for |
| 178 | +example, has indicated that `ML-DSA` support is a high priority, but the |
| 179 | +soonest it might appear in the standard library is Go 1.26 [(as of May 2025)]. |
| 180 | + |
| 181 | +[Cloudflare\'s CIRCL] (Cloudflare Interoperable Reusable Cryptographic Library) |
| 182 | +library implements some PQC signature schemes like variants of Dilithium, and |
| 183 | +they maintain a [fork of Go (cfgo)] that integrates CIRCL. Using `cfgo`, it\'s |
| 184 | +possible to experiment with generating certificates signed with PQC algorithms |
| 185 | +like Ed25519-Dilithium2. However, this requires using a custom Go toolchain and |
| 186 | +is not yet part of the mainstream Kubernetes or Go distributions. |
| 187 | + |
| 188 | +## Conclusion |
| 189 | + |
| 190 | +The journey to a post-quantum secure Kubernetes is underway, and perhaps |
| 191 | +further along than many realize, thanks to the proactive adoption of `ML-KEM` |
| 192 | +in Go. With Kubernetes v1.33, users are already benefiting from hybrid post-quantum key |
| 193 | +exchange in many TLS connections by default. |
| 194 | + |
| 195 | +However, awareness of potential pitfalls, such as Go version mismatches leading |
| 196 | +to downgrades and issues with Client Hello packet sizes, is crucial. While PQC |
| 197 | +for KEMs is becoming a reality, PQC for digital signatures and certificate |
| 198 | +hierarchies is still in earlier stages of development and adoption for |
| 199 | +mainstream use. As Kubernetes maintainers and contributors, staying informed |
| 200 | +about these developments will be key to ensuring the long-term security of the |
| 201 | +platform. |
| 202 | + |
| 203 | +[Shor\'s Algorithm]: https://en.wikipedia.org/wiki/Shor%27s_algorithm |
| 204 | +[NIST]: https://www.nist.gov/ |
| 205 | +[FIPS\-203]: https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf |
| 206 | +[NIST report]: https://nvlpubs.nist.gov/nistpubs/ir/2024/NIST.IR.8547.ipd.pdf |
| 207 | +[tldr.fail]: https://tldr.fail/ |
| 208 | +[presents challenges]: https://blog.cloudflare.com/another-look-at-pq-signatures/#the-algorithms |
| 209 | +[can be substantially slower]: https://pqshield.github.io/nist-sigs-zoo/ |
| 210 | +[(as of May 2025)]: https://github.com/golang/go/issues/64537#issuecomment-2877714729 |
| 211 | +[Cloudflare\'s CIRCL]: https://github.com/cloudflare/circl |
| 212 | +[fork of Go (cfgo)]: https://github.com/cloudflare/go |
| 213 | +[PQC2024]: https://blog.cloudflare.com/pq-2024/ |
| 214 | +[NIST2ndRound]: https://csrc.nist.gov/news/2024/pqc-digital-signature-second-round-announcement |
| 215 | +[ApplePQC]: https://support.apple.com/en-lb/122756 |
0 commit comments