Skip to content

Conversation

@Sploder12
Copy link
Contributor

@Sploder12 Sploder12 commented Sep 30, 2025

This PR removes all occurrences where we use strings to represent subnets and replaces them with a dedicated and robust Subnet class, properly tested and made platform independent. It also combines and mocks all usage of uniform_int random number generation to aid in testing. Making these changes removes some dangerous assumptions we are making such as every subnet (including ones we do not own) having the mask 255.255.255.0, ensures subnets are valid subnets, and gives us significantly more flexibility for future subnetting.

Also modifies Linux Qemu to use AZ subnets instead of its own.

Windows and MacOS do not implement the platform dependent parts yet, this will come in later PRs that include their AZ networking implementation.

The Subnet class does not handle prefix lengths /31 and /32 since I don't believe we plan on using them and they require special logic.

MULTI-2298

@Sploder12 Sploder12 force-pushed the az-networking-refactor branch 2 times, most recently from 7073d06 to bc50b40 Compare September 30, 2025 10:28
@codecov
Copy link

codecov bot commented Sep 30, 2025

Codecov Report

❌ Patch coverage is 98.86364% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.66%. Comparing base (acba722) to head (6b837ce).

Files with missing lines Patch % Lines
src/network/subnet.cpp 98.50% 1 Missing ⚠️
src/utils/utils.cpp 90.00% 1 Missing ⚠️
Additional details and impacted files
@@                  Coverage Diff                   @@
##           availability-zones    #4398      +/-   ##
======================================================
+ Coverage               89.24%   89.66%   +0.42%     
======================================================
  Files                     246      249       +3     
  Lines                   15756    14320    -1436     
======================================================
- Hits                    14061    12840    -1221     
+ Misses                   1695     1480     -215     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sploder12 Sploder12 marked this pull request as ready for review September 30, 2025 14:13
@Sploder12 Sploder12 requested review from levkropp and xmkg September 30, 2025 14:14
Copy link
Member

@xmkg xmkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass of review -- the overall shape is looking good!

I have comments about the terminology, naming, and potential issues of lending out random subnets relying on heuristics.

}

mp::Subnet::Subnet(const std::string& cidr_string) : Subnet(parse(cidr_string))
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As insurance, it'd also be nice to validate that the given subnet.network_addr() is canonical, (i.e., the last N bits are 0, depending on the prefix_length).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would make sense to do in case of a non-canonical network address? Throw an error, log a message? ifconfig (I intended to use for MacOS) returns canonical + 1 instead of the canonical for its subnets but if we stop relying on heuristics that'd no longer matter. Window's Get-NetIPAddress returns an IP that is assigned to the network interface inside its subnet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two possible avenues:

a) Throw an exception and make the canonicalization the caller's responsibility
b) Subnet class canonicalizes the inputs by default

I think option b might be the most convenient one for the use case you described. I'll leave it up to you.

if (MP_PLATFORM.subnet_used_locally(subnet))
continue;

if (MP_PLATFORM.can_reach_gateway(subnet.min_address()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the gateway is temporarily unreachable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then a conflicting subnet is returned

// @TODO don't rely on pure randomness
for (auto i = 0; i < 100; ++i)
{
const auto subnet = ::generate_random_subnet(cidr, range);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of doing a spray and pray, we need to do some bookkeeping about the subnets we're lending, meaning:

  • Using a configurable subnet range, and asking the user to specify a subnet range during the installation while providing a sensible default
  • Determining minimums for both network count and host count so we can ask the user to give a subnet range that has plenty of space
  • Handing out the subnets deterministically, and persisting which address spaces are used by whom

This will eliminate the need for heuristic probing and make it a hard contract that the user is responsible for providing a viable, unused address range if the default range already collides with the user's existing networks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely need a better way of tracking and generating as you say, AZs track their own subnet but that isn't checked directly (possibly indirectly) in the generation of new subnets. I personally like your idea but anything user interaction probably needs eyes from design on it and would probably be out of scope for this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A relevant issue: #3147

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, what we could do is provide a sensible default now, and introduce the customization later. We can try to be smart about it by checking which blocks popular Container/VM/VPN technologies prefer by default (e.g., Docker, LXD, VirtualBox, OpenVPN). User issues could be a reference on which blocks to avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll look into it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like 192.168.0.0/16 is out of the question since it's too small and populated, Docker uses 172.16.0.0/12 by default. That leaves the 10.0.0.0/8 range which we already use. The low [10.0.0.0/16, 10.10.0.0/16] range seems pretty heavily used. The approach differs between applications, for example, Docker does not check for conflicts, LXD uses almost the exact same code we already do...

With the current AZs we only need 3 subnets, but I imagine that will increase in the future, my initial thought is 32 is plenty. For VMs, 62 seems reasonable per subnet. that would give us a minimum range of /21 allocating /26 subnets.

The random number generator decided 10.97.0.0/16 as a sensible default, we can hand out 256 /24 subnets from this range. Although, I'd want to get other opinions before removing the existing heuristics.

@Sploder12 Sploder12 force-pushed the az-networking-refactor branch 4 times, most recently from 72cd33e to 8f07861 Compare October 2, 2025 14:13

virtual Path default_mount_target(const Path& source) const;

virtual long long random_int(long long a, long long b) const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe [[nodiscard]] ?


namespace
{
mp::Subnet parse(const std::string& cidr_string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[[nodiscard]]

// @TODO don't rely on pure randomness
for (auto i = 0; i < 100; ++i)
{
const auto subnet = ::generate_random_subnet(cidr, range);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, what we could do is provide a sensible default now, and introduce the customization later. We can try to be smart about it by checking which blocks popular Container/VM/VPN technologies prefer by default (e.g., Docker, LXD, VirtualBox, OpenVPN). User issues could be a reference on which blocks to avoid.

@Sploder12 Sploder12 force-pushed the az-networking-refactor branch from 8f07861 to 0a7ec29 Compare October 7, 2025 13:53
@Sploder12 Sploder12 force-pushed the az-networking-refactor branch from 0a7ec29 to 2051c37 Compare October 21, 2025 20:06
@Sploder12 Sploder12 force-pushed the az-networking-refactor branch from 2051c37 to 2532d5c Compare October 21, 2025 20:33
@Sploder12 Sploder12 force-pushed the availability-zones branch 6 times, most recently from 2af4bcb to acba722 Compare October 22, 2025 00:51
Sploder12 and others added 20 commits October 21, 2025 21:59
previously this missed cases of larger subnets
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Signed-off-by: trevor-shoe <trevor.schupbach@canonical.com>
Co-Authored-By: Mustafa Kemal Gılor <hello@mkg.dev>
@Sploder12 Sploder12 force-pushed the az-networking-refactor branch from 2532d5c to f433397 Compare October 22, 2025 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants