Skip to content

Solving slows down dramatically when testing hundreds of versions #135

Closed
@charliermarsh

Description

@charliermarsh

When testing many consecutive versions of a given package, the cost of computing compatible versions seems to increase non-linearly over time.

Concretely, running this example slows down noticeably for me at around 100 iterations, and grinds to a ~halt in the 200s:

use pubgrub::error::PubGrubError;
use pubgrub::range::Range;
use pubgrub::report::{DefaultStringReporter, Reporter};
use pubgrub::solver::{OfflineDependencyProvider, resolve};
use pubgrub::version::SemanticVersion;

fn main() {
    let mut dependency_provider = OfflineDependencyProvider::<&str, SemanticVersion>::new();

    // root depends on foo...
    dependency_provider.add_dependencies(
        "root", (1, 0, 0),
        vec![
            ("foo", Range::any()),
        ],
    );

    for i in 1..1000 {
        dbg!(i);

        // foo depends on bar...
        dependency_provider.add_dependencies(
            "foo", (i, 0, 0),
            vec![
                ("bar", Range::any()),
            ],
        );

        match resolve(&dependency_provider, "root", (1, 0, 0)) {
            Ok(sol) => println!("{:?}", sol),
            Err(PubGrubError::NoSolution(mut derivation_tree)) => {
                derivation_tree.collapse_no_versions();
                eprintln!("{}", DefaultStringReporter::report(&derivation_tree));
            }
            Err(err) => panic!("{:?}", err),
        };
    }
}

I suspect that this may have to do with the increased size of the intersected versions. For example, after testing and rejecting Django 5.0a1, we produce a term like:

[ 0a0.dev0, 5.0a1 [  [ 5.0a1.post0.dev0, ∞ [

We then test Django 4.2.6, which gives us:

[ 0a0.dev0, 4.2.6 [  [ 4.2.6.post0.dev0, ∞ [

We then take the intersection of these terms, which gives us:

[ 0a0.dev0, 4.2.6 [  [ 4.2.6.post0.dev0, 5.0a1 [  [ 5.0a1.post0.dev0, ∞ [

This continues until we have a range like:

[ 0a0.dev0, 4.2rc1 [  [ 4.2rc1.post0.dev0, 4.2 [  [ 4.2.post0.dev0, 4.2.1 [  [ 4.2.1.post0.dev0, 4.2.2 [  [ 4.2.2.post0.dev0, 4.2.3 [  [ 4.2.3.post0.dev0, 4.2.4 [  [ 4.2.4.post0.dev0, 4.2.5 [  [ 4.2.5.post0.dev0, 4.2.6 [  [ 4.2.6.post0.dev0, 5.0a1 [  [ 5.0a1.post0.dev0, ∞ [

If you're testing hundreds of versions, the terms continue to grow, and I suspect this leads to some kind of quadratic behavior, since we're increasing the number of terms linearly with the number of tested versions.

The intersections aren't "wrong", and it's possible that we're doing something wrong with our version formulations -- but could we take advantage of the fact that we know there are no versions in these partial ranges?

For example, given [ 0a0.dev0, 5.0a1 [ [ 5.0a1.post0.dev0, ∞ [ and [ 0a0.dev0, 4.2.6 [ [ 4.2.6.post0.dev0, ∞ [, it would be preferable to reduce to [ 0a0.dev0, 4.2.6 [, if I'm understanding the syntax correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions