Skip to content

Commit 368333c

Browse files
committed
Auto merge of #7361 - Eh2406:public_dependency-as-type_4, r=alexcrichton
Public dependency refactor and re-allow backjumping There were **three** attempts at vanquishing exponential time spent in Public dependency resolution. All failures. All three started with some refactoring that seams worth saving. Specifically the data structure `public_dependency` that is used to test for Public dependency conflicts is large, tricky, and modified in line. So lets make it a type with a name and move the interactions into methods. Next each attempt needed to know how far back to jump to undo any given dependency edge. I am fairly confident that any full solution will need this functionality. I also think any solution will need a way to represent richer conflicts than the existing "is this pid active". So let's keep the `still_applies` structure from the last attempt. Last each attempt needs to pick a way to represent a Public dependency conflict. The last attempt used three facts about a situation. - `a1`: `PublicDependency(p)` witch can be read as the package `p` can see the package `a1` - `b`: `PublicDependency(p)` witch can be read as the package `p` can see the package `b` - `a2`: `PubliclyExports(b)` witch can be read as the package `b` has the package `a2` in its publick interface. This representation is good enough to allow for `backjumping`. I.E. `find_candidate` can go back several frames until the `age` when the Public dependency conflict was introduced. This optimization, added for normal dependencies in #4834, saves the most time in practice. So having it for Public dependency conflicts is important for allowing real world experimentation of the Public dependencies feature. We will have to alter/improve/replace this representation to unlock all of the important optimizations. But I don't know one that will work for all of them and this is a major step forward. Can be read one commit at a time.
2 parents f4d1b77 + 0750caf commit 368333c

File tree

5 files changed

+289
-111
lines changed

5 files changed

+289
-111
lines changed

crates/resolver-tests/tests/resolve.rs

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1047,6 +1047,41 @@ fn resolving_with_constrained_sibling_backtrack_activation() {
10471047
);
10481048
}
10491049

1050+
#[test]
1051+
fn resolving_with_public_constrained_sibling() {
1052+
// It makes sense to resolve most-constrained deps first, but
1053+
// with that logic the backtrack traps here come between the two
1054+
// attempted resolutions of 'constrained'. When backtracking,
1055+
// cargo should skip past them and resume resolution once the
1056+
// number of activations for 'constrained' changes.
1057+
let mut reglist = vec![
1058+
pkg!(("foo", "1.0.0") => [dep_req("bar", "=1.0.0"),
1059+
dep_req("backtrack_trap1", "1.0"),
1060+
dep_req("backtrack_trap2", "1.0"),
1061+
dep_req("constrained", "<=60")]),
1062+
pkg!(("bar", "1.0.0") => [dep_req_kind("constrained", ">=60", Kind::Normal, true)]),
1063+
];
1064+
// Bump these to make the test harder, but you'll also need to
1065+
// change the version constraints on `constrained` above. To correctly
1066+
// exercise Cargo, the relationship between the values is:
1067+
// NUM_CONSTRAINED - vsn < NUM_TRAPS < vsn
1068+
// to make sure the traps are resolved between `constrained`.
1069+
const NUM_TRAPS: usize = 45; // min 1
1070+
const NUM_CONSTRAINED: usize = 100; // min 1
1071+
for i in 0..NUM_TRAPS {
1072+
let vsn = format!("1.0.{}", i);
1073+
reglist.push(pkg!(("backtrack_trap1", vsn.clone())));
1074+
reglist.push(pkg!(("backtrack_trap2", vsn.clone())));
1075+
}
1076+
for i in 0..NUM_CONSTRAINED {
1077+
let vsn = format!("{}.0.0", i);
1078+
reglist.push(pkg!(("constrained", vsn.clone())));
1079+
}
1080+
let reg = registry(reglist);
1081+
1082+
let _ = resolve_and_validated(vec![dep_req("foo", "1")], &reg, None);
1083+
}
1084+
10501085
#[test]
10511086
fn resolving_with_constrained_sibling_transitive_dep_effects() {
10521087
// When backtracking due to a failed dependency, if Cargo is

src/cargo/core/resolver/conflict_cache.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ use std::collections::{BTreeMap, HashMap, HashSet};
22

33
use log::trace;
44

5-
use super::types::{ConflictMap, ConflictReason};
5+
use super::types::ConflictMap;
66
use crate::core::resolver::Context;
77
use crate::core::{Dependency, PackageId};
88

@@ -194,7 +194,7 @@ impl ConflictCache {
194194
/// `dep` is known to be unresolvable if
195195
/// all the `PackageId` entries are activated.
196196
pub fn insert(&mut self, dep: &Dependency, con: &ConflictMap) {
197-
if con.values().any(|c| *c == ConflictReason::PublicDependency) {
197+
if con.values().any(|c| c.is_public_dependency()) {
198198
// TODO: needs more info for back jumping
199199
// for now refuse to cache it.
200200
return;

src/cargo/core/resolver/context.rs

Lines changed: 213 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -23,15 +23,15 @@ pub use super::resolve::Resolve;
2323
// possible.
2424
#[derive(Clone)]
2525
pub struct Context {
26+
pub age: ContextAge,
2627
pub activations: Activations,
2728
/// list the features that are activated for each package
2829
pub resolve_features: im_rc::HashMap<PackageId, FeaturesSet>,
2930
/// get the package that will be linking to a native library by its links attribute
3031
pub links: im_rc::HashMap<InternedString, PackageId>,
3132
/// for each package the list of names it can see,
3233
/// then for each name the exact version that name represents and weather the name is public.
33-
pub public_dependency:
34-
Option<im_rc::HashMap<PackageId, im_rc::HashMap<InternedString, (PackageId, bool)>>>,
34+
pub public_dependency: Option<PublicDependency>,
3535

3636
/// a way to look up for a package in activations what packages required it
3737
/// and all of the exact deps that it fulfilled.
@@ -49,13 +49,13 @@ pub type ContextAge = usize;
4949
/// By storing this in a hash map we ensure that there is only one
5050
/// semver compatible version of each crate.
5151
/// This all so stores the `ContextAge`.
52-
pub type Activations =
53-
im_rc::HashMap<(InternedString, SourceId, SemverCompatibility), (Summary, ContextAge)>;
52+
pub type ActivationsKey = (InternedString, SourceId, SemverCompatibility);
53+
pub type Activations = im_rc::HashMap<ActivationsKey, (Summary, ContextAge)>;
5454

5555
/// A type that represents when cargo treats two Versions as compatible.
5656
/// Versions `a` and `b` are compatible if their left-most nonzero digit is the
5757
/// same.
58-
#[derive(Clone, Copy, Eq, PartialEq, Hash, Debug)]
58+
#[derive(Clone, Copy, Eq, PartialEq, Hash, Debug, PartialOrd, Ord)]
5959
pub enum SemverCompatibility {
6060
Major(NonZeroU64),
6161
Minor(NonZeroU64),
@@ -75,18 +75,19 @@ impl From<&semver::Version> for SemverCompatibility {
7575
}
7676

7777
impl PackageId {
78-
pub fn as_activations_key(self) -> (InternedString, SourceId, SemverCompatibility) {
78+
pub fn as_activations_key(self) -> ActivationsKey {
7979
(self.name(), self.source_id(), self.version().into())
8080
}
8181
}
8282

8383
impl Context {
8484
pub fn new(check_public_visible_dependencies: bool) -> Context {
8585
Context {
86+
age: 0,
8687
resolve_features: im_rc::HashMap::new(),
8788
links: im_rc::HashMap::new(),
8889
public_dependency: if check_public_visible_dependencies {
89-
Some(im_rc::HashMap::new())
90+
Some(PublicDependency::new())
9091
} else {
9192
None
9293
},
@@ -109,7 +110,7 @@ impl Context {
109110
parent: Option<(&Summary, &Dependency)>,
110111
) -> ActivateResult<bool> {
111112
let id = summary.package_id();
112-
let age: ContextAge = self.age();
113+
let age: ContextAge = self.age;
113114
match self.activations.entry(id.as_activations_key()) {
114115
im_rc::hashmap::Entry::Occupied(o) => {
115116
debug_assert_eq!(
@@ -181,20 +182,49 @@ impl Context {
181182
})
182183
}
183184

184-
/// Returns the `ContextAge` of this `Context`.
185-
/// For now we use (len of activations) as the age.
186-
/// See the `ContextAge` docs for more details.
187-
pub fn age(&self) -> ContextAge {
188-
self.activations.len()
189-
}
190-
191185
/// If the package is active returns the `ContextAge` when it was added
192186
pub fn is_active(&self, id: PackageId) -> Option<ContextAge> {
193187
self.activations
194188
.get(&id.as_activations_key())
195189
.and_then(|(s, l)| if s.package_id() == id { Some(*l) } else { None })
196190
}
197191

192+
/// If the conflict reason on the package still applies returns the `ContextAge` when it was added
193+
pub fn still_applies(&self, id: PackageId, reason: &ConflictReason) -> Option<ContextAge> {
194+
self.is_active(id).and_then(|mut max| {
195+
match reason {
196+
ConflictReason::PublicDependency(name) => {
197+
if &id == name {
198+
return Some(max);
199+
}
200+
max = std::cmp::max(max, self.is_active(*name)?);
201+
max = std::cmp::max(
202+
max,
203+
self.public_dependency
204+
.as_ref()
205+
.unwrap()
206+
.can_see_item(*name, id)?,
207+
);
208+
}
209+
ConflictReason::PubliclyExports(name) => {
210+
if &id == name {
211+
return Some(max);
212+
}
213+
max = std::cmp::max(max, self.is_active(*name)?);
214+
max = std::cmp::max(
215+
max,
216+
self.public_dependency
217+
.as_ref()
218+
.unwrap()
219+
.publicly_exports_item(*name, id)?,
220+
);
221+
}
222+
_ => {}
223+
}
224+
Some(max)
225+
})
226+
}
227+
198228
/// Checks whether all of `parent` and the keys of `conflicting activations`
199229
/// are still active.
200230
/// If so returns the `ContextAge` when the newest one was added.
@@ -204,12 +234,12 @@ impl Context {
204234
conflicting_activations: &ConflictMap,
205235
) -> Option<usize> {
206236
let mut max = 0;
207-
for &id in conflicting_activations.keys().chain(parent.as_ref()) {
208-
if let Some(age) = self.is_active(id) {
209-
max = std::cmp::max(max, age);
210-
} else {
211-
return None;
212-
}
237+
if let Some(parent) = parent {
238+
max = std::cmp::max(max, self.is_active(parent)?);
239+
}
240+
241+
for (id, reason) in conflicting_activations.iter() {
242+
max = std::cmp::max(max, self.still_applies(*id, reason)?);
213243
}
214244
Some(max)
215245
}
@@ -240,3 +270,165 @@ impl Context {
240270
graph
241271
}
242272
}
273+
274+
impl Graph<PackageId, Rc<Vec<Dependency>>> {
275+
pub fn parents_of(&self, p: PackageId) -> impl Iterator<Item = (PackageId, bool)> + '_ {
276+
self.edges(&p)
277+
.map(|(grand, d)| (*grand, d.iter().any(|x| x.is_public())))
278+
}
279+
}
280+
281+
#[derive(Clone, Debug, Default)]
282+
pub struct PublicDependency {
283+
/// For each active package the set of all the names it can see,
284+
/// for each name the exact package that name resolves to,
285+
/// the `ContextAge` when it was first visible,
286+
/// and the `ContextAge` when it was first exported.
287+
inner: im_rc::HashMap<
288+
PackageId,
289+
im_rc::HashMap<InternedString, (PackageId, ContextAge, Option<ContextAge>)>,
290+
>,
291+
}
292+
293+
impl PublicDependency {
294+
fn new() -> Self {
295+
PublicDependency {
296+
inner: im_rc::HashMap::new(),
297+
}
298+
}
299+
fn publicly_exports(&self, candidate_pid: PackageId) -> Vec<PackageId> {
300+
self.inner
301+
.get(&candidate_pid) // if we have seen it before
302+
.iter()
303+
.flat_map(|x| x.values()) // all the things we have stored
304+
.filter(|x| x.2.is_some()) // as publicly exported
305+
.map(|x| x.0)
306+
.chain(Some(candidate_pid)) // but even if not we know that everything exports itself
307+
.collect()
308+
}
309+
fn publicly_exports_item(
310+
&self,
311+
candidate_pid: PackageId,
312+
target: PackageId,
313+
) -> Option<ContextAge> {
314+
debug_assert_ne!(candidate_pid, target);
315+
let out = self
316+
.inner
317+
.get(&candidate_pid)
318+
.and_then(|names| names.get(&target.name()))
319+
.filter(|(p, _, _)| *p == target)
320+
.and_then(|(_, _, age)| *age);
321+
debug_assert_eq!(
322+
out.is_some(),
323+
self.publicly_exports(candidate_pid).contains(&target)
324+
);
325+
out
326+
}
327+
pub fn can_see_item(&self, candidate_pid: PackageId, target: PackageId) -> Option<ContextAge> {
328+
self.inner
329+
.get(&candidate_pid)
330+
.and_then(|names| names.get(&target.name()))
331+
.filter(|(p, _, _)| *p == target)
332+
.map(|(_, age, _)| *age)
333+
}
334+
pub fn add_edge(
335+
&mut self,
336+
candidate_pid: PackageId,
337+
parent_pid: PackageId,
338+
is_public: bool,
339+
age: ContextAge,
340+
parents: &Graph<PackageId, Rc<Vec<Dependency>>>,
341+
) {
342+
// one tricky part is that `candidate_pid` may already be active and
343+
// have public dependencies of its own. So we not only need to mark
344+
// `candidate_pid` as visible to its parents but also all of its existing
345+
// publicly exported dependencies.
346+
for c in self.publicly_exports(candidate_pid) {
347+
// for each (transitive) parent that can newly see `t`
348+
let mut stack = vec![(parent_pid, is_public)];
349+
while let Some((p, public)) = stack.pop() {
350+
match self.inner.entry(p).or_default().entry(c.name()) {
351+
im_rc::hashmap::Entry::Occupied(mut o) => {
352+
// the (transitive) parent can already see something by `c`s name, it had better be `c`.
353+
assert_eq!(o.get().0, c);
354+
if o.get().2.is_some() {
355+
// The previous time the parent saw `c`, it was a public dependency.
356+
// So all of its parents already know about `c`
357+
// and we can save some time by stopping now.
358+
continue;
359+
}
360+
if public {
361+
// Mark that `c` has now bean seen publicly
362+
let old_age = o.get().1;
363+
o.insert((c, old_age, if public { Some(age) } else { None }));
364+
}
365+
}
366+
im_rc::hashmap::Entry::Vacant(v) => {
367+
// The (transitive) parent does not have anything by `c`s name,
368+
// so we add `c`.
369+
v.insert((c, age, if public { Some(age) } else { None }));
370+
}
371+
}
372+
// if `candidate_pid` was a private dependency of `p` then `p` parents can't see `c` thru `p`
373+
if public {
374+
// if it was public, then we add all of `p`s parents to be checked
375+
stack.extend(parents.parents_of(p));
376+
}
377+
}
378+
}
379+
}
380+
pub fn can_add_edge(
381+
&self,
382+
b_id: PackageId,
383+
parent: PackageId,
384+
is_public: bool,
385+
parents: &Graph<PackageId, Rc<Vec<Dependency>>>,
386+
) -> Result<
387+
(),
388+
(
389+
((PackageId, ConflictReason), (PackageId, ConflictReason)),
390+
Option<(PackageId, ConflictReason)>,
391+
),
392+
> {
393+
// one tricky part is that `candidate_pid` may already be active and
394+
// have public dependencies of its own. So we not only need to check
395+
// `b_id` as visible to its parents but also all of its existing
396+
// publicly exported dependencies.
397+
for t in self.publicly_exports(b_id) {
398+
// for each (transitive) parent that can newly see `t`
399+
let mut stack = vec![(parent, is_public)];
400+
while let Some((p, public)) = stack.pop() {
401+
// TODO: dont look at the same thing more then once
402+
if let Some(o) = self.inner.get(&p).and_then(|x| x.get(&t.name())) {
403+
if o.0 != t {
404+
// the (transitive) parent can already see a different version by `t`s name.
405+
// So, adding `b` will cause `p` to have a public dependency conflict on `t`.
406+
return Err((
407+
(o.0, ConflictReason::PublicDependency(p)), // p can see the other version and
408+
(parent, ConflictReason::PublicDependency(p)), // p can see us
409+
))
410+
.map_err(|e| {
411+
if t == b_id {
412+
(e, None)
413+
} else {
414+
(e, Some((t, ConflictReason::PubliclyExports(b_id))))
415+
}
416+
});
417+
}
418+
if o.2.is_some() {
419+
// The previous time the parent saw `t`, it was a public dependency.
420+
// So all of its parents already know about `t`
421+
// and we can save some time by stopping now.
422+
continue;
423+
}
424+
}
425+
// if `b` was a private dependency of `p` then `p` parents can't see `t` thru `p`
426+
if public {
427+
// if it was public, then we add all of `p`s parents to be checked
428+
stack.extend(parents.parents_of(p));
429+
}
430+
}
431+
}
432+
Ok(())
433+
}
434+
}

0 commit comments

Comments
 (0)