|
1 |
| -//! This module provides functions to deconstruct and reconstruct patterns into a constructor |
2 |
| -//! applied to some fields. This is used by the `_match` module to compute pattern |
3 |
| -//! usefulness/exhaustiveness. |
| 1 | +//! [`super::usefulness`] explains most of what is happening in this file. As explained there, |
| 2 | +//! values and patterns are made from constructors applied to fields. This file defines a |
| 3 | +//! `Constructor` enum, a `Fields` struct, and various operations to manipulate them and convert |
| 4 | +//! them from/to patterns. |
| 5 | +//! |
| 6 | +//! There's one idea that is not detailed in [`super::usefulness`] because the details are not |
| 7 | +//! needed there: _constructor splitting_. |
| 8 | +//! |
| 9 | +//! # Constructor splitting |
| 10 | +//! |
| 11 | +//! The idea is as follows: given a constructor `c` and a matrix, we want to specialize in turn |
| 12 | +//! with all the value constructors that are covered by `c`, and compute usefulness for each. |
| 13 | +//! Instead of listing all those constructors (which is intractable), we group those value |
| 14 | +//! constructors together as much as possible. Example: |
| 15 | +//! |
| 16 | +//! ``` |
| 17 | +//! match (0, false) { |
| 18 | +//! (0 ..=100, true) => {} // `p_1` |
| 19 | +//! (50..=150, false) => {} // `p_2` |
| 20 | +//! (0 ..=200, _) => {} // `q` |
| 21 | +//! } |
| 22 | +//! ``` |
| 23 | +//! |
| 24 | +//! The naive approach would try all numbers in the range `0..=200`. But we can be a lot more |
| 25 | +//! clever: `0` and `1` for example will match the exact same rows, and return equivalent |
| 26 | +//! witnesses. In fact all of `0..50` would. We can thus restrict our exploration to 4 |
| 27 | +//! constructors: `0..50`, `50..=100`, `101..=150` and `151..=200`. That is enough and infinitely |
| 28 | +//! more tractable. |
| 29 | +//! |
| 30 | +//! We capture this idea in a function `split(p_1 ... p_n, c)` which returns a list of constructors |
| 31 | +//! `c'` covered by `c`. Given such a `c'`, we require that all value ctors `c''` covered by `c'` |
| 32 | +//! return an equivalent set of witnesses after specializing and computing usefulness. |
| 33 | +//! In the example above, witnesses for specializing by `c''` covered by `0..50` will only differ |
| 34 | +//! in their first element. |
| 35 | +//! |
| 36 | +//! We usually also ask that the `c'` together cover all of the original `c`. However we allow |
| 37 | +//! skipping some constructors as long as it doesn't change whether the resulting list of witnesses |
| 38 | +//! is empty of not. We use this in the wildcard `_` case. |
| 39 | +//! |
| 40 | +//! Splitting is implemented in the [`Constructor::split`] function. We don't do splitting for |
| 41 | +//! or-patterns; instead we just try the alternatives one-by-one. For details on splitting |
| 42 | +//! wildcards, see [`SplitWildcard`]; for integer ranges, see [`SplitIntRange`]; for slices, see |
| 43 | +//! [`SplitVarLenSlice`]. |
| 44 | +
|
4 | 45 | use self::Constructor::*;
|
5 | 46 | use self::SliceKind::*;
|
6 | 47 |
|
@@ -260,13 +301,13 @@ enum IntBorder {
|
260 | 301 | AfterMax,
|
261 | 302 | }
|
262 | 303 |
|
263 |
| -/// A range of integers that is partitioned into disjoint subranges. |
264 |
| -/// |
265 |
| -/// This is fed an input of multiple ranges, and returns an output that covers the union of the |
266 |
| -/// inputs but is split so that an output range only intersects an input range by being a subrange |
267 |
| -/// of it. No output range straddles the boundary of one of the inputs. This does constructor |
| 304 | +/// A range of integers that is partitioned into disjoint subranges. This does constructor |
268 | 305 | /// splitting for integer ranges as explained at the top of the file.
|
269 | 306 | ///
|
| 307 | +/// This is fed multiple ranges, and returns an output that covers the input, but is split so that |
| 308 | +/// the only intersections between an output range and a seen range are inclusions. No output range |
| 309 | +/// straddles the boundary of one of the inputs. |
| 310 | +/// |
270 | 311 | /// The following input:
|
271 | 312 | /// ```
|
272 | 313 | /// |-------------------------| // `self`
|
@@ -405,54 +446,67 @@ impl Slice {
|
405 | 446 | }
|
406 | 447 | }
|
407 | 448 |
|
408 |
| -/// The exhaustiveness-checking paper does not include any details on checking variable-length |
409 |
| -/// slice patterns. However, they may be matched by an infinite collection of fixed-length array |
410 |
| -/// patterns. |
411 |
| -/// |
412 |
| -/// Checking the infinite set directly would take an infinite amount of time. However, it turns out |
413 |
| -/// that for each finite set of patterns `P`, all sufficiently large array lengths are equivalent: |
414 |
| -/// |
415 |
| -/// Each slice `s` with a "sufficiently-large" length `l ≥ L` that applies to exactly the subset |
416 |
| -/// `Pₜ` of `P` can be transformed to a slice `sₘ` for each sufficiently-large length `m` that |
417 |
| -/// applies to exactly the same subset of `P`. |
418 |
| -/// |
419 |
| -/// Because of that, each witness for reachability-checking of one of the sufficiently-large |
420 |
| -/// lengths can be transformed to an equally-valid witness of any other length, so we only have to |
421 |
| -/// check slices of the "minimal sufficiently-large length" and less. |
422 |
| -/// |
423 |
| -/// Note that the fact that there is a *single* `sₘ` for each `m` not depending on the specific |
424 |
| -/// pattern in `P` is important: if you look at the pair of patterns `[true, ..]` `[.., false]` |
425 |
| -/// Then any slice of length ≥1 that matches one of these two patterns can be trivially turned to a |
426 |
| -/// slice of any other length ≥1 that matches them and vice-versa, but the slice of length 2 |
427 |
| -/// `[false, true]` that matches neither of these patterns can't be turned to a slice from length 1 |
428 |
| -/// that matches neither of these patterns, so we have to consider slices from length 2 there. |
| 449 | +/// This computes constructor splitting for variable-length slices, as explained at the top of the |
| 450 | +/// file. |
429 | 451 | ///
|
430 |
| -/// Now, to see that that length exists and find it, observe that slice patterns are either |
431 |
| -/// "fixed-length" patterns (`[_, _, _]`) or "variable-length" patterns (`[_, .., _]`). |
| 452 | +/// A slice pattern `[x, .., y]` behaves like the infinite or-pattern `[x, y] | [x, _, y] | [x, _, |
| 453 | +/// _, y] | ...`. The corresponding value constructors are fixed-length array constructors above a |
| 454 | +/// given minimum length. We obviously can't list all of this infinity of constructors. Thankfully, |
| 455 | +/// it turns out that for each finite set of slice patterns, all sufficiently large array lengths |
| 456 | +/// are equivalent. |
432 | 457 | ///
|
433 |
| -/// For fixed-length patterns, all slices with lengths *longer* than the pattern's length have the |
434 |
| -/// same outcome (of not matching), so as long as `L` is greater than the pattern's length we can |
435 |
| -/// pick any `sₘ` from that length and get the same result. |
| 458 | +/// Let's look at an example, where we are trying to split the last pattern: |
| 459 | +/// ``` |
| 460 | +/// match x { |
| 461 | +/// [true, true, ..] => {} |
| 462 | +/// [.., false, false] => {} |
| 463 | +/// [..] => {} |
| 464 | +/// } |
| 465 | +/// ``` |
| 466 | +/// Here are the results of specialization for the first few lengths: |
| 467 | +/// ``` |
| 468 | +/// // length 0 |
| 469 | +/// [] => {} |
| 470 | +/// // length 1 |
| 471 | +/// [_] => {} |
| 472 | +/// // length 2 |
| 473 | +/// [true, true] => {} |
| 474 | +/// [false, false] => {} |
| 475 | +/// [_, _] => {} |
| 476 | +/// // length 3 |
| 477 | +/// [true, true, _ ] => {} |
| 478 | +/// [_, false, false] => {} |
| 479 | +/// [_, _, _ ] => {} |
| 480 | +/// // length 4 |
| 481 | +/// [true, true, _, _ ] => {} |
| 482 | +/// [_, _, false, false] => {} |
| 483 | +/// [_, _, _, _ ] => {} |
| 484 | +/// // length 5 |
| 485 | +/// [true, true, _, _, _ ] => {} |
| 486 | +/// [_, _, _, false, false] => {} |
| 487 | +/// [_, _, _, _, _ ] => {} |
| 488 | +/// ``` |
436 | 489 | ///
|
437 |
| -/// For variable-length patterns, the situation is more complicated, because as seen above the |
438 |
| -/// precise value of `sₘ` matters. |
| 490 | +/// If we went above length 5, we would simply be inserting more columns full of wildcards in the |
| 491 | +/// middle. This means that the set of witnesses for length `l >= 5` if equivalent to the set for |
| 492 | +/// any other `l' >= 5`: simply add or remove wildcards in the middle to convert between them. |
439 | 493 | ///
|
440 |
| -/// However, for each variable-length pattern `p` with a prefix of length `plₚ` and suffix of |
441 |
| -/// length `slₚ`, only the first `plₚ` and the last `slₚ` elements are examined. |
| 494 | +/// This applies to any set of slice patterns: there will be a length `L` above which all length |
| 495 | +/// behave the same. This is exactly what we need for constructor splitting. Therefore a |
| 496 | +/// variable-length slice can be split into a variable-length slice of minimal length `L`, and many |
| 497 | +/// fixed-length slices of lengths `< L`. |
442 | 498 | ///
|
443 |
| -/// Therefore, as long as `L` is positive (to avoid concerns about empty types), all elements after |
444 |
| -/// the maximum prefix length and before the maximum suffix length are not examined by any |
445 |
| -/// variable-length pattern, and therefore can be added/removed without affecting them - creating |
446 |
| -/// equivalent patterns from any sufficiently-large length. |
| 499 | +/// For each variable-length pattern `p` with a prefix of length `plₚ` and suffix of length `slₚ`, |
| 500 | +/// only the first `plₚ` and the last `slₚ` elements are examined. Therefore, as long as `L` is |
| 501 | +/// positive (to avoid concerns about empty types), all elements after the maximum prefix length |
| 502 | +/// and before the maximum suffix length are not examined by any variable-length pattern, and |
| 503 | +/// therefore can be added/removed without affecting them - creating equivalent patterns from any |
| 504 | +/// sufficiently-large length. |
447 | 505 | ///
|
448 | 506 | /// Of course, if fixed-length patterns exist, we must be sure that our length is large enough to
|
449 | 507 | /// miss them all, so we can pick `L = max(max(FIXED_LEN)+1, max(PREFIX_LEN) + max(SUFFIX_LEN))`
|
450 | 508 | ///
|
451 | 509 | /// `max_slice` below will be made to have arity `L`.
|
452 |
| -/// |
453 |
| -/// For example, with the above pair of patterns, all elements but the first and last can be |
454 |
| -/// added/removed, so any witness of length ≥2 (say, `[false, false, true]`) can be turned to a |
455 |
| -/// witness from any other length ≥2. |
456 | 510 | #[derive(Debug)]
|
457 | 511 | struct SplitVarLenSlice {
|
458 | 512 | /// If the type is an array, this is its size.
|
@@ -787,11 +841,19 @@ impl<'tcx> Constructor<'tcx> {
|
787 | 841 |
|
788 | 842 | /// A wildcard constructor that we split relative to the constructors in the matrix, as explained
|
789 | 843 | /// at the top of the file.
|
790 |
| -/// For splitting wildcards, there are two groups of constructors: there are the constructors |
791 |
| -/// actually present in the matrix (`matrix_ctors`), and the constructors not present. Two |
792 |
| -/// constructors that are not in the matrix will either both be covered (by a wildcard), or both |
793 |
| -/// not be covered by any given row. Therefore we can keep the missing constructors grouped |
794 |
| -/// together. |
| 844 | +/// |
| 845 | +/// A constructor that is not present in the matrix rows will only be covered by the rows that have |
| 846 | +/// wildcards. Thus we can group all of those constructors together; we call them "missing |
| 847 | +/// constructors". Splitting a wildcard would therefore list all present constructors individually |
| 848 | +/// (or grouped if they are integers or slices), and then all missing constructors together as a |
| 849 | +/// group. |
| 850 | +/// |
| 851 | +/// However we can go further: since any constructor will match the wildcard rows, and having more |
| 852 | +/// rows can only reduce the amount of usefulness witnesses, we can skip the present constructors |
| 853 | +/// and only try the missing ones. |
| 854 | +/// This will not preserve the whole list of witnesses, but will preserve whether the list is empty |
| 855 | +/// or not. In fact this is quite natural from the point of view of diagnostics too. This is done |
| 856 | +/// in `to_ctors`: in some cases we only return `Missing`. |
795 | 857 | #[derive(Debug)]
|
796 | 858 | pub(super) struct SplitWildcard<'tcx> {
|
797 | 859 | /// Constructors seen in the matrix.
|
|
0 commit comments