Skip to content

Adding Property Tests to DlxLib

Jonathan Taylor edited this page Mar 8, 2015 · 3 revisions

Introduction

I wrote DlxLib a couple of years ago and included unit tests in the solution. Since then, I have discovered property-based testing - originally in the context of Scala as I was working through the book, Functional Programming in Scala, and subsequently in the form of QuickCheck whilst learning Haskell. Then I realised that property-based testing was available in the .NET world in the form of FsCheck. I added property tests to some of my other projects. And then it occurred to me that I could also add property tests to DlxLib. I have now managed to achieve this - in both C# and F#. I am happy with the results but I must admit that it was a struggle - even though I already had experience of adding property tests to other projects.

Quick overview of DlxLib

Given a matrix of 0s and 1s, DlxLib can find all solutions where a solution consists of a subset of the rows of the matrix such that each column contains exactly one 1. For example, the following matrix has 3 solutions:

Row 0:	{1, 0, 0, 0}
Row 1:	{0, 1, 1, 0}
Row 2:	{1, 0, 0, 1}
Row 3:	{0, 0, 1, 1}
Row 4:	{0, 1, 0, 0}
Row 5:	{0, 0, 1, 0}

Here are the solutions:

Row 0:	{1, 0, 0, 0}
Row 3:	{0, 0, 1, 1}
Row 4:	{0, 1, 0, 0}
Row 1:	{0, 1, 1, 0}
Row 2:	{1, 0, 0, 1}
Row 2:	{1, 0, 0, 1}
Row 4:	{0, 1, 0, 0}
Row 5:	{0, 0, 1, 0}

Quick overview of FsCheck

With FsCheck, you write a property that you believe should be true for all valid inputs. FsCheck will then generate random input data and try to falsify your property. Here is a quick example:

[Test]
public void IntTest()
{
    var arb = Arb.fromGen(Gen.choose(1, 10));
    var body = FSharpFunc<int, bool>.FromConverter(n => n * n <= 100);
    var property = Prop.forAll(arb, body);
    Check.One(Config.QuickThrowOnFailure, property);
}

Here, we generate random numbers between 1 and 10 inclusive. Our property then states that the each such number squared is less than or equal to 100. When we run this test, FsCheck will execute the above code 100 times (this is configurable). However, it will stop if it generates a value that falsifies the property.

I could have written this simple example more succinctly using FsCheck's Fluent API:

[Test]
public void IntTest()
{
    Spec.For(Any.IntBetween(1, 10), n => n * n <= 100).QuickCheckThrowOnFailure();
}

In writing the C# version of the DlxLib property tests, I have used the Fluent API when generating random matrices but I have used the regular API to implement the actual tests.

Matrices with no solutions

I figured that the easiest place to start was to generate random matrices with no solutions. Since a solution must have exactly one 1 in each column, all I had to do was ensure that one column in the matrix contained no 1s so that no combination of rows could possibly form a valid solution. Here is how I achieved it:

private static Gen<int[,]> GenMatrixOfIntWithNoSolutions()
{
    return
        from numCols in Any.IntBetween(2, 20)
        from numRows in Any.IntBetween(2, 20)
        from indexOfAlwaysZeroColumn in Any.IntBetween(0, numCols - 1)
        from rows in GenRowWithZeroInGivenColumn(numCols, indexOfAlwaysZeroColumn).MakeListOfLength(numRows)
        select rows.To2DArray();
}

These are the steps:

  1. Choose a random number of columns between 2 and 20 inclusive
  2. Choose a random number of rows between 2 and 20 inclusive
  3. Randomly choose the column that will contain all 0s
  4. Use the GenRowWithZeroInGivenColumn auxiliary method to generate a list of rows
  5. Use the To2DArray extension method to convert the List<List<int>> to a 2-dimensional array, int[,]

GenRowWithZeroInGivenColumn looks like this:

private static Gen<List<int>> GenRowWithZeroInGivenColumn(int numCols, int indexOfAlwaysZeroColumn)
{
    return
        from row in Any.ValueIn(0, 1).MakeListOfLength(numCols).Select(r =>
        {
            r[indexOfAlwaysZeroColumn] = 0;
            return r;
        })
        select row;
}

These are the steps:

  1. Make a list of length of numCols where each value is randomly chosen to be either 0 or 1
  2. Map the generated row using GeneratorExtensions.Select which allows us to ensure that the value at the given index is a 0

Here is an example of a matrix generated by GenMatrixOfIntWithNoSolutions. Note that, in this particular case, the fourth column contains all zeros.

[[0; 0; 0; 0; 0; 1; 1; 1; 1; 0; 0; 0; 0]
 [0; 1; 0; 0; 0; 0; 1; 1; 1; 1; 0; 1; 0]
 [0; 0; 1; 0; 1; 1; 0; 0; 0; 1; 1; 0; 1]
 [0; 1; 1; 0; 0; 0; 1; 1; 0; 0; 0; 1; 1]
 [1; 1; 0; 0; 1; 1; 0; 1; 1; 0; 0; 0; 0]
 [1; 0; 1; 0; 0; 0; 0; 1; 0; 1; 0; 0; 0]
 [0; 0; 0; 0; 0; 1; 1; 1; 1; 0; 0; 0; 0]
 [1; 1; 1; 0; 0; 0; 0; 0; 0; 1; 1; 1; 1]
 [0; 0; 0; 0; 1; 1; 1; 1; 1; 0; 0; 0; 0]
 [1; 1; 1; 0; 0; 0; 0; 0; 0; 1; 1; 1; 1]]

Here is the test that uses GenMatrixOfIntWithNoSolutions to generate random matrices and checks that DlxLib correctly finds no solutions:

[Test]
public void ExactCoverProblemsWithNoSolutionsTest()
{
    Func<int, string> makeLabel = numSolutions => string.Format(
        "Expected no solutions but got {0}",
        numSolutions);

    var arbMatrix = Arb.fromGen(GenMatrixOfIntWithNoSolutions());

    var property = Prop.forAll(arbMatrix, FSharpFunc<int[,], Property>.FromConverter(matrix =>
    {
        var solutions = new Dlx().Solve(matrix).ToList();
        return PropExtensions.Label(!solutions.Any(), makeLabel(solutions.Count()));
    }));

    Check.One(Config, property);
}

Matrices with a single solution

Next, I wanted to generate random matrices with exactly one solution. To do this, I randomly pick the number of rows that the solution will be spread over. Then, for each column, I randomly pick the solution row that will contain the 1. I then hide the solution rows in a large sea of rows containing all 0s. I then check that DlxLib finds exactly one solution and that the solution is correct.

Here is the code for the main generator:

private static Gen<int[,]> GenMatrixOfIntWithSingleSolution()
{
    return
        from numCols in Any.IntBetween(2, 20)
        from solution in GenSolution(numCols)
        from numRows in Any.IntBetween(solution.Count, solution.Count * 5)
        from matrix in Any.Value(0).MakeListOfLength(numCols).MakeListOfLength(numRows)
        from randomRowIdxs in GenExtensions.PickValues(solution.Count, Enumerable.Range(0, numRows))
        select PokeSolutionRowsIntoMatrix(matrix, solution, randomRowIdxs).To2DArray();
}

GenSolution generates a random solution with numCols columns and split across a random number of rows. I then use GenExtensions.PickValues, part of my FsCheckUtils, to randomly choose the locations in the larger matrix of all zeros at which to distribute the solution rows.

Here is the code generate a random solution:

private static Gen<List<List<int>>> GenSolution(int numCols)
{
    return
        from numSolutionRows in Any.IntBetween(1, Math.Min(5, numCols))
        from solutionRows in GenSolutionRows(numCols, numSolutionRows)
        select solutionRows;
}

I first randomly choose the number of rows that will comprise the solution from the range 1 to 5 inclusive. However, the upper bound needs to be reduced if the number of columns is less than 5. I then use GenSolutionRows to generate the solution rows.

Here is the code that generates the solution rows:

private static Gen<List<List<int>>> GenSolutionRows(int numCols, int numSolutionRows)
{
    return
        from solutionRows in Any.Value(0).MakeListOfLength(numCols).MakeListOfLength(numSolutionRows)
        from randomRowIdxs in Any.IntBetween(0, numSolutionRows - 1).MakeListOfLength(numCols)
        select RandomlySprinkleOnesIntoSolutionRows(solutionRows, randomRowIdxs);
}

I initially generate the given number of rows with the given number of columns and set all the values to 0. Then, for each column, I randomly choose which solution row will contain the 1. I then use the following auxiliary function to store the 1s in the chosen places:

private static List<List<int>> RandomlySprinkleOnesIntoSolutionRows(
    List<List<int>> solutionRows,
    IEnumerable<int> randomRowIdxs,
    int startIdx = 0)
{
    var colIndex = startIdx;
    foreach (var randomRowIdx in randomRowIdxs) solutionRows[randomRowIdx][colIndex++] = 1;
    return solutionRows;
}

The last step of GenMatrixOfIntWithSingleSolution uses the following auxiliary function to store the solution rows in the larger matrix of 0s:

private static List<List<int>> PokeSolutionRowsIntoMatrix(
    List<List<int>> matrix,
    IReadOnlyList<List<int>> solutionRows,
    IEnumerable<int> randomRowIdxs)
{
    var fromIdx = 0;
    foreach (var toIdx in randomRowIdxs) matrix[toIdx] = solutionRows[fromIdx++];
    return matrix;
}

A typical randomly generated matrix with one solution looks like this:

[[0; 0; 0; 1; 1; 1; 0; 1; 0; 0; 0; 0; 0; 0; 1; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 1; 0; 0; 1; 0; 1; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [1; 1; 1; 0; 0; 0; 1; 0; 0; 1; 1; 0; 1; 0; 0; 1]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]]

Here is the test that uses GenMatrixOfIntWithSingleSolution to generate random matrices and checks that DlxLib finds exactly one solution and that the solution is correct:

[Test]
public void ExactCoverProblemsWithSingleSolutionTest()
{
    Func<int, string> makeLabel = numSolutions => string.Format(
        "Expected exactly one solution but got {0}",
        numSolutions);

    var arbMatrix = Arb.fromGen(GenMatrixOfIntWithSingleSolution());

    var property = Prop.forAll(arbMatrix, FSharpFunc<int[,], Property>.FromConverter(matrix =>
    {
        var solutions = new Dlx().Solve(matrix).ToList();
        var p1 = PropExtensions.Label(solutions.Count() == 1, makeLabel(solutions.Count()));
        var p2 = CheckSolutions(solutions, matrix);
        return PropExtensions.And(p1, p2);
    }));

    Check.One(Config, property);
}

I discuss CheckSolutions later.

Matrices with multiple solutions

Next, I wanted to be able to generate random matrices with a given number of solutions. The obvious approach would be to use GenSolution to generate multiple solutions. However, the problem with this approach is that subsets of the rows from the various solutions may also combine together to form additional unwanted solutions. I needed to find a way to avoid this interference between sets of solution rows.

The answer was to partition the width of the matrix. For example, lets say that the matrix width has been randomly chosen to be 15 columns and we want the matrix to have 2 solutions where the first solution has 2 rows and the second solution has 3 rows. Say I choose random partition lengths that add up to 15 e.g. 6 and 9. So, the first solution will contain random values in the first 6 columns and filler values elsewhere. The second solution will contain random values in the last 9 columns and filler values elsewhere. The filler values need to be 1s for the first row of a solution and 0s for the other rows of a solution. Here is an example:

First solution, Row 0:	{x, x, x, x, x, x, 1, 1, 1, 1, 1, 1, 1, 1, 1} 
First solution, Row 1:	{x, x, x, x, x, x, 0, 0, 0, 0, 0, 0, 0, 0, 0}
Second solution, Row 0:	{1, 1, 1, 1, 1, 1, y, y, y, y, y, y, y, y, y}
Second solution, Row 1:	{0, 0, 0, 0, 0, 0, y, y, y, y, y, y, y, y, y}
Second solution, Row 2:	{0, 0, 0, 0, 0, 0, y, y, y, y, y, y, y, y, y}

In the first solution, the 'x's denote the columns in which we need to randomly sprinkle our 1s ensuring that each column contains exactly one 1 as usual. Similarly, in the second solution, the 'y's denote the columns in which we need to randomly sprinkle our 1s. The filler values ensure that the sets of solution rows do not interfere with each other.

However, there is a remaining subtlety that needs to be handled. We need to disallow the random part (the 'x's and 'y's above) from being all 0s otherwise the following situation can occur where the two indicated rows form an additional unwanted solution:

First solution, Row 0:	{0, 0, 0, 0, 1, 1, 1, 1}	<--
First solution, Row 1:	{0, 1, 0, 1, 0, 0, 0, 0}
First solution, Row 2:	{0, 0, 0, 0, 0, 0, 0, 0}
First solution, Row 3:	{1, 0, 1, 0, 0, 0, 0, 0}
Second solution, Row 0:	{1, 1, 1, 1, 0, 0, 0, 0}	<--
Second solution, Row 1:	{0, 0, 0, 0, 0, 1, 1, 1}
Second solution, Row 2:	{0, 0, 0, 0, 1, 0, 0, 0}

This is the main method that generates the random matrices:

private static Gen<int[,]> GenMatrixOfIntWithMultipleSolutions(int numSolutions)
{
    return
        from numCols in Any.IntBetween(numSolutions, numSolutions * 10)
        from partitions in GenPartitions(numCols, numSolutions)
        from solutions in GenPartitionedSolutions(numCols, partitions)
        let combinedSolutions = CombineSolutions(solutions)
        from numRows in Any.IntBetween(combinedSolutions.Count, combinedSolutions.Count * 5)
        from matrix in Any.Value(0).MakeListOfLength(numCols).MakeListOfLength(numRows)
        from randomRowIdxs in GenExtensions.PickValues(combinedSolutions.Count, Enumerable.Range(0, numRows))
        select PokeSolutionRowsIntoMatrix(matrix, combinedSolutions, randomRowIdxs).To2DArray();
}

The key parts are the calls to GenPartitions and GenPartitionedSolutions. GenPartitions randomly chooses the lengths of the partitions and then creates a sequence of Tuples (pairs of ints specifying the start index and one-after-the-end index of each partition) to describe the chosen partitions. In the example above, we wanted to partition 15 columns for 2 solutions with one being 6 columns wide and the other being 9 columns wide. The sequence of tuples would thus look like (0, 6), (6, 15).

private static Gen<IEnumerable<Tuple<int, int>>> GenPartitions(int numCols, int numSolutions)
{
    return
        from partitionLengths in GenPartitionLengths(numCols, numSolutions)
        select MakePartitions(partitionLengths);
}

private static Gen<IEnumerable<int>> GenPartitionLengths(int numCols, int numSolutions)
{
    return
        from partitionLengths in Any.IntBetween(1, numCols / 2).MakeListOfLength(numSolutions - 1)
        let sum = partitionLengths.Sum()
        where sum < numCols
        select partitionLengths.Concat(new[] { numCols - sum });
}

private static IEnumerable<Tuple<int, int>> MakePartitions(IEnumerable<int> partitionLengths)
{
    var currentStartIdx = 0;

    return partitionLengths.Select(partitionLength =>
    {
        var tuple = Tuple.Create(currentStartIdx, currentStartIdx + partitionLength);
        currentStartIdx += partitionLength;
        return tuple;
    });
}

Having randomly chosen the partitions, we can then generate the solutions. GenPartitionedSolutions invokes GenPartitionedSolution for each partition. This results in an enumerable of generators of solutions with the type IEnumerable<Gen<List<List<int>>>>. We can pass this enumerable to the Fluent API's Any.SequenceOf which converts a sequence of generators of values into a generator of a list of values. In our case, it converts a sequence of solution generators into a generator of a list of solutions.

private static Gen<List<List<List<int>>>> GenPartitionedSolutions(int numCols, IEnumerable<Tuple<int, int>> partitions)
{
    return Any.SequenceOf(partitions.Select(partition =>
    {
        var startIdx = partition.Item1;
        var endIdx = partition.Item2;
        return GenPartitionedSolution(numCols, startIdx, endIdx);
    }));
}

GenPartitionedSolution is very similar to GenSolution that we saw earlier except that it uses GenPartitionedSolutionRows instead of GenSolutionRows to generate rows for the solution.

private static Gen<List<List<int>>> GenPartitionedSolution(int numCols, int startIdx, int endIdx)
{
    return
        from numSolutionRows in Any.IntBetween(1, Math.Min(5, endIdx - startIdx))
        from solutionRows in GenPartitionedSolutionRows(numCols, startIdx, endIdx, numSolutionRows)
        select solutionRows;
}

GenPartitionedSolutionRows is very similar to GenSolutionRows. The differences are:

  1. We use InitPartitionedSolutionRows to initialise the solution rows with the correct filler values. The sections of each solution row that will eventually contain random 0s and 1s are initially set to all 0s.

  2. We have an extra line of code that ensures that the randomly picked randomRowIdxs covers all the solution rows to handle the subtlety referred to above - we don't allow the random part of any generated solution row to be all 0s.

private static Gen<List<List<int>>> GenPartitionedSolutionRows(int numCols, int startIdx, int endIdx, int numSolutionRows)
{
    return
        from solutionRows in Any.Value(InitPartitionedSolutionRows(numCols, startIdx, endIdx, numSolutionRows))
        from randomRowIdxs in Any.IntBetween(0, numSolutionRows - 1).MakeListOfLength(endIdx - startIdx)
        where Enumerable.Range(0, numSolutionRows).All(randomRowIdxs.Contains)
        select RandomlySprinkleOnesIntoSolutionRows(solutionRows, randomRowIdxs, startIdx);
}

InitPartitionedSolutionRows initialises all the solution rows for a partitioned solution. The filler parts are set to the appropriate filler value which means 1s for the first row in a solution and 0s for the other rows in a solution. The random part of each solution row is initialised to all 0s. The 1s will be randomly sprinkled in later by the caller, i.e. GenPartitionedSolutionRows.

private static List<List<int>> InitPartitionedSolutionRows(int numCols, int startIdx, int endIdx, int numSolutionRows)
{
    var firstRow = InitPartitionedSolutionFirstRow(numCols, startIdx, endIdx);
    var otherRows = InitPartitionedSolutionOtherRows(numCols, startIdx, endIdx, numSolutionRows - 1);
    var combinedRows = new List<List<int>> {firstRow};
    combinedRows.AddRange(otherRows);
    return combinedRows;
}

private static List<int> InitPartitionedSolutionFirstRow(int numCols, int startIdx, int endIdx)
{
    return InitPartitionedSolutionRow(numCols, startIdx, endIdx, true);
}

private static IEnumerable<List<int>> InitPartitionedSolutionOtherRows(int numCols, int startIdx, int endIdx, int numOtherRows)
{
    return Enumerable.Range(0, numOtherRows)
        .Select(_ => InitPartitionedSolutionRow(numCols, startIdx, endIdx, false));
}

private static List<int> InitPartitionedSolutionRow(int numCols, int startIdx, int endIdx, bool isFirstRow)
{
    var fillerValue = (isFirstRow) ? 1 : 0;

    var prefixPartLength = startIdx;
    var randomPartLength = endIdx - startIdx;
    var suffixPartLength = numCols - endIdx;

    var prefixPart = Enumerable.Repeat(fillerValue, prefixPartLength);
    var randomPart = Enumerable.Repeat(0, randomPartLength);
    var suffixPart = Enumerable.Repeat(fillerValue, suffixPartLength);

    return prefixPart.Concat(randomPart).Concat(suffixPart).ToList();
}

Here is an example of a matrix generated by GenMatrixOfIntWithMultipleSolutions:

[[0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [1; 1; 0; 1; 0; 0; 1; 0; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1]
 [0; 0; 0; 0; 1; 1; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 1; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0]
 [1; 1; 1; 1; 1; 1; 1; 1; 0; 0; 0; 0; 0; 1; 1; 0; 1; 0; 1]
 [0; 0; 0; 0; 0; 0; 0; 0; 1; 0; 0; 1; 0; 0; 0; 0; 0; 0; 0]
 [0; 0; 0; 0; 0; 0; 0; 0; 0; 1; 1; 0; 1; 0; 0; 1; 0; 1; 0]]

Here is the test that uses GenMatrixOfIntWithMultipleSolutions to generate random matrices and checks that DlxLib finds the correct number of solutions and that the solutions are correct:

[Test]
public void ExactCoverProblemsWithMultipleSolutionsTest()
{
    var arbNumSolutions = Arb.fromGen(Any.IntBetween(2, 5));

    var property = Prop.forAll(arbNumSolutions, FSharpFunc<int, Property>.FromConverter(numSolutions =>
    {
        Func<int, string> makeLabel = actualNumSolutions => string.Format(
            "Expected exactly {0} solutions but got {1}",
            numSolutions, actualNumSolutions);

        var arbMatrix = Arb.fromGen(GenMatrixOfIntWithMultipleSolutions(numSolutions));

        return Prop.forAll(arbMatrix, FSharpFunc<int[,], Property>.FromConverter(matrix =>
        {
            var solutions = new Dlx().Solve(matrix).ToList();
            var actualNumSolutions = solutions.Count();
            var p1 = PropExtensions.Label(actualNumSolutions == numSolutions, makeLabel(actualNumSolutions));
            var p2 = CheckSolutions(solutions, matrix);
            return PropExtensions.And(p1, p2);
        }));
    }));

    Check.One(Config, property);
}

Checking solutions

TODO