Skip to content

Commit 991ff9a

Browse files
committed
clusterlin: use bounded BFS exploration (optimization)
Switch to BFS exploration of the search tree in SearchCandidateFinder instead of DFS exploration. This appears to behave better for real world clusters. As BFS has the downside of needing far larger search queues, switch back to DFS temporarily when the queue grows too large.
1 parent d9b235e commit 991ff9a

File tree

1 file changed

+32
-4
lines changed

1 file changed

+32
-4
lines changed

src/cluster_linearize.h

Lines changed: 32 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include <utility>
1414

1515
#include <util/feefrac.h>
16+
#include <util/vecdeque.h>
1617

1718
namespace cluster_linearize {
1819

@@ -415,7 +416,8 @@ class SearchCandidateFinder
415416
};
416417

417418
/** The queue of work items. */
418-
std::vector<WorkItem> queue;
419+
VecDeque<WorkItem> queue;
420+
queue.reserve(std::max<size_t>(256, 2 * m_todo.Count()));
419421

420422
// Create an initial entry with m_todo as undecided. Also use it as best if not provided,
421423
// so that during the work processing loop below, and during the add_fn/split_fn calls, we
@@ -445,7 +447,10 @@ class SearchCandidateFinder
445447
// Make sure there are undecided transactions left to split on.
446448
if (und.None()) return;
447449

448-
// Actually construct a new work item on the queue.
450+
// Actually construct a new work item on the queue. Due to the switch to DFS when queue
451+
// space runs out (see below), we know that no reallocation of the queue should ever
452+
// occur.
453+
Assume(queue.size() < queue.capacity());
449454
queue.emplace_back(std::move(inc), std::move(und));
450455
};
451456

@@ -479,10 +484,33 @@ class SearchCandidateFinder
479484
};
480485

481486
// Work processing loop.
487+
//
488+
// New work items are always added at the back of the queue, but items to process use a
489+
// hybrid approach where they can be taken from the front or the back.
490+
//
491+
// Depth-first search (DFS) corresponds to always taking from the back of the queue. This
492+
// is very memory-efficient (linear in the number of transactions). Breadth-first search
493+
// (BFS) corresponds to always taking from the front, which potentially uses more memory
494+
// (up to exponential in the transaction count), but seems to work better in practice.
495+
//
496+
// The approach here combines the two: use BFS until the queue grows too large, at which
497+
// point we temporarily switch to DFS until the size shrinks again.
482498
while (!queue.empty()) {
499+
// Processing the first queue item, and then using DFS for everything it gives rise to,
500+
// may increase the queue size by the number of undecided elements in there, minus 1
501+
// for the first queue item being removed. Thus, only when that pushes the queue over
502+
// its capacity can we not process from the front (BFS), and should we use DFS.
503+
while (queue.size() - 1 + queue.front().und.Count() > queue.capacity()) {
504+
if (!iterations_left) break;
505+
auto elem = queue.back();
506+
queue.pop_back();
507+
split_fn(std::move(elem));
508+
}
509+
510+
// Process one entry from the front of the queue (BFS exploration)
483511
if (!iterations_left) break;
484-
auto elem = queue.back();
485-
queue.pop_back();
512+
auto elem = queue.front();
513+
queue.pop_front();
486514
split_fn(std::move(elem));
487515
}
488516

0 commit comments

Comments
 (0)