Skip to content

Commit 37bd70a

Browse files
committed
Merge bitcoin/bitcoin#30126: cluster mempool: cluster linearization algorithm
647fa37 bench: add cluster linearization improvement benchmark (Pieter Wuille) 2854979 clusterlin: permit passing in existing linearization to Linearize (Pieter Wuille) 97d9871 clusterlin: add LinearizationChunking class (Pieter Wuille) d5918dc clusterlin: randomize the SearchCandidateFinder search order (Pieter Wuille) 991ff9a clusterlin: use bounded BFS exploration (optimization) (Pieter Wuille) d9b235e bench: Candidate finding and linearization benchmarks (Pieter Wuille) 46aad9b clusterlin: add Linearize function (Pieter Wuille) ee0ddfe clusterlin: add chunking algorithm (Pieter Wuille) 2a41f15 clusterlin: add SearchCandidateFinder class (Pieter Wuille) 4828079 clusterlin: add AncestorCandidateFinder class (Pieter Wuille) 58f7e01 tests: framework for testing DepGraph class (Pieter Wuille) a6e07e7 clusterlin: introduce cluster_linearize.h with Cluster and DepGraph types (Pieter Wuille) Pull request description: Part of cluster mempool: #30289 This introduces low-level cluster linearization code, including tests and some benchmarks. It is currently not hooked up to anything. Ultimately, what this PR adds is a function `Linearize` which operates on instances of `DepGraph` (instances of which represent pre-processed transaction clusters) to produce and/or improve linearizations for that cluster. To provide assurance, the code heavily relies on fuzz tests. A novel approach is used here, where the fuzz input is parsed using the serialization.h framework rather than `FuzzedDataProvider`, with a custom serializer/deserializer for `DepGraph` objects. By including serialization, it's possible to ascertain that the format can represent every relevant cluster, as well as potentially permitting the construction of ad-hoc fuzz inputs from clusters (not included in this PR, but used during development). --- The `Linearize(depgraph, iteration_limit, rng_seed, old_linearization)` function is an implementation of the (single) LIMO algorithm, with the $S$ in every iteration found as the best out of (a) the best remaining ancestor set and (b) randomized computationally-bounded search. It incrementally builds up a linearization by finding good topologically-valid subsets to move to the front, in such a way that the resulting linearization has a diagram that is at least as good as the `old_linearization` passed in (if any). * Despite using both best ancestor set and search, this is not Double LIMO, as no intersections between these are involved; just the best of the two. * The `iteration_limit` and `rng_seed` only control the (b) randomized search. Even with 0 iterations, the result will be as good as the old linearization, and the included sets at every point will have a feerate at least as high as the best remaining ancestor set at that point. The search algorithm used in the (b) step is very basic, and largely matches Section 2.1 of [How to Linearize your Cluster.](https://delvingbitcoin.org/t/how-to-linearize-your-cluster/303#h-21-searching-6). See #30286 for optimizations to make it more efficient. For background and references, see [Introduction to cluster linearization](https://delvingbitcoin.org/t/introduction-to-cluster-linearization/1032). ACKs for top commit: instagibbs: reACK 647fa37 glozow: reACK 647fa37, both code and mermaid diagram look correct to me sdaftuar: ACK 647fa37 Tree-SHA512: 52c8aa3d1d91190bf1265a947d2712e9d12f745313ffceef6ae7e3ff517d01d8b3b9b4ce6066298d59751c4ba90555a3c0171229868ba50100f588a2aa6a486d
2 parents ec700f0 + 647fa37 commit 37bd70a

9 files changed

+2142
-0
lines changed

src/Makefile.am

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,7 @@ BITCOIN_CORE_H = \
132132
chainparamsseeds.h \
133133
checkqueue.h \
134134
clientversion.h \
135+
cluster_linearize.h \
135136
coins.h \
136137
common/args.h \
137138
common/bloom.h \

src/Makefile.bench.include

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ bench_bench_bitcoin_SOURCES = \
2525
bench/checkblock.cpp \
2626
bench/checkblockindex.cpp \
2727
bench/checkqueue.cpp \
28+
bench/cluster_linearize.cpp \
2829
bench/crypto_hash.cpp \
2930
bench/data.cpp \
3031
bench/data.h \

src/Makefile.test.include

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ BITCOIN_TESTS =\
8383
test/bloom_tests.cpp \
8484
test/bswap_tests.cpp \
8585
test/checkqueue_tests.cpp \
86+
test/cluster_linearize_tests.cpp \
8687
test/coins_tests.cpp \
8788
test/coinstatsindex_tests.cpp \
8889
test/common_url_tests.cpp \
@@ -302,6 +303,7 @@ test_fuzz_fuzz_SOURCES = \
302303
test/fuzz/buffered_file.cpp \
303304
test/fuzz/chain.cpp \
304305
test/fuzz/checkqueue.cpp \
306+
test/fuzz/cluster_linearize.cpp \
305307
test/fuzz/coins_view.cpp \
306308
test/fuzz/coinscache_sim.cpp \
307309
test/fuzz/connman.cpp \

src/Makefile.test_util.include

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ EXTRA_LIBRARIES += \
1010
TEST_UTIL_H = \
1111
test/util/blockfilter.h \
1212
test/util/chainstate.h \
13+
test/util/cluster_linearize.h \
1314
test/util/coins.h \
1415
test/util/index.h \
1516
test/util/json.h \

src/bench/cluster_linearize.cpp

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
// Copyright (c) The Bitcoin Core developers
2+
// Distributed under the MIT software license, see the accompanying
3+
// file COPYING or http://www.opensource.org/licenses/mit-license.php.
4+
5+
#include <bench/bench.h>
6+
7+
#include <util/bitset.h>
8+
#include <cluster_linearize.h>
9+
10+
using namespace cluster_linearize;
11+
12+
namespace {
13+
14+
/** Construct a linear graph. These are pessimal for AncestorCandidateFinder, as they maximize
15+
* the number of ancestor set feerate updates. The best ancestor set is always the topmost
16+
* remaining transaction, whose removal requires updating all remaining transactions' ancestor
17+
* set feerates. */
18+
template<typename SetType>
19+
DepGraph<SetType> MakeLinearGraph(ClusterIndex ntx)
20+
{
21+
DepGraph<SetType> depgraph;
22+
for (ClusterIndex i = 0; i < ntx; ++i) {
23+
depgraph.AddTransaction({-int32_t(i), 1});
24+
if (i > 0) depgraph.AddDependency(i - 1, i);
25+
}
26+
return depgraph;
27+
}
28+
29+
/** Construct a wide graph (one root, with N-1 children that are otherwise unrelated, with
30+
* increasing feerates). These graphs are pessimal for the LIMO step in Linearize, because
31+
* rechunking is needed after every candidate (the last transaction gets picked every time).
32+
*/
33+
template<typename SetType>
34+
DepGraph<SetType> MakeWideGraph(ClusterIndex ntx)
35+
{
36+
DepGraph<SetType> depgraph;
37+
for (ClusterIndex i = 0; i < ntx; ++i) {
38+
depgraph.AddTransaction({int32_t(i) + 1, 1});
39+
if (i > 0) depgraph.AddDependency(0, i);
40+
}
41+
return depgraph;
42+
}
43+
44+
// Construct a difficult graph. These need at least sqrt(2^(n-1)) iterations in the best
45+
// known algorithms (purely empirically determined).
46+
template<typename SetType>
47+
DepGraph<SetType> MakeHardGraph(ClusterIndex ntx)
48+
{
49+
DepGraph<SetType> depgraph;
50+
for (ClusterIndex i = 0; i < ntx; ++i) {
51+
if (ntx & 1) {
52+
// Odd cluster size.
53+
//
54+
// Mermaid diagram code for the resulting cluster for 11 transactions:
55+
// ```mermaid
56+
// graph BT
57+
// T0["T0: 1/2"];T1["T1: 14/2"];T2["T2: 6/1"];T3["T3: 5/1"];T4["T4: 7/1"];
58+
// T5["T5: 5/1"];T6["T6: 7/1"];T7["T7: 5/1"];T8["T8: 7/1"];T9["T9: 5/1"];
59+
// T10["T10: 7/1"];
60+
// T1-->T0;T1-->T2;T3-->T2;T4-->T3;T4-->T5;T6-->T5;T4-->T7;T8-->T7;T4-->T9;T10-->T9;
61+
// ```
62+
if (i == 0) {
63+
depgraph.AddTransaction({1, 2});
64+
} else if (i == 1) {
65+
depgraph.AddTransaction({14, 2});
66+
depgraph.AddDependency(0, 1);
67+
} else if (i == 2) {
68+
depgraph.AddTransaction({6, 1});
69+
depgraph.AddDependency(2, 1);
70+
} else if (i == 3) {
71+
depgraph.AddTransaction({5, 1});
72+
depgraph.AddDependency(2, 3);
73+
} else if ((i & 1) == 0) {
74+
depgraph.AddTransaction({7, 1});
75+
depgraph.AddDependency(i - 1, i);
76+
} else {
77+
depgraph.AddTransaction({5, 1});
78+
depgraph.AddDependency(i, 4);
79+
}
80+
} else {
81+
// Even cluster size.
82+
//
83+
// Mermaid diagram code for the resulting cluster for 10 transactions:
84+
// ```mermaid
85+
// graph BT
86+
// T0["T0: 1"];T1["T1: 3"];T2["T2: 1"];T3["T3: 4"];T4["T4: 0"];T5["T5: 4"];T6["T6: 0"];
87+
// T7["T7: 4"];T8["T8: 0"];T9["T9: 4"];
88+
// T1-->T0;T2-->T0;T3-->T2;T3-->T4;T5-->T4;T3-->T6;T7-->T6;T3-->T8;T9-->T8;
89+
// ```
90+
if (i == 0) {
91+
depgraph.AddTransaction({1, 1});
92+
} else if (i == 1) {
93+
depgraph.AddTransaction({3, 1});
94+
depgraph.AddDependency(0, 1);
95+
} else if (i == 2) {
96+
depgraph.AddTransaction({1, 1});
97+
depgraph.AddDependency(0, 2);
98+
} else if (i & 1) {
99+
depgraph.AddTransaction({4, 1});
100+
depgraph.AddDependency(i - 1, i);
101+
} else {
102+
depgraph.AddTransaction({0, 1});
103+
depgraph.AddDependency(i, 3);
104+
}
105+
}
106+
}
107+
return depgraph;
108+
}
109+
110+
/** Benchmark that does search-based candidate finding with 10000 iterations.
111+
*
112+
* Its goal is measuring how much time every additional search iteration in linearization costs.
113+
*/
114+
template<typename SetType>
115+
void BenchLinearizePerIterWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
116+
{
117+
const auto depgraph = MakeHardGraph<SetType>(ntx);
118+
const auto iter_limit = std::min<uint64_t>(10000, uint64_t{1} << (ntx / 2 - 1));
119+
uint64_t rng_seed = 0;
120+
bench.batch(iter_limit).unit("iters").run([&] {
121+
SearchCandidateFinder finder(depgraph, rng_seed++);
122+
auto [candidate, iters_performed] = finder.FindCandidateSet(iter_limit, {});
123+
assert(iters_performed == iter_limit);
124+
});
125+
}
126+
127+
/** Benchmark for linearization improvement of a trivial linear graph using just ancestor sort.
128+
*
129+
* Its goal is measuring how much time linearization may take without any search iterations.
130+
*
131+
* If P is the resulting time of BenchLinearizePerIterWorstCase, and N is the resulting time of
132+
* BenchLinearizeNoItersWorstCase*, then an invocation of Linearize with max_iterations=m should
133+
* take no more than roughly N+m*P time. This may however be an overestimate, as the worst cases
134+
* do not coincide (the ones that are worst for linearization without any search happen to be ones
135+
* that do not need many search iterations).
136+
*
137+
* This benchmark exercises a worst case for AncestorCandidateFinder, but for which improvement is
138+
* cheap.
139+
*/
140+
template<typename SetType>
141+
void BenchLinearizeNoItersWorstCaseAnc(ClusterIndex ntx, benchmark::Bench& bench)
142+
{
143+
const auto depgraph = MakeLinearGraph<SetType>(ntx);
144+
uint64_t rng_seed = 0;
145+
std::vector<ClusterIndex> old_lin(ntx);
146+
for (ClusterIndex i = 0; i < ntx; ++i) old_lin[i] = i;
147+
bench.run([&] {
148+
Linearize(depgraph, /*max_iterations=*/0, rng_seed++, old_lin);
149+
});
150+
}
151+
152+
/** Benchmark for linearization improvement of a trivial wide graph using just ancestor sort.
153+
*
154+
* Its goal is measuring how much time improving a linearization may take without any search
155+
* iterations, similar to the previous function.
156+
*
157+
* This benchmark exercises a worst case for improving an existing linearization, but for which
158+
* AncestorCandidateFinder is cheap.
159+
*/
160+
template<typename SetType>
161+
void BenchLinearizeNoItersWorstCaseLIMO(ClusterIndex ntx, benchmark::Bench& bench)
162+
{
163+
const auto depgraph = MakeWideGraph<SetType>(ntx);
164+
uint64_t rng_seed = 0;
165+
std::vector<ClusterIndex> old_lin(ntx);
166+
for (ClusterIndex i = 0; i < ntx; ++i) old_lin[i] = i;
167+
bench.run([&] {
168+
Linearize(depgraph, /*max_iterations=*/0, rng_seed++, old_lin);
169+
});
170+
}
171+
172+
} // namespace
173+
174+
static void LinearizePerIter16TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<16>>(16, bench); }
175+
static void LinearizePerIter32TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<32>>(32, bench); }
176+
static void LinearizePerIter48TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<48>>(48, bench); }
177+
static void LinearizePerIter64TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<64>>(64, bench); }
178+
static void LinearizePerIter75TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<75>>(75, bench); }
179+
static void LinearizePerIter99TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<99>>(99, bench); }
180+
181+
static void LinearizeNoIters16TxWorstCaseAnc(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseAnc<BitSet<16>>(16, bench); }
182+
static void LinearizeNoIters32TxWorstCaseAnc(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseAnc<BitSet<32>>(32, bench); }
183+
static void LinearizeNoIters48TxWorstCaseAnc(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseAnc<BitSet<48>>(48, bench); }
184+
static void LinearizeNoIters64TxWorstCaseAnc(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseAnc<BitSet<64>>(64, bench); }
185+
static void LinearizeNoIters75TxWorstCaseAnc(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseAnc<BitSet<75>>(75, bench); }
186+
static void LinearizeNoIters99TxWorstCaseAnc(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseAnc<BitSet<99>>(99, bench); }
187+
188+
static void LinearizeNoIters16TxWorstCaseLIMO(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseLIMO<BitSet<16>>(16, bench); }
189+
static void LinearizeNoIters32TxWorstCaseLIMO(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseLIMO<BitSet<32>>(32, bench); }
190+
static void LinearizeNoIters48TxWorstCaseLIMO(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseLIMO<BitSet<48>>(48, bench); }
191+
static void LinearizeNoIters64TxWorstCaseLIMO(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseLIMO<BitSet<64>>(64, bench); }
192+
static void LinearizeNoIters75TxWorstCaseLIMO(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseLIMO<BitSet<75>>(75, bench); }
193+
static void LinearizeNoIters99TxWorstCaseLIMO(benchmark::Bench& bench) { BenchLinearizeNoItersWorstCaseLIMO<BitSet<99>>(99, bench); }
194+
195+
BENCHMARK(LinearizePerIter16TxWorstCase, benchmark::PriorityLevel::HIGH);
196+
BENCHMARK(LinearizePerIter32TxWorstCase, benchmark::PriorityLevel::HIGH);
197+
BENCHMARK(LinearizePerIter48TxWorstCase, benchmark::PriorityLevel::HIGH);
198+
BENCHMARK(LinearizePerIter64TxWorstCase, benchmark::PriorityLevel::HIGH);
199+
BENCHMARK(LinearizePerIter75TxWorstCase, benchmark::PriorityLevel::HIGH);
200+
BENCHMARK(LinearizePerIter99TxWorstCase, benchmark::PriorityLevel::HIGH);
201+
202+
BENCHMARK(LinearizeNoIters16TxWorstCaseAnc, benchmark::PriorityLevel::HIGH);
203+
BENCHMARK(LinearizeNoIters32TxWorstCaseAnc, benchmark::PriorityLevel::HIGH);
204+
BENCHMARK(LinearizeNoIters48TxWorstCaseAnc, benchmark::PriorityLevel::HIGH);
205+
BENCHMARK(LinearizeNoIters64TxWorstCaseAnc, benchmark::PriorityLevel::HIGH);
206+
BENCHMARK(LinearizeNoIters75TxWorstCaseAnc, benchmark::PriorityLevel::HIGH);
207+
BENCHMARK(LinearizeNoIters99TxWorstCaseAnc, benchmark::PriorityLevel::HIGH);
208+
209+
BENCHMARK(LinearizeNoIters16TxWorstCaseLIMO, benchmark::PriorityLevel::HIGH);
210+
BENCHMARK(LinearizeNoIters32TxWorstCaseLIMO, benchmark::PriorityLevel::HIGH);
211+
BENCHMARK(LinearizeNoIters48TxWorstCaseLIMO, benchmark::PriorityLevel::HIGH);
212+
BENCHMARK(LinearizeNoIters64TxWorstCaseLIMO, benchmark::PriorityLevel::HIGH);
213+
BENCHMARK(LinearizeNoIters75TxWorstCaseLIMO, benchmark::PriorityLevel::HIGH);
214+
BENCHMARK(LinearizeNoIters99TxWorstCaseLIMO, benchmark::PriorityLevel::HIGH);

0 commit comments

Comments
 (0)