Skip to content

Conversation

@paullegranddc
Copy link
Contributor

@paullegranddc paullegranddc commented Apr 14, 2025

What does this PR do?

This refactor splits the logic in collect_trace_chunks between the trace exporter spans (v04 and v05) and the mini agent spans (pb::Spans).
it completely removes usage of the TraceCollection struct from data-pipeline, and instead introduces the TraceChunks enum to differentiate between v04 and v05.

Currently the way the code is structured makes replacing ByteString with the slice harder due to shared lifetime.
Furthermore, the enums encodes two different codepaths, the spanBytes and span pb which never interact with each other. So having function that handle both span bytes and pb spans is pure complexity overhead.
This refactor also removes a bunch of panics and lines of code that were here because to handle the "fake" pb spans and trace exporter spans overlap, which is practice never happens.

Lastly, this remove the TracerParams struct. Every occurence of if was creating it, and invoking TryInto<TracerCollection> just after on it. So replacing it by a simple function is a lot less complex for the same feature set.

Motivation

Prepare for using SpanSlice<'a> instead of SpanBytes in the trace exporter.

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

@pr-commenter
Copy link

pr-commenter bot commented Apr 14, 2025

Benchmarks

Comparison

Benchmark execution time: 2025-04-16 12:02:48

Comparing candidate commit edfc921 in PR branch paullgdc/data-pipeline/split_trace_collect with baseline commit daf50ad in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 52 metrics, 2 unstable metrics.

Candidate

Candidate benchmark details

Group 1

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time 503.697µs 505.205µs ± 0.781µs 505.200µs ± 0.263µs 505.436µs 505.760µs 506.119µs 514.639µs 1.87% 8.753 104.601 0.15% 0.055µs 1 200
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput 1943108.378op/s 1979398.440op/s ± 3021.024op/s 1979414.746op/s ± 1030.018op/s 1980502.993op/s 1982269.187op/s 1984041.573op/s 1985321.635op/s 0.30% -8.615 102.465 0.15% 213.619op/s 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time 453.838µs 454.903µs ± 0.468µs 454.883µs ± 0.313µs 455.230µs 455.615µs 455.951µs 456.198µs 0.29% 0.099 -0.352 0.10% 0.033µs 1 200
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput 2192030.173op/s 2198272.717op/s ± 2262.255op/s 2198365.559op/s ± 1511.630op/s 2199726.959op/s 2201880.407op/s 2203232.537op/s 2203429.175op/s 0.23% -0.094 -0.354 0.10% 159.966op/s 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time 176.782µs 177.389µs ± 0.254µs 177.418µs ± 0.202µs 177.596µs 177.733µs 177.822µs 177.991µs 0.32% -0.316 -0.669 0.14% 0.018µs 1 200
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput 5618264.604op/s 5637326.719op/s ± 8076.045op/s 5636405.171op/s ± 6420.246op/s 5643434.127op/s 5652315.420op/s 5655443.442op/s 5656692.885op/s 0.36% 0.322 -0.666 0.14% 571.063op/s 1 200
normalization/normalize_service/normalize_service/[empty string] execution_time 37.580µs 37.682µs ± 0.041µs 37.677µs ± 0.027µs 37.711µs 37.748µs 37.799µs 37.840µs 0.43% 0.556 0.655 0.11% 0.003µs 1 200
normalization/normalize_service/normalize_service/[empty string] throughput 26426718.039op/s 26537709.420op/s ± 28663.920op/s 26541456.970op/s ± 19292.746op/s 26557877.406op/s 26580409.385op/s 26588719.840op/s 26609910.884op/s 0.26% -0.548 0.638 0.11% 2026.845op/s 1 200
normalization/normalize_service/normalize_service/test_ASCII execution_time 48.085µs 48.294µs ± 0.221µs 48.118µs ± 0.031µs 48.531µs 48.618µs 48.684µs 48.739µs 1.29% 0.365 -1.665 0.46% 0.016µs 1 200
normalization/normalize_service/normalize_service/test_ASCII throughput 20517524.108op/s 20707027.035op/s ± 94798.346op/s 20782397.927op/s ± 13488.053op/s 20792430.367op/s 20795040.086op/s 20795944.720op/s 20796377.865op/s 0.07% -0.362 -1.669 0.46% 6703.255op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... execution_time [505.097µs; 505.313µs] or [-0.021%; +0.021%] None None None
normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000... throughput [1978979.756op/s; 1979817.125op/s] or [-0.021%; +0.021%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて execution_time [454.838µs; 454.968µs] or [-0.014%; +0.014%] None None None
normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて throughput [2197959.190op/s; 2198586.244op/s] or [-0.014%; +0.014%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters execution_time [177.354µs; 177.425µs] or [-0.020%; +0.020%] None None None
normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters throughput [5636207.457op/s; 5638445.981op/s] or [-0.020%; +0.020%] None None None
normalization/normalize_service/normalize_service/[empty string] execution_time [37.677µs; 37.688µs] or [-0.015%; +0.015%] None None None
normalization/normalize_service/normalize_service/[empty string] throughput [26533736.876op/s; 26541681.963op/s] or [-0.015%; +0.015%] None None None
normalization/normalize_service/normalize_service/test_ASCII execution_time [48.263µs; 48.324µs] or [-0.064%; +0.064%] None None None
normalization/normalize_service/normalize_service/test_ASCII throughput [20693888.895op/s; 20720165.174op/s] or [-0.063%; +0.063%] None None None

Group 2

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
two way interface execution_time 17.720µs 26.170µs ± 11.089µs 17.921µs ± 0.096µs 35.158µs 44.296µs 46.728µs 90.155µs 403.06% 1.660 5.320 42.27% 0.784µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
two way interface execution_time [24.633µs; 27.707µs] or [-5.872%; +5.872%] None None None

Group 3

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
write only interface execution_time 1.170µs 3.204µs ± 1.425µs 3.011µs ± 0.028µs 3.034µs 3.666µs 13.845µs 14.953µs 396.69% 7.386 55.572 44.37% 0.101µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
write only interface execution_time [3.007µs; 3.402µs] or [-6.165%; +6.165%] None None None

Group 4

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
redis/obfuscate_redis_string execution_time 33.042µs 33.926µs ± 1.097µs 33.263µs ± 0.144µs 35.344µs 35.701µs 35.986µs 36.565µs 9.93% 0.918 -1.034 3.23% 0.078µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
redis/obfuscate_redis_string execution_time [33.774µs; 34.079µs] or [-0.448%; +0.448%] None None None

Group 5

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
concentrator/add_spans_to_concentrator execution_time 5.976ms 5.989ms ± 0.011ms 5.987ms ± 0.003ms 5.991ms 5.999ms 6.029ms 6.098ms 1.84% 6.064 48.938 0.19% 0.001ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
concentrator/add_spans_to_concentrator execution_time [5.988ms; 5.991ms] or [-0.026%; +0.026%] None None None

Group 6

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching string interning on wordpress profile execution_time 148.564µs 149.530µs ± 0.274µs 149.525µs ± 0.136µs 149.641µs 149.936µs 150.277µs 150.739µs 0.81% 0.346 2.950 0.18% 0.019µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching string interning on wordpress profile execution_time [149.492µs; 149.568µs] or [-0.025%; +0.025%] None None None

Group 7

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
credit_card/is_card_number/ execution_time 3.894µs 3.913µs ± 0.003µs 3.913µs ± 0.001µs 3.914µs 3.917µs 3.919µs 3.921µs 0.21% -1.482 9.380 0.07% 0.000µs 1 200
credit_card/is_card_number/ throughput 255036013.360op/s 255575810.444op/s ± 184088.588op/s 255569012.258op/s ± 89910.473op/s 255657363.122op/s 255851285.739op/s 256009354.430op/s 256811766.319op/s 0.49% 1.502 9.518 0.07% 13017.029op/s 1 200
credit_card/is_card_number/ 3782-8224-6310-005 execution_time 76.578µs 77.627µs ± 0.468µs 77.613µs ± 0.331µs 77.904µs 78.474µs 78.716µs 78.935µs 1.70% 0.324 -0.285 0.60% 0.033µs 1 200
credit_card/is_card_number/ 3782-8224-6310-005 throughput 12668714.874op/s 12882587.146op/s ± 77507.656op/s 12884368.727op/s ± 55222.645op/s 12941337.556op/s 12998702.070op/s 13024951.649op/s 13058649.533op/s 1.35% -0.295 -0.312 0.60% 5480.619op/s 1 200
credit_card/is_card_number/ 378282246310005 execution_time 70.929µs 71.416µs ± 0.289µs 71.391µs ± 0.181µs 71.569µs 71.973µs 72.322µs 72.374µs 1.38% 0.777 0.650 0.40% 0.020µs 1 200
credit_card/is_card_number/ 378282246310005 throughput 13817154.556op/s 14002739.373op/s ± 56581.608op/s 14007436.061op/s ± 35600.469op/s 14044691.304op/s 14080610.980op/s 14095272.954op/s 14098635.252op/s 0.65% -0.752 0.590 0.40% 4000.924op/s 1 200
credit_card/is_card_number/37828224631 execution_time 3.893µs 3.912µs ± 0.003µs 3.913µs ± 0.002µs 3.914µs 3.918µs 3.919µs 3.919µs 0.15% -1.216 6.721 0.08% 0.000µs 1 200
credit_card/is_card_number/37828224631 throughput 255185393.136op/s 255591667.256op/s ± 204265.381op/s 255572367.169op/s ± 114541.448op/s 255694630.501op/s 255912974.642op/s 255976972.340op/s 256885320.245op/s 0.51% 1.233 6.841 0.08% 14443.744op/s 1 200
credit_card/is_card_number/378282246310005 execution_time 67.079µs 67.661µs ± 0.331µs 67.631µs ± 0.200µs 67.860µs 68.256µs 68.416µs 69.096µs 2.17% 0.616 0.957 0.49% 0.023µs 1 200
credit_card/is_card_number/378282246310005 throughput 14472701.307op/s 14779847.882op/s ± 72047.837op/s 14786036.808op/s ± 43643.418op/s 14825226.613op/s 14893568.547op/s 14907154.835op/s 14907823.817op/s 0.82% -0.579 0.841 0.49% 5094.551op/s 1 200
credit_card/is_card_number/37828224631000521389798 execution_time 51.745µs 51.835µs ± 0.040µs 51.838µs ± 0.025µs 51.860µs 51.903µs 51.927µs 51.942µs 0.20% 0.024 -0.165 0.08% 0.003µs 1 200
credit_card/is_card_number/37828224631000521389798 throughput 19252118.057op/s 19291866.555op/s ± 14818.334op/s 19290852.282op/s ± 9387.288op/s 19300650.433op/s 19318856.943op/s 19323975.257op/s 19325529.024op/s 0.18% -0.020 -0.167 0.08% 1047.814op/s 1 200
credit_card/is_card_number/x371413321323331 execution_time 6.026µs 6.057µs ± 0.030µs 6.041µs ± 0.008µs 6.075µs 6.116µs 6.153µs 6.191µs 2.48% 1.453 2.139 0.49% 0.002µs 1 200
credit_card/is_card_number/x371413321323331 throughput 161530562.248op/s 165091701.019op/s ± 812731.235op/s 165541997.133op/s ± 231590.836op/s 165670972.590op/s 165818165.139op/s 165926772.292op/s 165948894.977op/s 0.25% -1.424 1.993 0.49% 57468.777op/s 1 200
credit_card/is_card_number_no_luhn/ execution_time 3.896µs 3.913µs ± 0.002µs 3.913µs ± 0.001µs 3.914µs 3.916µs 3.917µs 3.919µs 0.16% -2.579 19.336 0.05% 0.000µs 1 200
credit_card/is_card_number_no_luhn/ throughput 255143726.474op/s 255560423.569op/s ± 140920.351op/s 255545083.376op/s ± 67066.592op/s 255623709.099op/s 255772148.804op/s 255876530.416op/s 256686528.406op/s 0.45% 2.603 19.562 0.06% 9964.574op/s 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time 65.880µs 66.330µs ± 0.212µs 66.312µs ± 0.123µs 66.437µs 66.719µs 66.922µs 67.038µs 1.09% 0.685 0.675 0.32% 0.015µs 1 200
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput 14916867.114op/s 15076199.137op/s ± 48142.835op/s 15080134.938op/s ± 28006.033op/s 15106720.225op/s 15149219.080op/s 15168664.652op/s 15179076.919op/s 0.66% -0.664 0.635 0.32% 3404.213op/s 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time 59.484µs 59.667µs ± 0.070µs 59.661µs ± 0.040µs 59.710µs 59.776µs 59.796µs 60.177µs 0.86% 1.860 12.701 0.12% 0.005µs 1 200
credit_card/is_card_number_no_luhn/ 378282246310005 throughput 16617674.202op/s 16759653.418op/s ± 19630.006op/s 16761365.890op/s ± 11118.305op/s 16771319.368op/s 16787281.539op/s 16797783.512op/s 16811327.472op/s 0.30% -1.821 12.371 0.12% 1388.051op/s 1 200
credit_card/is_card_number_no_luhn/37828224631 execution_time 3.896µs 3.913µs ± 0.003µs 3.914µs ± 0.002µs 3.915µs 3.918µs 3.920µs 3.921µs 0.19% -1.100 6.372 0.07% 0.000µs 1 200
credit_card/is_card_number_no_luhn/37828224631 throughput 255032654.357op/s 255526716.100op/s ± 190642.518op/s 255520022.293op/s ± 112408.078op/s 255634612.960op/s 255815003.567op/s 255959196.486op/s 256702047.560op/s 0.46% 1.116 6.471 0.07% 13480.462op/s 1 200
credit_card/is_card_number_no_luhn/378282246310005 execution_time 56.179µs 56.410µs ± 0.118µs 56.395µs ± 0.074µs 56.473µs 56.615µs 56.702µs 56.837µs 0.78% 0.606 0.141 0.21% 0.008µs 1 200
credit_card/is_card_number_no_luhn/378282246310005 throughput 17594303.241op/s 17727513.397op/s ± 37082.880op/s 17732136.454op/s ± 23343.877op/s 17752608.967op/s 17779502.900op/s 17792568.669op/s 17800119.261op/s 0.38% -0.594 0.118 0.21% 2622.156op/s 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time 51.718µs 51.829µs ± 0.035µs 51.830µs ± 0.019µs 51.849µs 51.878µs 51.911µs 51.972µs 0.27% 0.042 1.526 0.07% 0.002µs 1 200
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput 19241062.701op/s 19294403.068op/s ± 13101.078op/s 19293942.667op/s ± 6944.031op/s 19300387.938op/s 19318585.252op/s 19325093.982op/s 19335602.645op/s 0.22% -0.035 1.516 0.07% 926.386op/s 1 200
credit_card/is_card_number_no_luhn/x371413321323331 execution_time 6.026µs 6.058µs ± 0.025µs 6.043µs ± 0.011µs 6.076µs 6.113µs 6.120µs 6.134µs 1.50% 0.875 -0.193 0.42% 0.002µs 1 200
credit_card/is_card_number_no_luhn/x371413321323331 throughput 163035025.292op/s 165082001.677op/s ± 690477.846op/s 165475558.185op/s ± 305163.394op/s 165641035.822op/s 165780902.083op/s 165881350.383op/s 165958013.225op/s 0.29% -0.862 -0.230 0.42% 48824.157op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
credit_card/is_card_number/ execution_time [3.912µs; 3.913µs] or [-0.010%; +0.010%] None None None
credit_card/is_card_number/ throughput [255550297.536op/s; 255601323.352op/s] or [-0.010%; +0.010%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 execution_time [77.562µs; 77.692µs] or [-0.084%; +0.084%] None None None
credit_card/is_card_number/ 3782-8224-6310-005 throughput [12871845.331op/s; 12893328.962op/s] or [-0.083%; +0.083%] None None None
credit_card/is_card_number/ 378282246310005 execution_time [71.376µs; 71.456µs] or [-0.056%; +0.056%] None None None
credit_card/is_card_number/ 378282246310005 throughput [13994897.706op/s; 14010581.039op/s] or [-0.056%; +0.056%] None None None
credit_card/is_card_number/37828224631 execution_time [3.912µs; 3.913µs] or [-0.011%; +0.011%] None None None
credit_card/is_card_number/37828224631 throughput [255563358.038op/s; 255619976.473op/s] or [-0.011%; +0.011%] None None None
credit_card/is_card_number/378282246310005 execution_time [67.615µs; 67.707µs] or [-0.068%; +0.068%] None None None
credit_card/is_card_number/378282246310005 throughput [14769862.744op/s; 14789833.019op/s] or [-0.068%; +0.068%] None None None
credit_card/is_card_number/37828224631000521389798 execution_time [51.830µs; 51.841µs] or [-0.011%; +0.011%] None None None
credit_card/is_card_number/37828224631000521389798 throughput [19289812.876op/s; 19293920.233op/s] or [-0.011%; +0.011%] None None None
credit_card/is_card_number/x371413321323331 execution_time [6.053µs; 6.062µs] or [-0.069%; +0.069%] None None None
credit_card/is_card_number/x371413321323331 throughput [164979064.286op/s; 165204337.752op/s] or [-0.068%; +0.068%] None None None
credit_card/is_card_number_no_luhn/ execution_time [3.913µs; 3.913µs] or [-0.008%; +0.008%] None None None
credit_card/is_card_number_no_luhn/ throughput [255540893.364op/s; 255579953.775op/s] or [-0.008%; +0.008%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 execution_time [66.301µs; 66.360µs] or [-0.044%; +0.044%] None None None
credit_card/is_card_number_no_luhn/ 3782-8224-6310-005 throughput [15069527.004op/s; 15082871.271op/s] or [-0.044%; +0.044%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 execution_time [59.657µs; 59.677µs] or [-0.016%; +0.016%] None None None
credit_card/is_card_number_no_luhn/ 378282246310005 throughput [16756932.888op/s; 16762373.948op/s] or [-0.016%; +0.016%] None None None
credit_card/is_card_number_no_luhn/37828224631 execution_time [3.913µs; 3.914µs] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/37828224631 throughput [255500294.881op/s; 255553137.320op/s] or [-0.010%; +0.010%] None None None
credit_card/is_card_number_no_luhn/378282246310005 execution_time [56.393µs; 56.426µs] or [-0.029%; +0.029%] None None None
credit_card/is_card_number_no_luhn/378282246310005 throughput [17722374.066op/s; 17732652.727op/s] or [-0.029%; +0.029%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 execution_time [51.824µs; 51.833µs] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/37828224631000521389798 throughput [19292587.384op/s; 19296218.751op/s] or [-0.009%; +0.009%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 execution_time [6.054µs; 6.061µs] or [-0.058%; +0.058%] None None None
credit_card/is_card_number_no_luhn/x371413321323331 throughput [164986308.088op/s; 165177695.266op/s] or [-0.058%; +0.058%] None None None

Group 8

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_trace/test_trace execution_time 244.468ns 255.016ns ± 12.963ns 249.003ns ± 2.299ns 256.046ns 282.057ns 300.107ns 301.790ns 21.20% 2.073 3.759 5.07% 0.917ns 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_trace/test_trace execution_time [253.220ns; 256.813ns] or [-0.704%; +0.704%] None None None

Group 9

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
ip_address/quantize_peer_ip_address_benchmark execution_time 4.941µs 5.026µs ± 0.044µs 5.015µs ± 0.018µs 5.035µs 5.105µs 5.108µs 5.113µs 1.95% 0.734 -0.636 0.88% 0.003µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
ip_address/quantize_peer_ip_address_benchmark execution_time [5.020µs; 5.033µs] or [-0.123%; +0.123%] None None None

Group 10

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
benching deserializing traces from msgpack to their internal representation execution_time 73.390ms 73.894ms ± 0.281ms 73.950ms ± 0.164ms 74.051ms 74.289ms 74.838ms 75.023ms 1.45% 0.703 1.582 0.38% 0.020ms 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
benching deserializing traces from msgpack to their internal representation execution_time [73.855ms; 73.933ms] or [-0.053%; +0.053%] None None None

Group 11

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
tags/replace_trace_tags execution_time 2.357µs 2.409µs ± 0.027µs 2.404µs ± 0.005µs 2.408µs 2.490µs 2.497µs 2.502µs 4.11% 1.894 4.130 1.12% 0.002µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
tags/replace_trace_tags execution_time [2.406µs; 2.413µs] or [-0.156%; +0.156%] None None None

Group 12

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time 208.626µs 209.037µs ± 0.150µs 209.023µs ± 0.087µs 209.140µs 209.283µs 209.432µs 209.444µs 0.20% 0.035 0.529 0.07% 0.011µs 1 200
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput 4774549.107op/s 4783834.252op/s ± 3442.295op/s 4784164.416op/s ± 1994.015op/s 4785758.540op/s 4789275.253op/s 4792709.905op/s 4793270.887op/s 0.19% -0.030 0.530 0.07% 243.407op/s 1 200
normalization/normalize_name/normalize_name/bad-name execution_time 18.619µs 18.681µs ± 0.091µs 18.667µs ± 0.017µs 18.694µs 18.724µs 18.776µs 19.846µs 6.31% 10.731 131.253 0.49% 0.006µs 1 200
normalization/normalize_name/normalize_name/bad-name throughput 50389149.423op/s 53532695.934op/s ± 249198.845op/s 53569547.815op/s ± 50148.394op/s 53609638.013op/s 53666724.515op/s 53699963.928op/s 53708161.683op/s 0.26% -10.467 126.398 0.46% 17621.019op/s 1 200
normalization/normalize_name/normalize_name/good execution_time 10.926µs 10.998µs ± 0.048µs 10.993µs ± 0.041µs 11.034µs 11.081µs 11.101µs 11.138µs 1.32% 0.397 -0.728 0.43% 0.003µs 1 200
normalization/normalize_name/normalize_name/good throughput 89780803.621op/s 90931069.865op/s ± 392402.529op/s 90967520.533op/s ± 334511.713op/s 91260779.961op/s 91443736.223op/s 91470476.576op/s 91524738.785op/s 0.61% -0.383 -0.753 0.43% 27747.049op/s 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... execution_time [209.017µs; 209.058µs] or [-0.010%; +0.010%] None None None
normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo... throughput [4783357.183op/s; 4784311.321op/s] or [-0.010%; +0.010%] None None None
normalization/normalize_name/normalize_name/bad-name execution_time [18.668µs; 18.693µs] or [-0.068%; +0.068%] None None None
normalization/normalize_name/normalize_name/bad-name throughput [53498159.371op/s; 53567232.497op/s] or [-0.065%; +0.065%] None None None
normalization/normalize_name/normalize_name/good execution_time [10.991µs; 11.004µs] or [-0.060%; +0.060%] None None None
normalization/normalize_name/normalize_name/good throughput [90876686.649op/s; 90985453.082op/s] or [-0.060%; +0.060%] None None None

Group 13

cpu_model git_commit_sha git_commit_date git_branch
Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz edfc921 1744804265 paullgdc/data-pipeline/split_trace_collect
scenario metric min mean ± sd median ± mad p75 p95 p99 max peak_to_median_ratio skewness kurtosis cv sem runs sample_size
sql/obfuscate_sql_string execution_time 67.081µs 67.277µs ± 0.226µs 67.244µs ± 0.049µs 67.301µs 67.447µs 67.702µs 69.900µs 3.95% 8.653 92.283 0.34% 0.016µs 1 200
scenario metric 95% CI mean Shapiro-Wilk pvalue Ljung-Box pvalue (lag=1) Dip test pvalue
sql/obfuscate_sql_string execution_time [67.245µs; 67.308µs] or [-0.047%; +0.047%] None None None

Baseline

Omitted due to size.

This refactor is needed because the shared logic in collect_trace_chunks and TracerPayloadParams. The way these structs are created makes replacing ByteString with the slice harder due to shared lifetime.

Furthermore, the enums encodes two different codepaths, the spanBytes and span pb which never interact with each other. So having function that handle both span bytes and pb spans is pure complexity overhead.

This refactor also has the advnatage if removing a bunch of panics and lines of code that were here because of the "fake" pb spans and trace exporter spans overlap
@paullegranddc paullegranddc force-pushed the paullgdc/data-pipeline/split_trace_collect branch from 0c8f800 to f6aae6b Compare April 15, 2025 09:34
@codecov-commenter
Copy link

codecov-commenter commented Apr 15, 2025

Codecov Report

Attention: Patch coverage is 92.95302% with 21 lines in your changes missing coverage. Please review.

Project coverage is 71.52%. Comparing base (daf50ad) to head (edfc921).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1021      +/-   ##
==========================================
- Coverage   71.52%   71.52%   -0.01%     
==========================================
  Files         339      339              
  Lines       50751    50628     -123     
==========================================
- Hits        36302    36214      -88     
+ Misses      14449    14414      -35     
Components Coverage Δ
crashtracker 42.82% <ø> (-0.03%) ⬇️
crashtracker-ffi 6.30% <ø> (ø)
datadog-alloc 98.73% <ø> (ø)
data-pipeline 91.00% <88.46%> (+0.06%) ⬆️
data-pipeline-ffi 90.35% <ø> (ø)
ddcommon 78.57% <ø> (ø)
ddcommon-ffi 66.37% <ø> (ø)
ddtelemetry 60.29% <ø> (ø)
ddtelemetry-ffi 21.43% <ø> (ø)
dogstatsd-client 82.57% <ø> (ø)
ipc 82.41% <ø> (ø)
profiling 77.49% <ø> (ø)
profiling-ffi 62.12% <ø> (ø)
serverless 0.00% <ø> (ø)
sidecar 41.18% <0.00%> (+0.03%) ⬆️
sidecar-ffi 2.05% <ø> (ø)
spawn-worker 54.37% <ø> (ø)
tinybytes 89.86% <ø> (ø)
trace-mini-agent 73.80% <100.00%> (-0.03%) ⬇️
trace-normalization 98.24% <ø> (ø)
trace-obfuscation 96.00% <ø> (ø)
trace-protobuf 78.50% <ø> (ø)
trace-utils 93.13% <96.18%> (+0.30%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@r1viollet
Copy link
Contributor

r1viollet commented Apr 15, 2025

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 7.78 MB 7.77 MB --.14% (-11.57 KB) 💪
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so.debug 24.21 MB 24.06 MB --.61% (-151.58 KB) 💪
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 78.03 MB 77.61 MB --.53% (-429.60 KB) 💪
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so.debug 22.82 MB 22.67 MB --.64% (-150.05 KB) 💪
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 72.41 MB 71.99 MB --.57% (-429.12 KB) 💪
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 7.72 MB 7.70 MB --.15% (-12.14 KB) 💪
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 17.01 MB 16.89 MB --.71% (-124.00 KB) 💪
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 61.83 KB 61.83 KB 0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 114.25 MB 113.76 MB --.42% (-496.00 KB) 💪
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 633.71 MB 633.28 MB --.06% (-437.70 KB) 💪
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 5.06 MB 5.02 MB --.74% (-38.50 KB) 💪
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 61.83 KB 61.83 KB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 16.15 MB 16.03 MB --.72% (-120.00 KB) 💪
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 26.84 MB 26.64 MB --.73% (-202.58 KB) 💪
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 14.40 MB 14.29 MB --.73% (-109.00 KB) 💪
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 62.78 KB 62.78 KB 0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 116.21 MB 115.72 MB --.42% (-504.00 KB) 💪
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 625.80 MB 625.38 MB --.06% (-429.45 KB) 💪
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 3.83 MB 3.80 MB --.70% (-27.50 KB) 💪
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 62.78 KB 62.78 KB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 16.78 MB 16.66 MB --.74% (-128.00 KB) 💪
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 24.74 MB 24.56 MB --.74% (-188.35 KB) 💪
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 67.28 MB 66.92 MB --.53% (-371.86 KB) 💪
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 8.27 MB 8.22 MB --.50% (-43.17 KB) 💪
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so.debug 23.35 MB 23.22 MB --.54% (-130.46 KB) 💪
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 68.23 MB 67.86 MB --.54% (-380.20 KB) 💪
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 8.14 MB 8.11 MB --.37% (-31.49 KB) 💪
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so.debug 20.98 MB 20.85 MB --.61% (-132.32 KB) 💪

@paullegranddc paullegranddc marked this pull request as ready for review April 15, 2025 13:33
@paullegranddc paullegranddc requested review from a team as code owners April 15, 2025 13:33
Copy link
Contributor

@VianneyRuhlmann VianneyRuhlmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@paullegranddc paullegranddc force-pushed the paullgdc/data-pipeline/split_trace_collect branch from fe35ab6 to 5db3bf2 Compare April 15, 2025 14:50
@paullegranddc paullegranddc enabled auto-merge (squash) April 16, 2025 11:58
@paullegranddc paullegranddc disabled auto-merge April 16, 2025 12:33
@paullegranddc paullegranddc merged commit 3dab0be into main Apr 16, 2025
35 checks passed
@paullegranddc paullegranddc deleted the paullgdc/data-pipeline/split_trace_collect branch April 16, 2025 12:33
duncanpharvey pushed a commit that referenced this pull request Apr 16, 2025
# What does this PR do?

This refactor splits the logic in `collect_trace_chunks` between the trace exporter spans (v04 and v05) and the mini agent spans (pb::Spans).
it completely removes usage of the `TraceCollection` struct from data-pipeline, and instead introduces the `TraceChunks` enum to differentiate between v04 and v05.
 
Currently the way the code is structured makes replacing ByteString with the slice harder due to shared lifetime.
Furthermore, the enums encodes two different codepaths, the spanBytes and span pb which never interact with each other. So having function that handle both span bytes and pb spans is pure complexity overhead.
This refactor also removes a bunch of panics and lines of code that were here because to handle the "fake" pb spans and trace exporter spans overlap, which is practice never happens.

Lastly, this remove the TracerParams struct. Every occurence of if was creating it, and invoking `TryInto<TracerCollection>` just after on it. So replacing it by a simple function is a lot less complex for the same feature set.

# Motivation

Prepare for using `SpanSlice<'a>` instead of `SpanBytes` in the trace exporter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants