Skip to content

feat(api): improve isValidSpanId, isValidTraceId performance #5714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 13, 2025
2 changes: 2 additions & 0 deletions api/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ All notable changes to this project will be documented in this file.

### :rocket: (Enhancement)

* feat(api): improve isValidSpanId, isValidTraceId performance [#5714](https://github.com/open-telemetry/opentelemetry-js/pull/5714) @seemk

### :bug: (Bug Fix)

### :books: (Refine Doc)
Expand Down
32 changes: 28 additions & 4 deletions api/src/trace/spancontext-utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,45 @@ import { NonRecordingSpan } from './NonRecordingSpan';
import { Span } from './span';
import { SpanContext } from './span_context';

const VALID_TRACEID_REGEX = /^([0-9a-f]{32})$/i;
const VALID_SPANID_REGEX = /^[0-9a-f]{16}$/i;
// Valid characters (0-9, a-f, A-F) are marked as 1.
const isHex = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, Node core implements this with an Int8Array, which might be more efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be about the same, with the Int8Array based version being a few percent slower (no idea why, could be noise):

isValidTraceId (LUT, unroll x4) 1474.1 ms
isValidTraceId (Int8Array LUT, unroll x4) 1562.7 ms
isValidSpanId (LUT, unroll x4) 748.7 ms
isValidSpanId (Int8Array LUT, unroll x4) 776.7 ms

isValidTraceId (LUT, unroll x4) x 29,297,057 ops/sec ±0.48% (96 runs sampled)
isValidTraceId (Int8Array LUT, unroll x4) x 31,043,607 ops/sec ±0.61% (95 runs sampled)
isValidSpanId (LUT, unroll x4) x 51,491,271 ops/sec ±0.79% (96 runs sampled)
isValidSpanId (Int8Array LUT, unroll x4) x 54,356,052 ops/sec ±1.02% (94 runs sampled)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I benchmarked this with jsbench.me and changing this to be a Uint8Array sped it up by about 12% for me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be slightly faster on x64, see the results here: #5714 (comment) (there's a link to the benchmark script if you want to run it yourself).

I changed the implementation to use Uint8Array.

0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
] as const;

function isValidHex(id: string, length: number): boolean {
// As of 1.9.0 the id was allowed to be a non-string value,
// even though it was not possible in the types.
if (typeof id !== 'string' || id.length !== length) return false;

let r = 0;
for (let i = 0; i < id.length; i += 4) {
// Mask by 0x7f (127) to avoid LUT overflow.
r +=
isHex[id.charCodeAt(i) & 0x7f] +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also a comment about masking so that this only lowest 7 bits are retained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an explanation for the bitwise and

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible using charCodeAt to get partial code points from an extended-unicode code point containing string. Should we be using codePointAt? It would then also mean the number might be larger than 128 though (up to 2^16 I think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment got me thinking that this implementation had an error, where an UTF-16 value like \u4141 (䅁) would be truncated to 0x41, or A, which is a valid character and an invalid ID containing such value would pass.

I changed the implementation to LUT[id.charCodeAt(i)] | 0 which coerces undefined to 0 and fixes the issue (while being faster than the original). See the chart - LUT unroll x 4 vs LUT unroll x 4 old.

As for codePointAt, I can't think of a case where a Unicode character taking up multiple UTF-16 values would have all of the UTF-16 characters in the range in the lookup table ('a-z', 'A-Z, '0-9'). Besides that, it seems String.length should not be used with codePointAt.

I've profiled different implementations now, here's the data:
image
image

The implementation in the PR is U8 LUT, unroll x 4.

You can try it out yourself: https://gist.github.com/seemk/f00c432d8779a03dbe1a28236f9dd198

isHex[id.charCodeAt(i + 1) & 0x7f] +
isHex[id.charCodeAt(i + 2) & 0x7f] +
isHex[id.charCodeAt(i + 3) & 0x7f];
}

return r === length;
}

/**
* @since 1.0.0
*/
export function isValidTraceId(traceId: string): boolean {
return VALID_TRACEID_REGEX.test(traceId) && traceId !== INVALID_TRACEID;
return isValidHex(traceId, 32) && traceId !== INVALID_TRACEID;
}

/**
* @since 1.0.0
*/
export function isValidSpanId(spanId: string): boolean {
return VALID_SPANID_REGEX.test(spanId) && spanId !== INVALID_SPANID;
return isValidHex(spanId, 16) && spanId !== INVALID_SPANID;
}

/**
Expand Down