Improvements to the speed of latin1_to_string. #587

Narfinger · 2025-06-27T09:10:27Z

Improvements to the speed of latin1_to_string.
On the benchmark it improves the encoding from around 9 microsends to 6
microseconds.

This PR also includes the benchmark and setup as criterion benchmark.

Signed-off-by: Narfinger Narfinger@users.noreply.github.com

mozjs/benches/latin1_string_conversion.rs

mozjs/src/conversions.rs

mrobinson · 2025-06-27T12:49:27Z

Out of curiosity how does this compare to something like https://docs.rs/encoding_rs/latest/encoding_rs/mem/fn.convert_latin1_to_str.html?

Narfinger · 2025-06-27T14:39:16Z

Out of curiosity how does this compare to something like https://docs.rs/encoding_rs/latest/encoding_rs/mem/fn.convert_latin1_to_str.html?

Good catch! It actually is faster in my tests. now down to 5.4 ms.

Narfinger · 2025-06-30T15:13:18Z

I worked a bit more on it and now have simd acceleration for sse2 and avx. With the current testcase we go down from 5.4 ms to 406 ns. Quite an improvement.
However, when simd is stable in rust this should be switched to the encoding_rs path and use their flag.
I also added more testcases which should be on the boundary of the simd values.

mrobinson · 2025-07-01T08:31:52Z

Can you please wait to implement the SIMD implementation in another PR and have this PR just be about the non-SIMD optimizations? That way we can separate out the two topics and if a revert is needed maybe only one of them will be affected.

Narfinger · 2025-07-01T08:36:34Z

Ok probably a wise choice. I hope it is fine to still have the fast copy function. Because it is not public it should be inlined anyway. And the multiple testcases.

jschwe · 2025-07-04T03:04:43Z

mozjs/src/conversions.rs

+/// Copies chars to the string
+unsafe fn fast_copy(chars: &[u8]) -> String {
+    let mut v = Vec::with_capacity(chars.len() * 2);
+    v.set_len(chars.len() * 2);


This is undefined behavior. See Safety requirements of set_len

There is some existing discussion over at encoding_rs hsivonen/encoding_rs#79

Narfinger · 2025-07-04T08:28:45Z

I also tested using the external api of encoding_rs::UTF_16BE.decode making sure not to duplicate the work that JS_DeprecatedStringHasLatin1Chars does and it seems that that path is still slower then the current path. I am not sure why.
As for the undefined behavior above, jonathan wanted to see if he can use MaybeUninit to make the api safer.

On the benchmark it improves the encoding from around 9 microsends to 6 microseconds. This PR also includes the benchmark and setup as criterion benchmark. Signed-off-by: Narfinger <Narfinger@users.noreply.github.com>

Signed-off-by: Jonathan Schwender <schwenderjonathan@gmail.com>

jschwe · 2025-07-10T03:48:00Z

@Narfinger I pushed a commit which refactors the benchmark a bit, so that we run with different input sizes and can also test the slow path.

I can confirm that also on my macbook encoding_rs::mem::decode_latin1 is slower (although it depends on the input sizes, at 1K on the fastpath it seems to be quite a bit faster). For the slowpath with high bytes decode_latin1 is quite a bit slower than the current solution in your PR.
In any case, it doesn't seem like we can easily / quickly change encoding_rs. Did your manual SIMD implementation also still use encoding_rs, or would that be UB free then?

Narfinger · 2025-07-10T07:40:28Z

The manual simd part still falls back to the encoding_rs::mem solution for basically everything that doesn't support sse2.
I tested and did not find an improvement for the phone case but I would be interested in if it is faster on apple chips.

I am not quite sure what you mean with fast and slow path here. The function is only called for codepoints that fit in the latin1 range, everything else is not valid behaviour.

At the moment we have to use encoding_rs to do this encoding leading us to a bit of a cyclical testing. But tests should prepare for future refactoring, so I think this is fine. Signed-off-by: Narfinger <Narfinger@users.noreply.github.com>

Narfinger · 2025-07-10T08:47:46Z

Ok I added more tests for checking latin1 completely. Currently this uses encoding_rs::mem because otherwise we cannot really convert it. This leads us to a bit of cyclical testing but I think this is fine so that anybody who in the future tries to improve this function can test against a sane baseline.

In a "completely unrelated" news, my simd improvements do not work and produce wrong results.

mrobinson requested changes Jun 27, 2025

View reviewed changes

mozjs/benches/latin1_string_conversion.rs Outdated Show resolved Hide resolved

mozjs/src/conversions.rs Outdated Show resolved Hide resolved

mozjs/src/conversions.rs Outdated Show resolved Hide resolved

mozjs/src/conversions.rs Outdated Show resolved Hide resolved

sagudev reviewed Jun 27, 2025

View reviewed changes

mozjs/src/conversions.rs Show resolved Hide resolved

Narfinger force-pushed the latin1-improvements branch from e3d7773 to f9993c2 Compare June 27, 2025 09:42

Narfinger force-pushed the latin1-improvements branch 2 times, most recently from 895dba4 to c774a5c Compare June 30, 2025 15:10

Narfinger force-pushed the latin1-improvements branch 2 times, most recently from c1e2bdb to 661283a Compare July 1, 2025 08:12

Narfinger force-pushed the latin1-improvements branch 2 times, most recently from 3f9d706 to 50541e7 Compare July 1, 2025 08:35

Narfinger force-pushed the latin1-improvements branch from 6298a9e to ff2b4c4 Compare July 1, 2025 08:53

jschwe reviewed Jul 4, 2025

View reviewed changes

Narfinger force-pushed the latin1-improvements branch from ff2b4c4 to 7cc0787 Compare July 9, 2025 10:15

Narfinger and others added 2 commits July 9, 2025 12:17

Improvements to the speed of latin1_to_string.

7cc0787

On the benchmark it improves the encoding from around 9 microsends to 6 microseconds. This PR also includes the benchmark and setup as criterion benchmark. Signed-off-by: Narfinger <Narfinger@users.noreply.github.com>

latin1 benches: measure multiple input sizes

f09f8cb

Signed-off-by: Jonathan Schwender <schwenderjonathan@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improvements to the speed of latin1_to_string. #587

Improvements to the speed of latin1_to_string. #587

Uh oh!

Narfinger commented Jun 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mrobinson commented Jun 27, 2025

Uh oh!

Narfinger commented Jun 27, 2025

Uh oh!

Narfinger commented Jun 30, 2025 •

edited

Loading

Uh oh!

mrobinson commented Jul 1, 2025

Uh oh!

Narfinger commented Jul 1, 2025 •

edited

Loading

Uh oh!

jschwe Jul 4, 2025

Uh oh!

jschwe Jul 4, 2025

Uh oh!

Narfinger commented Jul 4, 2025

Uh oh!

jschwe commented Jul 10, 2025

Uh oh!

Narfinger commented Jul 10, 2025

Uh oh!

Narfinger commented Jul 10, 2025

Uh oh!

Uh oh!

Improvements to the speed of latin1_to_string. #587

Are you sure you want to change the base?

Improvements to the speed of latin1_to_string. #587

Uh oh!

Conversation

Narfinger commented Jun 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mrobinson commented Jun 27, 2025

Uh oh!

Narfinger commented Jun 27, 2025

Uh oh!

Narfinger commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrobinson commented Jul 1, 2025

Uh oh!

Narfinger commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jschwe Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

jschwe Jul 4, 2025

Choose a reason for hiding this comment

Uh oh!

Narfinger commented Jul 4, 2025

Uh oh!

jschwe commented Jul 10, 2025

Uh oh!

Narfinger commented Jul 10, 2025

Uh oh!

Narfinger commented Jul 10, 2025

Uh oh!

Uh oh!

Narfinger commented Jun 30, 2025 •

edited

Loading

Narfinger commented Jul 1, 2025 •

edited

Loading