Added windows cross compile builds & fixed build issues #169

ashbob999 · 2024-09-15T16:54:53Z

Added cross compiling builds (x64, x86, ARM64) for windows (MSVC), and fixed build issues.
Added support for using MSVC ARM intrinsics.
Added tests for sz_u32_clz which will help catch if the tests are run without BMI support.
Fixed build issues when building MacOS Universal2 binaries.
Disabled AVX for 32-bit binaries.
Fixed lots of issues when building a Windows 32-bit binary.
Updated release.yml to build x86/x64/arm64 windows versions, and included the .lib file in the archive.
Added checks for making sure stringzillite is built without any dependencies.
Fixed Windows stringzillite might be broken #185

CMakeLists.txt

Also added popcnt32 intrinsic support for win32.

Fixes issue with building MacOS universal2, as both the x86 and arm feature flags can be enabled at the same time.

ashbob999 · 2024-09-21T18:01:24Z

@ashvardanian
For the x86 (32-bit) builds, there are warnings about conversion loss of data, due to converison between sz_size_t and sz_u64_t.

sz_sorted_idx_t is defined as a sz_u64_t.

Functions where this happens: sz_partition, sz_merge, sz_sort_introsort_recursion, sz_sort_partial.

Plus there are other places where the sz_u64_t are used directly, but only to be converted to sz_size_t.

StringZilla/include/stringzilla/stringzilla.h

Lines 3478 to 3487 in a34d836

    
           SZ_PUBLIC void sz_sort_insertion(sz_sequence_t *sequence, sz_sequence_comparator_t less) { 
        
               sz_u64_t *keys = sequence->order; 
        
               sz_size_t keys_count = sequence->count; 
        
               for (sz_size_t i = 1; i < keys_count; i++) { 
        
                   sz_u64_t i_key = keys[i]; 
        
                   sz_size_t j = i; 
        
                   for (; j > 0 && less(sequence, i_key, keys[j - 1]); --j) keys[j] = keys[j - 1]; 
        
                   keys[j] = i_key; 
        
               } 
        
           }

Other functions it occurs in: _sz_sift_down, sz_sort_introsort_recursion, sz_string_unpack.

So should all of these be changed to use sz_size_t?

Also should we be running the NumPy tests when building the python binaries?

ashvardanian · 2024-09-21T18:03:00Z

Hey @ashbob999! Those places require the unsigned integer to be 64 bit, so it shouldn't be changed.

ashbob999 · 2024-09-21T18:25:29Z

Hey @ashbob999! Those places require the unsigned integer to be 64 bit, so it shouldn't be changed.

Although, if I have understood the sz_sort_insertion function correctly and the sz_sequence_t struct.

That the order member is a pointer to an array of indexes, which specify the sorted order, and the count is the number of indexes in the array. Then don't we already have a mismatch, because we have a uint32 for the count and a uint64 for the index of each string?

And because the order pointer array is populated with the data ptr from a std::vector, which the vector max size is 32-bit.

ashbob999 · 2024-10-11T15:26:13Z

@ashvardanian what should I do about the warnings for the 32-builds (for the sorting functions)?

ashvardanian · 2024-10-11T15:29:05Z

@ashbob999, what's the cleanest way to reproduce some of those warnings/errors with GCC/Clang? It would be easier to choose a path forward if I can better understand their nature.

ashbob999 · 2024-10-11T21:42:18Z

@ashvardanian

Just building it as 32-bit (m32 and adding -Wconversion although this adds other warnings for clang) should be enough. Although I cannot seem to reproduce some of the warnings that MSVC emits.

For example the warning about converting the result of sz_blend_u64 doesn't get emitted even though it is truncating from u64 to u32.

StringZilla/include/stringzilla/stringzilla.h

Line 3162 in dd2b949

    
           *space = sz_u64_blend(SZ_STRING_INTERNAL_SPACE, string->external.space, is_big_mask);

But only a very few actually get emitted (GCC).

[1/2] Building CXX object CMakeFiles/stringzilla_test_cpp14.dir/scripts/test.cpp.o
In file included from /usr/include/c++/11/cassert:44,
                 from ../include/stringzilla/stringzilla.hpp:57,
                 from ../scripts/test.cpp:21:
../scripts/test.cpp: In function ‘void test_sequence_algorithms()’:
../scripts/test.cpp:1347:90: warning: conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<long long unsigned int>, long long unsigned int>::value_type’ {aka ‘long long unsigned int’} to ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘unsigned int’} may change value [-Wconversion]
 1347 |             for (std::size_t i = 1; i != dataset_size; ++i) { assert(dataset[order[i - 1]] <= dataset[order[i]]); }
      |                                                                                          ^
../scripts/test.cpp:1347:111: warning: conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<long long unsigned int>, long long unsigned int>::value_type’ {aka ‘long long unsigned int’} to ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘unsigned int’} may change value [-Wconversion]
 1347 |             for (std::size_t i = 1; i != dataset_size; ++i) { assert(dataset[order[i - 1]] <= dataset[order[i]]); }
      |                                                                                                               ^
[2/2] Linking CXX executable stringzilla_test_cpp14

But for whatever reason (maybe my building of the 32-bit with GCC/Clang is broken) but if I build it as 64-bit but with sz_size_t and sz_ssize_t changed to be uint32_t and int32_t respectively, then I get the warnings that MSVC emitted.

ashvardanian · 2024-10-22T13:54:47Z

Can you think of any better way of forcing the 4B limit on the user?

I would stick to size_t for sizes and focus on designing a better sorting algorithm. Long-term sustainable solutions are better than short-term gains 😉

ashvardanian · 2024-10-30T10:33:02Z

Hey @ashbob999! I'll have more time to look into this next week.

I want to thank you for following the git message style - it's very pleasing to see! Unlike most PRs, where I end up squashing the patches and merging under a new name, I would love to preserve your entire commit history. I'd just ask you to reformat/squash the last three commits that deviate from the style. Thanks again!

ashbob999 · 2024-10-30T10:59:35Z

@ashvardanian no problem, I purposely did not make them that way jsut because they were fixing issues that I noticed from previous commits in the merge.

I can squash them into a single one, labelled as minor fixes (or equivalent), do you also want me to rebate to fix one in the middle a35cc50?

Imported from #169 Co-authored-by: ashbob999 <32575256+ashbob999@users.noreply.github.com>

recurseml · 2025-05-02T15:29:27Z

⚠️ Only 5 files will be analyzed due to processing limits.

recurseml · 2025-05-02T15:29:31Z

⚠️ Only 5 files will be analyzed due to processing limits.

recurseml · 2025-05-02T15:32:39Z

c/64-bit-math-emulation.c

+#ifdef _M_IX86
+
+// 64-bit math operators for 32-bit systems
+void __declspec(naked) _allmul()


Incorrect function declaration. _allmul is a compiler intrinsic that must return ULONGLONG (64-bit unsigned integer), but it's declared as void. This could cause issues with the compiler's handling of the return value and potentially lead to runtime errors. The function should be declared as: 'ULONGLONG __declspec(naked) _allmul()'.

📚 Relevant Docs

React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)

recurseml · 2025-05-02T15:32:41Z

c/64-bit-math-emulation.c

+    /* *INDENT-ON* */
+}
+
+void __declspec(naked) _aullrem()


Incorrect function declaration. _aullrem is a compiler intrinsic that must return ULONGLONG (64-bit unsigned integer), but it's declared as void. This mismatched return type could cause improper handling of the remainder value in arithmetic operations. The function should be declared as: 'ULONGLONG __declspec(naked) _aullrem()'.

📚 Relevant Docs

React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)

recurseml · 2025-05-02T15:32:43Z

include/stringzilla/stringzilla.h

+#else
+    sz_u32_vec_t h_even_vec, h_odd_vec, n_vec, matches_even_vec, matches_odd_vec;
+    n_vec.u32 = 0;
+    n_vec.u8s[0] = n[0], n_vec.u8s[1] = n[1];


Potential integer overflow and incorrect size handling in sz_fill_serial(). The multiplication of value by 0x0101010101010101ull assumes a 64-bit size_t. On 32-bit platforms, this will result in undefined behavior due to integer overflow, as sz_size_t would be 32-bit. This affects memory filling operations.

📚 Relevant Docs

React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)

recurseml · 2025-05-02T15:32:45Z

include/stringzilla/stringzilla.h

    if (!h_length) return SZ_NULL_CHAR;
    sz_cptr_t const h_end = h + h_length;

 #if !SZ_DETECT_BIG_ENDIAN    // Use SWAR only on little-endian platforms for brevety.
-#if !SZ_USE_MISALIGNED_LOADS // Process the misaligned head, to void UB on unaligned 64-bit loads.
-    for (; ((sz_size_t)h & 7ull) && h < h_end; ++h)
+#if !SZ_USE_MISALIGNED_LOADS // Process the misaligned head, to void UB on unaligned 32/64 bit loads.


Incorrect comment indicates misaligned memory access prevention, but the code actually doesn't handle alignment properly for both 32-bit and 64-bit architectures, which could lead to undefined behavior on certain platforms. The alignment check should use sizeof(sz_size_t) consistently rather than hardcoded values.

React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)

recurseml · 2025-05-02T15:32:47Z

😱 Found 4 issues. Time to roll up your sleeves! 😱

ashbob999 force-pushed the msvc-arm branch 10 times, most recently from 8279247 to 713afd3 Compare September 21, 2024 16:18

ashvardanian reviewed Sep 21, 2024

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

CMakeLists.txt Outdated Show resolved Hide resolved

ashbob999 force-pushed the msvc-arm branch from 713afd3 to c42ea5e Compare September 21, 2024 16:48

ashbob999 added 7 commits September 21, 2024 18:08

Make: Fixed issue checking for cross compiling with MSVC

c32bbb4

Improve: Added MSVC ARM intrinsics

b6e6f52

Also added popcnt32 intrinsic support for win32.

Improve: Added sz_u32_clz tests

a8acd2f

Make: Added windows cross compile builds

a3862fd

Fixed MSVC not supporting subscript operator on vector types

a35cc50

Make: undef hardware feature flags if the hardware will never support it

1490504

Fixes issue with building MacOS universal2, as both the x86 and arm feature flags can be enabled at the same time.

Merge branch 'main-dev' into msvc-arm

f363a15

ashbob999 force-pushed the msvc-arm branch from c42ea5e to f363a15 Compare September 21, 2024 17:25

ashbob999 added 3 commits September 21, 2024 18:27

Make: Fixed ARM serial still using NEON

d3e5d32

Fix: Removed debug pragma messages

a03a95b

Make: Made the MacOS universal2 build support AVX+NEON

d08a183

ashvardanian mentioned this pull request Sep 27, 2024

New Sorting Algorithm #173

Open

3 tasks

Make: Removed SVE support for MSVC ARM builds

21dec6e

ashbob999 added 4 commits October 26, 2024 09:09

Make: Updated release to include cross-compiled windows builds

36fcb3a

Make: Added .lib files to uploaded windows archives

85b440b

Make: Updated neon tests to target arm8.2

c40ca24

Make: Add checks to make sure stringzillite is built correctly

c39c590

ashbob999 force-pushed the msvc-arm branch from 15fb92d to c39c590 Compare October 26, 2024 14:33

Make: Fixed the stringzillite still having dependencies dlls on windows

ecebb0d

ashbob999 force-pushed the msvc-arm branch 2 times, most recently from e3e8491 to 4ee87f4 Compare October 26, 2024 21:55

Make: Fixed the stringzillite check for Alpine Linux

16896c3

ashbob999 force-pushed the msvc-arm branch from 4ee87f4 to 16896c3 Compare October 27, 2024 15:12

ashbob999 marked this pull request as ready for review October 27, 2024 16:23

ashbob999 added 3 commits November 2, 2024 08:38

Fix: MacOS python builds not having the correct arch set

985cf0e

Fix: Reverted incorrect sized variables from a35cc50

7741244

Fix: Removed temp _sz_heapsort bench sort

a9b8065

ashbob999 force-pushed the msvc-arm branch from 7b99879 to a9b8065 Compare November 2, 2024 08:41

ashvardanian force-pushed the main-dev branch from e0a9e4e to c8c6c7c Compare December 1, 2024 09:36

ashvardanian mentioned this pull request Dec 8, 2024

StringZilla 4.0! #201

Open

ashvardanian added a commit that referenced this pull request Dec 8, 2024

Make: Detect Apple Universal builds

6d61c21

Imported from #169 Co-authored-by: ashbob999 <32575256+ashbob999@users.noreply.github.com>

ashvardanian added a commit that referenced this pull request Dec 8, 2024

Fix: Overriding LibC in 32-bit Windows

645539b

Imported from #169 Co-authored-by: ashbob999 <32575256+ashbob999@users.noreply.github.com>

recurseml bot reviewed May 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added windows cross compile builds & fixed build issues #169

Added windows cross compile builds & fixed build issues #169

ashbob999 commented Sep 15, 2024 •

edited

Loading

ashbob999 commented Sep 21, 2024

ashvardanian commented Sep 21, 2024

ashbob999 commented Sep 21, 2024

ashbob999 commented Oct 11, 2024

ashvardanian commented Oct 11, 2024

ashbob999 commented Oct 11, 2024

ashvardanian commented Oct 22, 2024

ashvardanian commented Oct 30, 2024

ashbob999 commented Oct 30, 2024 •

edited

Loading

recurseml bot commented May 2, 2025

recurseml bot commented May 2, 2025

recurseml bot May 2, 2025

recurseml bot May 2, 2025

recurseml bot May 2, 2025

recurseml bot May 2, 2025

recurseml bot commented May 2, 2025

Added windows cross compile builds & fixed build issues #169

Are you sure you want to change the base?

Added windows cross compile builds & fixed build issues #169

Conversation

ashbob999 commented Sep 15, 2024 • edited Loading

ashbob999 commented Sep 21, 2024

ashvardanian commented Sep 21, 2024

ashbob999 commented Sep 21, 2024

ashbob999 commented Oct 11, 2024

ashvardanian commented Oct 11, 2024

ashbob999 commented Oct 11, 2024

ashvardanian commented Oct 22, 2024

ashvardanian commented Oct 30, 2024

ashbob999 commented Oct 30, 2024 • edited Loading

recurseml bot commented May 2, 2025

recurseml bot commented May 2, 2025

recurseml bot May 2, 2025

Choose a reason for hiding this comment

recurseml bot May 2, 2025

Choose a reason for hiding this comment

recurseml bot May 2, 2025

Choose a reason for hiding this comment

recurseml bot May 2, 2025

Choose a reason for hiding this comment

recurseml bot commented May 2, 2025

ashbob999 commented Sep 15, 2024 •

edited

Loading

ashbob999 commented Oct 30, 2024 •

edited

Loading