Skip to content

[LLVM] Fix bug in lookupLLVMIntrinsicByName() with StringRefs th… #145629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jfurtek
Copy link
Contributor

@jfurtek jfurtek commented Jun 25, 2025

…at are not null-terminated

This PR modifies the implement of lookupLLVMIntrinsicByName(), which may access uninitialized memory when the StringRef Name argument is not null-terminated.

valgrind reports the following error:

==43697== Conditional jump or move depends on uninitialised value(s)
==43697==    at 0x11B06D28: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43697==    by 0x5901C6: std::char_traits<char>::length(char const*) (char_traits.h:399)
==43697==    by 0xAD22C3: llvm::StringRef::StringRef(char const*) (StringRef.h:94)
==43697==    by 0x944BD4A: auto lookupLLVMIntrinsicByName(llvm::ArrayRef<unsigned int>, llvm::StringRef, llvm::StringRef)::{lambda(auto:1, auto:2)#1}::operator()<unsigned int, char const*>(unsigned int, char const*) const (Intrinsics.cpp:768)

Prior to this MR, in the binary search loop, std::equal_range() is called with the StringRef data pointer. The lambda comparison function assigns an instance of a StringRef with the const char * value of Name.data() (either LHSStr or RHSStr). This statement creates an instance of StringRef from the pointer before invoking the assigment operator. Constructing a StringRef instance from a pointer calls strlen(), which is incorrect for strings that are not null terminated.

The MR uses the StringRef as the std::equal_range() value instead of using the data pointer.

…at are not null-terminated

This PR modifies the implement of lookupLLVMIntrinsicByName(), which may access
uninitialized memory when the StringRef Name argument is not null-terminated.

valgrind reports the following error:

==43697== Conditional jump or move depends on uninitialised value(s)
==43697==    at 0x11B06D28: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43697==    by 0x5901C6: std::char_traits<char>::length(char const*) (char_traits.h:399)
==43697==    by 0xAD22C3: llvm::StringRef::StringRef(char const*) (StringRef.h:94)
==43697==    by 0x944BD4A: auto lookupLLVMIntrinsicByName(llvm::ArrayRef<unsigned int>, llvm::StringRef, llvm::StringRef)::{lambda(auto:1, auto:2)llvm#1}::operator()<unsigned int, char const*>(unsigned int, char const*) const (Intrinsics.cpp:768)

Prior to this MR, in the binary search loop, std::equal_range() is called with
the StringRef data pointer. The lambda comparison function assigns an instance
of a StringRef with the const char * value of Name.data() (either LHSStr or
RHSStr). This statement creates an instance of StringRef from the pointer before
invoking the assigment operator. Constructing a StringRef instance from a
pointer calls strlen(), which is incorrect for strings that are not null
terminated.

The MR uses the StringRef as the std::equal_range() value instead of using the
data pointer.
@llvmbot
Copy link
Member

llvmbot commented Jun 25, 2025

@llvm/pr-subscribers-llvm-ir

Author: Jeremy Furtek (jfurtek)

Changes

…at are not null-terminated

This PR modifies the implement of lookupLLVMIntrinsicByName(), which may access uninitialized memory when the StringRef Name argument is not null-terminated.

valgrind reports the following error:

==43697== Conditional jump or move depends on uninitialised value(s)
==43697==    at 0x11B06D28: strlen (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==43697==    by 0x5901C6: std::char_traits&lt;char&gt;::length(char const*) (char_traits.h:399)
==43697==    by 0xAD22C3: llvm::StringRef::StringRef(char const*) (StringRef.h:94)
==43697==    by 0x944BD4A: auto lookupLLVMIntrinsicByName(llvm::ArrayRef&lt;unsigned int&gt;, llvm::StringRef, llvm::StringRef)::{lambda(auto:1, auto:2)#<!-- -->1}::operator()&lt;unsigned int, char const*&gt;(unsigned int, char const*) const (Intrinsics.cpp:768)

Prior to this MR, in the binary search loop, std::equal_range() is called with the StringRef data pointer. The lambda comparison function assigns an instance of a StringRef with the const char * value of Name.data() (either LHSStr or RHSStr). This statement creates an instance of StringRef from the pointer before invoking the assigment operator. Constructing a StringRef instance from a pointer calls strlen(), which is incorrect for strings that are not null terminated.

The MR uses the StringRef as the std::equal_range() value instead of using the data pointer.


Full diff: https://github.com/llvm/llvm-project/pull/145629.diff

1 Files Affected:

  • (modified) llvm/lib/IR/Intrinsics.cpp (+1-1)
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index e631419d5e1c2..2b3269ecd5b17 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -676,7 +676,7 @@ static int lookupLLVMIntrinsicByName(ArrayRef<unsigned> NameOffsetTable,
                      CmpEnd - CmpStart) < 0;
     };
     LastLow = Low;
-    std::tie(Low, High) = std::equal_range(Low, High, Name.data(), Cmp);
+    std::tie(Low, High) = std::equal_range(Low, High, Name, Cmp);
   }
   if (High - Low > 0)
     LastLow = Low;

@jfurtek
Copy link
Contributor Author

jfurtek commented Jun 25, 2025

Tagging @chandlerc, @jurahul for review.

@jurahul
Copy link
Contributor

jurahul commented Jun 25, 2025

Can you add a unit test if possible?

@hstk30-hw hstk30-hw requested review from jurahul and chandlerc June 25, 2025 06:37
@jurahul
Copy link
Contributor

jurahul commented Jun 25, 2025

Thanks for the fix, the change looks good. Can you adjust the PR title to be concise, so it does not wrap around into the description? And add a unit test (in llvm/unittests/IR/IntrinsicsTests.cpp)?

@jurahul
Copy link
Contributor

jurahul commented Jun 25, 2025

I am wondering if a better fix is to change LHSStr and RHSStr in the Cmp lambda to const char *? Because currently we create a StringRef and then just use .data(). That is, we do a bunch of unneeded strlen() computations when converting from const char * to StringRef here. We can avoid those as well if we make them const char * and fix this bug as well.

@jfurtek
Copy link
Contributor Author

jfurtek commented Jul 9, 2025

Closing - superseded by:
#147444

@jfurtek jfurtek closed this Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants