-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[IR2Vec] Add llvm-ir2vec tool for generating triplet embeddings #147842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: users/svkeerthy/07-09-_nfc_ir2vec_exposing_helpers_in_ir2vec_vocabulary
Are you sure you want to change the base?
[IR2Vec] Add llvm-ir2vec tool for generating triplet embeddings #147842
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI failures look relevant.
It would also be good to add a documentation file to the command guide. I would be fine with that happening in a separate patch though to keep the review focused.
Also, what's the point of generating these triplets? Nothing immediately springs to mind on how they would be useful.
Thanks! Looking into it.
Yes, will add it in the next patch.
The triplets collected on various ll files act as the corpus for training the vocabulary. I shall add the helper scripts for training in the subsequent patches. |
Add a new LLVM tool
llvm-ir2vec
. This tool is primarily intended to generate triplets for training the vocabulary (#141834) and to potentially generate the embeddings in a stand alone manner.This PR introduces the tool with triplet generation functionality. In the upcoming PRs I'll add scripts under
utils/mlgo
to complete the vocabulary tooling. #147844 adds embedding generation logic to the tool.(Tracking issue - #141817)