-
Notifications
You must be signed in to change notification settings - Fork 689
Add torchao kernels to llama runner #6195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6195
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 444d44c with merge base ddc8ea6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
endif() | ||
|
||
if(EXECUTORCH_BUILD_TORCHAO) | ||
list(APPEND link_libraries "$<LINK_LIBRARY:WHOLE_ARCHIVE,${CMAKE_CURRENT_BINARY_DIR}/../../../lib/libtorchao_ops_executorch.a>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exporting targets with config in torchao makes this bit nicer. It could be:
set(torchao_DIR ${CMAKE_CURRENT_BINARY_DIR}/../../../lib/cmake/torchao)
find_package(torchao REQUIRED)
target_link_options_shared_lib(torchao::torchao_ops_executorch)
list(APPEND link_libraries torchao::torchao_ops_executorch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You already called add_subdirectory
so here you can do something like this:
target_link_options(torchao_ops_executorch INTERFACE -Wl,--whole-archive -Wl,--no-whole-archive)
Or actually this line should live in torchao/experimental/CMakeLists.txt.
CMakeLists.txt
Outdated
|
||
if(EXECUTORCH_BUILD_TORCHAO) | ||
add_compile_options("-frtti") | ||
set(EXECUTORCH_INCLUDE_DIRS ${CMAKE_CURRENT_SOURCE_DIR}/..) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The find_package(ExecuTorch) in torchao does not appear to define the variables EXECUTORCH_INCLUDE_DIRS and EXECUTORCH_LIBRARIES unless I call the sh install_requirements in torchao/experimental. But then I don't know if the op registration will work in examples/models/llama2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these lines should live in the root level CMakeLists.txt, it seems you only need this for the runner build, or the custom_ops build.
ea358d5
to
3cc1d26
Compare
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
||
if(EXECUTORCH_BUILD_TORCHAO) | ||
set(TORCHAO_BUILD_EXECUTORCH_OPS ON) | ||
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/../../../third-party/ao/torchao/experimental ${CMAKE_CURRENT_BINARY_DIR}/../../../third-party/ao/torchao/experimental) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do ${EXECUTORCH_ROOT}/third-party/ao/torchao/experimental
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But one is the source directory and the other the binary directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant for the first one we can make it simpler. But it's up to you
os.path.abspath( | ||
os.path.join( | ||
os.path.dirname(__file__), | ||
"../../../../cmake-out/third-party/ao/torchao/experimental/libtorchao_ops_aten.*", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh this hardcoded path is not ideal. Can we add install() in torchao/experimental/CMakeLists.txt and then we can find the installed library in CMAKE_INSTALL_PREFIX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand, you want the glob path to be something like:
"{CMAKE_INSTALL_PREFIX}/lib/libtorchao_ops_aten.*"
instead of
"../../../../cmake-out/third-party/ao/torchao/experimental/libtorchao_ops_aten.*"
How is the python script going to know what CMAKE_INSTALL_PREFIX is? Do you want to user to define this as an environment variable?
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
||
if(EXECUTORCH_BUILD_TORCHAO) | ||
set(TORCHAO_BUILD_EXECUTORCH_OPS ON) | ||
add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/../../../third-party/ao/torchao/experimental ${CMAKE_CURRENT_BINARY_DIR}/../../../third-party/ao/torchao/experimental) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant for the first one we can make it simpler. But it's up to you
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2 similar comments
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@metascroy has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Setup ET by following directions here: https://pytorch.org/executorch/stable/getting-started-setup
After ET is setup, we can build the Llama runner with torchao kernels.
Step 1 (build ET)
Step2 (build runner with torchao):
Step3 (install runner requirements):
Step3 (export model):
The above quantizes the embeddings to 3 bits with groupsize 32 (-E torchao:3,32) and quantizes the linear layers with 8-bit dynamically quantized activations and weights to 3 bits with groupsize 128 "-qmode torchao:8da3w". You can play around with other quantization schemes. Starting with 4-bit (instead of 3-bit) is a good starting place for model quality. torchao supports 1-7 bit quantization for both linear and embedding layers.
Step4 (run model):
Note: you can also export quantized model pte files with torchchat (https://github.com/pytorch/torchchat/blob/main/docs/quantization.md#executorch-1) and then run the PTE files using Step4.