This repository was archived by the owner on Apr 28, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 212
tc_pybind_example.py occasionally fails on tuning cache check #523
Comments
nicolasvasilache
added a commit
to nicolasvasilache/TensorComprehensions
that referenced
this issue
Jul 23, 2018
This commit supposedly addresses issue facebookresearch#523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
nicolasvasilache
added a commit
to nicolasvasilache/TensorComprehensions
that referenced
this issue
Jul 23, 2018
This commit supposedly addresses issue facebookresearch#523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
nicolasvasilache
added a commit
to nicolasvasilache/TensorComprehensions
that referenced
this issue
Jul 23, 2018
This commit supposedly addresses issue facebookresearch#523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
nicolasvasilache
added a commit
to nicolasvasilache/TensorComprehensions
that referenced
this issue
Jul 23, 2018
This commit supposedly addresses issue facebookresearch#523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
nicolasvasilache
added a commit
to nicolasvasilache/TensorComprehensions
that referenced
this issue
Jul 23, 2018
This commit supposedly addresses issue facebookresearch#523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
nicolasvasilache
added a commit
to nicolasvasilache/TensorComprehensions
that referenced
this issue
Jul 23, 2018
This commit supposedly addresses issue facebookresearch#523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
Artix18
pushed a commit
that referenced
this issue
Jul 24, 2018
This commit supposedly addresses issue #523 (only supposedly because there is no easy repro). The problem is conjectured to come from the tuner keeping the best time/option in a private field whereas the functions that interact with the cache files operate on the cache. When multiple entries have the same runtime, it is conjectured (by @ftynse) that the ordering of the cache entries do not match the private field. In hindsight this can easily happen with thread/block sizes because once the number of threads/blocks is one per loop element, one can increase the values passed to mapping options but the same code will be generated after tightening. It is not too much of a stretch to imagine that the same code will occasionally have the same runtime. This commit drops the private state and ensures we always fetch the requires values from the options cache (under its lock).
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
See, e.g., https://ci.pytorch.org/jenkins/job/tensorcomp-builds/job/tc-cuda9.0-cudnn7.1-ubuntu16.04-devel-build-test/259/consoleText
I was able to reproduce it on my system. top10 contains exactly one element, but it is different from top1. The element in top10 corresponds to the one in the protobuf file.
The text was updated successfully, but these errors were encountered: