Replies: 1 comment 4 replies
-
One reason could be your data is too small. By default, cudamemcpy loads into L2. So if your working set can fit in L2 then it's possible there is no memory traffic. You can try turning off Thanks |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I want to use GPGPU-Sim to collect memory statistics and am trying to run a small vector addition kernel using PTX simulation. After the application finishes execution, the output verification passes but there is no DRAM activity. I enabled the flushing of L2 cache after kernel completion, but that didn't help. Running the simulation in a debugger confirms that
dram_t::push
is indeed never called either. Does anyone know what's happening here?Beta Was this translation helpful? Give feedback.
All reactions