Skip to content

What makes STQ faster versus SAP on the GPU? #3

Answered by zfergus
Q-Minh asked this question in Q&A
Discussion options

You must be logged in to vote

Hello,

Thank you for your interest in our work.

The different CUDA implementations can be found here. The difference between SAP and STQ on the GPU is how the work is divided between the threads.

In SAP, much like on the CPU, each thread is assigned a box and it iterates until it finds the next box in the sorted list that does not intersect. This has the limitation that the workload can be unbalanced between different threads. E.g., one thread may only find 1 intersection while another may find 100. On the CPU this works well as the thread with little work can be freed and assigned a new box. However, on the GPU the thread with little work will have to wait for all threads in its block to…

Replies: 4 comments 2 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by zfergus
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@zfergus
Comment options

Comment options

You must be logged in to vote
1 reply
@zfergus
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #2 on August 07, 2024 03:45.