How does one "select" which cuda stream to run on. Is there a way to pass that into the functions some how?