Make `braket_container.py` thread safe for CUDA-Q BYOC image

**Is your feature request related to a problem? Please describe.**
The [`braket_container.py`](https://github.com/amazon-braket/amazon-braket-examples/blob/main/examples/nvidia_cuda_q/container/braket_container.py) script used for the CUDA-Q BYOC image to launch the user-provided algorithm script is not thread safe, which can create race conditions in paritcular in the step to download, extract and make available the customer code to be executed in the job. This becomes a problem, specifically, for (single and multi-instance) multi-GPU workflows, an area where CUDA-Q can provide acceleration, in particular. While the [original script](https://github.com/amazon-braket/amazon-braket-containers/blob/main/src/braket_container.py) on the amazon-braket-containers repository, does not take into account multiple processes running in an MPI context, at all, the script in this repository at least performs some basic handling of the MPI ranks [here](https://github.com/amazon-braket/amazon-braket-examples/blob/dd3828c5db4cdcabae99b6535a630dacf97656c6/examples/nvidia_cuda_q/container/braket_container.py#L316):

```python
# Add wait time to resolve race condition
import time

rank = int(os.getenv("OMPI_COMM_WORLD_NODE_RANK", "0"))
time.sleep(rank)
```

But, this handling is both, inefficient, and ultimately not bullet proof (for example, if the download of the user-provided algorithm code from S3 takes longer than expected).


**Describe the solution you'd like**
The script should be refactored for real thread safety.

**Describe alternatives you've considered**
It would be even better, IMO, to improve the original script (https://github.com/amazon-braket/amazon-braket-containers/blob/main/src/braket_container.py) and copy it directly in the Dockerfile rather than duplicating it locally, e.g.:

```
FROM ...

# other instructions...

RUN git clone --depth=1 https://github.com/amazon-braket/amazon-braket-containers.git
RUN cp amazon-braket-containers/src/braket_container.py /opt/ml/code/braket_container.py
ENV SAGEMAKER_PROGRAM=braket_container.py
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make `braket_container.py` thread safe for CUDA-Q BYOC image #679

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Make braket_container.py thread safe for CUDA-Q BYOC image #679

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Make `braket_container.py` thread safe for CUDA-Q BYOC image #679