[WIP] Distributed computing with celery #128

jchodera · 2017-02-08T03:07:30Z

@andrrizzi : Here's the very basic test code I was playing with, in case you find it useful. This doesn't necessarily have to be merged, but might at least illustrate something along the lines of what I was thinking.

I haven't tested this on the cluster yet.

jchodera · 2017-02-08T03:26:45Z

We'll still need a way to start celery workers on individual GPUs. I bet we could do this with something like

CUDA_VISIBLE_DEVICES=0 celery -A openmmtools.distributed worker -l info --concurrency=1 &
CUDA_VISIBLE_DEVICES=1 celery -A openmmtools.distributed worker -l info --concurrency=1 &
CUDA_VISIBLE_DEVICES=2 celery -A openmmtools.distributed worker -l info --concurrency=1 &
CUDA_VISIBLE_DEVICES=3 celery -A openmmtools.distributed worker -l info --concurrency=1 &

though @pgrinaway may have better ideas for how best to do this with multiple GPUs on a node.

It looks like there's also a way to specify worker queues with the --queues flag, like --queues gpu vs --queues cpu. Documentation on these can be found here.

andrrizzi · 2017-02-08T03:33:14Z

Thanks! I'll take a look at this tomorrow.

jchodera · 2017-02-15T14:56:45Z

This is still very much test code for experimenting. I think the next steps are:

Try to construct some sort of benchmark example that tests local vs remote execution of a replica exchange like operation on a realistic test system (e.g. Src in explicit solvent) to see how well things perform
Make it easy to try both celery and redis

jchodera added 4 commits February 7, 2017 22:05

Upadate README

f5e9e0d

Add distributed example

117a323

Add distributed computing core

7785439

Run multiple replicas at once.

7646eae

Fix authentication info for RabbitMQ

489d09e

jchodera added 3 commits February 15, 2017 14:17

Get login credentials for RabbitMQ from environment variables

62f6aca

Add defaults for all celery environment variables

6d3737c

Update README with celery defaults

36a009f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Distributed computing with celery #128

[WIP] Distributed computing with celery #128

Uh oh!

jchodera commented Feb 8, 2017

Uh oh!

jchodera commented Feb 8, 2017

Uh oh!

andrrizzi commented Feb 8, 2017

Uh oh!

jchodera commented Feb 15, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP] Distributed computing with celery #128

Are you sure you want to change the base?

[WIP] Distributed computing with celery #128

Uh oh!

Conversation

jchodera commented Feb 8, 2017

Uh oh!

jchodera commented Feb 8, 2017

Uh oh!

andrrizzi commented Feb 8, 2017

Uh oh!

jchodera commented Feb 15, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants