What GPU model(s) did you use to train this model, how many GPUs were deployed, and approximately how long did the training process take?