-
Notifications
You must be signed in to change notification settings - Fork 41
Description
What you would like to be added?
I would like to have a simple way to create a TrainJob using entry point from image specified in TrainingRuntime, without a need to specify explicit train function and related properties (func_args, packages_to_install, pip_index_urls).
It should still be possible to provide job configuration like num_nodes, resources_per_node, env...
Optionally it can be also achieved by #47
Why is this needed?
Some training images (for example https://github.com/foundation-model-stack/fms-hf-tuning/tree/main/build) already provide training scripts by themselves, referenced them in the image entry point.
It would be good to allow creation of TrainJob to run those images, providing possibility to specify training properties like environment variables, number of nodes, resources...
Love this feature?
Give it a 👍 We prioritize the features with most 👍