Skip to content

Support creating TrainJob using image entry point #116

@sutaakar

Description

@sutaakar

What you would like to be added?

I would like to have a simple way to create a TrainJob using entry point from image specified in TrainingRuntime, without a need to specify explicit train function and related properties (func_args, packages_to_install, pip_index_urls).
It should still be possible to provide job configuration like num_nodes, resources_per_node, env...

Optionally it can be also achieved by #47

Why is this needed?

Some training images (for example https://github.com/foundation-model-stack/fms-hf-tuning/tree/main/build) already provide training scripts by themselves, referenced them in the image entry point.

It would be good to allow creation of TrainJob to run those images, providing possibility to specify training properties like environment variables, number of nodes, resources...

Love this feature?

Give it a 👍 We prioritize the features with most 👍

Metadata

Metadata

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions