|
1 | 1 | Run a Script
|
2 | 2 | ************
|
3 | 3 |
|
4 |
| -This example shows you how to create a job running "Hello World" Python scripts. Although Python scripts are used here, you could also run Bash or Shell scripts. The Logging service log and log group are defined in the infrastructure. The output of the script appear in the logs. |
5 |
| - |
6 |
| -Python |
7 |
| -====== |
8 |
| - |
9 |
| -Suppose you would like to run the following "Hello World" python script named ``job_script.py``. |
10 |
| - |
11 |
| -.. code-block:: python3 |
12 |
| -
|
13 |
| - print("Hello World") |
14 |
| -
|
15 |
| -First, initiate a job with a job name: |
16 |
| - |
17 |
| -.. code-block:: python3 |
18 |
| -
|
19 |
| - from ads.jobs import Job |
20 |
| - job = Job(name="Job Name") |
21 |
| -
|
22 |
| -Next, you specify the desired infrastructure to run the job. If you are in a notebook session, ADS can automatically fetch the infrastructure configurations and use them for the job. If you aren't in a notebook session or you want to customize the infrastructure, you can specify them using the methods from the ``DataScienceJob`` class: |
23 |
| - |
24 |
| -.. code-block:: python3 |
25 |
| -
|
26 |
| - from ads.jobs import DataScienceJob |
27 |
| -
|
28 |
| - job.with_infrastructure( |
29 |
| - DataScienceJob() |
30 |
| - .with_log_group_id("<log_group_ocid>") |
31 |
| - .with_log_id("<log_ocid>") |
32 |
| - # The following infrastructure configurations are optional |
33 |
| - # if you are in an OCI data science notebook session. |
34 |
| - # The configurations of the notebook session will be used as defaults |
35 |
| - .with_compartment_id("<compartment_ocid>") |
36 |
| - .with_project_id("<project_ocid>") |
37 |
| - .with_subnet_id("<subnet_ocid>") |
38 |
| - .with_shape_name("VM.Standard.E3.Flex") |
39 |
| - .with_shape_config_details(memory_in_gbs=16, ocpus=1) # Applicable only for the flexible shapes |
40 |
| - .with_block_storage_size(50) |
41 |
| - ) |
42 |
| -
|
43 |
| -In this example, it is a Python script so the ``ScriptRuntime()`` class is used to define the name of the script using the ``.with_source()`` method: |
44 |
| - |
45 |
| -.. code-block:: python3 |
46 |
| -
|
47 |
| - from ads.jobs import ScriptRuntime |
48 |
| - job.with_runtime( |
49 |
| - ScriptRuntime().with_source("job_script.py") |
50 |
| - ) |
51 |
| -
|
52 |
| -Finally, you create and run the job, which gives you access to the |
53 |
| -``job_run.id``: |
54 |
| - |
55 |
| -.. code-block:: python3 |
56 |
| -
|
57 |
| - job.create() |
58 |
| - job_run = job.run() |
59 |
| -
|
60 |
| -Additionally, you can acquire the job run using the OCID: |
61 |
| - |
62 |
| -.. code-block:: python3 |
63 |
| -
|
64 |
| - from ads.jobs import DataScienceJobRun |
65 |
| - job_run = DataScienceJobRun.from_ocid(job_run.id) |
66 |
| -
|
67 |
| -The ``.watch()`` method is useful to monitor the progress of the job run: |
68 |
| - |
69 |
| -.. code-block:: python3 |
70 |
| -
|
71 |
| - job_run.watch() |
72 |
| -
|
73 |
| -After the job has been created and runs successfully, you can find |
74 |
| -the output of the script in the logs if you configured logging. |
75 |
| - |
76 |
| -YAML |
77 |
| -==== |
78 |
| - |
79 |
| -You could also initialize a job directly from a YAML string. For example, to create a job identical to the preceding example, you could simply run the following: |
80 |
| - |
81 |
| -.. code-block:: python3 |
82 |
| -
|
83 |
| - job = Job.from_string(f""" |
84 |
| - kind: job |
85 |
| - spec: |
86 |
| - infrastructure: |
87 |
| - kind: infrastructure |
88 |
| - type: dataScienceJob |
89 |
| - spec: |
90 |
| - logGroupId: <log_group_ocid> |
91 |
| - logId: <log_ocid> |
92 |
| - compartmentId: <compartment_ocid> |
93 |
| - projectId: <project_ocid> |
94 |
| - subnetId: <subnet_ocid> |
95 |
| - shapeName: VM.Standard.E3.Flex |
96 |
| - shapeConfigDetails: |
97 |
| - memoryInGBs: 16 |
98 |
| - ocpus: 1 |
99 |
| - blockStorageSize: 50 |
100 |
| - name: <resource_name> |
101 |
| - runtime: |
102 |
| - kind: runtime |
103 |
| - type: python |
104 |
| - spec: |
105 |
| - scriptPathURI: job_script.py |
106 |
| - """) |
107 |
| -
|
108 |
| -
|
109 |
| -Command Line Arguments |
110 |
| -====================== |
111 |
| - |
112 |
| -If the Python script that you want to run as a job requires CLI arguments, |
113 |
| -use the ``.with_argument()`` method to pass the arguments to the job. |
114 |
| - |
115 |
| -Python |
116 |
| ------- |
117 |
| - |
118 |
| -Suppose you want to run the following python script named ``job_script_argument.py``: |
119 |
| - |
120 |
| -.. code-block:: python3 |
121 |
| -
|
122 |
| - import sys |
123 |
| - print("Hello " + str(sys.argv[1]) + " and " + str(sys.argv[2])) |
124 |
| -
|
125 |
| -This example runs a job with CLI arguments: |
126 |
| - |
127 |
| -.. code-block:: python3 |
128 |
| -
|
129 |
| - from ads.jobs import Job |
130 |
| - from ads.jobs import DataScienceJob |
131 |
| - from ads.jobs import ScriptRuntime |
132 |
| -
|
133 |
| - job = Job() |
134 |
| - job.with_infrastructure( |
135 |
| - DataScienceJob() |
136 |
| - .with_log_id("<log_id>") |
137 |
| - .with_log_group_id("<log_group_id>") |
138 |
| - ) |
139 |
| -
|
140 |
| - # The CLI argument can be passed in using `with_argument` when defining the runtime |
141 |
| - job.with_runtime( |
142 |
| - ScriptRuntime() |
143 |
| - .with_source("job_script_argument.py") |
144 |
| - .with_argument("<first_argument>", "<second_argument>") |
145 |
| - ) |
146 |
| -
|
147 |
| - job.create() |
148 |
| - job_run = job.run() |
149 |
| -
|
150 |
| -After the job run is created and run, you can use the ``.watch()`` method to monitor |
151 |
| -its progress: |
152 |
| - |
153 |
| -.. code-block:: python3 |
154 |
| -
|
155 |
| - job_run.watch() |
156 |
| -
|
157 |
| -This job run prints out ``Hello <first_argument> and <second_argument>``. |
158 |
| - |
159 |
| -YAML |
160 |
| ----- |
161 |
| - |
162 |
| -You could create the preceding example job with the following YAML file: |
163 |
| - |
164 |
| -.. code-block:: yaml |
165 |
| -
|
166 |
| - kind: job |
167 |
| - spec: |
168 |
| - infrastructure: |
169 |
| - kind: infrastructure |
170 |
| - type: dataScienceJob |
171 |
| - spec: |
172 |
| - logGroupId: <log_group_ocid> |
173 |
| - logId: <log_ocid> |
174 |
| - compartmentId: <compartment_ocid> |
175 |
| - projectId: <project_ocid> |
176 |
| - subnetId: <subnet_ocid> |
177 |
| - shapeName: VM.Standard.E3.Flex |
178 |
| - shapeConfigDetails: |
179 |
| - memoryInGBs: 16 |
180 |
| - ocpus: 1 |
181 |
| - blockStorageSize: 50 |
182 |
| - runtime: |
183 |
| - kind: runtime |
184 |
| - type: python |
185 |
| - spec: |
186 |
| - args: |
187 |
| - - <first_argument> |
188 |
| - - <second_argument> |
189 |
| - scriptPathURI: job_script_env.py |
190 |
| -
|
191 |
| -
|
192 |
| -Environment Variables |
193 |
| -===================== |
194 |
| - |
195 |
| -Similarly, if the script you want to run requires environment variables, you also pass them in using the ``.with_environment_variable()`` method. The key-value pair of the environment variable are passed in using the ``.with_environment_variable()`` method, and are accessed in the Python script using the ``os.environ`` dictionary. |
196 |
| - |
197 |
| -Python |
198 |
| ------- |
199 |
| - |
200 |
| -Suppose you want to run the following python script named ``job_script_env.py``: |
201 |
| - |
202 |
| -.. code-block:: python3 |
203 |
| -
|
204 |
| - import os |
205 |
| - import sys |
206 |
| - print("Hello " + os.environ["KEY1"] + " and " + os.environ["KEY2"]) |
207 |
| -
|
208 |
| -This example runs a job with environment variables: |
209 |
| - |
210 |
| -.. code-block:: python3 |
211 |
| -
|
212 |
| - from ads.jobs import Job |
213 |
| - from ads.jobs import DataScienceJob |
214 |
| - from ads.jobs import ScriptRuntime |
215 |
| -
|
216 |
| - job = Job() |
217 |
| - job.with_infrastructure( |
218 |
| - DataScienceJob() |
219 |
| - .with_log_group_id("<log_group_ocid>") |
220 |
| - .with_log_id("<log_ocid>") |
221 |
| - # The following infrastructure configurations are optional |
222 |
| - # if you are in an OCI data science notebook session. |
223 |
| - # The configurations of the notebook session will be used as defaults |
224 |
| - .with_compartment_id("<compartment_ocid>") |
225 |
| - .with_project_id("<project_ocid>") |
226 |
| - .with_subnet_id("<subnet_ocid>") |
227 |
| - .with_shape_name("VM.Standard.E3.Flex") |
228 |
| - .with_shape_config_details(memory_in_gbs=16, ocpus=1) |
229 |
| - .with_block_storage_size(50) |
230 |
| - ) |
231 |
| -
|
232 |
| - job.with_runtime( |
233 |
| - ScriptRuntime() |
234 |
| - .with_source("job_script_env.py") |
235 |
| - .with_environment_variable(KEY1="<first_value>", KEY2="<second_value>") |
236 |
| - ) |
237 |
| - job.create() |
238 |
| - job_run = job.run() |
239 |
| -
|
240 |
| -You can watch the progress of the job run using the ``.watch()`` method: |
241 |
| - |
242 |
| -.. code-block:: python3 |
243 |
| -
|
244 |
| - job_run.watch() |
245 |
| -
|
246 |
| -This job run prints out ``Hello <first_value> and <second_value>``. |
247 |
| - |
248 |
| -YAML |
249 |
| ----- |
250 |
| - |
251 |
| -You could create the preceding example job with the following YAML file: |
| 4 | +This section shows how to create a job to run a script. |
252 | 5 |
|
253 | 6 | The :py:class:`~ads.jobs.ScriptRuntime` is designed for you to define job artifacts and configurations supported by OCI
|
254 | 7 | Data Science Jobs natively. It can be used with any script types that is supported by the OCI Data Science Jobs,
|
255 | 8 | including shell scripts and python scripts.
|
256 | 9 |
|
257 |
| - kind: job |
258 |
| - spec: |
259 |
| - infrastructure: |
260 |
| - kind: infrastructure |
261 |
| - type: dataScienceJob |
262 |
| - spec: |
263 |
| - logGroupId: <log_group_ocid> |
264 |
| - logId: <log_ocid> |
265 |
| - compartmentId: <compartment_ocid> |
266 |
| - projectId: <project_ocid> |
267 |
| - subnetId: <subnet_ocid> |
268 |
| - shapeName: VM.Standard.E3.Flex |
269 |
| - shapeConfigDetails: |
270 |
| - memoryInGBs: 16 |
271 |
| - ocpus: 1 |
272 |
| - blockStorageSize: 50 |
273 |
| - runtime: |
274 |
| - kind: runtime |
275 |
| - type: python |
276 |
| - spec: |
277 |
| - env: |
278 |
| - - name: KEY1 |
279 |
| - value: <first_value> |
280 |
| - - name: KEY2 |
281 |
| - value: <second_value> |
282 |
| - scriptPathURI: job_script_env.py |
| 10 | +The source code can be a single script, files in a folder or a zip/tar file. |
283 | 11 |
|
284 | 12 | See also: `Preparing Job Artifacts <https://docs.oracle.com/en-us/iaas/data-science/using/jobs-artifact.htm>`_.
|
285 | 13 |
|
|
0 commit comments