Skip to content

Commit e8aaf3f

Browse files
authored
Configuration documentation (#535)
1 parent bfe25ec commit e8aaf3f

File tree

2 files changed

+186
-0
lines changed

2 files changed

+186
-0
lines changed

docs/configuration.md

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
# Configuration
2+
3+
Cubed uses [Donfig](https://donfig.readthedocs.io/en/latest/) for managing configuration of things like the directory used for temporary files, or for setting executor properties.
4+
5+
This page covers how to specify configuration properties, and a reference with all the configuration options that you can use in Cubed.
6+
7+
## Specification
8+
9+
There are three main ways of specifying configuration in Cubed:
10+
1. by instantiating a `Spec` object,
11+
2. by using a YAML file and setting the `CUBED_CONFIG` environment variable, or
12+
3. by setting environment variables for individual properties.
13+
14+
We look at each in turn.
15+
16+
### `Spec` object
17+
18+
This is how you configure Cubed directly from within a Python program - by instantiating a {py:class}`Spec <cubed.Spec>` object:
19+
20+
```python
21+
import cubed
22+
23+
spec = cubed.Spec(
24+
work_dir="s3://cubed-tomwhite-temp",
25+
allowed_mem="2GB",
26+
executor_name="lithops",
27+
executor_options=dict(use_backups=False, runtime="cubed-runtime", runtime_memory=2000)
28+
)
29+
```
30+
31+
The `spec` instance is then passed to array creation functions as follows:
32+
33+
```python
34+
import cubed.array_api as xp
35+
36+
a = cubed.random.random((50000, 50000), chunks=(5000, 5000), spec=spec)
37+
b = cubed.random.random((50000, 50000), chunks=(5000, 5000), spec=spec)
38+
c = xp.add(a, b)
39+
```
40+
41+
All arrays in any given computation must share the same `spec` instance.
42+
43+
### YAML file
44+
45+
A YAML file is a good way to encapsulate the configuration in a single file that lives outside the Python program.
46+
It's a useful way to package up the settings for running using a particular executor, so it can be reused.
47+
The Cubed [examples](https://github.com/cubed-dev/cubed/blob/main/examples/README.md) use YAML files for this reason.
48+
49+
```yaml
50+
spec:
51+
work_dir: "s3://cubed-$USER-temp"
52+
allowed_mem: "2GB"
53+
executor_name: "lithops"
54+
executor_options:
55+
use_backups: False
56+
runtime: "cubed-runtime"
57+
runtime_memory: 2000
58+
```
59+
60+
Note that YAML files can use environment variables - in this example `$USER` will be expanded appropriately.
61+
62+
To use the YAML file, set the `CUBED_CONFIG` environment variable to the file path before invoking your Python program:
63+
64+
```shell
65+
CUBED_CONFIG=/path/to/cubed.yaml python ...
66+
```
67+
68+
Donfig will actually look for YAML files in a variety of locations, see the [Donfig docs](https://donfig.readthedocs.io/en/latest/configuration.html#yaml-files) for details.
69+
70+
### Environment variables
71+
72+
You can also set [Donfig-style environment variables](https://donfig.readthedocs.io/en/latest/configuration.html#environment-variables) to set individual properties. Notice how double underscores are used to indicate nesting.
73+
74+
```shell
75+
export CUBED_SPEC__WORK_DIR='s3://cubed-$USER-temp'
76+
export CUBED_SPEC__ALLOWED_MEM=2GB
77+
export CUBED_SPEC__EXECUTOR_NAME=lithops
78+
export CUBED_SPEC__EXECUTOR_OPTIONS__USE_BACKUPS=False
79+
export CUBED_SPEC__EXECUTOR_OPTIONS__RUNTIME=cubed-runtime
80+
export CUBED_SPEC__EXECUTOR_OPTIONS__RUNTIME_MEMORY=2000
81+
```
82+
83+
This can be handy if you only have a couple of properties to set:
84+
85+
```shell
86+
CUBED_SPEC__ALLOWED_MEM=2GB CUBED_SPEC__EXECUTOR_NAME=processes python ...
87+
```
88+
89+
## Reference
90+
91+
### Spec options
92+
93+
These properties can be passed directly to the {py:class}`Spec <cubed.Spec>` constructor. Or, equivalently, they are directly under `spec` in a YAML file.
94+
95+
| Property | Default | Description |
96+
|--------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
97+
| `work_dir` | `None` | The directory path (specified as an fsspec URL) used for storing intermediate data. If not set, the user's temporary directory is used. |
98+
| `allowed_mem` | `200MB` | The total memory available to a worker for running a task. This includes any `reserved_mem` that has been set. |
99+
| `reserved_mem` | `100MB` | The memory reserved on a worker for non-data use when running a task |
100+
| `executor_name` | `single-threaded` | The executor for running computations. One of `single-threaded`, `threads`, `processes`, `beam`, `coiled`, `dask`, `lithops`, `modal`. |
101+
| `executor_options` | `None` | Options to pass to the executor on construction. See below for possible options for each executor. |
102+
103+
104+
### Executor options
105+
106+
Different executors support different options. They are listed here for each executor.
107+
108+
These properties are keys in the `executor_options` passed to the {py:class}`Spec <cubed.Spec>` constructor. Or, equivalently, they are directly under `spec.executor_options` in a YAML file.
109+
110+
111+
#### `single-threaded`
112+
113+
The `single-threaded` executor is a simple executor mainly used for testing. It doesn't support any configuration options
114+
since it is deliberately designed not to have anything except the most basic features.
115+
116+
#### `threads`
117+
118+
| Property | Default | Description |
119+
|------------------------------|---------|----------------------------------------------------------------------------------------------------|
120+
| `retries` | 2 | The number of times to retry a task if it fails. |
121+
| `use_backups` | `True` | Whether to use backup tasks for mitigating stragglers. |
122+
| `batch_size` | `None` | Number of input tasks to submit to be run in parallel. The default is not to batch. |
123+
| `compute_arrays_in_parallel` | `False` | Whether arrays are computed one at a time or in parallel. |
124+
| `max_workers` | `None` | The maximum number of workers to use in the `ThreadPoolExecutor`. Defaults to number of CPU cores. |
125+
126+
127+
#### `processes`
128+
129+
| Property | Default | Description |
130+
|------------------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------------|
131+
| `use_backups` | `True` | Whether to use backup tasks for mitigating stragglers. |
132+
| `batch_size` | `None` | Number of input tasks to submit to be run in parallel. `None` means don't batch. |
133+
| `compute_arrays_in_parallel` | `False` | Whether arrays are computed one at a time or in parallel. |
134+
| `max_workers` | `None` | The maximum number of workers to use in the `ProcessPoolExecutor`. Defaults to number of CPU cores. |
135+
| `max_tasks_per_child` | `None` | The number of tasks to run in each child process. See the Python documentation for `concurrent.futures.ProcessPoolExecutor`. (Python 3.11) |
136+
137+
Note that `retries` is not currently supported for the `processes` executor.
138+
139+
#### `beam`
140+
141+
The `beam` executor doesn't currently expose any configuration options.
142+
When running on Google Cloud Dataflow, [four retry attempts](https://cloud.google.com/dataflow/docs/pipeline-lifecycle#error_and_exception_handling) are made for failing tasks.
143+
144+
#### `coiled`
145+
146+
| Property | Default | Description |
147+
|-----------------|---------|---------------------------------------------------------------------------------------------------------|
148+
| `coiled_kwargs` | `None` | Keyword arguments to pass to [`coiled.function`](https://docs.coiled.io/user_guide/functions.html#api). |
149+
150+
Note that there is currently no way to set retries or a timeout for the Coiled executor.
151+
152+
#### `dask`
153+
154+
| Property | Default | Description |
155+
|------------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------|
156+
| `retries` | 2 | The number of times to retry a task if it fails. |
157+
| `use_backups` | `True` | Whether to use backup tasks for mitigating stragglers. |
158+
| `batch_size` | `None` | Number of input tasks to submit to be run in parallel. The default is not to batch. |
159+
| `compute_arrays_in_parallel` | `False` | Whether arrays are computed one at a time or in parallel. |
160+
| `compute_kwargs` | `None` | Keyword arguments to pass to Dask's [`distributed.Client`](https://distributed.dask.org/en/latest/api.html#client) constructor. |
161+
162+
Note that there is currently no way to set a timeout for the Dask executor.
163+
164+
#### `lithops`
165+
166+
| Property | Default | Description |
167+
|------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
168+
| `retries` | 2 | The number of times to retry a task if it fails. |
169+
| `timeout` | `None` | Tasks that take longer than the timeout will be automatically killed and retried. Defaults to the timeout specified when [deploying the lithops runtime image](https://lithops-cloud.github.io/docs/source/cli.html#lithops-runtime-deploy-runtime-name). This is 180 seconds in the [examples](https://github.com/cubed-dev/cubed/blob/main/examples/README.md). |
170+
| `use_backups` | `True` | Whether to use backup tasks for mitigating stragglers. |
171+
| `compute_arrays_in_parallel` | `False` | Whether arrays are computed one at a time or in parallel. |
172+
| Other properties | N/A | Other properties will be passed as keyword arguments to the [`lithops.executors.FunctionExecutor`](https://lithops-cloud.github.io/docs/source/api_futures.html#lithops.executors.FunctionExecutor) constructor. |
173+
174+
Note that `batch_size` is not currently supported for Lithops.
175+
176+
#### `modal`
177+
178+
| Property | Default | Description |
179+
|------------------------------|---------|-------------------------------------------------------------------------------------|
180+
| `cloud` | `aws` | The cloud to run on. One of `aws` or `gcp`. |
181+
| `use_backups` | `True` | Whether to use backup tasks for mitigating stragglers. |
182+
| `batch_size` | `None` | Number of input tasks to submit to be run in parallel. The default is not to batch. |
183+
| `compute_arrays_in_parallel` | `False` | Whether arrays are computed one at a time or in parallel. |
184+
185+
Currently the Modal executor in Cubed uses a hard-coded value of 2 for retries and 300 seconds for timeouts, neither of which can be changed through configuration.

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ Intro slides <https://cubed-dev.github.io/cubed/cubed-intro.slides.html>
2424
Examples <https://github.com/tomwhite/cubed/tree/main/examples/README.md>
2525
api
2626
array-api
27+
configuration
2728
related-projects
2829
```
2930

0 commit comments

Comments
 (0)