Skip to content

Commit 281b9e1

Browse files
authored
Merge pull request #3 from jpsamaroo/jps/readme
Add a README
2 parents 7047c4d + 3d75c3a commit 281b9e1

File tree

1 file changed

+55
-0
lines changed

1 file changed

+55
-0
lines changed

README.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Distributed
2+
3+
The `Distributed` package provides functionality for creating and controlling multiple Julia processes remotely, and for performing distributed and parallel computing. It uses network sockets or other supported interfaces to communicate between Julia processes, and relies on Julia's `Serialization` stdlib package to transform Julia objects into a format that can be transferred between processes efficiently. It provides a full set of utilities to create and destroy new Julia processes and add them to a "cluster" (a collection of Julia processes connected together), as well as functions to perform Remote Procedure Calls (RPC) between the processes within a cluster. See [`API`](@ref) for details.
4+
5+
This package ships as part of the Julia stdlib.
6+
7+
## Using development versions of this package
8+
9+
To use a newer version of this package, you need to build Julia from scratch. The build process is the same as any other build except that you need to change the commit used in `stdlib/Distributed.version`.
10+
11+
It's also possible to load a development version of the package using [the trick used in the Section named "Using the development version of Pkg.jl" in the `Pkg.jl` repo](https://github.com/JuliaLang/Pkg.jl#using-the-development-version-of-pkgjl), but the capabilities are limited as all other packages will depend on the stdlib version of the package and will not work with the modified package.
12+
13+
## API
14+
15+
The public API of `Distributed` consists of a variety of functions for various tasks; for creating and destroying processes within a cluster:
16+
17+
- `addprocs` - create one or more Julia processes and connect them to the cluster
18+
- `rmprocs` - shutdown and remove one or more Julia processes from the cluster
19+
20+
For controlling other processes via RPC:
21+
22+
- `remotecall` - call a function on another process and return a `Future` referencing the result of that call
23+
- `Future` - an object that references the result of a `remotecall` that hasn't yet completed - use `fetch` to return the call's result, or `wait` to just wait for the remote call to finish
24+
- `remotecall_fetch` - the same as `fetch(remotecall(...))`
25+
- `remotecall_wait` - the same as `wait(remotecall(...))`
26+
- `remote_do` - like `remotecall`, but does not provide a way to access the result of the call
27+
- `@spawnat` - like `remotecall`, but in macro form
28+
- `@spawn` - like `@spawn`, but the target process is picked automatically
29+
- `@fetch` - macro equivalent of `fetch(@spawn expr)`
30+
- `@fetchfrom` - macro equivalent of `fetch(@spawnat p expr)`
31+
- `myid` - returns the `Int` identifier of the process calling it
32+
- `nprocs` - returns the number of processes in the cluster
33+
- `nworkers` - returns the number of worker processes in the cluster
34+
- `procs` - returns the set of IDs for processes in the cluster
35+
- `workers` - returns the set of IDs for worker processes in the cluster
36+
- `interrupt` - interrupts the specified process
37+
38+
For communicating between processes in the style of a channel or stream:
39+
40+
- `RemoteChannel` - a `Channel`-like object that can be `put!` to or `take!` from any process
41+
42+
For controlling multiple processes at once:
43+
44+
- `WorkerPool` - a collection of processes than can be passed instead a process ID to certain APIs
45+
- `CachingPool` - like `WorkerPool`, but caches functions (including closures which capture large data) on each process
46+
- `@everywhere` - runs a block of code on all (or a subset of all) processes and waits for them all to complete
47+
- `pmap` - performs a `map` operation where each element may be computed on another process
48+
- `@distributed` - implements a `for`-loop where each iteration may be computed on another process
49+
50+
### Process Identifiers
51+
52+
Julia processes connected with `Distributed` are all assigned a cluster-unique `Int` identifier, starting from `1`. The first Julia process within a cluster is given ID `1`, while other processes added via `addprocs` get incrementing IDs (`2`, `3`, etc.). Functions and macros which communicate from one process to another usually take one or more identifiers to determine which process they target - for example, `remotecall_fetch(myid, 2)` calls `myid()` on process 2.
53+
54+
!!! note
55+
Only process 1 (often called the "head", "primary", or "master") may add or remove processes, and manages the rest of the cluster. Other processes (called "workers" or "worker processes") may still call functions on each other and send and receive data, but `addprocs`/`rmprocs` on worker processes will fail with an error.

0 commit comments

Comments
 (0)