Skip to content

Commit cc050a0

Browse files
committed
Clear docs main page
1 parent 1613073 commit cc050a0

File tree

1 file changed

+1
-159
lines changed

1 file changed

+1
-159
lines changed

docs/src/index.md

Lines changed: 1 addition & 159 deletions
Original file line numberDiff line numberDiff line change
@@ -1,161 +1,3 @@
11
# ParallelProcessingTools.jl
22

3-
This Julia package provides some tools to ease multithreaded and distributed programming, especially for more complex use cases and when using multiple processes with multiple threads on each process.
4-
5-
This package follows the SPMD (Single Program Multiple Data) paradigm (like, e.g MPI, Cuda, OpenCL and
6-
`DistributedArrays.SPMD`): Run the same code on every execution unit (process or thread) and make the code responsible for figuring out which part of the data it should process. This differs from the approach of `Base.Threads.@threads` and `Distributed.@distributed`. SPMD is more appropriate for complex cases that the latter do not handle well (e.g. because some initial setup is required on each execution unit and/or iteration scheme over the data is more complex, control over SIMD processing is required, etc.).
7-
8-
This package also implements thread-local variables and tooling to handle non-thread-safe code.
9-
10-
In addition, `ParallelProcessingTools` provides functions and macros designed to ease the transition to the new multi-threading model introduced in Julia v1.3.
11-
12-
Note: Some features may not work on Windows, currently.
13-
14-
15-
## Work partitions
16-
17-
`workpart` partitions an `AbstractArray` across a a specified set of workers (i.e. processes or threads). E.g.
18-
19-
```julia
20-
A = rand(100)
21-
workpart(A, 4:7, 5) == view(A, 26:50)
22-
```
23-
24-
returns a views into the array that worker `5` out of a set or workers `4:7` will be responsible for. The intended usage is
25-
26-
```julia
27-
using Distributed
28-
@everywhere using Base.Threads, ParallelProcessingTools
29-
@everywhere data = rand(1000)
30-
@everywhere procsel = workers()
31-
@onprocs procsel begin
32-
sub_A = workpart(data, procsel, myid())
33-
threadsel = allthreads()
34-
@onthreads threadsel begin
35-
# ... some initialization, create local buffers, etc.
36-
idxs = workpart(eachindex(sub_A), threadsel, threadid())
37-
for i in idxs
38-
# ... A[i] ...
39-
end
40-
end
41-
end
42-
```
43-
44-
see below for a full example.
45-
46-
If `data` is a `DistributedArrays.DArray`, then `DistributedArrays.localpart(data)` should be used instead of `workpart(data, workers(), myid())`.
47-
48-
49-
## Thread-safety
50-
51-
Use `@critical` to mark non thread-safe code, e.g. for logging. For example
52-
53-
```julia
54-
@onthreads allthreads() begin
55-
@critical @info Base.Threads.threadid()
56-
end
57-
```
58-
59-
would crash Julia without `@critical` because `@info` is not thread-safe.
60-
61-
Note: This doesn't always work for multithreaded code on other processes yet.
62-
63-
64-
## Thread-local variables
65-
66-
Thread-local variable can be created and initialized via
67-
68-
```julia
69-
tl = ThreadLocal(0.0)
70-
```
71-
72-
The API is the similar to `Ref`: `tl[]` gets the value of `tl` for the current thread, `tl[] = 4.2` sets the value for the current thread. `getallvalues(tl)` returns the values for all threads as a vector, and can only be called from single-threaded code.
73-
74-
75-
## Multithreaded code execution
76-
77-
The macro `@onthreads threadsel expr` will run the code in `expr` on the threads in `threadsel` (typically a range of thread IDs). For convenience, the package exports `allthreads() = 1:nthreads()`. Here's a simple example on how to use thread-local variables and `@onthreads` to sum up numbers in parallel:
78-
79-
```julia
80-
tlsum = ThreadLocal(0.0)
81-
data = rand(100)
82-
@onthreads allthreads() begin
83-
tlsum[] = sum(workpart(data, allthreads(), Base.Threads.threadid()))
84-
end
85-
sum(getallvalues(tlsum)) sum(data)
86-
```
87-
88-
`@onthreads` forwards exceptions thrown by the code in `expr` to the caller (in contrast to, `Base.Threads.@threads`, that will currently print an exception but not forward it, so when using `@threads` program execution simply continues after a failure in multithreaded code).
89-
90-
Note: Julia can currently run only one function on multiple threads at the same time (this restriction is likely to disappear in the the future). So even if `threadsel` does not include all threads, the rest of the threads will be idle but blocked and cannot be used to run other code in parallel. However, the ability to run on a subset of the available threads is still useful to measure the scaling behavior of multithreaded code (without restarting Julia with a different value for `$JULIA_NUM_THREADS`).
91-
92-
93-
94-
## Multiprocess code execution
95-
96-
The macro `@onprocs procsel expr` will run the code in `expr` on the processes in `procsel` (typically an
97-
array of process IDs). `@onprocs` returns a vector with the result of `expr` on each process and
98-
will wait until all the results are available (but may of course be wrapped in `@async`). A
99-
simple example to get the process ID on each worker process:
100-
101-
```julia
102-
using Distributed
103-
addprocs(2)
104-
workers() == @onprocs workers() myid()
105-
```
106-
107-
Note: If the data can be expressed in terms of a `DistributedArrays.DArray`, it may be more appropriate and convenient to use the multiprocess execution tooling available in the package `DistributedArrays` (possibly combined with `ParallelProcessingTools.@onthreads`).
108-
109-
110-
### Example use case:
111-
112-
As a simple real-world use case, let's histogram distributed data on multiple processes and threads:
113-
114-
Set up a cluster of multithreaded workers and load the required packages:
115-
116-
```julia
117-
using Distributed, ParallelProcessingTools
118-
addprocs(2)
119-
@everywhere using ParallelProcessingTools, Base.Threads,
120-
DistributedArrays, Statistics, StatsBase
121-
```
122-
123-
Create some distributed data and check how the data is distributed:
124-
125-
```julia
126-
data = drandn(10^8)
127-
procsel = procs(data)
128-
@onprocs procsel size(localpart(data))
129-
```
130-
131-
Check the number of threads on each worker holding a part of the data:
132-
133-
```julia
134-
@onprocs procsel nthreads()
135-
```
136-
137-
Create histograms in parallel on all threads of all workers and merge:
138-
139-
```julia
140-
proc_hists = @onprocs procsel begin
141-
local_data = localpart(data)
142-
tl_hist = ThreadLocal(Histogram((-6:0.1:6,), :left))
143-
@onthreads allthreads() begin
144-
data_for_this_thread = workpart(local_data, allthreads(), threadid())
145-
append!(tl_hist[], data_for_this_thread)
146-
end
147-
merged_hist = merge(getallvalues(tl_hist)...)
148-
end
149-
final_hist = merge(proc_hists...)
150-
```
151-
152-
Check result:
153-
154-
```
155-
sum(final_hist.weights) ≈ length(data)
156-
157-
using Plots
158-
plot(final_hist)
159-
```
160-
161-
Note: This example is meant to show how to combine the features of this package. The multi-process part of this particular use case can be written in a simpler way using functionality from `DistributedArrays`.
3+
This Julia package provides some tools to ease multithreaded and distributed programming.

0 commit comments

Comments
 (0)