Skip to content

Commit c13ddec

Browse files
author
Thomas Koenig
committed
Add chapter on collective subroutines.
1 parent 09c48f2 commit c13ddec

File tree

1 file changed

+180
-7
lines changed

1 file changed

+180
-7
lines changed

tutorial.md

Lines changed: 180 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,24 @@ which will get the intended result:
119119
Goodbye from image 4 of 4
120120
Goodbye from image 3 of 4
121121
```
122-
122+
The `SYNC ALL` statements do not have to be in the same place in the
123+
program. For example, this program will print the "Hello" message
124+
from image 1 later than all the others:
125+
```
126+
program main
127+
implicit none
128+
if (this_image() == 1) sync all
129+
write (*,*) "Hello from image", this_image()
130+
if (this_image() /= 1) sync all
131+
end program
132+
```
133+
Output is (for example)
134+
```
135+
Hello from image 2
136+
Hello from image 4
137+
Hello from image 3
138+
Hello from image 1
139+
```
123140
# Coarrays
124141
In order to be really useful, the images need a way to exchange data
125142
with other images. This can be done with coarrays.
@@ -198,15 +215,15 @@ where all images do work, while only one of them does I/O.
198215
```
199216
program main
200217
implicit none
201-
integer :: sq[*]
218+
integer :: me[*]
202219
integer :: i, s, n
203-
sq = this_image()
220+
me = this_image()
204221
sync all ! Do not forget this.
205222
if (this_image() == 1) then
206223
s = 0
207224
n = num_images()
208225
do i=1, n
209-
s = s + sq[i]
226+
s = s + me[i]
210227
end do
211228
write (*,'(*(A,I0))') "Number of images: ", n, " sum: ", s, &
212229
" expected: ", n*(n+1)/2
@@ -398,7 +415,7 @@ can adjust the bounds. This, for example, would be legal:
398415
and give you an index running from `1` to `num_images * n`, but
399416
you would still have to specify the correct coarray.
400417

401-
# More advanced synchronization -- `SYNC IMAGES`
418+
# More advanced synchronization - `SYNC IMAGES`
402419

403420
`SYNC ALL` is not everything that may be needed for synchronization.
404421
Suppose not every image needs to communicate with every other image,
@@ -454,10 +471,166 @@ end program
454471
Two images can issue `SYNC IMAGES` commands to each other multiple
455472
times. Execution will only continue if the numbers match.
456473

457-
# Coroutines
474+
# Collective subroutines
475+
476+
Data transfer between images can be repetetive to write. For
477+
example, setting a value on all images would require an
478+
explicit DO loop over all images, plus explicit synchronization.
479+
480+
To facilitate this, the Fortran 2018 standard introduced the collective
481+
subroutines. Using these subroutines, you can transfer data between
482+
images using normal (i.e. non-coarray) variables.
483+
484+
## Setting a value on all images - `CO_BROADCAST`
458485

459-
Another method.
486+
You use the subroutine `CO_BROADCAST` to set the value of variables
487+
on all images from one particular image. This variable can be an
488+
array or a scalar. Here is an example:
489+
```
490+
program main
491+
integer, dimension(3) :: a
492+
if (this_image () == 1) then
493+
a = [2,3,5]
494+
end if
495+
call co_broadcast (a, 1)
496+
write (*,*) 'Image', this_image(), "a =", a
497+
end program main
498+
```
499+
The call to co_broadcast works as if the value of `a` is
500+
been assigned to the value of `a` on image 1.
501+
`a` is *not* a coarray (no square brackets), and no explicit
502+
synchronization is needed. The compiler does that for you. The
503+
example output is
504+
```
505+
Image 2 a = 2 3 5
506+
Image 4 a = 2 3 5
507+
Image 3 a = 2 3 5
508+
Image 1 a = 2 3 5
509+
```
510+
511+
## Common reductions - sum, maximum, minimum
512+
513+
You often want to know the sum, maximum, minimum or product of
514+
something that is calculated on each image. This is common
515+
enough so that three is a subroutine for each of these tasks:
516+
`CO_SUM`, `CO_MAX`, `CO_MIN`, respectively. You can apply these
517+
subroutines to scalars or arrays.
518+
519+
These subroutines take as argument the variable to be reduced, plus
520+
an optional argument `RESULT_IMAGE` where the result should be
521+
stored. If you supply that image number, then the result is only
522+
stored on the corresponding image, and the variables on all other
523+
variables become undefined. If you do not supply `RESULT_IMAGE`, the
524+
result is stored on every variable. Here is an example without using
525+
`RESULT_IMAGE`:
526+
```
527+
program main
528+
integer :: a
529+
a = this_image()
530+
call co_sum(a)
531+
write (*,*) this_image(), a
532+
end
533+
```
534+
with the output
535+
```
536+
2 10
537+
4 10
538+
3 10
539+
1 10
540+
```
541+
And here is a variant which used `RESULT_IMAGE` to assign
542+
the value to image 1 only:
543+
```
544+
program main
545+
implicit none
546+
integer :: me, n
547+
me = this_image ()
548+
n = num_images()
549+
call co_sum (me, result_image = 1)
550+
if (this_image() == 1) then
551+
write (*,'(*(A,I0))') "Number of images: ", n, " sum: ", me, &
552+
" expected: ", n*(n+1)/2
553+
end if
554+
end program main
555+
```
556+
with the output
557+
```
558+
Number of images: 4 sum: 10 expected: 10
559+
```
560+
Here is another example which calculates the sum, minimum and maximum
561+
of a value which is calculated for each image. The program prints out
562+
the values for each image, then the minimum, maximum and sum of
563+
each element.
564+
```
565+
program main
566+
implicit none
567+
integer, parameter :: n = 3
568+
integer :: i
569+
real, dimension(n) :: val
570+
real, dimension(n) :: val_min, val_max, val_sum
571+
val = [(cos(0.2*i*this_image()),i=1,n)]
572+
write (*,'(I4," ",3F12.5)') this_image(), val
573+
val_min = val
574+
call co_min (val_min, result_image = 1)
575+
val_max = val
576+
call co_max (val_max, result_image = 1)
577+
val_sum = val
578+
call co_sum (val_sum, result_image = 1)
579+
if (this_image() == 1) then
580+
write (*,'(A,3F12.5)') "Min: ", val_min, "Max: ", val_max, &
581+
"Sum: ", val_sum
582+
end if
583+
end program main
584+
```
585+
The output is, for four images
586+
```
587+
4 0.69671 -0.02920 -0.73739
588+
2 0.92106 0.69671 0.36236
589+
1 0.98007 0.92106 0.82534
590+
3 0.82534 0.36236 -0.22720
591+
Min: 0.69671 -0.02920 -0.73739
592+
Max: 0.98007 0.92106 0.82534
593+
Sum: 3.42317 1.95093 0.22310
594+
```
595+
## Generalized reduction - `CO_REDUCE`
596+
There is a possibility that the reduction that is needed is not among
597+
the supported ones above. In that case, you can define your own
598+
function to do the reduction and call `CO_REDUCE`.
460599

600+
The function needs to be `PURE`, and it needs to apply the operation
601+
to its two arguments. It also needs to be transitive, so
602+
`f(a,b)` needs to do the same thing as `f(b,a)`. The following
603+
example checks if all elements of the logical variable `flag` are
604+
true, the same way that the `ALL` intrinsic would do for normal
605+
Fortran variables.
606+
```
607+
program main
608+
implicit none
609+
integer, parameter :: n = 3
610+
integer :: i
611+
logical, dimension(n) :: flag
612+
flag = [(cos(0.2*i*this_image()) > 0.,i=1,n)]
613+
write (*,'(I4," ",3L2)') this_image(), flag
614+
call co_reduce (flag, both, result_image=1)
615+
if (this_image() == 1) then
616+
write (*,'(A5,3L2)') "All: ", flag
617+
end if
618+
contains
619+
pure function both (lhs,rhs) result(res)
620+
logical, intent(in) :: lhs,rhs
621+
logical :: res
622+
res = lhs .AND. rhs
623+
END FUNCTION both
624+
end program main
625+
```
626+
And here is its output:
627+
```
628+
2 T T T
629+
3 T T F
630+
4 T F F
631+
1 T T T
632+
All: T F F
633+
```
461634
# Getting it to work
462635

463636
## Using gfortran

0 commit comments

Comments
 (0)