@@ -413,7 +413,11 @@ can adjust the bounds. This, for example, would be legal:
413
413
allocate (a(from:to)[*])
414
414
```
415
415
and give you an index running from ` 1 ` to ` num_images * n ` , but
416
- you would still have to specify the correct coarray.
416
+ you would still have to specify the correct coindices.
417
+
418
+ ` ALLOCATE ` and ` DEALLOCATE ` also do implicit synchronization,
419
+ so you can use the allocated coarrays directly, no need to
420
+ specifcy any ` SYNC ` variant.
417
421
418
422
# More advanced synchronization - ` SYNC IMAGES `
419
423
@@ -631,6 +635,62 @@ And here is its output:
631
635
1 T T T
632
636
All: T F F
633
637
```
638
+ # Errors, error discovery and program termination
639
+
640
+ What happens when errors occur and images terminate needs to be
641
+ defined carefully. Fortran has facilities to detect failure on
642
+ individual compute nodes and offers possibilities to deal with them.
643
+
644
+ ## Image states
645
+
646
+ There are three states that an image can be in: It can be an
647
+ - * active image* if it is running normally
648
+ - * stopped image* if it has been terminated normally by reaching
649
+ the end of the main program or by executing a ` STOP ` statement.
650
+ - * failed image* when an image stopped working for some reason
651
+ (for example a hardware failure) or execution of a ` FAIL IMAGE `
652
+ statement.
653
+
654
+ Once an image is in a stopped or failed state, there is no coming
655
+ back - it will always remain in that state. An image can also be
656
+ terminated by an * error condition* ; all other images should then also
657
+ be terminated by the system as soon as possible. This is what
658
+ usually happens when you try to allocate an already allocated
659
+ variable, open a non-existent file for reading without specifying
660
+ a ` STAT ` variable.
661
+
662
+ ## Look at the state you are in
663
+
664
+ If you synchronize with a failed or stopped image, try to
665
+ allocate or deallocate a variable there or other similar things,
666
+ what is the system to do? Without direction from the programmer,
667
+ it will simply terminate the program (an error condition, as above).
668
+ This is not very useful as a fail-safe tactic.
669
+
670
+ However, the programmer can specify a ` STAT ` and optionally the
671
+ ` ERRMSG ` arguments to catch the error and act accordingly. It
672
+ is then possible to compare the value returned for the ` STAT `
673
+ argument against predefined values from ` iso_fortran_env ` and
674
+ then use the intrinsic functions ` FAILED_IMAGES() ` and
675
+ ` STOPPED_IMAGES() ` too look up which ones failed.
676
+
677
+ ```
678
+ program main
679
+ use iso_fortran_env, only : STAT_FAILED_IMAGE, STAT_STOPPED_IMAGE
680
+ integer :: sync_stat, alloc_stat
681
+ sync all (stat=sync_stat)
682
+ if (stat /= 0) then
683
+ if (stat == STAT_FAILED_IMAGE) then
684
+ print *,"Failed images: ", failed_images()
685
+ else if (stat == STAT_STOPPED_IMAGE) then
686
+ print *,"Stopped images: ", stopped_images()
687
+ else
688
+ print *,"Unforseen error, aborting"
689
+ error stop
690
+ end if
691
+ end if
692
+ ```
693
+
634
694
# Getting it to work
635
695
636
696
## Using gfortran
0 commit comments