Skip to content

Commit d05903f

Browse files
committed
specs and examples
1 parent 75945f1 commit d05903f

File tree

5 files changed

+153
-2
lines changed

5 files changed

+153
-2
lines changed

doc/specs/stdlib_intrinsics.md

Lines changed: 115 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The `stdlib_intrinsics` module provides replacements for some of the well known
1515

1616
#### Description
1717

18-
The `fsum` function can replace the intrinsic `sum` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximaxes vectorization potential as well as reducing the round-off error.
18+
The `fsum` function can replace the intrinsic `sum` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when summing large arrays, for repetitive summation of smaller arrays consider the classical `sum`.
1919

2020
#### Syntax
2121

@@ -32,7 +32,7 @@ Pure function.
3232
#### Argument(s)
3333

3434
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
35-
`mask`: 1D array of `logical` values. This argument is `intent(in)`.
35+
`mask` (optional): 1D array of `logical` values. This argument is `intent(in)`.
3636

3737
#### Output value or Result value
3838

@@ -43,3 +43,116 @@ The output is a scalar of `type` and `kind` same as to that of `x`.
4343
```fortran
4444
{!example/math/example_intrinsics_sum.f90!}
4545
```
46+
47+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
48+
### `fsum_kahan` function
49+
50+
#### Description
51+
52+
The `fsum_kahan` function can replace the intrinsic `sum` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential, complemented by an `elemental` kernel based on the [kahan summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) strategy to reduce the round-off error:
53+
54+
```fortran
55+
elemental subroutine vkahan_<kind>(a,s,c)
56+
type(<kind>), intent(in) :: a
57+
type(<kind>), intent(inout) :: s
58+
type(<kind>), intent(inout) :: c
59+
type(<kind>) :: t, y
60+
y = a - c
61+
t = s + y
62+
c = (t - s) - y
63+
s = t
64+
end subroutine
65+
```
66+
67+
#### Syntax
68+
69+
`res = ` [[stdlib_intrinsics(module):fsum_kahan(interface)]] ` (x [,mask] )`
70+
71+
#### Status
72+
73+
Experimental
74+
75+
#### Class
76+
77+
Pure function.
78+
79+
#### Argument(s)
80+
81+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
82+
`mask` (optional): 1D array of `logical` values. This argument is `intent(in)`.
83+
84+
#### Output value or Result value
85+
86+
The output is a scalar of `type` and `kind` same as to that of `x`.
87+
88+
#### Example
89+
90+
```fortran
91+
{!example/math/example_intrinsics_sum.f90!}
92+
```
93+
94+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
95+
### `fprod` function
96+
97+
#### Description
98+
99+
The `fprod` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential as well as reducing the round-off error. This procedure is recommended when crunching large arrays, for repetitive products of smaller arrays consider the classical `dot_product`.
100+
101+
#### Syntax
102+
103+
`res = ` [[stdlib_intrinsics(module):fprod(interface)]] ` (x, y)`
104+
105+
#### Status
106+
107+
Experimental
108+
109+
#### Class
110+
111+
Pure function.
112+
113+
#### Argument(s)
114+
115+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
116+
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
117+
118+
#### Output value or Result value
119+
120+
The output is a scalar of `type` and `kind` same as to that of `x` and `y`.
121+
122+
#### Example
123+
124+
```fortran
125+
{!example/math/example_intrinsics_dot_duct.f90!}
126+
```
127+
128+
<!-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -->
129+
### `fprod_kahan` function
130+
131+
#### Description
132+
133+
The `fprod_kahan` function can replace the intrinsic `dot_product` for 1D `real` or `complex` arrays. It follows a chunked implementation which maximizes vectorization potential , complemented by the same `elemental` kernel based on the [kahan summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) used for `fsum` to reduce the round-off error.
134+
135+
#### Syntax
136+
137+
`res = ` [[stdlib_intrinsics(module):fprod_kahan(interface)]] ` (x, y)`
138+
139+
#### Status
140+
141+
Experimental
142+
143+
#### Class
144+
145+
Pure function.
146+
147+
#### Argument(s)
148+
149+
`x`: 1D array of either `real` or `complex` type. This argument is `intent(in)`.
150+
`y`: 1D array of the same type and kind as `x`. This argument is `intent(in)`.
151+
152+
#### Output value or Result value
153+
154+
The output is a scalar of `type` and `kind` same as to that of `x` and `y`.
155+
156+
```fortran
157+
{!example/math/example_intrinsics_dot_duct.f90!}
158+
```

example/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ add_subdirectory(constants)
1313
add_subdirectory(error)
1414
add_subdirectory(hashmaps)
1515
add_subdirectory(hash_procedures)
16+
add_subdirectory(intrinsics)
1617
add_subdirectory(io)
1718
add_subdirectory(linalg)
1819
add_subdirectory(logger)

example/intrinsics/CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
ADD_EXAMPLE(sum)
2+
ADD_EXAMPLE(dot_product)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
program example_dot_product
2+
use stdlib_kinds, only: sp
3+
use stdlib_intrinsics, only: fprod, fprod_kahan
4+
implicit none
5+
6+
real(sp), allocatable :: x(:), y(:)
7+
real(sp) :: total_prod(3)
8+
9+
allocate( x(1000), y(1000) )
10+
call random_number(x)
11+
call random_number(y)
12+
13+
total_prod(1) = dot_product(x,y) !> compiler intrinsic
14+
total_prod(2) = fprod(x,y) !> chunked summation over inner product
15+
total_prod(3) = fprod_kahan(x,y) !> chunked kahan summation over inner product
16+
print *, total_prod(1:3)
17+
18+
end program example_dot_product

example/intrinsics/example_sum.f90

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
program example_sum
2+
use stdlib_kinds, only: sp
3+
use stdlib_intrinsics, only: fsum, fsum_kahan
4+
implicit none
5+
6+
real(sp), allocatable :: x(:)
7+
real(sp) :: total_sum(3)
8+
9+
allocate( x(1000) )
10+
call random_number(x)
11+
12+
total_sum(1) = sum(x) !> compiler intrinsic
13+
total_sum(2) = fsum(x) !> chunked summation
14+
total_sum(3) = fsum_kahan(x)!> chunked kahan summation
15+
print *, total_sum(1:3)
16+
17+
end program example_sum

0 commit comments

Comments
 (0)