Skip to content

Commit a402779

Browse files
committed
New issue from Mark Hoemmen: "Fix Mandates, Preconditions, and Complexity elements of [linalg] algorithms"
1 parent ee281a4 commit a402779

File tree

1 file changed

+349
-0
lines changed

1 file changed

+349
-0
lines changed

xml/issue4137.xml

Lines changed: 349 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,349 @@
1+
<?xml version='1.0' encoding='utf-8' standalone='no'?>
2+
<!DOCTYPE issue SYSTEM "lwg-issue.dtd">
3+
4+
<issue num="4137" status="New">
5+
<title>Fix <i>Mandates</i>, <i>Preconditions</i>, and <i>Complexity</i> elements of [linalg] algorithms</title>
6+
<section><sref ref="[linalg.algs.blas2]"/><sref ref="[linalg.algs.blas3]"/></section>
7+
<submitter>Mark Hoemmen</submitter>
8+
<date>08 Aug 2024</date>
9+
<priority>99</priority>
10+
11+
<discussion>
12+
<p>
13+
As <a href="https://github.com/ORNL/cpp-proposals-pub/issues/464">pointed out by Raffaele Solcà</a>
14+
(CSCS Swiss National Supercomputing Centre), some of the <i>Mandates</i>, <i>Preconditions</i>, and
15+
<i>Complexity</i> elements of some BLAS 2 and BLAS 3 algorithms in [linalg] are incorrect.
16+
</p>
17+
</discussion>
18+
19+
<resolution>
20+
<p>
21+
This wording is relative to <paper num="N4988"/>.
22+
</p>
23+
24+
<ol>
25+
26+
<li><p>Modify <sref ref="[linalg.algs.blas2.gemv]"/> as indicated:</p>
27+
28+
<blockquote class="note">
29+
<p>
30+
[<i>Drafting note</i>: This change is needed because the matrix <tt>A</tt> does not need to be square.
31+
<tt>x.extents(0)</tt> must equal <tt>A.extents(1)</tt>, while <tt>y.extents(0)</tt> must equal
32+
<tt>A.extents(0)</tt>.]
33+
</p>
34+
</blockquote>
35+
36+
<blockquote>
37+
<p>
38+
-3- <i>Mandates</i>:
39+
</p>
40+
<ol style="list-style-type: none">
41+
<li><p>(3.1) &mdash; <tt><i>possibly-multipliable</i>&lt;decltype(A), decltype(x), decltype(y)&gt;()</tt>
42+
is <tt>true</tt>, and</p></li>
43+
<li><p>(3.2) &mdash; <tt><i>possibly-addable</i>&lt;decltype(<ins>y</ins><del>x</del>), decltype(y),
44+
decltype(z)&gt;()</tt> is <tt>true</tt> for those overloads that take a <tt>z</tt> parameter.</p></li>
45+
</ol>
46+
<p>
47+
-4- <i>Preconditions</i>:
48+
</p>
49+
<ol style="list-style-type: none">
50+
<li><p>(4.1) &mdash; <tt><i>multipliable</i>(A, x, y)</tt> is <tt>true</tt>, and</p></li>
51+
<li><p>(4.2) &mdash; <tt><i>addable</i>(<ins>y</ins><del>x</del>, y, z)</tt> is <tt>true</tt>
52+
for those overloads that take a <tt>z</tt> parameter.</p></li>
53+
</ol>
54+
<p>
55+
-5- <i>Complexity</i>: &#x1d4aa;(<tt><ins>A</ins><del>x</del>.extent(0)</tt> ×
56+
<tt><ins>x</ins><del>A</del>.extent(<ins>0</ins><del>1</del>)</tt>).
57+
</p>
58+
</blockquote>
59+
60+
</li>
61+
62+
<li><p>Modify <sref ref="[linalg.algs.blas2.symv]"/> as indicated:</p>
63+
64+
<blockquote>
65+
<p>
66+
-3- <i>Mandates</i>:
67+
</p>
68+
<ol style="list-style-type: none">
69+
<li><p>(3.1) &mdash; [&hellip;]</p></li>
70+
<li><p>(3.2) &mdash; [&hellip;]</p></li>
71+
<li><p>(3.3) &mdash; <tt><i>possibly-multipliable</i>&lt;decltype(A), decltype(x), decltype(y)&gt;()</tt>
72+
is <tt>true</tt>, and</p></li>
73+
<li><p>(3.4) &mdash; <tt><i>possibly-addable</i>&lt;decltype(<ins>y</ins><del>x</del>), decltype(y),
74+
decltype(z)&gt;()</tt> is <tt>true</tt> for those overloads that take a <tt>z</tt> parameter.</p></li>
75+
</ol>
76+
<p>
77+
-4- <i>Preconditions</i>:
78+
</p>
79+
<ol style="list-style-type: none">
80+
<li><p>(4.1) &mdash; <tt>A.extent(0)</tt> equals <tt>A.extent(1)</tt>,</p></li>
81+
<li><p>(4.2) &mdash; <tt><i>multipliable</i>(A, x, y)</tt> is <tt>true</tt>, and</p></li>
82+
<li><p>(4.3) &mdash; <tt><i>addable</i>(<ins>y</ins><del>x</del>, y, z)</tt> is <tt>true</tt>
83+
for those overloads that take a <tt>z</tt> parameter.</p></li>
84+
</ol>
85+
<p>
86+
-5- <i>Complexity</i>: &#x1d4aa;(<tt><ins>A</ins><del>x</del>.extent(0)</tt> ×
87+
<tt><ins>x</ins><del>A</del>.extent(<ins>0</ins><del>1</del>)</tt>).
88+
</p>
89+
</blockquote>
90+
91+
</li>
92+
93+
<li><p>Modify <sref ref="[linalg.algs.blas2.hemv]"/> as indicated:</p>
94+
95+
<blockquote>
96+
<p>
97+
-3- <i>Mandates</i>:
98+
</p>
99+
<ol style="list-style-type: none">
100+
<li><p>(3.1) &mdash; [&hellip;]</p></li>
101+
<li><p>(3.2) &mdash; [&hellip;]</p></li>
102+
<li><p>(3.3) &mdash; <tt><i>possibly-multipliable</i>&lt;decltype(A), decltype(x), decltype(y)&gt;()</tt>
103+
is <tt>true</tt>, and</p></li>
104+
<li><p>(3.4) &mdash; <tt><i>possibly-addable</i>&lt;decltype(<ins>y</ins><del>x</del>), decltype(y),
105+
decltype(z)&gt;()</tt> is <tt>true</tt> for those overloads that take a <tt>z</tt> parameter.</p></li>
106+
</ol>
107+
<p>
108+
-4- <i>Preconditions</i>:
109+
</p>
110+
<ol style="list-style-type: none">
111+
<li><p>(4.1) &mdash; <tt>A.extent(0)</tt> equals <tt>A.extent(1)</tt>,</p></li>
112+
<li><p>(4.2) &mdash; <tt><i>multipliable</i>(A, x, y)</tt> is <tt>true</tt>, and</p></li>
113+
<li><p>(4.3) &mdash; <tt><i>addable</i>(<ins>y</ins><del>x</del>, y, z)</tt> is <tt>true</tt>
114+
for those overloads that take a <tt>z</tt> parameter.</p></li>
115+
</ol>
116+
<p>
117+
-5- <i>Complexity</i>: &#x1d4aa;(<tt><ins>A</ins><del>x</del>.extent(0)</tt> ×
118+
<tt><ins>x</ins><del>A</del>.extent(<ins>0</ins><del>1</del>)</tt>).
119+
</p>
120+
</blockquote>
121+
122+
</li>
123+
124+
<li><p>Modify <sref ref="[linalg.algs.blas2.trmv]"/> as indicated:</p>
125+
126+
<blockquote class="note">
127+
<p>
128+
[<i>Drafting note</i>: The extents compatibility conditions are expressed differently than in the
129+
above matrix-vector multiply sections, perhaps more for consistency with the TRSV section below.
130+
They look correct here. The original <i>Complexity</i> elements adjusted below are technically correct,
131+
since <math><mi>A</mi></math> is square, but changing this would improve consistency with
132+
<sref ref="[linalg.algs.blas2.gemv]"/>]
133+
</p>
134+
</blockquote>
135+
136+
<blockquote>
137+
<pre>
138+
template&lt;<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage, <i>in-vector</i> InVec,
139+
<i>out-vector</i> OutVec&gt;
140+
void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d, InVec x, OutVec y);
141+
template&lt;class ExecutionPolicy,
142+
<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage, <i>in-vector</i> InVec,
143+
<i>out-vector</i> OutVec&gt;
144+
void triangular_matrix_vector_product(ExecutionPolicy&amp;&amp; exec,
145+
InMat A, Triangle t, DiagonalStorage d, InVec x, OutVec y);
146+
</pre>
147+
<blockquote>
148+
<p>
149+
-5- [&hellip;]
150+
<p/>
151+
-6- <i>Effects</i>: Computes <math><mi>y</mi> <mo>=</mo> <mi>A</mi><mi>x</mi></math>.
152+
<p/>
153+
-5- <i>Complexity</i>: &#x1d4aa;(<tt><ins>A</ins><del>x</del>.extent(0)</tt> ×
154+
<tt><ins>x</ins><del>A</del>.extent(<ins>0</ins><del>1</del>)</tt>).
155+
</p>
156+
</blockquote>
157+
<pre>
158+
template&lt;<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage, <i>inout-vector</i> InOutVec&gt;
159+
void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d, InOutVec y);
160+
template&lt;class ExecutionPolicy,
161+
<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage, <i>inout-vector</i> InOutVec&gt;
162+
void triangular_matrix_vector_product(ExecutionPolicy&amp;&amp; exec,
163+
InMat A, Triangle t, DiagonalStorage d, InOutVec y);
164+
</pre>
165+
<blockquote>
166+
<p>
167+
-8- [&hellip;]
168+
<p/>
169+
-9- <i>Effects</i>: [&hellip;]
170+
<p/>
171+
-10- <i>Complexity</i>: &#x1d4aa;(<tt><ins>A</ins><del>y</del>.extent(0)</tt> ×
172+
<tt><ins>y</ins><del>A</del>.extent(<ins>0</ins><del>1</del>)</tt>).
173+
</p>
174+
</blockquote>
175+
<pre>
176+
template&lt;<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage,
177+
<i>in-vector</i> InVec1, <i>in-vector</i> InVec2, <i>out-vector</i> OutVec&gt;
178+
void triangular_matrix_vector_product(InMat A, Triangle t, DiagonalStorage d,
179+
InVec1 x, InVec2 y, OutVec z);
180+
template&lt;class ExecutionPolicy,
181+
<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage,
182+
<i>in-vector</i> InVec1, <i>in-vector</i> InVec2, <i>out-vector</i> OutVec&gt;
183+
void triangular_matrix_vector_product(ExecutionPolicy&amp;&amp; exec,
184+
InMat A, Triangle t, DiagonalStorage d,
185+
InVec1 x, InVec2 y, OutVec z);
186+
</pre>
187+
<blockquote>
188+
<p>
189+
-11- [&hellip;]
190+
<p/>
191+
-12- <i>Effects</i>: Computes <math><mi>z</mi> <mo>=</mo> <mi>y</mi> <mo>+</mo> <mi>A</mi><mi>x</mi></math>.
192+
<p/>
193+
-13- <i>Complexity</i>: &#x1d4aa;(<tt><ins>A</ins><del>x</del>.extent(0)</tt> ×
194+
<tt><ins>x</ins><del>A</del>.extent(<ins>0</ins><del>1</del>)</tt>).
195+
</p>
196+
</blockquote>
197+
</blockquote>
198+
199+
</li>
200+
201+
<li><p>Modify <sref ref="[linalg.algs.blas3.rankk]"/> as indicated:</p>
202+
203+
<blockquote class="note">
204+
<p>
205+
[<i>Drafting note</i>: <paper num="P3371R0"/>, to be submitted in the August 15 mailing for
206+
LEWG review, contains the same wording changes to <sref ref="[linalg.algs.blas3.rankk]"/>
207+
and <sref ref="[linalg.algs.blas3.rank2k]"/> as proposed here, with additional changes
208+
corresponding to that proposal. Please apply this LWG issue's changes first, before P3371 merges]
209+
</p>
210+
</blockquote>
211+
212+
<blockquote>
213+
<p>
214+
-3- <i>Mandates</i>:
215+
</p>
216+
<ol style="list-style-type: none">
217+
<li><p>(3.1) &mdash; If <tt>InOutMat</tt> has <tt>layout_blas_packed</tt> layout, then the
218+
layout's <tt>Triangle</tt> template argument has the same type as the function's
219+
<tt>Triangle</tt> template argument; <ins>and</ins></p></li>
220+
<li><p>(3.2) &mdash; <tt><ins><i>possibly-multipliable</i>&lt;decltype(A),
221+
decltype(transposed(A)), decltype(C)&gt;</ins> <del><i>compatible-static-extents</i>&lt;decltype(A),
222+
decltype(A)&gt;(0, 1)</del></tt> is <tt>true</tt><ins>.</ins><del>;</del></p></li>
223+
<li><p><del>(3.3) &mdash; <tt><i>compatible-static-extents</i>&lt;decltype(C), decltype(C)&gt;(0, 1)</tt>
224+
is <tt>true</tt>; and</del></p></li>
225+
<li><p><del>(3.4) &mdash; <tt><i>compatible-static-extents</i>&lt;decltype(A), decltype(C)&gt;(0, 0)</tt>
226+
is <tt>true</tt>.</del></p></li>
227+
</ol>
228+
<p>
229+
-4- <i>Preconditions</i>: <ins><tt><i>multipliable</i>(A, transposed(A), C)</tt> is <tt>true</tt>.</ins>
230+
</p>
231+
<ol style="list-style-type: none">
232+
<li><p><del>(4.1) &mdash; <tt>A.extent(0)</tt> equals <tt>A.extent(1)</tt>,</del></p></li>
233+
<li><p><del>(4.2) &mdash; <tt>C.extent(0)</tt> equals <tt>C.extent(1)</tt>, and</del></p></li>
234+
<li><p><del>(4.3) &mdash; <tt>A.extent(0)</tt> equals <tt>C.extent(0)</tt>.</del></p></li>
235+
</ol>
236+
<p>
237+
-5- <i>Complexity</i>: &#x1d4aa;(<tt>A.extent(0)</tt> × <tt>A.extent(1)</tt> × <tt><ins>A</ins><del>C</del>.extent(0)</tt>).
238+
</p>
239+
</blockquote>
240+
241+
</li>
242+
243+
<li><p>Modify <sref ref="[linalg.algs.blas3.rank2k]"/> as indicated:</p>
244+
245+
<blockquote>
246+
<p>
247+
-3- <i>Mandates</i>:
248+
</p>
249+
<ol style="list-style-type: none">
250+
<li><p>(3.1) &mdash; If <tt>InOutMat</tt> has <tt>layout_blas_packed</tt> layout, then the
251+
layout's <tt>Triangle</tt> template argument has the same type as the function's
252+
<tt>Triangle</tt> template argument;</p></li>
253+
<li><p>(3.2) &mdash; <tt><ins><i>possibly-multipliable</i>&lt;decltype(A),
254+
decltype(transposed(B)), decltype(C)&gt;()</ins> <del><i>possibly-addable</i>&lt;decltype(A),
255+
decltype(B), decltype(C)&gt;()</del></tt>
256+
is <tt>true</tt>; and</p></li>
257+
<li><p>(3.3) &mdash; <tt><ins><i>possibly-multipliable</i>&lt;decltype(B),
258+
decltype(transposed(A)), decltype(C)&gt;(0, 1)</ins> <del><i>compatible-static-extents</i>&lt;decltype(A),
259+
decltype(A)&gt;(0, 1)</del></tt> is <tt>true</tt>.</p></li>
260+
</ol>
261+
<p>
262+
-4- <i>Preconditions</i>:
263+
</p>
264+
<ol style="list-style-type: none">
265+
<li><p>(4.1) &mdash; <tt><ins><i>multipliable</i>(A, transposed(B), C)</ins>
266+
<del><i>addable</i>(A, B, C)</del></tt> is <tt>true</tt>, and</p></li>
267+
<li><p>(4.2) &mdash; <ins><tt><i>multipliable</i>(B, transposed(A), C)</tt> is <tt>true</tt></ins>
268+
<del><tt>A.extent(0)</tt> equals <tt>A.extent(1)</tt></del>.</p></li>
269+
</ol>
270+
<p>
271+
-5- <i>Complexity</i>: &#x1d4aa;(<tt>A.extent(0)</tt> × <tt>A.extent(1)</tt> × <tt><ins>B</ins><del>C</del>.extent(0)</tt>).
272+
</p>
273+
</blockquote>
274+
275+
</li>
276+
277+
<li><p>Modify <sref ref="[linalg.algs.blas3.trsm]"/> as indicated:</p>
278+
279+
<blockquote class="note">
280+
<p>
281+
[<i>Drafting note</i>: Nothing is wrong here, but it's nice to make the complexity clauses depend
282+
only on input if possible]
283+
</p>
284+
</blockquote>
285+
286+
<blockquote>
287+
<pre>
288+
template&lt;<i>in-matrix</i> InMat1, class Triangle, class DiagonalStorage,
289+
<i>in-matrix</i> InMat2, <i>out-matrix</i> OutMat, class BinaryDivideOp>
290+
void triangular_matrix_matrix_left_solve(InMat1 A, Triangle t, DiagonalStorage d,
291+
InMat2 B, OutMat X, BinaryDivideOp divide);
292+
template&lt;class ExecutionPolicy,
293+
<i>in-matrix</i> InMat1, class Triangle, class DiagonalStorage,
294+
<i>in-matrix</i> InMat2, <i>out-matrix</i> OutMat, class BinaryDivideOp>
295+
void triangular_matrix_matrix_left_solve(ExecutionPolicy&amp;&amp; exec,
296+
InMat1 A, Triangle t, DiagonalStorage d,
297+
InMat2 B, OutMat X, BinaryDivideOp divide);
298+
</pre>
299+
<blockquote>
300+
<p>
301+
[&hellip;]
302+
<p/>
303+
-6- <i>Complexity</i>: &#x1d4aa;(<tt>A.extent(0)</tt> × <tt><ins>B</ins><del>X</del>.extent(1)</tt> × <tt><ins>B</ins><del>X</del>.extent(1)</tt>).
304+
</p>
305+
</blockquote>
306+
307+
</blockquote>
308+
309+
</li>
310+
311+
312+
<li><p>Modify <sref ref="[linalg.algs.blas3.inplacetrsm]"/> as indicated:</p>
313+
314+
<blockquote class="note">
315+
<p>
316+
[<i>Drafting note</i>: Nothing is wrong here, but it's nice to make the complexity clauses depend
317+
only on input if possible]
318+
</p>
319+
</blockquote>
320+
321+
<blockquote>
322+
<pre>
323+
template&lt;<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage,
324+
<i>inout-matrix</i> InOutMat, class BinaryDivideOp>
325+
void triangular_matrix_matrix_right_solve(InMat A, Triangle t, DiagonalStorage d,
326+
InOutMat B, BinaryDivideOp divide);
327+
template&lt;class ExecutionPolicy,
328+
<i>in-matrix</i> InMat, class Triangle, class DiagonalStorage,
329+
<i>inout-matrix</i> InOutMat, class BinaryDivideOp>
330+
void triangular_matrix_matrix_right_solve(ExecutionPolicy&amp;&amp; exec,
331+
InMat A, Triangle t, DiagonalStorage d,
332+
InOutMat B, BinaryDivideOp divide);
333+
</pre>
334+
<blockquote>
335+
<p>
336+
[&hellip;]
337+
<p/>
338+
-13- <i>Complexity</i>: &#x1d4aa;(<tt><ins>B</ins><del>A</del>.extent(0)</tt> ×
339+
<tt>A.extent(<ins>0</ins><del>1</del>)</tt> × <tt><ins>A</ins><del>B</del>.extent(1)</tt>).
340+
</p>
341+
</blockquote>
342+
343+
</blockquote>
344+
345+
</li>
346+
</ol>
347+
</resolution>
348+
349+
</issue>

0 commit comments

Comments
 (0)