Skip to content

Commit 95608b2

Browse files
committed
feat: add stats/strided/dcovarmtk
--- type: pre_commit_static_analysis_report description: Results of running static analysis checks when committing changes. report: - task: lint_filenames status: passed - task: lint_editorconfig status: passed - task: lint_markdown status: passed - task: lint_package_json status: passed - task: lint_repl_help status: passed - task: lint_javascript_src status: passed - task: lint_javascript_cli status: na - task: lint_javascript_examples status: passed - task: lint_javascript_tests status: passed - task: lint_javascript_benchmarks status: passed - task: lint_python status: na - task: lint_r status: na - task: lint_c_src status: passed - task: lint_c_examples status: passed - task: lint_c_benchmarks status: passed - task: lint_c_tests_fixtures status: na - task: lint_shell status: na - task: lint_typescript_declarations status: passed - task: lint_typescript_tests status: passed - task: lint_license_headers status: passed ---
1 parent 97af938 commit 95608b2

33 files changed

+3923
-0
lines changed
Lines changed: 386 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,386 @@
1+
<!--
2+
3+
@license Apache-2.0
4+
5+
Copyright (c) 2025 The Stdlib Authors.
6+
7+
Licensed under the Apache License, Version 2.0 (the "License");
8+
you may not use this file except in compliance with the License.
9+
You may obtain a copy of the License at
10+
11+
http://www.apache.org/licenses/LICENSE-2.0
12+
13+
Unless required by applicable law or agreed to in writing, software
14+
distributed under the License is distributed on an "AS IS" BASIS,
15+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
See the License for the specific language governing permissions and
17+
limitations under the License.
18+
19+
-->
20+
21+
<!-- lint disable maximum-heading-length -->
22+
23+
# dcovarmtk
24+
25+
> Calculate the [covariance][covariance] of two double-precision floating-point strided arrays provided known means and using a one-pass textbook algorithm.
26+
27+
<section class="intro">
28+
29+
The population [covariance][covariance] of a finite size population of size `N` is given by
30+
31+
<!-- <equation class="equation" label="eq:population_covariance" align="center" raw="\operatorname{\mathrm{cov_N}} = \frac{1}{N} \sum_{i=0}^{N-1} (x_i - \mu_x)(y_i - \mu_y)" alt="Equation for the population covariance."> -->
32+
33+
```math
34+
\mathop{\mathrm{cov_N}} = \frac{1}{N} \sum_{i=0}^{N-1} (x_i - \mu_x)(y_i - \mu_y)
35+
```
36+
37+
<!-- </equation> -->
38+
39+
where the population means are given by
40+
41+
<!-- <equation class="equation" label="eq:population_mean_for_x" align="center" raw="\mu_x = \frac{1}{N} \sum_{i=0}^{N-1} x_i" alt="Equation for the population mean for first array."> -->
42+
43+
```math
44+
\mu_x = \frac{1}{N} \sum_{i=0}^{N-1} x_i
45+
```
46+
47+
<!-- </equation> -->
48+
49+
and
50+
51+
<!-- <equation class="equation" label="eq:population_mean_for_y" align="center" raw="\mu_y = \frac{1}{N} \sum_{i=0}^{N-1} y_i" alt="Equation for the population mean for second array."> -->
52+
53+
```math
54+
\mu_y = \frac{1}{N} \sum_{i=0}^{N-1} y_i
55+
```
56+
57+
<!-- </equation> -->
58+
59+
Often in the analysis of data, the true population [covariance][covariance] is not known _a priori_ and must be estimated from samples drawn from population distributions. If one attempts to use the formula for the population [covariance][covariance], the result is biased and yields a **biased sample covariance**. To compute an **unbiased sample variance** for a sample of size `n`,
60+
61+
<!-- <equation class="equation" label="eq:unbiased_sample_covariance" align="center" raw="\operatorname{\mathrm{cov_n}} = \frac{1}{n-1} \sum_{i=0}^{n-1} (x_i - \bar{x}_n)(y_i - \bar{y}_n)" alt="Equation for computing an unbiased sample variance."> -->
62+
63+
```math
64+
\mathop{\mathrm{cov_n}} = \frac{1}{n-1} \sum_{i=0}^{n-1} (x_i - \bar{x}_n)(y_i - \bar{y}_n)
65+
```
66+
67+
<!-- </equation> -->
68+
69+
where sample means are given by
70+
71+
<!-- <equation class="equation" label="eq:sample_mean_for_x" align="center" raw="\bar{x} = \frac{1}{n} \sum_{i=0}^{n-1} x_i" alt="Equation for the sample mean for first array."> -->
72+
73+
```math
74+
\bar{x} = \frac{1}{n} \sum_{i=0}^{n-1} x_i
75+
```
76+
77+
<!-- </equation> -->
78+
79+
and
80+
81+
<!-- <equation class="equation" label="eq:sample_mean_for_y" align="center" raw="\bar{y} = \frac{1}{n} \sum_{i=0}^{n-1} y_i" alt="Equation for the sample mean for second array."> -->
82+
83+
```math
84+
\bar{y} = \frac{1}{n} \sum_{i=0}^{n-1} y_i
85+
```
86+
87+
<!-- </equation> -->
88+
89+
The use of the term `n-1` is commonly referred to as Bessel's correction. Depending on the characteristics of the population distributions, other correction factors (e.g., `n-1.5`, `n+1`, etc) can yield better estimators.
90+
91+
</section>
92+
93+
<!-- /.intro -->
94+
95+
<section class="usage">
96+
97+
## Usage
98+
99+
```javascript
100+
var dcovarmtk = require( '@stdlib/stats/strided/dcovarmtk' );
101+
```
102+
103+
#### dcovarmtk( N, correction, meanx, x, strideX, meany, y, strideY )
104+
105+
Computes the [covariance][covariance] of two double-precision floating-point strided arrays provided known means and using a one-pass textbook algorithm.
106+
107+
```javascript
108+
var Float64Array = require( '@stdlib/array/float64' );
109+
110+
var x = new Float64Array( [ 1.0, -2.0, 2.0 ] );
111+
var y = new Float64Array( [ 2.0, -2.0, 1.0 ] );
112+
113+
var v = dcovarmtk( x.length, 1, 1.0/3.0, x, 1, 1.0/3.0, y, 1 );
114+
// returns ~7.6667
115+
```
116+
117+
The function has the following parameters:
118+
119+
- **N**: number of indexed elements.
120+
- **correction**: degrees of freedom adjustment. Setting this parameter to a value other than `0` has the effect of adjusting the divisor during the calculation of the [covariance][covariance] according to `N-c` where `c` corresponds to the provided degrees of freedom adjustment. When computing the population [covariance][covariance], setting this parameter to `0` is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample [covariance][covariance], setting this parameter to `1` is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
121+
- **meanx**: mean of `x`.
122+
- **x**: first input [`Float64Array`][@stdlib/array/float64].
123+
- **strideX**: stride length for `x`.
124+
- **meany**: mean of `y`.
125+
- **y**: second input [`Float64Array`][@stdlib/array/float64].
126+
- **strideY**: stride length for `y`.
127+
128+
The `N` and stride parameters determine which elements in the strided arrays are accessed at runtime. For example, to compute the [covariance][covariance] of every other element in `x` and `y`,
129+
130+
```javascript
131+
var Float64Array = require( '@stdlib/array/float64' );
132+
133+
var x = new Float64Array( [ 1.0, 2.0, 2.0, -7.0, -2.0, 3.0, 4.0, 2.0 ] );
134+
var y = new Float64Array( [ -7.0, 2.0, 2.0, 1.0, -2.0, 2.0, 3.0, 4.0 ] );
135+
136+
var v = dcovarmtk( 4, 1, 1.25, x, 2, 1.25, y, 2 );
137+
// returns 6.0
138+
```
139+
140+
Note that indexing is relative to the first index. To introduce an offset, use [`typed array`][mdn-typed-array] views.
141+
142+
<!-- eslint-disable stdlib/capitalized-comments -->
143+
144+
```javascript
145+
var Float64Array = require( '@stdlib/array/float64' );
146+
147+
var x0 = new Float64Array( [ 2.0, 1.0, 2.0, -2.0, -2.0, 2.0, 3.0, 4.0 ] );
148+
var y0 = new Float64Array( [ 2.0, -2.0, 2.0, 1.0, -2.0, 4.0, 3.0, 2.0 ] );
149+
150+
var x1 = new Float64Array( x0.buffer, x0.BYTES_PER_ELEMENT*1 ); // start at 2nd element
151+
var y1 = new Float64Array( y0.buffer, y0.BYTES_PER_ELEMENT*1 ); // start at 2nd element
152+
153+
var v = dcovarmtk( 4, 1, 1.25, x1, 2, 1.25, y1, 2 );
154+
// returns ~1.9167
155+
```
156+
157+
#### dcovarmtk.ndarray( N, correction, meanx, x, strideX, offsetX, meany, y, strideY, offsetY )
158+
159+
Computes the [covariance][covariance] of two double-precision floating-point strided arrays provided known means and using a one-pass textbook algorithm and alternative indexing semantics.
160+
161+
```javascript
162+
var Float64Array = require( '@stdlib/array/float64' );
163+
164+
var x = new Float64Array( [ 1.0, -2.0, 2.0 ] );
165+
var y = new Float64Array( [ 2.0, -2.0, 1.0 ] );
166+
167+
var v = dcovarmtk.ndarray( x.length, 1, 1.0/3.0, x, 1, 0, 1.0/3.0, y, 1, 0 );
168+
// returns ~7.6667
169+
```
170+
171+
The function has the following additional parameters:
172+
173+
- **offsetX**: starting index for `x`.
174+
- **offsetY**: starting index for `y`.
175+
176+
While [`typed array`][mdn-typed-array] views mandate a view offset based on the underlying buffer, the offset parameters support indexing semantics based on starting indices. For example, to calculate the [covariance][covariance] for every other element in `x` and `y` starting from the second element
177+
178+
```javascript
179+
var Float64Array = require( '@stdlib/array/float64' );
180+
181+
var x = new Float64Array( [ 2.0, 1.0, 2.0, -2.0, -2.0, 2.0, 3.0, 4.0 ] );
182+
var y = new Float64Array( [ -7.0, 2.0, 2.0, 1.0, -2.0, 2.0, 3.0, 4.0 ] );
183+
184+
var v = dcovarmtk.ndarray( 4, 1, 1.25, x, 2, 1, 1.25, y, 2, 1 );
185+
// returns 6.0
186+
```
187+
188+
</section>
189+
190+
<!-- /.usage -->
191+
192+
<section class="notes">
193+
194+
## Notes
195+
196+
- If `N <= 0`, both functions return `NaN`.
197+
- If `N - c` is less than or equal to `0` (where `c` corresponds to the provided degrees of freedom adjustment), both functions return `NaN`.
198+
199+
</section>
200+
201+
<!-- /.notes -->
202+
203+
<section class="examples">
204+
205+
## Examples
206+
207+
<!-- eslint no-undef: "error" -->
208+
209+
```javascript
210+
var discreteUniform = require( '@stdlib/random/array/discrete-uniform' );
211+
var dcovarmtk = require( '@stdlib/stats/strided/dcovarmtk' );
212+
213+
var opts = {
214+
'dtype': 'float64'
215+
};
216+
var x = discreteUniform( 10, -50, 50, opts );
217+
console.log( x );
218+
219+
var y = discreteUniform( 10, -50, 50, opts );
220+
console.log( y );
221+
222+
var v = dcovarmtk( x.length, 1, 0.0, x, 1, 0.0, y, 1 );
223+
console.log( v );
224+
```
225+
226+
</section>
227+
228+
<!-- /.examples -->
229+
230+
<!-- C interface documentation. -->
231+
232+
* * *
233+
234+
<section class="c">
235+
236+
## C APIs
237+
238+
<!-- Section to include introductory text. Make sure to keep an empty line after the intro `section` element and another before the `/section` close. -->
239+
240+
<section class="intro">
241+
242+
</section>
243+
244+
<!-- /.intro -->
245+
246+
<!-- C usage documentation. -->
247+
248+
<section class="usage">
249+
250+
### Usage
251+
252+
```c
253+
#include "stdlib/stats/strided/dcovarmtk.h"
254+
```
255+
256+
#### stdlib_strided_dcovarmtk( N, correction, meanx, \*X, strideX, meany, \*Y, strideY )
257+
258+
Computes the [covariance][covariance] of two double-precision floating-point strided arrays provided known means and using a one-pass textbook algorithm.
259+
260+
```c
261+
const double x[] = { 1.0, -2.0, 2.0 };
262+
const double y[] = { 2.0, -2.0, 1.0 };
263+
264+
double v = stdlib_strided_dcovarmtk( 3, 1.0, 1.0/3.0, x, 1, 1.0/3.0, y, 1 );
265+
// returns ~7.6667
266+
```
267+
268+
The function accepts the following arguments:
269+
270+
- **N**: `[in] CBLAS_INT` number of indexed elements.
271+
- **correction**: `[in] double` degrees of freedom adjustment. Setting this parameter to a value other than `0` has the effect of adjusting the divisor during the calculation of the [covariance][covariance] according to `N-c` where `c` corresponds to the provided degrees of freedom adjustment. When computing the population [covariance][covariance], setting this parameter to `0` is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample [covariance][covariance], setting this parameter to `1` is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
272+
- **meanx**: `[in] double` mean of `X`.
273+
- **X**: `[in] double*` first input array.
274+
- **strideX**: `[in] CBLAS_INT` stride length for `X`.
275+
- **meany**: `[in] double` mean of `Y`.
276+
- **Y**: `[in] double*` second input array.
277+
- **strideY**: `[in] CBLAS_INT` stride length for `Y`.
278+
279+
```c
280+
double stdlib_strided_dcovarmtk( const CBLAS_INT N, const double correction, const double meanx, const double *X, const CBLAS_INT strideX, const double meanY, const double *Y, const CBLAS_INT strideY );
281+
```
282+
283+
#### stdlib_strided_dcovarmtk_ndarray( N, correction, meanx, \*X, strideX, offsetX, meany, \*Y, strideY, offsetY )
284+
285+
Computes the [covariance][covariance] of two double-precision floating-point strided arrays provided known means and using a one-pass textbook algorithm and alternative indexing semantics.
286+
287+
```c
288+
const double x[] = { 1.0, -2.0, 2.0 };
289+
const double y[] = { 2.0, -2.0, 1.0 };
290+
291+
double v = stdlib_strided_dcovarmtk_ndarray( 3, 1.0, 1.0/3.0, x, 1, 0, 1.0/3.0, y, 1, 0 );
292+
// returns ~7.6667
293+
```
294+
295+
The function accepts the following arguments:
296+
297+
- **N**: `[in] CBLAS_INT` number of indexed elements.
298+
- **correction**: `[in] double` degrees of freedom adjustment. Setting this parameter to a value other than `0` has the effect of adjusting the divisor during the calculation of the [covariance][covariance] according to `N-c` where `c` corresponds to the provided degrees of freedom adjustment. When computing the population [covariance][covariance], setting this parameter to `0` is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample [covariance][covariance], setting this parameter to `1` is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
299+
- **meanx**: `[in] double` mean of `X`.
300+
- **X**: `[in] double*` first input array.
301+
- **strideX**: `[in] CBLAS_INT` stride length for `X`.
302+
- **offsetX**: `[in] CBLAS_INT` starting index for `X`.
303+
- **meany**: `[in] double` mean of `Y`.
304+
- **Y**: `[in] double*` second input array.
305+
- **strideY**: `[in] CBLAS_INT` stride length for `Y`.
306+
- **offsetY**: `[in] CBLAS_INT` starting index for `Y`.
307+
308+
```c
309+
double stdlib_strided_dcovarmtk_ndarray( const CBLAS_INT N, const double correction, const double meanx, const double *X, const CBLAS_INT strideX, const CBLAS_INT offsetX, const double meany, const double *Y, const CBLAS_INT strideY, const CBLAS_INT offsetY );
310+
```
311+
312+
</section>
313+
314+
<!-- /.usage -->
315+
316+
<!-- C API usage notes. Make sure to keep an empty line after the `section` element and another before the `/section` close. -->
317+
318+
<section class="notes">
319+
320+
</section>
321+
322+
<!-- /.notes -->
323+
324+
<!-- C API usage examples. -->
325+
326+
<section class="examples">
327+
328+
### Examples
329+
330+
```c
331+
#include "stdlib/stats/strided/dcovarmtk.h"
332+
#include <stdio.h>
333+
334+
int main( void ) {
335+
// Create a strided array:
336+
const double x[] = { 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 };
337+
338+
// Specify the number of elements:
339+
const int N = 4;
340+
341+
// Specify the stride length:
342+
const int strideX = 2;
343+
344+
// Compute the covariance of `x` with itself:
345+
double v = stdlib_strided_dcovarmtk( N, 1, 4.5, x, strideX, 4.5, x, -strideX );
346+
347+
// Print the result:
348+
printf( "covariance: %lf\n", v );
349+
}
350+
```
351+
352+
</section>
353+
354+
<!-- /.examples -->
355+
356+
</section>
357+
358+
<!-- /.c -->
359+
360+
<section class="references">
361+
362+
</section>
363+
364+
<!-- /.references -->
365+
366+
<!-- Section for related `stdlib` packages. Do not manually edit this section, as it is automatically populated. -->
367+
368+
<section class="related">
369+
370+
</section>
371+
372+
<!-- /.related -->
373+
374+
<!-- Section for all links. Make sure to keep an empty line after the `section` element and another before the `/section` close. -->
375+
376+
<section class="links">
377+
378+
[covariance]: https://en.wikipedia.org/wiki/Covariance
379+
380+
[@stdlib/array/float64]: https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/array/float64
381+
382+
[mdn-typed-array]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray
383+
384+
</section>
385+
386+
<!-- /.links -->

0 commit comments

Comments
 (0)