-
Notifications
You must be signed in to change notification settings - Fork 0
Description
The Array API Standard states:
Apart from array object attributes, such as
ndim
,device
, anddtype
, all operations in this standard return arrays (or tuples of arrays).
NEP 56 argued that this was not high enough priority to consider during the NumPy 2.0 transition:
Given that array scalars implement a largely array-compatible interface, this doesn’t seem like the highest-prio item regarding array API standard compatibility (or in general).
Consequently, NumPy 2.x is not compatible with this aspect of the standard. If there are other libraries in the ecosystem that run into problems because of this, I'd like to discuss it at the summit. My main goal would be to take a poll of whether the NEP 56 assessment is accurate. If not, I'd like to understand what resolution(s) projects would support. For instance:
- Support NumPy in adding all attributes and abilities of scalars to NumPy 0d arrays, and agree to ignore the distinction between the two entirely (e.g. dependent library functions are free to return one or the other, and which they return might even be input-dependent).
- Support NumPy in working out a plan for transitioning from returning scalars to returning 0-d arrays wherever it is supposed to (ENH: No longer auto-convert array scalars to numpy scalars in ufuncs (and elsewhere?) numpy/numpy#24897).
For background, NumPy scalars (e.g. np.float64(1.)
) are not instances of the fundamental NumPy array type (ndarray
). Although NumPy scalars have many attributes of arrays and are accepted like 0-d arrays by many NumPy functions, they do not have all attributes required of arrays (e.g. mT
) and they do not support boolean index assignment like true NumPy arrays1.
Many NumPy functions accept NumPy arrays and return NumPy scalars when the standard would require the return value to be a 0D array. For instance:
import numpy as np
x = np.arange(10.)
x[0] # scalar
np.mean(x) # scalar
y = np.asarray(1.) # array
1.*y # scalar
SciPy frequently runs into two kinds of problems because of this:
- We try to follow an array type in = array type out rule. If SciPy functions forget to explicilty cast the result to an array immediately before
return
, they are likely to inadvertently returning a NumPy scalar. - We sometimes rely on the arrays be mutable during calculations2. If the result of an intermediate calculation is a NumPy scalar instead of a proper array, it needs to be explicitly cast to an array to make it mutable.
These problems can be worked around by sprinkling xp.asarray
liberally throughout the codebase, but this doesn't seem like an ideal long-term solution. Rather than having all NumPy-dependent libraries work around this standard incompatibility on a line-by-line basis, I'd like to address it at the source.
Footnotes
-
It might be argued that they don't need to support boolean index assignment to be considered "arrays", since the standard seems to make accomodations for immutable array types. In that case, note that the NumPy scalar types do not support
dim
ensionality other than0
orsize
other than1
. So while it is true that NumPy scalars are almost interchangeable with NumPy 0d arrays (aside from the missing attribute(s) and behaviors), I doubt that they can be considered standard-compatible according to the current wording of the standard. ↩ -
These days, we can't really rely on mutability of array types, and we are starting to use
array_api_extra.at
for all in-place-like operations. But this is actually a whole other can of worms that needs to be discussed at some point. ↩