Skip to content

Commit 01f7b4f

Browse files
authored
[skip-ci] NamedArray: Add lazy indexing array refactoring plan (#8775)
1 parent 8e1dfcf commit 01f7b4f

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

design_notes/named_array_design_doc.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,39 @@ The named-array package is designed to be interoperable with other scientific Py
8787

8888
Further implementation details are in Appendix: [Implementation Details](#appendix-implementation-details).
8989

90+
## Plan for decoupling lazy indexing functionality from NamedArray
91+
92+
Today's implementation Xarray's lazy indexing functionality uses three private objects: `*Indexer`, `*IndexingAdapter`, `*Array`.
93+
These objects are needed for two reason:
94+
1. We need to translate from Xarray (NamedArray) indexing rules to bare array indexing rules.
95+
- `*Indexer` objects track the type of indexing - basic, orthogonal, vectorized
96+
2. Not all arrays support the same indexing rules, so we need `*Indexing` adapters
97+
1. Indexing Adapters today implement `__getitem__` and use type of `*Indexer` object to do appropriate conversions.
98+
3. We also want to support lazy indexing of on-disk arrays.
99+
1. These again support different types of indexing, so we have `explicit_indexing_adapter` that understands `*Indexer` objects.
100+
101+
### Goals
102+
1. We would like to keep the lazy indexing array objects, and backend array objects within Xarray. Thus NamedArray cannot treat these objects specially.
103+
2. A key source of confusion (and coupling) is that both lazy indexing arrays and indexing adapters, both handle Indexer objects, and both subclass `ExplicitlyIndexedNDArrayMixin`. These are however conceptually different.
104+
105+
### Proposal
106+
107+
1. The `NumpyIndexingAdapter`, `DaskIndexingAdapter`, and `ArrayApiIndexingAdapter` classes will need to migrate to Named Array project since we will want to support indexing of numpy, dask, and array-API arrays appropriately.
108+
2. The `as_indexable` function which wraps an array with the appropriate adapter will also migrate over to named array.
109+
3. Lazy indexing arrays will implement `__getitem__` for basic indexing, `.oindex` for orthogonal indexing, and `.vindex` for vectorized indexing.
110+
4. IndexingAdapter classes will similarly implement `__getitem__`, `oindex`, and `vindex`.
111+
5. `NamedArray.__getitem__` (and `__setitem__`) will still use `*Indexer` objects internally (for e.g. in `NamedArray._broadcast_indexes`), but use `.oindex`, `.vindex` on the underlying indexing adapters.
112+
6. We will move the `*Indexer` and `*IndexingAdapter` classes to Named Array. These will be considered private in the long-term.
113+
7. `as_indexable` will no longer special case `ExplicitlyIndexed` objects (we can special case a new `IndexingAdapter` mixin class that will be private to NamedArray). To handle Xarray's lazy indexing arrays, we will introduce a new `ExplicitIndexingAdapter` which will wrap any array with any of `.oindex` of `.vindex` implemented.
114+
1. This will be the last case in the if-chain that is, we will try to wrap with all other `IndexingAdapter` objects before using `ExplicitIndexingAdapter` as a fallback. This Adapter will be used for the lazy indexing arrays, and backend arrays.
115+
2. As with other indexing adapters (point 4 above), this `ExplicitIndexingAdapter` will only implement `__getitem__` and will understand `*Indexer` objects.
116+
8. For backwards compatibility with external backends, we will have to gracefully deprecate `indexing.explicit_indexing_adapter` which translates from Xarray's indexing rules to the indexing supported by the backend.
117+
1. We could split `explicit_indexing_adapter` in to 3:
118+
- `basic_indexing_adapter`, `outer_indexing_adapter` and `vectorized_indexing_adapter` for public use.
119+
2. Implement fall back `.oindex`, `.vindex` properties on `BackendArray` base class. These will simply rewrap the `key` tuple with the appropriate `*Indexer` object, and pass it on to `__getitem__` or `__setitem__`. These methods will also raise DeprecationWarning so that external backends will know to migrate to `.oindex`, and `.vindex` over the next year.
120+
121+
THe most uncertain piece here is maintaining backward compatibility with external backends. We should first migrate a single internal backend, and test out the proposed approach.
122+
90123
## Project Timeline and Milestones
91124

92125
We have identified the following milestones for the completion of this project:

0 commit comments

Comments
 (0)