Skip to content

Commit b256e2b

Browse files
authored
timedelta64 (#12)
* add timedelta64 data type * clarify step size lower bound * prose * prose * lint and prose and typos * use scale factor consistently * update fill value section * reflow text * fix typo * use numpy prefix
1 parent 49602bf commit b256e2b

File tree

2 files changed

+151
-0
lines changed

2 files changed

+151
-0
lines changed
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# numpy.timedelta64 data type
2+
3+
This document defines a Zarr data type to model the `timedelta64` data type from NumPy.
4+
The `timedelta64` data type represents signed temporal durations.
5+
6+
## Background
7+
8+
`numpy.timedelta64` is based on a data type with the same name defined in [NumPy](https://NumPy.org/).
9+
To provide necessary context, this document first describes how `timedelta64` works in NumPy before
10+
detailing its specification in Zarr.
11+
12+
The following references to NumPy are based on version 2.2 of that library.
13+
14+
NumPy defines a data type called `"timedelta64"` to represent signed temporal durations.
15+
These durations arise when taking a difference between moments in time.
16+
NumPy models moments in time with a related data type called `"datetime64"`.
17+
Both data types are described in the [NumPy documentation](https://NumPy.org/doc/stable/reference/arrays.datetime.html),
18+
which should be considered authoritative.
19+
20+
`timedelta64` data types are parametrized by a physical unit of duration, like seconds or minutes,
21+
and a positive integral scale factor. For example, given a `timedelta64` data type defined with a
22+
unit of seconds and a duration 10, the scalar value `1` in that data type represents a duration of
23+
10 seconds.
24+
25+
NumPy represents `timedelta64` scalars with 64-bit signed integers. Negative values are permitted.
26+
The smallest 64-bit signed integer, i.e., `-2^63`, represents a non-duration value called
27+
"Not a Time", or `NaT`. The `NaT` value serves a role similar to the "Not a Number" value defined in
28+
some floating point data types.
29+
30+
### NumPy data type parameters
31+
32+
#### Scale factor
33+
The NumPy `timedelta64` data type takes a scaling factor. It must be an integer in the range
34+
`[1, 2147483647]`, i.e. `[1, 2^31 - 1]`.
35+
36+
While it is possible to construct a NumPy `timedelta64` data type with a scaling factor of `0`,
37+
NumPy will automatically normalize this to `1`.
38+
39+
#### Unit
40+
The NumPy `timedelta64` data type takes a unit parameter, which must be one of the following
41+
temporal units:
42+
43+
| Identifier | Meaning |
44+
|------------|----------|
45+
| Y | year |
46+
| M | month |
47+
| W | week |
48+
| D | day |
49+
| h | hour |
50+
| m | minute |
51+
| s | second |
52+
| ms | millisecond |
53+
| us | microsecond |
54+
| μs | microsecond |
55+
| ns | nanosecond |
56+
| ps | picosecond |
57+
| fs | femtosecond |
58+
| as | attosecond |
59+
60+
> Note: "us" and "μs" are treated as equivalent by NumPy.
61+
62+
> Note: NumPy permits the creation of `timedelta64` data types with an unspecified unit. In this
63+
case, the unit is set to the special value `"generic"`.
64+
65+
#### Endianness
66+
67+
The NumPy `timedelta64` data type takes a byte order parameter, which must be either little-endian
68+
or big-endian.
69+
70+
## Data type representation
71+
72+
### Name
73+
74+
The name of this data type is the string `"numpy.timedelta64"`.
75+
76+
### Configuration
77+
78+
This data type requires a configuration. The configuration for this data type is a JSON object with
79+
the following fields:
80+
81+
| field name | type | required | notes |
82+
|------------|----------|---|---|
83+
| `"unit"` | one of: `"Y"`, `"M"` , `"W"`, `"D"` , `"h"` , `"m"` , `"s"` , `"ms"` , `"us"` , `"μs"` , `"ns"` , `"ps"` , `"fs"` , `"as"`, `"generic"` | yes | None |
84+
| `"scale_factor"` | `integer` | yes | The number must represent an integer from the inclusive range `[1, 2147483647]` |
85+
86+
> Note: the NumPy `timedelta64` data type is parametrized by an endianness (little or big), but the
87+
Zarr `numpy.timedelta64` data type is not. In Zarr, the endianness of `numpy.timedelta64` arrays is determined
88+
by the configuration of the codecs defined in metadata and is thus not part of the data type configuration.
89+
90+
> Note: as per NumPy, `"us"` and `"μs"` are equivalent and interchangeable representations of
91+
microseconds.
92+
93+
No additional fields are permitted in the configuration.
94+
95+
### Examples
96+
The following is an example of the metadata for a `numpy.timedelta64` data type with a unit of
97+
microseconds and a scale factor of 10. This configuration defines a data type equivalent to the
98+
NumPy data type `timedelta64[10us]`:
99+
100+
```json
101+
{
102+
"name": "numpy.timedelta64",
103+
"configuration": {
104+
"unit": "us",
105+
"scale_factor": 10
106+
}
107+
}
108+
```
109+
110+
## Fill value representation
111+
112+
For the `"fill_value"` field of array metadata, `numpy.timedelta64` scalars must be represented in one of
113+
two forms:
114+
- As a JSON number with no fraction or exponent part that is within the range `[-2^63, 2^63 - 1]`.
115+
- As the string `"NaT"`, which denotes the value `NaT`.
116+
117+
> Note: the `NaT` value may be encoded as the JSON number `-9223372036854775808`, i.e.,
118+
`-2^63`. That is, `"fill_value": "NaT"` and `"fill_value": -9223372036854775808` should be treated
119+
as equivalent representations of the same scalar value (`NaT`).
120+
121+
## Codec compatibility
122+
123+
This data type is compatible with any codec that supports arrays of signed 64-bit integers.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
{
2+
"$schema": "https://json-schema.org/draft/2020-12/schema",
3+
"title": "timedelta64",
4+
"type": "object",
5+
"properties": {
6+
"name": {
7+
"const": "numpy.timedelta64"
8+
},
9+
"configuration": {
10+
"type": "object",
11+
"properties": {
12+
"unit": {
13+
"type": "string",
14+
"enum": ["Y", "M", "W", "D", "h", "m", "s", "ms", "us", "μs", "ns", "ps", "fs", "as", "generic"]
15+
},
16+
"scale_factor": {
17+
"type": "integer",
18+
"minimum": 1,
19+
"maximum": 2147483647
20+
}
21+
},
22+
"required": ["unit", "scale_factor"],
23+
"additionalProperties": false
24+
}
25+
},
26+
"required": ["name", "configuration"],
27+
"additionalProperties": false
28+
}

0 commit comments

Comments
 (0)