|
| 1 | +# numpy.datetime64 data type |
| 2 | + |
| 3 | +This document defines `numpy.datetime64`, a data type |
| 4 | +that represents moments in time relative to the Unix epoch. |
| 5 | +The `numpy.datetime64` data type closely models the `datetime64` data type from NumPy. |
| 6 | + |
| 7 | + |
| 8 | +## Background |
| 9 | + |
| 10 | +`numpy.datetime64` is based on the `datetime64` data defined in [NumPy](https://NumPy.org/). |
| 11 | +To provide necessary context, this document first describes how `datetime64` works in NumPy before |
| 12 | +detailing how the corresponding Zarr data type is defined. |
| 13 | + |
| 14 | +The following references to NumPy are based on version 2.2 of that library. |
| 15 | + |
| 16 | +NumPy defines a data type called `"datetime64"` to represent moments in time relative to the Unix |
| 17 | +epoch. This data type is described in the [NumPy documentation](https://NumPy.org/doc/stable/reference/arrays.datetime.html), which should be considered authoritative. |
| 18 | + |
| 19 | +`datetime64` data types are parametrized by a physical unit of duration, like seconds or minutes, |
| 20 | +and a positive integral scale factor. For example, given a `datetime64` data type defined with a |
| 21 | +unit of seconds and a duration 10, the scalar value `1` in that data type represents a 10 seconds |
| 22 | +after the Unix epoch, i.e. 00:00:10 UTC on 1 January 1970. |
| 23 | + |
| 24 | +NumPy represents `datetime64` scalars with 64-bit signed integers. The smallest 64-bit signed |
| 25 | +integer, i.e., `-2^63`, represents a non-temporal value called "Not a Time", or `NaT`. The `NaT` |
| 26 | +value serves a role similar to the "Not a Number" value used in floating point data types. |
| 27 | + |
| 28 | +### NumPy data type parameters |
| 29 | + |
| 30 | +#### Scale factor |
| 31 | +The NumPy `datetime64` data type takes a scale factor. It must be an integer in the range |
| 32 | +`[1, 2147483647]`, i.e., `[1, 2^31 - 1]`. |
| 33 | + |
| 34 | +While it is possible to construct a NumPy `datetime64` data type with a scale factor of `0`, |
| 35 | +NumPy will automatically normalize this value to `1`. |
| 36 | + |
| 37 | +#### Unit |
| 38 | +The NumPy `datetime64` data type takes a unit parameter, which must be one of the following temporal |
| 39 | +units: |
| 40 | + |
| 41 | +| Identifier | Meaning | |
| 42 | +|------------|----------| |
| 43 | +| Y | year | |
| 44 | +| M | month | |
| 45 | +| W | week | |
| 46 | +| D | day | |
| 47 | +| h | hour | |
| 48 | +| m | minute | |
| 49 | +| s | second | |
| 50 | +| ms | millisecond | |
| 51 | +| us | microsecond | |
| 52 | +| μs | microsecond | |
| 53 | +| ns | nanosecond | |
| 54 | +| ps | picosecond | |
| 55 | +| fs | femtosecond | |
| 56 | +| as | attosecond | |
| 57 | + |
| 58 | +> Note: "us" and "μs" are treated as equivalent by NumPy. |
| 59 | +
|
| 60 | +> Note: NumPy permits the creation of `datetime64` data types with an unspecified unit. In this |
| 61 | +case, the unit is set to the special value `"generic"`. |
| 62 | + |
| 63 | +#### Endianness |
| 64 | +The NumPy `datetime64` data type takes a byte order parameter, which must be either |
| 65 | +little-endian or big-endian. |
| 66 | + |
| 67 | +## Data type representation |
| 68 | + |
| 69 | +### Name |
| 70 | + |
| 71 | +The name of this data type is the string `"numpy.datetime64"`. |
| 72 | + |
| 73 | +### Configuration |
| 74 | + |
| 75 | +This data type requires a configuration. The configuration for this data type is a JSON object with |
| 76 | +the following fields: |
| 77 | + |
| 78 | +| field name | type | required | notes | |
| 79 | +|------------|----------|---|---| |
| 80 | +| `"unit"` | one of: `"Y"`, `"M"` , `"W"`, `"D"` , `"h"` , `"m"` , `"s"` , `"ms"` , `"us"` , `"μs"` , `"ns"` , `"ps"` , `"fs"` , `"as"`, `"generic"` | yes | None | |
| 81 | +| `"scale_factor"` | `integer` | yes | The number must represent an integer from the inclusive range `[1, 2147483647]` | |
| 82 | + |
| 83 | +> Note: the NumPy `datetime64` data type is parametrized by an endianness (little or big), but the |
| 84 | +Zarr `numpy.datetime64` data type is not. In Zarr, the endianness of `numpy.datetime64` arrays is determined by |
| 85 | +the configuration of the codecs defined in metadata and is thus not part of the data type configuration. |
| 86 | + |
| 87 | +> Note: as per NumPy, `"us"` and `"μs"` are equivalent and interchangeable representations of |
| 88 | +microseconds. |
| 89 | + |
| 90 | +No additional fields are permitted in the configuration. |
| 91 | + |
| 92 | +### Examples |
| 93 | +The following is an example of the metadata for a `numpy.datetime64` data type with a unit of microseconds |
| 94 | +and a scale factor of 10. This configuration defines a data type equivalent to the NumPy data type |
| 95 | +`datetime64[10us]`: |
| 96 | + |
| 97 | +```json |
| 98 | +{ |
| 99 | + "name": "numpy.datetime64", |
| 100 | + "configuration": { |
| 101 | + "unit": "us", |
| 102 | + "scale_factor": 10 |
| 103 | + } |
| 104 | +} |
| 105 | +``` |
| 106 | + |
| 107 | +## Fill value representation |
| 108 | + |
| 109 | +For the `"fill_value"` field of array metadata, `numpy.datetime64` scalars must be represented in one of |
| 110 | +two forms: |
| 111 | +- As a JSON number with no fraction or exponent part that is within the range `[-2^63, 2^63 - 1]`. |
| 112 | +- As the string `"NaT"`, which denotes the value `NaT`. |
| 113 | + |
| 114 | +> Note: the `NaT` value may be encoded as the JSON number `-9223372036854775808`, i.e., |
| 115 | +`-2^63`. That is, `"fill_value": "NaT"` and `"fill_value": -9223372036854775808` should be treated |
| 116 | +as equivalent representations of the same scalar value (`NaT`). |
| 117 | + |
| 118 | +## Codec compatibility |
| 119 | + |
| 120 | +This data type is compatible with any codec that supports arrays of signed 64-bit integers. |
0 commit comments