From 2c59326fef0aba74f060e26cd922e308c595b340 Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Tue, 5 Nov 2024 15:44:18 +0800 Subject: [PATCH 1/7] Add docs for the supported numeric dtypes --- doc/techref/array_dtypes.md | 35 +++++++++++++++++++++++++++++++++++ doc/techref/index.md | 1 + 2 files changed, 36 insertions(+) create mode 100644 doc/techref/array_dtypes.md diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md new file mode 100644 index 00000000000..6d9584b7d36 --- /dev/null +++ b/doc/techref/array_dtypes.md @@ -0,0 +1,35 @@ +# Supported Array Dtypes + +PyGMT uses NumPy arrays to store data and passes them to the GMT C library. In this way, +PyGMT can support a wide range of dtypes. This page documents array dtypes supported by +PyGMT. + +## Numeric Dtypes + +For 1-D and 2-D arrays, PyGMT supports most numeric dtypes provided by NumPy, pandas, and +PyArrow. + +**Signed Integers:** + +- `numpy.int8`, `numpy.int16`, `numpy.int32`, `numpy.int64` +- `pandas.Int8`, `pandas.Int16`, `pandas.Int32`, `pandas.Int64` +- `pyarrow.int8`, `pyarrow.int16`, `pyarrow.int32`, `pyarrow.int64` + +**Unsigned Integers:** + +- `numpy.uint8`, `numpy.uint16`, `numpy.uint32`, `numpy.uint64` +- `pandas.UInt8`, `pandas.UInt16`, `pandas.UInt32`, `pandas.UInt64` +- `pyarrow.uint8`, `pyarrow.uint16`, `pyarrow.uint32`, `pyarrow.uint64` + +**Floating-point numbers:** + +- `numpy.float32`, `numpy.float64` +- `pandas.Float32`, `pandas.Float64` +- `pyarrow.float32`, `pyarrow.float64` + +For 3-D {class}`xarray.DataArray` objects representing raster images, only 8-bit unsigned +intergers (i.e., `numpy.uint8`) are supported. + +## String Dtypes + +## Datetime Dtypes diff --git a/doc/techref/index.md b/doc/techref/index.md index bbba3ead6b4..96e75e0b493 100644 --- a/doc/techref/index.md +++ b/doc/techref/index.md @@ -8,6 +8,7 @@ visit the {gmt-docs}`GMT Technical Reference `. ```{toctree} :maxdepth: 1 +array_dtypes.md projections.md fonts.md patterns.md From 466ef40c48bfef4a0d0061f96924875ce487c7bc Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Mon, 11 Nov 2024 10:18:13 +0800 Subject: [PATCH 2/7] Add notes about float16 and longdouble --- doc/techref/array_dtypes.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md index 6d9584b7d36..99ee051d3e3 100644 --- a/doc/techref/array_dtypes.md +++ b/doc/techref/array_dtypes.md @@ -6,8 +6,8 @@ PyGMT. ## Numeric Dtypes -For 1-D and 2-D arrays, PyGMT supports most numeric dtypes provided by NumPy, pandas, and -PyArrow. +For 1-D and 2-D arrays, PyGMT supports most numeric dtypes provided by NumPy, pandas, +and PyArrow. **Signed Integers:** @@ -27,8 +27,12 @@ PyArrow. - `pandas.Float32`, `pandas.Float64` - `pyarrow.float32`, `pyarrow.float64` -For 3-D {class}`xarray.DataArray` objects representing raster images, only 8-bit unsigned -intergers (i.e., `numpy.uint8`) are supported. +.. notes:: + + 1. Currently, `numpy.float16`, `numpy.longdouble` and `pyarrow.float16` are not + supported + 2. For 3-D {class}`xarray.DataArray` objects representing raster images, only 8-bit + unsigned intergers (i.e., `numpy.uint8`) are supported. ## String Dtypes From 9dc90afe3ed2fcd14424de340e159b64c2cd50f6 Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Mon, 11 Nov 2024 10:33:00 +0800 Subject: [PATCH 3/7] Add a few examples for numeric arrays --- doc/techref/array_dtypes.md | 50 +++++++++++++++++++++++++++---------- 1 file changed, 37 insertions(+), 13 deletions(-) diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md index 99ee051d3e3..261e83eddf2 100644 --- a/doc/techref/array_dtypes.md +++ b/doc/techref/array_dtypes.md @@ -1,38 +1,62 @@ # Supported Array Dtypes -PyGMT uses NumPy arrays to store data and passes them to the GMT C library. In this way, -PyGMT can support a wide range of dtypes. This page documents array dtypes supported by -PyGMT. +PyGMT uses NumPy arrays as its fundamental data structure for storing data, as well as +for exchanging data with the GMT C library. In this way, PyGMT can support a wide +range of dtypes, as long as they can be converted to a NumPy array. This page provides +a comprehensive overview of the dtypes supported by PyGMT. ## Numeric Dtypes -For 1-D and 2-D arrays, PyGMT supports most numeric dtypes provided by NumPy, pandas, -and PyArrow. +In addition to the Python built-in numeric types (i.e., `int` and `float`), PyGMT +supports most of the numeric dtypes provided by NumPy, pandas, and PyArrow. -**Signed Integers:** +**Signed Integers** - `numpy.int8`, `numpy.int16`, `numpy.int32`, `numpy.int64` - `pandas.Int8`, `pandas.Int16`, `pandas.Int32`, `pandas.Int64` - `pyarrow.int8`, `pyarrow.int16`, `pyarrow.int32`, `pyarrow.int64` -**Unsigned Integers:** +**Unsigned Integers** - `numpy.uint8`, `numpy.uint16`, `numpy.uint32`, `numpy.uint64` - `pandas.UInt8`, `pandas.UInt16`, `pandas.UInt32`, `pandas.UInt64` - `pyarrow.uint8`, `pyarrow.uint16`, `pyarrow.uint32`, `pyarrow.uint64` -**Floating-point numbers:** +**Floating-point numbers** - `numpy.float32`, `numpy.float64` - `pandas.Float32`, `pandas.Float64` - `pyarrow.float32`, `pyarrow.float64` -.. notes:: +```{note} - 1. Currently, `numpy.float16`, `numpy.longdouble` and `pyarrow.float16` are not - supported - 2. For 3-D {class}`xarray.DataArray` objects representing raster images, only 8-bit - unsigned intergers (i.e., `numpy.uint8`) are supported. +1. Currently, `numpy.float16`, `numpy.longdouble` and `pyarrow.float16` are not + supported. +2. For 3-D {class}`xarray.DataArray` objects representing raster images, only 8-bit + unsigned integers (i.e., `numpy.uint8`) are supported. +``` + +````{example} +```python +# A list of integers +[1, 2, 3] + +# A NumPy array with np.int32 dtype +np.array([1, 2, 3], dtype=np.int32) + +# A pandas.Series with pandas's Int32 dtype +pd.Series([1, 2, 3], dtype="Int32") + +# A pandas.Series with pandas's nullable Int32 dtype +pd.Series([1, 2, pd.NA], dtype="Int32") + +# A pandas.Series with PyArrow-backed float64 dtype +pd.Series([1, 2, 3], dtype="float64[pyarrow]") + +# A PyArrow array with pyarrow.uint8 dtype +pa.array([1, 2, 3], type=pa.uint8()) +``` +```` ## String Dtypes From 441de81555ef0359f09eb09880e94396b745cd5d Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Mon, 11 Nov 2024 14:27:04 +0800 Subject: [PATCH 4/7] Fix syntax --- doc/techref/array_dtypes.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md index 261e83eddf2..93afdc7bfd7 100644 --- a/doc/techref/array_dtypes.md +++ b/doc/techref/array_dtypes.md @@ -28,15 +28,15 @@ supports most of the numeric dtypes provided by NumPy, pandas, and PyArrow. - `pandas.Float32`, `pandas.Float64` - `pyarrow.float32`, `pyarrow.float64` -```{note} +:::{note} 1. Currently, `numpy.float16`, `numpy.longdouble` and `pyarrow.float16` are not supported. 2. For 3-D {class}`xarray.DataArray` objects representing raster images, only 8-bit unsigned integers (i.e., `numpy.uint8`) are supported. -``` +::: -````{example} +:::{tip} Examples ```python # A list of integers [1, 2, 3] @@ -56,7 +56,7 @@ pd.Series([1, 2, 3], dtype="float64[pyarrow]") # A PyArrow array with pyarrow.uint8 dtype pa.array([1, 2, 3], type=pa.uint8()) ``` -```` +::: ## String Dtypes From 9a032f21d9e5fcca6e828f18130b8b5eb3c6558d Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Mon, 11 Nov 2024 15:37:26 +0800 Subject: [PATCH 5/7] Fix some myst syntax --- doc/techref/array_dtypes.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md index 93afdc7bfd7..9c4226acb26 100644 --- a/doc/techref/array_dtypes.md +++ b/doc/techref/array_dtypes.md @@ -36,7 +36,10 @@ supports most of the numeric dtypes provided by NumPy, pandas, and PyArrow. unsigned integers (i.e., `numpy.uint8`) are supported. ::: -:::{tip} Examples +:::{note} + +Here are some examples for creating array-like objects that PyGMT supports: + ```python # A list of integers [1, 2, 3] From b725be7d6b66dac6f8a09886067e88c6bb2eca5f Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Tue, 12 Nov 2024 18:04:58 +0800 Subject: [PATCH 6/7] Add documents about string dtypes --- doc/techref/array_dtypes.md | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md index 9c4226acb26..a0e8c7a2abc 100644 --- a/doc/techref/array_dtypes.md +++ b/doc/techref/array_dtypes.md @@ -38,7 +38,7 @@ supports most of the numeric dtypes provided by NumPy, pandas, and PyArrow. :::{note} -Here are some examples for creating array-like objects that PyGMT supports: +Here are some examples for creating array-like numeric objects that PyGMT supports: ```python # A list of integers @@ -63,4 +63,32 @@ pa.array([1, 2, 3], type=pa.uint8()) ## String Dtypes +In addition to the Python built-in `str` type, PyGMT also support following string dtypes: + +- NumPy: `numpy.str_` +- pandas: `pandas.StringDtype` (including `string[python]`, `string[pyarrow]` and + `string[pyarrow_numpy]`) +- PyArrow: `pyarrow.string`, `pyarrow.large_string`, and `pyarrow.string_view` + +:::{note} +Here are some examples for creating string arrays that PyGMT supports: + +```python +# A list of strings +["a", "b", "c"] + +# A NumPy string array +np.array(["a", "b", "c"]) +np.array(["a", "b", "c"], dtype=np.str_) + +# A pandas.Series string array +pd.Series(["a", "b", "c"], dtype="string") +pd.Series(["a", "b", "c"], dtype="string[python]") +pd.Series(["a", "b", "c"], dtype="string[pyarrow]") +pd.Series(["a", "b", "c"], dtype="string[pyarrow_numpy]") + +# A PyArrow array with pyarrow.string dtype +pa.array(["a", "b", "c"], type=pa.string()) +``` + ## Datetime Dtypes From 52e9c3d2a5cbca6ae494dd96046fd4fd8316497d Mon Sep 17 00:00:00 2001 From: Dongdong Tian Date: Thu, 14 Nov 2024 13:43:12 +0800 Subject: [PATCH 7/7] Add date32/date64 dtypes --- doc/techref/array_dtypes.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/techref/array_dtypes.md b/doc/techref/array_dtypes.md index a0e8c7a2abc..7eb3ce46ee4 100644 --- a/doc/techref/array_dtypes.md +++ b/doc/techref/array_dtypes.md @@ -92,3 +92,6 @@ pa.array(["a", "b", "c"], type=pa.string()) ``` ## Datetime Dtypes + +- pandas: `date32[day][pyarrow]`, `date64[ms][pyarrow]` +- PyArrow: `pyarrow.date32`, `pyarrow.date64`