Skip to content

data_kind: Refactor the if-else statements into match-case statements #3481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 7, 2024
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 40 additions & 30 deletions pygmt/helpers/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,23 +187,27 @@
return "ISOLatin1+"


def data_kind(
data: Any = None, required: bool = True
def data_kind( # noqa: PLR0911
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PLR0911 means "too-many-return-statements".

https://docs.astral.sh/ruff/rules/too-many-return-statements/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR refactors the data_kind function, to use if-return statements instead of if-else statements. The new if-return statements are easier to read and maintain.

I don't see much of an improvement in readability/maintainability between the previous if-elif-else statements and the if-return style in this PR. Looking at https://stackoverflow.com/questions/72024229/how-to-solve-pylint-too-many-return-statements-elegantly, I'm thinking if we could either keep the if-elif-else syntax, or use match-case instead.

If using match-case, would it be more readable to do the isinstance() checks first, and then have a nested if-elif for case _ for the more complex cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel match-case would make it more complicated:

match data:
    case x if isinstance(x, io.StringIO):
        return "stringio"
    case x if isinstance(x, xr.DataArray):
        return "image" if len(x.dims) == 3 else "grid"
    case x if isinstance(x, bool | int | float) or (x is None and not required):
        return "arg"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using match-case also has the too-many-return-statements warning, and using if-elif-else means we need to type hints the kind variable like:

kind: Literal["arg", "file", "geojson", "grid", "image", "matrix", "stringio", "vectors"]

Copy link
Member

@weiji14 weiji14 Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't run tests on this, but we can remove some of the isinstance() calls like so using class patterns (xref https://stackoverflow.com/questions/67524641/convert-multiple-isinstance-checks-to-structural-pattern-matching/67524642#67524642)

    match data:
        case str() | pathlib.PurePath():
            kind = "file"
        case list() | tuple() if all(
            isinstance(_file, str | pathlib.PurePath) for _file in data
        ):
            kind = "file"
        case io.StringIO():
            kind = "stringio"
        case (bool() | int() | float()) | None if not required:
            kind = "arg"
        case xr.DataArray():
            kind = "image" if len(data.dims) == 3 else "grid"
        case x if hasattr(x, "__geo_interface__"):
            kind = "geojson"
        case x if x is not None:
            kind = "matrix"
        case _:
            kind = "vectors"
    return kind

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The match-case statements look more compact. Done in 33051ac.

The mypy warning is also suppressed.

data: Any, required: bool = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making data a required parameter here.

) -> Literal[
"arg", "file", "geojson", "grid", "image", "matrix", "stringio", "vectors"
]:
r"""
Check the kind of data that is provided to a module.

The ``data`` argument can be in any type, but only following types are supported:
The ``data`` argument can be in any types. Following data kinds are recognized:

- a string or a :class:`pathlib.PurePath` object or a sequence of them, representing
a file name or a list of file names
- a 2-D or 3-D :class:`xarray.DataArray` object
- a 2-D matrix
- None, bool, int or float type representing an optional arguments
- a geo-like Python object that implements ``__geo_interface__`` (e.g.,
geopandas.GeoDataFrame or shapely.geometry)
- ``"arg"``: data is ``None`` and ``required=False``, or bool, int, float,
representing an optional argument, used for dealing with optional virtual files
- ``"file"``: a string or a :class:`pathlib.PurePath` object or a sequence of them,
representing one or more file names
- ``"geojson"``: a geo-like Python object that implements ``__geo_interface__``
(e.g., geopandas.GeoDataFrame or shapely.geometry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this highlighting exists also for shapely.geometry?

Suggested change
(e.g., geopandas.GeoDataFrame or shapely.geometry)
(e.g., :class:`geopandas.GeoDataFrame` or shapely.geometry)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not documented in the API documentation page, so linking shapely.geometry or not makes little difference.

- ``"grid"``: a :class:`xarray.DataArray` object that is not 3-D
- ``"image"``: a 3-D :class:`xarray.DataArray` object
- ``"stringio"``: a :class:`io.StringIO` object
- ``"matrix"``: anything that is not None
- ``"vectors"``: data is ``None`` and ``required=True``

Parameters
----------
Expand Down Expand Up @@ -287,30 +291,36 @@
>>> data_kind(data=None)
'vectors'
"""
kind: Literal[
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type hint is no longer needed.

"arg", "file", "geojson", "grid", "image", "matrix", "stringio", "vectors"
]
# One file or a list/tuple of files.
if isinstance(data, str | pathlib.PurePath) or (
isinstance(data, list | tuple)
and all(isinstance(_file, str | pathlib.PurePath) for _file in data)
):
# One or more files
kind = "file"
elif isinstance(data, bool | int | float) or (data is None and not required):
kind = "arg"
elif isinstance(data, io.StringIO):
kind = "stringio"
elif isinstance(data, xr.DataArray):
kind = "image" if len(data.dims) == 3 else "grid"
elif hasattr(data, "__geo_interface__"):
# geo-like Python object that implements ``__geo_interface__``
# (geopandas.GeoDataFrame or shapely.geometry)
kind = "geojson"
elif data is not None:
kind = "matrix"
else:
kind = "vectors"
return kind
return "file"

# A StringIO object.
if isinstance(data, io.StringIO):
return "stringio"

# An option argument, mainly for dealing optional virtual files.
if isinstance(data, bool | int | float) or (data is None and not required):
return "arg"

# An xarray.DataArray object, representing a grid or an image.
if isinstance(data, xr.DataArray):
return "image" if len(data.dims) == 3 else "grid"

# Geo-like Python object that implements ``__geo_interface__`` (e.g.,
# geopandas.GeoDataFrame or shapely.geometry).
# Reference: https://gist.github.com/sgillies/2217756
if hasattr(data, "__geo_interface__"):
return "geojson"

Check warning on line 317 in pygmt/helpers/utils.py

View check run for this annotation

Codecov / codecov/patch

pygmt/helpers/utils.py#L317

Added line #L317 was not covered by tests

# Any not-None is considered as a matrix.
if data is not None:
return "matrix"

return "vectors"


def non_ascii_to_octal(
Expand Down