Description
Description of the issue
In the GMT command-line world, there are some data processing functions that can output to either a NetCDF grid or ASCII table. Translating to Python/PyGMT, do we want to 1) have a single function that can output to both (depending on some flag), or 2) have two functions/methods, one which outputs to a grid, and one which outputs to a table.
This is a list of functions that need to be handled:
-
triangulate
Wrap triangulate #731 -
grdhisteq
Wrap grdhisteq #1433 -
binstats
Wrap binstats #1652 pygmt.binstats: Add alias "tiling" for "T" #2409 - etc
Originally posted by @weiji14 in #1433 (comment)
I changed the implementation a bit relative to #731 to support ASCII or pandas.DataFrame output for writing out the equalized histogram.
Still, the code is a bit clunky in order to support four different output types (pandas.DataFrame, xarray.DataArray, netCDF, or ASCII). What would you think about having two PyGMT functions for GMT's grdhisteq module rather than just one? One function could write out the data ranges of histogram equalization to a pd.DataFrame or ASCII table and the other could write out the cumulative distribution statistics to a netCDF file or xarray.DataArray. I guess coming up with the names for these would be harder than the current implementation, but I think it would be more user friendly long-term.
Yeah I've debated a bit on whether to have 2 functions too, something like a
pygmt.grdhisteq.to_table()
andpygmt.grdhisteq.to_grid()
(implemented using Python classmethods), or maybe with an underscore likepygmt.grdhisteq_to_table()
andpygmt.grdhisteq_to_grid()
(implemented purely using Python functions). Tying this to #1318 (comment), I think the split into 2 may have to happen eventually, especially if we want to support more table-like outputs (ascii/numpy/pandas/geopandas/etc) like what Will is doing atgrd2xyz
#1284.
Possible implementation styles
These are how the implementation would look like, using triangulate
as an example.
Single function
def triangulate(data, outgrid=None, outfile=None):
pass
Two Python functions
Have a common _triangulate
function that handles grid or table outputs, some similarities to the _blockm
.
def _triangulate(data, outgrid=None, outfile=None):
pass
def triangulate_to_grid(data, outgrid=None):
pass
def triangulate_to_table(data, outfile=None):
pass
Two methods in a single Python class ✔️
class triangulate:
def _triangulate():
pass
@staticmethod
def to_grid(data, outgrid=None):
pass
@staticmethod
def to_table(data, outfile=None):
pass
Are you willing to help implement and maintain this feature? Vote for which API style you prefer!
- A. 👍 Single function to do both grid/table output, i.e.
pygmt.triangulate(outgrid=True)
orpygmt.triangulate(outfile=True)
- B. 🎉 The 'functional' style, i.e.
pygmt.triangulate_to_grid()
orpygmt.triangulate_to_table()
- C. 🚀 The 'class' method style, i.e.
pygmt.triangulate.to_grid()
orpygmt.triangulate.to_table()
- D. 👀 Other suggestions on the names, or API design, please comment below!
P.S. Also xref #896 where there is a similar API design discussion on wrapping GMT functions that do either plotting or data processing.