Replies: 2 comments 4 replies
-
I'm sure I don't follow the argument fully, but on the narrow point of "max
of all elements" pandas has an open issue for `.max(axis=None)` to reduce
over all rows. It's currently implemented for logical reductions like any
and all.
Again, I'm not sure how that relates to the "OOP" discussion.
…On Thu, Aug 27, 2020 at 7:53 AM Fredo Erxleben ***@***.***> wrote:
What is this *not* about?
This is not supposed to touch internal implementation issues of any of the
frameworks in any way shape or form. This is also not about *why* things
are designed how they are designed currently.
What is this about?
*TLDR:* It is about how the API allows its users to think about a problem
and formulate their solution steps.
The concerned APIs are all written in a language that has object oriented
programming as one of its core features. But besides the occasional
DataFrame class, they make nearly no use of it and promote purely
imperative programming semantic approaches to problem solving.
The way we write code also represents the way we think about the structure
of our problem, how we identify the pieces that make up our *chunk of
data*.
An Example for Illustration
Let's talk about *pandas* for am moment. Assume we want to get the
maximum value of all the elements in a *data_frame*:
Is it max(data_frame)? Perhaps data_frame.max()? No. The correct answer
is data_frame.max().max().
From the perspective of someone who is used to OOP, the second option
seems to be the more obvious one.
Let's explore some variations of the scenario and how they could be
expressed in a more object-oriented fashion:
# Get the maximum value of a data_framedata_frame.max()
# Get the maximum value of the third columndata_frame.column(2).max()
# Same with the third rowdata_frame.row(2).max()
# How about picking some very specific cells?
data_frame.column(range(0, 3)).row(4).max()
# Alternatively, in case we want to reuse the columns and rows:from typing import Listselected_columns: List[Column] = data_frame.column(range(0, 3))selected_row_slice: Row = selected_columns.row(4)selected_row_slice.max()selected_row_slice.whatever()
The point here is not to provide new functionality, but instead provide an
intuitive and clear to write way of
doing things.
It is about using the expressive benefits that come with the language
these frameworks are built for.
Disclaimer
I am seriously sorry for using *pandas* as the bad example, it is merely
the first DS framework I worked with and I have a record of beginner issues
I ran into available.
This by no means implies that other frameworks are better or worse.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIQJDYQ3TZBLHXTZVZLSCZJMZANCNFSM4QM6TTUQ>
.
|
Beta Was this translation helpful? Give feedback.
2 replies
-
@fer-rum you don't say this explicitly, but you somehow seem to assume that object-oriented APIs are by definition clearer or more expressive than functional ones? If so, that's probably a mistaken assumption - itmay be preferred for dataframes, but for the type of code one typically writes for arrays/tensors (e.g. for scientific computing) a functional style is preferred. If that's not what you meant, then can you please clarify? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
What is this not about?
This is not supposed to touch internal implementation issues of any of the frameworks in any way shape or form. This is also not about why things are designed how they are designed currently.
What is this about?
TLDR: It is about how the API allows its users to think about a problem and formulate their solution steps.
The concerned APIs are all written in a language that has object oriented programming as one of its core features. But besides the occasional
DataFrame
class, they make nearly no use of it and promote purely imperative programming semantic approaches to problem solving.The way we write code also represents the way we think about the structure of our problem, how we identify the pieces that make up our chunk of data.
An Example for Illustration
Let's talk about pandas for am moment. Assume we want to get the maximum value of all the elements in a data_frame:
Is it
max(data_frame)
? Perhapsdata_frame.max()
? No. The correct answer isdata_frame.max().max()
.From the perspective of someone who is used to OOP, the second option seems to be the more obvious one.
Let's explore some variations of the scenario and how they could be expressed in a more object-oriented fashion:
The point here is not to provide new functionality, but instead provide an intuitive and clear to write way of
doing things.
It is about using the expressive benefits that come with the language these frameworks are built for.
Disclaimer
I am seriously sorry for using pandas as the bad example, it is merely the first DS framework I worked with and I have a record of beginner issues I ran into available.
This by no means implies that other frameworks are better or worse.
Beta Was this translation helpful? Give feedback.
All reactions