Replies: 6 comments 11 replies
-
I agree, the syntax you propose matches the way other functions in datatable are used. shouldn't the output look more like this?
Otherwise it may not agree with how it's done in other packages. |
Beta Was this translation helpful? Give feedback.
-
Hmm, is it how it's done in other packages? This would mean that melting is merely splitting the frame into columns and then r-binding those columns. I thought that melting should preserve local order values: if some values are close in the original frame, they must remain close in the transformed frame. Given that for a typical frame I've looked into the packages listed above, and none of them documents the exact order of rows in the molten frame. De facto, however, they use the following orders:
I wonder why they chose differently, and whether there is a clearly better alternative here. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure if this is a big deal in terms of performance but data.table uses memcpy in the getvaluecols function https://github.com/Rdatatable/data.table/blob/master/src/fmelt.c#L488 to copy entire columns from the input to the outpu so if you want to do that I think you would need the columns-first approach. I guess SQL DBs use the rows-first approach because they have row-wise storage rather than column-wise storage in R/data.table? |
Beta Was this translation helpful? Give feedback.
-
I have recently added some advanced features to fmelt Rdatatable/data.table#4731 so you may think about doing something similar or at least planning for those features in your python version. |
Beta Was this translation helpful? Give feedback.
-
While considering this, it would be useful if a |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm sorry if there's a better place to ask this question and if so, please feel free to tell me where to move it. I came across this package and I'm really excited about it! But one of the functionalities that I've noticed is missing are the reshaping functions (melt/dcast/pivot_table). Is there a timetable for this kind of functionality? I've checked the expected functionality in the 1.0.0 release that is in the documentation and haven't noticed it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
So, I've been thinking about the
melt()
function as requested in #2499. The equivalent to this function exists in R data.table, pandas, tidyr, SQL Server, Snowflake.Interestingly, in data.table, pandas and tidyr this function is a full-frame transform. That is, the method is applied to a whole frame, and produces another frame as a result. The method also has a dozen or so various options to control different aspects of the transformation. In SQL, on the other hand, the same functionality is done via a separate clause which looks like this:
UNPIVOT (vcol FOR ncol IN collist...)
. However, this clause is attached toFROM
, so in this sense it's also a transformation of the frame.What would be really nice for us, however, is to implement melting as just another simple building block that can be combined with other functions inside a
DT[i,j,...]
call. Something like this:(where
columns
is an FExpr that selects 1 or more columns). Themelt()
function would then produce a 2-column FExpr where the first column contains column labels, and the second column is the values from the table. Internally, the resulting columns will be at a grouping level which expands every row into as many rows as the number of columns in the columnset. This means that melt() cannot be combined with a groupby. But it also means that all regular columns will be auto-expanded to match the shape of the melted columns.For example, given the dataset
we can write
to produce
Beta Was this translation helpful? Give feedback.
All reactions