-
Notifications
You must be signed in to change notification settings - Fork 128
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
with_column typically add a new column with the provided name. However, when using lag we get two new columns, with the provided name and automatically generate one.
To Reproduce
import datafusion as dfn
from datafusion import col, lit, functions as F
import pyarrow as pa
def datafusion_example() -> None:
table = pa.table({"a": [1.0, 2.0, 3.0]})
ctx = dfn.SessionContext()
df = ctx.from_arrow(table)
print(
df.with_column(
"previous_a",
F.lag(
col("a"),
default_value=None,
),
)
)
print(df.with_column("something_else", col("a") + lit(1.0)))
if __name__ == "__main__":
datafusion_example()
Output
DataFrame()
+-----+-----------------------------------------------------------------------------------------------------------------+------------+
| a | lag(ced8c2b3710c14382bd5eb58d49ffbd53.a,Int64(1),NULL) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | previous_a |
+-----+-----------------------------------------------------------------------------------------------------------------+------------+
| 1.0 | | |
| 2.0 | 1.0 | 1.0 |
| 3.0 | 2.0 | 2.0 |
+-----+-----------------------------------------------------------------------------------------------------------------+------------+
DataFrame()
+-----+----------------+
| a | something_else |
+-----+----------------+
| 1.0 | 2.0 |
| 2.0 | 3.0 |
| 3.0 | 4.0 |
+-----+----------------+
Expected behavior
DataFrame()
+-----+------------+
| a | previous_a |
+-----+------------+
| 1.0 | |
| 2.0 | 1.0 |
| 3.0 | 2.0 |
+-----+------------+
DataFrame()
+-----+----------------+
| a | something_else |
+-----+----------------+
| 1.0 | 2.0 |
| 2.0 | 3.0 |
| 3.0 | 4.0 |
+-----+----------------+
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working