-
Notifications
You must be signed in to change notification settings - Fork 101
Description
What happened + What you expected to happen
Calling preprocess on a test set changes the result of the future.
I would expect preprocess to be side effect free.
Versions / Dependencies
MacOS, Python3.12.3, MLForecast:1.0.2
Reproduction script
from mlforecast.utils import generate_series, generate_prices_for_series
from mlforecast import MLForecast
import lightgbm as lgb
import pandas as pd
series = generate_series(200, equal_ends=True)
def train_test_split_last_n_pandas(df: pd.DataFrame, time_col: str = "ds", id_col: str = "unique_id", n: int = 3):
df_sorted = df.sort_values(by=[id_col, time_col])
test = df_sorted
train = df_sorted.groupby(id_col).apply(lambda group: group.iloc[:-n] if len(group) > n else group.iloc[0:0])
train = train.reset_index(drop=True)
return train, test
train, test = train_test_split_last_n_pandas(series, n =3)
fcst = MLForecast(
models=[lgb.LGBMRegressor()],
freq='D',
lags=[1],
)
fcst.fit(train)
fu1 = fcst.make_future_dataframe(h=1)
pre = fcst.preprocess(test)
fu2 = fcst.make_future_dataframe(h=1)
print(fu1.equals(fu2))
Issue Severity
Low: It annoys or frustrates me.