Skip to content

[ML-49316] Support MonthMid and MonthEnd for DeepAR #159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Lanz-db
Copy link

@Lanz-db Lanz-db commented Jan 30, 2025

This PR fixes the bug that when user dataset has monthly frequency and the day of the month is not the first day, DeepAR will fail. The bug results from this line,

new_index_full = pd.date_range(total_min, total_max, freq=frequency)

freq is "MS" so the generated new_index_full will always be the first day of month. So this line,

df.reindex(new_index_full)

will generate a df with all rows in target column to be NaN.

To fix the bug, this PR introduces

  1. a helper function, validate_and_generate_index , to generate a complete time index for the given DataFrame based on the specified frequency. If it is monthly frequency, it will generate the index based on the given day of month, also detect if it is the end of month.
  2. Use this helper function instead of pd.date_range(total_min, total_max, freq=frequency)

To test the function,

  • Unit tests added in utils_test.py
    Run the below command locally
PYTHONPATH=~/automl/runtime/ pytest tests/automl_runtime/forecast/deepar/utils_test.py

@Lanz-db Lanz-db closed this Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant