Weekly Processing?
#10652
Replies: 1 comment 1 reply
-
Hi @davies-w . A few options:
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'd like to be able to run a dvc pipeline that will maintain a weekly state..
EG:
Input is:
s3:week-1.df.gz
s3:week-2.df.gz
Intermediate Output:
dvc_data/intermediate/week-1-processed.df
dvc_data/intermediate/week-2-processed.df
Final Output:
dvc_data/final/combined-formatted.dataset
So that when a new s3:week-3.df.gz appears, dvc will just run on that file, and produce:
dvc_data/intermediate/week-3-processed.df
and then updates the weeks together to produce:
dvc_data/final/combined-formatted.dataset
Extra credit if you can suck in the original version of dvc_data/final/combined-formatted.dataset and merge it with dvc_data/intermediate/week-3-processed.df
Beta Was this translation helpful? Give feedback.
All reactions