Skip to content

Egorsky/azurml_pipelines_in_action

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

azurml_pipelines_in_action

This repository contains code for ensemble model deployment. Don't forget to rate a star ⭐ if you find content insightful, helping or interesting.

Data

Data for this example is synthetic and contains of columns representing date, sales and complaints.

Models

Considering synthetic data is based on simple sine, cosine functions, random noise and holidays in Poland it is not really represents any behaviour of the product. Therefore all decisions upon models are made in order to provide an example.

For sales forecasting SARIMA model provided by statsmodels is used. The out of sample forecasting is created with help of predicted mean. More information about the model may be found here. For complaints forecasting prophet model is used. You may read more about this model in this documentation. All hyperparameters for the model are based on EDA. Optuna was used to get hyperparameters for the Prophet model. More information about Optuna may be found here.

Pipeline

The pipeline contains of 5 components:

  • SARIMA model training. In this component the data is gathered directly froim the Azure Blob storage using SecretClient for accessing to the SAS Token.
  • Model forecasting. Reads model and data from the previous step, makes forecast and creates extended DataFrame
  • Feature creation. Here feautres like lags and movig average are created
  • Prophet model training. Component gets data with features from the previous steps and trains model
  • Forecasting of the complaints. The component gets model and data from the previous step and makes forecast which further on gets uploaded into the Blob Storage Conatiner as a csv file using Connection String.

Features & Artifacts

Througut the pipeline feaures get logged via the MLFlow. We also record plots and models which allow to keep version control of the model and track its performance.

Releases

No releases published

Packages

No packages published