Releases: Galileo-Galilei/kedro-mlflow
Release 0.8.0
[0.8.0] - 2022-01-05
Added
- ✨ Add a
kedro mlflow modelify
command to export a pipeline as a mlflow model (#261) - 📝 Format code blocks in documentation with
blacken-docs
- 👷 Enforce the use of
black
andisort
in the CI to enforce style guidelines for developers
Changed
- ✨ 💥 The
pipeline_ml_factory
accepts 2 new argumentslog_model_kwargs
(which will be passed as is tomlflow.pyfunc.log_model
) andkpm_kwargs
(which will be passed as is toKedroPipelineModel
). This ensures perfect consistency with mlflow API and offers new possibility like saving the project source code alongside the model (#67). Note thatmodel_signature
,conda_env
andmodel_name
arguments are removed, and replace respectively bylog_model_kwargs["signature"]
,log_model_kwargs["conda_env"]
andlog_model_kwargs["artifact_path"]
. - ✨ 💥 The
KedroPipelineModel
custom mlflow model now accepts any kedroPipeline
as input (provided they have a single DataFrame input and a single output because this is an mlflow limitation) instead of onlyPipelineML
objects. This simplifies the API for user who want to customise the model logging (#171).KedroPipelineModel.__init__
argumentpipeline_ml
is renamedpipeline
to reflect this change. - 🗑️
kedro_mlflow.io.metrics.MlflowMetricsDataSet
is no longer deprecated because there is no alternative for now to log many metrics at the same time. - 💥 Refactor
mlflow.yml
to match mlflow's API (#77). To migrate projects withkedro<0.8.0
, please update theirmlflow.yml
withkedro mlflow init --force
command.
Fixed
- 🐛
KedroMlflowConfig.setup()
methods now sets the experiment globally to ensure all runs are launched under the experiment specified in the configuration even in interactive mode (#256).
Removed
- 🔥 💥
KedroMlflowConfig
andget_mlflow_config
were deprecated since0.7.3
and are now removed fromkedro_mlflow.framework.context
. Direct import must now usekedro_mlflow.config
.
�
Release 0.7.6
[0.7.6] - 2021-10-08
Fixed
- 🐛 The reserved keyword "databricks" is no longer converted to a local filepath before setting the
MLFLOW_TRACKING_URI
to enable integration with databricks managed platform. (#248)
�
Release 0.7.5
[0.7.5] - 2021-09-21
Added
- ✨ Add support for notebook use. When a notebook is opened via a kedro command (e.g.
kedro jupyter notebook
), you can call the%reload_kedro_mlflow
line magic to setup mlflow configuration automatically. Amlflow_client
to the database is also created available as a global variable (#124). - 📝 Add automatic API documentation through docstrings for better consistency between code and docs (#110). All docstrings are not updated yet and it will be a long term work.
Changed
-
♻️
KedroMlflowConfig
was refactored with pydantic for improved type checking when loading configuration, overall robustness and autocompletion. Its keys have changed, but it is not considered as a user facing changes since the public functionget_mlflow_config()
andKedroMlflowConfig().setup()
are not modified. -
🗑️ The
kedro.framework.context
folder is moved tokedro.config
for consistency with the Kedro repo structure:get_mlflow_config
import must change fromfrom kedro_mlflow.framework.context import get_mlflow_config
tofrom kedro_mlflow.config import get_mlflow_config
.
�
Release 0.7.4
[0.7.4] - 2021-08-30
Added
- ✨ Create an
MlflowMetricDataSet
to simplify the existing metric API. It enables logging a single float as a metric, eventually automatically increasing the "step" if the metric is going to be updated during time (#73) - ✨ Create an
MlflowMetricHistoryDataSet
to simplify the existing metric API. It enables logging the evolution of a given metric during training. (#73)
Fixed
- 🐛 Dictionnary parameters with integer keys are now properly logged in mlflow when
flatten_dict_params
is set toTrue
in themlflow.yml
instead of raising aTypeError
(#224) - 🐛 The user defined
sep
parameter of themlflow.yml
(defined innode
section) is now used even if the parameters dictionnary has a depth>=2 (#230)
Changed
- ♻️ Move
flatten_dict
function tohooks.utils
folder and rename it_flatten_dict
to make more explicit that it is not a user facing function which should not be used directly and comes with no guarantee. This is not considered as a breaking change since it is an undocumented function. - 🗑️ Deprecate
MlflowMetricsDataSet
in favor of the 2 new datasetsMlflowMetricDataSet
andMlflowMetricHistoryDataSet
newly added. It will be removed inkedro-mlflow==0.8.0
.
�
Release 0.7.3
[0.7.3] - 2021-08-16
Added
- ✨ Update the
MlflowArtifactDataSet.load()
method to download the data from therun_id
if it is specified instead of using the local filepath. This can be used for instance to continue training from a pretrained model or to retrieve the best model from an hyperparameter search (#95)
�
Release 0.7.2
[0.7.2] - 2021-05-02
Fixed
- Remove global CLI command
new
(which was not implemented yet) to make project CLI commands available. It is not possible to have 2 CLI groups (one at global level , one at project level) because of a bug inkedro==0.17.3
(#193)
�
Release 0.7.1
[0.7.1] - 2021-04-09
Added
-
It is now possible to deactivate tracking (for parameters and datasets) by specifying a key
disabled_tracking: pipelines: [<pipeline-name>]
in themlflow.yml
configuration file. (#92) -
The
kedro mlflow ui
commandhost
andport
keys can be overwritten at runtime (#187)
Fixed
- The
kedro mlflow ui
now reads properly theui:host
andui:port
keys from themlflow.yml
which were incorrectly ignored (#187)
�
Release 0.7.0
[0.7.0] - 2021-03-17
Added
kedro-mlflow
now supportskedro>=0.17.1
(#144).
Changed
- Drop support for
kedro==0.17.0
, since the kedro core team made a breaking change in0.17.1
. All future plugin updates will be only compatible withkedro>=0.17.1
.
�
Release 0.6.0
[0.6.0] - 2021-03-14
Added
kedro-mlflow
now supportskedro==0.17.0
(#144). Since the kedro core team made a breaking change in the patch release0.17.1
, it is not supported yet. They also recommend to downgrade to 0.17.0 for stability.- Updated documentation
Fixed
- The support of
kedro==0.17.0
automatically makes the CLI commands available when the configuration is declared in apyproject.toml
instead of a.kedro.yml
, which was not the case in previous version despite we claim it was (#157).
Changed
- Drop support for
kedro==0.16.x
. All future plugin updates will be only compatible withkedro>=0.17.0
.
�
Release 0.5.0
[0.5.0] - 2021-02-21
Added
- A new
long_parameters_strategy
key is added in themlflow.yml
(under in the hook/node section). You can specify different strategies (fail
,truncate
ortag
) to handle parameters over 250 characters which cause crashes for some mlflow backend. (#69) - Add an
env
parameter tokedro mlflow init
command to specify under whichconf/
subfolder themlflow.yml
should be created. (#159) - The input parameters of the
inference
pipeline of aPipelineML
object are now automatically pickle-ised and converted as artifacts. (#158) - Detailed documentation on how to use
pipeline_ml_factory
function, and more generally on how to usekedro-mlflow
as mlops framework. This comes from an example repokedro-mlflow-tutorial
. (#16)
Fixed
- Pin the kedro version to force it to be strictly inferior to
0.17
which is not compatible with currentkedro-mlflow
version (#143) - It is no longer assumed for the project to run that the
mlflow.yml
is located underconf/base
. The project will run as soon as the configuration file is discovered by the registered ConfigLoader (#159)
Changed
- The
KedroPipelineModel.load_context()
method now loads all theDataSets
in memory in theDataCatalog
. It is also now possible to specify therunner
to execute the model as well as thecopy_mode
when executing the inference pipeline (instead of deepcopying the datasets between each nodes which is kedro's default). This makes the API serving withmlflow serve
command considerably faster (~20 times faster) for models which need compiling (e.g. keras, tensorflow ...) (#133) - The CLI projects commands are now always accessible even if you have not called
kedro mlflow init
yet to create amlflow.yml
configuration file (#159)
�