Releases: rodrigo-arenas/Sklearn-genetic-opt
0.6.0
This is a big release with several new features and enhancements! 🎊
Features:
-
Added the
ProgressBar
callback, it uses tqdm progress bar to shows how many generations are left in the training progress. -
Added the
TensorBoard
callback to log the generation metrics, watch in real-time while the models are trained, and compare different runs in your TensorBoard instance. -
Added the
TimerStopping
callback to stop the iterations after a total (threshold) fitting time has been elapsed. -
Added new parallel coordinates plot using
plot_parallel_coordinates
by @Raul9595 -
Now if one or more callbacks decides to stop the algorithm, it will print its class name to know which callbacks were responsible of the stopping.
-
Added support for extra methods coming from scikit-learn's BaseSearchCV, like
cv_results_
,
best_index_
andrefit_time_
among others. -
Added methods
on_start
andon_end
toBaseCallback
. Now the algorithms check for the callbacks like this:-
on_start: When the evolutionary algorithm is called from the GASearchCV.fit method.
-
on_step: When the evolutionary algorithm finishes a generation (no change here).
-
on_end: At the end of the last generation.
-
Bug Fixes:
- A missing statement was making that the callbacks start to get evaluated from generation 1, ignoring generation 0. Now this is properly handled and callbacks work from generation 0.
API Changes:
- The modules
sklearn_genetic.plots
andsklearn_genetic.mlflow.MLflowConfig
now requires an explicit installation of seaborn and mlflow, now those are optionally installed usingpip install sklearn-genetic-opt[all].
- The GASearchCV.logbook property now has extra information that comes from the scikit-learn cross_validate function.
- An optional extra parameter was added to GASearchCV, named
return_train_score
: bool, default=False. As in scikit-learn, it controls if thecv_results_
should have the training scores.
Docs:
- Edited all demos to be in the jupyter notebook format.
- Added embedded jupyter notebooks examples in read the docs page.
- The modules of the package now have a summary of their classes/functions in the docs.
- Updated the callbacks and custom callbacks tutorials to add a new TensorBoard callback and the new methods on the base callback.
Internal:
- Now the HallofFame (hof) uses the
self.best_params_
for the position 0, to be consistent with the
scikit-learn API and parameters likeself.best_index_
- MLflow now has unit tests by @Turtle24
Thanks to new contributors for helping in this project! @Raul9595 @Turtle24
0.5.0
Features:
-
Build-in integration with MLflow using the class
sklearn_genetic.mlflow.MLflowConfig
and the new parameterlog_config
from the classsklearn_genetic.GASearchCV
-
Implemented the callback
sklearn_genetic.callbacks.LogbookSaver
which saves the estimator.logbook object with all the fitted hyperparameters and their cross-validation score -
Added the parameter
estimator
to all the functions on the modulesklearn_genetic.callbacks
Docs:
- Added user guide "Integrating with MLflow"
- Update the tutorial "Custom Callbacks" for new API inheritance behavior
Internal:
- Added a base class
sklearn_genetic.callbacks.base.BaseCallback
from which all Callbacks must inherit from - Now coverage report doesn't take into account the lines with # pragma: no cover and # noqa
0.4.1
Docs:
- Added user guide on "Understanding the evaluation process"
- Several guides on contributing, code of conduct
- Added important links
- Docs requirement are now independent of package requirements
Internal:
- Changed test ci from travis to Github actions
0.4.0
Features:
- Implemented callbacks module to stop the optimization process based in the current iteration metrics, currently implemented:
sklearn_genetic.callbacks.ThresholdStopping
,sklearn_genetic.callbacks.ConsecutiveStopping
andsklearn_genetic.callbacks.DeltaThreshold
. - The algorithms 'eaSimple', 'eaMuPlusLambda', 'eaMuCommaLambda' are now implemented in the module
sklearn_genetic.algorithms
for more control over their options, rather that taking thedeap.algorithms module
. - Implemented the
sklearn_genetic.plots
module and added the functionsklearn_genetic.plots.plot_search_space
,
this function plots a mixed counter, scatter and histogram plots over all the fitted hyperparameters and their cross-validation score. - Documentation based in rst with Sphinx to host in read the docs. It includes public classes and functions documentation as well
as several tutorials on how to use the package, link: https://sklearn-genetic-opt.readthedocs.io/ - Added
best_params_
andbest_estimator_
properties after fitting GASearchCV. - Added optional parameters
refit
,pre_dispatch
anderror_score
.
API Changes:
- Removed support for python 3.6, changed the libraries supported versions to be the same as scikit-learn current version.
- Several internal changes on the documentation and variables naming style to be compatible with Sphinx.
- Removed the parameters
continuous_parameters
,categorical_parameters
andinteger_parameters
in GASearchCV, replacing them withparam_grid
.
0.3.0
Features:
- Added the space module to control better the data types and ranges of each hyperparameter, their distribution to sample random values from, and merge all data types in one Space class that can work with the new param_grid parameter
- Changed the continuous_parameters, categorical_parameters and integer_parameters for the param_grid, the first ones still work but will be removed in a next version
- Added the option to use the eaMuCommaLambda algorithm from deap
- The mu and lambda_ parameters of the internal eaMuPlusLambda and eaMuCommaLambda now are in terms of the initial population size and not the number of generations
0.2.1
Features:
- Enabled deap's eaMuPlusLambda algorithm for the optimization process, now is the default routine
- Added the parameter keep_top_k to control the amout of solutions if the hall of fame (hof)
- Changed default parameters crossover_probability from 1 to 0.8 and generations from 50 to 40
Internal
- Changed parameters with pre-defined options to use pydantic models
Fixes
- Fix log of the scoring metric in logbook, now is part of the parameters and is show only once
0.2.0
Features:
- Added a logbook and history properties to the fitted GASearchCV to make post-fit analysis
- Elitism = False now implements a roulette selection instead of ignoring the parameter
API Changes:
- Refactored the optimization algorithm to use deap package instead of a custom implementation, this causes the removal of several methods, properties and variables inside the GASearchCV class
- The parameter encoding_length has been removed, it's not longer required to the GASearchCV class
- Renamed the property of the fitted estimator from best_params_ to best_params
- The verbosity now prints the deap log of the fitness function, it's standard deviation, max and min values from each generation
- The variable GASearchCV._best_solutions was removed and it's meant to be replaced with GASearchCV.logbook and GASearchCV.history
0.1.1
Bug Fixes:
- Fixs unexpected overwrites over if statements
- Correct validation when parameters dicts are empty
Enchacements:
- Criteria parameter to control if it's a minimization or maximization problem with respect to the scoring metric
- Plot fitness function over generations
- Unit tests for all the package
- Examples with regression problems
- Implementation of some magic methods
- Documentation of the GASearchCV class parameters
0.1.0
Features:
- GASearchCV for sklearn classifiers and regressors hyperparameters search using genetic algorithms and cross validation