Skip to content

Commit 679c9a2

Browse files
authored
Merge branch 'scikit-learn:main' into submodulev3
2 parents e7c04f9 + 3b35ad0 commit 679c9a2

34 files changed

+809
-261
lines changed

build_tools/azure/install.sh

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,10 @@ pre_python_environment_install() {
4949

5050
python_environment_install_and_activate() {
5151
if [[ "$DISTRIB" == "conda"* ]]; then
52-
conda update -n base conda -y
52+
# Install/update conda with the libmamba solver because the legacy
53+
# solver can be slow at installing a specific version of conda-lock.
54+
conda install -n base conda conda-libmamba-solver -y
55+
conda config --set solver libmamba
5356
conda install -c conda-forge "$(get_dep conda-lock min)" -y
5457
conda-lock install --name $VIRTUALENV $LOCK_FILE
5558
source activate $VIRTUALENV

build_tools/azure/install_win.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,10 @@ set -x
77
source build_tools/shared.sh
88

99
if [[ "$DISTRIB" == "conda" ]]; then
10+
# Install/update conda with the libmamba solver because the legacy solver
11+
# can be slow at installing a specific version of conda-lock.
12+
conda install -n base conda conda-libmamba-solver -y
13+
conda config --set solver libmamba
1014
conda install -c conda-forge "$(get_dep conda-lock min)" -y
1115
conda-lock install --name $VIRTUALENV $LOCK_FILE
1216
source activate $VIRTUALENV

doc/conftest.py

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -145,13 +145,6 @@ def pytest_runtest_setup(item):
145145
setup_preprocessing()
146146
elif fname.endswith("statistical_inference/unsupervised_learning.rst"):
147147
setup_unsupervised_learning()
148-
elif fname.endswith("metadata_routing.rst"):
149-
# TODO: remove this once implemented
150-
# Skip metarouting because is it is not fully implemented yet
151-
raise SkipTest(
152-
"Skipping doctest for metadata_routing.rst because it "
153-
"is not fully implemented yet"
154-
)
155148

156149
rst_files_requiring_matplotlib = [
157150
"modules/partial_dependence.rst",

doc/developers/advanced_installation.rst

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,12 @@ feature, code or documentation improvement).
6969
.. prompt:: bash $
7070

7171
conda create -n sklearn-env -c conda-forge python=3.9 numpy scipy cython
72+
73+
It is not always necessary but it is safer to open a new prompt before
74+
activating the newly created conda environment.
75+
76+
.. prompt:: bash $
77+
7278
conda activate sklearn-env
7379

7480
#. **Alternative to conda:** If you run Linux or similar, you can instead use
@@ -287,6 +293,12 @@ scikit-learn from source:
287293

288294
conda create -n sklearn-dev -c conda-forge python numpy scipy cython \
289295
joblib threadpoolctl pytest compilers llvm-openmp
296+
297+
It is not always necessary but it is safer to open a new prompt before
298+
activating the newly created conda environment.
299+
300+
.. prompt:: bash $
301+
290302
conda activate sklearn-dev
291303
make clean
292304
pip install -v --no-use-pep517 --no-build-isolation -e .
@@ -307,12 +319,6 @@ forge using the following command:
307319

308320
which should include ``compilers`` and ``llvm-openmp``.
309321

310-
.. note::
311-
312-
If you installed these packages after creating and activating a new conda
313-
environment, you will need to first deactivate and then reactivate the
314-
environment for these changes to take effect.
315-
316322
The compilers meta-package will automatically set custom environment
317323
variables:
318324

@@ -428,6 +434,12 @@ in the user folder using conda:
428434

429435
conda create -n sklearn-dev -c conda-forge python numpy scipy cython \
430436
joblib threadpoolctl pytest compilers
437+
438+
It is not always necessary but it is safer to open a new prompt before
439+
activating the newly created conda environment.
440+
441+
.. prompt:: bash $
442+
431443
conda activate sklearn-dev
432444
pip install -v --no-use-pep517 --no-build-isolation -e .
433445

doc/metadata_routing.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -76,15 +76,15 @@ metadata called ``sample_weight``::
7676
... lr,
7777
... X,
7878
... y,
79-
... props={"sample_weight": my_weights, "groups": my_groups},
79+
... params={"sample_weight": my_weights, "groups": my_groups},
8080
... cv=GroupKFold(),
8181
... scoring=weighted_acc,
8282
... )
8383

8484
Note that in this example, ``my_weights`` is passed to both the scorer and
8585
:class:`~linear_model.LogisticRegressionCV`.
8686

87-
Error handling: if ``props={"sample_weigh": my_weights, ...}`` were passed
87+
Error handling: if ``params={"sample_weigh": my_weights, ...}`` were passed
8888
(note the typo), :func:`~model_selection.cross_validate` would raise an error,
8989
since ``sample_weigh`` was not requested by any of its underlying objects.
9090

@@ -110,7 +110,7 @@ that :func:`~model_selection.cross_validate` does not pass the weights along::
110110
... X,
111111
... y,
112112
... cv=GroupKFold(),
113-
... props={"sample_weight": my_weights, "groups": my_groups},
113+
... params={"sample_weight": my_weights, "groups": my_groups},
114114
... scoring=weighted_acc,
115115
... )
116116

@@ -142,7 +142,7 @@ instance is set and ``sample_weight`` is not routed to it::
142142
... X,
143143
... y,
144144
... cv=GroupKFold(),
145-
... props={"sample_weight": my_weights, "groups": my_groups},
145+
... params={"sample_weight": my_weights, "groups": my_groups},
146146
... scoring=weighted_acc,
147147
... )
148148

@@ -166,7 +166,7 @@ consumers. In this example, we pass ``scoring_weight`` to the scorer, and
166166
... X,
167167
... y,
168168
... cv=GroupKFold(),
169-
... props={
169+
... params={
170170
... "scoring_weight": my_weights,
171171
... "fitting_weight": my_other_weights,
172172
... "groups": my_groups,

doc/modules/classes.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -995,6 +995,8 @@ details.
995995
metrics.median_absolute_error
996996
metrics.mean_absolute_percentage_error
997997
metrics.r2_score
998+
metrics.root_mean_squared_log_error
999+
metrics.root_mean_squared_error
9981000
metrics.mean_poisson_deviance
9991001
metrics.mean_gamma_deviance
10001002
metrics.mean_tweedie_deviance

doc/modules/model_evaluation.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,8 +94,9 @@ Scoring Function
9494
'max_error' :func:`metrics.max_error`
9595
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error`
9696
'neg_mean_squared_error' :func:`metrics.mean_squared_error`
97-
'neg_root_mean_squared_error' :func:`metrics.mean_squared_error`
97+
'neg_root_mean_squared_error' :func:`metrics.root_mean_squared_error`
9898
'neg_mean_squared_log_error' :func:`metrics.mean_squared_log_error`
99+
'neg_root_mean_squared_log_error' :func:`metrics.root_mean_squared_log_error`
99100
'neg_median_absolute_error' :func:`metrics.median_absolute_error`
100101
'r2' :func:`metrics.r2_score`
101102
'neg_mean_poisson_deviance' :func:`metrics.mean_poisson_deviance`
@@ -2310,6 +2311,10 @@ function::
23102311
for an example of mean squared error usage to
23112312
evaluate gradient boosting regression.
23122313

2314+
Taking the square root of the MSE, called the root mean squared error (RMSE), is another
2315+
common metric that provides a measure in the same units as the target variable. RSME is
2316+
available through the :func:`root_mean_squared_error` function.
2317+
23132318
.. _mean_squared_log_error:
23142319

23152320
Mean squared logarithmic error
@@ -2347,6 +2352,9 @@ function::
23472352
>>> mean_squared_log_error(y_true, y_pred)
23482353
0.044...
23492354

2355+
The root mean squared logarithmic error (RMSLE) is available through the
2356+
:func:`root_mean_squared_log_error` function.
2357+
23502358
.. _mean_absolute_percentage_error:
23512359

23522360
Mean absolute percentage error

doc/related_projects.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,14 @@ Note scikit-learn own modern gradient boosting estimators
228228
- `Flower <https://flower.dev/>`_ A friendly federated learning framework with a
229229
unified approach that can federate any workload, any ML framework, and any programming language.
230230

231+
**Privacy Preserving Machine Learning**
232+
233+
- `Concrete ML <https://github.com/zama-ai/concrete-ml/>`_ A privacy preserving
234+
ML framework built on top of `Concrete
235+
<https://github.com/zama-ai/concrete>`_, with bindings to traditional ML
236+
frameworks, thanks to fully homomorphic encryption. APIs of so-called
237+
Concrete ML built-in models are very close to scikit-learn APIs.
238+
231239
**Broad scope**
232240

233241
- `mlxtend <https://github.com/rasbt/mlxtend>`_ Includes a number of additional

doc/whats_new/v1.3.rst

Lines changed: 58 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,20 @@ Version 1.3.1
99

1010
**In development**
1111

12+
Changed models
13+
--------------
14+
15+
The following estimators and functions, when fit with the same data and
16+
parameters, may produce different models from the previous version. This often
17+
occurs due to changes in the modelling logic (bug fixes or enhancements), or in
18+
random sampling procedures.
19+
20+
- |Fix| Ridge models with `solver='sparse_cg'` may have slightly different
21+
results with scipy>=1.12, because of an underlying change in the scipy solver
22+
(see `scipy#18488 <https://github.com/scipy/scipy/pull/18488>`_ for more
23+
details)
24+
:pr:`26814` by :user:`Loïc Estève <lesteve>`
25+
1226
Changes impacting all modules
1327
-----------------------------
1428

@@ -18,6 +32,13 @@ Changes impacting all modules
1832
Changelog
1933
---------
2034

35+
:mod:`sklearn.calibration`
36+
..........................
37+
38+
- |Fix| :class:`calibration.CalibratedClassifierCV` can now handle models that
39+
produce large prediction scores. Before it was numerically unstable.
40+
:pr:`26913` by :user:`Omar Salman <OmarManzoor>`.
41+
2142
:mod:`sklearn.cluster`
2243
......................
2344

@@ -26,7 +47,14 @@ Changelog
2647
:pr:`27167` by `Olivier Grisel`_.
2748

2849
- |Fix| :class:`cluster.BisectingKMeans` now works with data that has a single feature.
29-
:pr:`27243` by `Jérémie du Boisberranger <jeremiedbb>`.
50+
:pr:`27243` by :user:`Jérémie du Boisberranger <jeremiedbb>`.
51+
52+
:mod:`sklearn.cross_decomposition`
53+
..................................
54+
55+
- |Fix| :class:`cross_decomposition.PLSRegression` now automatically ravels the output
56+
of `predict` if fitted with one dimensional `y`.
57+
:pr:`26602` by :user:`Yao Xiao <Charlie-XIAO>`.
3058

3159
:mod:`sklearn.ensemble`
3260
.......................
@@ -36,13 +64,34 @@ Changelog
3664
the sum of the scores should sum to zero for a sample).
3765
:pr:`26521` by :user:`Guillaume Lemaitre <glemaitre>`.
3866

67+
:mod:`sklearn.feature_selection`
68+
................................
69+
70+
- |Fix| :func:`feature_selection.mutual_info_regression` now correctly computes the
71+
result when `X` is of integer dtype. :pr:`26748` by :user:`Yao Xiao <Charlie-XIAO>`.
72+
3973
:mod:`sklearn.impute`
4074
.....................
4175

4276
- |Fix| :class:`impute.KNNImputer` now correctly adds a missing indicator column in
4377
``transform`` when ``add_indicator`` is set to ``True`` and missing values are observed
4478
during ``fit``. :pr:`26600` by :user:`Shreesha Kumar Bhat <Shreesha3112>`.
4579

80+
:mod:`sklearn.metrics`
81+
......................
82+
83+
- |Fix| Scorers used with :func:`metrics.get_scorer` handle properly
84+
multilabel-indicator matrix.
85+
:pr:`27002` by :user:`Guillaume Lemaitre <glemaitre>`.
86+
87+
:mod:`sklearn.mixture`
88+
......................
89+
90+
- |Fix| The initialization of :class:`mixture.GaussianMixture` from user-provided
91+
`precisions_init` for `covariance_type` of `full` or `tied` was not correct,
92+
and has been fixed.
93+
:pr:`26416` by :user:`Yang Tao <mchikyt3>`.
94+
4695
:mod:`sklearn.neighbors`
4796
........................
4897

@@ -58,12 +107,20 @@ Changelog
58107
when the input to the `param_distributions` parameter is a list of dicts.
59108
:pr:`26893` by :user:`Stefanie Senger <StefanieSenger>`.
60109

110+
- |Fix| Neighbors based estimators now correctly work when `metric="minkowski"` and the
111+
metric parameter `p` is in the range `0 < p < 1`, regardless of the `dtype` of `X`.
112+
:pr:`26760` by :user:`Shreesha Kumar Bhat <Shreesha3112>`.
113+
61114
:mod:`sklearn.preprocessing`
62115
............................
63116

64117
- |Fix| :class:`preprocessing.LabelEncoder` correctly accepts `y` as a keyword
65118
argument. :pr:`26940` by `Thomas Fan`_.
66119

120+
- |Fix| :class:`preprocessing.OneHotEncoder` shows a more informative error message
121+
when `sparse_output=True` and the output is configured to be pandas.
122+
:pr:`26931` by `Thomas Fan`_.
123+
67124
:mod:`sklearn.tree`
68125
...................
69126

0 commit comments

Comments
 (0)