@@ -1022,7 +1022,7 @@ The following table summarizes the penalties and multinomial multiclass supporte
1022
1022
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1023
1023
| **Penalties ** | **'lbfgs' ** | **'liblinear' ** | **'newton-cg' ** | **'newton-cholesky' ** | **'sag' ** | **'saga' ** |
1024
1024
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1025
- | L2 penalty | yes | no | yes | no | yes | yes |
1025
+ | L2 penalty | yes | yes | yes | yes | yes | yes |
1026
1026
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1027
1027
| L1 penalty | no | yes | no | no | no | yes |
1028
1028
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
@@ -1032,7 +1032,7 @@ The following table summarizes the penalties and multinomial multiclass supporte
1032
1032
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1033
1033
| **Multiclass support ** | |
1034
1034
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1035
- | multinomial multiclass | yes | no | yes | no | yes | yes |
1035
+ | multinomial multiclass | yes | no | yes | yes | yes | yes |
1036
1036
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1037
1037
| **Behaviors ** | |
1038
1038
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
@@ -1043,8 +1043,11 @@ The following table summarizes the penalties and multinomial multiclass supporte
1043
1043
| Robust to unscaled datasets | yes | yes | yes | yes | no | no |
1044
1044
+------------------------------+-------------+-----------------+-----------------+-----------------------+-----------+------------+
1045
1045
1046
- The "lbfgs" solver is used by default for its robustness. For large datasets
1047
- the "saga" solver is usually faster.
1046
+ The "lbfgs" solver is used by default for its robustness. For
1047
+ `n_samples >> n_features `, "newton-cholesky" is a good choice and can reach high
1048
+ precision (tiny `tol ` values). For large datasets
1049
+ the "saga" solver is usually faster (than "lbfgs"), in particular for low precision
1050
+ (high `tol `).
1048
1051
For large dataset, you may also consider using :class: `SGDClassifier `
1049
1052
with `loss="log_loss" `, which might be even faster but requires more tuning.
1050
1053
@@ -1101,13 +1104,12 @@ zero, is likely to be an underfit, bad model and you are advised to set
1101
1104
scaled datasets and on datasets with one-hot encoded categorical features with rare
1102
1105
categories.
1103
1106
1104
- * The "newton-cholesky" solver is an exact Newton solver that calculates the hessian
1107
+ * The "newton-cholesky" solver is an exact Newton solver that calculates the Hessian
1105
1108
matrix and solves the resulting linear system. It is a very good choice for
1106
- `n_samples ` >> `n_features `, but has a few shortcomings: Only :math: `\ell _2 `
1107
- regularization is supported. Furthermore, because the hessian matrix is explicitly
1108
- computed, the memory usage has a quadratic dependency on `n_features ` as well as on
1109
- `n_classes `. As a consequence, only the one-vs-rest scheme is implemented for the
1110
- multiclass case.
1109
+ `n_samples ` >> `n_features ` and can reach high precision (tiny values of `tol `),
1110
+ but has a few shortcomings: Only :math: `\ell _2 ` regularization is supported.
1111
+ Furthermore, because the Hessian matrix is explicitly computed, the memory usage
1112
+ has a quadratic dependency on `n_features ` as well as on `n_classes `.
1111
1113
1112
1114
For a comparison of some of these solvers, see [9 ]_.
1113
1115
0 commit comments