@@ -54,9 +54,8 @@ The last estimator may be any type (transformer, classifier, etc.).
54
54
Usage
55
55
-----
56
56
57
- |details-start |
58
- **Construction **
59
- |details-split |
57
+ Build a pipeline
58
+ ................
60
59
61
60
The :class: `Pipeline ` is built using a list of ``(key, value) `` pairs, where
62
61
the ``key `` is a string containing the name you want to give this step and ``value ``
@@ -70,6 +69,10 @@ is an estimator object::
70
69
>>> pipe
71
70
Pipeline(steps=[('reduce_dim', PCA()), ('clf', SVC())])
72
71
72
+ |details-start |
73
+ **Shortand version using :func:`make_pipeline` **
74
+ |details-split |
75
+
73
76
The utility function :func: `make_pipeline ` is a shorthand
74
77
for constructing pipelines;
75
78
it takes a variable number of estimators and returns a pipeline,
@@ -81,14 +84,26 @@ filling in the names automatically::
81
84
82
85
|details-end |
83
86
87
+ Access pipeline steps
88
+ .....................
89
+
90
+ The estimators of a pipeline are stored as a list in the ``steps `` attribute.
91
+ A sub-pipeline can be extracted using the slicing notation commonly used
92
+ for Python Sequences such as lists or strings (although only a step of 1 is
93
+ permitted). This is convenient for performing only some of the transformations
94
+ (or their inverse):
95
+
96
+ >>> pipe[:1 ]
97
+ Pipeline(steps=[('reduce_dim', PCA())])
98
+ >>> pipe[- 1 :]
99
+ Pipeline(steps=[('clf', SVC())])
100
+
84
101
|details-start |
85
- **Accessing steps **
102
+ **Accessing a step by name or position **
86
103
|details-split |
87
104
88
-
89
- The estimators of a pipeline are stored as a list in the ``steps `` attribute,
90
- but can be accessed by index or name by indexing (with ``[idx] ``) the
91
- Pipeline::
105
+ A specific step can also be accessed by index or name by indexing (with ``[idx] ``) the
106
+ pipeline::
92
107
93
108
>>> pipe.steps[0]
94
109
('reduce_dim', PCA())
@@ -97,36 +112,61 @@ Pipeline::
97
112
>>> pipe['reduce_dim']
98
113
PCA()
99
114
100
- Pipeline's `named_steps ` attribute allows accessing steps by name with tab
115
+ ` Pipeline ` 's `named_steps ` attribute allows accessing steps by name with tab
101
116
completion in interactive environments::
102
117
103
118
>>> pipe.named_steps.reduce_dim is pipe['reduce_dim']
104
119
True
105
120
106
- A sub-pipeline can also be extracted using the slicing notation commonly used
107
- for Python Sequences such as lists or strings (although only a step of 1 is
108
- permitted). This is convenient for performing only some of the transformations
109
- (or their inverse):
121
+ |details-end |
110
122
111
- >>> pipe[:1 ]
112
- Pipeline(steps=[('reduce_dim', PCA())])
113
- >>> pipe[- 1 :]
114
- Pipeline(steps=[('clf', SVC())])
123
+ Tracking feature names in a pipeline
124
+ ....................................
115
125
116
- |details-end |
126
+ To enable model inspection, :class: `~sklearn.pipeline.Pipeline ` has a
127
+ ``get_feature_names_out() `` method, just like all transformers. You can use
128
+ pipeline slicing to get the feature names going into each step::
117
129
118
- .. _pipeline_nested_parameters :
130
+ >>> from sklearn.datasets import load_iris
131
+ >>> from sklearn.feature_selection import SelectKBest
132
+ >>> iris = load_iris()
133
+ >>> pipe = Pipeline(steps=[
134
+ ... ('select', SelectKBest(k=2)),
135
+ ... ('clf', LogisticRegression())])
136
+ >>> pipe.fit(iris.data, iris.target)
137
+ Pipeline(steps=[('select', SelectKBest(...)), ('clf', LogisticRegression(...))])
138
+ >>> pipe[:-1].get_feature_names_out()
139
+ array(['x2', 'x3'], ...)
119
140
120
141
|details-start |
121
- **Nested parameters **
142
+ **Customize feature names **
122
143
|details-split |
123
144
124
- Parameters of the estimators in the pipeline can be accessed using the
125
- ``<estimator>__<parameter> `` syntax::
145
+ You can also provide custom feature names for the input data using
146
+ ``get_feature_names_out ``::
147
+
148
+ >>> pipe[:-1].get_feature_names_out(iris.feature_names)
149
+ array(['petal length (cm)', 'petal width (cm)'], ...)
150
+
151
+ |details-end |
152
+
153
+ .. _pipeline_nested_parameters :
154
+
155
+ Access to nested parameters
156
+ ...........................
157
+
158
+ It is common to adjust the parameters of an estimator within a pipeline. This parameter
159
+ is therefore nested because it belongs to a particular sub-step. Parameters of the
160
+ estimators in the pipeline are accessible using the ``<estimator>__<parameter> ``
161
+ syntax::
126
162
127
163
>>> pipe.set_params(clf__C=10)
128
164
Pipeline(steps=[('reduce_dim', PCA()), ('clf', SVC(C=10))])
129
165
166
+ |details-start |
167
+ **When does it matter? **
168
+ |details-split |
169
+
130
170
This is particularly important for doing grid searches::
131
171
132
172
>>> from sklearn.model_selection import GridSearchCV
@@ -143,36 +183,11 @@ ignored by setting them to ``'passthrough'``::
143
183
... clf__C=[0.1, 10, 100])
144
184
>>> grid_search = GridSearchCV(pipe, param_grid=param_grid)
145
185
146
- The estimators of the pipeline can be retrieved by index:
147
-
148
- >>> pipe[0 ]
149
- PCA()
150
-
151
- or by name::
152
-
153
- >>> pipe['reduce_dim']
154
- PCA()
155
-
156
- To enable model inspection, :class: `~sklearn.pipeline.Pipeline ` has a
157
- ``get_feature_names_out() `` method, just like all transformers. You can use
158
- pipeline slicing to get the feature names going into each step::
159
-
160
- >>> from sklearn.datasets import load_iris
161
- >>> from sklearn.feature_selection import SelectKBest
162
- >>> iris = load_iris()
163
- >>> pipe = Pipeline(steps=[
164
- ... ('select', SelectKBest(k=2)),
165
- ... ('clf', LogisticRegression())])
166
- >>> pipe.fit(iris.data, iris.target)
167
- Pipeline(steps=[('select', SelectKBest(...)), ('clf', LogisticRegression(...))])
168
- >>> pipe[:-1].get_feature_names_out()
169
- array(['x2', 'x3'], ...)
186
+ .. topic :: See Also:
170
187
171
- You can also provide custom feature names for the input data using
172
- ``get_feature_names_out ``::
188
+ * :ref: `composite_grid_search `
173
189
174
- >>> pipe[:-1].get_feature_names_out(iris.feature_names)
175
- array(['petal length (cm)', 'petal width (cm)'], ...)
190
+ |details-end |
176
191
177
192
.. topic :: Examples:
178
193
@@ -184,11 +199,6 @@ You can also provide custom feature names for the input data using
184
199
* :ref: `sphx_glr_auto_examples_compose_plot_compare_reduction.py `
185
200
* :ref: `sphx_glr_auto_examples_miscellaneous_plot_pipeline_display.py `
186
201
187
- .. topic :: See Also:
188
-
189
- * :ref: `composite_grid_search `
190
-
191
- |details-end |
192
202
193
203
.. _pipeline_cache :
194
204
0 commit comments