Skip to content

Commit cd93ab3

Browse files
DOC Improve Tfidf docstring (scikit-learn#26697)
1 parent b51f274 commit cd93ab3

File tree

1 file changed

+5
-4
lines changed

1 file changed

+5
-4
lines changed

sklearn/feature_extraction/text.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -633,7 +633,7 @@ class HashingVectorizer(
633633
'ascii' is a fast method that only works on characters that have
634634
a direct ASCII mapping.
635635
'unicode' is a slightly slower method that works on any character.
636-
None (default) does nothing.
636+
None (default) means no character normalization is performed.
637637
638638
Both 'ascii' and 'unicode' use NFKD normalization from
639639
:func:`unicodedata.normalize`.
@@ -964,7 +964,7 @@ class CountVectorizer(_VectorizerMixin, BaseEstimator):
964964
'ascii' is a fast method that only works on characters that have
965965
a direct ASCII mapping.
966966
'unicode' is a slightly slower method that works on any characters.
967-
None (default) does nothing.
967+
None (default) means no character normalization is performed.
968968
969969
Both 'ascii' and 'unicode' use NFKD normalization from
970970
:func:`unicodedata.normalize`.
@@ -1786,7 +1786,7 @@ class TfidfVectorizer(CountVectorizer):
17861786
'ascii' is a fast method that only works on characters that have
17871787
a direct ASCII mapping.
17881788
'unicode' is a slightly slower method that works on any characters.
1789-
None (default) does nothing.
1789+
None (default) means no character normalization is performed.
17901790
17911791
Both 'ascii' and 'unicode' use NFKD normalization from
17921792
:func:`unicodedata.normalize`.
@@ -1881,7 +1881,8 @@ class TfidfVectorizer(CountVectorizer):
18811881
binary : bool, default=False
18821882
If True, all non-zero term counts are set to 1. This does not mean
18831883
outputs will have only 0/1 values, only that the tf term in tf-idf
1884-
is binary. (Set idf and normalization to False to get 0/1 outputs).
1884+
is binary. (Set `binary` to True, `use_idf` to False and
1885+
`norm` to None to get 0/1 outputs).
18851886
18861887
dtype : dtype, default=float64
18871888
Type of the matrix returned by fit_transform() or transform().

0 commit comments

Comments
 (0)