Skip to content

Commit db33ca6

Browse files
committed
Docs: Improve docstring for Glove._binarize_vectors
1 parent d3826c4 commit db33ca6

File tree

1 file changed

+9
-8
lines changed

1 file changed

+9
-8
lines changed

ingredient_parser/en/_embeddings.py

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -89,25 +89,26 @@ def _load_vectors_from_file(self, vec_file: str) -> None:
8989
self.vectors[token] = vector
9090

9191
def _binarize_vectors(self):
92-
"""Binarize word vectors by converting continuous values into discrete values.
92+
"""Binarize vectors by converting continuous values into discrete values [1].
9393
9494
For each word vector, calculate the average value of the positive elements and
9595
the negative elements. Replace each element of each word vector according to:
9696
if value < negative_average:
97-
"NEG"
97+
"VNEG"
9898
elif value > positive_average
99-
"POS"
99+
"VPOS"
100100
else
101-
"0"
101+
"V0"
102102
103103
The resulting word vectors are stored in the binarized_vectors attribute.
104104
105105
References
106106
----------
107-
J. Guo, W. Che, H. Wang, and T. Liu, ‘Revisiting Embedding Features for Simple
108-
Semi-supervised Learning’, in Proceedings of the 2014 Conference on Empirical
109-
Methods in Natural Language Processing (EMNLP), Doha, Qatar: Association for
110-
Computational Linguistics, 2014, pp. 110–120. doi: 10.3115/v1/D14-1012.
107+
.. [1] J. Guo, W. Che, H. Wang, and T. Liu, ‘Revisiting Embedding Features for
108+
Simple Semi-supervised Learning’, in Proceedings of the 2014 Conference on
109+
EmpiricalMethods in Natural Language Processing (EMNLP), Doha, Qatar:
110+
Association for Computational Linguistics, 2014, pp. 110–120.
111+
doi: 10.3115/v1/D14-1012.
111112
"""
112113
self.binarized_vectors = {}
113114
for word, vec in self.vectors.items():

0 commit comments

Comments
 (0)