Ability to obtain word count / word frequency from pretrained word vector corpus #5232
Unanswered
aced125
asked this question in
Help: Coding & Implementations
Replies: 1 comment
-
Some of the provided spacy md/lg models have word probabilities from a separate source than the vectors (German, Spanish, English, Greek), which you can access per token as |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Feature description
The idea would be to obtain word frequencies for e.g glove vectors.
This could allow computing weighted sentence vectors:
For example, SIF embeddings (https://openreview.net/pdf?id=SyK00v5xx)
There may be a way to do this already that I am not aware of.
Could the feature be a custom component or spaCy plugin?
I will provide a custom spacy component for SIF embeddings here:
Beta Was this translation helpful? Give feedback.
All reactions