Replies: 1 comment
-
Hi @sakib-NSL
Having a simple sample code I can look deeper into this and see if we need to improve our WordSegmenterModel() and its models |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone,
I am a beginner in Spark NLP. I have trained a Japanese Dataset on spark NLP using Bert embeddings. I have used spacy tokenizer and converted it to BIO format and then used the data in training. The result is satisfactory with the test data. But when I use the same test data on prediction pipeline, the performance decreases. I have used Tokenizer() and WordSegmenterModel() (alternatively) in prediction pipeline but did not work. Can I use a customized different tokeizer in pipeline?
Here is the training pipeline
Here is prediction pipeline
Questions:
Thank you in advance.
Beta Was this translation helpful? Give feedback.
All reactions