-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
Hi, thank you for releasing your code!
I ran the preprocessing code preprocess.py and meet a runtime error.
INFO:root:skip this step as /workspace/helo_word/data/conll2014 is NOT empty
INFO:root:STEP 0-8. Download language model
INFO:root:skip this step as /workspace/helo_word/data/language_model/data-bin is NOT empty
INFO:root:STEP 1. Word-tokenize the original files and merge them
INFO:root:STEP 1-1. gutenberg
INFO:root:skip this step as /workspace/helo_word/data/gutenberg/gutenberg.txt already exists
INFO:root:STEP 1-2. tatoeba
INFO:root:skip this step as /workspace/helo_word/data/tatoeba/tatoeba.txt already exists
INFO:root:STEP 1-3. wiki103
INFO:root:skip this step as /workspace/helo_word/data/wiki103/wiki103.txt already exists
INFO:root:STEP 2. Train bpe model
INFO:root:skip this step as /workspace/helo_word/data/bpe-model/gutenberg.model already exists
INFO:root:STEP 3. Split wi.dev into wi.dev.3k and wi.dev.1k
INFO:root:skip this step as /workspace/helo_word/data/bea19/wi+locness/m2/ABCN.dev.gold.bea19.3k.m2 already exists
INFO:root:STEP 4. Perturb and make parallel files
INFO:root:Track 1
INFO:root:STEP 4-1. writing perturbation scenario
INFO:root:STEP 4-2. gutenberg
# multiprocessing settings
# prepare inputs
# work
0%| | 0/1 [00:00<?, ?it/s]
--- SKIP ---
0%| | 0/1 [00:08<?, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/pattern3/text/__init__.py", line 412, in _read
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/workspace/helo_word/gec/perturb.py", line 160, in make_parallel
perturbation = apply_perturbation(words, word2ptbs, word_change_prob, type_change_prob)
File "/workspace/helo_word/gec/perturb.py", line 121, in apply_perturbation
w = change_type(w, t, type_change_prob)
File "/workspace/helo_word/gec/perturb.py", line 34, in change_type
word = conjugate(word, verb_type)
File "/opt/conda/lib/python3.7/site-packages/pattern3/text/__init__.py", line 2123, in conjugate
b = self.lemma(verb, parse=kwargs.get("parse", True))
File "/opt/conda/lib/python3.7/site-packages/pattern3/text/__init__.py", line 2088, in lemma
self.load()
File "/opt/conda/lib/python3.7/site-packages/pattern3/text/__init__.py", line 2042, in load
for v in _read(self._path):
RuntimeError: generator raised StopIteration
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "preprocess.py", line 169, in <module>
args.word_change_prob, args.type_change_prob))
File "preprocess.py", line 15, in maybe_do
func(*inputs)
File "/workspace/helo_word/gec/perturb.py", line 183, in do
p.map(make_parallel, inputs_li)
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
RuntimeError: generator raised StopIteration
I tried to skip processing gutenberg corpus, but the same error raised when processing the next corpus.
How can I fix it?
Metadata
Metadata
Assignees
Labels
No labels