-
-
Notifications
You must be signed in to change notification settings - Fork 14
Description
I have the following simple program to see how to run all different models under
https://github.com/unicode-org/lstm_word_segmentation/tree/master/Models
It currently work for Thai_codepoints_exclusive_model4_heavy but I have problem to figure out what the value need to be passed in for other model
# Lint as: python3
from lstm_word_segmentation.word_segmenter import pick_lstm_model
import sys, getopt
"""
Read a file and output segmented results
"""
def main(argv):
inputfile = ''
outputfile = ''
try:
opts, args = getopt.getopt(argv,"hi:o:",["ifile=","ofile="])
except getopt.GetoptError:
print('test.py -i <inputfile> -o <outputfile>')
sys.exit(2)
for opt, arg in opts:
if opt == '-h':
print('test.py -i <inputfile> -o <outputfile>')
sys.exit()
elif opt in ("-i", "--ifile"):
inputfile = arg
elif opt in ("-o", "--ofile"):
outputfile = arg
print('Input file is "', inputfile)
print('Output file is "', outputfile)
file1 = open(inputfile, 'r')
Lines = file1.readlines()
word_segmenter = pick_lstm_model(model_name="Thai_codepoints_exclusive_model4_heavy",
embedding="codepoints",
train_data="exclusive BEST",
eval_data="exclusive BEST")
count = 0
# Strips the newline character
for line in Lines:
line = line.strip()
print(line)
print(word_segmenter.segment_arbitrary_line(line))
if __name__ == "__main__":
main(sys.argv[1:])
Could you specify what values should be used for embedding, train_data and eval_data for the other models?
Burmese_codepoints_exclusive_model4_heavy
Burmese_codepoints_exclusive_model5_heavy
Burmese_codepoints_exclusive_model7_heavy
Burmese_genvec1235_model4_heavy
Burmese_graphclust_model4_heavy
Burmese_graphclust_model5_heavy
Burmese_graphclust_model7_heavy
Thai_codepoints_exclusive_model4_heavy
Thai_codepoints_exclusive_model5_heavy
Thai_codepoints_exclusive_model7_heavy
Thai_genvec123_model5_heavy
Thai_graphclust_model4_heavy
Thai_graphclust_model5_heavy
Thai_graphclust_model7_heavy
or is there a simple way we can just have a simple function
get_lstm_model(model_name) on top of pick_lstm_model() and just fill the necessary parameter to call pick_lstm_model()