Skip to content

pick_lstm_model parameters are too complicated to call #10

@FrankYFTang

Description

@FrankYFTang

I have the following simple program to see how to run all different models under

https://github.com/unicode-org/lstm_word_segmentation/tree/master/Models

It currently work for Thai_codepoints_exclusive_model4_heavy but I have problem to figure out what the value need to be passed in for other model

# Lint as: python3
from lstm_word_segmentation.word_segmenter import pick_lstm_model
import sys, getopt

"""
Read a file and output segmented results
"""

def main(argv):
   inputfile = ''
   outputfile = ''
   try:
     opts, args = getopt.getopt(argv,"hi:o:",["ifile=","ofile="])
   except getopt.GetoptError:
     print('test.py -i <inputfile> -o <outputfile>')
     sys.exit(2)
   for opt, arg in opts:
      if opt == '-h':
        print('test.py -i <inputfile> -o <outputfile>')
        sys.exit()
      elif opt in ("-i", "--ifile"):
        inputfile = arg
      elif opt in ("-o", "--ofile"):
        outputfile = arg
   print('Input file is "', inputfile)
   print('Output file is "', outputfile)

   file1 = open(inputfile, 'r')
   Lines = file1.readlines()

   word_segmenter = pick_lstm_model(model_name="Thai_codepoints_exclusive_model4_heavy",
                                    embedding="codepoints",
                                    train_data="exclusive BEST",
                                    eval_data="exclusive BEST")

   count = 0
   # Strips the newline character
   for line in Lines:
       line = line.strip()
       print(line)
       print(word_segmenter.segment_arbitrary_line(line))

if __name__ == "__main__":
    main(sys.argv[1:])

Could you specify what values should be used for embedding, train_data and eval_data for the other models?

Burmese_codepoints_exclusive_model4_heavy
Burmese_codepoints_exclusive_model5_heavy
Burmese_codepoints_exclusive_model7_heavy
Burmese_genvec1235_model4_heavy
Burmese_graphclust_model4_heavy
Burmese_graphclust_model5_heavy
Burmese_graphclust_model7_heavy
Thai_codepoints_exclusive_model4_heavy
Thai_codepoints_exclusive_model5_heavy
Thai_codepoints_exclusive_model7_heavy
Thai_genvec123_model5_heavy
Thai_graphclust_model4_heavy
Thai_graphclust_model5_heavy
Thai_graphclust_model7_heavy

or is there a simple way we can just have a simple function

get_lstm_model(model_name) on top of pick_lstm_model() and just fill the necessary parameter to call pick_lstm_model()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions