We are using Mecab tokenizer to split Japanese sentences into individual words. Issue: word 食べてしまいます gets split into - 食べて - しまい - ます this is rather difficult for readers to understand Ideally we should use better parser that understands Japanese conjugation on higher level.