SNOWMAN, also called as 雪だるま, is a Japanese word analyzer which segments Japanese text and tag part-of-speech for each word.  It has the following advantages:
  • It's Web-based system;  The system have been updated.
  • It merges orthographical variants of words;  It is reported that we have around 10% of orthographical variants in Japanese. 
  • It detects idioms and functional expressions and identifies them as one word.
  • Some unique part-of-speech such as quantifier and double part-of-speech tags such as "noun-adverb" are introduced.
  • (plan) Word sense disambiguation module will be partially implemented.
  • and more.
You can use the system after the registration below.  The URL of the system can be seen when the registration has been completed.

The SNOWMAN forum

For further information of SNOWMAN, please refer to the following paper:
(You can download the paper in our paper archive)
  • Kazuhide Yamamoto, Yuki Miyanishi, Kanji Takahashi, Yoshiki Inomata, Yuki Mikami and Yuta Sudo. What We Need is Word, Not Morpheme; Constructing Word Analyzer for Japanese. Proceedings of the International Conference on Asian Language Processing (IALP 2015), pp.49-52 (2015.10)

Project Leader:

Nagaoka University of Technology