LSDB Archive

項目名	項目の説明
Base form	Corresponding headword in the JST thesaurus
Surface form	The word itself
Left-context ID	MeCab internal ID for left context (see http://taku910.github.io/mecab/dic.html)
Right-context ID	MeCab internal ID for right context (see http://taku910.github.io/mecab/dic.html)
Cost	The cost for the likelihood of the word to appear in a sentence (smaller, more likely)
POS	Part of speech
POS subcategory 1	POS subcategory 1
POS subcategory 2	POS subcategory 2
POS subcategory 3	POS subcategory 3
Conjugation type	Conjugation type
Conjugation form	Conjugation form
Reading('Furigana')	Reading of the headword.When Headword Flag is ’V’, it may be different from the reading of the surface form.
Pronunciation	Automatically generated from Reading
Source dictionary	It is fixed as 'Thesaurus2015'.
ID in Source dictionary	ID in JST Thesaurus
J-GLOBAL ID	ID in J-GLOBAL
Headword Flag	・C: The word's surface form is the same as the headword in JST Thesaurus (or corresponding hankaku form)・V: Otherwise
Category code	Category code of science fields in JST Thesaurus
Common word flag 1	・1: There is an entry (or entries) for the surface form in IPA dictionary・0: There are no entries for the surface in IPA dictionary
Common word flag 2	Based on "IPA dictionary analysis results":・When the value of Common word flag １ is 1, the value of this flag is the part of speech for the IPA dictionary analysis result.・When the value of Common word flag １ is 0:- UNKNOWN_1: if the result is one unknown word- UNKNOWN_2: if the result is multiple tokens including unknown word- MULTI_WORD: if the result is multiple tokens in IPA dictionary
IPA dictionary analysis results	Results of the morphological analysis with the original IPA dictionary (and the dictionary with IPA dictionary entries where zenkaku alphanumeric characters and symbols are converted into corresponding hankaku characters). If the result is devided into multiple tokens, it is whitespace-separated. It is not manually corrected.