jpWaC-L0 (Japanese Web, v.difficult sentences)
This action may take several minutes for large corpora, please wait.

jpWaC-L0 (Japanese Web, v.difficult sentences)

Japanese Web texts with automatically assigned difficulty level '0' (very difficult). PoS and lemma annotated with ChaSen. Crawl and annotation in 2007.

Counts
Tokens43763041
Words38838128
Sentences2627335
Documents46461
General info
Corpus description Document
LanguageJapanese
EncodingUTF-8
Compiled10/28/2017 18:28:12
Tagset Description
Lexicon sizes
word167744
lempos149129
tag67
ctag67
level5
lemma148318
lemma_lc147685

Structures and attributes