jpWaC-L2 (Japanese Web, intermediate sentences)
This action may take several minutes for large corpora, please wait.

jpWaC-L2 (Japanese Web, intermediate sentences)

Japanese Web texts with automatically assigned difficulty level '2' (intermediate). PoS and lemma annotated with ChaSen. Crawl and annotation in 2007.

Counts
Tokens4608635
Words4052389
Sentences372421
Documents37365
General info
Corpus description Document
LanguageJapanese
EncodingUTF-8
Compiled10/28/2017 18:27:13
Tagset Description
Lexicon sizes
word12141
lempos7465
tag65
ctag65
level4
lc 12141
lemma7103
lemma_lc7103

Structures and attributes