hrWaC (Croatian Web)
This action may take several minutes for large corpora, please wait.

hrWaC (Croatian Web)

Croatian Web Corpus v2.2 (2014)

Counts
Tokens1397757548
Words1210021198
Sentences67403219
Paragraphs28771178
Documents3611090
General info
Corpus description Document
LanguageCroatian
EncodingUTF-8
Compiled10/28/2017 20:14:36
Tagset Description
Lexicon sizes
word
norm
lempos
tag
lc
lemma
lemma_lc
Tags legend
NounN.*
Noun properNp.*
Noun commonNc.*
VerbV.*
AdjectiveA.*
PronounP.*
AdverbR.*
PrepositionS.*
ConjunctionC.*
NumeralM.*
ParticleQ.*
ArticleT.*
InterjectionI.*
AbbreviationY.*
ResidualX.*