hrWaC (Croatian Web)
This action may take several minutes for large corpora, please wait.

hrWaC (Croatian Web)

Croatian Web Corpus v2.2 (2014)

Counts
Tokens1397757548
Words1210021198
Sentences67403219
Paragraphs28771178
Documents3611090
General info
Corpus description Document
LanguageCroatian
EncodingUTF-8
Compiled10/28/2017 20:14:36
Tagset Description
Lexicon sizes
word9778545
norm8248073
lempos9060101
tag735
lc 8305309
lemma7792876
lemma_lc6914817
Tags legend
NounN.*
Noun properNp.*
Noun commonNc.*
VerbV.*
AdjectiveA.*
PronounP.*
AdverbR.*
PrepositionS.*
ConjunctionC.*
NumeralM.*
ParticleQ.*
ArticleT.*
InterjectionI.*
AbbreviationY.*
ResidualX.*

Structures and attributes