This action may take several minutes for large corpora, please wait.
Janes Tweet (tviti 2013-2017)
Korpus tvitov (2013-2017) Janes Tweet v1.0 // Corpus of Slovene tweets (2013-2017) Janes Tweet v1.0
Counts |
Tokens | 151457091 |
Words | 109672826 |
Sentences | 20202197 |
Paragraphs | 11308823 |
Documents | 11308823 |
General info |
Corpus description |
Document |
Language | Slovenian |
Encoding | UTF-8 |
Compiled | 10/28/2017 19:10:00 |
Tagset |
Description |
Lexicon sizes |
word | |
norm | |
lempos | |
tag_en | |
tag | |
diff | |
lc
| |
lemma | |
lemma_lc | |
Tags legend |
samostalnik | S.* |
glagol | G.* |
pridevnik | P.* |
prislov | R.* |
zaimek | Z.* |
predlog | D.* |
veznik | V.* |
členek | L.* |
medmet | M.* |
števnik | K.* |
okrajšava | O.* |
neuvrščeno | N.* |
ločilo | U.* |
Lempos suffixes |
samostalnik | -s |
glagol | -g |
pridevnik | -p |
prislov | -r |
zaimek | -z |
predlog | -d |
veznik | -v |
členek | -l |
medmet | -m |
števnik | -k |
okrajšava | -o |
neuvrščeno | -n |
ločilo | -u |