Najnovejše

 corpus 
corpus
Opis:
The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication dates, and automatically assigned hate labels ...
 Ta vnos vsebuje 1 datoteko (182.04 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 toolService 
toolService
Avtor(ji):
Opis:
The Q-CAT (Querying-Supported Corpus Annotation Tool) is a computational tool for manual annotation of language corpora, which also enables advanced queries on top of these annotations. The tool has been used in various ...
 Ta vnos vsebuje 1 datoteko (1.11 MB).
 
Publicly Available
 corpus 
corpus
Opis:
Gos VideoLectures is an add-on to the Gos reference corpus of spoken Slovene (http://hdl.handle.net/11356/1040), and covers public academic speech. It can be used for training continuous speech recognition for Slovene ...
 Ta vnos vsebuje 3 datotek(e) (20.74 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required

Največ ogledov

V preteklem tednu
 lexicalConceptualResource 
lexicalConceptualResource
Opis:
A lexicon of 751 emoji characters with automatically assigned sentiment. The sentiment is computed from 70,000 tweets, labeled by 83 human annotators in 13 European languages. The process and analysis of emoji sentiment ...
 Ta vnos vsebuje 3 datotek(e) (93.95 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 lexicalConceptualResource 
lexicalConceptualResource
Avtor(ji):
Opis:
hrLex is a large inflectional lexicon of Croatian language where each entry consists of a (wordform, lemma, MSD, MSD features, UPOS, morphological features, frequency, per-million frequency) 8-tuple. The (wordform, lemma, ...
 Ta vnos vsebuje 1 datoteko (51.95 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike