What's New
lexicalConceptualResource

Description:
Launched in December 2004 by the Domestic Research Society, Razvezani jezik (The Unleashed Tongue) is the first user-generated online dictionary of spoken Slovenian language. As a Wiki project, it allowed every visitor to ...
This item contains 1 file (1.89
MB).
Publicly Available


toolService

Description:
The monolingual Slovene RoBERTa (A Robustly Optimized Bidirectional Encoder Representations from Transformers) model is a state-of-the-art model representing words/tokens as contextually dependent word embeddings, used for ...
This item contains 4 files (423.5
MB).
Publicly Available



lexicalConceptualResource

Description:
This entry consists of a TSV file containing a list of 66,347 Slovene word pairs from the Sloleks Morphological Lexicon of Slovene (v2.0; http://hdl.handle.net/11356/1230) that have been automatically identified as ...
This item contains 1 file (2.83
MB).
Publicly Available



Most Viewed Items
Top Last Week
corpus

Description:
The corpus contains 256,567 documents from the Slovenian news portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content. The submission contains 7 files: ...
This item contains 8 files (616.88
MB).
Publicly Available



lexicalConceptualResource

Description:
A lexicon of 751 emoji characters with automatically assigned sentiment.
The sentiment is computed from 70,000 tweets, labeled by 83 human annotators
in 13 European languages.
The process and analysis of emoji sentiment ...
This item contains 3 files (93.95
KB).
Publicly Available



corpus

Description:
The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators.
There are 15 Twitter corpora for the corresponding 15 European languages. The data can be used to train and evaluate ...
This item contains 16 files (49.38
MB).
Publicly Available


