What's New
corpus

Description:
The dataset consists of 7514 Slovenian news articles from the SentiNews 1.0 corpus by Bučar et al. 2017 (http://hdl.handle.net/11356/1110) which had available article keywords.
We provide the train and test data splits ...
Ta vnos vsebuje 2 datotek(e) (6.05
MB).
Publicly Available



corpus

Description:
The Trendi corpus is a monitor corpus of Slovene. It contains news from 107 different media websites, published by 48 different publishers. Trendi 2022-05 covers the period from January 2019 to May 2022, complementing the ...
Ta vnos ne vsebuje datotek.
lexicalConceptualResource

Description:
Algemeen Nederlands Woordenboek (ANW). The ANW is a corpus-based, digital dictionary that describes contemporary Dutch in the Netherlands, Flanders, Suriname, and the Caribbean as comprehensively as possible. The language ...
Ta vnos ne vsebuje datotek.
Največ ogledov
V preteklem tednu
corpus

Description:
The 24sata news portal consists of a portal with daily news and several smaller portals covering news from specific topics, such as automotive news, health, culinary content, and lifestyle advice. The dataset contains over ...
Ta vnos vsebuje 2 datotek(e) (1.26
GB).
Publicly Available




corpus

Description:
ParlaMint 2.1 is a multilingual set of 17 comparable corpora containing parliamentary debates mostly starting in 2015 and extending to mid-2020, with each corpus being about 20 million words in size. The sessions in the ...
Ta vnos vsebuje 18 datotek(e) (23.37
GB).
Publicly Available


lexicalConceptualResource

Description:
A lexicon of 751 emoji characters with automatically assigned sentiment.
The sentiment is computed from 70,000 tweets, labeled by 83 human annotators
in 13 European languages.
The process and analysis of emoji sentiment ...
Ta vnos vsebuje 3 datotek(e) (93.95
KB).
Publicly Available


