What's New
corpus
Description:
The dataset contains social media posts from X and traditional media articles from online news sources related to the Slovenian commemorations of the Day of Resistance.
We used two types of data: For the social media ...
This item contains 2 files (2.5
MB).
Publicly Available
toolService
Description:
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the concatenation of the SSJ UD treebank of written Slovenian ...
This item contains 1 file (145.55
MB).
Publicly Available
corpus
Description:
This entry contains the first part of the audiobook "Sam bog naj jo bere" (Let only God read it) by author Alenka Čurin Janžekovič (COBISS ID: 277038339, ISBN: 978-961-291-543-8).
An extraordinary first-person account ...
This item contains 4 files (177.19
MB).
Publicly Available
Most Viewed Items
Top Last Week
corpus
Description:
This entry includes the first part of the e-book "Socialna omrežja" (Social networks) by author Aleš Jelenko (COBISS.SI-ID 270071555, ISBN 978-961-7272-26-0).
What do we say when we speak in a digital language? And what ...
This item contains 1 file (466.99
KB).
Publicly Available
corpus
Description:
COLESLAW 1.0 is a large-scale collection of Slovenian legal texts compiled from authoritative public sources. The corpus covers legislative, judicial, and governmental legal documents and is designed to support research ...
This item contains 1 file (1.24
GB).
Publicly Available
corpus
Description:
SloIE is a manually labelled dataset of Slovene idiomatic expressions. It contains 29,400 sentences with 75 different expressions that can occur with either a literal or an idiomatic meaning, with appropriate manual ...
This item contains 1 file (4.22
MB).
Publicly Available