What's New

 corpus 
corpus
Description:
This entry contains the first part of the audiobook "Deček na črnem konju" (The boy on the black horse) by author Leopold Suhodolčan (COBISS ID: 273737219, ISBN: 978-961-7194-60-9). A youth novel about love, friendship, ...
 Ta vnos vsebuje 8 datotek(e) (126.66 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 corpus 
corpus
Description:
This entry contains the first part of the audiobook "Skriti dnevnik" (The hidden diary) by author Leopold Suhodolčan (COBISS ID: 272747267, ISBN: 978-961-7194-57-9). The story of a boy named Mirt, who decides to fulfill ...
 Ta vnos vsebuje 6 datotek(e) (90.53 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 corpus 
corpus
Description:
This entry contains the first part of the audiobook "Naočnik in očalnik" (The Spectacle Man and the Glasses Man) by author Leopold Suhodolčan (COBISS ID: 274676739, ISBN: 978-961-7194-64-7). Detective Naočnik and Očalnik ...
 Ta vnos vsebuje 3 datotek(e) (95.4 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Največ ogledov

V preteklem tednu
 corpus 
corpus
Description:
SloIE is a manually labelled dataset of Slovene idiomatic expressions. It contains 29,400 sentences with 75 different expressions that can occur with either a literal or an idiomatic meaning, with appropriate manual ...
 Ta vnos vsebuje 1 datoteko (4.22 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike
 corpus 
corpus
Description:
The dataset of user comments provided for research purposes for the EMBEDDIA, a Horizon 2020 project, extracted from the database of user comments from the 24sata.hr news portal. The 24sata.hr is the largest-circulation ...
 Ta vnos vsebuje 3 datotek(e) (1.89 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial No Derivative Works
 corpus 
corpus
Description:
The dataset represents the Twitter production in Slovenian in the period from 2018 until 2020. It consists of tweet IDs, retweet IDs, pseudo-anonymized user IDs, publication dates, and automatically assigned hate labels ...
 Ta vnos vsebuje 1 datoteko (182.04 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike