What's New
corpus
Description:
The dataset contains social media posts from X and traditional media articles from online news sources related to the Slovenian commemorations of the Day of Resistance.
We used two types of data: For the social media ...
This item contains 2 files (2.5
MB).
Publicly Available
toolService
Description:
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the concatenation of the SSJ UD treebank of written Slovenian ...
This item contains 1 file (145.55
MB).
Publicly Available
corpus
Description:
This entry contains the first part of the audiobook "Sam bog naj jo bere" (Let only God read it) by author Alenka Čurin Janžekovič (COBISS ID: 277038339, ISBN: 978-961-291-543-8).
An extraordinary first-person account ...
This item contains 4 files (177.19
MB).
Publicly Available
Most Viewed Items
Top Last Week
lexicalConceptualResource
Description:
Trilingual (EN-LT-NO) glossary of terms denoting phobia types extracted from the articles of English "The Guardian", Lithuanian "DELFI", and Norwegian "Dagbladet" news media sites.
This item contains no files.
corpus
Description:
goo300k is a manually annotated reference corpus of historical Slovene. It contains 1,100 pages (about 300,000 tokens) sampled from 89 texts from the period 1584-1899.
Each text contains extensive meta-data and per-page ...
This item contains 2 files (8.9
MB).
Publicly Available
lexicalConceptualResource
Description:
The resource contains English SimLex-999 (Hill et al. 2015) and their Slovene translations. In the translation process, the word pairs were first translated by two translators independently, and next, for the examples where ...
This item contains 3 files (37.3
KB).
Publicly Available