What's New

 toolService 
toolService
Description:
The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that ...
 Ta vnos vsebuje 1 datoteko (231.07 MB).
 
Publicly Available
 corpus 
corpus
Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 107 media websites, published by 77 publishers. Trendi 2024-08 covers the period from January 2019 to August 2024, complementing the Gigafida ...
 Ta vnos ne vsebuje datotek.
 toolService 
toolService
Description:
This is a retrained Slovenian model for the Trankit v1.1.1 library for multilingual natural language processing (https://pypi.org/project/trankit/), trained on the concatenation of the SSJ UD treebank of written Slovenian ...
 Ta vnos vsebuje 1 datoteko (145.44 MB).
 
Publicly Available

Največ ogledov

V preteklem tednu
 corpus 
corpus
Description:
PodzemniRadovi-sr-en, dvojezični poravnati korpus radova iz oblasti rudarstva. Undeground-mining-sr-en: bilingual texts from the Underground Mining Engineering journal (55 papers from 8 issues), aligned at the sentence ...
 Ta vnos ne vsebuje datotek.
 corpus 
corpus
Description:
The CVET corpus contains 230 texts (around 175 thousand words) of varying length, published in the religious journal "Cvetje z vertov sv. Frančiška" between 1887 and 1916, when the magazine was edited by the linguist Fr. ...
 Ta vnos vsebuje 4 datotek(e) (15.02 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required