What's New
corpus
Description:
Submission includes the first part of the audiobook: Svetišča narave (Nature sanctuaries) by author Irena Cerar (COBISS.ID: 275565059, ISBN: 978-961-291-542-1).
With the book Nature sanctuaries, Irena Cerar, the author ...
Ta vnos vsebuje 4 datotek(e) (161.37
MB).
Publicly Available
corpus
Description:
The Disasters corpus in classical Arabic sources (DiCCAS) is designed to allow historians to compare different accounts and narratives of disasters in a variety of classical Arabic sources.
The corpus encompasses a ...
Ta vnos vsebuje 7 datotek(e) (20.97
MB).
Publicly Available
corpus
Description:
This entry contains the first part of the audiobook "Cvetoča Slovenija" (Blooming Slovenia) by author Ivan Sivec (COBISS.ID: 275424259, ISBN: 978-961-291-539-1).
Many believe that the writer Ivan Sivec is one of the ...
Ta vnos vsebuje 3 datotek(e) (102
MB).
Publicly Available
Največ ogledov
V preteklem tednu
corpus
Description:
ParlaMint 5.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
Ta vnos vsebuje 31 datotek(e) (5.94
GB).
Publicly Available
toolService
Description:
The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that ...
Ta vnos vsebuje 1 datoteko (231.07
MB).
Publicly Available
corpus
Description:
goo300k is a manually annotated reference corpus of historical Slovene. It contains 1,100 pages (about 300,000 tokens) sampled from 89 texts from the period 1584-1899.
Each text contains extensive meta-data and per-page ...
Ta vnos vsebuje 2 datotek(e) (8.9
MB).
Publicly Available