What's New

 corpus 
corpus
Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 106 media websites, published by 57 publishers. Trendi 2025-05 covers the period from January 2019 to May 2025, complementing the Gigafida ...
 This item contains no files.
 lexicalConceptualResource 
lexicalConceptualResource
Author(s):
Description:
This dataset contains lists of delexicalized dependency trees and subtrees extracted from the English UD GUM corpus, version 2.15 (http://hdl.handle.net/11234/1-5787), using the STARK tool (https://github.com/clarinsi/STARK). ...
 This item contains 6 files (42.39 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 lexicalConceptualResource 
lexicalConceptualResource
Author(s):
Description:
This dataset contains lists of delexicalized dependency trees and subtrees extracted from the Slovenian UD corpora SSJ (written) and SST (spoken), version 2.15 (http://hdl.handle.net/11234/1-5787), using the STARK tool ...
 This item contains 6 files (74.12 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required

Most Viewed Items

Top Last Week
 lexicalConceptualResource 
lexicalConceptualResource
Author(s):
Description:
A list of headwords from the collection "Besede slovenskega jezika" (Words of Slovenian Language).
 This item contains 1 file (997.48 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial
 corpus 
corpus
Description:
This dataset is an archive of reader comments on the Ekspress Meedia news site from 2009-2019, containing approximately 31M comments, mostly in the Estonian language, with some in Russian. Description of the Datasets. There ...
 This item contains 12 files (9.95 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Noncommercial No Derivative Works
 corpus 
corpus
Description:
The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original (about 100,000 words in length), and its translations ...
 This item contains 1 file (14.12 MB).
 
Academic Use Attribution Required Noncommercial