| dc.contributor.author | Krek, Simon |
| dc.contributor.author | Laskowski, Cyprian |
| dc.contributor.author | Robnik-Šikonja, Marko |
| dc.contributor.author | Kosem, Iztok |
| dc.contributor.author | Arhar Holdt, Špela |
| dc.contributor.author | Gantar, Polona |
| dc.contributor.author | Čibej, Jaka |
| dc.contributor.author | Gorjanc, Vojko |
| dc.contributor.author | Klemenc, Bojan |
| dc.contributor.author | Dobrovoljc, Kaja |
| dc.contributor.author | Pori, Eva |
| dc.contributor.author | Roblek, Rebeka |
| dc.contributor.author | Zgaga, Karolina |
| dc.contributor.author | Kamenšek, Urška |
| dc.contributor.author | Ponikvar, Primož |
| dc.contributor.author | Šešet, Jure |
| dc.contributor.author | Zaranšek, Petra |
| dc.date.accessioned | 2026-02-19T10:03:42Z |
| dc.date.available | 2026-02-19T10:03:42Z |
| dc.date.issued | 2026-02-17 |
| dc.identifier.uri | http://hdl.handle.net/11356/2092 |
| dc.description | Thesaurus of Modern Slovene is the largest automatically generated open-access collection of Slovene synonyms. The current version 2.2 contains 102,068 keywords and 362,464 synonyms. Nearly 6,000 entries also contain antonyms. The Thesaurus includes two types of dictionary entries. Most of them are prepared entirely using automatic processes; however, an increasing number of them contain manually divided senses and categorized synonyms under relevant senses. Version 2.2 contains 7,465 sense-divided headwords, and 141 headwords contain sense-divided antonyms. The original data for the Thesaurus was sourced from the data in two principal language resources: The Oxford®-DZS Comprehensive English-Slovenian Dictionary and the Gigafida 1.0 corpus of written Slovene. The links identified between synonyms were additionally confirmed using the Dictionary of Standard Slovenian Language (SSKJ). The data extraction and structure for the Thesaurus were based on the frequency and manner in which words co-occur in translation strings of the Oxford-DZS Dictionary. This information is the basis for discriminating between ‘core’ and ‘near’ synonyms, with ‘core’ synonyms exhibiting a greater connection to the keyword. In the following step, an approach combining balanced co-occurrence graphs and the Personal PageRank algorithm automatically divides the synonyms into subgroups and ranks them according to the degree of semantic relatedness to the keyword, as well as their frequency in language use. For the creation methodology, see Krek et al. (2017) in the provided references. |
| dc.language.iso | slv |
| dc.publisher | Centre for Language Resources and Technologies, University of Ljubljana |
| dc.relation.isreferencedby | https://elex.link/elex2023/wp-content/uploads/82.pdf |
| dc.relation.isreferencedby | https://elex.link/elex2017/wp-content/uploads/2017/09/paper05.pdf |
| dc.relation.replaces | http://hdl.handle.net/11356/1916 |
| dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
| dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
| dc.rights.label | PUB |
| dc.source.uri | https://viri.cjvt.si/sopomenke/eng/about |
| dc.subject | thesaurus |
| dc.subject | synonyms |
| dc.subject | antonyms |
| dc.title | Thesaurus of Modern Slovene 2.2 |
| dc.type | lexicalConceptualResource |
| metashare.ResourceInfo#ContentInfo.detailedType | thesaurus |
| metashare.ResourceInfo#ContentInfo.mediaType | text |
| has.files | yes |
| branding | CLARIN.SI data & tools |
| demo.uri | http://viri.cjvt.si/sopomenke/eng |
| contact.person | Simon Krek simon.krek@guest.arnes.si Centre for Language Resources and Technologies, University of Ljubljana |
| sponsor | University of Ljubljana P6-0215 Slovene Language - Basic, Contrastive, and Applied Studies nationalFunds |
| sponsor | University of Ljubljana I0-0022 Network of Research Infrastructure Centres (MRIC) nationalFunds |
| sponsor | ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds |
| sponsor | Ministry of Culture of the Republic of Slovenia JR-infrastruktura-SJ-2024-2025 Data completion and gamification of dictionary resources at CJVT UL (PODVIG) nationalFunds |
| size.info | 102068 entries |
| files.count | 1 |
| files.size | 10333947 |
Files in this item
This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
- Name
- CJVT-Thesaurus-of-Modern-Slovene-v2.2.zip
- Size
- 9.86 MB
- Format
- application/zip
- Description
- dictionary + schema
- MD5
- 6c0f8b721f2e8ee344a657a0fc13c20c
- CJVT-Thesaurus-of-Modern-Slovene-v2.2
- monolingual_dictionaries.xsd-1 B
- inventory.xsd-1 B
- CJVT-Thesaurus-of-Modern-Slovene-v2.2.xml-1 B