dc.contributor.author | Čibej, Jaka |
dc.contributor.author | Arhar Holdt, Špela |
dc.contributor.author | Krek, Simon |
dc.date.accessioned | 2024-11-07T12:40:52Z |
dc.date.available | 2024-11-07T12:40:52Z |
dc.date.issued | 2024-11-07 |
dc.identifier.uri | http://hdl.handle.net/11356/1986 |
dc.description | This entry consists of a TSV file containing a list of 66,347 Slovene word pairs from the Sloleks Morphological Lexicon of Slovene (v2.0; http://hdl.handle.net/11356/1230) that have been automatically identified as morphologically related according to a number of manually designed morphological relation rules (e.g. "dež" -> "deževen", "pisati" -> "pisatelj", "prijatelj" -> "prijateljica"). Each line in the list contains the following columns: - original lemma (e.g. "pisati"), - related lemma (e.g. "pisatelj"), - original lemma, automatically deconstructed into individual word parts (e.g. "pis_ati"), - related lemma, automatically deconstructed into individual word parts (e.g. "pis_at_elj"), - MTE-6 lexical features of the original lemma (e.g. "G"),* - MTE-6 lexical features of the related lemma (e.g. "Som"),* - ID of the original lemma from Sloleks 2.0, - ID of the related lemma from Sloleks 2.0, - the overlapping or central part (common to both the original and the related lemmas; e.g. "pis") - the ID of the morphological relation rule used to identify the relation (e.g. "G.Som.5.2.1"), - the morphological relation rule (e.g. "[G]_ati -> [G]_at_elj"). * MTE-6 refers to MULTEXT-East Version 6 morphosyntactic specifications for Slovenian, available at http://nl.ijs.si/ME/V6/ Each rule constitutes a pattern to form a morphological relation. For instance, "[G]_ati -> [G]_at_elj" indicates that a verb (G) ending with the word part "ati" is related to the lemma formed by replacing "_ati" with "_at_elj". Note that the list contains no proper nouns and no relations for 38 morphological rules that have been included in the hierarchy of rules (listed in the accompanying file nssss_sloleks_word_relation_rules.tsv), but need to take into account additional rules that have not yet been implemented in the current version of the extraction process (such as irregular conversions in overlapping word parts: "gri_sti" - "griz_enj_e", "sneg" - "snež_ak"). Version 1.1 also contains manual evaluation scores for approximately 5,000 pairs which were sampled in a stratified manner (by rules). The pairs were reviewed by a linguist and assigned one of three scores (0 - inadequate; 1 - acceptable; 2 - adequate). |
dc.language.iso | slv |
dc.publisher | Centre for Language Resources and Technologies, University of Ljubljana |
dc.publisher | Jožef Stefan Institute |
dc.relation.isreferencedby | https://ebooks.uni-lj.si/ZalozbaUL/catalog/view/325/477/7316 |
dc.relation.replaces | http://hdl.handle.net/11356/1386 |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
dc.rights.label | PUB |
dc.source.uri | http://slovnica.ijs.si/ |
dc.subject | word parts |
dc.subject | word relations |
dc.subject | morphology |
dc.subject | morphological rules |
dc.subject | derivational morphology |
dc.title | List of word relations from the Sloleks 2.0 lexicon 1.1 |
dc.type | lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.detailedType | wordList |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Jaka Čibej jaka.cibej@cjvt.si Centre for Language Resources and Technologies, University of Ljubljana |
sponsor | ARRS (Slovenian Research Agency) J6-8256 New grammar of contemporary standard Slovene: sources and methods nationalFunds |
sponsor | ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds |
size.info | 66347 entries |
files.count | 1 |
files.size | 2974867 |
Files in this item
This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Name
- nssss_sloleks_word_relations_1.1.zip
- Size
- 2.84 MB
- Format
- application/zip
- Description
- List of word relations and word relation rules from the Sloleks 2.0 lexicon 1.1 (TSV)
- MD5
- cc52b4822d2ed40c8bf7af4cf88fc624
- nssss_sloleks_word_relations_1.1
- nssss_sloleks_word_relation_rules_1.1.tsv19 kB
- nssss_sloleks_word_relations_1.1.tsv10 MB
- 00README.txt6 kB