dc.contributor.author | Babič, Saša |
dc.contributor.author | Erjavec, Tomaž |
dc.contributor.author | Farič, Ana |
dc.contributor.author | Peče, Miha |
dc.date.accessioned | 2025-04-25T19:43:00Z |
dc.date.available | 2025-04-25T19:43:00Z |
dc.date.issued | 2025-04-25 |
dc.identifier.uri | http://hdl.handle.net/11356/2023 |
dc.description | The Uganke corpus collects 2,790 Slovenian riddles from the folklore collection of the Institute of Slovenian Ethnology. The riddles come from 171 sources: fieldwork, newspapers, journals, manuscripts and printed riddle collections from the 19th and 20th centuries. The material is categorised into eight types, depending on the content, semantics, length and presumed context of the riddle: true riddle, narrative true riddle, joking question, wisdom question, joking wisdom question, logical riddle, neck riddle, sexual riddle. Each riddle is split into the question and answer part, and each is given in the diplomatic transcription, mirroring the riddle in the source document, and the critical transcription, which is brought closer to the contemporary Slovenian standard orthography. The critical transcriptions have been automatically annotated with lemmas, MULTEXT-East morphosyntactic descriptions (https://nl.ijs.si/ME/V6/msd/html/msd-sl.html) and Universal dependencies (https://universaldependencies.org/) with the CLASSLA toolchain (https://github.com/clarinsi/classla). The canonical encoding of the corpus is TEI, but the corpus is also distributed in two derived encodings. One is the riddles and the bibliography as two TSV files, and the other the vertical file with the linguistically annotated riddles, as used by CQP-type concordancers, such as Sketch Engine. |
dc.language.iso | slv |
dc.publisher | ZRC SAZU |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.subject | riddles |
dc.subject | TEI |
dc.subject | folklore |
dc.subject | humour |
dc.title | Collection of Slovenian riddles Uganke 1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Saša Babič sasa.babic@zrc-sazu.si ZRC SAZU |
contact.person | Miha Peče miha.pece@zrc-sazu.si ZRC SAZU |
sponsor | ARIS P6-0088 Ethnological, Anthropological and Folklore Studies Research on Everyday Life nationalFunds |
sponsor | DARIAH-SI - The Slovenian Digital Research Infrastructure for the Arts and Humanities Other |
sponsor | Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds |
size.info | 2790 items |
files.count | 4 |
files.size | 3921346 |
featuredService.noske | search|https://www.clarin.si/ske/#dashboard?corpname=uganke |
Files in this item
Download all files in item (3.74 MB)This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)



- Name
- Uganke.TEI.zip
- Size
- 1.83 MB
- Format
- application/zip
- Description
- Corpus in canonical TEI encoding
- MD5
- 50fda6c67aebfcaab5b6733f1d3cec38
- Uganke.TEI
- uganke-text.xml2 MB
- uganke-text.ana.xml14 MB
- uganke.ana.xml17 kB
- Schema
- tei_odds.rng665 kB
- tei_clarin.rnc331 kB
- tei_clarin.dtd240 kB
- tei_clarin.rng707 kB
- tei_clarin.xsd756 kB
- dcr.tmp1 kB
- xml.tmp2 kB
- uganke.xml10 kB
- uganke-viri.xml25 kB
- 00README.txt1 kB

- Name
- Uganke.tsv.zip
- Size
- 180.04 KB
- Format
- application/zip
- Description
- Corpus text and list of bibliographical sources as TSV files
- MD5
- 1d9e98603480ab237d32dca6982f15f3
- Uganke.tsv
- uganke.tsv648 kB
- uganke-viri.tsv19 kB
- 00README.txt694 B

- Name
- Uganke.CoNLL-U.zip
- Size
- 745.27 KB
- Format
- application/zip
- Description
- Linguistically annotated corpus in CoNLL-U format
- MD5
- 6c330dc95fb9f744815efafb7e66bb7e
- Uganke.CoNLL-U
- uganke.tsv648 kB
- uganke-viri.tsv19 kB
- uganke.conllu4 MB
- 00README.txt620 B

- Name
- Uganke.vert.zip
- Size
- 1.01 MB
- Format
- application/zip
- Description
- Corpus in vertical format
- MD5
- b626a24dc44d7779d8d0543fd811f6c6
- Uganke.vert
- uganke.regi3 kB
- 00README.txt1 kB
- uganke.vert11 MB