Show simple item record

 
dc.contributor.author Babič, Saša
dc.contributor.author Miha, Peče
dc.contributor.author Erjavec, Tomaž
dc.contributor.author Ivančič Kutin, Barbara
dc.contributor.author Šrimpf Vendramin, Katarina
dc.contributor.author Kropej Telban, Monika
dc.contributor.author Jakop, Nataša
dc.contributor.author Stanonik, Marija
dc.date.accessioned 2022-01-11T15:58:06Z
dc.date.available 2022-01-11T15:58:06Z
dc.date.issued 2022-01-11
dc.identifier.uri http://hdl.handle.net/11356/1455
dc.description This corpus collects and annotates the extensive and highly valuable diachronic collection of Slovenian proverbs, 50 years and more in the making at the ZRC SAZU Institute of Slovenian Ethnology. It is composed of the structured 2,515 bibliographical items (1578-2010): printed books, journals, calendars, collecting campaigns in different journals, folklore collecting field-works, personal notes, etc. that served as the sources of the proverbs and the collection of the paremiological units. Each one is represented in two ways: as the diplomatic transcription from the source collection (due to the technical difficulties of the transcribers and human errors in transcription, the transcription of older texts is inconsistent) and as the critical transcription which normalizes the alphabet. The words of the critical transcriptions have also been automatically modernised to contemporary spelling, and these words further annotated with lemmas, MULTEXT-East MSDs and Universal dependencies with the CLASSLA toolchain. The canonical encoding of the corpus is TEI, but the corpus is also distributed in two derived encodings. One is the bibliography and sayings as two TSV files, and the other the vertical file, as used by CQP-type concordancers, such as Sketch Engine.
dc.language.iso slv
dc.publisher ZRC SAZU
dc.publisher Jožef Stefan Institute
dc.relation.isreplacedby http://hdl.handle.net/11356/1853
dc.rights CLARIN.SI Licence ACA ID-BY-NC-INF-NORED 1.0
dc.rights.uri https://clarin.si/repository/xmlui/page/licence-aca-id-by-nc-inf-nored-1.0
dc.rights.label ACA
dc.source.uri https://isn2.zrc-sazu.si/en/programi-in-projekti/traditional-paremiological-units-in-dialogue-with-contemporary-use
dc.subject paremiology
dc.subject folk sayings
dc.subject TEI
dc.subject proverbs
dc.title Collection of Slovenian paremiological units Pregovori 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Saša Babič sasa.babic@zrc-sazu.si ZRC SAZU
contact.person Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute
sponsor ARRS (Slovenian Research Agency) J6-2579 Traditional Paremiological Units in Dialogue with Contemporary Use nationalFunds
size.info 36207 idiomaticExpressions
size.info 2515 entries
files.count 3
files.size 22705201


 Files in this item

 Download all files in item (21.65 MB)
This item is
Academic Use
and licensed under:
CLARIN.SI Licence ACA ID-BY-NC-INF-NORED 1.0
Inform Before Use Attribution Required Noncommercial
Icon
Name
Pregovori.TEI.zip
Size
12.73 MB
Format
application/zip
Description
Corpus in source TEI format
MD5
326aa38b784ae9cff84b30106c1a424c
 Download file
Icon
Name
Pregovori.tsv.zip
Size
1.34 MB
Format
application/zip
Description
Corpus in derived tabular format
MD5
2e01b983a3b1c573685ddec978b50a2b
 Download file
Icon
Name
Pregovori.vert.zip
Size
7.58 MB
Format
application/zip
Description
Corpus in derived vertical format
MD5
67a437af9a6184a8eba0169cd79a572e
 Download file

Show simple item record