dc.contributor.author |
Babič, Saša |
dc.contributor.author |
Miha, Peče |
dc.contributor.author |
Erjavec, Tomaž |
dc.contributor.author |
Ivančič Kutin, Barbara |
dc.contributor.author |
Šrimpf Vendramin, Katarina |
dc.contributor.author |
Kropej Telban, Monika |
dc.contributor.author |
Jakop, Nataša |
dc.contributor.author |
Stanonik, Marija |
dc.date.accessioned |
2022-01-11T15:58:06Z |
dc.date.available |
2022-01-11T15:58:06Z |
dc.date.issued |
2022-01-11 |
dc.identifier.uri |
http://hdl.handle.net/11356/1455 |
dc.description |
This corpus collects and annotates the extensive and highly valuable diachronic collection of Slovenian proverbs, 50 years and more in the making at the ZRC SAZU Institute of Slovenian Ethnology. It is composed of the structured 2,515 bibliographical items (1578-2010): printed books, journals, calendars, collecting campaigns in different journals, folklore collecting field-works, personal notes, etc. that served as the sources of the proverbs and the collection of the paremiological units. Each one is represented in two ways: as the diplomatic transcription from the source collection (due to the technical difficulties of the transcribers and human errors in transcription, the transcription of older texts is inconsistent) and as the critical transcription which normalizes the alphabet.
The words of the critical transcriptions have also been automatically modernised to contemporary spelling, and these words further annotated with lemmas, MULTEXT-East MSDs and Universal dependencies with the CLASSLA toolchain.
The canonical encoding of the corpus is TEI, but the corpus is also distributed in two derived encodings. One is the bibliography and sayings as two TSV files, and the other the vertical file, as used by CQP-type concordancers, such as Sketch Engine. |
dc.language.iso |
slv |
dc.publisher |
ZRC SAZU |
dc.publisher |
Jožef Stefan Institute |
dc.relation.isreplacedby |
http://hdl.handle.net/11356/1853 |
dc.rights |
CLARIN.SI Licence ACA ID-BY-NC-INF-NORED 1.0 |
dc.rights.uri |
https://clarin.si/repository/xmlui/page/licence-aca-id-by-nc-inf-nored-1.0 |
dc.rights.label |
ACA |
dc.source.uri |
https://isn2.zrc-sazu.si/en/programi-in-projekti/traditional-paremiological-units-in-dialogue-with-contemporary-use |
dc.subject |
paremiology |
dc.subject |
folk sayings |
dc.subject |
TEI |
dc.subject |
proverbs |
dc.title |
Collection of Slovenian paremiological units Pregovori 1.0 |
dc.type |
corpus |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
has.files |
yes |
branding |
CLARIN.SI data & tools |
contact.person |
Saša Babič sasa.babic@zrc-sazu.si ZRC SAZU |
contact.person |
Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute |
sponsor |
ARRS (Slovenian Research Agency) J6-2579 Traditional Paremiological Units in Dialogue with Contemporary Use nationalFunds |
size.info |
36207 idiomaticExpressions |
size.info |
2515 entries |
files.count |
3 |
files.size |
22705201 |