Prikaži enostavni zapis vnosa

 
dc.contributor.author Dobrovoljc, Kaja
dc.contributor.author Roblek, Rebeka
dc.contributor.author Vianello, Chiara
dc.contributor.author Diaci, Ajda
dc.contributor.author Vuga, Zala
dc.date.accessioned 2020-01-06T08:41:50Z
dc.date.available 2020-01-06T08:41:50Z
dc.date.issued 2020-01-06
dc.identifier.uri http://hdl.handle.net/11356/1280
dc.description This document contains 1,891 formulaic sequences in standard written Slovenian, i.e. frequently recurring strings of two to five words, manually annotated for syntactic structure, pragmatic function, and dictionary relevance. The list of sequences with a minimum frequency threshold of 20/million is based on the Frequency lists of word-level n-grams from lowercase word forms in Gigafida 2.0 (http://hdl.handle.net/11356/1274) and contains the union of top-1,000 formulaic sequences ranked by frequency and five association measures (Dice, t-test, MI, MI3, simple-LL). Note that there exists a related entry "List of formulaic sequences in spoken Slovenian", http://hdl.handle.net/11356/1279.
dc.language.iso slv
dc.publisher Jožef Stefan Institute
dc.publisher Centre for Language Resources and Technologies, University of Ljubljana
dc.relation.isreferencedby http://slovnica.ijs.si/wp-content/uploads/2019/12/NSSS_DS5-nizi_navodila_v6.pdf
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri http://slovnica.ijs.si/
dc.subject n-grams
dc.subject manual annotation
dc.subject multiword expressions
dc.subject formulaic language
dc.title List of formulaic sequences in standard written Slovenian
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType wordList
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Kaja Dobrovoljc kaja.dobrovoljc@cjvt.si Centre for Language Resources and Technologies, University of Ljubljana
sponsor ARRS (Slovenian Research Agency) J6-8256 New grammar of contemporary standard Slovene: sources and methods nationalFunds
size.info 1891 expressions
files.count 1
files.size 303971


 Datoteke v tem vnosu

Icon
Ime
formulaic-sequences_Gigafida_top1000.tsv
Velikost
296.85 KB
Format
Neznano
Opis
List of manually annotated formulaic sequences from Gigafida 2.0.
MD5
59cab5fa3e20e6e265c74ddae3204151
 Prenesi datoteko

Prikaži enostavni zapis vnosa