Show simple item record Krek, Simon Gantar, Polona Kosem, Iztok Dobrovoljc, Kaja Arhar Holdt, Špela Čibej, Jaka Laskowski, Cyprian Klemenc, Bojan Krsnik, Luka 2021-03-16T08:36:14Z 2021-03-16T08:36:14Z 2021-03-09
dc.description Frequency lists of collocations were extracted from the Gigafida 2.1 Corpus of Written Standard Slovene ( using specialised scripts for extraction of data from syntactically parsed corpora. The lists contain collocations with absolute frequency 10 and above, split into files corresponding to 81 predefined syntactic structures. The formal description of syntactic structures with information on restrictions and representations applied to POS and dependency parsing annotations is included in the dataset. The lists are sorted according to absolute frequency of collocations and include frequency information on individual lemmas, together with the most frequent representative forms of combined lemmas. The lists also include calculation of logDice score for collocations, and the number of distinct forms of lemmas appearing in corpus hits for a particular collocation.
dc.language.iso slv
dc.publisher Centre for Language Resources and Technologies, University of Ljubljana
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.label PUB
dc.subject collocations
dc.subject syntactic structures
dc.title Frequency lists of collocations from the Gigafida 2.1 corpus
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType lexicon
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Simon Krek Jožef Stefan Institute
sponsor ARRS (Slovenian Research Agency) J6-8256 New grammar of contemporary standard Slovene: sources and methods nationalFunds
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor University of Ljubljana P6-0215 Slovene Language - Basic, Contrastive, and Applied Studies nationalFunds
sponsor ARRS (Slovenian Research Agency) J6-8255 Collocations as a basis for language description: semantic and temporal perspectives nationalFunds 82 files 4002918 collocations
files.count 1
files.size 146338935

 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
139.56 MB
Collocations lists from Gigafida 2.1 in CSV format
 Download file  Preview
 File Preview  
  • GF2.1-Collocations_JOS-structures
    • structure_41.csv720 kB
    • structure_19.csv173 kB
    • structure_57.csv16 MB
    • structure_95.csv90 kB
    • structure_30.csv5 MB
    • structure_103.csv22 kB
    • JOS_structures_2021-03-09.xsd4 kB
    • structure_46.csv8 MB
    • structure_84.csv154 kB
    • structure_35.csv292 kB
    • structure_73.csv754 kB
    • structure_108.csv1 MB
    • structure_89.csv11 MB
    • structure_24.csv334 kB
    • structure_13.csv11 MB
    • structure_78.csv324 kB
    • structure_51.csv15 MB
    • structure_29.csv2 MB
    • structure_40.csv914 kB
    • structure_18.csv1 MB
    • structure_94.csv1 kB
    • structure_102.csv243 kB
    • structure_45.csv315 kB
    • structure_83.csv445 kB
    • structure_99.csv104 kB
    • structure_34.csv107 MB
    • structure_72.csv3 MB
    • structure_107.csv700 kB
    • structure_88.csv6 MB
    • structure_23.csv40 MB
    • structure_39.csv32 kB
    • structure_12.csv2 MB
    • structure_77.csv4 MB
    • structure_50.csv10 MB
    • structure_28.csv1 MB
    • structure_17.csv1 MB
    • structure_55.csv2 MB
    • structure_93.csv1 MB
    • structure_101.csv69 kB
    • structure_44.csv452 kB
    • structure_82.csv1 MB
    • structure_98.csv565 kB
    • structure_71.csv16 MB
    • structure_106.csv29 MB
    • structure_49.csv1 MB
    • structure_87.csv429 kB
    • structure_22.csv5 MB
    • structure_38.csv670 kB
    • structure_76.csv1 MB
    • structure_27.csv2 MB
    • structure_16.csv14 MB
    • structure_54.csv1 MB
    • structure_92.csv422 kB
    • structure_100.csv408 kB
    • structure_43.csv25 MB
    • structure_81.csv10 MB
    • structure_32.csv224 kB
    • structure_70.csv55 MB
    • structure_105.csv31 kB
    • structure_48.csv7 MB
    • structure_86.csv2 MB
    • structure_37.csv3 kB
    • structure_75.csv205 kB
    • structure_26.csv1 MB
    • structure_15.csv40 MB
    • structure_53.csv74 MB
    • structure_91.csv117 kB
    • structure_69.csv2 MB
    • structure_42.csv554 kB
    • structure_80.csv54 kB
    • structure_96.csv631 kB
    • structure_31.csv84 kB
    • structure_104.csv128 kB
    • structure_47.csv3 MB
    • structure_85.csv2 MB
    • structure_36.csv793 kB
    • structure_74.csv3 MB
    • structure_25.csv1 MB
    • structure_14.csv20 MB
    • structure_52.csv28 MB
    • structure_90.csv5 MB
    • JOS_structures_2021-03-09.xml2 MB
    • structure_68.csv1 MB

Show simple item record