Show simple item record

 
dc.contributor.author Krek, Simon
dc.contributor.author Gantar, Polona
dc.contributor.author Kosem, Iztok
dc.contributor.author Dobrovoljc, Kaja
dc.contributor.author Arhar Holdt, Špela
dc.contributor.author Čibej, Jaka
dc.contributor.author Laskowski, Cyprian
dc.contributor.author Klemenc, Bojan
dc.contributor.author Krsnik, Luka
dc.date.accessioned 2021-03-16T08:36:14Z
dc.date.available 2021-03-16T08:36:14Z
dc.date.issued 2021-03-09
dc.identifier.uri http://hdl.handle.net/11356/1415
dc.description Frequency lists of collocations were extracted from the Gigafida 2.1 Corpus of Written Standard Slovene (https://www.clarin.si/noske/run.cgi/corp_info?corpname=gfida21) using specialised scripts for extraction of data from syntactically parsed corpora. The lists contain collocations with absolute frequency 10 and above, split into files corresponding to 81 predefined syntactic structures. The formal description of syntactic structures with information on restrictions and representations applied to POS and dependency parsing annotations is included in the dataset. The lists are sorted according to absolute frequency of collocations and include frequency information on individual lemmas, together with the most frequent representative forms of combined lemmas. The lists also include calculation of logDice score for collocations, and the number of distinct forms of lemmas appearing in corpus hits for a particular collocation.
dc.language.iso slv
dc.publisher Centre for Language Resources and Technologies, University of Ljubljana
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri http://slovnica.ijs.si/
dc.subject collocations
dc.subject syntactic structures
dc.title Frequency lists of collocations from the Gigafida 2.1 corpus
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType lexicon
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Simon Krek simon.krek@ijs.si Jožef Stefan Institute
sponsor ARRS (Slovenian Research Agency) J6-8256 New grammar of contemporary standard Slovene: sources and methods nationalFunds
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor University of Ljubljana P6-0215 Slovene Language - Basic, Contrastive, and Applied Studies nationalFunds
sponsor ARRS (Slovenian Research Agency) J6-8255 Collocations as a basis for language description: semantic and temporal perspectives nationalFunds
size.info 82 files
size.info 4002918 collocations
files.count 1
files.size 146338935


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
GF2.1-Collocations_JOS-structures.zip
Size
139.56 MB
Format
application/zip
Description
Collocations lists from Gigafida 2.1 in CSV format
MD5
8dd20dcfc7dc048ead42d45643bb0342
 Download file  Preview
 File Preview  
  • GF2.1-Collocations_JOS-structures
    • structure_41.csv720 kB
    • structure_19.csv173 kB
    • structure_57.csv16 MB
    • structure_95.csv90 kB
    • structure_30.csv5 MB
    • structure_103.csv22 kB
    • JOS_structures_2021-03-09.xsd4 kB
    • structure_46.csv8 MB
    • structure_84.csv154 kB
    • structure_35.csv292 kB
    • structure_73.csv754 kB
    • structure_108.csv1 MB
    • structure_89.csv11 MB
    • structure_24.csv334 kB
    • structure_13.csv11 MB
    • structure_78.csv324 kB
    • structure_51.csv15 MB
    • structure_29.csv2 MB
    • structure_40.csv914 kB
    • structure_18.csv1 MB
    • structure_94.csv1 kB
    • structure_102.csv243 kB
    • structure_45.csv315 kB
    • structure_83.csv445 kB
    • structure_99.csv104 kB
    • structure_34.csv107 MB
    • structure_72.csv3 MB
    • structure_107.csv700 kB
    • structure_88.csv6 MB
    • structure_23.csv40 MB
    • structure_39.csv32 kB
    • structure_12.csv2 MB
    • structure_77.csv4 MB
    • structure_50.csv10 MB
    • structure_28.csv1 MB
    • structure_17.csv1 MB
    • structure_55.csv2 MB
    • structure_93.csv1 MB
    • structure_101.csv69 kB
    • structure_44.csv452 kB
    • structure_82.csv1 MB
    • structure_98.csv565 kB
    • structure_71.csv16 MB
    • structure_106.csv29 MB
    • structure_49.csv1 MB
    • structure_87.csv429 kB
    • structure_22.csv5 MB
    • structure_38.csv670 kB
    • structure_76.csv1 MB
    • structure_27.csv2 MB
    • structure_16.csv14 MB
    • structure_54.csv1 MB
    • structure_92.csv422 kB
    • structure_100.csv408 kB
    • structure_43.csv25 MB
    • structure_81.csv10 MB
    • structure_32.csv224 kB
    • structure_70.csv55 MB
    • structure_105.csv31 kB
    • structure_48.csv7 MB
    • structure_86.csv2 MB
    • structure_37.csv3 kB
    • structure_75.csv205 kB
    • structure_26.csv1 MB
    • structure_15.csv40 MB
    • structure_53.csv74 MB
    • structure_91.csv117 kB
    • structure_69.csv2 MB
    • structure_42.csv554 kB
    • structure_80.csv54 kB
    • structure_96.csv631 kB
    • structure_31.csv84 kB
    • structure_104.csv128 kB
    • structure_47.csv3 MB
    • structure_85.csv2 MB
    • structure_36.csv793 kB
    • structure_74.csv3 MB
    • structure_25.csv1 MB
    • structure_14.csv20 MB
    • structure_52.csv28 MB
    • structure_90.csv5 MB
    • JOS_structures_2021-03-09.xml2 MB
    • structure_68.csv1 MB

Show simple item record