Show simple item record

 
dc.contributor.author Čibej, Jaka
dc.contributor.author Arhar Holdt, Špela
dc.contributor.author Dobrovoljc, Kaja
dc.contributor.author Krek, Simon
dc.date.accessioned 2020-02-19T11:09:29Z
dc.date.available 2020-02-19T11:09:29Z
dc.date.issued 2020-02-13
dc.identifier.uri http://hdl.handle.net/11356/1290
dc.description The lists contain consonant-vowel structures of all lemmas, word forms, and normalized word forms in the GOS 1.0 Corpus of Spoken Slovene (http://hdl.handle.net/11356/1040). In each unit, its characters were converted as follows: C - consonant (in lists with finegrained character categorizations, consonants were divided into Z - sonorant, G - voiced obstruent, and K - voiceless obstruent), V - vowel, X - foreign consonant, Y - foreign vowel, S - symbol, P - punctuation, N - number, F - non-Latin-script character, ! - other. Each consonant-vowel structure also contains its frequency in the corpus (i.e. the total sum of the frequencies of all units corresponding to the consonant-vowel structure), as well as the set of all units (in the lists labeled "entire") or the set of its 30 most frequent units (in the lists labeled as "short"), along with their part-of-speech categories and their individual frequencies). They also contain the number of all unique units within the consonant-vowel structure. The lists were prepared based on frequency lists extracted from GOS 1.0 using LIST: http://hdl.handle.net/11356/1276 Note that there exists a related resource, "Consonant-vowel structures in the Gigafida 2.0 corpus", http://hdl.handle.net/11356/1289
dc.language.iso slv
dc.publisher Centre for Language Resources and Technologies, University of Ljubljana
dc.publisher Jožef Stefan Institute
dc.relation.isreplacedby http://hdl.handle.net/11356/1367
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri http://slovnica.ijs.si/
dc.subject consonants
dc.subject vowels
dc.subject consonant-vowel structures
dc.subject GOS
dc.subject spoken Slovene
dc.subject sonorants
dc.subject obstruents
dc.subject frequency list
dc.title Consonant-vowel structures in the GOS 1.0 corpus
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType wordList
metashare.ResourceInfo#ContentInfo.mediaType text
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Jaka Čibej jaka.cibej@cjvt.si Centre for Language Resources and Technologies, University of Ljubljana
sponsor ARRS (Slovenian Research Agency) J6-8256 New grammar of contemporary standard Slovene: sources and methods nationalFunds
files.count 7
files.size 3773624


 Files in this item

 Download all files in item (3.6 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
GOS1.0_cv_forms_robust.zip
Size
546.98 KB
Format
application/zip
Description
Consonant-vowel structures of word forms in GOS 1.0 (robust consonant categorization)
MD5
2ea66e10ae9513b3f7b5fe584f50e8f7
 Download file  Preview
 File Preview  
    • GOS1.0_cv_forms_robust_entire.tsv1 MB
    • GOS1.0_cv_forms_robust_short.tsv587 kB
Icon
Name
GOS1.0_cv_forms_finegrained.zip
Size
963.23 KB
Format
application/zip
Description
Consonant-vowel structures of word forms in GOS 1.0 (finegrained consonant categorization)
MD5
46dfb7f542ef7a91a936d0f8ee3e5cfb
 Download file  Preview
 File Preview  
    • GOS1.0_cv_forms_finegrained_short.tsv2 MB
    • GOS1.0_cv_forms_finegrained_entire.tsv2 MB
Icon
Name
GOS1.0_cv_lemmas_robust.zip
Size
285.35 KB
Format
application/zip
Description
Consonant-vowel structures of lemmas in GOS 1.0 (robust consonant categorization)
MD5
ab7e84b286f71c8cb0ce9c1be7732633
 Download file  Preview
 File Preview  
    • GOS1.0_cv_lemmas_robust_short.tsv392 kB
    • GOS1.0_cv_lemmas_robust_entire.tsv566 kB
Icon
Name
GOS1.0_cv_lemmas_finegrained.zip
Size
485.04 KB
Format
application/zip
Description
Consonant-vowel structures of lemmas in GOS 1.0 (finegrained consonant categorization)
MD5
1e795c87f135c0af910701e1a3725332
 Download file  Preview
 File Preview  
    • GOS1.0_cv_lemmas_finegrained_entire.tsv1 MB
    • GOS1.0_cv_lemmas_finegrained_short.tsv1 MB
Icon
Name
GOS1.0_cv_norms_robust.zip
Size
514.04 KB
Format
application/zip
Description
Consonant-vowel structures of normalized word forms in GOS 1.0 (robust consonant categorization)
MD5
4942151e3818eb9b1dde5c4ba3b55ef3
 Download file  Preview
 File Preview  
    • GOS1.0_cv_norms_robust_short.tsv609 kB
    • GOS1.0_cv_norms_robust_entire.tsv1 MB
Icon
Name
GOS1.0_cv_norms_finegrained.zip
Size
887.48 KB
Format
application/zip
Description
Consonant-vowel structures of normalized word forms in GOS 1.0 (finegrained consonant categorization)
MD5
532b64e87fbbdcb7745307f537bdff53
 Download file  Preview
 File Preview  
    • GOS1.0_cv_norms_finegrained_short.tsv2 MB
    • GOS1.0_cv_norms_finegrained_entire.tsv2 MB
Icon
Name
GOS1.0_character_categorization.tsv
Size
3.06 KB
Format
Unknown
Description
Categorization of characters in GOS 1.0
MD5
c944d2c29567bd486f72a4ee17a45ffd
 Download file

Show simple item record