Show simple item record

 
dc.contributor.author Verdonik, Darinka
dc.contributor.author Potočnik, Tomaž
dc.contributor.author Sepesy Maučec, Mirjam
dc.contributor.author Erjavec, Tomaž
dc.date.accessioned 2018-08-22T21:03:57Z
dc.date.available 2018-08-22T21:03:57Z
dc.date.issued 2018-08-01
dc.identifier.uri http://hdl.handle.net/11356/1190
dc.description Gos VideoLectures is an add-on to the Gos reference corpus of spoken Slovene (http://hdl.handle.net/11356/1040), and covers public academic speech. The Gos VideoLectures corpus contains a selection of public lectures available through the web portal Videolectures.net provided by the Jožef Stefan Institute, and covers 37 lectures and 16 hours of speech. This resource contains only annotated transcriptions of the corpus – audio recordings are available at http://hdl.handle.net/11356/1189. All transcriptions for Gos VideoLectures were done manually and carefully checked. The main guidelines for transcription were those of the Gos corpus (http://www.korpus-gos.net/Support/About). The transcription tool Transcriber 1.5.1 (http://trans.sourceforge.net/en/presentation.php) was used for making transcriptions. It can be also used for reading or exporting transcriptions (.trs files) to different formats. The transcriptions comprise the TRS files with tabular metadata, their conversion to TEI and to the CWB vertical file format. Each recording has two TRS files, one with pronunciation-based and the other with the standardised/normalised transcription. The TEI and CWB encodings join these two transcriptions at the token level, with the normalised words being also automatically PoS tagged and lemmatised. The corpus can be used for training continuous speech recognition for Slovene language, for phonetic research or any other research of Slovene academic speech.
dc.language.iso slv
dc.publisher Faculty of Electrical Engineering and Computer Science, University of Maribor
dc.relation.replaces http://hdl.handle.net/11356/1158
dc.relation.isreplacedby http://hdl.handle.net/11356/1223
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.subject speech database
dc.subject spoken corpus
dc.subject academic speech
dc.subject speech transcription
dc.subject speech recognition
dc.subject TEI
dc.title Spoken corpus Gos VideoLectures 3.0 (transcription)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Darinka Verdonik darinka.verdonik@um.si Faculty of Electrical Engineering and Computer Science, University of Maribor
sponsor Republic of Slovenia, Ministry of Culture 3340-15-141005 Project Gos Videolectures nationalFunds
size.info 37 texts
size.info 1141 utterances
size.info 7858 sentences
size.info 126197 words
files.count 3
files.size 3739533


 Files in this item

 Download all files in item (3.57 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Distributed under Creative Commons Attribution Required
Icon
Name
GosVL.TEI.zip
Size
1.79 MB
Format
application/zip
Description
Tagged and lemmatised transcriptions in TEI format
MD5
bd91625a112dca9fccc554090620f655
 Download file  Preview
 File Preview  
  • GosVL.TEI
    • GosVL34_druzb.tei.xml168 kB
    • GosVL31_plesn.tei.xml197 kB
    • GosVL12_cujec.tei.xml793 kB
    • GosVL11_lhise.tei.xml121 kB
    • GosVL30_kramb.tei.xml375 kB
    • GosVL16_stara.tei.xml371 kB
    • GosVL15_celia.tei.xml223 kB
    • GosVL35_vicar.tei.xml330 kB
    • GosVL02_kleme.tei.xml137 kB
    • GosVL07_kungf.tei.xml868 kB
    • GosVL32_temen.tei.xml586 kB
    • GosVL08_droge.tei.xml190 kB
    • GosVL13_menin.tei.xml311 kB
    • GosVL36_bozan.tei.xml224 kB
    • GosVL06_kzcoi.tei.xml402 kB
    • 00README.txt179 B
    • GosVL24_inter.tei.xml386 kB
    • GosVL10_partn.tei.xml593 kB
    • GosVL28_reber.tei.xml263 kB
    • GosVL29_hladn.tei.xml377 kB
    • GosVL03_medit.tei.xml245 kB
    • GosVL.tei.xml9 kB
    • GosVL37_zadel.tei.xml1 MB
    • GosVL17_inten.tei.xml276 kB
    • GosVL20_zumer.tei.xml123 kB
    • MTE-msd.tei.xml466 kB
    • GosVL33_erjav.tei.xml174 kB
    • GosVL18_aritm.tei.xml302 kB
    • GosVL04_fitot.tei.xml132 kB
    • schema
      • tei_gos.zip70 kB
      • tei_gos.rnc211 kB
      • tei_gos.dtd176 kB
      • tei_gos_schema.xml4 kB
      • tei_gos_doc.html4 MB
      • tei_gos.rng455 kB
    • GosVL22_siste.tei.xml229 kB
    • GosVL27_zagar.tei.xml181 kB
    • GosVL26_slova.tei.xml363 kB
    • GosVL01_pravo.tei.xml488 kB
    • GosVL21_poraz.tei.xml145 kB
    • GosVL14_karci.tei.xml225 kB
    • GosVL19_pujsk.tei.xml223 kB
    • GosVL09_ocean.tei.xml467 kB
    • GosVL25_nanom.tei.xml243 kB
    • 00INDEX.txt6 kB
    • GosVL05_zsrce.tei.xml121 kB
    • GosVL23_jeklo.tei.xml120 kB
Icon
Name
GosVL.vert.zip
Size
766.06 KB
Format
application/zip
Description
Tagged and lemmatised transcriptions in vertical format
MD5
b4d5796aef110ac0de24bf4bc7a74de5
 Download file  Preview
 File Preview  
  • GosVL.vert
    • gos_vl.vert4 MB
    • gos_vl.regi4 kB
    • 00INDEX.txt6 kB
    • 00README.txt179 B
Icon
Name
GosVL.TRS.zip
Size
1.03 MB
Format
application/zip
Description
Transcriptions in TRS format with tabular metadata (in Slovene)
MD5
338bdef6da39e5d4ed6662b9eb0f25a4
 Download file  Preview
 File Preview  
  • GosVL.TRS
    • GosVL06_kzcoi_dis.txt461 B
    • GosVL24_inter_s3.trs29 kB
    • GosVL37_zadel_s3.trs146 kB
    • GosVL26_slova_g3.txt148 B
    • GosVL09_ocean_g1.txt150 B
    • GosVL24_inter_dis.txt399 B
    • 00INDEX.txt6 kB
    • GosVL09_ocean_s3.trs43 kB
    • GosVL06_kzcoi_s2.trs33 kB
    • GosVL35_vicar_dis.txt475 B
    • GosVL17_inten_s2.trs19 kB
    • GosVL23_jeklo_s2.trs10 kB
    • GosVL28_reber_s2.trs26 kB
    • GosVL18_aritm_dis.txt743 B
    • GosVL18_aritm_s2.trs26 kB
    • GosVL15_celia_g2.txt287 B
    • GosVL05_zsrce_dis.txt473 B
    • GosVL01_pravo_dis.txt429 B
    • GosVL34_druzb_s2.trs14 kB
    • GosVL03_medit_g1.txt154 B
    • GosVL20_zumer_dis.txt758 B
    • GosVL03_medit_s3.trs22 kB
    • GosVL20_zumer_s2.trs10 kB
    • GosVL29_hladn_g1.txt149 B
    • GosVL05_zsrce_g1.txt155 B
    • GosVL10_partn_g1.txt154 B
    • GosVL36_bozan_dis.txt441 B
    • GosVL14_karci_s2.trs20 kB
    • GosVL29_hladn_s3.trs37 kB
    • GosVL28_reber_dis.txt454 B
    • GosVL05_zsrce_s3.trs11 kB
    • GosVL10_partn_s3.trs52 kB
    • GosVL01_pravo_s2.trs38 kB
    • GosVL08_droge_dis.txt410 B
    • GosVL32_temen_g1.txt149 B
    • GosVL16_stara_g1.txt259 B
    • GosVL10_partn_dis.txt447 B
    • GosVL32_temen_s3.trs52 kB
    • GosVL16_stara_s3.trs34 kB
    • GosVL12_cujec_dis.txt696 B
    • GosVL35_vicar_g2.txt154 B
    • GosVL02_kleme_g1.txt155 B
    • GosVL24_inter_s2.trs28 kB
    • GosVL17_inten_dis.txt705 B
    • GosVL37_zadel_s2.trs143 kB
    • trans-14.dtd2 kB
    • GosVL02_kleme_s3.trs14 kB
    • GosVL03_medit_dis.txt424 B
    • GosVL26_slova_g2.txt148 B
    • GosVL37_zadel_dis.txt437 B
    • GosVL09_ocean_s2.trs43 kB
    • GosVL11_lhise_g2.txt154 B
    • GosVL15_celia_g1.txt287 B
    • GosVL07_kungf_dis.txt433 B
    • GosVL27_zagar_g1.txt149 B
    • GosVL30_kramb_dis.txt422 B
    • GosVL15_celia_s3.trs25 kB
    • GosVL07_kungf_g1.txt154 B
    • GosVL03_medit_s2.trs21 kB
    • GosVL27_zagar_s3.trs16 kB
    • GosVL07_kungf_s3.trs71 kB
    • GosVL34_druzb_dis.txt443 B
    • GosVL29_hladn_s2.trs36 kB
    • GosVL09_ocean_dis.txt426 B
    • GosVL12_cujec_g1.txt288 B
    • GosVL05_zsrce_s2.trs11 kB
    • GosVL11_lhise_dis.txt490 B
    • GosVL23_jeklo_dis.txt734 B
    • GosVL10_partn_s2.trs50 kB
    • GosVL21_poraz_g1.txt156 B
    • GosVL12_cujec_s3.trs82 kB
    • GosVL33_erjav_g1.txt149 B
    • GosVL31_plesn_g1.txt148 B
    • GosVL21_poraz_s3.trs13 kB
    • GosVL13_menin_dis.txt695 B
    • GosVL32_temen_s2.trs51 kB
    • GosVL16_stara_s2.trs33 kB
    • GosVL31_plesn_s3.trs16 kB
    • GosVL33_erjav_s3.trs15 kB
    • GosVL25_nanom_g1.txt151 B
    • GosVL22_siste_dis.txt750 B
    • GosVL35_vicar_g1.txt154 B
    • GosVL21_poraz_dis.txt723 B
    • GosVL19_pujsk_dis.txt786 B
    • GosVL04_fitot_g1.txt154 B
    • GosVL25_nanom_s3.trs23 kB
    • GosVL35_vicar_s3.trs40 kB
    • GosVL19_pujsk_g1.txt287 B
    • GosVL22_siste_g1.txt285 B
    • GosVL02_kleme_s2.trs14 kB
    • GosVL26_slova_g1.txt148 B
    • GosVL04_fitot_s3.trs12 kB
    • GosVL25_nanom_dis.txt457 B
    • GosVL36_bozan_g1.txt149 B
    • GosVL19_pujsk_s3.trs19 kB
    • GosVL08_droge_g1.txt151 B
    • GosVL22_siste_s3.trs15 kB
    • GosVL26_slova_s3.trs42 kB
    • GosVL11_lhise_g1.txt154 B
    • GosVL13_menin_g1.txt285 B
    • GosVL36_bozan_s3.trs21 kB
    • GosVL08_droge_s3.trs26 kB
    • GosVL30_kramb_g1.txt149 B
    • 00README.txt179 B
    • GosVL11_lhise_s3.trs12 kB
    • GosVL13_menin_s3.trs29 kB
    • GosVL33_erjav_dis.txt447 B
    • GosVL30_kramb_s3.trs39 kB
    • GosVL15_celia_s2.trs25 kB
    • GosVL02_kleme_dis.txt455 B
    • GosVL27_zagar_dis.txt440 B
    • GosVL32_temen_dis.txt423 B
    • GosVL27_zagar_s2.trs15 kB
    • GosVL07_kungf_s2.trs68 kB
    • GosVL26_slova_dis.txt450 B
    • GosVL31_plesn_dis.txt463 B
    • GosVL14_karci_dis.txt764 B
    • GosVL29_hladn_dis.txt459 B
    • GosVL26_slova_g4.txt148 B
    • GosVL06_kzcoi_g1.txt155 B
    • GosVL17_inten_g1.txt286 B
    • GosVL23_jeklo_g1.txt150 B
    • GosVL28_reber_g1.txt149 B
    • GosVL12_cujec_s2.trs81 kB
    • GosVL18_aritm_g1.txt265 B
    • GosVL06_kzcoi_s3.trs34 kB
    • GosVL21_poraz_s2.trs13 kB
    • GosVL17_inten_s3.trs19 kB
    • GosVL15_celia_dis.txt742 B
    • GosVL23_jeklo_s3.trs10 kB
    • GosVL34_druzb_g1.txt150 B
    • GosVL28_reber_s3.trs27 kB
    • GosVL18_aritm_s3.trs27 kB
    • GosVL31_plesn_s2.trs16 kB
    • GosVL33_erjav_s2.trs15 kB
    • GosVL34_druzb_s3.trs15 kB
    • GosVL20_zumer_g1.txt264 B
    • GosVL-README.trs.pdf248 kB
    • GosVL20_zumer_s3.trs10 kB
    • GosVL25_nanom_s2.trs23 kB
    • GosVL14_karci_g1.txt286 B
    • GosVL35_vicar_s2.trs39 kB
    • GosVL04_fitot_s2.trs12 kB
    • GosVL01_pravo_g1.txt154 B
    • GosVL14_karci_s3.trs20 kB
    • GosVL19_pujsk_s2.trs19 kB
    • GosVL22_siste_s2.trs15 kB
    • GosVL26_slova_s2.trs42 kB
    • GosVL01_pravo_s3.trs38 kB
    • GosVL36_bozan_s2.trs21 kB
    • GosVL08_droge_s2.trs19 kB
    • GosVL11_lhise_s2.trs11 kB
    • GosVL04_fitot_dis.txt442 B
    • GosVL13_menin_s2.trs28 kB
    • GosVL30_kramb_s2.trs38 kB
    • GosVL16_stara_dis.txt754 B
    • GosVL24_inter_g1.txt153 B
    • GosVL37_zadel_g1.txt150 B

Show simple item record