Prikaži enostavni zapis vnosa

 
dc.contributor.author Verdonik, Darinka
dc.contributor.author Potočnik, Tomaž
dc.contributor.author Sepesy Maučec, Mirjam
dc.contributor.author Erjavec, Tomaž
dc.date.accessioned 2016-08-02T09:42:47Z
dc.date.available 2016-08-02T09:42:47Z
dc.date.issued 2016-08-01
dc.identifier.uri http://hdl.handle.net/11356/1069
dc.description Gos Videolectures is an add-on to the Gos reference speech corpus of Slovene (http://hdl.handle.net/11356/1040), and covers public academic speech. The Gos Videolectures recordings are a selection of public lectures available through web-portal Videolectures.net provided by the Jožef Stefan Institute, and covers in its first release 4.5 hours of speech. This resource contains only the transcriptions of the corpus - the audio recordings are avaiable at CLARIN.SI handle http://hdl.handle.net/11356/1070. All transcriptions for Gos Videolectures were done manually and carefully checked. The main guidelines for transcription were those of the Gos corpus (http://www.korpus-gos.net/Support/About). The transcription tool Transcriber 1.5.1 (http://trans.sourceforge.net/en/presentation.php) was used for making transcriptions. It can be also used for reading or exporting transcriptions (.trs files) to different formats. The transcriptions comprise the TRS files with tabular metadata, their conversion to TEI and to the CWB vertical file format. Each recording has two TRS files, one with the phonetic and the other with the normalised transcription. The TEI and CWB encodings join these two transcriptions at the token level, with the normalised words being also automatically PoS tagged and lemmatised. The corpus can be used for training continuous speech recognition for Slovene language, for phonetic research or any other research of Slovene academic speech.
dc.language.iso slv
dc.publisher Faculty of Electrical Engineering and Computer Science, University of Maribor
dc.relation.isreplacedby http://hdl.handle.net/11356/1158
dc.rights Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-nc/4.0/
dc.rights.label PUB
dc.subject speech database
dc.subject spoken corpus
dc.subject academic speech
dc.subject speech transcription
dc.subject speech recognition
dc.subject TEI
dc.title Spoken corpus Gos VideoLectures 1.0 (transcription)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType audio
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Darinka Verdonik darinka.verdonik@um.si Faculty of Electrical Engineering and Computer Science, University of Maribor
sponsor Republic of Slovenia, Ministry of Culture 3340-15-141005 Project Gos Videolectures nationalFunds
size.info 11 texts
size.info 1984 sentences
size.info 38766 words
files.count 3
files.size 1466882


 Datoteke v tem vnosu

 Prenesi vse datoteke v vnosu (1.4 MB)
Icon
Ime
GosVL.tei.zip
Velikost
762.03 KB
Format
application/zip
Opis
Tagged and lemmatised transcriptions in TEI
MD5
a0ba4d7956b7bf143e7e23b1246f876e
 Prenesi datoteko  Predogled
 Predogled datoteke  
  • GosVL
    • GosVL06_kzcoi.tei.xml401 kB
    • GosVL02_kleme.tei.xml137 kB
    • GosVL11_lhise.tei.xml121 kB
    • GosVL10_partn.tei.xml591 kB
    • schema
      • tei_gos.zip46 kB
      • tei_gos.rnc201 kB
      • tei_gos.dtd170 kB
      • tei_gos_schema.xml5 kB
      • tei_gos_doc.html2 MB
      • tei_gos.rng435 kB
    • GosVL03_medit.tei.xml244 kB
    • MTE-msd.tei.xml160 kB
    • GosVL08_droge.tei.xml188 kB
    • GosVL04_fitot.tei.xml131 kB
    • GosVL09_ocean.tei.xml466 kB
    • GosVL01_pravo.tei.xml487 kB
    • GosVL05_zsrce.tei.xml121 kB
    • GosVL07_kungf.tei.xml867 kB
    • GosVL.tei.xml7 kB
Icon
Ime
GosVL.vrt.zip
Velikost
204.44 KB
Format
application/zip
Opis
Transcriptions in CQP/Manatee concordancer vertical format with registry file.
MD5
9c7e733bc2f47ddbf468c6bb60210af2
 Prenesi datoteko  Predogled
 Predogled datoteke  
  • GosVL
    • gos_vl.vert1 MB
    • gos_vl.registry1 kB
Icon
Ime
GosVL.trs.zip
Velikost
466.03 KB
Format
application/zip
Opis
Transcriptions in TRS format with tabular metadata (in Slovene)
MD5
ab63974789a70784a990430a14c65cd5
 Prenesi datoteko  Predogled
 Predogled datoteke  
  • GosVL
    • GosVL11_lhise_g1.txt154 B
    • GosVL02_kleme_dis.txt455 B
    • GosVL07_kungf_dis.txt433 B
    • GosVL04_fitot_s3.trs12 kB
    • GosVL01_pravo_s2.trs38 kB
    • GosVL08_droge_dis.txt410 B
    • GosVL10_partn_s2.trs50 kB
    • GosVL08_droge_s3.trs26 kB
    • GosVL05_zsrce_s3.trs11 kB
    • GosVL01_pravo_g1.txt154 B
    • GosVL06_kzcoi_dis.txt461 B
    • GosVL06_kzcoi_s3.trs34 kB
    • GosVL10_partn_g1.txt154 B
    • GosVL10_partn_dis.txt447 B
    • GosVL04_fitot_s2.trs12 kB
    • GosVL03_medit_dis.txt424 B
    • GosVL09_ocean_s3.trs43 kB
    • GosVL07_kungf_s3.trs71 kB
    • GosVL04_fitot_dis.txt442 B
    • GosVL04_fitot_g1.txt154 B
    • GosVL03_medit_s3.trs22 kB
    • GosVL02_kleme_s3.trs14 kB
    • GosVL08_droge_s2.trs19 kB
    • GosVL05_zsrce_s2.trs11 kB
    • GosVL06_kzcoi_s2.trs33 kB
    • GosVL-README.trs.pdf248 kB
    • GosVL01_pravo_dis.txt429 B
    • GosVL11_lhise_s3.trs12 kB
    • GosVL09_ocean_dis.txt426 B
    • GosVL08_droge_g1.txt151 B
    • GosVL05_zsrce_g1.txt155 B
    • GosVL06_kzcoi_g1.txt155 B
    • GosVL11_lhise_g2.txt154 B
    • GosVL09_ocean_s2.trs43 kB
    • GosVL07_kungf_s2.trs68 kB
    • GosVL05_zsrce_dis.txt473 B
    • trans-14.dtd2 kB
    • GosVL03_medit_s2.trs21 kB
    • GosVL02_kleme_s2.trs14 kB
    • GosVL09_ocean_g1.txt150 B
    • GosVL11_lhise_s2.trs11 kB
    • GosVL07_kungf_g1.txt154 B
    • GosVL01_pravo_s3.trs38 kB
    • GosVL03_medit_g1.txt154 B
    • GosVL02_kleme_g1.txt155 B
    • GosVL11_lhise_dis.txt490 B
    • GosVL10_partn_s3.trs52 kB

Prikaži enostavni zapis vnosa