dc.contributor.author | Božović, Petar |
dc.contributor.author | Erjavec, Tomaž |
dc.contributor.author | Tiedemann, Jörg |
dc.contributor.author | Ljubešić, Nikola |
dc.contributor.author | Gorjanc, Vojko |
dc.date.accessioned | 2018-03-20T10:53:17Z |
dc.date.available | 2018-03-20T10:53:17Z |
dc.date.issued | 2018-03-20 |
dc.identifier.uri | http://hdl.handle.net/11356/1176 |
dc.description | This corpus contains parallel English-Montenegrin subtitles collected in the scope of conducting a linguistic and translatological research by Petar Božović for his PhD thesis "Audiovisual Translation and Elements of Culture: A Comparative Analysis of Transfer with Reception Study in Montenegro". The data and permission to redistribute were obtained from the Radio and Television of Montenegro (http://www.rtcg.me), the public service broadcaster of Montenegro. The corpus consists of English and Montenegrin subtitles of three TV series: House of Cards (686 minutes), Damages (2878 minutes), and Tudors (1999 minutes). The corpus covers 10 seasons, 110 episodes, and 5,563 minutes in terms of duration. Sentence alignment and basic encoding were performed inside the OPUS project (http://opus.nlpl.eu/MontenegrinSubs.php), while MSD tagging, lemmatisation, and TEI conversion were performed by the CLARIN.SI infrastructure. The English texts were tagged by TreeTagger (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/) and the Montenegrin texts by ReLDI Tagger (https://github.com/clarinsi/reldi-tagger) using the Serbian language model. The TreeTagger (Penn Treebank) tagset was mapped to the SPOOK MSD tagset for English (https://nl.ijs.si/spook/msd/html-en/msd-en.html). The corpus is available in TEI format and derived vertical format used by CQP and Manatee (Sketch Engine). The alignments in the vertical file are given separately as tables linking the alignment elements of the two languages. |
dc.language.iso | cnr |
dc.language.iso | eng |
dc.publisher | Jožef Stefan Institute |
dc.relation.isreferencedby | http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Bozovic-et-al_Opus-MontenegrinSubs-1-0-First-electronic-corpus-of-the-Montenegrin-language.pdf |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
dc.rights.label | PUB |
dc.source.uri | https://opus.nlpl.eu/MontenegrinSubs/corpus/version/MontenegrinSubs |
dc.subject | parallel corpus |
dc.subject | subtitles |
dc.subject | multilingual |
dc.title | English-Montenegrin parallel corpus of subtitles Opus-MontenegrinSubs 1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Nikola Ljubešić nikola.ljubesic@ijs.si Jožef Stefan Institute |
sponsor | ARRS (Slovenian Research Agency) P2-103 Knowledge Technologies nationalFunds |
sponsor | Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds |
size.info | 133547 units |
size.info | 853165 words |
files.count | 2 |
files.size | 13482052 |
featuredService.kontext | search|https://www.clarin.si/kontext/first_form?corpname=opusmonte_cnr |
featuredService.noske | search|https://www.clarin.si/ske/#dashboard?corpname=opusmonte_cnr |
Datoteke v tem vnosu
Prenesi vse datoteke v vnosu (12.86 MB)To je vnos
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
z licenco:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Ime
- OpusMonte.TEI.zip
- Velikost
- 8.31 MB
- Format
- application/zip
- Opis
- Corpus in TEI format
- MD5
- 82405f90314ed95950f2577d9fc5939f
- OpusMonte.TEI
- opusmonte_cnr.ana.xml46 MB
- opusmonte_alg.xml6 MB
- opus2vert.xsl2 kB
- schema
- tei_clarin.zip42 kB
- tei_clarin_schema.xml2 kB
- tei_clarin.rnc184 kB
- tei_clarin.dtd146 kB
- README.md396 B
- tei_clarin_doc.html1 MB
- tei_clarin.rng377 kB
- opusmonte_en.ana.xml65 MB
- 00README.txt227 B
- opusmonte.xml6 kB

- Ime
- OpusMonte.vert.zip
- Velikost
- 4.54 MB
- Format
- application/zip
- Opis
- Corpus in derived vertical format
- MD5
- 24596dab9f565c615929b492720538df
- OpusMonte.vert
- opusmonte_en-cnr.tbl770 kB
- opusmonte_en.vert11 MB
- opusmonte_en.regi2 kB
- opusmonte_cnr-en.tbl770 kB
- opus2vert.xsl2 kB
- opusmonte_cnr.vert10 MB
- 00README.txt227 B
- opusmonte_cnr.regi2 kB