dc.contributor.author | Erjavec, Tomaž |
dc.contributor.author | Bruda, Ştefan |
dc.contributor.author | Dimitrova, Ludmila |
dc.contributor.author | Ide, Nancy |
dc.contributor.author | Kaalep, Heiki-Jaan |
dc.contributor.author | Krstev, Cvetana |
dc.contributor.author | Orav, Heili |
dc.contributor.author | Oravecz, Csaba |
dc.contributor.author | Paldre, Leho |
dc.contributor.author | Petkevič, Vladimír |
dc.contributor.author | Priest-Dorman, Greg |
dc.contributor.author | Simov, Kiril |
dc.contributor.author | Sinapova, Lydia |
dc.contributor.author | Sokolovsky, Paul |
dc.contributor.author | Sryvkin, Sergey |
dc.contributor.author | Tufiş, Dan |
dc.contributor.author | Utka, Andrius |
dc.contributor.author | Villandi, Viire |
dc.contributor.author | Vitas, Duško |
dc.contributor.author | Vuković, Olga |
dc.date.accessioned | 2015-06-15T08:56:08Z |
dc.date.available | 2015-06-15T08:56:08Z |
dc.date.issued | 2010-05-14 |
dc.identifier.uri | http://hdl.handle.net/11356/1044 |
dc.description | The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original (about 100,000 words in length), and its translations into a number of languages. This version of the corpus contains structurally annotated texts only, which contain elements such as the paragraph, the footnote, and highlighted text. In terms of linguistic annotations, the text contain names and sentences. The linguistically annotated texts are a separate submission (http://hdl.handle.net/11356/1043) also with somewhat different languages. |
dc.language.iso | bul |
dc.language.iso | ces |
dc.language.iso | eng |
dc.language.iso | est |
dc.language.iso | hun |
dc.language.iso | lit |
dc.language.iso | ron |
dc.language.iso | rus |
dc.language.iso | slv |
dc.language.iso | srp |
dc.publisher | Jožef Stefan Institute |
dc.relation | info:eu-repo/grantAgreement/EC/FP7/211938![]() |
dc.relation.isreferencedby | https://doi.org/10.1007/s10579-011-9174-8 |
dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.rights.label | PUB |
dc.source.uri | http://nl.ijs.si/ME/Vault/V4/ |
dc.subject | parallel corpus |
dc.subject | multilingual |
dc.subject | TEI |
dc.title | MULTEXT-East "1984" document corpus 4.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | CLARIN.SI data & tools |
demo.uri | http://nl.ijs.si/ME/Vault/V4/doc/#sec-orwell |
contact.person | Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute |
sponsor | EU Copernicus COP-106 MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages Other |
sponsor | EU Copernicus TELRI Trans-European Language Resources Infrastructure Other |
sponsor | EU Copernicus CONCEDE Consortium for Central European Dictionary Encoding Other |
sponsor | FP7 Capacities MONDILEX Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources euFunds info:eu-repo/grantAgreement/EC/FP7/211938 |
size.info | 10 texts |
size.info | 66500 sentences |
files.count | 1 |
files.size | 4848274 |
Files in this item
This item is
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)





- Name
- MTE1984-doc.zip
- Size
- 4.62 MB
- Format
- application/zip
- Description
- TEI encoded texts and sentence alignments
- MD5
- e9ffd235931ad1ad45916fb2245dc79d
- MTE1984-doc
- oalg-bgen.xml497 kB
- oalg-rosl.xml488 kB
- oalg-etsl.xml492 kB
- oalg-ensl.xml498 kB
- orwl-et.xml1007 kB
- orwl-bg.xml1 MB
- oalg-cslt.xml498 kB
- oalg-csen.xml500 kB
- oalg-bgsr.xml497 kB
- orwl-ro.xml1 MB
- mte-cesdoc.xml8 kB
- oalg-enlt.xml499 kB
- oalg-etlt.xml492 kB
- oalg-cssr.xml498 kB
- oalg-bgcsenethultroslsr.xml1 MB
- oalg-ltro.xml488 kB
- mte-oalg.xml7 kB
- oalg-husl.xml500 kB
- orwl-cs.xml965 kB
- orwl-hu.xml1 MB
- oalg-bget.xml491 kB
- oalg-rosr.xml487 kB
- oalg-ensr.xml499 kB
- oalg-etsr.xml492 kB
- 00README.txt716 B
- oalg-bgro.xml487 kB
- orwl-sl.xml912 kB
- oalg-slsr.xml498 kB
- oalg-cset.xml493 kB
- oalg-hult.xml499 kB
- orwl-ru.xml1 MB
- oalg-csro.xml488 kB
- oalg-bgcs.xml498 kB
- oalg-enet.xml483 kB
- oalg-ltsl.xml498 kB
- oalg-bghu.xml499 kB
- schema
- tei_mte.xsd287 kB
- tei_mte.rng236 kB
- tei_mte.rnc117 kB
- xml.xsd1 kB
- tei_mte.dtd104 kB
- tei_mte_schema.xml7 kB
- oalg-husr.xml499 kB
- orwl-lt.xml893 kB
- oalg-enro.xml489 kB
- orwl-en.xml981 kB
- oalg-etro.xml483 kB
- oalg-cshu.xml500 kB
- oalg-bgsl.xml496 kB
- orwl-sr.xml853 kB
- oalg-enhu.xml501 kB
- oalg-cssl.xml498 kB
- oalg-ethu.xml494 kB
- oalg-huro.xml490 kB
- oalg-ltsr.xml497 kB
- oalg-bglt.xml497 kB