Show simple item record

 
dc.contributor.author Erjavec, Tomaž
dc.contributor.author Bruda, Ştefan
dc.contributor.author Dimitrova, Ludmila
dc.contributor.author Ide, Nancy
dc.contributor.author Kaalep, Heiki-Jaan
dc.contributor.author Krstev, Cvetana
dc.contributor.author Orav, Heili
dc.contributor.author Oravecz, Csaba
dc.contributor.author Paldre, Leho
dc.contributor.author Petkevič, Vladimír
dc.contributor.author Priest-Dorman, Greg
dc.contributor.author Simov, Kiril
dc.contributor.author Sinapova, Lydia
dc.contributor.author Sokolovsky, Paul
dc.contributor.author Sryvkin, Sergey
dc.contributor.author Tufiş, Dan
dc.contributor.author Utka, Andrius
dc.contributor.author Villandi, Viire
dc.contributor.author Vitas, Duško
dc.contributor.author Vuković, Olga
dc.date.accessioned 2015-06-15T08:56:08Z
dc.date.available 2015-06-15T08:56:08Z
dc.date.issued 2010-05-14
dc.identifier.uri http://hdl.handle.net/11356/1044
dc.description The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original (about 100,000 words in length), and its translations into a number of languages. This version of the corpus contains structurally annotated texts only, which contain elements such as the paragraph, the footnote, and highlighted text. In terms of linguistic annotations, the text contain names and sentences. The linguistically annotated texts are a separate submission (http://hdl.handle.net/11356/1043) also with somewhat different languages.
dc.language.iso bul
dc.language.iso ces
dc.language.iso eng
dc.language.iso est
dc.language.iso hun
dc.language.iso lit
dc.language.iso ron
dc.language.iso rus
dc.language.iso slv
dc.language.iso srp
dc.publisher Jožef Stefan Institute
dc.relation info:eu-repo/grantAgreement/EC/FP7/211938
dc.relation.isreferencedby https://doi.org/10.1007/s10579-011-9174-8
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-nc-sa/4.0/
dc.rights.label PUB
dc.source.uri http://nl.ijs.si/ME/Vault/V4/
dc.subject parallel corpus
dc.subject multilingual
dc.subject TEI
dc.title MULTEXT-East "1984" document corpus 4.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
demo.uri http://nl.ijs.si/ME/Vault/V4/doc/#sec-orwell
contact.person Tomaž Erjavec tomaz.erjavec@ijs.si Jožef Stefan Institute
sponsor EU Copernicus COP-106 MULTEXT-East: Multilingual Text Tools and Corpora for Central and Eastern European Languages Other
sponsor EU Copernicus TELRI Trans-European Language Resources Infrastructure Other
sponsor EU Copernicus CONCEDE Consortium for Central European Dictionary Encoding Other
sponsor FP7 Capacities MONDILEX Conceptual Modelling of Networking of Centres for High-Quality Research in Slavic Lexicography and Their Digital Resources euFunds info:eu-repo/grantAgreement/EC/FP7/211938
size.info 10 texts
size.info 66500 sentences
files.count 1
files.size 4848274


 Files in this item

Icon
Name
MTE1984-doc.zip
Size
4.62 MB
Format
application/zip
Description
TEI encoded texts and sentence alignments
MD5
e9ffd235931ad1ad45916fb2245dc79d
 Download file  Preview
 File Preview  
  • MTE1984-doc
    • oalg-bgen.xml497 kB
    • oalg-rosl.xml488 kB
    • oalg-etsl.xml492 kB
    • oalg-ensl.xml498 kB
    • orwl-et.xml1007 kB
    • orwl-bg.xml1 MB
    • oalg-cslt.xml498 kB
    • oalg-csen.xml500 kB
    • oalg-bgsr.xml497 kB
    • orwl-ro.xml1 MB
    • mte-cesdoc.xml8 kB
    • oalg-enlt.xml499 kB
    • oalg-etlt.xml492 kB
    • oalg-cssr.xml498 kB
    • oalg-bgcsenethultroslsr.xml1 MB
    • oalg-ltro.xml488 kB
    • mte-oalg.xml7 kB
    • oalg-husl.xml500 kB
    • orwl-cs.xml965 kB
    • orwl-hu.xml1 MB
    • oalg-bget.xml491 kB
    • oalg-rosr.xml487 kB
    • oalg-ensr.xml499 kB
    • oalg-etsr.xml492 kB
    • 00README.txt716 B
    • oalg-bgro.xml487 kB
    • orwl-sl.xml912 kB
    • oalg-slsr.xml498 kB
    • oalg-cset.xml493 kB
    • oalg-hult.xml499 kB
    • orwl-ru.xml1 MB
    • oalg-csro.xml488 kB
    • oalg-bgcs.xml498 kB
    • oalg-enet.xml483 kB
    • oalg-ltsl.xml498 kB
    • oalg-bghu.xml499 kB
    • schema
      • tei_mte.xsd287 kB
      • tei_mte.rng236 kB
      • tei_mte.rnc117 kB
      • xml.xsd1 kB
      • tei_mte.dtd104 kB
      • tei_mte_schema.xml7 kB
    • oalg-husr.xml499 kB
    • orwl-lt.xml893 kB
    • oalg-enro.xml489 kB
    • orwl-en.xml981 kB
    • oalg-etro.xml483 kB
    • oalg-cshu.xml500 kB
    • oalg-bgsl.xml496 kB
    • orwl-sr.xml853 kB
    • oalg-enhu.xml501 kB
    • oalg-cssl.xml498 kB
    • oalg-ethu.xml494 kB
    • oalg-huro.xml490 kB
    • oalg-ltsr.xml497 kB
    • oalg-bglt.xml497 kB

Show simple item record