What's New

 corpus 
corpus
Description:
The Montenegrin web corpus meWaC was built by crawling the .me top-level domain in 2019. The corpus was near-deduplicated on paragraph level, normalised via transliteration into the Latin script, and morphosyntactically ...
 This item contains 2 files (2.47 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
DGT-UD is a 2 billion word 23-language parallel syntactically parsed corpus, which consists of the JRC DGT translation memory of European law, automatically annotated with UD-Pipe 1.2 (http://ufal.mff.cuni.cz/udpipe) using ...
 This item contains 24 files (24.42 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required