This action may take several minutes for large corpora, please wait.
EU DGT-UD: Dutch
JRC EU DGT Translation Memory (2016) annotated with UD-Pipe: Dutch part
Counts |
Tokens | 96670999 |
Words | 81069362 |
Sentences | 5158569 |
Documents | 36071 |
General info |
Corpus description |
Document |
Language | Dutch |
Encoding | UTF-8 |
Compiled | 08/15/2018 09:59:18 |
Tagset |
Description |
Lexicon sizes |
word | 858678 |
lempos | 937993 |
tag
| 323 |
pos
| 16 |
feats
| 250 |
deprel
| 32 |
head_word
| 608844 |
head_lempos | 633465 |
head_tag
| 321 |
head_pos
| 17 |
head_feats
| 249 |
lc
| 756179 |
lemma | 826925 |
lemma_lc | 742232 |
head_lc
| 547103 |
head_lemma | 576962 |
head_lemma_lc | 527405 |
Tags legend |
Noun | NOUN.* |
Noun proper | PROPN.* |
Verb | VERB.* |
Adjective | Adj.* |
Pronoun | PRON.* |
Adverb | ADV.* |
Adposition | ADP.* |
Coord_Conjunction | CCONJ.* |
Subord_Conjunction | SCONJ.* |
Numeral | NUM.* |
Particle | PART.* |
Determiner | DET.* |
Interjection | INTJ.* |
Symbol | SYM.* |
Residual | X.* |
Structures and attributes