Datoteke v tem vnosu

 Prenesi vse datoteke v vnosu (16.66 MB)
Icon
Ime
imp_ngrams_word_1-5.zip
Velikost
3.01 MB
Format
application/zip
Opis
1- to 5-grams of words (historical spelling) excluding punctuation. The minimum frequency threshold is 5.
MD5
9f5ef25bde132c88d881ac3f9302d554
 Prenesi datoteko  Predogled
 Predogled datoteke  
    • sorted_cut-5_imp_word_c-no_n-5_t-1.txt90 kB
    • sorted_cut-5_imp_word_c-no_n-4_t-1.txt555 kB
    • sorted_cut-5_imp_word_c-no_n-2_t-1.txt3 MB
    • sorted_cut-5_imp_word_c-no_n-3_t-1.txt2 MB
    • sorted_cut-5_imp_word_c-no_n-1_t-1.txt1 MB
Icon
Ime
imp_ngrams_norm_1-5.zip
Velikost
3 MB
Format
application/zip
Opis
1- to 5-grams of normalized words (modernised spelling) excluding punctuation. The minimum frequency threshold is 5.
MD5
7e673805f64b452c63071fe25a7d4635
 Prenesi datoteko  Predogled
 Predogled datoteke  
    • sorted_cut-5_imp_lc_c-no_n-5_t-1.txt121 kB
    • sorted_cut-5_imp_lc_c-no_n-3_t-1.txt2 MB
    • sorted_cut-5_imp_lc_c-no_n-4_t-1.txt698 kB
    • sorted_cut-5_imp_lc_c-no_n-2_t-1.txt3 MB
    • sorted_cut-5_imp_lc_c-no_n-1_t-1.txt1 MB
Icon
Ime
imp_ngrams_word-norm-lemma-tag_1-5.zip
Velikost
8.61 MB
Format
application/zip
Opis
1- to 5-grams of words with normalized form, lemma and morphosyntactic tag including punctuation. The minimum frequency threshold is 5.
MD5
e0f78e9be4bc742c92c0d17e2d556f48
 Prenesi datoteko  Predogled
 Predogled datoteke  
    • sorted_cut-5_imp_word-lc-lemma-tag_c-yes_n-4_t-1.txt5 MB
    • sorted_cut-5_imp_word-lc-lemma-tag_c-yes_n-5_t-1.txt1 MB
    • sorted_cut-5_imp_word-lc-lemma-tag_c-yes_n-3_t-1.txt11 MB
    • sorted_cut-5_imp_word-lc-lemma-tag_c-yes_n-2_t-1.txt15 MB
    • sorted_cut-5_imp_word-lc-lemma-tag_c-yes_n-1_t-1.txt4 MB
Icon
Ime
imp_AFL_norm_1-5_min5M.zip
Velikost
2.04 MB
Format
application/zip
Opis
Adjusted frequency list for 1- to 5-grams of normalized words (modernised spelling) excluding punctuation. The minimum relative frequency threshold for substring reduction is 5. Column 1: n-gram; column 2: length of n-gram, column 3: adjusted corpus frequency.
MD5
79c94b6f4f462c7797e52724165087b7
 Prenesi datoteko  Predogled
 Predogled datoteke  
    • sorted_cut-1_AFL_imp_lc_c-no_n-5_t-75.txt5 MB